Saturday, January 3, 2009

First Clojure Project

Finished my first clojure project yesterday, at least finished functionally.

I wrote a small program where you have some 3d graphics running and you can change the anti-aliasing settings, the arguments to the glsl shaders, and a few other things.

It main uses clojure, JOGL and swing. It is running doing 4x antialiasing of the screen on my system and using about 35% of one of my two cores. Most of that time is in opengl or swing, I am not sure which. I have figured out that for this program, java 1.6 significantly outperforms java 1.5. This may be due to a much faster swing rendering implementation that is itself rooted in opengl, or the fact the the 64 bit java implementation is faster, or a number of things.

When I have thought of a better name for it I will move it to a new github repository and people can try it out. You would need to install java, JOGL, clojure, and clojure-contrib.

So, here are some of the harder issues encountered doing this:

1. Figuring out which windowing library to use was horrid because I tried QT first. It hung or crashed for days; I tried installing different jvms and all manner of things. Mixing QT with JOGL and clojure really was a waste of time; QT-java, for the mac, just isn't ready for primetime. This was the only real issue that really wasn't fun to solve.

In implementation and debugging (these were fun to figure out).

2. Laziness confused me several times. One time it was classes, I called map and didn't use the return value. Thus nothing happened; I wanted the side effect of writing out to a file. Another time I use a lazy list while doing opengl stuff. I tried to look at this list in the REPL. This caused the entire jvm to crash with a bus error. This is because the repl and opengl run in different threads and calculating the result on demand was causing the list to be evaluated in the repl thread. This was the type of bug that, when you figure it out, you know just marked a significant step in your path to functional zen.

3. Swing layout systems. Every goddamn time I use these I spend forever on at least one stupid problem. GridBagLayout-4-eva. At least with a repl I could literally change the code, hit return and see the results. It really was kind of interesting.

4. REPL madness. I managed to get the program into all sorts of weird states by reloading files in the repl. I produced another bus error, I got things to hang. Lots of stuff. I would sit there an change opengl commands telling all sorts of stuff, hit return, and laugh as it did something weird. I would exit, build the project jar, restart and the program would behave differently because I hadn't loaded a file that I had edited or something along those lines.

5. java.nio.Buffer. There was a bug that took me a long time to solve when I switched to vertex buffer objects. The way you pass these to JOGL is using the newer java.nio.FloatBuffer (or CharBuffer or ShortBuffer...). Anyway, I would hit render and nothing would happen. At first I thought it was perhaps my vertex shader. Then I was thinking it was in the binding the vertex buffer object to this particular vertex shader property. Still nothing. Finally *finally* after working on this for like 2 hours, I looked at the API for java.nio.Buffer. It has a flip function. I deduced out that when you fill a buffer from an array, its position member variable gets set to the end of the buffer. The flip function sets this variable back to 0. This set it up for a read operation somewhere on the JNI side of things.

Super, super brutal mistake caused by a few things. First off, when you call the JOGL call you pass in an explicit byte-count argument. I figured if JOGL had this information, it should be able to take the buffer and just make it work. Second off, I didn't even realize that a java.nio object *had* flip function. Third, I have never used anything with a flip function so the initial read over the API didn't help anything out.

6. glVertexAttribPointer takes *byte* arguments. I was giving it an array of floats, I figured its arguments would be float sized since it knew the buffer that was bound at the time of its invocation and thus knew the datatype in the buffer. After getting extremely bizarre results for another couple hours I figured this out. This was unfortunately when I was hooking up the antialiasing code which also renders to a multi-sample fbo, then downsamples this multi-sample fbo to a single (non antialiased) sample fbo and then you finally render this to the screen. Get it? So there were a lot of links in the chain that could have failed. Oh yeah, and I had a to render to a fullscreen quad for the final step to the screen so I had this other piece of the chain that was failing for a while.

7. The general problems related to running a realtime rendering system. These include: Swing completely destroys the opengl render context every once in a while on resize. I wasn't initially planning to handle this condition because it would traditionally only happen if you ran a screensaver or something like that. This isn't a rip on swing; I understand why they render to pbuffers and thus why they have to reallocate them sometimes. The java debug opengl interface, instead of only throwing an exception once upon error and resetting, throws an error for every single opengl call after the error is made. Thus I would get exception stack frames printed to the repl at 60 frames/second.

The upside of most of the problems of 7 is that I built a much more robust rendering platform much earlier than I was planning. So it automatically reloads files that have been kicked out of the system, rebuilds vertex buffer objects, is generally pretty tough.


OK, so enough about the problems. What did I build? What did I get out of the experience?


I proved clojure's viability, at least at some level, for doing 3d graphics. They are certainly possible. As swing gets faster, they will get more possible. Plus, not all applications need to update at 60 frames a second all the time. Additionally, my usage of clojure is still quite amateur. As I learn more about the language and swing, I may have significant opportunities to make things cooler.

The app is really cool, tight, and small. It has an application log, a 3d view, and an inspector palette where you can check out the properties of things you are looking at. You can click on a shader file and (at least on my mac) it will open the file in an external editor and start listening for changes. Any time the file changes, it attempts to reload the file and shove it back into the program. If it works then you get a new program. If it doesn't it saves the result, deletes intermediates and prints the error log to the application log. Thus you can sit there and tweak shaders till you are dizzy and the app will continue to display reasonable stuff and tell you exactly why the shader didn't load.

You can switch the antialising on or off, which actually changes a bit of the rendering pipeline, and you can choose the amount of aa you like. My laptop, for example, only supports 4x antialiasing. Thus if you select 16x it will try that. Failing 16x oversampling, fall back to 8x and try that. It will continue doing this until it finds antialiasing that is supported. This is supported in the graphics library; you can pass it a sequence of surface specifications and it will try each one until it finds one that is valid.

The docking panel framework I found on the web that is OK. Not great, but not horrible. It wouldn't work for a commercial application but it is like Christmas for a opensource of shareware application. Plus it is LGPL'd. Anyway, you can dock, undock, and do all manner of stuff with the windows.

The application is built on a heavily threaded architecture. OpenGL can only run in one thread, so that is capped. There is a large thread pool for blocking IO and a smaller thread pool (the number of processors) for CPU-bound processing. This is taken care of using clojure's agents which I bastardize to do what I want. I don't use them the way they were intended; I just use their send and send-off commands.

So, for instance, lets say you want to load a file. All files are md5'd upon load so that I can avoid doing anything redundant with large pieces of data. Lets say this file is a vertex shader that is used in one or more programs.

An IO thread loads the file into a byte buffer. It then hands this byte buffer to a CPU thread to do the md5. Finally, the rendering system picks up this new buffer (during the render loop) and tries to make a shader with it. Should it succeed, it finds all programs that included the old shader and attempts to re-link the programs with the new shader and its old counterpart (the shader could have been either vertex or fragment). If it succeeds it replaces the old program with the new one. At each point it logs something.

The log data gets shoved onto the a CPU thread where it is split into lines, has module and message type appended to the front. The application log messages list gets appended to with the new log data.

There is another thread that every so often, perhaps 3 times a second, checks to see if the written log messages are different than the application log messages. If they are it copies the app log messages to the written messages list and fires off a thread that takes the written messages list and builds a large string of them all concatenated together. It then fires off yet another thread using SwingUtilities/invokeLater that does the actual text setting on the log window object.

I didn't want logging to block whatever thread is doing the logging. I didn't want the log window to be appended to very often because this is an extremely expensive operation; especially when the log is long (it is capped at around 1000 lines right now in a clever, lazy fasion using 'take'). I wanted any real work to be done in a CPU thread and not in the logging thread or the UI event thread. Finally you shouldn't access swing components outside of the swing thread if you can help it.

So, the point is that I am using threads like they are growing on trees. Because I have two cores and I know for a fact the second core is usually doing absolutely nothing. This is nothing compared to what will happen over time as i get more and more functionality; I love threads and continuation style threading.

Now, I have all the functionality I want and the application is damn stable and responsive. If it runs, I bet you can't crash it. I can say this because it has very fixed inputs and I have tested every one of them exhaustively, which is feasible in this case.

Lets talk about repl-based development. Lets take my FBO (frame-buffer-object) implementation. Starting from zero, I first write a function that pushes a closure onto a list that gets run on the render thread. It waits for this function to finish (using a CountDownLatch), and then returns the result to the repl *just as if the function was synchronous*. Even though it happened on the render thread.

Next I write a function that creates an FBO. I test this function extensively with the repl right then and there. I pass in nonsense, I passing in large numbers, I find out all about the failure modes of my FBO allocation implementation (of which there are a few). Now I do the same for a simple FBO delete function, chained after an allocate function. Next I test managing maps of named FBO objects so you can create an FBO and refer to it by name to get it later. For the most part, all of this is done without shutting the program down.

In 3d-graphics, this is pretty hard to get right. But designing your code to be run from the repl has the same effect as designing it to be unit tested; it is just a lot tighter and easier to mess with.

Anyway, I used this technique for designing the user interface of my program. Swing layouts really suck to get right. They are time sinks; but a repl and a good testing strategy *really* help things out to speed things up.

OK, so you know what hurt and you know what I thought was cool. Now the next steps...

Here are the famous (and correct) steps to creating good systems:
1. Make it work. Be goddamn sure that you understand the problem.
2. Make it elegant. <== we are here
3. Make it fast. (probably doesn't apply yet).

Now, how do we make it elegant?

1. Remove as much as possible. I call this the code compression stage, although a better way to state it would be the code evaporation stage. This is where you study the language, the code, and really think about your algorithms and the way they are implemented to see if you can think of more concise (but not overly clever) ways of doing things. This is also there I will try to significantly improve the modular decomposition of my code; write better utilities and break code up into generic pieces. This is the best part; where you take a rough sculpture and make it truly a piece of art.

2. Ensure API exposure is consistent. This means that there are consistent names and overall structure to the system. I am doing very poorly on this point right now because I was learning the language while I was writing code. This is also where you separate public API's from internal APIs and document public api thoroughly.

3. Attempt to match the idioms of the language. I used underscores throughout my program because I like them and I didn't realize I wasn't supposed to; I will replace these with dashes because clojure uses dashes. You should also look at your usage of standard library functions and try to make sure you understand how they are supposed to be used. Remove any functions you wrote that are replacements for standard library functions; etc. Go through the clojure libraries and try to match the naming and design conventions of the major packages.

When I code, I do whatever is expedient because I want to see something work. But after I see something work, I want to make it really nice. The above steps are my standard steps and really they just facilitate me thinking about what I did in a very thorough but still abstract manner.

Chris

No comments: