Monday, March 17, 2008

Graph-based Programming

So I was really out of it tonight. I did yoga today for the first time in a very long time in a super hot room. It felt amazing, but I was pretty zoned the rest of the day.

Anyway, I am trying to decide how to spend my time.

I love LISP, and more the idea of super sophisticated metaprogramming systems. I spent several hours yesterday and today looking over the iron python code, and I had a very interesting thought.

Python is by no means a complex language these days. Yet the python .net compiler is a seriously tough project to get into. It may be faster than CPython, but it is really tricky trying to figure out how things work. The craziest thing is how much random code there is in the compiler doing various things, a lot of it very very repetitive. Wouldn't a simpler language with more powerful metaprogramming abilities be capable of overwhelming a complex language with less sophisticated metaprogramming abilities?

Lets say you have a sophisticated problem X. The closer your language looks to X, the less work it is going to take to solve it. Couldn't a compiler be seen as a general solution to all problems that then has to be customized to the particular problem? Now you could add all the crazy complexity to the compiler to make X very easy to solve. Then your solutions probably would not generalize to making problem Y very easy to solve. This is fundamentally why a given language has a given advantage in a given problem domain.

But a programmable language should be capable of being used efficiently for any problem, because in a sense, you are building your own compiler just for that problem.

The point is that I could see removing 90% of the structure and complexity of Iron Python, replacing it with general high-level metaprogramming capabilities, and ending up with a simpler system. A great test would be to write an Iron Python compiler in super-meta-language Y and see how much shorter it is.

Anyway, implementing some sort of LISP on .net is something that I am seriously thinking about.

Next on the list of really interesting things to fuck with is writing a graph-based graphics manipulation program and writing a book about it. I know a ton about large scale editing applications that very few other people know. I don't know how generalizable this knowledge is to other fields, or how useful it would appear to other people.

But I have to believe that speaking clearly about the technical advantages of the entire system we have designed for editing applications should elicit quite a response in people who like 3d graphics, who do like graphs, or who just like to think about things that blow your mind a bit. Plus it is something I can write about with a high degree of confidence that what I am saying is at least feasible. And I didn't have to pay for the feasibility test myself (thanks Anark!).


So, I was really sitting there completely in my head in a dark room thinking about all the advantages of a graph based editing systems, and I started thinking about representing programs a graphs, and could I build a lisp compiler that had macros, but that also let you run code generically operating on the representing of the code as a graph before it is dumped to IL (CLR assembly code)?

What does a program look like in its graph state? It should look something like a callgraph. Along with this (or embedded in it), would be a graph of what datatypes are used in which functions (or perhaps just allocated). And the datatype graph (schema) described by the program. Also, there is the syntax tree which hooks into the callgraph.

Now merge this with the datatype graph created by reading an xml schema. You could have extend the pure language directly with the datatype graph you are working with. This would be a code generator of sorts; but it would be a much more general one because your base language is already completely graph based. Talk about the results fitting into the language. You could build systems like this that were technically indistinguishable other programs but still extended the programming language in absolutely crazy ways.

Anyway, it is interesting to think about programs and think about graphs made up of every possible piece of related information. It is also interesting to think about the possibilities of being able to, at compile time, to manipulate those graphs to a great degree. Why not run time?

Because I really don't know how far I could push .net during run time and still have things work. Can I dynamically generate classes? Where does the code go? Sometimes you need a dll to be saved to disk in order for things to work, I know that, but the Iron Python guys may have solved a lot of those issues. Plus, I haven't really figured out what it would mean if you had access to all this stuff, and compile time just feels a little more natural to me.

Why have we designed all this structure around systems that are fundamentally graph based to begin with? I am not saying that visually connecting lines is a good idea, I never really liked graphical programming techniques. What if you built a runtime system that was fundamentally made to run these graphs, just as the current systems are made to run assembly? Then you would have all the crazy expressivity of the graphs all the time. It could do all sorts of optimizations based on which edges were traversed more than other edges, which ones were never traversed, etc. It could really morph itself both to new demands and new problems. I guess I should take a compiler theory class and find out just how far you can really take this idea.

Chris

1 comment:

Unknown said...

You should have a look at FME from www.safe.com. It is more like programming in CSP than graph-based state machine but I think this is what you would be after