Monday, January 12, 2009

graphics file formats

It is an odd fact of life for 3d programmers that the file formats *really* suck.

The collada specification *just the spec* is a 3.7M pdf. It doesn't have a lot of pictures.

The collada xml schema is over 400K bytes and clocks in at 11,848 lines.

The FBX toolkit has a c++ library as a file specification. They can't be bothered to publish their file format, even thought it extremely regular and pretty damn well designed by my estimation. You can see it in ascii, but I wonder if the binary version is identical to the ascii or if there is something neat about it.

Then check out lib3ds, try checking out a maya file. Obj is ok if you need something fairly simple.

Supporting any one of these in an application takes a huge amount of work.

Out of all of these I only really know collada. I did a 3ds importer a long time ago. One time I reverse engineered the excel .xls format going from partial documentation and the source code to OpenOffice's source code. Even with all of that, it took a *lot* of effort. Lots of hex dumps. Anyway...

Collada is the best out of all of them in my opinion. I bet I am one of a very very few number of knowledgeable graphics developers who believe that, but it is. They all suck, but at least on has an open development process along with a few large companies pushing it. It isn't a game engine data format; xml is big, verbose, and extremely slow to parse compared to fairly trivial binary file formats. It isn't the best designed piece of software I have ever seen, either. It isn't modular nor entirely consistent (like something of that size could be consistent, but anyway); in some places it is way overly generic and arbitrary (animations. There are at least 5 different ways to store bezier and linear key data. 5. FX composer, 3dmax, maya, and another 3d editor who's name escapes me now because they aren't sold in the US all do things them differently. It doesn't have nearly enough examples of what could be considered the right way to do things; thus everyone has just rolled their own.

But, it is something people can rally behind and we can all use it. I think the way collada does transforms is really really good. They are the source of a lot of problems, but you can do anything with them. Really goddamn good. I spent a huge amount of time getting them to work, but I think they are cool. I also think the way they store geometry data is pretty cool and lends itself perfectly to using opengl efficiently (vbos). I am not a huge fan of its shader sections; the interaction between shaders and scenes (using evaluate_scene tags) kind of annoys me. I just don't think of shaders, be they a special material on an object or a multi-pass effect in the same model and it irks me.

I like collada's extension mechanism but it is clunky. Certain really odd situations happen with different interpretations of the specification.

I wish they had used certain elements of xml schema and don't things a little differently in others. For instance, they don't break up logically different spaces using sub-schemas and xls-include. I don't know if this is for technical reasons, but it makes interpretating the entire specification tough because you *have* to look at a lot of stuff.

Why there isn't collada-geometry that includes collada core, why isn't there collada-animation in a completely different schema that includes common bits? Why doesn't the file start with their extension mechanism and use that to make pieces more modular? Perhaps you can't really have a unified ID-space if you are using different schemas? It may break schema verifiers; which for something like collada are particularly useless as it does lots of outside-schema stuff anyway, and certain large vendors produced files that didn't pass verification as it was.

I believe the FBX file format has superior architecture but it isn't open and it has weird characteristics that really irk me. I wish it had collada's geometry section, and for that record the way collada at least partially allows a good design for shaders and effects blows FBX out of the water. It starts with very generic design (essentially objects with arbitrary properties) and uses convention to and their API to ensure consistency. The details of the data in the format aren't as smart as Collada, however, and that kills it for me.

They both are light years better than 3ds files. I bet maya is pretty good design as that app is really in a league of its own with regard to really amazingly smart design. Also, for the record, Adobe can do some unbelievably cool things with applications. After Effects, even after all of these years is an amazing application. Their plugin API, while tedious and sometimes very poorly thought out, is documented with humor and looking at some of the things they allow is quite enlightening as to the application's internal architecture.

This is perhaps the weirdest thing I have admitted for a long time but file formats are really fascinating to me. API's almost always bore me or piss me off. But for some reason file formats are interesting; especially old ones. The excel file format is really damn cool, and some of the things that the microsoft applications could do was legitimately goddamn tough to do in binary. Things like in-place editing, where the app could run directly from the file without ever loading the entire file into memory at once and could write to the file in the same locations without growing its size (or growing it in a very generic way). Meaning the application could memory-map the file and then instantly use structures with no further loading, theoretically. It could also perform edits to this file while simultaneously growing its memory. This isn't shit tricks done with malloc or new; this is really good, consistent and darn hard engineering challenge that would affect the design of your entire application. On the other hand, the applications take forever to load files now and you have to wonder. Meanwhile KOffice tends to insta-load apps so fast it can be somewhat shocking.

I hate microsoft office; but I believe that in odd corners of it you find some of the most amazing engineering I have ever seen. Excel is an application that has yet to be equalled (or even close) by an opensource alternative. The best open source office suites just don't even compare to Microsoft's tools in terms of quality, features, or unified, consistent design. I guess I just hate office software in general as just the look of it forces me to puke instantly while seeing images of baby Jesus crying.

/ramble

Chris

No comments: