15 Most Recent [RSS]
I am not recycling - I collect garbage!
The last couple weeks were very busy at work. We had to get the new site into a state where it could be presented to a representative of the customer's. So, when I returned, I was pretty exhausted and usually spent my time catching up on sleep. What little time I had left, however, I spent on a little endeavour that I'd wanted to try for a long while. For this SEKR1T PR0JEKT, I needed a programming language. After a little search, I came across Lua, which is a very lightweight, very flexible and powerful embeddable scripting language. It's under a very liberal MIT-style license, and sticks mostly to ANSI C and thus compiles beautifully out of the box.
I absolutely adored its object model. You have numbers and strings, and then you have tables - essentially associative arrays or maps. For any table you can define a meta-table that can serve as its super-class, providing values for keys that aren't present in your subclass, and since functions are just another kind of data type in Lua, you can simply assign a function to a table entry and then call it using good old dot-syntax and brackets. Essentially, it's the same kind of prototype-based programming that I liked about TADS, and which is also used by other languages like Io and Self.
Another advantage of Lua is that it was built to integrate easily with a host application. You can create custom userdata objects that wrap around a C data structure or pointer to a C++ object, and you get notified when your object is garbage collected or someone calls one of its methods. You can also register C functions with Lua so your scripts can call it.
The downside of Lua is that it is very general-purpose. The features were chosen in a way that lets you do everything, but some of them are a little complicated to do, and others use two very similar syntax notations where simply holding down the shift key too long gives you very similar behaviour but not the same. On top of that, Lua's developers don't give any guarantees regarding backwards compatibility. The byte code and the API may change without further notice, and byte code compiled by an older version of Lua can't be run in a newer one. So, while Lua worked pretty nicely for my test app, it turned out not to be a viable option after all. The parser was also pretty odd code that I really didn't manage to read very well, so my plans to fork the code-base and fix some of the syntax issues were pretty quickly laid to rest.
So, what do you do when you find n cool tools but none that has everything you need? You write the n +1th one yourself. I'm not sure whether I'll ever finish this, but today I did the basic object model. Ints, Strings, Tables, and a garbage collector to go with them. Due to prolonged exposure to Cocoa, I had already done several reference counting implementations, but this time I decided I wanted something that just worked, without having to worry about circular references. So I read a little about mark-and-sweep garbage collectors, which seemed simple enough.
Essentially, a mark-and-sweep garbage collector assumes that there is one central hierarchy in which all objects are. Usually, this hierarchy is rooted in the list of global variables. In addition, the garbage collector has a list of all existing objects (or knows how to find them). Every object has an associated "reachable" flag. When the garbage collector runs, it first sets all objects' "reachable" flags to FALSE. Then it does a deep search through the object hierarchy and marks all objects it comes across as reachable. If you have any orphaned chains of circular-retained objects, where each one holds on to the other, those aren't part of the tree anymore and stay marked as unreachable. All you have to do is sweep over your list of objects and de-allocate each one marked as unreachable.
Simple and beautiful, except if you have a lot of objects, in which case running the garbage collector takes a bunch of time, which is where garbage collectors got their bad reputation from. Of course, there are ways to optimise this. For instance, you can use reference counting as your primary garbage-collection method and then run the collector a lot less often to get rid of the occasional retain-circle.
Another approach which is more complex is having an incremental garbage collector. I.e. instead of marking and sweeping all objects at once, you only process part of the list. Of course, this opens a whole other can of worms if an object moves from a not yet processed object to one that has already been processed. It will then never be marked as reachable. You could of course just have one object inherit its owner's "reachable" flag if the container is marked as reachable, and thus at least avoid deallocating an object too early, but there may be other holes to be avoided.
Anyway: Next stop is actually writing the programming language around that, and drinking a cup of that delicious "Kiba Flip" tea that just finished (pineapple, banana, cherry ... mhhh...). And yeah, it's real tea, not that bagged stuff I've been drinking the last weeks :-)