Shotgun: The Rubinius virtual machine and some musings on compiling Ruby

If you're interested in the future of Ruby, read Shotgun: The Rubinius virtual machine - it's a great overview of Rubinius, one of the promising alternative Ruby implementations.

Rubinius seems very clean - certainly a great step forward over Ruby 1.8. I haven't yet looked enough at YARV / Ruby 1.9 to know how Rubinius and YARV compare, but that's definitively on my todo list.

The design of Rubinius is in any case a valuable basis for anyone who wants to understand execution and parsing of Ruby.

Native compilation of Ruby

Personally I've long been thinking about what would be needed for native compilation of Ruby or a Ruby like language. To get really good performance you'd need to make at least some subtle changes to semantics, but I do believe it's possible to make something highly compatible, including keeping most of the dynamism that makes Ruby great.

A major one is to decide on a split between the read/parse/compile phase and an execution phase. I've started seeing a Ruby script as a program that is interpreted to create a program (a set of classes, constants and globals) - a "native" Ruby compiler would need some mechanism for executing a Ruby script and then decide what code to start executing at when the executable program is invoked. A simple approach might be to require a "main" method to be present, or take a class/method on the command line to indicate where to start execution. There'd be an issue with whether you'd allow the full language (as currently) to be executed during compile time, but nothing insurmountable.

Thats actually half the problem. The second half is Ruby's object model, which allows method to be removed, added or aliased at any point during program lifetime, and allows the same for instance variables. It's tremendously hard to make that memory efficient and fast - Matz' Ruby interpreter handles this by giving each object a hash table for instance variables and each class a hash table for methods, which makes both instance variables and method lookups slow, and make instance variables waste a tremendous amount of memory. Compiling the code in the simple obvious way mirroring the interpreter would never approach C++ or similar languages. There ARE opportunities for static analysis and both faster and less memory intensive models, though, but I suspect they'd require full JIT support and conditional recompilation of methods on changes to the object model - possibly even moving objects if you really want to push it.

The key is to optimize for the common cases, which I believe exclude a lot of the painful (hard to compile) dynamism for most classes and objects most of the time... We'll see if someone pick up the torch. I'd love to (and I am playing with a prototype compiler backend I think would be helpful, but it's FAR from supporting even a minimal subset of the Ruby object model - I'll start a series of posts about it soon), but I very much suspect I won't get the time anytime soon.

Shotgun: The Rubinius virtual machine and some musings on compiling Ruby 2008-03-20

Native compilation of Ruby