This is part of a series I started in March 2008 - you may want to go back and look at older parts if you're new to this series.
The main reason for handling re-opening of class-definitions at this stage, is for a simple reason:
We're going to attack eigenclasses soon, and being able to handle class re-opening makes that a lot simpler in addition to being needed in and of itself too.
Part of the reason is that I made the choice way back in part 19 to use vtables as the basic method for dispatch, and while vtables make for fast method calls, there's extra book-keeping required if we allow re-defining a method, which Ruby of course does.
But how does that affect eigenclasses?
As you may or may not remember, this involves an array in the class object with a pointer to
each method. This is what an instance, and its class (which is itself an instance of Class
)
looks like today:
(the left column represents the "slot" in the object structure. With our current 32-bit backend, this is a multiple of 4 bytes offset from the start of the structure)
This is the standard approach for virtual method dispatch in languages like C++. There it is easier than in Ruby for a few reasons:
The class hierarchies are rarely as deep, since they're not all rooted in a
single class (Object
previously, these days BasicObject
). Furthermore, not all methods are virtual,
and member functions that are not virtual do not typically get vtable slots.
This substantially reduces the likely maximum size of the vtables.
You're not expected to be able to safely call any method on any object. If you do stupid
stuff like statically casting objects to incompatible types, your application will crash
when you call the wrong methods on it. But in Ruby, that's expected to be "safe" (as in,
the worst case is a method_missing
. This means a vtable in C++ only needs to contain
space for virtual member functions up to and including the current class. New ones added
in sub-classes does not impact the size.
You're not expected to be able to re-open classes and make further definitions, and certainly not allowed to define the same method multiple times for the same class (with the result of overwriting past versions)
It's this last one that makes re-opening classes and Eigenclasses tangled up for us. Consider these two variants:
class Foo
# Here self == Foo
class << self
# Here self == Foo's eigenclass
def foo
end
end
# Here self == Foo again
end
class Foo
def self.foo
end
end
While in the latter case it's not as explicit, in both cases we're handling first class Foo
,
then opening Foo
's eigenclass, then handling class Foo
again. While we can optimize
away some cruft here later, to merge processing of adjacent definitions (so that e.g. defining
self.foo
and self.bar
right after each other maintains the same scope), that's extra
complexity that does not solve the wider issue.
There is one obvious issue with trying to re-open a class with vtables: What about sub-classes?
Consider our earlier example fleshed out a bit more:
What do we now have to do if we want to override, say, vtable slot 2 in Foo
? Well, that depends on Bar
:
Bar
has overridden vtable slot 2, then we just update Foo
Bar
has not overridden vtable slot 2, then we need to update Foo
and Bar
.So we need to expand the Class
instance variables to let us find the sub-classes of any given class. As an example, I've added in a Baz
class which is a second subclass of Foo. So this:
class Foo
end
class Bar < Foo
end
class Baz < Foo
end
ob = Bar.new
turns into this:
This leaves us with these steps to set a vtable slot:
(Instead of using a pointer to the first subclass, and a chained linked list like this, we could use a hash table or array, and we might very well do that at some point, but not now).
I've added a test case in 02021ad which simply exercises the basic scenarios by creating a class and two sub-classes - one for which a method is overridden - and then re-opens the base class and overrides one of the original methods.
One of the first things to overhaul is the method I introduced way back at the start of this series to keep track of functions we want to output.
While Ruby does not have functions, we turn every method we find statically into something that looks much more like a function, and name it in a reasonably human-friendly way in part to make it easier to debug the assembler output.
You can check out most of this (all of the handful of lines) yourself, apart from the new globals.rb
which contains the meat in this method (in b88a31b):
# Returns the actual name
def set(fname, function)
suffix = ""
i = 0
while @global_functions[fname + suffix]
i += 1
suffix = "__#{i}"
end
fname = fname + suffix
# add the method to the global list of functions defined so far
# with its "munged" name.
@global_functions[fname] = function
fname
end
All this does is append __
The reason we need to do this is that we now of course can have multiple Foo#bar
methods at various
stages of the execution.
I also snuck in another minor change in the preceding commit (b88a31b)
To allow re-opening classes, we need to not blindly create new class objects as we did before (though we're still assuming the class and its name is/was possible to determine statically. This is the change, which basically wraps an if check around the old version:
- compile_exp(scope, [:assign, name.to_sym, [:sexp,[:call, :__new_class_object, [cscope.klass_size,superclass,ssize]]]])
+ compile_eval_arg(scope, [:if,
+ [:sexp,[:eq, name, 0]],
+ # then
+ [:assign, name.to_sym,
+ [:sexp,[:call, :__new_class_object, [cscope.klass_size,superclass,ssize]]]
+ ]
+ ])
Next up we assign two extra instance variable slots in the class objects:
# slot 4 is reserved for subclasses
# slot 5 is reserved for next_sibling
CLASS_IVAR_NUM = 6
You'll find the remaining bits in 4733be0, and we'll look at lib/core/class.rb
in more detail.
Firstly, this part of __new_class_object
is new:
# Sub-classes
(assign (index ob 4) 0)
(if (eq superclass 0)
(assign (index ob 5) 0)
(do
# Link in as subclass:
(assign (index ob 5) (index superclass 4))
(assign (index superclass 4) ob)
)
)
Basically it simply links this class into the list of sub-classes for it's superclass, by prepending it to the head.
The more interesting bit is __set_vtable
. This was previously the over-simplistic:
%s(defun __set_vtable (vtable off ptr)
(assign (index vtable off) ptr)
)
Now it looks like this:
%s(defun __set_vtable (vtable off ptr)
(let (p)
(assign p (index vtable 4))
(while (sexp p)
(do
(if (eq (index p off) (index vtable off)) (__set_vtable p off ptr))
(assign p (index p 5))
)
)
(assign (index vtable off) ptr)
))
This boils down to this "pseudo-Ruby" which follows the algorithm sketched out earlier:
p = vtable.subclasses
while p
if p[off] == vtable[off]; __set_vtable(p,off,ptr); end
p = p.next_sibling
end
vtable[off] = ptr
And that's it!
Next up: Eigenclasses.