This is part of a series I started in March 2008 - you may want to go back and look at older parts if you're new to this series.
One of the big elephants sauntering around the room for a long time has been
the issue of how to handle the specifics of how Ruby handles
false. To a lesser extent this issue also affects numbers, but it is those
three values that are most critical right now.
The reason is control flow. So far, we've treated these values the way C does:
nil is simply the null pointer;
true is any non-zero value, and false is zero
(and thus for most practical intents the same as
The problem, of course, is that this is not the way it is in Ruby.
nil are values distinct from the numbers, and they compare with each others
and with other values in different ways than in C.
They are also objects. Which means we lose out on some of the simplest ways of
doing comparisons and turning the comparison results into a value. We may find
people doing things like
if <some expression>.nil?.
false both evaluate to false in a conditional, but
nil != false.
So far "faking it" have worked, because with a few exceptions like the ones above, the C and Ruby variations are relatively compatible. But it's not a lasting solution.
There is another problem: If we change basic contructs like the s-expression
work on Ruby objects, we'll find it hard to implement the "plumbing" under Ruby.
Don't bring out the pickaxe just yet (groan). As it happens, our compiler compiles two very different languages: The s-expression inspired low level language used both as the compilation target for Ruby and implementation language for low level features, and Ruby itself.
The former language is de facto typeless, like BCPL: We pass values around with wild abandon, and we even clobber Ruby local variables and instance variables with it, but what meaning these values have depends entirely on usage rather than the type of the variable (as in C) or a type attached to the value itself, like in Ruby.
And as it happens, here lies both the problem and solution to our conundrum from above:
If only the compiler can know when it is dealing with real Ruby values, and when
it is dealing with something else, then, e.g.
compile_if can generate different
code in these situations.
Not only that: We will need this information when we eventually get tired of leaking memory and start adding a garbage collector - otherwise we're stuck with a conservative collector, so we get twice the benefit.
It will also help us contain the "leakage" of untyped values into Ruby, by letting us define and narrow the rules for when and where and how we're allowed to work with them.
As it happens, we don't need a very complicated type-system: For now we can get away with knowing if a reasonable subset of constructs returns either an object or may contain anything.
That's it. That's the grand total of the static typing we'll introduce this time.
However the changes start laying the groundwork for more static typing that we can use for optimizations and sanity checks. Ultimately I wish to relegate the "s-expression plumbing" to a very restricted space.
Apart from just categorizing stuff into two types, there's another limitation too
for now: Where we act on type information, we will treat all variables as typed to
objects, and all return variables from method calls to be typed as objects. We
will implicitly assume that the s-expression syntax will be contained, though we
are not yet verifying that. In some cases this will be outright wrong. E.g. this
if foo; bar; end from working correctly if
foo is not an
object, and happens to contain 0, and in any number of similar instances, so it
is likely introducing some regressions (I caught one while writing this - there are
First, let's put some basic test cases in place. You'll find them in d22b95f
Then lets start putting our new typing into place. Let's start with a
to hold a possibly typed value (in 3ec81cb):
require 'delegate' # Used to hold a possiby-typed value # Currently, valid values for "type" # are :object or nil. class Value < SimpleDelegator attr_reader :type def initialize ob, type = nil super(ob) @type = type end # Evil. Since we explicitly check for Symbol some places def is_a?(ob) __getobj__.is_a?(ob) end end
To simplify refactoring, we have it be a delegator, so we only selectively add/change
behaviour as needed. For now, the only new thing is that
#type will return the associated
type tag, or nil. We only support
:object for now, to indicate we know the value to be a
pointer to a Ruby object.
I'm not going to go through ever detail of the changes in
compiler.rb. You can find the full set
Apart from a number of changes to return objects of the new
Value class, the main things
to notice are as follows:
@global_constants to prevent them from being treated
as method calls:
+ @global_constants << :false + @global_constants << :true + @global_constants << :nil
Next up is this change in
- @e.jmp_on_false(l_else_arm, res) + + if res && res.type == :object + @e.save_result(res) + @e.cmpl(@e.result_value, "nil") + @e.je(l_else_arm) + @e.cmpl(@e.result_value, "false") + @e.je(l_else_arm) + else + @e.jmp_on_false(l_else_arm, res) + end +
What's happening here is that instead of assuming an untyped value, we check to see if we know we have an object. If we do, and we come across "if result; ...; else ...; end", we change the code to effectively do the equivalent of:
if result != nil && result != false # if block else # else block end
There's an equivalent change for
Furthermore there's a few minor additional changes to
scope.rb to prevent true/false/nil from
being treated as method calls in 52f31ad3
We also need to check in
transform.rb that we're not trying to treat true, false and nil as
local variables. See f2af5fc
In order to make these changes work, we also need to modify the runtime in various ways.
Most obviously, we need to actually make
nil real objects. We do that in
+require 'core/true' +true = TrueClass.new # FIXME: MRI does not allow creating an object of TrueClass +require 'core/false' +false = FalseClass.new # FIXME: MRI does not allow creating an object of FalseClass +require 'core/nil' +nil = NilClass.new # FIXME: MRI does not allow creating an object of NilClass. + # OK, so perhaps this is a bit ugly... self = Object.new @@ -59,9 +66,6 @@ STDERR = 1 STDOUT = IO.new ARGV=7 Enumerable=8 #Here because modules doesn't work yet -nil = 0 # FIXME: Should be an object of NilClass -true = 1 # FIXME: Should be an object of TrueClass -false = 0 # FIXME: Should be an object of FalseClass
These depends on very basic initial implementations of
NilClass - see c356591
Another change is in
lib/core/fixnum.rb, where all the comparison operators needs to change:
def == other - %s(eq @value (callm other __get_raw)) + %s(if (eq @value (callm other __get_raw)) true false) end
This is because
%s(eq ..) etc. does not handle typing yet (and they may not necessarily
ever need it), so we use our newly
typed %s(if ..) coupled with explicitly returning the
right objects instead of the numeric values we'd previously get.
It is important to do this in particular as one of the changes I snuck past in
assumes that method calls returns Ruby objects.
Almost done now, but there's also a minor change to
lib/core/object.rb to remove the horribly
false methods we used previously.
As it happens, we have a few more things to do:
%s(and ..) and
%s(or ...) needs to take
type information into account to be able to generate proper code for e.g.:
if a and b ... elsif a or c ... end
In our new world,
a and b (or
a && b) will always be true, because both
have integer values that are non-null. Similarly
a or c /
a || c will always be true as well,
since both values will be seen to evaluate to true.
First of all, I've added a test case to catch this, in 1dfe043. But one of our other test
cases shows a regression as well.
features/inputs/strcmp.rb now gives wrong results, because
we previously relied on being able to use "plain Ruby"
if to check the result of a call to
strcmp that we stored in a local variable. But for now at least, we're assuming variables
contain objects. We'll likely want to refine that, but for now we'll apply a workaround that
will work (in 32ddcde):
%s(assign res (if (strcmp @buffer (callm other __get_raw)) false true))
By explicitly assigning with the values
true, from the result of a value that
will get an indeterminate type, it will work again.
But lets fix "&&"/"and". Firstly we need to actually store the return value from
else_arm (in 2ae727d):
- compile_eval_arg(scope, if_arm) + ifret = compile_eval_arg(scope, if_arm) @e.jmp(l_end_if_arm) if else_arm @e.local(l_else_arm) - compile_eval_arg(scope, else_arm) if else_arm + elseret = compile_eval_arg(scope, else_arm) if else_arm
Secondly, we need to determine type based on them. Most importantly, we can only
safely return a type that is shared by both of them if both
present (also in 2ae727d):
- return Value.new([:subexpr]) + # We only return a specific type if there's either only an "if" + # expression, or both the "if" and "else" expressions have the + # same type. + # + type = nil + if ifret && (!elseret || ifret.type == elseret.type) + type = ifret.type + end + + return Value.new([:subexpr], type) end
Other than that, we're simply just adding our missing
(EDIT: This implementation is broken; a correct version will be in part 41)
+ def compile_or scope, left, right + compile_if(scope, left, false, right) + end
And that's it for this time.