Writing a (Ruby) compiler in Ruby bottom up

This is part of a series I started in March 2008 - you may want to go back and look at older parts if you're new to this series.

Untangling Expect

Next up in my attempt to self-host parts of the parser, is Scanner#expect. This was no fun, and this part again recounts a long string of attempts at addressing relatively small annoying issues that will seem disjoint and probably confusing, but possibly useful in showing some of the debugging approaches.

Diving in

The immediate problem with Scanner#expect is this curious line:


        return Tokens::Keyword.expect(self, str) if str.is_a?(Symbol)

It calls into Tokens::Keyword for no obvious reason. Because when we dig into Tokens::Keyword, we find that Tokens::Keyword.expect simply calls Atom.expect and compares the result with what it got passed in.

So in our case, the only result of going that route, instead of falling through to the rest of Scanner#expect is basically to turn the expected string into a Symbol on returning it.

Now, first of all, we don't need to call Tokens::Keyword.expect for that. I'm not quite sure why I did that. Secondly, the "proper" thing to do here is to just return a String anyway, or specifically a ScannerString, as that includes the position, allowing for better error reporting.

The only place I find that depend on a specific value returned from expect is a line that uses ParserBase#expect with three symbols, which again call Scanner#expect with each in turn (I'm itching to optimize, but it's not for now), and uses that to determine what expression to return (:if, :while, or :rescue nodes). But that piece of code actually calls #to_sym on the result...

So, we're going to clean up Scanner#expect to simply call #to_s on objects that don't respond to #expect, and treat it as we do strings in f190425 :


    
       def expect(str,&block)
         return buf if str == ""
         return str.expect(self,&block) if str.respond_to?(:expect)
    -    return Tokens::Keyword.expect(self, str) if str.is_a?(Symbol)
    +
    +    str = str.to_s
         buf = ScannerString.new
         buf.position = self.position
         str.each_byte do |s|
    @@ -102,6 +103,7 @@ class Scanner
           end
           buf << get
         end
    +
         return buf
       end

And the problems pile up...

Firstly, our number of arguments check does not handle an explicit block argument as optional.

Comment out that for now, and I get "Method missing: Symbol#==" when testing Scanner#expect by calling expect(:rescue).

I've added one in 4292c34 - it's trivial, just comparing the string values This version is inefficient; part of the value of Symbol in Ruby is that it does not need to have the cost of String for comparisions, but it's functionally sufficient for now. A better way of handling it is to ensure Symbol objects are unique per symbol (so we can depend on just a pointer comparison), or at least obtain a unique id for any given string and compare that (which would be closer to how MRI is doing it); we'll revisit that again some day (not soon).

Next up, I stub out String#is_a?, since Scanner#expect depends on it. I just make it return true, which will obviously cause problems later. We don't yet have exactly what we need to implement it properly. See 284d493

Then I ran into a segmentation fault. Some poking reveals several possible causes.

One clear candidate is that String#initialize by default uses a null pointer, but that'd force us to check for null all over the place. And indeed, String#length doesn't. Rather than adding null checks all over, we fix String#initialize in 7b5805b.

Default nil from blocks

We also need to add something to String#reverse so what's returned is at a bare minimum a viable object. The obvious answer here would be to implement it, and we will, but first we should solve the reason this crashes:

As it turns out, when we compile an empty array of expressions, we return junk: Whatever is in %eax. Thankfully this is trivial to fix the most obvious cases of. In Compiler#compile_do we do this (in c359ea7ca) :


    diff --git a/compiler.rb b/compiler.rb
    index 751c34b..dfcd253 100644
    --- a/compiler.rb
    +++ b/compiler.rb
    @@ -649,6 +649,10 @@ class Compiler
     
       # Compiles a do-end block expression.
       def compile_do(scope, *exp)
    +    if exp.length == 0
    +      exp = [:nil]
    +    end
    +
           exp.each { |e| source=compile_eval_arg(scope, e); @e.save_result(source); }
         return Value.new([:subexpr])
       end

String#reverse

Great! Now we fail on NilClass#length due to the empty String#reverse. So lets add a basic version. Not the most inspired, nor most Ruby-ish implementation, but it has the benefit that it works with the current state of the compiler (in 9a363ee):


    diff --git a/lib/core/string.rb b/lib/core/string.rb
    index 0cbba12..d118d91 100644
    --- a/lib/core/string.rb
    +++ b/lib/core/string.rb
    @@ -130,6 +130,16 @@ class String
     
     
       def reverse
    +    buf = ""
    +    l = length
    +    if l == 0
    +      return
    +    end
    +    while (l > 0)
    +      l = l - 1
    +      buf << self[l].chr
    +    end
    +    buf
       end

String#<<

The next error I ran into was this:

WARNING: __send__ bypassing vtable (name not statically known at compile time) not yet implemented.
WARNING:    Method: '&lt;6a5ffde) since we don't have support for method aliases (method aliases are actually quite

straight forward to implement in this case: We 'just' need to update the vtable pointers; the only complication is that since this effectively means re-opening the class, it may have been sub-classed, in which case we need to update the sub-classes vtables as well; we'll deal with this properly once we "have to" deal with class re-opening).

String#count

The scanner uses String.count(character) to count the number of instances of LF (ASCII 10) to determine how to adjust the line number. We don't care about this for now, so I just give it an argument that defaults to nil, and make it return 1 (in f41af6b)

Now my test program runs all the way through, but it doesn't do much.

Adding String#each_byte, and the fallout

The big deal is the missing String#each_byte which the Scanner class uses to check the stream against each byte in the expected string. We fill in a basic version like this (in b7879ee):


    diff --git a/lib/core/string.rb b/lib/core/string.rb
    index b89ae2c..51dba30 100644
    --- a/lib/core/string.rb
    +++ b/lib/core/string.rb
    @@ -82,6 +82,13 @@ class String
       end
    
       def each_byte
    +    i = 0
    +    len = length
    +    while i <  len
    +      yield(self[i])
    +      i = i + 1
    +    end
    +    self
       end

Boom. Back to segmentation faults. The most obvious culprit is yield.

Replacing the yield with __closure__.call(self[i]), using what should eventually be a "hidden" internal variable name for the block argument, we quickly run into a different error, but one that makes it clear that yield doesn't work. This is another unfortunate regression. As it turns out we can now reduce yield to this, though:


      # Yield to the supplied block
      def compile_yield(scope, args, block)
        @e.comment("yield")
        args ||= []
        compile_callm(scope, :__closure__, :call, args, block)
      end

But while this makes yield and __closure__.call() equivalent, it does not make it work with an argument.

Splat handling, again ##

This is one of those little overcomplicated bits that badly needs to be re-thought. Basically, whenever we come across *rest in a method, we do a raw copy of it onto the stack for the call we expect to currently be preparing for.

Except *rest can occur anywhere. And it can be used for things that does not refer to the argument list from the calling function. Except that will fail horribly.

Basically this is a hack that is quickly overstaying its welcome. Particularly because the current solution is error prone. We'll give it this one last chance, before getting rid of it.

There are two glaring problems with the current code: Firstly, it doesn't handle the case where there are no arguments properly. Secondly, I actually confused the parameters provided to the method/function being called, with the arguments provided to the method we're currently in.

This has worked, sort of, as long as the methods have sufficiently similar signatures, which is why I haven't spotted it previously.

I've fixed it like this (in 58c4373):


    diff --git a/splat.rb b/splat.rb
    index 65cf0f9..a7c1b46 100644
    --- a/splat.rb
    +++ b/splat.rb
    @@ -20,7 +20,10 @@ class Compiler
         # (needs proper register allocation)
                         @e.comment("*#{args.last.last.to_s}")
         reg = compile_eval_arg(scope,:numargs)
    -    @e.subl(args.size-1,reg)
    +
    +    # "reg" is set to numargs - (number of non-splat arguments to the *method we're in*)
    +    m = scope.method
    +    @e.subl(m.args.size-1,reg)
         @e.sall(2,reg)
         @e.subl(reg,@e.sp)

The above is the critical bit that takes the correct argument length.

And this changes the loop that copies, so that we jump to the end and do the check against the end condition first to handle the case of zero arguments to copy:


    @@ -32,16 +35,25 @@ class Compiler
           
           @e.with_register do |dest|
             @e.movl(@e.sp,dest)
    -        l = @e.local
    +        lc = @e.get_local
    +        @e.jmp(lc) # So we can handle the zero argument case
    +
    +        ls = @e.local
             @e.load_indirect(reg,@e.scratch)
             @e.save_indirect(@e.scratch,dest)
             @e.addl(4,@e.result)
             @e.addl(4,dest)
    +
    +        @e.local(lc)  # So we can jump straight to condition
             @e.cmpl(reg,argend)
    -        @e.jne(l)
    +        @e.jne(ls)
    +
    +        # At this point, dest points to the position *after* the
    +        # splat arguments. Subtracting the stack pointer, and shifting
    +        # right twice gives us the number of splat arguments
    +
             @e.subl(@e.sp,dest)
             @e.sarl(2,dest)
    -        @e.subl(1,dest)
             @e.movl(dest,@e.scratch)
            @e.comment("*#{args.last.last.to_s} end")

The above also depends on some minor changes to the classes in scope.rb, which now have finally gotten a shared superclass. FuncScope have had a #method method that returns the Function object. The change introduced #method methods in all the scope classes, so that we can get at the method in question to figure out the argument length (in 58c4373 too)

We also need to "fix" compile_call, as we're "abusing it" to implement yield, and it also needs to handle splats (we are long overdue a refactoring of compile_call and compile_callm, so we'll give that a gentle nudge by factoring out a tiny bit of code from it as well).

Here is the change that includes invoking handle_splat (in 8b013ef15a):


    diff --git a/compiler.rb b/compiler.rb
    index d77bf08..f09da08 100644
    --- a/compiler.rb
    +++ b/compiler.rb
    @@ -489,6 +489,15 @@ class Compiler
       end
    
    
    +  # Push arguments onto the stack
    +  def push_args(scope,args, offset = 0)
    +    args.each_with_index do |a, i|
    +      param = compile_eval_arg(scope, a)
    +      @e.save_to_stack(param, i + offset)
    +    end
    +  end
    +
    +
       # Compiles a function call.
       # Takes the current scope, the function to call as well as the arguments
       # to call the function with.
    @@ -504,15 +513,17 @@ class Compiler
    
         args = [args] if !args.is_a?(Array)
         @e.caller_save do
    -      @e.with_stack(args.length, true) do
    -        args.each_with_index do |a, i|
    -          param = compile_eval_arg(scope, a)
    -          @e.save_to_stack(param, i)
    +      handle_splat(scope, args) do |args,splat|
    +        @e.comment("ARGS: #{args.inspect}; #{splat}")
    +        @e.with_stack(args.length, !splat) do
    +          @e.pushl(@e.scratch)
    +          push_args(scope, args,1)
    +          @e.popl(@e.scratch)
    +          @e.call(compile_eval_arg(scope, func))
             end
    -        @e.movl(args.length, :ebx)
    -        @e.call(compile_eval_arg(scope, func))
           end
         end

Yet more is needed to make this work properly, as we did not handle arguments to Proc#call previously.

Fixing Proc

I will repeat the entirety of the "new" Proc here, as it's not very large. The main change is that instead of continuing to expand the __new_proc function, I've followed the approach taken in many of the other classes and added a #__set_raw method, with the intent of eventually preventing access to this method from regular Ruby.

Other than that, the main difference, and the point of changing the class, is to make use of the fixed splat support to allow arguments, and at the same time attack another problem we'd run into shortly: self.

The "old Proc executed the block with self bound to the instance of Proc itself, which does not work very well if the block tries to run methods on the object it was defind in. So now we pass in the value that should be bound to self as well.

Also note the warning at the end of #call. This has a very good reason, which will be addressed in the next section.

You can find this in 226e24f


    class Proc
      def initialize
        @addr = nil
        @env  = nil
        @s    = nil  # self in block
      end
    
      # We do this here rather than allow it to be
      # set in initialize because we want to
      # eventually "seal" access to this method
      # away from regular Ruby code
    
      def __set_raw addr, env, s
        @addr = addr
        @env = env
        @s = s
      end
    
    
      def call *arg
        %s(call @addr (@s 0 @env (splat arg)))
    
        # WARNING: Do not do extra stuff here. If this is a 'proc'/bare block
        # code after the %s(call ...) above will not get executed.
      end
    end
    
    %s(defun __new_proc (addr env self)
    (let (p)
       (assign p (callm Proc new))
       (callm p __set_raw (addr env self))
       p
    ))

There's also a minor change to transform.rb to ensure __new_proc is called with a self value to assign to the Proc object:


    diff --git a/transform.rb b/transform.rb
    index 22f3efb..406ca30 100644
    --- a/transform.rb
    +++ b/transform.rb
    @@ -25,7 +25,7 @@ class Compiler
                 E[:self,:__closure__,:__env__]+args,
                 body]
             ]
    -        e[2] = E[exp.position,:sexp, E[:call, :__new_proc, E[:__tmp_proc, :__env__]]]
    +        e[2] = E[exp.position,:sexp, E[:call, :__new_proc, E[:__tmp_proc, :__env__, :self]]]
           end
         end
       end

We have two minor little bits to add after this: Fixnum#! and String#ord which I've added in a93f95e.

And then it's on to one more larger change before we're done:

Return from blocks

Ruby semantics is that a "return" from a block exits the method that calls the block. That doesn't currently work for us, because a block gets compiled into a C-style function that takes some extra arguments.

A solution is to put the return address into the environment, and treat a return as a goto __env__.__return_address__ of sorts.

Another complication is that when you use lambda to create Proc object, a return inside the block does seemingly not exit the defining method. But another way of seeing it, it that the return does exit Proc#call. This is a useful realisation, since for the time being at least, we've been creating Proc objects for simplicity.

There are a few ways around this difference:

Create a different class for "plain" blocks that can handle the call accordingly.
Don't create a class for "plain" blocks, at all, but pass around a "raw" tuple of the address and environment of some sort.
Change how we create he Proc objects, so that the return address for "proper" Proc objects explicitly instantiated, or created via lambda, gets the following instruction in Proc#call set as its address for explicit returns.

I'm sure there are other options too.

One additional complication: If you return a block as a Proc, and it has a return inside, it will raise a LocalJumpError exception, so ideally we should be able to recognize this case, though we will defer that for later.

We'll opt for storing what we need to return out of the defining scope on creation of the object. We'll also make a block act as the proc keyword (which we don't currently support). Note that there are additional complications: We're meant to nil out missing arguments, and ignore excess arguments rather than do strict argument checking as well. Those will have to wait.

The approach chosen will absent additional work make things go badly haywire if we commit the sin of returning a Proc by returning a bare block / or Proc.new/proc result (which we also doesn't support yet), since we can't yet detect it.

First of all, we will add a new low level operation: :stackframe, which will return the contents of %ebp. By restoring %ebp to that of the defining method, we can do a normal leave then ret sequence, and the stack pointer will be correctly adjusted, and the return address of the next instruction in the containing method will be used. Conversely, for non-proc blocks, a normal return will still be done, resulting in a return to Proc#call, which will again just return out in the calling scope.

Apart from adding it to @@keywords in Compiler, #compile_stackframe consists of just this (in e39ae721f48ed) :


      def compile_stackframe(scope)
        @e.comment("Stack frame")
        Value.new([:reg,:ebp])
      end

Above I've eluded to special handling of :return for :proc's. We do this in a new low level operation called :preturn that you can se below. In a little bit we'll then go through changes to transformations in transform.rb that rewrites :return nodes to :preturn as needed. But first #compile_preturn (still in e39ae721f48ed) :


      # "Special" return for `proc` and bare blocks
      # to exit past Proc#call.
      def compile_preturn(scope, arg = nil)
        @e.comment("preturn")
    
        @e.save_result(compile_eval_arg(scope, arg)) if arg
        @e.pushl(:eax)
    
        # We load the return address pre-saved in __stackframe__ on creation of the proc.
        # __stackframe__ is automatically added to __env__ in `rewrite_let_env`
    
        ret = compile_eval_arg(scope,[:index,:__env__,0])
    
        @e.movl(ret,:ebp)
        @e.popl(:eax)
        @e.leave
        @e.ret
        @e.evict_all
        return Value.new([:subexpr])
      end

__stackframe__ here is hardcoded to offset 0 of __env__ which isn't particularly clean, but it works. So first we reload %ebp from what is effectively __env__[0], then we pop the return value off the stack, and do a normal return. Simple and fast.

Here's our new #compile_proc... At the moment it's a pointless waste as it is pretty much identical to #compile_lambda. It will make more sense later, given the differences in how proc and lambda affect treatment of checking argument count etc. (e39ae721f48ed)


      # To compile `proc`, and anonymous blocks
      # See also #compile_lambda
      def compile_proc(scope, args=nil, body=nil)
        e = @e.get_local
        body ||= []
        args ||= []
    
        r = compile_defun(scope, e, [:self,:__closure__]+args,[:let,[]]+body)
        r
      end

Some minor additional static typing

To make accessing captured variables (variables defined outside of the block, but used inside it) work with our recent typing, we need to define the contents of __env__ apart from the stack frame to be of type :object. To do this reasonably cleanly, we add a tiny #lookup_type method which for now is a bit of a hack, but that allow us to expand the type lookups later (e39ae721f48ed) :


      # For our limited typing we will in some cases need to do proper lookup.
      # For now, we just want to make %s(index __env__ xxx) mostly treated as
      # objects, in order to ensure that variables accesses that gets rewritten
      # to indirect via __env__ gets treated as object. The exception is
      # for now __env__[0] which contains a stackframe pointer used by
      # :preturn.
      def lookup_type(var, index = nil)
        (var == :__env__ && index != 0) ? :object : nil
      end

For now this is only used in #compile_index:


       return Value.new([:indirect, r], lookup_type(arr,index))

Transformations

The above is still not enough to use :preturn and :proc. For starters, in a number of places we now check for both :lambda an :proc nodes.

Additionally, in the event we get a :proc, we call a new method to rewrite the body of the :proc to replace and :return nodes with :preturn:

(e39ae721f48ed)


    +        if e[0] == :proc && body
    +          body = rewrite_proc_return(body)
    +        end

#rewrite_proc_return looks like this:


      def rewrite_proc_return(exp)
        exp.depth_first do |e|
          next :skip if e[0] == :sexp
          if e[0] == :return
            e[0] = :preturn
          end
        end
        exp
      end

And in #rewrite_let_env we add __stackframe__ to the __env__:


          # For "preturn". see Compiler#compile_preturn
          aenv = [:__stackframe__] + env.to_a
          env << :__stackframe__

In addition there are a few minor changes to correctly pick up variable references that were left out.

Finally, in 927f173 I make a minor change to actually generate :proc nodes from bare blocks. I've not yet added support for the proc keyword itself, though that should be trivial.

Some final changes

e535b77 contains a number of test case updates to cover some of the above changes, as well as a bug fix that caused a number of vasted lines special casing on single-argument function/method calls. It also contains some minor refactoring to start reducing the coupling of the shunting yard parser to the higher level Ruby parser (eventually I'd like to extract a useful generic component from it, but the separation is also intended to make the next step of self-hosting easier).

The current state of the test scanner

You can find features/inputs/scanner6.rb in 36e3da5. This is the exact bastardized / cut down version of the scanner that compiles after the above changes. When run it tests #peek, #get, #unget and #expect (for some limited values of "tests" - many things will still not work). Compile and echo a suitable string in:

$ ./compile features/inputs/scanner6.rb
[...]
$ echo "forescue" | /tmp/scanner6
Got: 102 (PEEK)
Got: f (GET)
Got: o (GET2)
rescue
f
o
o
DONE

Parting words

The output from the scanner test code is hardly impressive for the amount of work put in so far, but more and more of the foundation is in place now, and as you can see from this part, apart from some hairy moments dealing with things like proc and yield, an increasing number of the fixes are a matter of small additions to our highly under-developed version of the Ruby standard library rather than gnarly low level compiler code.

As a result I expect progress towards self-hosting to be faster for each coming part.

There's certainly still plenty of gnarly low level compiler code left to do, such as re-opening of classes, proper handling of method-missing and send, as well as attr_accessor/reader/writer, define_method (again...), module, class methods and include. (I'm sure there are many more that I'm repressing too...), but even much of that should be simpler than what we've gone through over the last couple of parts.

The next part too will be fairly messy, but then we'll have soother sailing for a few rounds as I'll be covering more focused features again.

Writing a (Ruby) compiler in Ruby bottom up - step 40 2014-11-05