Writing a (Ruby) compiler in Ruby bottom up

This is part of a series I started in March 2008 - you may want to go back and look at older parts if you're new to this series.

Further misadventures on the way to self-hosting

In which we continue the circuitous approach towards parsing s-expressions

As you may or may not remember (it's taking a while, after all), the immediate purpose is to use the limited s-expression part of the parser to ferret out the first batch of problems standing beween us and fully self-hosting the compiler.

For that purpose, I've now put together a tiny test driver and the minimal dependencies of the s-expression parser, and this part we will look at 1) changes to reduce coupling (which means we're cheating by delaying certain types of fixes by moving the troublesome code out of the immediate dependencies of the s-expression parser) and 2) changes to make that code compile, and hopefully even run.

I'll give you a spoiler right now, and tell you that at the end of this there will be at least one more roadblock: Implementing dynamic method dispatch (looking up a method by name for #send).

At the time of writing I don't know if that will be the only outstanding issue (probably not), but it's most likely the biggest of the remaining issues to get the s-expression parser to work by itself, since call parser methods by name.

(Whether or not it's a good idea to do that is another matter, but for now it'll stay since we need to cover that bit anyway).

Improving 'require' and the driver

An immediate problem presented itself when trying to compile code that require'd other code expecting a different load path. I'm not quite reading to start dealing with the pain (in terms of ahead-of-time, semi-statically) compiling code that dynamically messes with $:, but adding options to set the load path, and improving/fixing some of the path issues with the current implementation of require, as well as improving the option passing in the driver code seems like a good idea.

Time to start cleaning up the driver for the compiler ever so slightly. In 91aa84d I started by making the driver bail out without trying to assemble if the compilation fails, as that's been driving me crazy for a while.

In 4b62af4 I added slightly better load/include path handling. In driver.rb, I added support for "-I" to indicate the initial include path (we do not support $: etc. yet). That is supported in parser.rb by quite straight forward code to mangle the paths and identify the actual files as needed, and to ensure the core library still loads.

Improving nested symbol resolution

What was revealed once I added the above, was that I took too many shortcuts for resolving Foo::Bar back in part 41:


    ./compiler.rb:199:in `compile_deref': Unable to resolve: Scanner::ScannerString statically (FIXME) (RuntimeError)
    from ./compiler.rb:540:in `compile_exp'
    from ./compiler.rb:114:in `get_arg'
    from ./compiler.rb:370:in `compile_eval_arg'

We're going to attack this first by improving the debugging of symbol table content.

In a8b23a9 I added the --dumpsymtab option, which makes use of new #dump methods added to Scope with specializations for GlobalScope and ClassScope in debugscope.rb, which recursively processes the symbol table.

What that that revealed was the result of the ugly shortcut of conflating the low level output of assembler level symbols with Ruby-level constant nesting, as a result of hacking our way through this without a detailed up front design.

Thankfully that's not all that hard to fix at this stage - we already did the most important work with the introduction of build_class_scopes in part 41, which ensures that the right information is available - that's what made the --dumpsymtabs option easy to provide.

All we need to do is to actually look up what we should be looking up: the name of the inner constant, instead of the synthetic assembly level name, so that it is found in the scope chain instead of ending up being forwarded all the way up to GlobalScope (which still has the ugly synthetic variable name - that needs to go eventually, and not be exposed to Ruby).

In ClassScope (in e4eafcc) :


       def find_constant(c)
         const = @constants[c]
         return const if const
    -    return @next.find_constant(@name+"__"+c.to_s) if @next
    +    return @next.find_constant(c) if @next
         return nil
       end

While doing a global looking in compiler.rb we'll actually look up these ugly compound names for now, and then add them to the list (e4eafcc) :


    +      prefix = scope.name
    +      aparam = prefix + "__" + aparam.to_s if !prefix.empty?
           @global_constants << aparam

Handling "include"

For now we're just stubbing it out, together with protected. See f0dfb45 which basically just adds them without doing anything of substance...

Handling splat => array conversion

Now to the big one for this round. The remaining other issues I dealt with for this part all fell out of addressing the splat handling.

This is one of the long-standing tensions between efficiency and getting required Ruby functionality.

This does what you'd expect with MRI, of course:


    def foo *args
      puts "Length:"
      puts args.length
      puts "Members:"
      puts args[0]
      puts args[1]
      puts args[2]
    end
    
    foo(1,2,3)

But on entering foo at the moment, args is a c-style array. It works if we dip into the s-expression syntax:


    def foo *args
      puts "Length:"
      %s(callm self puts ((__get_fixnum numargs)))
      puts "Members:"
      %s(callm self puts ((index args 0)))
      %s(callm self puts ((index args 1)))
      %s(callm self puts ((index args 2)))
    end
    
    foo(1,2,3)

(never mind that it outputs a length of 5 - that reflects self and space for a pointer to a block).

To handle this, we need to add support for turning args into an actual Ruby array.

There's a wrinkle: So far we've used this as a shortcut for quickly pushing members onto the call stack if we subsequently turn around and call another method with it. Either we need to continue supporting that (it will be sufficient if the second example above still works, but there are other alternatives)

One solution is to allocate arguments in a Ruby array, but that has terrible overhead, and complicates interoperability with C. Another is to allocate a Ruby Array in the method, optionally using some method to determine if the array is likely to be used.

For now, I have chosen to add code in the method to convert splat arguments to a real Ruby array, and to take the pain of rewriting code that used index/numargs on splat arguments.

A problem which instantly materialize is that calling Class#new can not directly or indirectly require calls to Class#new in order to complete. If it does, we get infinite recursion, obviously (or rather, recursion until we run out of stack).

Unfortunately this is tricky: Class#new currently uses a splat argument, which in the most obvious naive implementation of turning splat arguments into a real Ruby Array itself triggers a Class#new call (in the form of Array.new).

Array#initialize further assigns Fixnum'a to instance variables, and this too triggers Class#new to instantiate said Fixnum's.

We either needs to shortcircuit this in the case of an empty splat array, and add a way of constructing an empty Array, or find some other way of short-circuiting this recursion.

(We're going to deal with the fallout of figuring out a more reasonable ordering of the bootstrapping of the core classes for some time to come).

Let's look at the code

You'll find this entire change in d9ac43d75

I'm not going to go over every tiny little bit, but let's examine the most important pieces.

First of all, this is a bit on the sidelines, but a major reason to complete the splat support was to handle literal arrays. There's now a new method Compiler#compile_array to handle that simply by turning them into Array[] calls:


    +  # Compile a literal Array initalization
    +  #
    +  # FIXME: An alternative is another "transform" step
    +  #
    +  def compile_array(scope, *initializers)
    +    compile_eval_arg(scope,
    +                     [:sexp,
    +                      [:callm, :Array, :[], initializers]
    +                      ])
    +    return Value.new([:subexpr], :object)
    +  end

Next, in transform.rb the work starts by modifying methods with splat arguments. First we rename the argument to __splat (at some point I'll probably change this to hide them from the Ruby code entirely):


        +      rest = e[2][-1] rescue nil
        +      rest = (rest[-1] == :rest ? rest[0] : nil) rescue nil
        +      if rest
        +        e[2][-1][0] = :__splat
        +      end
        +

Then we introduce a new local variable (hence the location of this transformation rule in rewrite_let_env) with the original name of the argument, and assign the result of a call to __splay_to_Array_ to it:


    -      e[3] = E[e.position,:let, vars,*e[3]]
    +      if rest
    +        vars << rest.to_sym
    +        rest_func =
    +          [:sexp,
    +           [:assign, rest.to_sym, [:__splat_to_Array, :__splat, :numargs]]
    +          ]
    +      else
    +        rest_func = []
    +      end
    +
    +      e[3] = E[e.position,:let, vars, rest_func,*e[3]]

One immediate thing is worth observing: We could certainly defer this call as long as possible, to hope to avoid the conversion in e.g. cases of early return.

At some point that will be worth exploring, but note also that this could/should fall out automatically if we later add suitable optimization passes. For now it's by far easiest to just put it at the beginning of the method.

This of course means we need to implement __splat_to_Array:


    +%s(defun __splat_to_Array (r na)
    +   (let (splat pos data max)
    +    (assign splat (callm Array __new))
    +    (assign pos 0)
    +    (assign max (sub na 2))
    +    (callm splat __grow (max))
    +    (while (lt pos max)
    +       (do
    +          (callm splat __set (pos (index r pos)))
    +          (assign pos (add pos 1))
    +          )
    +        )
    +  splat
    +  ))
    +

This is roughly equivalent to (pseudo-Ruby):


    def __splat_to_Array(r na) # na is "numargs"; r is the c-style splat array.
      splat = Array.__new
      pos = 0
      max = numargs - 2 # Subtract two for "self" and a &block argument we always allocate space for
      splat.__grow(max)
      while pos < max
        splat.__set(pos, r[pos])
        pos += 1
      end
    end

There are a few odd things here. For starters, we do this using the s-expression syntax because nearly "everything" otherwise would potentially trigger attempts to allocate objects, and Class#new uses splats. This also means we can't use Class#new, as discussed above.

So I've introduced Class#__new as a lower level method that you should never-ever use in Ruby code (and which we'll find some way of hiding later). I also had to modify Array extensively to avoid even using Fixnum's, since we're for the time being at least using objects rather than type-tagged integers.

The rest pretty much falls out of that. Class#__new and Class#__alloc (used by Class#__new) looks like this:


    +  # Clients that need to be able to allocate a completely clean-slate empty
    +  # object, should call <tt>__alloc</tt>.
    +  #
    +  def __alloc
    +    %s(assign ob (__array @instance_size))
    +    %s(assign (index ob 0) self)
    +    ob
    +  end
     
     +  # Clients that want to be able to create and initialize a basic version of
     +  # an object without normal initializtion should call <tt>__new</tt>. See
     +  # <tt>__splat_to_array</tt>
     +  #
     +  def __new
     +    ob = __alloc
     +    ob.__initialize
     +    ob
     +  end

These should be pretty self-explanatory. Note that the new Class#new calls __alloc, but not __initialize.

A very basic initial version of Array is now first introduced in lib/core/core.rb. It needs to be introduced that early for bootstrapping reasons, since it needs to exist when Class#new is first called. This showcases __initialize:


    +class Array
    +  # __get_fixum should technically be safe to call, but lets not tempt fate
    +  # NOTE: The order of these is important, as it is relied on elsewhere
    +  def __initialize
    +    %s(assign @len 0)
    +    %s(assign @ptr 0)
    +    %s(assign @capacity 0)
    +  end
    +
    +  #FIXME: Private; Used by splat handling
    +  def __len
    +    @len
    +  end
    +
    +  def __ptr
    +    @ptr
    +  end

And here's __grow that's used internally to ensure the Array is big enough. Normally it should not be called directly, but we make an exception for the splat code:


    +  def __grow newlen
    +    # FIXME: This is just a guestimate of a reasonable rule for
    +    # growing. Too rapid growth and it wastes memory; to slow and
    +    # it is, well, slow to append to.
    +
    +    # FIXME: This called __get_fixnum, which means it fails when called
    +    # from __new_empty. May want to create new method to handle the whol
    +    # basic nasty splat allocation
    +    # @capacity = (newlen * 4 / 3) + 4
    +    %s(assign @capacity (add (div (mul newlen 4) 3) 4))
    +
    +    %s(if (ne @ptr 0)
    +         (assign @ptr (realloc @ptr (mul @capacity 4)))
    +         (assign @ptr (malloc (mul @capacity 4)))
    +         )
    +  end

And __set is a low level method to assign to the array and extend, that makes assumptions we can't safely make in general (that is, it expects an expansion by at most one element at a time. Frankly rewriting __splat_to_Array to set length once and omit it here would probably be worth it, but some other time:


    +  #FIXME: Private. Assumes idx < @len && idx >= 0
    +  def __set(idx, obj)
    +    %s(if (ge idx @len) (assign @len (add idx 1)))
    +    %s(assign (index @ptr idx) obj)
    +  end
    +
    +end
    +

Most of the rest of the changes, apart from compile_calls.rb, are changes to the library code (particularly Array to accommodate the splat handling changes and/or take advantage of them). compile_calls.rb, though has seen a huge rewrite. Rather than go through the diff, it's easier to walk through the new version.

Both method and our "c-level" function calls now use the same method-chanin to push arguments onto the stack. It starts with this:


      def compile_args(scope, ob, args, &block)
        @e.caller_save do
          splat = args.detect {|a| a.is_a?(Array) && a.first == :splat }
    
          if !splat
            compile_args_nosplat(scope,ob,args, &block)
          else
            compile_args_splat(scope,ob,args, &block)
          end
        end
      end

The "nosplat" case is simple, and pretty much what we used to do:


      def compile_args_nosplat(scope, ob, args)
        @e.with_stack(args.length, true) do
          @e.pushl(@e.scratch)
          args.each_with_index do |a, i|
            param = compile_eval_arg(scope, a)
            @e.save_to_stack(param, i + 1)
          end
          @e.popl(@e.scratch)
          yield
        end
      end

The tricky bit is the splat handling, which I'm by no means happy with, and it will probably need significant simplification:


     def compile_args_splat(scope, ob, args)
        # Because Ruby evaluation order is left to right,
        # we need to first figure out how much space we need on
        # the stack.
        #
        # We do that by first building up an expression that
        # adds up the static elements of the parameter list
        # and the result of retrieving 'Array#length' from
        # each splatted array.
        #
        # (FIXME: Note that we're not actually type-checking
        # what is *actually* passed)
        #
        num_fixed = 0
        exprlist = []
        args.each_with_index do |a, i|
          if a.is_a?(Array) && a[0] == :splat
            # We do this, rather than Array#length, because the class may not
            # have been created yet. This *requires* Array's @len ivar to be
            # in the first ivar;
            # FIXME: should enforce this.
            exprlist << [:index, a[1], 1]
          else
            num_fixed += 1
          end
        end
        expr = num_fixed
        while e = exprlist.pop
          expr = [:add, e, expr]
        end
    
        @e.comment("BEGIN Calculating argument count for splat")
        ret = compile_eval_arg(scope, expr)
        @e.movl(@e.result, @e.scratch)
       @e.comment("END Calculating argument count for splat; numargs is now in #{@e.scratch.to_s}")

The comments in the part above hopefully speaks for itself. For e.g. foo(*a, b, *c) it basically does the equivalent of a.length + c.length + n where n is the number of non-splat arguments.

Next we save the count we found, and then allocate space on the stack:


        @e.comment("Moving stack pointer to start of argument array:")
        @e.pushl(@e.scratch) # We assume this may get borked during argument evaluation
        @e.imull(4,@e.result)
    
        # esp now points to the start of the arguments; ebx holds numargs,
        # and end_of_arguments(%esp) also holds numargs
        @e.subl(@e.result, :esp)

Then we start at the bottom of the allocated stack segment, and evaluate arguments and save the result, working our ways up the stack, except if we find a splat argument, we call compile_args_copysplat to take care of that:


        @e.comment("BEGIN Pushing arguments:")
        @e.with_register do |indir|
          # We'll use indir to put arguments onto the stack without clobbering esp:
          @e.movl(:esp, indir)
          @e.pushl(@e.scratch)
          @e.comment("BEGIN args.each do |a|")
          args.each do |a|
            @e.comment(a.inspect)
            if a.is_a?(Array) && a[0] == :splat
              compile_args_copysplat(scope, a, indir)
            else
              param = compile_eval_arg(scope, a)
              @e.save_indirect(param, indir)
              @e.addl(4,indir)
            end
          end
          @e.comment("END args.each")
          @e.popl(@e.scratch)
        end
        @e.comment("END Pushing arguments")

Then we yield to handle the actual call for various types of calls, and then adjust the stack:


        yield
        @e.comment("Re-adjusting stack post-call:")
        @e.imull(4,@e.scratch)
        @e.addl(@e.scratch, :esp)
      end

That leaves compile_args_copysplat:


      # For a splat argument, push it onto the stack,
      # forwards relative to register "indir".
      #
      # FIXME: This method is almost certainly much
      # less efficient than it could be.
      #
      def compile_args_copysplat(scope, a, indir)
        @e.comment("SPLAT")
        @e.with_register do |splatcnt|
          param = compile_eval_arg(scope, a[1])
          @e.addl(4,param)
          @e.load_indirect(param, splatcnt)
          @e.addl(4,param)
          @e.load_indirect(param, :eax)
          @e.testl(:eax,:eax)
          l = @e.get_local

Notice the ugliness that comes next: We need to sort-of the case where the Array class itself simply hasn't been created yet at all (which will only "work" if the splat arguments in question are not actually used):


          # If Array class ptr has not been allocated yet:
          @e.je(l)

I really hope there's a nicer/cleaner/faster way of doing this, but what's driving me crazy below is the addressing modes for x86. For m68k, you can get away with some indirect gymnastics which makes stuff like this far simpler:


          @e.loop do |br|
            @e.testl(splatcnt, splatcnt)
            @e.je(br)
            # x86 will be the death of me.
            @e.pushl("(%eax)")
            @e.popl("(%#{indir.to_s})")
            @e.addl(4,:eax)
            @e.addl(4,indir)
            @e.subl(1,splatcnt)
          end
    
          @e.local(l)
        end
      end

I'm sure there are better ways of doing it for x86 too (and certainly for x64; one day we'll get around to doing a 64-bit backend).

Parting words

And that's it. This brings me to the point where the main obvious hindrance (I'm sure others will pop up) is the lack of proper support for send, and so that is what I'll cover in part 45.

Writing a (Ruby) compiler in Ruby bottom up - step 44 2015-03-17