Writing a (Ruby) compiler in Ruby bottom up - step 37 2014-07-19


This is part of a series I started in March 2008 - you may want to go back and look at older parts if you're new to this series.

(I will be at the Brighton Ruby Conference on Monday July 20th if you're attending and follow my series or just want to talk about Ruby in general, I'm happy to have a chat)

Default arguments

First order of business today is to get rid of this abomination from last time:


    # FIXME: ScannerString is broken -- need to handle String.new with an argument
    #  - this again depends on handling default arguments
    
        s = ScannerString.new
        r = ch.__get_raw
        s.__set_raw(r)

Essentially we want to be able to do both String.new and String.new "foo", and then inherit String with ScannerString, and use super to pass a String argument straight through to the super class constructory, so we can change the code above to:


        s = ScannerString.new(ch)

This shouldn't be too hard: We pass the argument count to methods, and


        def foo arg1, arg2 = "something"
        end

.. is equivalent to (pseudo code):


       def foo args
          arg1 = args[0]
          if <number of args> == 2
             arg2 = args[1]
          else
            if <number of args> == 1
               arg2 = "something"
            else
              [here we really should raise an exception]
            end
          end
       end

The overheads Ruby imposes for all the nice high level stuff should start to become apparent now...

We don't have support for raise yet, but we can at least output a message for the else case too while we're at it. Of course, the more arguments, the more convoluted the above will get, but it's not hard - we have %s(numargs) to get at the argument count, and we should be able to do this pretty much just with high level constructs.

A first naive approach

My first attempt was something like this:


    def foo arg1, arg2 = "default"
      %s(if (gt numargs 4) (printf "ERROR: wrong number of arguments (%d for 1)\n" (sub numargs 2))
           (if (lt numargs 3) (printf "ERROR: wrong number of arguments (%d for 1)\n" (sub numargs 2))
            (if (le numargs 3) (assign arg2 (__get_string "default"))))
             )
    
      puts arg1
      puts arg2
    end

Spot the mistake? It is slightly tricky. The problem is not in this function itself, but in how the arguments get allocated: Since the caller allocates space on the stack for the arguments, if arg2 is not being passed, then there's no slot for it, so we can't assign anything to it.

One way out of this is to allocate a local variable, and copy either the default value or or the passed in argument to the local variable, and then rewrite all references to arg2 to local_arg2 in the rest of the function.

Another variation is to instead of doing the rewrite, change the Function#get_arg method to allow us to "redirect" requests for the argument afterwards. We can do this similarly to how we reference numargs, by treating them as negative offsets

Parsing

We were getting ahead of ourselves. First lets update the parser. In 7bca9f3 I've added this to parse_arglist. Previously we just ignored the result of the call to the shunting yard parser. Now we keep it, and treat the argument differently if we have a default value:


        default = nil
        if expect("=")
          nolfws
          default = @shunting.parse([","])
        end
    
        if prefix then args = [[name.to_sym, prefix == "*" ? :rest : :block]]
        elsif default
          args = [[name.to_sym, :default, default]]
        else
          args = [name.to_sym]
        end

This change breaks a number of test cases, so there's also a series of updates to features/parser.feature

To make this work, we also need to slightly change transform.rb to actually handle these arrays in the argument list (which brings up the question of whether or not it might not be cleaner to modify the parser to store them uniformly, but lets leave that for later).


       def rewrite_let_env(exp)
         exp.depth_first(:defm) do |e|
    -      args   = Set[*e[2]]
    +      args   = Set[*e[2].collect{|a| a.kind_of?(Array) ? a[0] : a}]

Creating our own local variables

Note that I'm not particularly happy with the approach I'm about to describe. It feels a bit dirty. Not so much the concept, but the implementation. I'd love suggestions for cleanups.

But first, there's a test case in f25101b

Next, we turn our attention to function.rb. Function and Arg objects are instantiated from the data in the parse tree, and we now need to handle the arguments with :default in them.

We start by simply adding a @default in Arg (In 1a09e19 ). You'll note I've also added a @lvar that is not being set immediately, as this value will depend on the function object. But as the comment says, this will be used to let us "switch" this Arg object from referring to refer to a local variable that will effectively (and intentionally) alias the original in the case of defaults.


     class Arg
    -  attr_reader :name, :rest
    +  attr_reader :name, :rest, :default
    +
    +  # The local variable offset for this argument, if it 
    +  # has a default
    +  attr_accessor :lvar
    
       def initialize(name, *modifiers)
         @name = name
         # @rest indicates if we have
         # a variable amount of parameters
         @rest = modifiers.include?(:rest)
    +    @default = modifiers[0] == :default ? modifiers[1] : nil
       end

In Function, we add a @defaultvars that keeps track of how many extra local variable slots we need to allocate. In Function#initialize we change the initalization of the Arg objects, and once we've created it, we assign a local variable slot for this argument.


    +    @defaultvars = 0
         @args = args.collect do |a|
    -      arg = Arg.new(*[a].flatten)
    +      arg = Arg.new(*[a].flatten(1))
    +      if arg.default
    +        arg.lvar = @defaultvars
    +        @defaultvars += 1
    +      end

We then initialize @defaults_assigned to false. This is a new instance variable that will act as a "switch". When it is false, compiling request for an argument with a default will refer to the argument itself. So we're not yet aliasing. At this stage, we need to be careful - if we read the argument without checking numargs, we may be reading random junk from the stack.

Once we switch it to true, we're telling the Function object that we wish to redirect requests for arguments that have defaults to their freshly created local variables. We'll see how this is done when changing Compiler shortly.

Let's now look at get_arg and some new methods, in reverse order of the source, Function#get_arg first:


      def get_arg(a)
        # Previously, we made this a constant for non-variadic
        # functions. The problem is that we need to be able
        # to trigger exceptions for calls with the wrong number
        # of arguments, so we need this always.
        return [:lvar, -1] if a == :numargs

The above is basically a bug fix. It was a short-sighted optimization (I knew there was a reason why I don't like premature optimizations, but every now and again they are just too tempting) which worked great until actually getting the semantics right.

And this is the start of the part I'm not particularly happy with:


        r = get_lvar_arg(a) if @defaults_assigned || a[0] == ?#
        return r if r
        raise "Expected lvar - #{a} / #{args.inspect}" if a[0] == ?#
    
        args.each_with_index do |arg,i|
          return [arg.type, i] if arg.name == a
        end
    
        return @scope.get_arg(a)
      end

Basically, if the @defaults_assigned "switch" has been flipped, or we explicitly request a "fake argument", we call get_lvar_arg to try to look it up. If we find it, great. If not, we check for a normal variable. We feel free to throw an exception for debugging purposes if we're requesting a "fake argument", as they should only ever be created by the compiler itself, and should always exist if they've been created. The "fake argument" here is "#argname" if the arguments name is "argname". I chose to use the "#" prefix as it can't legally occur in an argument name, and so it's safe. It's still ugly, though.

Next a helper to process the argument list and yield the ones with defaults, and "flip the switch" when done:


      def process_defaults
        self.args.each_with_index do |arg,index|
          # FIXME: Should check that there are no "gaps" without defaults?
          if (arg.default)
            yield(arg,index)
          end
        end
        @defaults_assigned = true
      end

Last but not least, we do the lookup of our "fake argument":


      # For arguments with defaults only, return the [:lvar, arg.lvar] value
      def get_lvar_arg(a)
        a = a.to_s[1..-1].to_sym if a[0] == ?#
        args.each_with_index do |arg,i|
          if arg.default && (arg.name == a)
            raise "Expected to have a lvar assigned for #{arg.name}" if !arg.lvar
            return [:lvar, arg.lvar]
          end
        end
        nil
      end

Using our own local variables

in Compiler#output_functions (still in 1a09e19), we add a bit to actually handle the default argument:


    -        @e.func(name, func.rest?, pos, varfreq) do
    +        @e.func(name, pos, varfreq) do
    +
    +          if func.defaultvars > 0
    +            @e.with_stack(func.defaultvars) do 
    +              func.process_defaults do |arg, index|
    +                @e.comment("Default argument for #{arg.name.to_s} at position #{2 + index}")
    +                @e.comment(arg.default.inspect)
    +                compile_if(fscope, [:lt, :numargs, 1 + index],
    +                           [:assign, ("#"+arg.name.to_s).to_sym, arg.default],
    +                           [:assign, ("#"+arg.name.to_s).to_sym, arg.name])
    +              end
    +            end
    +          end
    +
    +          # FIXME: Also check *minium* and *maximum* number of arguments too.
    +

This piece basically allocates a new stack frame for local variables, and then iterates through the arguments with defaults; we then compile code equivalent to if numargs &lt; [offset]; #arg = [default value]; else #arg = arg; end where "arg" is the name of the current argument.

Adding checks for minimum and maximum number of arguments

What should have been utterly trivial, given the above, turned into a couple of annoying debugging sessions, due to bugs that were in fact exposed once I started checking the argument numbers...

But let us start with the actual checks:


    +          minargs = func.minargs
    +
    +          compile_if(fscope, [:lt, :numargs, minargs],
    +                     [:sexp,[:call, :printf, 
    +                             ["ArgumentError: In %s - expected a minimum of %d arguments, got %d\n",
    +                              name, minargs - 2, [:sub, :numargs,2]]], [:div,1,0] ])
    +
    +          if !func.rest?
    +            maxargs = func.maxargs
    +            compile_if(fscope, [:gt, :numargs, maxargs],
    +                       [:sexp,[:call, :printf, 
    +                               ["ArgumentError: In %s - expected a maximum of %d arguments, got %d\n",
    +                                name, maxargs - 2, [:sub, :numargs,2]]],  [:div,1,0] ])
    +          end

These should be reasonably straightforward:

In Function (function.rb), this is #minargs and #maxargs:


    +  def minargs
    +    @args.length - (rest? ? 1 : 0) - @defaultvars
    +  end
    +
    +  def maxargs
    +    rest? ? 99999 : @args.length
    +  end

(We could leave #maxargs undefined for cases where #rest? returns true, but really, we could just as well try to set a reasonable maximum - if it gets ridiculously high, it likely indicates a messed up stack again)

Fixing saving of caller saved registers

One thing that I wasted quite a bit of time with when adding the min/max argument checks was that this triggered a few lingering bugs/limitations of the current register handling.

The main culprit was the [:div, 1, 0] part. As it happens, because this forces use of %edx (if you remember, this is because on i386, idivl explicitly depends on %edx), which is a caller saved register, our lack of proper handling of caller saved registers caused spurious errors.

The problem is that Emitter#with_register is only safe as long as nothing triggers forced evictions of registers within the code it uses. This code is going to need further refactoring to make it safer. For now I've added a few changes to make the current solution more resilient (at the cost of sometimes making it generate even less efficient code, but it should long since have been clear that efficiency is secondary until everything is working).

The most important parts of this change is in Emitter and RegAlloc (in 3997a31)

Firstly, we add a Emitter#caller_save, like this:


      def caller_save
        to_push = @allocator.evict_caller_saved(:will_push)
        to_push.each do |r|
          self.pushl(r)
          @allocator.free!(r)
        end
        yield
        to_push.reverse.each do |r|
          self.popl(r)
          @allocator.alloc!(r)
        end
      end

This depends on an improvement of #evict_caller_saved that returns the registers that needs to be saved on the stack, because they've been allocated "outside" of the variable caching (for cached variables, we can "just" spill them back into their own storage locations, and reload them later, if they are referenced again). We then push them onto the stack, and explicitly free them, yield to let the upper layers do what it wants (such as generate a method call), and pop them back off the stack.

There's likely lots to be gained from cleaning up the entire method call handling once the dust settles and things are a bit closer to semantically complete - we'll get back to that eventually, I promise.

The new version of evict_caller_saved looks like this:


      def evict_caller_saved(will_push = false)
        to_push = will_push ? [] : nil
        @caller_saved.each do |r|
          evict_by_cache(@by_reg[r])
          yield r if block_given?
    
          if @allocated_registers.member?(r)
            if !will_push
              raise "Allocated via with_register when crossing call boundary: #{r.inspect}"
            else
              to_push << r
            end
          end
        end
        return to_push
      end

Basically, we iterate through the list of caller-save reigsters, and if we've indicated we intend to push the values on the stack, we return a list of what needs to be pushed. Otherwise we raise an error, as we risk losing/overwriting an allocated register.

You will see Emitter#caller_save in use in compiler.rb in #compile_call and #compile_callm, where it simply wraps the code that evaluates the argument list and calls themselves.

Other than that, I've made some minor hacky changes to make #compile_assign, #compile_index and #compile_bindex more resilient against these problems.

Splats and numarg revisited

The last, missing little piece is that it surfaced a bug in the handling of numargs with splats. In particular, once I added the min/max checks, the call to a class's #initialize triggered the errors, because this is done with ob.initialize(*args) in lib/core/class.rb, and it didn't set numargs correctly.

I'm not going into this change in detail, as it's a few lines of modifications to splat.rb, compiler.rb and emitter.rb. The significant part of the change is that the splat handling now expects to overwrite %ebx, as it should, and Emitter#with_stack will now set %ebx outright unless it is called after the splat handling, in which case it will add the number of "fixed" arguments to the existing "splat argument count" in %ebx.

(We'll still need to go back and fix the splat handling, as we're still not creating a proper array, and the current support only works for method calls)

Reflections on optimizations

While working on this, it struck me that a future optimization to consider is different entry-points for different variations of the method. While the caller does not really "know" exactly which method will be called, the caller does know how many arguments it has, and so knows that either there is an implementation of a method that can take that number of arguments, or it will trigger an ArgumentError (or should do when we add exceptions).

Thus, if we allocate vtable slots based on the number of provided arguments too, then static/compile-time lookups will fail for method calls where no known possible combination of method name / number of arguments is known at compile time, and it will fall back to the slow path (which we've not yet implemented).

The question of course is how much it would bloa the vtables. However this approach can also reduce the cost of the code, as we can jump past the numargs checks for the version where all arguments are specified etc.

Another approach is to decide on rules for "padding" the argument space allocated, so that the most common cases, of, say 1-2 default arguments can be supported without resorting to extra local variables. This saves some assignments.

Next time

... we'll fix super. Well, add it, as currently it ends up being treated as a regular method call.


blog comments powered by Disqus