This is part of a series I started in March 2008 - you may want to go back and look at older parts if you're new to this series.
It's been a few months, and apart from my side step into parser land I haven't had much time to keep posting this series, in part because of work, and part because of repeated illness that after being poked and prodded and x-rayed and being subjected to ultrasound (no, I'm not pregnant, nor is anything about to rupture) and being tapped for what feels like several litres of blood turns out to likely 'only' be infectious mononucleosis. Luckily I've avoided the sometimes lengthy bouts of fatigue, but 4 rounds with several days of fevers and fatigue in just a few months hasn't been entertaining, and if I'm unlucky I might keep getting them occasionally for a few more months. Fun.
Anyway, enough moaning. Last time, I implemented support for arrays, and mentioned local variables as one of the remaining targets:
Lets first figure out how to do it, by resorting to our trusted old method of seeing what gcc does to this:
int foo() { unsigned int i; i = 42; }
This is the abbreviated result of gcc -s:
foo: pushl %ebp movl %esp,%ebp subl $16, %esp movl $42, -4(%ebp) leave ret
Figuring out what's going on shouldn't be too hard:
There's not all that much to it.
Our first step is to introduce a new type of scope. Previously I added the Scope class to keep track of named parameters to a function. Now we have another form: A set of variables tied to a local scope. Why not tie it directly to the function, and just add it to the Scope class? Consider Ruby blocks or the lambda's we've already introduced: Many languages have constructs that introduce new scopes inside a function that may only encompass part of that function. Adding a new class gives us flexibility to introduce new scopes wherever we want:
class LocalVarScope def initialize locals, next_scope @next = next_scope @locals = locals end def get_arg a a = a.to_sym return [:lvar,@locals[a]] if @locals.include?(a) return @next.get_arg(a) if @next return [:addr,a] # Shouldn't get here normally end end
This LocalVarScope class follows the same interface as the Scope class for the #get_arg method which is the only thing we really cares about. It returns a new type :lvar to indicate a local variable, which needs to be given different treatment during code generation. If it can't find a local variable, it tries passing the #get_arg request on to the next scope. The last line is a fallback - at that point we could either make it an error, or do as we've done in the past and just assume it refers to a global variable defined elsewhere. It would only get that far if the LocalVarScope occurs outside a function, which is a major WTF, so perhaps I should make it a variable. That's one for later.
Our next step is to handle :lvar. The following is added to #compile_eval_arg immediately before the return:
@e.load_local_var(aparam) if atype == :lvar
This calls a new function in the emitter to load a local variable into %eax. So lets add the new functions to the emitter:
def local_var(aparam) # FIXME: The +2 instead of +1 is ecause %ecx is pushed onto the stack in "main". It's really only needed there. "-#{PTR_SIZE*(aparam+2)}(%ebp)" end def load_local_var(aparam) movl(local_var(aparam),:eax) end def save_to_local_var(arg,aparam) movl(arg,local_var(aparam)) end def with_local(args) # FIXME: The "+1" is a hack because main contains a pushl %ecx with_stack(args+1) { yield } end
#local_var is a helper to return the correct offset off (%ebp). The FIXME relates to the fact that we save %ecx in the main function, and for now I won't treat the main function differently, which means wasting 4 bytes for every strack fame instead. [Yeah, I know - it'll get fixed]
#with_local is a helper equivalent to #with_stack. It's introduced for two reasons: a) it's possible we'll want to implement them differently on different architectures, and more importantly for now, b) it hides the "hack" mentioned above squarely in the Emitter.
#load_local_var and #save_to_local var does the obvious things - they load and store the local variable from/to a register where we can manipulate the value more easily. Which brings us to a problem...
So far we've done nothing to handle register allocation. We've just assumed "default" registers for various expressions, and it's worked. But it's worked because almost all we've done has been function calls which involve saving stuff to the stack and popping it back off.
However, that's a very inefficient solution, and eventually it needs to be fixed. There are two approaches to solving this:
1) Allocate registers "properly"; and there's a hell of a lot of literature on doing that well - you want to avoid unnecessary memory accesses by keeping the most frequently used variables in memory, but you also need to be careful as it can cause a real mess when it comes to thread synchronization etc.
2) Keep pushing things on the stack, and then rely on a "smart" peephole analyzer to "fix" things by removing stack manipulation and doing the register allocation properly at that stage. Essentially you're not getting away from #1 - to get a high performance compiler, sooner or later you need to worry about proper register allocation.
Unsurprisingly, I've opted for later. Functionality first. So to make sure things continues to work for now we make some small changes to use the stack. Here's a diff of #compile_assign:
def compile_assign scope, left, right source = compile_eval_arg(scope, right) + @e.pushl(source) atype, aparam = get_arg(scope,left) if atype == :indirect + @e.popl(:eax) @e.emit(:movl,source,"(%#{aparam})") + elsif atype == :lvar + @e.popl(:eax) + @e.save_to_local_var(source,aparam) elsif atype == :arg + @e.popl(:eax) @e.save_to_arg(source,aparam) else raise "Expected an argument on left hand side of assignment" end return [:subexpr] end
In addition to the call to the previously defined #save_to_local_var to handle the case where a local variable is the left argument to an assign, we've littered the function with a pushl (push to the stack) and popl (pop something off the stack) to avoid clobbering intermediate results.
"let" is a name with long traditions - having been used in LISP, numerous BASIC's and a variety of modern functional languages. We use "let" as a construct to list a bunch of local variables, and make the expressions contained in a "let" define the scope. Example:
(let (a b c) expr1 expr2 .. expr-n)
This allows us to introduce new scopes wherever we like. Later, when I put a more advanced parser on top, we may end up hiding this functionality and have the parser or an intermediate stage introduce let's where it fits to introduce new scopes.
First we augment #compile_exp:
return compile_let(scope,*exp[1..-1]) if (exp[0] == :let)Simple enough, so let's move on to #compile_let:
def compile_let(scope,varlist,*args) vars = {} varlist.each_with_index {|v,i| vars[v]=i} ls = LocalVarScope.new(vars,scope) @e.with_local(vars.size) { compile_do(ls,*args) } end
As you can see it quite straightforwardly uses the new LocalVarScope class, and it then compiles the expressions enclosed in the let expression.
prog = [:let, [:i], [:assign, :i, [:array, 2]], [:printf, "i=%p\n",:i], [:assign, [:index, :i, 0], 2], [:assign, [:index, :i, 1], 42], [:printf, "i[0]=%ld, i[1]=%ld\n",[:index,:i,0],[:index,:i,1]] ]
As usual the full source code is available for download here.