I've spent the last couple of weeks hand-compiling Scheme programs, working out the details of the VM for the next release. I've ended up with a VM strongly reminiscent of the CEK abstract machine.
While this is a work in progress, it'll do me good to write it up here.
Registers
call-with-current-continuation.
One advantage of keeping the activation chain separate from the process callstack is having Windows system calls interoperate nicely with reentrant continuations. In Pocket Scheme 1.x, FFI wndprocs and the interpreter share a common stack, which means that any windows running in the Scheme thread (i.e. any windows with a wndproc written in Scheme) must take care never to reenter themselves accidentally.
Instructions
A frame is a multipurpose structure, responsible for saving the values of local variables, saving temporary computation results, and finally passing a set of parameters to a procedure. A frame starts life with a FRAME instruction, and has its elements set with FRAMESET instructions. It may stay in FRAME for a while, possibly as the tail of a chain of frames as created by subsequent FRAME operations, but eventually it will move into the ENV chain via PUSHENV, CALL, or TAILCALL. A frame ends its life via RETURN or POPENV, both of which immediately recycle the frame for subsequent FRAME calls, or a CALL or TAILCALL that makes it unreachable.
A frame is recycled by returning it to a pool from which subsequent FRAME instructions will reallocate it.
Operations that break the stack access pattern of a frame will mark it as nonrecyclable,
in which case the frame does not return to the pool, but instead is left for eventual GC.
The CLOSURE instruction will mark every frame in the current ENV as nonrecyclable.
Likewise, the procedure call-with-current-continuation
will mark every frame in both ENV and CONT as nonrecyclable,
so that those frames last for the life of the reified continuation.
For this reason, the compiler attempts to elide CLOSURE operations wherever possible
(e.g. tail-calls to procedures defined via let and letrec).
apply and let, but not letrec.
Issue: if VAR-ARITY has extended the environment, RETURN should recycle two frames.
let;
otherwise, it is a Scheme letrec.
I always write JUMP with a symbolic offset, just as I do JUMPFALSE and CLOSURE. A LABEL pseudo-instruction gives a name to the target.
Closing over the environment marks every frame in the current environment as nonrecyclable.
I always write LITERAL with the explicit datum, just as I do GLOBDEF, GLOBSET, and GLOBREF.
The intent is for all -ARITY instructions eventually to emit procedure metadata that is interpreted by CALL
instead of performing the check inline.
This would allow a better error message in the case of a mismatch in a tail call,
allow a direct jump to the callsite for a case-lambda,
and for calls to statically determined sites,
would allow arity and type checks to be hoisted higher in the calltree.