Bytecode and the VM - The Guts of My Interpreter
Still working on the clox language from crafting interpreters. Actually executing code now, which is pretty cool
Bytecode is just a sequence of instructions encoded as bytes. It's a middle ground between source code and machine code - more compact and faster to execute than parsing source code repeatedly, but more portable than native machine code. Following Nystrom's book, I implemented a system to store and execute this bytecode for my clox interpreter.
The fundamental structure is what the book calls a "chunk" - basically a dynamic array of bytes:
typedef struct {
int count;
int capacity;
uint8_t* code;
int* lines; // Track source line numbers
ValueArray constants; // Storage for constant values
} Chunk;Each operation in the language is represented by a single-byte opcode. Some operations need additional data - like loading a constant requires knowing which constant to load. So the bytecode for 1 + 2 looks like:
OP_CONSTANT 0 // Push constant from slot 0 (value 1)
OP_CONSTANT 1 // Push constant from slot 1 (value 2)
OP_ADD // Pop two values, add them, push result
OP_RETURN // End execution
The book has an interesting approach for tracking line numbers for error reporting. When a runtime error occurs, you need to tell the user which line of source code caused it. Nystrom's solution, which I implemented, is to store a parallel array of line numbers, one for each byte of code:
writeChunk(chunk, OP_CONSTANT, 123); // Line 123 in source
writeChunk(chunk, constantIndex, 123); // Same lineThe VM itself is surprisingly simple. It's stack-based, meaning operations pop their operands from a stack and push their results back. This is much simpler than a register-based VM:
typedef struct {
Chunk* chunk;
uint8_t* ip; // Instruction pointer
Value stack[STACK_MAX];
Value* stackTop;
} VM;Stack-Based vs Register-Based VMs: A Brief Detour
When I hit this part in the book, I went down a rabbit hole learning about different VM architectures. Turns out there are two main flavors: stack-based (like clox) and register-based (like Lua).
In a stack-based VM:
- Operations implicitly work on the top of the stack
- Instructions are shorter (no need to specify operands)
- Implementation is simpler
- Typically needs more instructions to do the same work
For example, adding two numbers in a stack VM:
LOAD_CONST 0 # Push first number
LOAD_CONST 1 # Push second number
ADD # Pop both, add, push result
In a register-based VM:
- Values live in virtual registers
- Instructions explicitly specify which registers to use
- Instructions are longer but you need fewer of them
- Implementation is more complex but can be more efficient
The same addition in a register VM:
LOAD_CONST r0, 0 # Load first number into register 0
LOAD_CONST r1, 1 # Load second number into register 1
ADD r0, r0, r1 # r0 = r0 + r1
Most actual CPUs use registers, which is why register VMs can be faster—they're closer to how hardware works. But they're harder to implement and the instructions take up more space. The JVM and CLR (.NET) are stack-based, while Lua's VM and WebAssembly are register-based.
I of course am not deviating from the book and will continue with using the stack based compiler, but, I did find this really interesting and might someday try migrating clox to a register based VM.
The core execution loop is just a big switch statement:
static InterpretResult run() {
#define READ_BYTE() (*vm.ip++)
#define READ_CONSTANT() (vm.chunk->constants.values[READ_BYTE()])
#define BINARY_OP(op) \
do { \
double b = pop(); \
double a = pop(); \
push(a op b); \
} while (false)
for (;;) {
uint8_t instruction = READ_BYTE();
switch (instruction) {
case OP_CONSTANT: {
Value constant = READ_CONSTANT();
push(constant);
break;
}
case OP_ADD: BINARY_OP(+); break;
case OP_SUBTRACT: BINARY_OP(-); break;
case OP_MULTIPLY: BINARY_OP(*); break;
case OP_DIVIDE: BINARY_OP(/); break;
case OP_NEGATE: push(-pop()); break;
case OP_RETURN: {
printValue(pop());
printf("\n");
return INTERPRET_OK;
}
}
}
#undef READ_BYTE
#undef READ_CONSTANT
#undef BINARY_OP
}The beauty of macros like BINARY_OP is how they compress repetitive code. With a single macro, addition, subtraction, multiplication, and division can all be implemented. Each just pops two values, performs the operation, and pushes the result.
Debugging was half the battle. Following the book, I wrote a disassembler to convert the raw bytecode back into a human-readable format:
static int disassembleInstruction(Chunk* chunk, int offset) {
printf("%04d ", offset);
if (offset > 0 && chunk->lines[offset] == chunk->lines[offset - 1]) {
printf(" | ");
} else {
printf("%4d ", chunk->lines[offset]);
}
uint8_t instruction = chunk->code[offset];
switch (instruction) {
case OP_CONSTANT:
return constantInstruction("OP_CONSTANT", chunk, offset);
case OP_ADD:
return simpleInstruction("OP_ADD", offset);
// More cases...
}
}This produces output like:
== test chunk ==
0000 1 OP_CONSTANT 0 '1.2'
0002 | OP_CONSTANT 1 '3.4'
0004 | OP_ADD
0005 | OP_NEGATE
0006 2 OP_RETURN
I also added the stack tracing code from the book to see the VM's internal state during execution:
0000 1 OP_CONSTANT 0 '1.2'
[ 1.2 ]
0002 | OP_CONSTANT 1 '3.4'
[ 1.2 ][ 3.4 ]
0004 | OP_ADD
[ 4.6 ]
0005 | OP_NEGATE
[ -4.6 ]
0006 2 OP_RETURN
-4.6
For now, there really isn't a discernible "programming language" yet. It's starting to form within the compiler but ultimately we're working backwards. We created the lexer already which is creating the tokens and we're started building out the VM which ultimately executes the code and we're working back towards the lexer.
That being said, our VM is executing code which is pretty cool. The book has you hand build some chunks in the main.c file which when run, executes the fundamental building block of the clox language and runs.
Chunk chunk;
initChunk(&chunk);
int constant = addConstant(&chunk, 1.2);
writeChunk(&chunk, OP_CONSTANT, 123);
writeChunk(&chunk, constant, 123);
constant = addConstant(&chunk, 3.4);
writeChunk(&chunk, OP_CONSTANT, 123);
writeChunk(&chunk, constant, 123);
writeChunk(&chunk, OP_ADD, 123);
writeChunk(&chunk, OP_NEGATE, 123);
writeChunk(&chunk, OP_RETURN, 124);