I recently got a task to implement a simple stack based interpreter in clojure to interview for a clojure developer position. Building an interpreter is a good excercise to learn any programming language. It uses all the tools in a programmer's arsenal and gives you plenty of opportunity to flex your brain muscle. If you can make a working programming language using a particular tool then you can make almost anything your customer or employer desires
There are three main type of Von Neumann Machines:
- Accumulator - The most basic form of processor where only a single register is used to store the results of computation.
- Stack - Stack machines use an operand stack to push and pop results off the top
- Register - Register machiens use a number of named (or numbered) registers to store values or pass arguments.
As far as I can tell most dynamic language interpreters are stack machines. I expect that this is largely due to the fact that they're a little easier to write and you don't have to keep track of your register usage when writing your assembly.
Interestingly both stack and register machines can emulate each other which makes sense since we run them all on physical processors which are usually register machines.
You can think of a stack as something as simple as an
list. When you initialize your machine the stack is empty and each instruction executed by the machine can pop values off the top of the stack and push values onto the stack. Say we start with the simple calculator example and add two numbers:
- Initialize Machine - (
- Use a
pushinstruction to add the number 2 onto the stack -
- Push the number 3 onto the stack -
- Use an
addinstruction to pop two items off the stack and push back the result -
Hopefully that gives you enough basic idea of how a stack machine works.
Here's a few basic terms that we're going to use in the course of this series:
- Stack - A pile of values, usually in the form of an array or list where values can be added and removed from the "top" only.
- Operand - The values that are being added and removed to the stack and operated upon by the instructions.
- Instruction - The smallest unit of behavior in our machine. Instructions can pop operants off the stack, operate on them, and then push results back onto the stack.
- OpCode - Short for operation code. A unique number which corresponds to a specific instruction to allow programs to be serialized more compactly than using their names.
- Program - A list of instructions (usually by opcode) and their operand arguments for the machine to execute.
- Instruction Pointer - The index into the program which keeps track of the currently executing (or next) instruction.
There are few others, that we will cover as we progress.
Want to dig deeper into the code? Checkout Part 2.