Building a stack-based virtual machine, part 3 - instructions

#rust #language #theory #computerscience

In part 2 we build a stack that we can use to store our operands. That's pretty cool. The next step for us is to define instructions that allow us to perform operations on the operands in the stack.

In case you missed the definitions in part 1 let's recap what we need to make instructions.

A name - Technically this is optional, but I'll be damned if I'm going to use a machine where I have to remember the opcodes of our instructions.
An opcode - A unique number for this instruction. It's important that this doesn't change between VM versions otherwise we won't be able to run old code.
Arity - The number of arguments this instruction takes - that's different to the number of operands it pops from the stack. Think about the instruction push 2; where does the 2 come from? It is supplied as an argument. Most instructions will take 0 arguments but sometimes you need to give an instruction more to go on than simply the contents of the stack.
Behaviour - The action that our instruction will perform. In a CPU this is made using transistors but in our case we'll use Rust functions.

With all that in mind, let's build our instruction type. Remember that because our operands are generic we need our instructions to be generic too.

pub struct Instruction<T> {
  pub op_code: usize,
  pub name: String,
  pub arity: usize,
  pub fun: InstructionFn<T>
}

pub type InstructionFn<T> = fn(machine: &mut Machine<T>, args: &[usize]);

impl<T> Instruction<T> {

    pub fn new(op_code: usize, name: &str, arity: usize, fun: InstructionFn<T>) -> Instruction<T> {
        Instruction {
            op_code: op_code,
            name: String::from(name),
            arity: arity,
            fun: fun
        }

    }
}

Don't worry about the Machine type. We'll get to that in a later episode.

Something important to note here is that the instructions arguments aren't of type T here - they're indexes into the program's data section. We'll get on to that later too.

Because the Instruction type is not much more than a handy container the tests are correspondingly simple:

#[cfg(test)]
mod test {
    use super::*;

    struct Operand(i64);

    fn noop(_machine: &mut Machine<Operand>, _args: &[usize]) {}

    #[test]
    fn new() {
        let operand = Instruction::new(13, "noop", 7, noop);
        assert_eq!(operand.op_code, 13);
        assert_eq!(operand.name, "noop".to_string());
        assert_eq!(operand.arity, 7);
    }
}

Next we need an instruction table - a structure which we can store instructions in and look them up by opcode and name. We're going to use a HashMap.

pub struct InstructionTable<T>(HashMap<usize, Instruction<T>>);

impl<T> InstructionTable<T> {
    pub fn new() -> InstructionTable<T> {
        InstructionTable(HashMap::new())
    }

    pub fn by_op_code(&self, op_code: usize) -> Option<&Instruction<T>> {
        self.0.get(&op_code)
    }

    pub fn by_name(&self, name: &str) -> Option<&Instruction<T>> {
        self.0
            .values()
            .find(|ref instr| instr.name == name)
    }

    pub fn insert(&mut self, instr: Instruction<T>) {
        self.0.insert(instr.op_code, instr);
    }

    pub fn is_empty(&self) -> bool {
        self.0.is_empty()
    }
}

That wasn't so hard. Now let's write some tests:

#[cfg(test)]
mod test {
    use super::*;
    use machine::Machine;

    fn noop(_machine: &mut Machine<usize>, _args: &[usize]) {}

    #[test]
    fn new() {
        let table: InstructionTable<usize> = InstructionTable::new();
        assert!(table.is_empty())
    }

    #[test]
    fn insert() {
        let mut table: InstructionTable<usize> = InstructionTable::new();
        assert!(table.is_empty());
        table.insert(Instruction::new(0, "NOOP", 0, noop));
        assert!(!table.is_empty());
    }

    #[test]
    fn by_op_code() {
        let mut table: InstructionTable<usize> = InstructionTable::new();
        table.insert(Instruction::new(0, "NOOP", 0, noop));
        let instr = table.by_op_code(0).unwrap();
        assert_eq!(instr.name, "NOOP");
    }

    #[test]
    fn by_name() {
        let mut table: InstructionTable<usize> = InstructionTable::new();
        table.insert(Instruction::new(0, "NOOP", 0, noop));
        let instr = table.by_name("NOOP").unwrap();
        assert_eq!(instr.op_code, 0);
    }
}

We've come to the end of today's journey. We now have an operand stack, and a way to collect and use our instructions. In our next episode we're going to create the necessary structures to represent a program that our machine can execute.

DEV Community

Building a stack-based virtual machine, part 3 - instructions

Top comments (0)

Read next

Elon Musk and the Open Source Revolution: Pioneering a New Era of Innovation

Magic Eden: Pioneering Open Source Licensing in the Blockchain Era

Microsoft's Journey into Open Source: A Transformational Tale

Advanced WebRTC Python: Unleashing Real-Time Communication