Ethereum Virtual Machine (EVM)
The Ethereum network exists solely for the purpose of keeping the single, continuous, uninterrupted, and immutable operation of the state machine that is the Ethereum blockchain. It is the environment in which all Ethereum accounts and smart contracts and data live. At any given block, Ethereum has one and only one globally accepted 'state'. The Ethereum Virtual Machine (EVM) is what defines the rules for computing a new valid state from block to block.
Prerequisites
Some basic understanding of bytes, memory, and the stack is required to understand the EVM.
It might also be helpful to familiarize yourself with some cryptography like hash functions.
Ethereum as a State Machine
Blockchains like Bitcoin are often described as 'distributed ledgers' which enable the existence of a decentralized currency using fundamental tools of cryptography.
A cryptocurrency can behave like a 'normal' currency because of the rules which govern what one can and cannot do to modify this ledger. For example, a Bitcoin address cannot spend more Bitcoin than it has previously received. These rules underpin all transactions that take place on Bitcoin, and similarly other blockchains.
While Ethereum also has its native cryptocurrency, the Ether, it also enables a much more powerful function that we have seen - Smart Contracts. For this more complex feature, we need a more powerful analogy than just 'distributed ledger'.
Instead of a distributed ledger, Ethereum can be described as a distributed state machine. A state machine is essentially any machine that can change from one state to another in response to certain inputs.
A simple state machine is a coin-operated turnstile, commonly found in subways or train stations, to prevent people from entering unless they pay using a coin or have a ticket.
The initial state for a turnstile is locked. In the locked state, if you keep pushing it, it remains locked. If you insert a coin, it moves to the unlocked state. If you keep inserting coins, it remains in the unlocked state. Once you push in the unlocked state (and someone passes through), it becomes locked again.
For Ethereum, the state is much more complex. It is described using a large data structure which contains all the state of the blockchain. The specific rules of how state can change from block to block is defined by the EVM.
Ethereum State Transition
On a high level, the EVM behaves similar to a mathematical state transition function. Given the current state, and a new set of valid transactions, it produces a new state. The output is deterministic, which means that for the same input, it will always produce the same output.
Y(S, T) = S'
Given the old valid state S
, and a new set of valid transactions T
, the state transition function Y
produces the new valid state S'
.
The state in Ethereum is stored as a really large data structure called a Merkle Patricia Trie. You do not need to understand exactly how it is structured, but if you want to, you can read the given link.
EVM Layer
The EVM lives as a layer in the software stack of Ethereum.
Ethereum nodes contain implementations of the EVM, and the EVM can then execute EVM code on it. EVM code is compiled smart contract bytecode that can be executed.
EVM Code Generation
EVM Instructions (OPCODES)
The EVM itself behaves as a stack machine with a maximum depth of 1024 items on the stack. Each item in the stack is a 256-bit (32 bytes) word.
During execution, the EVM maintains a transient memory, as a 32 byte addressed byte array, which does not persist between transactions. The transient memory is cleared when a new transaction is being executed.
Smart contracts, however, do maintain their own state in the blockchain. This state is also modeled as a Merkle Patricia Trie. This is commonly refered to as the EVM storage during transaction execution.
The EVM has logic present that allows it to execute EVM Opcodes, which perform standard operations on the stack like XOR
, ADD
, AND
, SUB
, MUL
etc. The EVM also implements a number of blockchain-specific stack operations, such as BALANCE
and BLOCKHASH
.
When a smart contract is compiled into bytecode (represented in hexadecimal), it compiles down to EVM opcodes. These opcodes are what get executed on the EVM.
EVM Implementations
All implementations of the EVM must adhere to the specification described in the Ethereum Yellowpaper. Over Ethereum's history, the EVM has undergone multiple revisions, and there now exist multiple implementations of the EVM in various programming languages.
All Ethereum clients include an EVM implementation. In addition to those, there are multiple standalone implementations as well.
Ethereum Clients (with EVM)
- Geth | Programming Language = Go
- OpenEthereum | Programming Language = Rust
- Nethermind | Programming Language = C# (.NET)
- Besu | Programming Language = Java
- Erigon | Programming Language = Go
Standalone EVM Implementations
- Py-EVM | Programming Language = Python
- evmone | Programming Language = C++
- ethereumjs-evm | Programming Language = Javascript
- Enclave EVM | Programming Language = C++
Resources
The following are recommended, but optional, readings/viewings for learning more about the EVM.
This article is brought to you by LearnWeb3 DAO. A free, comprehensive A to Z blockchain training program for developers across the globe.
Everything from "What is a Blockchain" to "Hacking smart contracts"β-βand everything in between, but also much more!
Join us now to start buidling with 25,000+ builders.
Top comments (1)
Followed you after reading this writing!