DEV Community

TangHaosuan
TangHaosuan

Posted on

REVM Source Code - Frame Part 1

Foreword

The previous series of articles was meant to quickly walk through the flow, giving everyone a conceptual understanding of REVM.

Making source code reading less intimidating. Many details and concepts were skipped.

In subsequent articles, we'll try to cover things in more detail.

I'll also try to explain Rust syntax where I can.

Rust generics are truly headache-inducing — for a beginner like me, it's practically hieroglyphics.

I have to research and understand things before I can write about them. Of course, I'll try to ensure accuracy.

Frame contains several other EVM classes. We'll cover them in this chapter as well.

Purpose

Much of Frame's source code was already introduced in Flow(4).

In the flow articles, we covered the execution process of make_call_frame and make_create_frame.

Both functions create Frames for executing contracts.

In REVM, a Frame is created for every contract call.

Simple transfers between EOA accounts don't create a Frame.

Contract-to-contract calls (CALL, STATICCALL, DELEGATECALL) generate a new Frame.

Execution environments between Frames are isolated, each having its own independent:

  • Bytecode
  • Stack
  • Contract address
  • PC (program counter / instruction position)
  • Remaining Gas
  • Memory (sharing one large memory block, but each has its own range)
  • Journaled State change records

Frame Definition

Let's first look at the Frame struct definition:

#[derive_where(Clone, Debug; IW,
    <IW as InterpreterTypes>::Stack,
    <IW as InterpreterTypes>::Memory,
    <IW as InterpreterTypes>::Bytecode,
    <IW as InterpreterTypes>::ReturnData,
    <IW as InterpreterTypes>::Input,
    <IW as InterpreterTypes>::RuntimeFlag,
    <IW as InterpreterTypes>::Extend,
)]
pub struct EthFrame<IW: InterpreterTypes = EthInterpreter> {
    pub data: FrameData,
    pub input: FrameInput,
    pub depth: usize,
    pub checkpoint: JournalCheckpoint,
    pub interpreter: Interpreter<IW>,
    pub is_finished: bool,
}
Enter fullscreen mode Exit fullscreen mode

Explanation of each field:

  • data
    • Call
      • return_memory_range
    • Create
      • created_address
  • input — FrameInput is explained in detail later
  • depth — current Frame's depth. Each time a frame is generated, it's pushed onto the stack — think of it as the index in the stack.
  • checkpoint — checkpoint. Used for rolling back state during revert.
  • interpreter — interpreter, used to execute bytecode
  • is_finished — whether the current Frame has completed execution and returned a result.

The derive_where above is a third-party library macro. It indicates we want to derive Clone and Debug for the struct EthFrame.

It doesn't act directly on the struct but on the impl. After expansion, it looks like this:

impl<IW: InterpreterTypes + Clone + Debug> Clone for EthFrame<IW>
where
    <IW as InterpreterTypes>::Stack: Clone,
    <IW as InterpreterTypes>::Memory: Clone,
    <IW as InterpreterTypes>::Bytecode: Clone,
    <IW as InterpreterTypes>::ReturnData: Clone,
    <IW as InterpreterTypes>::Input: Clone,
    <IW as InterpreterTypes>::RuntimeFlag: Clone,
    <IW as InterpreterTypes>::Extend: Clone,
{
    fn clone(&self) -> Self {
    }
}

impl<IW: InterpreterTypes + Debug> Debug for EthFrame<IW>
where
    <IW as InterpreterTypes>::Stack: Debug,
    <IW as InterpreterTypes>::Memory: Debug,
    <IW as InterpreterTypes>::Bytecode: Debug,
    <IW as InterpreterTypes>::ReturnData: Debug,
    <IW as InterpreterTypes>::Input: Debug,
    <IW as InterpreterTypes>::RuntimeFlag: Debug,
    <IW as InterpreterTypes>::Extend: Debug,
{
    fn fmt(&self, f: &mut std::fmt::Formatter<'_>) -> std::fmt::Result {

    }
}
Enter fullscreen mode Exit fullscreen mode

Frame Implementation

impl<IT: InterpreterTypes> FrameTr for EthFrame<IT> {
    type FrameResult = FrameResult;
    type FrameInit = FrameInit;
}

impl Default for EthFrame<EthInterpreter> {
    fn default() -> Self {
        Self::do_default(Interpreter::default())
    }
}

impl EthFrame<EthInterpreter> {
    pub fn invalid() -> Self {
        Self::do_default(Interpreter::invalid())
    }

    fn do_default(interpreter: Interpreter<EthInterpreter>) -> Self {
        Self {
            data: FrameData::Call(CallFrame {
                return_memory_range: 0..0,
            }),
            input: FrameInput::Empty,
            depth: 0,
            checkpoint: JournalCheckpoint::default(),
            interpreter,
            is_finished: false,
        }
    }

    pub fn is_finished(&self) -> bool {
        self.is_finished
    }

    pub fn set_finished(&mut self, finished: bool) {
        self.is_finished = finished;
    }
}
Enter fullscreen mode Exit fullscreen mode

The only confusing part here is the first section. Both sides of the expression are the same.

Clicking into the definitions navigates to different places.

First, let's look at FrameResult's definition in FrameTr.

The type FrameResult: From<FrameResult> can be understood in two parts:

type FrameResult and FrameResult: From<FrameResult>.

If there were only type FrameResult, it means the type is unspecified — the concrete type is specified during impl.

FrameResult: From<FrameResult> is a Trait bound, constraining that FrameResult can only be converted from the FrameResult type. Note the colon here.

#[auto_impl(&mut, Box)] is a third-party macro.

The auto_impl macro automatically implements this trait for &mut and Box pointer types. This means if you have a type T that implements FrameTr, then &mut T and Box<T> will also automatically implement FrameTr.


Looking back at type FrameResult = FrameResult, this just makes the associated type FrameResult on the left equal to the concrete type FrameResult on the right. Note the equals sign here.

It looks confusing because it makes two identical names equal.

impl<IT: InterpreterTypes> FrameTr for EthFrame<IT> {
    type FrameResult = FrameResult;
    type FrameInit = FrameInit;
}

// crates/handler/src/evm.rs 
// Left side of the equation
#[auto_impl(&mut, Box)]
pub trait FrameTr {
    /// The result type returned when a frame completes execution.
    type FrameResult: From<FrameResult>;
    /// The initialization type used to create a new frame.
    type FrameInit: From<FrameInit>;
}

// crates/handler/src/frame_data.rs
// Right side of the equation
#[cfg_attr(feature = "serde", derive(serde::Serialize, serde::Deserialize))]
#[derive(Debug, Clone)]
pub enum FrameResult {
    /// Call frame result.
    Call(CallOutcome),
    /// Create frame result.
    Create(CreateOutcome),
}
#[derive(Clone, Debug, PartialEq, Eq)]
#[cfg_attr(feature = "serde", derive(serde::Serialize, serde::Deserialize))]
pub struct FrameInit {
    /// depth of the next frame
    pub depth: usize,
    /// shared memory set to this shared context
    pub memory: SharedMemory,
    /// Data needed as input for Interpreter.
    pub frame_input: FrameInput,
}
Enter fullscreen mode Exit fullscreen mode

Let's continue. Most of the content was already introduced in the previous Execute flow.

pub type ContextTrDbError<CTX> = <<CTX as ContextTr>::Db as Database>::Error;

impl EthFrame<EthInterpreter> {
    /// Clear and initialize a frame.
    #[allow(clippy::too_many_arguments)]
    #[inline(always)]
    pub fn clear(
        &mut self,
        data: FrameData,
        input: FrameInput,
        depth: usize,
        memory: SharedMemory,
        bytecode: ExtBytecode,
        inputs: InputsImpl,
        is_static: bool,
        spec_id: SpecId,
        gas_limit: u64,
        checkpoint: JournalCheckpoint,
        gas_params: GasParams,
    ) {
        let Self {
            data: data_ref,
            input: input_ref,
            depth: depth_ref,
            interpreter,
            checkpoint: checkpoint_ref,
            is_finished: is_finished_ref,
        } = self;
        *data_ref = data;
        *input_ref = input;
        *depth_ref = depth;
        *is_finished_ref = false;
        interpreter.clear(
            memory, bytecode, inputs, is_static, spec_id, gas_limit, gas_params,
        );
        *checkpoint_ref = checkpoint;
    }
    #[inline]
    pub fn make_call_frame<
        CTX: ContextTr,
        PRECOMPILES: PrecompileProvider<CTX, Output = InterpreterResult>,
        ERROR: From<ContextTrDbError<CTX>> + FromStringError,
    >(
        mut this: OutFrame<'_, Self>,
        ctx: &mut CTX,
        precompiles: &mut PRECOMPILES,
        depth: usize,
        memory: SharedMemory,
        inputs: Box<CallInputs>,
        gas_params: GasParams,
    ) -> Result<ItemOrResult<FrameToken, FrameResult>, ERROR> {}

    #[inline]
    pub fn make_create_frame<
        CTX: ContextTr,
        ERROR: From<ContextTrDbError<CTX>> + FromStringError,
    >(
        mut this: OutFrame<'_, Self>,
        context: &mut CTX,
        depth: usize,
        memory: SharedMemory,
        inputs: Box<CreateInputs>,
        gas_params: GasParams,
    ) -> Result<ItemOrResult<FrameToken, FrameResult>, ERROR> {}

    pub fn init_with_context<
        CTX: ContextTr,
        PRECOMPILES: PrecompileProvider<CTX, Output = InterpreterResult>,
    >(
        this: OutFrame<'_, Self>,
        ctx: &mut CTX,
        precompiles: &mut PRECOMPILES,
        frame_init: FrameInit,
        gas_params: GasParams,
    ) -> Result<
        ItemOrResult<FrameToken, FrameResult>,
        ContextError<<<CTX as ContextTr>::Db as Database>::Error>,
    > {}
}

impl EthFrame<EthInterpreter> {
    /// Processes the next interpreter action, either creating a new frame or returning a result.
    pub fn process_next_action<
        CTX: ContextTr,
        ERROR: From<ContextTrDbError<CTX>> + FromStringError,
    >(
        &mut self,
        context: &mut CTX,
        next_action: InterpreterAction,
    ) -> Result<FrameInitOrResult<Self>, ERROR> {}
    pub fn return_result<CTX: ContextTr, ERROR: From<ContextTrDbError<CTX>> + FromStringError>(
        &mut self,
        ctx: &mut CTX,
        result: FrameResult,
    ) -> Result<(), ERROR> {}
}
pub fn return_create<JOURNAL: JournalTr>(
    journal: &mut JOURNAL,
    checkpoint: JournalCheckpoint,
    interpreter_result: &mut InterpreterResult,
    address: Address,
    max_code_size: usize,
    is_eip3541_disabled: bool,
    spec_id: SpecId,
) {}
Enter fullscreen mode Exit fullscreen mode

Since most of the content was covered earlier, why write a separate chapter for Frame?

Because many details haven't been covered yet — we only walked through the flow, and the understanding of the entire EVM execution is still fragmented.

In the following sections, we'll focus on the details that weren't previously covered.

init_with_context can be considered Frame's entry point.

Let's go through it again starting from init_with_context:

// Initialize a Frame with the given context and precompiled contracts
pub fn init_with_context<
        CTX: ContextTr,
        PRECOMPILES: PrecompileProvider<CTX, Output = InterpreterResult>,
    >(
        this: OutFrame<'_, Self>,
        ctx: &mut CTX,
        precompiles: &mut PRECOMPILES,
        frame_init: FrameInit,
        gas_params: GasParams,
    ) -> Result<
        ItemOrResult<FrameToken, FrameResult>,
        ContextError<<<CTX as ContextTr>::Db as Database>::Error>,
    > {}
Enter fullscreen mode Exit fullscreen mode

A total of 5 parameters:

  • this: OutFrame<'_, Self> — temporary type during Frame creation.
  • ctx: &mut CTX — context constrained to implement the ContextTr Trait.
  • precompiles: &mut PRECOMPILES — precompiled contract collection
  • frame_init: FrameInit — parameters for Frame initialization
  • gas_params: GasParams — stores the base gas consumption for opcodes

OutFrame

Among the 5 parameters above, the others are fairly easy to understand. Only OutFrame isn't immediately obvious.

Let's jump to the definition:

#[allow(missing_debug_implementations)] — when defining pub types, suppresses the warning about not implementing the Debug Trait.

A total of 3 parameters:

  • ptr: *mut T — memory address where Frame data is stored
  • init: bool — whether the content pointed to by ptr is initialized as a valid T
  • lt: core::marker::PhantomData<&'a mut T> — marks T's borrow ownership and lifetime. The reason is that ptr's type is *mut T rather than &mut T — it's a raw pointer, not a reference. Raw pointers don't have lifetime or ownership. Writing it this way makes raw pointers safer.
// crates/context/interface/src/local.rs
// A potentially initialized frame. Used when initializing new frames in the main loop.
#[allow(missing_debug_implementations)]
pub struct OutFrame<'a, T> {
    ptr: *mut T,
    init: bool,
    lt: core::marker::PhantomData<&'a mut T>,
}
impl<'a, T> OutFrame<'a, T> {
    pub fn new_init(slot: &'a mut T) -> Self {
        unsafe { Self::new_maybe_uninit(slot, true) }
    }
    pub fn new_uninit(slot: &'a mut core::mem::MaybeUninit<T>) -> Self {
        unsafe { Self::new_maybe_uninit(slot.as_mut_ptr(), false) }
    }
    pub unsafe fn new_maybe_uninit(ptr: *mut T, init: bool) -> Self {
        Self {
            ptr,
            init,
            lt: Default::default(),
        }
    }
    pub unsafe fn get_unchecked(&mut self) -> &mut T {
        debug_assert!(self.init, "OutFrame must be initialized before use");
        unsafe { &mut *self.ptr }
    }
    pub fn consume(self) -> FrameToken {
        FrameToken(self.init)
    }
}
Enter fullscreen mode Exit fullscreen mode

Let's find where init_with_context is called and jump to where OutFrame is created.

The key is unsafe { OutFrame::new_maybe_uninit(self.stack.as_mut_ptr().add(idx), idx < self.stack.len()).

Gets the FrameStack's starting pointer + idx positions, checks if idx is less than self.stack.len().

On the first call, FrameStack's len is definitely 0, so idx < self.stack.len() is necessarily false.

Calls new_maybe_uninit, whose definition is shown above.

It simply creates a new OutFrame and returns it.

// crates/handler/src/evm.rs
let new_frame = if is_first_init {
    self.frame_stack.start_init()
} else {
    self.frame_stack.get_next()
};

// crates/context/interface/src/local.rs
#[inline]
pub fn start_init(&mut self) -> OutFrame<'_, T> {
    self.index = None;
    if self.stack.is_empty() {
        self.stack.reserve(8);
    }
    self.out_frame_at(0)
}

fn out_frame_at(&mut self, idx: usize) -> OutFrame<'_, T> {
        unsafe {
            OutFrame::new_maybe_uninit(self.stack.as_mut_ptr().add(idx), idx < self.stack.len())
        }
    }
Enter fullscreen mode Exit fullscreen mode

Back to init_with_context — you can see it just passes the OutFrame-typed this into the two CreateFrame functions.

Clicking into make_call_frame and finding where this is used:

We finally found its usage: this.get and this.consume:

this.get(EthFrame::invalid).clear(
    FrameData::Call(CallFrame {
        return_memory_range: inputs.return_memory_offset.clone(),
    }),
    FrameInput::Call(inputs),
    depth,
    memory,
    ExtBytecode::new_with_hash(bytecode, bytecode_hash),
    interpreter_input,
    is_static,
    ctx.cfg().spec().into(),
    gas_limit,
    checkpoint,
    gas_params,
);
Ok(ItemOrResult::Item(this.consume()))
Enter fullscreen mode Exit fullscreen mode

Let's look at the implementation of this.get and this.consume.

It's clear now — it calls the passed function EthFrame::invalid to generate a new invalid Frame

and returns a mutable reference to the generated Frame.

Why is there unsafe here? Because it involves raw pointer dereferencing *self.ptr:

// crates/context/interface/src/local.rs
pub fn get(&mut self, f: impl FnOnce() -> T) -> &mut T {
    if !self.init {
        self.do_init(f);
    }
    unsafe { &mut *self.ptr }
}
#[inline(never)]
#[cold]
fn do_init(&mut self, f: impl FnOnce() -> T) {
    unsafe {
        self.init = true;
        self.ptr.write(f());
    }
}
pub fn consume(self) -> FrameToken {
    FrameToken(self.init)
}
pub struct FrameToken(bool);
impl FrameToken {
    /// Asserts that the frame token is initialized.
    #[cfg_attr(debug_assertions, track_caller)]
    pub fn assert(self) {
        assert!(self.0, "FrameToken must be initialized before use");
    }
}

Enter fullscreen mode Exit fullscreen mode

Let's reorganize the flow:

  1. In crates/handler/src/evm.rs, the Evm has a frame_stack field that stores all generated Frames.
  2. In frame_init, either frame_stack.start_init or self.frame_stack.get_next() is called to generate an OutFrame-typed newFrame, whose raw pointer stores the frame's position in frame_stack.
  3. In init_with_context, newFrame is passed in, and ultimately in MakeXXFrame, the Frame creation function is passed to OutFrame's get, which creates and initializes the Frame.

All this circuitous logic is just to create a Frame.

So why not create a Frame directly instead of going through an intermediate OutFrame?

I thought about it for a long time and asked multiple AIs about these details many times without getting a satisfying answer.

Initially, I thought it was a Rust ownership issue.

With direct Frame::new, you'd need to pass in some member fields of self. Then self.frame_stack.push(new_frame) would involve ownership issues.

But looking back at Frame's structure, it doesn't involve any references or borrows.

So it should be a performance issue.

REVM frequently creates Frames during execution. If a new Frame is created via New each time and destroyed when finished, the repeated memory allocation and deallocation would be costly.

This way, FrameStack can serve as an object pool.

Another reason is lazy creation. In MakeXXFrame, there are many failure cases.

If a check fails, there's no need to create a Frame — avoiding unnecessary Frame creation.

PRECOMPILES

We've mentioned before that precompiles is a precompiled contract collection, but haven't dived in to explain it.

The PRECOMPILES here is generic, accepting types that implement the PrecompileProvider<CTX, Output = InterpreterResult> Trait.

Let's look at the PrecompileProvider<CTX: ContextTr> definition:

// crates/handler/src/precompile_provider.rs
#[auto_impl(&mut, Box)]
pub trait PrecompileProvider<CTX: ContextTr> {
    type Output;
    fn set_spec(&mut self, spec: <CTX::Cfg as Cfg>::Spec) -> bool;
    fn run(
        &mut self,
        context: &mut CTX,
        inputs: &CallInputs,
    ) -> Result<Option<Self::Output>, String>;
    fn warm_addresses(&self) -> Box<impl Iterator<Item = Address>>;
    fn contains(&self, address: &Address) -> bool;
}
Enter fullscreen mode Exit fullscreen mode
  • set_spec — sets the spec_id; returns true if different from the previous one.
  • run — runs the specified precompile
  • warm_addresses — gets the warm address list; here it retrieves precompiled contracts
  • contains — checks whether an address is a precompiled contract precompile

The functionality is clear: get all precompiled contracts based on the EVM version and store them. Then provide interfaces to run specified precompiled contracts, get contract lists, and check if something is a precompiled contract.

In the crates/handler/src/precompile_provider.rs file, only EthPrecompiles implements the Trait.

Let's look directly at EthPrecompiles's content.

Much of the content is relatively simple — we'll cover selected parts.

&'static Precompiles — the &'static here is a lifetime, meaning Precompiles is valid for the entire program runtime.

pub fn warm_addresses(&self) -> Box<impl Iterator<Item = Address>> returns a type implementing the Iterator Trait that yields Address-type elements.

Precompiles::new is quite long, so we won't paste it.

The logic is simple — based on spec_id, return all precompiled contracts:

pub struct EthPrecompiles {
    pub precompiles: &'static Precompiles,
    pub spec: SpecId,
}

impl EthPrecompiles {
    pub fn warm_addresses(&self) -> Box<impl Iterator<Item = Address>> {
        Box::new(self.precompiles.addresses().cloned())
    }
    pub fn contains(&self, address: &Address) -> bool {
        self.precompiles.contains(address)
    }
}
impl Clone for EthPrecompiles {
    fn clone(&self) -> Self {
        Self {
            precompiles: self.precompiles,
            spec: self.spec,
        }
    }
}
impl Default for EthPrecompiles {
    fn default() -> Self {
        let spec = SpecId::default();
        Self {
            precompiles: Precompiles::new(PrecompileSpecId::from_spec_id(spec)),
            spec,
        }
    }
}
impl<CTX: ContextTr> PrecompileProvider<CTX> for EthPrecompiles {
    type Output = InterpreterResult;

    fn set_spec(&mut self, spec: <CTX::Cfg as Cfg>::Spec) -> bool {
        let spec = spec.into();
        if spec == self.spec {
            return false;
        }
        self.precompiles = Precompiles::new(PrecompileSpecId::from_spec_id(spec));
        self.spec = spec;
        true
    }

    fn run(
        &mut self,
        context: &mut CTX,
        inputs: &CallInputs,
    ) -> Result<Option<InterpreterResult>, String> {
        let Some(precompile) = self.precompiles.get(&inputs.bytecode_address) else {
            return Ok(None);
        };

        let mut result = InterpreterResult {
            result: InstructionResult::Return,
            gas: Gas::new(inputs.gas_limit),
            output: Bytes::new(),
        };

        let exec_result = {
            let r;
            let input_bytes = match &inputs.input {
                CallInput::SharedBuffer(range) => {
                    if let Some(slice) = context.local().shared_memory_buffer_slice(range.clone()) {
                        r = slice;
                        r.as_ref()
                    } else {
                        &[]
                    }
                }
                CallInput::Bytes(bytes) => bytes.0.iter().as_slice(),
            };
            precompile.execute(input_bytes, inputs.gas_limit)
        };

        match exec_result {
            Ok(output) => {
                result.gas.record_refund(output.gas_refunded);
                let underflow = result.gas.record_cost(output.gas_used);
                assert!(underflow, "Gas underflow is not possible");
                result.result = if output.reverted {
                    InstructionResult::Revert
                } else {
                    InstructionResult::Return
                };
                result.output = output.bytes;
            }
            Err(PrecompileError::Fatal(e)) => return Err(e),
            Err(e) => {
                result.result = if e.is_oog() {
                    InstructionResult::PrecompileOOG
                } else {
                    InstructionResult::PrecompileError
                };
                if !e.is_oog() && context.journal().depth() == 1 {
                    context
                        .local_mut()
                        .set_precompile_error_context(e.to_string());
                }
            }
        }
        Ok(Some(result))
    }
    fn warm_addresses(&self) -> Box<impl Iterator<Item = Address>> {
        Self::warm_addresses(self)
    }
    fn contains(&self, address: &Address) -> bool {
        Self::contains(self, address)
    }
}
Enter fullscreen mode Exit fullscreen mode

Let's look at Run's content. It uniformly converts input data &inputs.input into a &[u8] slice.

range is a range containing startIndex and endIndex.

shared_memory_buffer_slice returns the slice between startIndex and endIndex (won't expand here, belongs to another chapter).

The Bytes type is pub struct Bytes(pub bytes::Bytes) — the field is a tuple field, so bytes.0 is used to access the content:

let exec_result = {
    let r;
    let input_bytes = match &inputs.input {
        CallInput::SharedBuffer(range) => {
            if let Some(slice) = context.local().shared_memory_buffer_slice(range.clone()) {
                r = slice;
                r.as_ref()
            } else {
                &[]
            }
        }
        CallInput::Bytes(bytes) => bytes.0.iter().as_slice(),
    };
    precompile.execute(input_bytes, inputs.gas_limit)
};
Enter fullscreen mode Exit fullscreen mode

Processing the return result. No particularly complex logic here.

oog stands for out of gas — insufficient gas.

Both Revert and Return are considered successful execution from the EVM's perspective. The difference is for the Caller: Return counts as success, Revert counts as failure:

match exec_result {
    Ok(output) => {
        result.gas.record_refund(output.gas_refunded);
        let underflow = result.gas.record_cost(output.gas_used);
        assert!(underflow, "Gas underflow is not possible");
        result.result = if output.reverted {
            InstructionResult::Revert
        } else {
            InstructionResult::Return
        };
        result.output = output.bytes;
    }
    Err(PrecompileError::Fatal(e)) => return Err(e),
    Err(e) => {
        result.result = if e.is_oog() {
            InstructionResult::PrecompileOOG
        } else {
            InstructionResult::PrecompileError
        };
        if !e.is_oog() && context.journal().depth() == 1 {
            context
                .local_mut()
                .set_precompile_error_context(e.to_string());
        }
    }
}
Enter fullscreen mode Exit fullscreen mode

FrameInit

Contains all parameters for initializing a Frame.

If it's the first Frame, i.e., the root Frame, it's generated and returned by the handler in first_frame_input.

If not, it's produced by the previous Frame's execution,

sourced from handler->frame_run->frame.process_next_action.

This was already covered in the flow articles and won't be repeated here.

// crates/handler/src/frame.rs
pub fn process_next_action<
        CTX: ContextTr,
        ERROR: From<ContextTrDbError<CTX>> + FromStringError,
    >(
        &mut self,
        context: &mut CTX,
        next_action: InterpreterAction,
    ) -> Result<FrameInitOrResult<Self>, ERROR> {
        let spec = context.cfg().spec().into();

        // Run interpreter

        let mut interpreter_result = match next_action {
            InterpreterAction::NewFrame(frame_input) => {
                let depth = self.depth + 1;
                return Ok(ItemOrResult::Item(FrameInit {
                    frame_input,
                    depth,
                    memory: self.interpreter.memory.new_child_context(),
                }));
            }
            InterpreterAction::Return(result) => result,
        };
Enter fullscreen mode Exit fullscreen mode

Let's continue with FrameInit's definition.

#[cfg_attr(feature = "serde", derive(serde::Serialize, serde::Deserialize))] — if serde is enabled in Cargo.toml, derives serialization and deserialization capabilities for the struct.

  • depth — current frame's depth.
  • memory — shared memory. All frames share one large pre-allocated memory block; each Frame has its own offset and len. SharedMemory will be covered in more detail later.
  • frame_input — divided into two types: CallInputs and CreateInputs
    • CallInputs
      • input — data for calling the frame. Passed via memory.
      • return_memory_offset — offset of return data in memory. As mentioned, all frames share memory, and both input and output are passed through this memory.
      • gas_limit
      • bytecode_address — address of the contract to call
      • known_bytecode — if the contract was pre-loaded, bytecode is stored here
      • target_address — indicates whose address storage is modified. If you're familiar with Solidity, this is easier to understand. There are mainly three contract call methods: Call, DelegateCall, StaticCall. Call and StaticCall modify the callee's contract storage when called, so target_address and bytecode_address are the same. DelegateCall modifies the caller's storage when called. Most proxy contracts use DelegateCall — calling other contracts but actually modifying their own data.
      • caller — the sender calling this Frame's contract, not the entire transaction's sender. If a user calls contract A, and A calls contract B, B's frame's caller is A.
      • value — value to send
      • scheme — call type
        • Call
        • CallCode
        • DelegateCall
        • StaticCall
      • is_static — whether it's a read-only transaction. When set, storage is read-only and cannot be modified.
    • CreateInputs
      • caller
      • scheme — method of contract creation. The two have different parameters and computation methods, producing different addresses.
        • CREATE
        • CREATE2
        • Custom
      • value
      • init_code — the initcode for contract creation. When deploying a contract, it's divided into two parts. The constructor portion belongs to initcode. The rest belongs to runcode. Deployment only uses initcode; runcode runs during Call transactions.
      • gas_limit
      • cached_address — the address generated after contract deployment.
#[derive(Clone, Debug, PartialEq, Eq)]
#[cfg_attr(feature = "serde", derive(serde::Serialize, serde::Deserialize))]
pub struct FrameInit {
    /// depth of the next frame
    pub depth: usize,
    /// shared memory set to this shared context
    pub memory: SharedMemory,
    /// Data needed as input for Interpreter.
    pub frame_input: FrameInput,
}
Enter fullscreen mode Exit fullscreen mode

GasParams

During EVM OpCode execution, Gas is divided into two parts: one is static Gas, where each OpCode has a fixed consumption value.

The other is dynamic Gas, such as memory expansion, whether addresses are warmed, LOGx data length.

In the previous Flow(4), I made an error — I didn't look carefully and thought it only stored the static part.

But actually, it's not just the static part. The static part is stored in an array.

The dynamic part is returned via function calls.

Let's look at the definition:

table is of type Arc<[u64; 256]>, where Arc is shared reference counting, used for ownership sharing across threads.

[u64; 256] is an array containing 256 u64-type elements.

table stores the base Gas consumption for each OpCode.

#[derive(Clone, Debug, PartialEq, Eq, Hash)]
pub struct GasParams {
    /// Table of gas costs for operations
    table: Arc<[u64; 256]>,
    /// Pointer to the table.
    ptr: *const u64,
}
Enter fullscreen mode Exit fullscreen mode

Won't paste the specific code. Here's an explanation of the impl methods:

  • pub fn new(table: Arc<[u64; 256]>) -> Self — replaces the existing table with the passed-in table, sets the raw pointer to point to the table, and returns a new GasParams
  • pub fn override_gas(&mut self, values: impl IntoIterator<Item = (GasId, u64)>) — replaces the gas for specified opcodes
  • pub fn table(&self) -> &[u64; 256] — returns a reference to the table
  • pub fn new_spec(spec: SpecId) -> Self — generates the opcode gas consumption table based on specId
  • pub const fn get(&self, id: GasId) -> u64 — gets the gas consumption for a specified opcode
  • pub fn exp_cost(&self, power: U256) -> u64 — gas consumption for the exp opcode
  • pub fn selfdestruct_refund(&self) -> i64 — gas refund when a contract calls Selfdestruct
  • selfdestruct_cold_cost() — calculates the additional gas cost of SELFDESTRUCT for cold accounts (combination of cold + warm)
  • selfdestruct_cost(should_charge_topup, is_cold) — comprehensively calculates the total gas cost of SELFDESTRUCT, including new account creation and cold access fees
  • extcodecopy(len) — calculates the dynamic gas consumption for EXTCODECOPY copying code of specified length (per word)
  • mcopy_cost(len) — calculates the dynamic gas consumption for MCOPY copying memory of specified length (per word)
  • sstore_static_gas() — returns the base static gas cost for SSTORE operation
  • sstore_set_without_load_cost() — returns the additional gas for SSTORE "setting new slot" after deducting load cost
  • sstore_reset_without_cold_load_cost() — returns the additional gas for SSTORE "resetting existing slot" after deducting cold load
  • sstore_clearing_slot_refund() — returns the refund amount for SSTORE clearing a storage slot (non-0→0)
  • sstore_dynamic_gas(is_istanbul, vals, is_cold) — calculates the dynamic gas portion of SSTORE based on storage value changes and cold/warm status
  • sstore_refund(is_istanbul, vals) — calculates the storage refund for SSTORE based on before/after value changes (may be negative)
  • log_cost(n, len) — calculates the total gas for LOGx operations, including fixed cost per topic + linear cost of data length
  • keccak256_cost(len) — calculates the dynamic gas consumption of KECCAK256 for input of specified length (per word)
  • memory_cost(len) — calculates total gas based on memory expansion to specified length, using linear + quadratic formula
  • initcode_cost(len) — calculates the gas consumption corresponding to initcode length in CREATE/CREATE2 (fixed cost per word)
  • create_cost() — returns the base gas cost for CREATE operation
  • create2_cost(len) — calculates the total gas cost of CREATE2 (base + keccak256 hash input length cost)
  • call_stipend() — returns the fixed gas stipend additionally granted during sub-calls (with value), typically 2300
  • call_stipend_reduction(gas_limit) — calculates the 1/64 portion deducted during sub-call gas forwarding
  • transfer_value_cost() — returns the additional transfer gas cost charged for CALL operations with value
  • cold_account_additional_cost() — returns the additional gas charged when accessing cold accounts (EIP-2929)
  • cold_storage_additional_cost() — returns the additional gas difference charged when accessing cold storage slots
  • cold_storage_cost() — returns the total gas cost for accessing cold storage slots (warm + additional)
  • new_account_cost(is_spurious_dragon, transfers_value) — determines whether to charge new account creation gas based on fork and whether value is transferred
  • new_account_cost_for_selfdestruct() — returns the additional creation gas when SELFDESTRUCT's target is a new account
  • warm_storage_read_cost() — returns the base read gas cost for accessing warm storage slots
  • copy_cost(len) — calculates the dynamic gas consumption for general memory/code copy operations (per word)
  • copy_per_word_cost(word_num) — calculates the result of per-word copy gas cost multiplied by the number of words

SharedMemory

SharedMemory has been mentioned many times before.

All Frames share one large Buffer, and each Frame has its own offset and len.

This reduces memory allocation and deallocation overhead when creating and destroying frames.

When a parent Frame calls a child Frame, parameters are passed via SharedMemory.

The child Frame also returns results to the parent Frame via SharedMemory.

Let's jump to SharedMemory's definition:

  • buffer — where Frame data is stored.
    • Rc<RefCell<Vec<u8>>> from inside out: a Vec of type Vec<u8>.
    • RefCell allows mutable access through an immutable reference
    • Rc is a reference-counting smart pointer that allows the wrapped variable to have multiple owners but the data cannot be modified.
    • Ultimately, buffer can have multiple owners, and each owner can modify the data.
  • my_checkpoint — the current Frame's checkpoint, essentially the starting position of the current Frame's data.
  • child_checkpoint — the starting position of the child Frame's data. Since Frame data length isn't stored, child_checkpoint must be saved to facilitate rollback when the child Frame releases data.
  • memory_limit — maximum memory limit. The current EVM version has no memory limit. This field is temporarily unused.
pub struct SharedMemory {
    /// The underlying buffer.
    buffer: Option<Rc<RefCell<Vec<u8>>>>,
    /// Memory checkpoints for each depth.
    /// Invariant: these are always in bounds of `data`.
    my_checkpoint: usize,
    /// Child checkpoint that we need to free context to.
    child_checkpoint: Option<usize>,
    /// Memory limit. See [`Cfg`](context_interface::Cfg).
    #[cfg(feature = "memory_limit")]
    memory_limit: u64,
}
Enter fullscreen mode Exit fullscreen mode

Let's look at SharedMemory's functions (only partial excerpts).

Although SharedMemory has a new(), it's not actually used in practice.

The first creation (root Frame) uses new_with_buffer

it simply sets the buffer field with the passed-in buffer.

Let's find where it's called.

The buffer passed in is obtained from ctx.local().shared_memory_buffer().

This part belongs to context and will be covered later:

// crates/handler/src/handler.rs
//first_frame_input
let mut memory = SharedMemory::new_with_buffer(ctx.local().shared_memory_buffer().clone());
Enter fullscreen mode Exit fullscreen mode

All subsequent child Frames call new_child_context.

It sets the current Frame's child_checkpoint to the total length of the buffer

and returns a newly created SharedMemory type.

Note: the returned data is not the current Frame's data. Rather, it returns memory for use in the child Frame, so my_checkpoint equals the parent Frame's child_checkpoint, and child_checkpoint is None:

pub fn new_with_buffer(buffer: Rc<RefCell<Vec<u8>>>) -> Self {
    Self {
        buffer: Some(buffer),
        my_checkpoint: 0,
        child_checkpoint: None,
        #[cfg(feature = "memory_limit")]
        memory_limit: u64::MAX,
    }
}
pub fn new_child_context(&mut self) -> SharedMemory {
        if self.child_checkpoint.is_some() {
            panic!("new_child_context was already called without freeing child context");
        }
        let new_checkpoint = self.full_len();
        self.child_checkpoint = Some(new_checkpoint);
        SharedMemory {
            buffer: Some(self.buffer().clone()),
            my_checkpoint: new_checkpoint,
            // child_checkpoint is same as my_checkpoint
            child_checkpoint: None,
            #[cfg(feature = "memory_limit")]
            memory_limit: self.memory_limit,
        }
    }
fn full_len(&self) -> usize {
    self.buffer_ref().len()
}

fn buffer_ref(&self) -> Ref<'_, Vec<u8>> {
    self.buffer().dbg_borrow()
}
Enter fullscreen mode Exit fullscreen mode

The above covers creation-related functions. Let's look at the other functions:

  • free_child_context — frees memory by simply changing the length to the child frame's checkpoint position. When a new Frame needs memory later, it can modify directly, avoiding re-allocation and deallocation.
  • resize — increases memory by specified length. During OpCode execution, there are operations when memory is insufficient.
  • slice_len — returns a data slice for a specified range within this Frame.
  • slice_range — returns a data slice for a specified range within this Frame.
    • self.buffer_ref_mut returns a mutable reference to buffer. The type is RefMut<'_, Vec<u8>>, where _ is the lifetime — for easier understanding, think of it as RefMut<Vec<u8>>.
    • Ref::map(buffer) converts buffer to a RefMut<[u8]> slice.
    • |b| {match b.get_mut(self.my_checkpoint + offset..self.my_checkpoint + offset + size) is used to get a specified range slice from the slice.
  • global_slice_range — returns specified range data from the entire buffer (shared across all frames, not just the current frame).
  • slice_mut — same as slice_range, but with separate parameters.
pub fn free_child_context(&mut self) {
    let Some(child_checkpoint) = self.child_checkpoint.take() else {
        return;
    };
    unsafe {
        self.buffer_ref_mut().set_len(child_checkpoint);
    }
}
pub fn resize(&mut self, new_size: usize) {
    self.buffer()
        .dbg_borrow_mut()
        .resize(self.my_checkpoint + new_size, 0);
}
pub fn slice_len(&self, offset: usize, size: usize) -> Ref<'_, [u8]> {
    self.slice_range(offset..offset + size)
}
pub fn slice_range(&self, range: Range<usize>) -> Ref<'_, [u8]> {
    let buffer = self.buffer_ref();
    Ref::map(buffer, |b| {
        match b.get(range.start + self.my_checkpoint..range.end + self.my_checkpoint) {
            Some(slice) => slice,
            None => debug_unreachable!("slice OOB: range; len: {}", self.len()),
        }
    })
}
pub fn global_slice_range(&self, range: Range<usize>) -> Ref<'_, [u8]> {
    let buffer = self.buffer_ref();
    Ref::map(buffer, |b| match b.get(range) {
        Some(slice) => slice,
        None => debug_unreachable!("slice OOB: range; len: {}", self.len()),
    })
}
pub fn slice_mut(&mut self, offset: usize, size: usize) -> RefMut<'_, [u8]> {
    let buffer = self.buffer_ref_mut();
    RefMut::map(buffer, |b| {
        match b.get_mut(self.my_checkpoint + offset..self.my_checkpoint + offset + size) {
            Some(slice) => slice,
            None => debug_unreachable!("slice OOB: {offset}..{}", offset + size),
        }
    })
}
Enter fullscreen mode Exit fullscreen mode

Let's continue with functions for saving and retrieving specific data types.

Nothing particularly noteworthy here:

pub fn get_byte(&self, offset: usize) -> u8 {
    self.slice_len(offset, 1)[0]
}
pub fn get_word(&self, offset: usize) -> B256 {
    (*self.slice_len(offset, 32)).try_into().unwrap()
}
pub fn get_u256(&self, offset: usize) -> U256 {
    self.get_word(offset).into()
}
pub fn set_byte(&mut self, offset: usize, byte: u8) {
    self.set(offset, &[byte]);
}
fn set_word(&mut self, offset: usize, value: &B256) {
    self.set(offset, &value[..]);
}
pub fn set_u256(&mut self, offset: usize, value: U256) {
    self.set(offset, &value.to_be_bytes::<32>());
}
pub fn set(&mut self, offset: usize, value: &[u8]) {
    if !value.is_empty() {
        self.slice_mut(offset, value.len()).copy_from_slice(value);
    }
}
pub fn set_data(&mut self, memory_offset: usize, data_offset: usize, len: usize, data: &[u8]) {
    let mut dst = self.context_memory_mut();
    unsafe { set_data(dst.as_mut(), data, memory_offset, data_offset, len) };
}
pub fn global_to_local_set_data(
    &mut self,
    memory_offset: usize,
    data_offset: usize,
    len: usize,
    data_range: Range<usize>,
) {
    let mut buffer = self.buffer_ref_mut();
    let (src, dst) = buffer.split_at_mut(self.my_checkpoint);
    let src = if data_range.is_empty() {
        &mut []
    } else {
        src.get_mut(data_range).unwrap()
    };
    unsafe { set_data(dst, src, memory_offset, data_offset, len) };
}
pub fn copy(&mut self, dst: usize, src: usize, len: usize) {
    self.context_memory_mut().copy_within(src..src + len, dst);
}
pub fn context_memory(&self) -> Ref<'_, [u8]> {
    let buffer = self.buffer_ref();
    Ref::map(buffer, |b| match b.get(self.my_checkpoint..) {
        Some(slice) => slice,
        None => debug_unreachable!("Context memory should be always valid"),
    })
}
pub fn context_memory_mut(&mut self) -> RefMut<'_, [u8]> {
    let buffer = self.buffer_ref_mut();
    RefMut::map(buffer, |b| match b.get_mut(self.my_checkpoint..) {
        Some(slice) => slice,
        None => debug_unreachable!("Context memory should be always valid"),
    })
}
Enter fullscreen mode Exit fullscreen mode

At the bottom of the file, two functions are provided for expanding memory.

As you can see, memory expansion always happens in multiples of 32 bytes:

#[inline]
pub fn resize_memory<Memory: MemoryTr>(
    gas: &mut crate::Gas,
    memory: &mut Memory,
    gas_table: &GasParams,
    offset: usize,
    len: usize,
) -> Result<(), InstructionResult> {
    #[cfg(feature = "memory_limit")]
    if memory.limit_reached(offset, len) {
        return Err(InstructionResult::MemoryLimitOOG);
    }

    let new_num_words = num_words(offset.saturating_add(len));
    if new_num_words > gas.memory().words_num {
        return resize_memory_cold(gas, memory, gas_table, new_num_words);
    }

    Ok(())
}
#[cold]
#[inline(never)]
fn resize_memory_cold<Memory: MemoryTr>(
    gas: &mut crate::Gas,
    memory: &mut Memory,
    gas_table: &GasParams,
    new_num_words: usize,
) -> Result<(), InstructionResult> {
    let cost = gas_table.memory_cost(new_num_words);
    let cost = unsafe {
        gas.memory_mut()
            .set_words_num(new_num_words, cost)
            .unwrap_unchecked()
    };

    if !gas.record_cost(cost) {
        return Err(InstructionResult::MemoryOOG);
    }
    memory.resize(new_num_words * 32);
    Ok(())
}
Enter fullscreen mode Exit fullscreen mode

Top comments (0)