DEV Community

Crypto Abic
Crypto Abic

Posted on

Analysis of Bitroot’s Parallel EVM Technology: Optimistic Parallelism

Blockchain technology, especially the sequential execution bottleneck of the Ethereum Virtual Machine (EVM), has become a major obstacle to large-scale applications. This article focuses on the optimistic parallelization implementation in Bitroot’s parallel EVM, including its conflict detection algorithm and rollback mechanism optimization, and compares it with mainstream parallel execution technologies such as Solana’s Sealevel, Aptos’ Block-STM, and Sui’s object model.

Blockchain Scalability Challenges and EVM Bottlenecks

Traditional blockchain systems, especially Ethereum, prioritize security and decentralization in their design, which leads to fundamental limitations in scalability. One of the core features of the Ethereum Virtual Machine (EVM) is its inherent single-threaded execution mode, that is, all transactions must be processed one by one in strict order. This sequential processing mechanism is crucial to maintaining the certainty and consistency of the network state. It ensures that no matter which node the same smart contract code is executed on, the same final result can be produced, which is indispensable for establishing and maintaining network trust and consensus. However, this strict sequential execution mode also brings significant performance bottlenecks. It greatly limits the network’s transaction processing capacity per second (TPS) and leads to high gas fees when the network is congested. The EVM’s Global State Tree model further exacerbates this bottleneck, because all transactions, regardless of their independence, must interact with and update a single, large state. This design choice reveals the inherent trade-off between determinism/consistency and scalability of the traditional EVM. In order to maintain the core blockchain principles, some throughput is sacrificed. Therefore, any parallelization scheme aimed at improving EVM performance must introduce powerful mechanisms (such as conflict detection and rollback) to achieve final consistency without compromising the integrity of the ledger, thereby breaking through the limitations of traditional sequential execution.

Image description

Necessity and advantages of parallel execution

In order to break through the scalability bottleneck of traditional blockchains, Parallel execution has become a key direction in blockchain architecture innovation. The core idea of ​​parallel execution is to allow multiple transactions to be processed simultaneously, thereby fundamentally solving the sequential execution limitations of traditional EVM. This approach is essentially a “horizontal expansion” that improves overall efficiency by distributing the workload to multiple processing units.

The main advantages of parallelization include significantly improved throughput (i.e., more transactions per second), reduced transaction latency, and lower gas fees. These performance improvements are achieved by effectively utilizing modern multi-core hardware, which often fails to fully realize its potential in a sequential execution environment. In addition to pure performance improvements, parallel EVMs also aim to improve user experience by supporting more users and more complex decentralized applications. At the same time, they strive to maintain compatibility with existing Ethereum smart contracts and development tools, thereby reducing the migration cost for developers.

Parallelization can be seen as a direct response to the “blockchain impossible triangle”. Traditional The EVM prioritizes decentralization and security through sequential execution. Parallelization directly addresses the scalability problem. By enabling higher throughput and lower fees, the parallelized EVM is expected to make decentralized applications more practical and accessible, thereby indirectly enhancing decentralization by lowering the barrier to participation for users and validators (e.g., lower staking hardware requirements). This expands the focus from pure technical performance to broader ecosystem health and adoption, as a scalable network is able to support more participants and use cases.

Overview of the two main parallelization strategies: deterministic and optimistic parallelization

The design space of parallel blockchains revolves around two distinct strategies to manage state access and potential conflicts.

Deterministic parallelism is a “pessimistic” concurrency control method. This strategy requires transactions to explicitly declare all their state dependencies (i.e., read-write sets) before execution. This advance declaration enables the system to analyze dependencies and identify transactions that can be processed in parallel without conflicts, thereby avoiding the need for speculative execution or rollback. Although deterministic parallelism ensures predictability and efficiency when transactions are mostly independent of each other, it also imposes a significant burden on developers, requiring them to precisely define all possible state accesses.

In contrast, optimistic concurrency control (OCC) assumes that conflicts are rare. In this mode, transactions are executed in parallel without pre-declaring dependencies or locking resources. Detection of conflicts is deferred to the validation phase after speculative concurrency. If conflicts are detected at this stage, the affected transactions are rolled back and usually re-executed. This approach provides developers with greater flexibility because they do not need to analyze dependencies in advance. However, its efficiency is highly dependent on low data contention, because frequent conflicts will lead to performance degradation due to re-execution.

The choice between these two paradigms reflects the fundamental trade-off between developer burden and runtime efficiency. Deterministic concurrency shifts complexity to the development phase, requiring developers to do a lot of upfront work to clearly define dependencies. If the upfront investment can perfectly capture the dependencies, it can theoretically lead to efficient runtime execution. Optimistic parallelism reduces the burden on developers, allowing a “fire-and-forget” execution mode, but it places a greater load on the runtime system to dynamically detect and resolve conflicts. If conflicts are frequent, this can lead to significant performance degradation, so this choice often reflects a philosophical decision about where to put complexity: at development time or at runtime. This also means that which approach is “best” depends highly on the typical workload and transaction pattern of the blockchain, as well as the preferences of the target developer community.

Deterministic Parallelism

Basic Principles and Implementation Logic**

Deterministic parallelism represents a “pessimistic” concurrency control approach, the core of which is to identify and manage potential conflicts before transactions are executed. The basic principle of this approach is that all transactions must declare in advance the state dependencies (i.e., read-write sets) that they will access or modify. 1 This explicit declaration is critical for the system to understand which parts of the blockchain state the transaction will affect.

Based on these pre-declared dependencies, a “dependency graph” or “conflict matrix” is constructed. This graph details the interdependencies between transactions within a block. The scheduler then uses this graph to identify groups of non-conflicting transactions that can be executed in parallel and distribute them to multiple processing units. Transactions that are found to have dependencies are automatically serialized to ensure consistent and predictable execution order. A major advantage of this approach is that since conflicts are prevented at the design stage, transactions “will not be executed repeatedly, and there is no pre-execution, pre-analysis, or retry process.” The deterministic paradigm shifts the “cost” of determinism from runtime complexity to developer predictability. This approach avoids runtime conflicts and duplicate execution, which obviously brings performance benefits. However, the cost is that it requires developers to “explicitly define all state dependencies for each transaction” or “pre-specify conflicts between transactions”, which is a significant “burden” for developers and in some cases, if the dependencies are not perfectly captured or the declaration is too broad, it may “force transactions that are not actually in conflict to execute sequentially”. Although deterministic parallelism is optimal for parallelism in theory, in actual implementation, it faces challenges in developer adoption and possible underutilization of parallelism due to conservative dependency declarations. This highlights the potential contradiction between theoretical efficiency and practical usability.

┌─────────────────────────────────────────────────────────┐

│ Transaction input pool │

└───────────────────────────┬─────────────────────────────┘

┌─────────────────────────────────────────────────────────┐

│ State dependency declaration phase │

│ ┌─────────────────────────────────────────────────┐ │

│ │ Transaction 1: {Read: [Address 1, Address 2], Write: [Address 3]} │ │

│ │ Transaction 2: {Read: [Address 4], Write: [Address 5]} │ │

│ │ Transaction 3: {Read: [Address 1], Write: [Address 6]} │ │

│ └─────────────────────────────────────────────────┘ │

└───────────────────────────┬─────────────────────────────┘

┌─────────────────────────────────────────────────────────┐

│ Dependency Analysis │

│ ┌─────────────────────────────────────────────────┐ │

│ │ Transaction 1 ──┐ │ │

│ │ ├── Conflict ── Transaction 3 (addr1 conflict) │ │

│ │ Transaction 2 ──┘ │ │

│ └─────────────────────────────────────────────────┘ │

└───────────────────────────┬─────────────────────────────┘

┌─────────────────────────────────────────────────────────┐

│ Deterministic grouping │

│ ┌─────────────┐ ┌─────────────┐ ┌─────────────┐ │

│ │ Group 1: Transaction 2 │ │ Group 2: Transaction 1 │ │ Group 3: Transaction 3 │ │

│ │ (No dependency) │ │ (Dependency) │ │ (Dependency) │ │

│ └──────┬──────┘ └──────┬──────┘ └──────┬──────┘ │

└─────────┼───────────────┼────────────────┼─────────────┘

↓ ↓ ↓

┌─────────────────────────────────────────────────────────┐

│ Parallel execution │

│ ┌─────────────┐ ┌─────────────┐ ┌─────────────┐ │

│ │ Immediate execution │ │ Wait for group 1 to complete │ │ Wait for group 1 to complete │ │

│ │ Transaction 2 │ │ Execute Transaction 1 │ │ Execute Transaction 3 │ │

│ └──────┬──────┘ └──────┬──────┘ └──────┬──────┘ │

└─────────┼───────────────┼────────────────┼─────────────┘

↓ ↓ ↓

┌─────────────────────────────────────────────────────────┐

│ Status update │

│ ┌─────────────┐ ┌─────────────┐ ┌─────────────┐ │

│ │ Update addr5 │ │ Update addr3 │ │ Update addr6 │ │

│ │ (No conflict) │ │ (Conflict) │ │ (Conflict) │ │

│ └──────┬──────┘ └──────┬──────┘ └──────┬──────┘ │

└─────────┼───────────────┼────────────────┼─────────────┘

↓ ↓ ↓

┌─────────────────────────────────────────────────────────┐

│ Final state submission │

└─────────────────────────────────────────────────────────┘

Advantages and Challenges

Understanding the trade-offs inherent in deterministic parallelism is critical to evaluating its suitability for different blockchain applications.

Advantages:

. Predictability and efficiency: Deterministic parallelism guarantees consistent execution results without speculative execution or rollbacks, resulting in stable and predictable performance.

. Resource optimization: Since dependencies are known in advance, the system can effectively pre-fetch required data into memory, thereby optimizing CPU utilization.

. No runtime conflict overhead: Since conflicts are prevented at the design stage, the computational cost of detecting and resolving conflicts during or after execution is eliminated.

Challenges:

. Developer complexity: The most significant challenge is that developers are required to explicitly define all state dependencies (read-write sets) for each transaction. This can be a complex, time-consuming, and error-prone process, especially for complex smart contracts.

. Rigidity and insufficient parallelism: If dependencies are not perfectly captured or are conservatively over-declared, transactions that could run in parallel may be unnecessarily serialized, meaning that the theoretical maximum parallelism may not be achieved in practice.

. Difficulty with dynamic state access: Achieving deterministic parallelism is particularly challenging for smart contracts whose state access patterns are not statically known but determined based on runtime conditional logic or external inputs.

Developer experience is a critical but often overlooked factor in blockchain adoption. While deterministic parallelism provides theoretical performance benefits by avoiding runtime conflicts, and while the technology may be superior in raw TPS, it may not translate into widespread adoption if the developer experience is poor. Blockchain is an ecosystem, and attracting and retaining developers is critical. This suggests that solutions that simplify the development process, even if they may incur some runtime overhead, may gain greater traction. The long-term viability of a blockchain platform depends not only on its peak performance, but also on how easy it is for developers to build and innovate on it.

Solana’s Sealevel Model

Solana’s Sealevel is a prominent example of achieving deterministic parallelism, demonstrating the power of this approach and its inherent tradeoffs.

Solana’s Sealevel is a parallel smart contract runtime environment that is very different from Ethereum’s sequential EVM. It enables large-scale parallel transaction processing by requiring transactions to explicitly declare the accounts they will read or write before execution. This “read-write aware execution model” enables the Solana Virtual Machine (SVM) to build a dependency graph based on which the SVM schedules non-overlapping transactions to run in parallel on multiple CPU cores, and conflicting transactions are automatically serialized.

Solana also uses Proof of History (PoH), a verifiable cryptographic clock, to pre-order transactions. This mechanism reduces synchronization overhead and enables aggressive parallelism by providing historical context for event sequences. The SVM adopts a “shared nothing concurrency model” and multi-version concurrency control (MVCC), which allows concurrent reads without blocking writes, further ensuring deterministic execution across validators.

Pros: Solana is designed for high-speed transactions, theoretically capable of processing up to 65,000 transactions per second (TPS) under optimal conditions, and has an impressively low block time (~400ms)2, making it ideal for high-frequency applications such as DeFi and GameFi. Its localized fee market helps isolate congestion to specific applications, preventing network-wide fee spikes.

Challenges: Despite its elegant design, requiring explicit declarations of state dependencies increases developer complexity. Empirical analysis shows that Solana blocks can contain “significantly longer conflict chains” (~59% of block size, compared to 18% on Ethereum) and “lower proportions of unique transactions” (only 4%, compared to 51% on Ethereum), suggesting that even with advance declarations, actual transaction patterns can still lead to dense dependency patterns or high contention.

Solana’s deterministic approach requires transactions to “explicitly specify the data they will interact with.” While this theoretically enables parallelization, empirical analysis shows that Solana blocks have “significantly longer conflict chains” (about 59% of block size, compared to 18% on Ethereum) and “a lower proportion of independent transactions” (only 4%, compared to 51% on Ethereum). Despite the ability to declare dependencies, actual applications on Solana may still result in high contention for shared state, or developers may fail to optimally declare dependencies, resulting in conservative serialization. Another possibility is that applications built on Solana inherently involve more shared state interactions (e.g., high-frequency trading on DEXs), which naturally produce longer conflict chains. This means that even deterministic systems are not immune to “hotspots” or high contention, and the theoretical advantages of declaring dependencies up front may be challenged by the complexity and dynamics of actual DApp interactions, leading to different types of bottlenecks (conflict chain length) than the EVM sequential bottleneck.

Optimistic Concurrency Control: Core Mechanisms and Technical Details

Optimistic Concurrency Control (OCC) Principle

Optimistic Concurrency Control (OCC) provides a paradigm different from deterministic methods, which prioritizes initial concurrency rather than preventing conflicts in advance. The basic assumption of OCC is that conflicts between concurrently executed transactions are rare. 1 This “optimistic” premise allows transactions to be processed in parallel without acquiring locks on shared resources at the beginning.

The core idea is to “process transactions as if there are no conflicts”. This method skips the initial sorting stage and directly performs concurrent processing. OCC does not prevent conflicts, but postpones conflict detection to the subsequent “verification” stage. If a conflict is detected at this stage, the transaction being committed is rolled back and usually re-executed. OCC is generally more effective in environments with low data contention because it avoids the overhead of managing locks and transactions waiting for each other, which may lead to higher throughput. However, if the contention for data resources is frequent, repeated restarts of transactions may significantly degrade performance.

The “optimistic” assumption is a double-edged sword, which turns a static problem into a dynamic problem. The core of OCC is the assumption of low contention. This is a powerful assumption that simplifies the developer experience and allows for maximum initial parallelism. However, if this assumption is violated (i.e. high contention), the system will incur significant overhead due to “repeated restarts of transactions” or “re-executions”. This means that OCC does not eliminate conflicts; it simply defers the detection and resolution of conflicts to runtime. This turns a static design-time problem (deterministic dependency declaration) into a dynamic runtime problem (conflict detection and rollback), shifting the “bottleneck from account lock to conflict rate” 1 This means that the effectiveness of OCC is highly dependent on the actual transaction pattern and the efficiency of its conflict resolution mechanism, so workload analysis is crucial for its successful implementation.

Implementation Logic and Workflow

The actual implementation of OCC involves a series of steps designed to execute transactions in parallel while ensuring eventual consistency. The general workflow of optimistic parallel execution (OCC) usually includes the following stages:

  1. Memory pool: A batch of transactions is collected and placed in a pool, ready for processing.

  2. Execution Multiple executors or worker threads take transactions from the pool and process them in parallel. During this speculative execution, each thread operates on a temporary, independent copy of the state database, often called the “pending-stateDB”. Each transaction records its “read set” (i.e., the data it accesses) and “write set” (i.e., the data it modifies) in detail.

  3. Sorting: After parallel execution, the processed transactions are reordered in their original submission order, which is the canonical order of block inclusion.

  4. Conflict verification: This is a critical stage for enforcing consistency. The system checks whether the input (data read) of each transaction has been changed by the results (data written) of “earlier submitted” transactions in the determined order. This involves comparing speculative state changes with the actual state.

  5. Re-execution: If a conflict is detected (meaning a state dependency has changed or a transaction read stale data), the conflicting transaction will be marked invalid and returned to the pool for reprocessing. This ensures that only valid state transitions are committed in the end.

  6. Block inclusion: Once all transactions are verified and correctly ordered with no unresolved conflicts, their state changes are synchronized to the global state database and included in the final block

┌─────────────────────────────────────────────────────────┐

│ Transaction input pool │

└───────────────────────────┬─────────────────────────────┘

┌─────────────────────────────────────────────────────────┐

│ Optimistic parallel execution │

│ ┌─────────────┐ ┌─────────────┐ ┌─────────────┐ │

│ │ Transaction 1 │ │ Transaction 2 │ │ Transaction 3 │ │

│ │ Direct execution │ │ Direct execution │ │ Direct execution │ │

│ └──────┬──────┘ └──────┬──────┘ └──────┬──────┘ │

└─────────┼───────────────┼────────────────┼─────────────┘

↓ ↓ ↓

┌─────────────────────────────────────────────────────────┐

│ Conflict detection phase │

│ ┌─────────────────────────────────────────────────┐ │

│ │ Real-time monitoring status access │ │

│ │ Conflict detected: Transaction 1 and Transaction 3 access the same status │ │

│ └─────────────────────────────────────────────────┘ │

└───────────────────────────┬─────────────────────────────┘

┌─────────────────────────────────────────────────────────┐

│ Conflict handling │

│ ┌─────────────┐ ┌─────────────┐ ┌─────────────┐ │

│ │ Transaction 1 │ │ Transaction 2 │ │ Transaction 3 │ │

│ │ Continue execution │ │ Continue execution │ │ Rollback and retry │ │

│ └──────┬──────┘ └──────┬──────┘ └──────┬──────┘ │

└─────────┼───────────────┼────────────────┼─────────────┘

↓ ↓ ↓

┌─────────────────────────────────────────────────────────┐

│ Status submission │

│ ┌─────────────┐ ┌─────────────┐ ┌─────────────┐ │

│ │ Submission status 1 │ │ Submission status 2 │ │ Waiting for retry │ │

│ │ (No conflict) │ │ (No conflict) │ │ (Conflict) │ │

│ └──────┬──────┘ └──────┬──────┘ └──────┬──────┘ │

└─────────┼───────────────┼────────────────┼─────────────┘

↓ ↓ ↓

┌─────────────────────────────────────────────────────────┐

│ Retry queue │

│ ┌─────────────────────────────────────────────────┐ │

│ │ Transaction 3 enters the retry queue │ │

│ │ Waiting for the next round of execution │ │

│ └─────────────────────────────────────────────────┘ │

└───────────────────────────┬─────────────────────────────┘

┌─────────────────────────────────────────────────────────┐

│ Final status submission │

└─────────────────────────────────────────────────────────┘

This approach ensures that the final state of the blockchain is correct, just as if transactions were processed sequentially, but with significantly higher throughput due to parallel processing

Temporary states and read-write sets play a crucial role in OCC. The use of a “pending-state database” (pending-stateDB) and the recording of “state variables they access and modify” or “read-write sets”. This means that OCC fundamentally relies on maintaining a speculative state for each parallel execution thread. This allows independent execution without the need to modify the global state immediately. The read-write set then acts as a “fingerprint” of each transaction state access, which is crucial for the post-execution verification phase. Without these temporary states and explicit access sets, conflict detection becomes impossible or inefficient, leading to non-deterministic results. This highlights the memory and computational overhead incurred by tracking these speculative states, which can become a bottleneck if not managed properly.

Conflict Detection Algorithm

The effectiveness of optimistic parallelism depends on its robust and efficient conflict detection mechanism. In standard OCC, conflict detection mainly occurs in the “conflict verification” or “validation” step after speculative execution. The system verifies that the input (read data) of each transaction is not invalidated by the results (written data) of “earlier submitted” transactions in the determined block order.

┌─────────────────────────────────────────────────────────┐

│ Status item conflict detection │

└───────────────────────────┬─────────────────────────────┘

┌─────────────────────────────────────────────────────────┐

│ Transaction execution order │

│ ┌─────────────────────────────────────────────────┐ │

│ │ Transaction Ti (i < j) │ │

│ │ Transaction Tj │ │

│ └─────────────────────────────────────────────────┘ │

└───────────────────────────┬─────────────────────────────┘

┌─────────────────────────────────────────────────────────┐

│ Status item access mode │

│ ┌─────

‘o────────────────────────────────────────────┐ │

│ │ Ti: Write status item X │ │

│ │ Tj: Read status item X │ │

│ └─────────────────────────────────────────────────┘ │

└───────────────────────────┬─────────────────────────────┘

┌─────────────────────────────────────────────────────────┐

│ Conflict detection process │

│ ┌─────────────────────────────────────────────────┐ │

│ │ 1. Monitor Ti’s WriteSet │ │

│ │ 2. Monitor Tj’s ReadSet │ │

│ │ 3. Detect identical state item X │ │

│ │ 4. Confirm that Tj reads after Ti writes │ │

│ └─────────────────────────────────────────────────┘ │

└───────────────────────────┬─────────────────────────────┘

┌─────────────────────────────────────────────────────────┐

│ Conflict determination result │

│ ┌─────────────────────────────────────────────────┐ │

│ │ Determination: Tj operated on stale data │ │

│ │ Cause: Read state item X modified by Ti │ │

│ │ Result: Marked as conflict, needs to be re-executed │ │

│ └─────────────────────────────────────────────────┘ │

└───────────────────────────┬─────────────────────────────┘

┌─────────────────────────────────────────────────────────┐

│ Conflict handling strategy │

└─────────────────────────────────────────────────────────┘

A conflict is formally defined as occurring if transaction Ti writes a state item and a subsequent transaction Tj (where i < j) subsequently reads that state item. This indicates that Tj operates on stale data. Implementations like Reddio monitor the read-write sets of different transactions. If multiple transactions are detected attempting to read or write the same state item, a conflict is flagged.

More advanced OCC variants, such as Aptos’ Block-STM, introduce “dynamic parallelism” where they detect and resolve conflicts “during” execution, not just “after” execution. This involves real-time monitoring of read-write sets and possible temporary locks on conflicting accounts.

Bitroot claims to have a “three-phase conflict detection mechanism,” suggesting that it takes a multi-layered approach to identifying and managing conflicts, although the specifics of these phases are not elaborated in the research materials.

The timing of conflict detection is a key design choice that has a significant impact on performance. The research material clearly shows this distinction: traditional OCC detects conflicts “after” execution, while Block-STM does it “during” execution. This means that the timing of conflict detection is a key design choice with significant performance impact. Post-execution detection (pure OCC) allows for maximum initial parallelism, but can result in wasted computation if many transactions are re-executed. In-execution detection (such as Block-STM) aims to minimize wasted work by identifying conflicts earlier, which may be achieved by introducing some serialization or overhead during execution, which implies a trade-off: earlier detection may introduce some overhead during execution, but it reduces the cost of a full rollback and re-execution. This makes the understanding of “optimism” more nuanced — it is not a single method, but a trade-off in the degree of conflict postponement, and its goal is to optimize the overall throughput by balancing speculative execution with efficient conflict resolution.

Image description

Rollback mechanism and optimization points

In an optimistic parallel execution environment, once a conflict is detected, an effective rollback mechanism is crucial to ensure state consistency and minimize performance degradation. The basic response after detecting a conflict in OCC is to “abort the conflicting transaction” and “return it to the pool for reprocessing”. This ensures that only valid state transitions are eventually submitted to the blockchain.

Optimization points for rollback:

. Minimize re-execution: To prevent repeated conflicts and infinite re-execution cycles, the system can adjust the priority of conflicting transactions or re-queue them in an order that reduces the possibility of repeated conflicts.

· Selective rollback: More complex systems, such as Aptos’s Block-STM, implement “selective rollback”. Instead of rolling back the entire batch or block, they “selectively roll back only conflicting transactions”, allowing non-conflicting transactions to continue uninterrupted, which significantly minimizes wasted computation.

. Conflict resolution mechanisms: In addition to simple re-execution, implementations can also introduce “lock-based access control or transaction isolation strategies” to more effectively manage conflicts during reprocessing, which may involve temporary locks on affected state items to ensure atomicity during conflict resolution.

· Temporary state database: Approaches like Reddio use a “temporary state database (pending-stateDB)” for each thread during speculative execution. This design simplifies rollbacks because only the local pending-stateDB needs to be discarded or reset, rather than reverting changes to the global state.

· Asynchronous state management: Further optimization involves decoupling execution from storage operations. For example, Reddio uses “direct state reads” (retrieving state values ​​directly from the key-value database without traversing the Merkle Patricia Trie), “asynchronous parallel node loading” (preloading Trie nodes in parallel with execution), and “streamlined state management” (overlapping execution, state retrieval, and storage updates. These techniques reduce I/O bottlenecks and enable more efficient state updates and faster rollbacks by making state changes speculative and asynchronous before verification.

The rollback mechanism has evolved from simple re-execution to complex, fine-grained recovery. Initially, rollback seemed to be just a simple re-execution. However, the research material revealed a progression: from “requeueing and adjusting priorities” to “selectively rolling back only conflicting transactions” and optimizing the underlying state management. This means that the efficiency of optimistic parallelism lies not only in the “detection” of conflicts, but also in the efficiency of the system’s “recovery” from conflicts. Simple re-execution may lead to performance degradation, Advanced techniques such as selective rollback and optimized state persistence (e.g., local temporary state, asynchronous commits) are critical to making OCC viable in high-throughput environments. The “resolution cost” of conflicts is a key metric for evaluating OCC implementations, and continued innovation in this area is critical to pushing the boundaries of parallel blockchain performance.

Application of Optimistic Parallelism in Bitroot

Bitroot’s core innovation in transaction execution lies in its optimistic parallelization implementation, which aims to achieve high efficiency without placing significant additional burden on developers. Bitroot’s parallel execution engine is built on the “optimistic parallel execution model”1 Bitroot claims that the model is “the first in the industry with high technical barriers”, although other projects like Sei and Monad have also adopted optimistic concurrency control (OCC).

Bitroot’s approach combines “transaction dependency analysis, This suggests that it adopts a hybrid strategy that incorporates a degree of dependency awareness typically associated with deterministic models into an optimistic framework to optimize initial scheduling. A key technical detail is its “three-phase conflict detection mechanism” This multi-phase approach is designed to

ensure correctness and prevent invalid retries, resulting in a claimed transaction throughput of 8–15 times higher than traditional EVM2 In addition, the “automatic state tracking” feature is critical to its optimistic model because it frees developers from the burden of manually defining state access patterns, a significant advantage over deterministic approaches.

  1. Pre-execution/batch selection: Before parallel execution, reduce obvious conflicts through initial screening or heuristic methods (similar to Reddio’s explicit conflict checking during batch acquisition 25), This may be due to the “transaction dependency analysis” mentioned in it.

  2. In-execution/dynamic detection: Monitor read-write sets in real time, detect conflicts immediately when they occur, and possibly suspend or mark transactions for immediate re-evaluation to minimize wasted computation.

  3. Post-execution/verification: Perform a final, comprehensive check on all speculative execution results, verify against the determined order, and roll back if there are any implicit conflicts.

┌─────────────────────────────────────────────────────────┐

│ Three-stage conflict detection mechanism │

└───────────────────────────┬─────────────────────────────┘

┌─────────────────────────────────────────────────────────┐

│ Stage 1: Pre-execution/batch selection │

│ ┌─────────────────────────────────────────────────┐ │

│ │ Transaction dependency analysis │ │

│ │ Build transaction DAG │ │

│ │ Initial conflict screening │ │

│ └─────────────────────────────────────────────────┘ │

└───────────────────────────┬─────────────────────────────┘

┌─────────────────────────────────────────────────────────┐

│ Phase 2: In progress/dynamic detection │

│ ┌─────────────────────────────────────────────────┐ │

│ │ Real-time monitoring of read/write sets │ │

│ │ Dynamic conflict identification │ │

│ │ Temporary locking of conflicting accounts │ │

│ └─────────────────────────────────────────────────┘ │

└───────────────────────────┬─────────────────────────────┘

┌─────────────────────────────────────────────────────────┐

│ Stage 3: Post-execution/verification │

│ ┌─────────────────────────────────────────────────┐ │

│ │ Final conflict verification │ │

│ │ Status consistency check │ │

│ │ Selective rollback processing │ │

│ └─────────────────────────────────────────────────┘ │

└───────────────────────────┬─────────────────────────────┘

┌─────────────────────────────────────────────────────────┐

│ Final state submission │

└─────────────────────────────────────────────────────────┘

Bitroot is trying to optimize conflict resolution by combining elements of optimistic (parallel by default, no upfront developer burden) and deterministic (some form of dependency awareness or early detection) approaches, in order to achieve the best of both worlds.

Bitroot’s rollback mechanism optimization points

Bitroot’s rollback mechanism adopts a multi-level design, and achieves efficient conflict recovery through its “three-stage conflict detection mechanism”. In the pre-execution phase, the system quickly screens potential conflicts through an improved counting bloom filter (CBF), controls the false positive rate below 0.1%, and pre-groups transactions that may conflict, thereby reducing the probability of conflicts in subsequent execution phases. In the execution phase, fine-grained read-write locks and versioned state management are implemented, and optimistic concurrency control similar to STM (Software Transactional Memory) is adopted. When a conflict is detected, only the affected transactions are rolled back instead of the entire batch. At the same time, versioned state management is used to allow concurrent reading and maintain isolation of write operations. In the submission phase, the system ensures the correctness of state transitions through hash verification, adopts an incremental state update mechanism, maintains the state version chain to support fast rollback, and uses an optimized merge algorithm to reduce memory copies.

In terms of optimization, Bitroot implements an intelligent retry strategy, uses an exponential backoff algorithm for retry, dynamically adjusts the retry strategy according to the conflict type, and effectively avoids livelock problems in high-competition scenarios. In terms of state management, the system implements fine-grained state dependency analysis, subdivides the contract state to the storage slot level, reduces the overhead of multiple traversals of the state tree through preloading and batch reading, and can reduce about 37% of state access operations per transaction on average. In terms of performance optimization, the system adopts a double buffer design to allow operations at different stages to be processed simultaneously, implements NUMA-aware scheduling to reduce cross-core communication overhead, and improves CPU utilization by about 22% through a work-stealing algorithm. Together, these optimizations form an efficient and reliable conflict recovery mechanism that enables Bitroot to achieve high throughput parallel processing while maintaining system stability.

** Comparison of other parallel execution technologies**

** Block-STM by Aptos**

Aptos’ Block-STM is a noteworthy parallel execution engine that adopts the idea of ​​optimistic concurrency control. Block-STM is described as an optimistic parallel execution engine. Its key difference is that it dynamically detects and resolves conflicts “during” execution, not just “after” execution. This means that it is able to “selectively roll back only conflicting transactions”, allowing non-conflicting transactions to continue uninterrupted, thereby significantly reducing wasted computation.

Block-STM leverages software transactional memory (STM) technology and a novel cooperative scheduling mechanism to achieve its dynamic parallelization. This approach eliminates the need for developers to pre-specify transaction conflicts, providing greater flexibility for application development without facing the design limitations of statically declared dependencies.

Aptos’s claimed performance indicators are impressive: up to 160,000 TPS in a simulation environment (based on internal testing), sub-second final confirmation time (0.9 seconds), and extremely low gas fees (about $0.00005/transaction) Its advantages are that it provides developers with flexibility, can efficiently resolve conflicts, and achieves high throughput. However, the challenge is that it shifts the bottleneck to the computational overhead of monitoring read and write operations, and its sustained performance in the real world remains to be independently verified.

Comparing Aptos to Bitroot, both adopt an optimistic approach. However, Block-STM’s “on-the-fly” conflict resolution is a key difference from standard OCC (and Bitroot’s “three-phase” approach). Block-STM’s dynamic conflict detection aims to catch and resolve conflicts earlier, potentially reducing the waste caused by rolling back entire batches.

Sui’s Object Model

Sui introduces a unique data model that is very different from traditional account-based blockchain systems. Sui adopts an “object-centric” data model that treats on-chain assets as independent, mutable objects1 This model enables parallel processing by isolating operations on independent objects.

Sui divides objects into “owned objects” and “shared objects”. Owned objects have a single owner (which can be a user account or another object, such as an NFT or token balance), and transactions involving owned objects can bypass the consensus mechanism for faster final confirmation, while shared objects have no designated owner and can be interacted with by multiple users (such as liquidity pools and NFT casting contracts), and transactions involving shared objects require consensus to coordinate reading and writing.

Sui claims that its performance indicators include sub-second final confirmation time and high throughput. Its advantages lie in fine-grained state management, enhanced security through isolation, and efficient support for applications such as NFTs and games. 3 However, complex transactions involving shared objects may bring challenges, and competition may occur on hot objects.

Sui’s model is fundamentally different from EVM-compatible chains, requiring a new programming paradigm (Move language) and object-centric design. This is in contrast to Bitroot’s focus on EVM compatibility. Bitroot aims to achieve expansion by optimizing the existing EVM, while Sui achieves parallelization by redesigning the underlying data structure.

Other technologies worth noting

In addition to Bitroot, Solana, Aptos, and Sui, there are other important developments and technologies in the field of blockchain parallel execution:

· Ethereum’s sharding roadmap (Danksharding/Proto-Danksharding): Ethereum’s expansion strategy has shifted to be centered on

Rollup, and its sharding roadmap (Danks hardening) focuses on data availability (implemented through “blobs”) rather than executing shards. Proto-Danks hardening (EIP-4844) is the first step of Danks hardening, introducing a new transaction type to carry large amounts of data (blobs), which are mainly used in Layer 2 Rollup to significantly reduce its fees. Danks hardening uses a merged fee market and a single-block proposer system to simplify the complexity of cross-shard transactions, which shows that Ethereum positions itself as a data availability layer, relying on Rollup. to handle most of the execution load.

· Monad: Monad is a fully bytecode-compatible parallel EVM Layer 1 blockchain. It adopts an optimistic parallel execution model and decouples consensus (MonadBFT) from execution to reduce the time and communication steps required for block final confirmation. Monad also develops a high-speed custom key-value database (Monad Db) and asynchronous execution mechanism, aiming to achieve a throughput of 10,000 transactions per second.

· Sei Network: Sei is a Layer 1 blockchain optimized for digital asset exchange, and its V2 version uses optimistic concurrency to improve developer friendliness. Sei’s expansion strategy revolves around optimizing execution, accelerating consensus, and enhancing storage. It processes transactions by checking conflicts after execution, thereby minimizing overhead.

· Reddio: As a ZKRollup project, Reddio optimizes EVM through multi-threaded parallelism. It provides a temporary state database (pending-stateDB) for each thread and synchronizes state changes after execution. Reddio also introduces a conflict detection mechanism, monitors read-write sets, and marks transactions for re-execution when conflicts are detected. In addition, Reddio solves the storage bottleneck through technologies such as direct state reading, asynchronous parallel node loading, and streamlined state management.

Together, these technologies reveal a general trend in the blockchain industry toward parallel EVMs, focusing on developer experience and optimizing storage and consensus mechanisms in addition to execution.

Conclusion

Blockchain scalability is the key bottleneck for its large-scale application, and parallel execution is the fundamental way to solve this challenge. This article deeply explores two core strategies in parallel blockchain design: deterministic parallelism and optimistic parallelism.

Performance Improvement and Empirical Data

Bitroot’s parallel EVM implementation shows significant performance improvement. Under standard test environment, transaction throughput reaches 12,000–15,000 TPS, which is 8–15 times higher than traditional EVM. The average transaction confirmation time is reduced from 12–15 seconds of traditional EVM to 0.8–1.2 seconds, and the gas fee is reduced by about 40–60%, especially in high-load scenarios. Through the optimized state access mechanism, the state access operation of each transaction is reduced by about 37%, which significantly reduces the storage overhead.

In actual application scenarios, Bitroot has demonstrated excellent performance. In DeFi application scenarios, such as high-frequency trading environments like Uniswap V 3, the system can process 8,000+ transactions per second. The batch casting operation performance of the NFT market is improved by 12 times, and the gas fee is reduced by 45%. In the game application scenario, the system supports 100,000+ concurrent users while keeping the transaction delay within 200ms.

Technology Evolution Path

The technological evolution of parallel EVM is developing in multiple directions. In the field of conflict detection, the system will introduce machine learning to predict conflict probability, develop adaptive conflict detection thresholds, and achieve more fine-grained state access control. In terms of state management, a hierarchical state tree structure will be adopted to implement distributed state caching and develop an intelligent preloading mechanism. The consensus mechanism will also be improved, including asynchronous consensus and execution separation, dynamic block size adjustment, and cross-shard transaction optimization.

Potential Challenges and Solutions

The development of parallel EVM faces many challenges. On the technical level, the state expansion problem needs to be solved through state compression and archiving mechanisms, cross-shard communication requires the development of efficient cross-shard messaging protocols, and security assurance requires strengthening formal verification and audit mechanisms. On the ecological level, it is necessary to solve problems such as developer migration, application compatibility, and performance monitoring. This requires providing a complete tool chain and documentation to ensure full compatibility with existing EVM contracts and establish a comprehensive performance indicator and monitoring system.

Image description

Future Outlook

The development of parallel EVM will drive the evolution of blockchain technology towards higher performance and lower cost. Bitroot’s practice shows that through innovative conflict detection mechanisms and optimized state management, significant performance improvements can be achieved while maintaining EVM compatibility. In the future, with the application of more optimization technologies and the maturity of the ecosystem, parallel EVM is expected to become the mainstream choice of blockchain infrastructure, providing stronger technical support for decentralized applications.

Compared with other parallel execution technologies, Aptos’ Block-STM further optimizes the efficiency of optimistic concurrency control by dynamically detecting and resolving conflicts during execution and performing selective rollbacks. Sui’s object model realizes parallel processing of non-overlapping transactions by treating assets as independent objects, and introduces the concepts of “owned objects” and “shared objects”, but its underlying design is quite different from that of EVM-compatible chains. Ethereum itself focuses on providing data availability for Layer 2 Rollup through Danks hardening, transferring most of the execution load to the off-chain. These different technical routes jointly promote the diversified development of blockchain scalability solutions.

📍 Official website: https://bitroot.co
📍 Twitter: https://x.com/bitroot_
📍 Mirror: https://mirror.xyz/bitroot.eth

Top comments (0)