ohmygod

Posted on Mar 23

Differential Testing for DeFi Protocol Forks: A Foundry Framework That Would Have Caught $50M in Exploits

#security #solidity #web3 #defi

Differential Testing for DeFi Protocol Forks: Finding the Bugs That Auditors Miss When You Copy-Paste Aave

Every week, a new lending protocol launches as "Aave but for [chain/asset/narrative]." Every month, one of them gets exploited. The pattern is predictable: fork a battle-tested protocol, modify 3-5% of the codebase for your niche, and accidentally break a security invariant that the original team spent years hardening.

Differential testing — running identical inputs against the original and forked implementations and comparing outputs — is the most efficient way to catch these fork-specific bugs. This article shows you how to build a differential testing framework that would have caught $50M+ in fork-related exploits in 2025-2026.

Why Forks Break: The 3-5% Problem

When you fork Aave V3, you inherit ~50,000 lines of audited Solidity. You change maybe 1,500 lines. But those changes interact with the other 48,500 lines in ways the original auditors never considered.

Real examples from 2025-2026:

Radiant Capital — Forked Aave V2, modified the liquidation bonus calculation. The change introduced a rounding error that made underwater positions unliquidatable under specific collateral ratios.
Mango Markets V4 — Forked Serum's orderbook, modified tick sizes. The change allowed oracle manipulation at a granularity the original codebase prevented.
Planet Finance — Forked Compound, modified accrued interest treatment. The change double-counted interest in certain withdrawal paths, leading to a $10K exploit.

The common thread: changes that look correct in isolation but violate implicit invariants of the original protocol.

Building a Differential Testing Framework

Step 1: Deploy Both Implementations Side by Side

// SPDX-License-Identifier: MIT
pragma solidity ^0.8.24;

import "forge-std/Test.sol";

// Import both implementations
import {LendingPool as OriginalPool} from "../src/original/LendingPool.sol";
import {LendingPool as ForkedPool} from "../src/forked/LendingPool.sol";

contract DifferentialTest is Test {
    OriginalPool originalPool;
    ForkedPool forkedPool;

    // Mirror state: same tokens, same oracles, same configs
    address constant USDC = address(0x1);
    address constant WETH = address(0x2);

    function setUp() public {
        originalPool = new OriginalPool();
        forkedPool = new ForkedPool();

        // Initialize both with identical parameters
        _mirrorConfiguration(originalPool, forkedPool);
    }

    function _mirrorConfiguration(
        OriginalPool orig, 
        ForkedPool fork
    ) internal {
        // Same reserve configs, oracle prices, interest rate models
        // This is the tedious but critical part
        orig.initReserve(USDC, /* params */);
        fork.initReserve(USDC, /* same params */);

        orig.initReserve(WETH, /* params */);
        fork.initReserve(WETH, /* same params */);
    }
}

Step 2: Fuzz Identical Sequences

The core insight: if both implementations receive identical inputs, they should produce identical outputs — unless the fork intentionally changed that behavior.

function testFuzz_depositWithdrawParity(
    uint256 amount,
    uint256 blocks
) public {
    amount = bound(amount, 1e6, 1e12);  // 1 USDC to 1M USDC
    blocks = bound(blocks, 1, 100_000);

    // Execute identical operations on both pools
    deal(USDC, address(this), amount * 2);

    IERC20(USDC).approve(address(originalPool), amount);
    IERC20(USDC).approve(address(forkedPool), amount);

    originalPool.deposit(USDC, amount, address(this), 0);
    forkedPool.deposit(USDC, amount, address(this), 0);

    // Advance time identically
    vm.roll(block.number + blocks);
    vm.warp(block.timestamp + blocks * 12);

    // Withdraw max from both
    uint256 origBalance = originalPool.withdraw(
        USDC, type(uint256).max, address(this)
    );
    uint256 forkBalance = forkedPool.withdraw(
        USDC, type(uint256).max, address(this)
    );

    // Compare: should be identical (or within 1 wei for rounding)
    assertApproxEqAbs(
        origBalance, 
        forkBalance, 
        1,  // 1 wei tolerance
        "Deposit-withdraw parity broken"
    );
}

Step 3: Target the Changed Code

The highest-value tests target the specific functions your fork modified. Use git diff to identify changed files:

#!/bin/bash
# generate-diff-targets.sh
# Compare fork against original, output modified function signatures

diff -rq original/contracts/ forked/contracts/ | \
  grep "differ" | \
  awk '{print $2}' | \
  while read file; do
    echo "=== $file ==="
    diff <(solc --hashes "$file" 2>/dev/null) \
         <(solc --hashes "${file/original/forked}" 2>/dev/null)
  done

Then write targeted differential tests for each changed function:

// If your fork modified the liquidation threshold calculation:
function testFuzz_liquidationThresholdParity(
    uint256 collateral,
    uint256 debt,
    uint256 price
) public {
    collateral = bound(collateral, 1e18, 1e24);
    debt = bound(debt, 1e6, 1e12);
    price = bound(price, 100e8, 10000e8);

    _setupPosition(originalPool, collateral, debt, price);
    _setupPosition(forkedPool, collateral, debt, price);

    bool origLiquidatable = originalPool.isLiquidatable(address(this));
    bool forkLiquidatable = forkedPool.isLiquidatable(address(this));

    // INTENTIONAL DIFFERENCE? Document it.
    // Otherwise, this should match exactly.
    if (origLiquidatable != forkLiquidatable) {
        emit log_string("DIVERGENCE DETECTED");
        emit log_named_uint("collateral", collateral);
        emit log_named_uint("debt", debt);
        emit log_named_uint("price", price);

        // Fail unless this is a documented intentional change
        revert("Liquidation threshold divergence");
    }
}

Pattern Library: 5 Fork Bug Classes

1. Rounding Direction Changes

Original Aave rounds down on deposits (conservative for protocol). Your fork changes a division for gas optimization and accidentally rounds up:

// Differential test that catches rounding divergence
function testFuzz_roundingDirection(uint256 amount, uint256 index) public {
    amount = bound(amount, 1, 1e30);
    index = bound(index, 1e27, 2e27);  // Ray-based index

    uint256 origShares = _originalRayDiv(amount, index);
    uint256 forkShares = _forkedRayDiv(amount, index);

    // Original should always be <= fork if rounding changed
    // Flag ANY difference for manual review
    if (origShares != forkShares) {
        emit log_named_uint("amount", amount);
        emit log_named_uint("index", index);
        emit log_named_int("delta", int256(forkShares) - int256(origShares));
        fail();
    }
}

2. Access Control Gaps

Forks often add new admin functions but forget to apply the original's access control modifiers:

function testFuzz_accessControlParity(
    address caller,
    bytes4 selector
) public {
    // For every function in the fork, check if access control matches
    bytes memory callData = abi.encodePacked(selector, bytes28(0));

    vm.startPrank(caller);

    (bool origSuccess,) = address(originalPool).call(callData);
    (bool forkSuccess,) = address(forkedPool).call(callData);

    vm.stopPrank();

    // If original reverts but fork succeeds, potential access control gap
    if (!origSuccess && forkSuccess) {
        emit log_named_address("caller", caller);
        emit log_named_bytes4("selector", selector);
        emit log_string("ACCESS CONTROL DIVERGENCE: fork allows, original denies");
        fail();
    }
}

3. Interest Rate Model Drift

Modified interest rate curves can create edge cases where utilization ratios produce unexpected rates:

function testFuzz_interestRateParity(uint256 utilization) public {
    utilization = bound(utilization, 0, 1e27); // 0% to 100% in Ray

    uint256 origRate = originalRateModel.calculateRate(utilization);
    uint256 forkRate = forkedRateModel.calculateRate(utilization);

    // Allow 0.01% tolerance for intentional curve changes
    uint256 tolerance = origRate / 10000;

    if (forkRate > origRate + tolerance || forkRate < origRate - tolerance) {
        emit log_named_uint("utilization", utilization);
        emit log_named_uint("origRate", origRate);
        emit log_named_uint("forkRate", forkRate);

        // Is this an intentional change? Check against documented modifications
        fail();
    }
}

4. Oracle Integration Mismatches

Forks that change oracle providers or add new assets often get decimal normalization wrong:

function testFuzz_oraclePriceParity(
    uint256 chainlinkPrice,
    uint8 decimals
) public {
    chainlinkPrice = bound(chainlinkPrice, 1, type(int256).max / 1e18);
    decimals = uint8(bound(decimals, 6, 18));

    // Set same raw price on both
    mockChainlink.setPrice(int256(chainlinkPrice), decimals);

    uint256 origNormalized = originalOracle.getAssetPrice(WETH);
    uint256 forkNormalized = forkedOracle.getAssetPrice(WETH);

    assertEq(
        origNormalized, 
        forkNormalized,
        "Oracle normalization divergence"
    );
}

5. Flash Loan Callback Mutations

Forks that modify flash loan callbacks or add new callback types can introduce reentrancy:

function testFuzz_flashLoanCallbackParity(
    uint256 amount,
    bytes calldata data
) public {
    amount = bound(amount, 1e6, 1e12);

    uint256 origBalBefore = IERC20(USDC).balanceOf(address(originalPool));
    uint256 forkBalBefore = IERC20(USDC).balanceOf(address(forkedPool));

    // Execute flash loans on both
    try originalPool.flashLoan(address(this), USDC, amount, data) {} catch {}
    try forkedPool.flashLoan(address(this), USDC, amount, data) {} catch {}

    uint256 origBalAfter = IERC20(USDC).balanceOf(address(originalPool));
    uint256 forkBalAfter = IERC20(USDC).balanceOf(address(forkedPool));

    // Pool balance should never decrease after flash loan
    assertGe(origBalAfter, origBalBefore, "Original: flash loan drained funds");
    assertGe(forkBalAfter, forkBalBefore, "Fork: flash loan drained funds");

    // Deltas should match
    assertEq(
        origBalAfter - origBalBefore,
        forkBalAfter - forkBalBefore,
        "Flash loan fee divergence"
    );
}

Automating Differential Tests in CI

# .github/workflows/differential-test.yml
name: Differential Fork Security Tests

on:
  push:
    paths:
      - 'contracts/**'
  pull_request:
    paths:
      - 'contracts/**'

jobs:
  diff-test:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4
        with:
          submodules: recursive

      - uses: foundry-rs/foundry-toolchain@v1

      - name: Generate diff targets
        run: |
          chmod +x scripts/generate-diff-targets.sh
          ./scripts/generate-diff-targets.sh > diff-report.txt
          cat diff-report.txt

      - name: Run differential fuzz tests
        run: |
          forge test \
            --match-contract "Differential" \
            --fuzz-runs 50000 \
            --fuzz-seed 42 \
            -vvv

      - name: Check for divergences
        if: failure()
        run: |
          echo "::error::Differential tests found divergences between original and fork!"
          echo "Review the test output above for specific input values that trigger different behavior."
          exit 1

Solana: Differential Testing for Anchor Program Forks

The same principle applies to Solana program forks. Use Bankrun to deploy both versions side by side:

// tests/differential_test.rs
use anchor_lang::prelude::*;
use litesvm::LiteSVM;

#[test]
fn test_deposit_withdraw_parity() {
    let mut svm = LiteSVM::new();

    // Deploy original program
    let original_id = Pubkey::new_unique();
    svm.add_program(original_id, "original_program.so");

    // Deploy forked program  
    let fork_id = Pubkey::new_unique();
    svm.add_program(fork_id, "forked_program.so");

    // Run identical deposit sequences
    let amount: u64 = 1_000_000; // 1 USDC

    let orig_result = execute_deposit(&mut svm, original_id, amount);
    let fork_result = execute_deposit(&mut svm, fork_id, amount);

    // Compare token balances, account states, events
    assert_eq!(
        orig_result.user_shares, 
        fork_result.user_shares,
        "Share calculation divergence: orig={}, fork={}",
        orig_result.user_shares,
        fork_result.user_shares
    );

    // Compare vault state
    assert_eq!(
        orig_result.total_deposits,
        fork_result.total_deposits,
        "Total deposit tracking divergence"
    );
}

#[test]
fn test_liquidation_boundary_parity() {
    // Fuzz the exact boundary where positions become liquidatable
    for collateral_ratio in (100..200).step_by(1) {
        let ratio = collateral_ratio as f64 / 100.0;

        let orig_liquidatable = check_liquidation(&original, ratio);
        let fork_liquidatable = check_liquidation(&forked, ratio);

        assert_eq!(
            orig_liquidatable, fork_liquidatable,
            "Liquidation boundary divergence at {}% collateral ratio",
            collateral_ratio
        );
    }
}

When Divergence Is Intentional

Not every difference is a bug. Document expected divergences:

/// @notice INTENTIONAL DIVERGENCE LOG
/// 
/// 1. liquidationBonus: Changed from 5% to 7% for volatile assets
///    - Original: 10500 (105%)
///    - Fork: 10700 (107%)
///    - Reason: Higher bonus needed for illiquid markets
///    - Risk assessment: Increases liquidation incentive, no security impact
///
/// 2. flashLoanPremium: Changed from 9 bps to 5 bps
///    - Original: 9 (0.09%)
///    - Fork: 5 (0.05%)  
///    - Reason: Competitive pricing
///    - Risk assessment: Lower premium reduces protocol revenue but no security impact
///
/// 3. maxStableRateBorrowSizePercent: REMOVED
///    - Original: 25% of available liquidity
///    - Fork: No limit
///    - Risk assessment: ⚠️ REVIEW — could enable utilization manipulation

Audit Checklist: Fork Security Review

Before deploying any forked DeFi protocol:

[ ] Full git diff documented — every changed line cataloged and justified
[ ] Differential fuzz tests covering all modified functions (≥50K runs each)
[ ] Rounding direction audit — confirm all division operations match original intent
[ ] Access control comparison — every new function has appropriate modifiers
[ ] Oracle integration test — decimal normalization matches across all asset types
[ ] Interest rate model boundary scan — test at 0%, 50%, optimal, 99%, and 100% utilization
[ ] Flash loan parity test — callback behavior matches or divergences are documented
[ ] Liquidation boundary fuzzing — test positions at exact health factor thresholds
[ ] Invariant tests from original protocol — run the upstream's test suite against your fork
[ ] Intentional divergence log — every deliberate change documented with risk assessment

The $50M Lesson

Differential testing is unsexy work. It's not as impressive as formal verification or as novel as AI-powered auditing. But it catches the bugs that actually kill forks — the subtle, emergent divergences that appear when you change 3% of a codebase and don't fully understand the other 97%.

The next time you audit a fork, don't just read the changed code. Run the original and the fork side by side, throw the same fuzzed inputs at both, and investigate every divergence. The $50M in fork-related losses from 2025-2026 suggests this basic practice is still remarkably rare.

Build boring tools that catch expensive bugs.

References: Foundry Differential Testing Docs, Trail of Bits — How to Review a DeFi Fork, BlockSec Weekly Incident Reports (2025-2026)

DEV Community