DEV Community

ohmygod
ohmygod

Posted on

The AI Audit Pipeline: How ItyFuzz, Certora AI Composer, and Medusa ML Are Making Manual Invariant Discovery Obsolete

Manual invariant discovery is the single biggest bottleneck in smart contract security. An experienced auditor spends 60-70% of their time writing specifications — not finding bugs. Three tools shipping in 2026 are collapsing that bottleneck from days to minutes.

This article is a hands-on walkthrough of the AI-assisted audit pipeline combining ItyFuzz (hybrid symbolic-fuzzing), Certora AI Composer (formal verification with AI-generated specs), and Medusa (ML-guided mutation fuzzing). Together, they represent a paradigm shift from "write specs then verify" to "discover specs automatically then verify everything."

Why Manual Invariant Discovery Fails at Scale

Consider a typical DeFi lending protocol. The core invariants seem obvious:

// "Total deposits >= total borrows" — easy, right?
assert(totalDeposits >= totalBorrows);
Enter fullscreen mode Exit fullscreen mode

But real protocols have hundreds of implicit invariants across interest rate models, liquidation engines, oracle integrations, and governance mechanisms. The Euler Finance exploit ($197M, 2023) bypassed an invariant nobody thought to write: the donation attack violated an assumption about the relationship between share price and underlying assets that existed only in developers' heads.

The gap: Auditors catch bugs they can imagine. AI catches bugs across the entire state space.

Layer 1: ItyFuzz — Hybrid Symbolic Fuzzing That Finds What Others Miss

ItyFuzz isn't just another fuzzer. It combines three techniques that individually are powerful but together are devastating:

Snapshot-Based State Exploration

Traditional fuzzers replay transaction sequences from genesis. ItyFuzz takes snapshots of interesting states and forks from them, dramatically reducing the search space:

# Install ItyFuzz
cargo install ityfuzz

# Basic fuzzing against a deployed contract (fork mode)
ityfuzz evm \
  -t 0xYOUR_CONTRACT \
  --onchain-etherscan-api-key $ETHERSCAN_KEY \
  -c ETH \
  --onchain-block-number 19000000
Enter fullscreen mode Exit fullscreen mode

Concolic Execution for Deep Paths

Pure fuzzing struggles with tight conditionals. ItyFuzz uses concolic execution — running concrete values while maintaining symbolic constraints — to solve path conditions that random inputs would take billions of years to hit:

// This conditional is virtually impossible to fuzz randomly
function withdraw(uint256 amount, bytes32 proof) external {
    require(keccak256(abi.encodePacked(amount, msg.sender, nonce)) == proof);
    // ItyFuzz's symbolic engine solves this constraint directly
}
Enter fullscreen mode Exit fullscreen mode

On-Chain Fork Fuzzing

The killer feature: ItyFuzz can fork mainnet state and fuzz against real deployed contracts with real balances. This catches composability bugs that isolated testing misses entirely:

# Fuzz a DeFi protocol against real mainnet state
ityfuzz evm \
  -t 0xLENDING_PROTOCOL \
  -t 0xORACLE_CONTRACT \
  -t 0xDEX_POOL \
  --onchain-etherscan-api-key $ETHERSCAN_KEY \
  -c ETH \
  --flashloan  # Enable flash loan attack simulation
Enter fullscreen mode Exit fullscreen mode

With --flashloan, ItyFuzz automatically discovers flash-loan-assisted attack paths — the exact pattern behind ~40% of DeFi exploits in Q1 2026.

Real-World Impact

In benchmarks against 18 known-vulnerable contracts, ItyFuzz found:

  • 14/18 vulnerabilities in under 5 minutes each
  • 3 additional bugs that were unknown at deployment time
  • Average time-to-first-bug: 47 seconds (vs. 12 minutes for Echidna, 8 minutes for Foundry fuzz)

Layer 2: Certora AI Composer — Formal Verification Meets LLM

Certora's AI Composer, open-sourced in November 2025, solves the specification bottleneck by embedding formal verification into the AI code generation loop.

How It Works

Instead of writing CVL (Certora Verification Language) specs manually, the AI Composer:

  1. Analyzes contract code to identify state variables and their relationships
  2. Generates candidate invariants using LLM understanding of DeFi patterns
  3. Formally verifies each invariant against all possible execution paths
  4. Iterates — refining invariants that fail verification
# certora.conf — AI Composer configuration
{
    "files": ["contracts/LendingPool.sol"],
    "verify": "LendingPool:certora/specs/auto_generated.spec",
    "ai_composer": {
        "enabled": true,
        "invariant_discovery": true,
        "max_iterations": 50,
        "pattern_library": "defi_lending"
    }
}
Enter fullscreen mode Exit fullscreen mode

What It Discovers Automatically

For a typical lending protocol, Certora AI Composer generates invariants like:

// Auto-discovered: share price monotonicity
invariant sharePriceNeverDecreases(address market)
    currentSharePrice(market) >= previousSharePrice(market)
    filtered { f -> !f.isHarness }

// Auto-discovered: solvency invariant
invariant protocolAlwaysSolvent()
    totalAssets() >= totalLiabilities()

// Auto-discovered: oracle freshness guard
invariant oracleNeverStale(address asset)
    block.timestamp - lastOracleUpdate(asset) <= MAX_ORACLE_DELAY

// Auto-discovered: liquidation safety
invariant liquidationNeverProfitless()
    forall address a. isLiquidatable(a) =>
        collateralValue(a) * liquidationBonus > debtValue(a)
Enter fullscreen mode Exit fullscreen mode

The critical insight: the AI Composer discovered the share price monotonicity invariant — the exact class of bug that caused the Euler exploit. A human auditor might write the solvency check, but the subtle relationship between share prices and deposit/withdrawal sequences is exactly what gets missed.

Integration With Existing Audit Workflows

# Run Certora Prover with AI-discovered specs
certoraRun certora.conf --ai_composer_mode discover_and_verify

# Output: 47 invariants discovered, 43 verified, 4 violations found
# Violation 1: sharePriceManipulation via flash deposit
# Violation 2: oracleStalenessDuringHighVolatility  
# Violation 3: liquidationRaceCondition
# Violation 4: governanceTimelockBypass
Enter fullscreen mode Exit fullscreen mode

Each violation comes with a concrete counterexample — an actual transaction sequence that breaks the invariant. This is infinitely more actionable than "potential reentrancy detected."

Layer 3: Medusa — ML-Guided Mutation Fuzzing

Medusa, the Go-Ethereum-based fuzzer from Trail of Bits, has evolved beyond property-based testing into ML-guided mutation:

Intelligent Corpus Generation

Instead of random mutations, Medusa's ML engine learns which transaction patterns reach deep states:

# medusa.yaml — ML-guided configuration
fuzzing:
  workers: 8
  timeout: 3600
  ml_guided:
    enabled: true
    model: "defi-v2"  # Pre-trained on known DeFi exploits
    mutation_strategy: "exploit_aware"
    reward_signal: "coverage_and_assertion"
  testing:
    property_testing:
      enabled: true
    optimization_testing:
      enabled: true  # Find inputs that maximize loss functions
Enter fullscreen mode Exit fullscreen mode

Exploit-Aware Mutations

The ML model is trained on historical exploit patterns. When it encounters a lending protocol, it automatically generates transaction sequences that mirror known attack patterns:

// Medusa auto-generates sequences like:
// 1. Flash loan large amount
// 2. Deposit into pool (inflate share price)
// 3. Donate directly to pool (manipulate price further)
// 4. Withdraw with inflated shares
// 5. Repay flash loan

// Property that Medusa tests against:
function property_no_profit_from_manipulation() public returns (bool) {
    uint256 attackerBalanceBefore = token.balanceOf(address(this));
    // ... (attack sequence is auto-generated) ...
    uint256 attackerBalanceAfter = token.balanceOf(address(this));
    return attackerBalanceAfter <= attackerBalanceBefore;
}
Enter fullscreen mode Exit fullscreen mode

Parallelized Deep State Exploration

Medusa's Go-based architecture enables true parallelism (unlike Echidna's Haskell runtime). On an 8-core machine:

# Run Medusa with parallel workers
medusa fuzz --config medusa.yaml

# Output after 30 minutes:
# Coverage: 94.7%
# Properties tested: 23
# Properties violated: 2
# Unique crash inputs: 847
# ML-guided mutations: 12,847 (vs 3,200 random baseline)
# Deep state paths explored: 2,341
Enter fullscreen mode Exit fullscreen mode

The Combined Pipeline: CI/CD Integration

Here's the complete GitHub Actions workflow that runs all three tools:

name: AI Security Audit Pipeline
on:
  push:
    paths: ['contracts/**']
  pull_request:
    paths: ['contracts/**']

jobs:
  ityfuzz-hybrid:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4
      - name: Install ItyFuzz
        run: cargo install ityfuzz
      - name: Compile contracts
        run: forge build
      - name: ItyFuzz hybrid fuzzing
        run: |
          ityfuzz evm \
            -t ./out/LendingPool.sol/LendingPool.json \
            --timeout 600 \
            --flashloan \
            --concolic
        timeout-minutes: 15

  certora-ai:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4
      - name: Install Certora
        run: pip install certora-cli
      - name: AI Composer - Discover & Verify
        run: certoraRun certora.conf --ai_composer_mode discover_and_verify
        env:
          CERTORAKEY: ${{ secrets.CERTORA_KEY }}
        timeout-minutes: 30

  medusa-ml:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4
      - name: Install Medusa
        run: |
          curl -L https://github.com/crytic/medusa/releases/latest/download/medusa-linux-amd64 -o /usr/local/bin/medusa
          chmod +x /usr/local/bin/medusa
      - name: ML-guided fuzzing
        run: medusa fuzz --config medusa.yaml --timeout 1800
        timeout-minutes: 35

  aggregate-results:
    needs: [ityfuzz-hybrid, certora-ai, medusa-ml]
    runs-on: ubuntu-latest
    steps:
      - name: Merge findings
        run: |
          echo "=== AI Audit Pipeline Results ==="
          cat ityfuzz-results.json | jq '.vulnerabilities'
          cat certora-results.json | jq '.violations'
          cat medusa-results.json | jq '.property_violations'
Enter fullscreen mode Exit fullscreen mode

What This Pipeline Catches That Manual Audits Miss

Vulnerability Class Manual Audit ItyFuzz Certora AI Medusa ML
Flash loan attacks ⚠️
Share price manipulation
Oracle staleness edge cases ⚠️ ⚠️
Cross-contract reentrancy ⚠️
Governance timing attacks ⚠️
Integer edge cases (type(uint).max)
State-dependent access control ⚠️
Donation/inflation attacks

Legend: ✅ = reliably catches, ⚠️ = sometimes catches, ❌ = rarely catches

Solana Parallel: Trident + Anchor Verify

The same pipeline philosophy applies to Solana, though tooling is less mature:

// Trident fuzzing for Anchor programs
use trident_client::*;

#[derive(Arbitrary)]
pub struct DepositFuzzInput {
    amount: u64,
    authority_seed: [u8; 32],
}

impl FuzzInstruction for DepositFuzzInput {
    fn get_accounts(&self) -> Vec<AccountMeta> {
        // Auto-generated account resolution
    }

    fn get_data(&self) -> Vec<u8> {
        // Serialize instruction data
    }
}

// Invariant: pool token supply * price_per_token >= total_deposited_value
fn invariant_pool_solvency(state: &ProgramState) -> bool {
    let pool_value = state.pool_token_supply
        .checked_mul(state.price_per_token)
        .unwrap_or(0);
    pool_value >= state.total_deposited_value
}
Enter fullscreen mode Exit fullscreen mode

Cost Analysis: AI Pipeline vs. Traditional Audit

Metric Traditional Audit AI Pipeline AI + Focused Manual
Time to first finding 3-5 days 15 minutes 15 minutes
Total invariants checked 20-50 200-500 200-500
Cost $50K-$200K ~$500/month $15K-$50K
False positive rate 5-10% 15-25% 5-8%
Coverage of implicit invariants 30-40% 80-90% 85-95%

The optimal approach: run the AI pipeline first, then focus human auditors on the violations it finds and the 10-20% of invariants it can't discover automatically. This cuts audit costs by 60-70% while improving coverage.

Getting Started Today

  1. Add ItyFuzz to your CI — 10 minutes of setup catches flash loan and reentrancy attacks automatically
  2. Try Certora AI Composer — the open-source alpha generates meaningful invariants for standard DeFi patterns
  3. Replace Echidna with Medusa — ML-guided mutation finds deeper bugs with less manual property writing
  4. Layer all three — each tool catches different vulnerability classes; the overlap is surprisingly small (~15%)

The era of paying $200K for a human to manually write 30 invariants is ending. The future is AI-discovered specs, formally verified at machine speed, with human auditors focusing on the creative edge cases that machines still miss.


DeFi Security Research series — covering exploit analysis, audit tooling, and security best practices across EVM and Solana ecosystems.

Top comments (0)