Forem

Deek Roumy
Deek Roumy

Posted on

I Built an AI Agent That Reviews Smart Contract Security — Here's What It Found on Its First Run

Smart contracts hold billions of dollars and can't be patched after deployment. One bug — a missing access modifier, an unchecked integer overflow — and an attacker drains everything in a single transaction. The stakes are higher than almost any other software domain, yet most developers ship without automated security tooling.

I wanted to fix that for my own workflow. This post walks through the AI-powered security review agent I built using Ollama (running codestral:22b locally) and Slither (the industry-standard static analyzer). The whole thing is about 30 lines of Python, runs offline, and surfaces real vulnerabilities in real contracts.

The Tool Stack

Tool Role
Slither Static analyzer that parses Solidity ASTs and checks 100+ vulnerability patterns
Ollama Local LLM inference server — no API keys, no cloud costs
codestral:22b Mistral's code-focused 22B-parameter model; great at understanding and explaining EVM patterns
Python 3 Glues it all together

Why Codestral? It was trained heavily on code and understands Solidity, ABI encoding, and EVM semantics well enough to explain why a finding is dangerous, not just that it exists. Running it locally through Ollama means your contract source never leaves your machine — important for anything pre-audit.

What Slither Finds

Slither checks 100+ detectors organized by severity. The ones that matter most in practice:

  • Reentrancy — External calls before state updates let attackers re-enter and drain funds (the DAO hack pattern)
  • Access control — Functions missing onlyOwner or equivalent modifiers that anyone can call
  • Integer overflow/underflow — Pre-0.8.x Solidity had no built-in overflow checks; SafeMath was the workaround
  • Unprotected selfdestruct — Callable by anyone, destroys the contract and sends ETH elsewhere
  • Unchecked return values — ERC-20 transfer() returns a bool; ignoring it means silent failures
  • Dangerous delegatecall — Storage collisions when delegating to untrusted contracts

The Automation Script

#!/usr/bin/env python3
"""
Smart contract security reviewer
Uses Slither for static analysis + Ollama/codestral for explanation
"""

import subprocess
import json
import sys
import requests

OLLAMA_URL = "http://localhost:11434/api/generate"
MODEL = "codestral:22b"

def run_slither(contract_path: str) -> dict:
    """Run Slither and return parsed JSON findings."""
    result = subprocess.run(
        ["slither", contract_path, "--json", "-"],
        capture_output=True, text=True
    )
    try:
        return json.loads(result.stdout)
    except json.JSONDecodeError:
        return {"success": False, "error": result.stderr}

def explain_findings(contract_path: str, findings: list) -> str:
    """Ask codestral to explain the findings in plain English."""
    with open(contract_path) as f:
        source = f.read()

    summary = "\n".join(
        f"- [{d['impact']}] {d['check']}: {d['description']}"
        for d in findings[:10]  # top 10
    )

    prompt = f"""You are a smart contract security auditor.

Contract source:
{source[:3000]}

Slither found these issues:
{summary}

For each issue: explain what it is, why it's dangerous, and how to fix it. Be concise."""

    response = requests.post(OLLAMA_URL, json={
        "model": MODEL,
        "prompt": prompt,
        "stream": False
    }, timeout=120)

    return response.json()["response"]

def review(contract_path: str):
    print(f"Scanning: {contract_path}\n")

    data = run_slither(contract_path)
    if not data.get("success"):
        print("Slither error:", data.get("error", "unknown"))
        return

    detectors = data.get("results", {}).get("detectors", [])
    high = [d for d in detectors if d["impact"] == "High"]
    medium = [d for d in detectors if d["impact"] == "Medium"]

    print(f"Found: {len(high)} HIGH, {len(medium)} MEDIUM severity issues\n")
    print("=" * 60)

    explanation = explain_findings(contract_path, high + medium)
    print(explanation)

if __name__ == "__main__":
    review(sys.argv[1] if len(sys.argv) > 1 else "contracts/Token.sol")
Enter fullscreen mode Exit fullscreen mode

Drop that in your project root. That's the whole agent.

What It Found on First Run

I pointed it at a simple ERC-20 token contract I wrote for testing — intentionally with a few flaws. Here's what came back:

Scanning: contracts/VulnToken.sol

Found: 2 HIGH, 3 MEDIUM severity issues

============================================================
HIGH: reentrancy-eth
  withdraw() sends ETH before updating balances. An attacker
  contract can re-enter withdraw() recursively, draining the
  contract before the balance ever updates.
  Fix: update balances BEFORE the external call.

HIGH: unprotected-upgrade  
  upgradeImplementation() has no access control. Anyone can
  point the proxy to a malicious implementation and take over.
  Fix: add onlyOwner or a timelock guard.

MEDIUM: unchecked-lowlevel
  The .call() return value is ignored. If the external call fails
  silently, the function continues as if it succeeded.
  Fix: check the bool return and revert on failure.
...
Enter fullscreen mode Exit fullscreen mode

The explanations were actionable. Codestral didn't just repeat the Slither detector name — it explained the attack path and gave a specific fix. That's the value-add over running Slither alone.

How to Run It Yourself

Prerequisites:

# Install Slither
pip install slither-analyzer

# Install Ollama and pull the model
brew install ollama
ollama pull codestral:22b
ollama serve  # runs on localhost:11434
Enter fullscreen mode Exit fullscreen mode

Run:

python3 review_agent.py path/to/YourContract.sol
Enter fullscreen mode Exit fullscreen mode

First scan takes ~2 minutes while the model loads. Subsequent scans are faster. For a 200-line contract you'll get a full report in under 90 seconds on an M-series Mac or any machine with 16GB+ RAM.

What's Next

This is a starting point, not a finished auditing suite. A few directions worth exploring:

  • Multi-file projects — Run Slither against a full Foundry/Hardhat project instead of a single file
  • Auto-PR mode — Generate a GitHub issue or PR comment with the findings automatically
  • Severity gating — Fail CI/CD if any High-severity findings exist (pre-merge guard)
  • Fine-tuning — A model trained specifically on audited Solidity codebases would be sharper than a general-purpose code model

Security tooling doesn't have to be expensive or cloud-dependent. With Slither + a local model you can run serious static analysis offline, free, on every commit.

The contract you don't audit is the one that gets exploited.


Running Ollama locally for the first time? Check the Ollama quickstart. Slither docs are at github.com/crytic/slither.

Top comments (0)