DEV Community

Gokul P
Gokul P

Posted on

I built a tool to measure how much of my codebase is AI-written — here's how it works

Ask your team what percentage of your production codebase was written by an AI last quarter. You'll get silence — not because nobody cares, but because there's no way to measure it.

We instrument everything else. Deployments, latency, error rates, test coverage. But code provenance? Nothing. Git blame still assumes a human wrote every line.

I built aigit to fix this.

The problem

As AI coding tools became part of my workflow, I noticed something uncomfortable: I had no visibility into the quality or longevity of AI-generated code versus hand-written code. Was AI code churning faster? Did it correlate with bug fixes? Which files were effectively AI-authored?

These aren't philosophical questions — they're engineering metrics that every team should be tracking.

How it works

Step 1: Session ingestion

Claude Code stores every session as JSONL under ~/.claude/projects//. Each assistant message contains either markdown fenced code blocks or — more importantly — tool_use blocks from Write and Edit calls. That's where the actual code written to disk lives.

Extract from both text responses AND Write/Edit tool calls

if block.get("type") == "tool_use" and block.get("name") in ("Write", "Edit"):
code_text = inp.get("content") or inp.get("new_string", "")

Step 2: Tiered fuzzy matching

Before hashing, code is normalized — comments stripped, whitespace collapsed, lowercased. Then matched against git diff hunks at three tiers:

Exact SHA-256 match → confidence 1.0 (verbatim copy-paste)
TLSH distance < 30 → confidence 0.9 (lightly reformatted)
TLSH distance < 100 → confidence 0.7 (substantially edited)

TLSH (Trend Micro Locality Sensitive Hash) is designed for fuzzy file matching — it measures structural similarity rather than exact content, which is exactly what you need when AI code gets tweaked before committing.

Step 3: Attribution overlay

Rather than rebuilding line provenance from scratch, aigit piggybacks on git blame --porcelain. It already tracks lines across renames, rebases, and cherry-picks. We just annotate its output:

$ aigit blame src/api/routes.py

4 a1b2c3d [claude 100%] def get_user(user_id: int):
5 a1b2c3d [claude 100%] return db.query(User).get(user_id)
6 f9e8d7c
7 f9e8d7c def delete_user(user_id: int):
8 f9e8d7c db.query(User).filter_by(id=user_id).delete()

$ aigit stats

src/api/routes.py 73% AI ████████████░░░░
src/core/engine.py 51% AI ████████░░░░░░░░

Repo-wide: 61% AI-attributed

What I found dogfooding it

I ran aigit on itself — the entire codebase was built in a single Claude Code session. Result: 89.8% AI-attributed across 2,171 lines. The 10.2% that wasn't AI-attributed were the lines I added manually to fix bugs the AI introduced. Which is itself an interesting metric.
**
Current limitations**

  • Claude Code only — the provider architecture is pluggable, but Cursor and Copilot support isn't built yet
  • Requires local session logs — tools that don't store sessions locally (Devin, cloud-based agents) can't be supported without an API
  • Cold start — existing commits before you started using aigit won't be attributed

Install and try it

pip install getaigit
cd your-repo
aigit index
aigit blame src/yourfile.py
aigit stats

The attribution database lives at .aigit/attribution.db — commit it to share attribution data across your team.

GitHub:
Curious what metrics you'd want to see beyond AI% and churn rate — and whether you're seeing the same gap in your teams.

Top comments (0)