DEV Community

Jeff Sinason
Jeff Sinason

Posted on • Originally published at echoforgex.com on

Introducing CodeAssay: Git Forensics for AI-Authored Code Quality

If you’re using AI to write code — Claude, Copilot, GPT, or any other tool — you probably have a gut sense of how well it’s working. Some sessions feel productive. Others end with you rewriting half of what the AI generated. But gut feelings don’t scale, and they don’t help you improve your process.

Today we’re open-sourcing CodeAssay , a git forensics tool that answers the question: how good is the code my AI tools are producing, and what goes wrong when it isn’t?

The Problem: You Can’t Improve What You Don’t Measure

AI coding assistants are powerful, but they’re not perfect. Code gets generated, merged, and then quietly fixed days later. Without tracking, you can’t distinguish between an AI tool that nails it 90% of the time and one that creates subtle bugs you spend hours debugging.

Most teams have no visibility into:

  • What percentage of their codebase is AI-authored
  • How often AI-generated code requires rework
  • Whether rework is caused by bugs, misunderstandings, or style violations
  • Which AI tools produce the most reliable code
  • Which files are rework hotspots

CodeAssay extracts all of this from your existing git history. No workflow changes required.

How It Works

CodeAssay analyzes your git history using three detection layers:

1. AI Commit Detection

It identifies AI-authored commits through Co-Authored-By trailers (Claude, Copilot, GPT), branch naming patterns, and manual AI-Assisted: true tags. If your AI tool leaves a signature in the commit, CodeAssay finds it.

2. Rework Detection

When a later commit modifies lines originally written by an AI commit, that’s a rework event. CodeAssay traces these using git blame ancestry within a configurable time window. It also detects file-level rewrites where entire files are replaced — a pattern that’s common when AI misunderstands a requirement.

3. Automatic Classification

Each rework event is classified into one of seven categories using commit message analysis and diff shape heuristics:

Category What It Means
Bug fix Code had a defect discovered later
Misunderstanding AI built the wrong thing entirely
Test failure Code didn’t pass tests on first attempt
Style/convention Worked, but didn’t follow project patterns
Security issue Introduced a vulnerability
Incomplete AI left TODOs or placeholders
Over-engineering Unnecessary complexity that was stripped out

This classification is heuristic-based — no LLM calls, fully offline, deterministic. If the classifier gets it wrong, you can override with codeassay reclassify.

Real Results from Our Repos

We built CodeAssay because we needed it ourselves. At EchoForgeX, our AI agent platform is heavily AI-assisted — roughly 75% of commits across our repositories are AI-authored. Here’s what CodeAssay revealed when we pointed it at our own codebase:

Metric EchoForgeX EchoForge Hub
AI Commit Rate 49.1% 74.4%
First-Pass Success 82.5% 47.3%
Rework Events 21 96
Top Rework Cause Style/convention Bug fix
Mean Time to Rework 21.7h 34.1h

The numbers immediately told us something actionable: our Hub codebase has a much lower first-pass success rate, dominated by bug fixes. That’s a signal to invest in better prompts, more specific specs, and tighter test coverage for that repo. Meanwhile, the EchoForgeX repo’s rework is mostly style violations — a prompt engineering fix, not an architecture problem.

Interactive Dashboard

Numbers in a terminal are useful. Charts are better. CodeAssay generates a self-contained HTML dashboard that opens in your browser — no server required, works offline, and produces publication-ready screenshots.

The dashboard includes:

  • Summary cards — AI commit rate, first-pass success, rework rate, mean time to rework
  • Category doughnut chart — visual breakdown of why rework happens, with percentages
  • Monthly trend lines — AI commits and rework events over time
  • File hotspot chart — which files need the most rework
  • Tool comparison — rework rates across different AI tools

Generate it with one command: codeassay dashboard

Install in 30 Seconds

CodeAssay is a Python package with zero external dependencies — just Python 3.10+ and git.

pip install codeassay
Enter fullscreen mode Exit fullscreen mode

Then scan any git repository:

# Scan a repo
codeassay scan /path/to/your/repo

# View CLI report
codeassay report

# Open interactive dashboard
codeassay dashboard

# Scan multiple repos at once
codeassay scan ../repo1 ../repo2 ../repo3
Enter fullscreen mode Exit fullscreen mode

Claude Code Plugin

If you use Claude Code, install CodeAssay as a plugin:

/plugin marketplace add jeffsinason/codeassay
/plugin install codeassay@codeassay
Enter fullscreen mode Exit fullscreen mode

After installation, /codeassay is available as a skill in your Claude Code sessions.

Filter the Noise

Not every file matters for code quality analysis. Documentation churn, config file updates, and dependency bumps add noise. Create a .codeassayignore file in your repo root to exclude them:

# .codeassayignore
*.md
.DS_Store
.organization
docs/**
Enter fullscreen mode Exit fullscreen mode

This uses gitignore-style patterns and filters files from both AI commit tracking and rework detection.

Query Your Own Data

CodeAssay stores everything in SQLite — one database per repo at .codeassay/quality.db. You can query it directly with any SQL tool:

# Which AI tool produces the most rework?
sqlite3 .codeassay/quality.db \
  "SELECT tool, COUNT(*) FROM ai_commits a
   JOIN rework_events r ON a.commit_hash = r.original_commit
   GROUP BY tool ORDER BY COUNT(*) DESC"

# What's the most common rework category this month?
sqlite3 .codeassay/quality.db \
  "SELECT category, COUNT(*) FROM rework_events
   WHERE rework_date >= '2026-04-01' GROUP BY category
   ORDER BY COUNT(*) DESC"
Enter fullscreen mode Exit fullscreen mode

This makes CodeAssay data available for custom analysis, notebooks, and integration into your existing tooling — no vendor lock-in.

What’s Next

CodeAssay v0.1.0 is manual — you run scans when you want them. Coming in v1.1:

  • Continuous mode — a Claude Code hook that auto-scans after every commit
  • Enhanced dashboards — more chart types, drill-down views, comparison across repos

The project is fully open source and MIT licensed. Contributions, issues, and feedback are welcome on GitHub: github.com/jeffsinason/codeassay

At EchoForgeX, we build AI-powered tools and help businesses integrate AI into their workflows. Get in touch to learn how we can help your team work smarter with AI.

Top comments (0)