If you’re using AI to write code — Claude, Copilot, GPT, or any other tool — you probably have a gut sense of how well it’s working. Some sessions feel productive. Others end with you rewriting half of what the AI generated. But gut feelings don’t scale, and they don’t help you improve your process.
Today we’re open-sourcing CodeAssay , a git forensics tool that answers the question: how good is the code my AI tools are producing, and what goes wrong when it isn’t?
The Problem: You Can’t Improve What You Don’t Measure
AI coding assistants are powerful, but they’re not perfect. Code gets generated, merged, and then quietly fixed days later. Without tracking, you can’t distinguish between an AI tool that nails it 90% of the time and one that creates subtle bugs you spend hours debugging.
Most teams have no visibility into:
- What percentage of their codebase is AI-authored
- How often AI-generated code requires rework
- Whether rework is caused by bugs, misunderstandings, or style violations
- Which AI tools produce the most reliable code
- Which files are rework hotspots
CodeAssay extracts all of this from your existing git history. No workflow changes required.
How It Works
CodeAssay analyzes your git history using three detection layers:
1. AI Commit Detection
It identifies AI-authored commits through Co-Authored-By trailers (Claude, Copilot, GPT), branch naming patterns, and manual AI-Assisted: true tags. If your AI tool leaves a signature in the commit, CodeAssay finds it.
2. Rework Detection
When a later commit modifies lines originally written by an AI commit, that’s a rework event. CodeAssay traces these using git blame ancestry within a configurable time window. It also detects file-level rewrites where entire files are replaced — a pattern that’s common when AI misunderstands a requirement.
3. Automatic Classification
Each rework event is classified into one of seven categories using commit message analysis and diff shape heuristics:
| Category | What It Means |
|---|---|
| Bug fix | Code had a defect discovered later |
| Misunderstanding | AI built the wrong thing entirely |
| Test failure | Code didn’t pass tests on first attempt |
| Style/convention | Worked, but didn’t follow project patterns |
| Security issue | Introduced a vulnerability |
| Incomplete | AI left TODOs or placeholders |
| Over-engineering | Unnecessary complexity that was stripped out |
This classification is heuristic-based — no LLM calls, fully offline, deterministic. If the classifier gets it wrong, you can override with codeassay reclassify.
Real Results from Our Repos
We built CodeAssay because we needed it ourselves. At EchoForgeX, our AI agent platform is heavily AI-assisted — roughly 75% of commits across our repositories are AI-authored. Here’s what CodeAssay revealed when we pointed it at our own codebase:
| Metric | EchoForgeX | EchoForge Hub |
|---|---|---|
| AI Commit Rate | 49.1% | 74.4% |
| First-Pass Success | 82.5% | 47.3% |
| Rework Events | 21 | 96 |
| Top Rework Cause | Style/convention | Bug fix |
| Mean Time to Rework | 21.7h | 34.1h |
The numbers immediately told us something actionable: our Hub codebase has a much lower first-pass success rate, dominated by bug fixes. That’s a signal to invest in better prompts, more specific specs, and tighter test coverage for that repo. Meanwhile, the EchoForgeX repo’s rework is mostly style violations — a prompt engineering fix, not an architecture problem.
Interactive Dashboard
Numbers in a terminal are useful. Charts are better. CodeAssay generates a self-contained HTML dashboard that opens in your browser — no server required, works offline, and produces publication-ready screenshots.
The dashboard includes:
- Summary cards — AI commit rate, first-pass success, rework rate, mean time to rework
- Category doughnut chart — visual breakdown of why rework happens, with percentages
- Monthly trend lines — AI commits and rework events over time
- File hotspot chart — which files need the most rework
- Tool comparison — rework rates across different AI tools
Generate it with one command: codeassay dashboard
Install in 30 Seconds
CodeAssay is a Python package with zero external dependencies — just Python 3.10+ and git.
pip install codeassay
Then scan any git repository:
# Scan a repo
codeassay scan /path/to/your/repo
# View CLI report
codeassay report
# Open interactive dashboard
codeassay dashboard
# Scan multiple repos at once
codeassay scan ../repo1 ../repo2 ../repo3
Claude Code Plugin
If you use Claude Code, install CodeAssay as a plugin:
/plugin marketplace add jeffsinason/codeassay
/plugin install codeassay@codeassay
After installation, /codeassay is available as a skill in your Claude Code sessions.
Filter the Noise
Not every file matters for code quality analysis. Documentation churn, config file updates, and dependency bumps add noise. Create a .codeassayignore file in your repo root to exclude them:
# .codeassayignore
*.md
.DS_Store
.organization
docs/**
This uses gitignore-style patterns and filters files from both AI commit tracking and rework detection.
Query Your Own Data
CodeAssay stores everything in SQLite — one database per repo at .codeassay/quality.db. You can query it directly with any SQL tool:
# Which AI tool produces the most rework?
sqlite3 .codeassay/quality.db \
"SELECT tool, COUNT(*) FROM ai_commits a
JOIN rework_events r ON a.commit_hash = r.original_commit
GROUP BY tool ORDER BY COUNT(*) DESC"
# What's the most common rework category this month?
sqlite3 .codeassay/quality.db \
"SELECT category, COUNT(*) FROM rework_events
WHERE rework_date >= '2026-04-01' GROUP BY category
ORDER BY COUNT(*) DESC"
This makes CodeAssay data available for custom analysis, notebooks, and integration into your existing tooling — no vendor lock-in.
What’s Next
CodeAssay v0.1.0 is manual — you run scans when you want them. Coming in v1.1:
- Continuous mode — a Claude Code hook that auto-scans after every commit
- Enhanced dashboards — more chart types, drill-down views, comparison across repos
The project is fully open source and MIT licensed. Contributions, issues, and feedback are welcome on GitHub: github.com/jeffsinason/codeassay
At EchoForgeX, we build AI-powered tools and help businesses integrate AI into their workflows. Get in touch to learn how we can help your team work smarter with AI.
Top comments (0)