TL;DR: AI coding agents are now part of the dev workflow, but most repos still treat context like it is free. It is not.
dist/, coverage output, source maps, generated clients, logs, snapshots, and lockfile churn all make agents waste attention. I built ContextLevy as a small PR guardrail for that.
The thing nobody wants to admit
Your AI coding agent reads your repo mess.
Not always perfectly. Not always literally every file. But enough that messy repositories become worse environments for tools like:
- Cursor
- Codex
- Claude Code
- Copilot
- local coding agents
- random CLI agents your team experiments with next week
And the mess usually does not look dramatic.
It looks like this:
+ dist/index.js
+ coverage/lcov.info
+ generated/client.ts
+ bundle.js.map
+ snapshots/
+ debug.log
+ package-lock.json with 9,000 changed lines
None of that necessarily breaks your app.
But it does make your repo heavier to reason about.
For humans, we learned to ignore junk.
For agents, junk becomes context.
Bundle bloat already taught us this lesson
Frontend devs understand bundle bloat because we learned to measure it.
A pull request adding 30 KB of JavaScript is easy to miss.
A bot comment saying this is harder to ignore:
Bundle size increased by 30 KB
Main chunk increased by 18 KB
Vendor chunk increased by 12 KB
That comment does not replace engineering judgment.
It creates friction at the right moment.
Before the cost lands in main.
That is the same mental model I think AI-heavy repos need now.
Bundle bloat was hidden cost for users.
Context bloat is hidden cost for agents.
What is context bloat?
Context bloat is when a repo accumulates files that are technically valid, but low-value or noisy for AI coding workflows.
Common examples
| File/change | Why it is noisy |
|---|---|
dist/ |
Usually generated output, not source of truth |
coverage/ |
Huge text output that agents should rarely inspect |
*.map |
Source maps can be massive and low-signal |
| generated clients | Sometimes needed, often overwhelming |
| lockfile churn | Can dominate PR diffs with little semantic value |
| snapshots | Useful for tests, noisy for reasoning |
| logs | Almost never belong in repo context |
| agent instruction files | Small changes can affect agent behavior a lot |
The issue is not that these files are always bad.
The issue is that they should not become invisible cost.
“Just use .gitignore” is not enough
You should use .gitignore.
Seriously.
But .gitignore only helps with files before they are tracked.
It does not help much when:
- generated files are intentionally committed
- old junk is already tracked
- snapshots grow over time
- lockfiles churn hard
- someone changes agent instructions in a risky way
- different tools use different ignore/indexing behavior
Also, .gitignore is local hygiene.
A PR comment is team hygiene.
It shows up where the merge decision happens.
“Just tell the AI to ignore it” also does not scale
This sounds good until you remember how real teams work.
One dev uses Cursor.
Another uses Claude Code.
Someone else uses Codex.
Someone runs a local model through a CLI.
Next month, the team tries another agent entirely.
Every tool has different rules for indexing, retrieval, file search, ignore behavior, and context selection.
The repo is the shared layer.
Cleaner repo context helps every tool downstream.
So I built ContextLevy
ContextLevy is a small open-source tool that acts like:
bundle-size checks, but for AI coding context
It runs on pull requests and flags diffs that add a lot of context weight.
It catches things like
- committed build output
- coverage reports
- source maps
- generated clients
- large lockfile churn
- snapshots
- logs
- agent instruction changes
It can run as
- a GitHub Action
- a GitHub App
- a local CLI
It does not
- call an LLM
- upload your code
- judge code quality
- replace code review
- pretend to be an AI platform
It just analyzes the diff and leaves a focused PR comment.
That is the whole point.
Small guardrail. Clear feedback.
What a ContextLevy comment is supposed to do
The goal is not to shame people for committing generated files.
Sometimes generated files belong in the repo.
The goal is to make the cost visible:
ContextLevy · Warning · ~84k added context tokens
Largest contributors:
+ coverage/lcov.info
+ dist/index.js
+ generated/client.ts
Suggestion:
Consider ignoring coverage output and build artifacts unless they are intentionally tracked.
That is it.
Just a useful nudge before main gets heavier.
Why this is not just another AI wrapper
Most AI devtools try to add more intelligence.
ContextLevy does the opposite.
It assumes the boring part matters:
- what files exist
- what changed in the PR
- how much text was added
- whether that text is likely useful
- whether the repo is getting noisier over time
A lot of AI tooling discourse focuses on better models.
But model quality is only half the story.
The other half is context quality.
Garbage context still hurts, even with better models.
Bigger context windows do not fix this.
They just make it easier to stuff more junk into the prompt.
The fair criticism
The obvious criticism is:
“Couldn’t I make this with a script?”
Yes.
You can also write your own formatter, linter, bundle-size checker, release script, changelog generator, and dependency bot.
Most useful devtools are not valuable because the underlying idea is impossible.
They are valuable because they package the boring workflow into something teams actually run.
The value is in:
- useful defaults
- CI integration
- PR comments
- config
- predictable output
- low setup cost
- making the issue visible consistently
That is what ContextLevy is trying to be.
Who this is for
ContextLevy makes sense if:
- your team uses AI coding agents heavily
- your repo has lots of generated or build output
- your PRs often include noisy files
- you care about keeping AI context clean
- you want a lightweight CI guardrail
It probably does not make sense if:
- your repo is tiny
- you barely use coding agents
- your team already has strict generated-file policies
- you do not want another PR check
- you expect semantic code review from it
That last point matters.
ContextLevy is not a reviewer.
It is a warning light.
Why I think this will matter more
AI coding agents are moving from autocomplete to actual development loops.
People now ask agents to:
- explain unfamiliar codebases
- implement cross-file features
- review pull requests
- debug CI failures
- migrate frameworks
- generate tests
- refactor architecture
That means repo context is becoming part of the development environment.
We already optimize package size.
We already optimize test speed.
We already optimize CI time.
We already optimize dependency weight.
So why are we pretending AI context is free?
My actual question
I am not claiming ContextLevy is the final answer.
I am trying to figure out if this problem deserves more serious tooling.
Repo:
https://github.com/unloopedmido/contextlevy
I would genuinely like blunt feedback:
- Is “context bloat” a real problem you have felt?
- Would you install a PR check for this?
- Are the default noisy-file categories correct?
- What would make this feel like a serious devtool instead of AI-tool noise?
- Is the bundle-size analogy clear, or does it feel forced?
Final thought
Bundle bloat became obvious once teams started measuring it.
Context bloat is still mostly invisible.
But as AI agents become normal parts of development, invisible repo noise will matter more.
Maybe the fix is not complicated.
Maybe it starts with a simple PR comment saying:
“This change adds a lot of context weight. Are you sure?”
That is what ContextLevy is trying to do.
Top comments (0)