Claude burned 83,000 tokens fixing test failures after a refactor — raw pytest output, coverage noise, ruff warnings, all re-fed every loop.
It worked. But it was absurdly expensive.
The problem isn’t the model — it’s the context.
So I made cq (python-code-quality on PyPI) It runs 10+ quality tools and surfaces exactly one thing at a time.
Minimal context
Instead of dumping everything into the prompt, cq:
- runs tools in priority order
- stops at the first failure
- emits a single, focused fix request
> cq check . -o llm
`src/myproject/utils.py:21` — **F841**: Local variable `unused_variable` is assigned to but never used
18: min_dist = float("inf")
19: nearest_city = None
20: for city in cities:
21: unused_variable = 67
22: dist = calc_dist(current_city, city)
Please fix only this issue. After fixing, run `cq check . -o llm` to verify.
That’s it. No test logs, no coverage spam, no unrelated warnings.
If the error looks like a caller / callee mismatch, we fetch the callee signature to potentially avoid an extra tool-call.
The minimal loop
smallest complete context → smallest capable model → fewest tool calls → successful edit
Small, focused context means you can use a small, cheap model and get the fix in 1 second. No tool-calling needed (if you edit yourself):
cq check . -o llm | ollama run qwen3:4b --think=false \
'show a unified diff to correct this code. Add a one line explanation'
--- a/src/myapp/calculator.py
+++ b/src/myapp/calculator.py
@@ -1,5 +1,5 @@
def evaluate(expression):
- return eval(expression)
+ import ast
+ return ast.literal_eval(expression)
Explanation: Replaced eval() with ast.literal_eval() to safely evaluate strings as Python literals.
Apply the fix. Run cq again. Repeat.
Or with Claude Code:
cq check . -o llm | claude -p "fix this"
Tool ordering
In -o llm mode, the tools are run sequentially, and we stop at the first error.
In other modes, we run in parralel and cache results for fast re-runs.
cq check .
┏━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━┳━━━━━━━━━━┓
┃ Tool ┃ Time ┃ Metric ┃ Score ┃ Status ┃
┡━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━╇━━━━━━━━━━┩
│ compile │ 0.47s │ compile │ 1.000 │ OK │
│ ruff │ 0.22s │ lint │ 1.000 │ OK │
│ ty │ 0.80s │ type_check │ 1.000 │ OK │
│ bandit │ 0.53s │ security │ 1.000 │ OK │
│ pytest │ 2.11s │ tests │ 1.000 │ OK │
│ radon-cc │ 0.34s │ simplicity │ 0.982 │ OK │
│ radon-mi │ 0.41s │ maintainability │ 0.848 │ OK │
│ radon-hal │ 0.36s │ file_bug_free │ 0.810 │ OK │
│ radon-hal │ │ file_smallness │ 0.655 │ OK │
│ radon-hal │ │ functions_bug_free │ 0.808 │ OK │
│ radon-hal │ │ functions_smallness │ 0.808 │ OK │
│ vulture │ 0.37s │ dead_code │ 1.000 │ OK │
│ interrogate │ 0.38s │ doc_coverage │ 0.853 │ OK │
│ │ │ Score │ 0.945 │ │
└──────────────────┴──────────┴───────────────────────────┴─────────┴──────────┘
Claude Code stop hook
If you want to auto-run, add a hook to your project's .claude/settings.json:
{
"hooks": {
"Stop": [{
"matcher": "",
"hooks": [{
"type": "command",
"command": "cq check . -o score && echo 'CQ: all clear' || cq check . -o llm; true"
}]
}]
}
}
- pass → tiny output
- fail → targeted fix prompt
- loop continues with minimal context
For manual use, create .claude/commands/cq-fix.md:
$(cq check . -o llm)
/cq-fix embeds the live output directly into the prompt.
Install
uv tool install python-code-quality
Help
cq check --help
Usage: cq check [OPTIONS] [PATH]
Feed the results from 11+ code quality tools to an LLM. Try: cq check . -o llm
╭─ Arguments ─────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────╮
│ path [PATH] Path to Python file or project directory [default: .] │
╰─────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────╯
╭─ Options ───────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────╮
│ --output -o [table|score|json|llm|raw] Output mode: table (default), score, json, llm │
│ --log-level TEXT Logging level (DEBUG, INFO, WARNING, ERROR, CRITICAL) [default: CRITICAL] │
│ --clear-cache Clear cached tool results before running │
│ --workers INTEGER Max parallel workers (default: one per tool, use 1 for sequential) [default: 0] │
│ --language -l TEXT Override language detection (e.g. python, typescript, rust) # FUTURE │
│ --only TEXT Comma-separated tool IDs to run (e.g. ruff,ty,pytest) │
│ --skip TEXT Comma-separated tool IDs to skip (e.g. bandit,vulture) │
│ --exclude TEXT Comma-separated paths to exclude (e.g. demo,docs) │
│ --help Show this message and exit. │
╰─────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────╯
Notes
- Python only (for now), but the approach generalizes
- No agent/tool orchestration required — just a shell pipeline
- Works with local models or hosted ones
Repo: github.com/rhiza-fr/py-cq — MIT, actively maintained.
Enjoy!
Top comments (0)