I have unreasonably high standards for commit messages.
That's not a brag — it's a diagnosis. I'll rewrite a subject line four times before even considering pushing. I'll agonize over whether something is a refactor or a feat. I've more than once spent longer on a commit message than on the code it described.
So naturally, when AI commit message generators became a thing, I tried them all. Every single one does the same thing: take git diff, throw it at an LLM, hope something plausible comes back.
And every single one produces something like this:
refactor: update code and improve things
That's not a commit message. That's a cry for help.
🤯 The Problem Nobody Is Solving
Here's what happens when you send a raw diff to an LLM:
-pub fn validate(input: &str) -> bool {
+pub fn validate(input: &str, strict: bool) -> Result<()> {
You and I see this instantly: new parameter, changed return type, breaking change.
An LLM sees two lines that are kinda similar and one has more stuff in it. It doesn't know those are function signatures. It doesn't know bool → Result<()> is a breaking change. It's just doing text completion on a fancy diff.
So it writes refactor: update validate function and calls it a day.
No commit tool on the market does anything about this. opencommit (7K+ stars), aicommits (8K+ stars), GitHub Copilot, Cursor, Windsurf — all raw diffs.
🐝 So I Built CommitBee
CommitBee parses your code with tree-sitter before the LLM ever sees it. Both the staged version and the HEAD version. In parallel, across CPU cores. For 10 languages.
For that same diff, the LLM doesn't get + and - lines. It gets:
STRUCTURED CHANGES:
Validator::validate(): +param strict: bool, return bool → Result<()>, body modified (+5 -2)
The model doesn't have to guess. It knows a parameter was added, the return type changed, and the function lives inside Validator.
Output: feat(validator): add strict mode with fallible return type
🧠 The LLM Is the Last Step, Not the Only Step
Before the LLM generates a single token, CommitBee has already figured out:
-
What kind of change this is. All test files →
test. Only docs →docs. Dependencies →chore. These come from code analysis — the model can't hallucinate types that don't match the evidence. -
What structurally changed per symbol. Which parameters were added, which return types changed, whether visibility was widened, whether
unsafewas introduced. - How files relate. Source file and test file both staged? Linked. Import added that matches a new symbol elsewhere? Tracked.
The output then goes through a 7-rule validator. Wrong type, generic subject, hallucinated breaking change — it retries with targeted corrections. Up to 3 attempts.
There's also commit splitting, secret scanning, git hooks, and a few other things — but that's what the docs are for.
🚀 Try It
cargo install commitbee
ollama pull qwen3.5:4b
commitbee
Zero config. Ollama detected automatically. Run commitbee --show-prompt if you want to see exactly what the LLM receives — that's usually the "oh, that's why it's better" moment.
Written in Rust. CommitBee · GitHub
If refactor: update code has ever made you feel things — give it a shot. And if you do: I'd genuinely love to hear what you think. Bug reports, missing language support, feature ideas I haven't thought of yet — all of it. This project is early and feedback is the one thing I'm actually starving for right now.
This post was created with some help of Claude Sonnet 4.6.
Top comments (0)