Semgrep vs Kolega: a great floor, but a floor is not a finish line

#security #devops #sast #opensource

Semgrep is the one we get compared to most, and honestly the one we have the most time for, so let me be fair before I get to the but.

Where Semgrep is good

Semgrep is great. It is free, it is fast, the custom rule engine is genuinely good, and "drop it in CI in an afternoon" is a real thing you can do. If you are not running anything yet, run Semgrep today. It is the sensible first move and we would tell you that even though we would rather you used us. No notes on it as a starting point.

Where it stops

Here is the but. Semgrep does exactly what it says: it matches patterns. You give it a rule, it finds things that look like the rule. That is perfect for known signatures, enforcing your own conventions, and catching the obvious stuff.

It is structurally incapable of finding things that are not a pattern, and the vulns that actually end up in incident writeups usually are not patterns:

Business logic flaws
Auth that breaks across multiple files
Second order injection
An operator precedence bug that quietly turns a permission check into a no-op (a real example we found in a secrets manager, of all things) No rule describes those, because they are not patterns. They are the code not meaning what the author thought it meant. You cannot write a Semgrep rule for "this is subtly wrong."

The rare case where we do not have to hand wave

Semgrep is literally on our benchmark. RealVuln is an open benchmark: 676 real vulnerabilities across 26 production repositories, plus 120 false positive traps to catch tools that flag everything to inflate recall.

RealVuln
- 676 real vulnerabilities
- 26 production repos
- 120 false positive traps
- Semgrep score: ~17%
- fully open source

Semgrep sits near the bottom, around 17 percent. Not because it is a bad tool, but because pattern matching has a ceiling and that is the ceiling. Run it yourself against the benchmark, the repo is right there. We published it specifically so nobody has to take our word for it.

So which one

This is not "Semgrep bad." Semgrep is the floor everyone should have. We are the layer that catches what rules cannot see. Best case, you run both: Semgrep for the fast pattern sweep, us for the semantic stuff underneath. We are not trying to delete Semgrep from your stack, just the assumption that it is enough on its own.

Full breakdown and the benchmark: https://kolega.dev/compare/semgrep/