DEV Community

Cover image for We've open-sourced our AI security scanner: it found 221 issues
Vlad Kapitsyn
Vlad Kapitsyn

Posted on

We've open-sourced our AI security scanner: it found 221 issues

Recently, mentions of Mythos from Anthropic leaked online - a model that found vulnerabilities even in OpenBSD. We thought this was a great moment to share our own developments in code quality analysis and show the underestimated scale of the problem in product development.

Besides security bugs that lead to leaks of private data or loss of control over a system, there is another strongly underestimated class of problems.

My team spent the entire last year building and improving processes in a fast-growing startup of 120 people, where there were more than 60 engineers and a dozen different products. And as is well known, to improve something, you first have to measure it.

The first thing that became very apparent was the constant desynchronization of product requirements, code, and documentation. This led to constant code rewrites, which burned about 70% of the teams' time and caused burnout among both engineers and their managers.

We decided to measure and fix - and that's how we came up with the idea of a spec-drift scanner. A script that scanned commits, tasks, product requirements docs, and development channels in Slack, then analyzed them using an LLM, and at the end produced a report on mismatches and offered to generate a fix.

The first version we put together in a couple of days already impressed us with its results:
221 missed discrepancies over 90 days when scanning the first project, 96% (212 of them) were accepted by the team as valid; Moreover, this project already had a well-known paid AI code analyzer with a rabbit in the logo connected to it.

As a control test, we tested a popular open source engine on GitHub - Wagtail. Even though the scanner only had code and issues open in the repository as data, it found discrepancies there as well - over the last 30 days, 5 issues were closed whose code solutions partially differed from the description.

We decided to put the scanner on GitHub and make it public, where anyone can try it: https://github.com/OneSpur/scanner

  • First, we believe in the Open Source approach and that the developer community can make a product much more useful than any single team.
  • Second, we realize that the scanner needs access to sensitive data, and the only way to build trust in such a product is to show its code and not touch other people's data.

To connect the scanner to an LLM, an API key is not required - it supports working via Claude Code and via the Ollama client.

You can connect a repository / Slack / project docs in two ways:

  • the quick way, in two clicks via authorization extensions;
  • the manual way, where you create access tokens for the script yourself for each application (this method takes about 5-7 minutes).

The scanning itself takes 1-2 minutes on average.


We would love feedback and stars on the repository :)
Also, if you are interested in following product updates, subscribe to our Discord channel.

Top comments (0)