I built a supply chain security scanner in Rust — here's what I learned

#npm #bunjs #security

If you've ever run npm install and thought "what exactly did I just put on my machine?", this one's for you.

The problem that got me started
A while back I was poking around the node_modules of a mid-sized project. It had 47 direct dependencies and... 841 transitive ones. Forty-seven became eight hundred and forty-one. No tool was giving me a clear picture of which ones were risky, which had active CVEs, or — worse — which had been quietly compromised.

Snyk and Dependabot exist, sure, but they're either paid or require you to hand your repository over to a third-party service. I wanted something that ran locally, offline, and without giving anyone access to my code.

That's how OpenSentinel was born.

What it does
One line: it analyzes your full dependency tree and tells you which packages are risky and why.

opse scan ~/projects/my-app
That launches an interactive TUI where you can navigate package by package, inspect each CVE, see what suspicious code patterns were detected, and export a SBOM if you need it for compliance.

For CI/CD it's just as simple:

opse analyze --format=json --severity=high,critical
It exits with a meaningful code (0=clean, 1=medium, 2=high, 3=critical) so your pipeline can block automatically.

The parts I had the most fun building

The risk scoring system I didn't want it to be just "has a CVE = bad". Reality is more nuanced. A package might have a high-severity CVE that doesn't apply to your setup, or it might have zero CVEs but be completely abandoned by its maintainer.

The scoring weighs 5 dimensions:

Dimension Weight
Advisories (CVEs, GHSA, NVD) 40%
Malicious code patterns 20%
Version behavior changes 15%
Maintainer reputation 15%
Community reports 10%
There's one special case: if a package is in the known-malicious database with a score ≥ 0.8, the weighted formula gets skipped entirely and it goes straight to CRITICAL. There's no point averaging things out if we already know it's malware.

The known-malicious database embedded in the binary Some npm packages have a documented history of being compromised — event-stream@3.3.6, ua-parser-js@0.7.29, colors@1.4.1, node-ipc during the war in Ukraine... the list is real and well-documented.

Instead of hitting an external API every time, I embedded the database directly into the binary at compile time using include_str!():

static BUNDLED_DB: &str = include_str!("../../data/known_malicious.json");
Compile-time. No internet needed. No latency. The binary knows from birth which packages are known bad actors.

Building the TUI with Ratatui I'd never built a terminal UI in Rust before. Ratatui (the successor to tui-rs) turned out to be surprisingly ergonomic. The layout system works a lot like CSS Flexbox/Grid but for the terminal:

The render loop runs on Tokio's blocking thread pool so it doesn't starve the async scan tasks.

AST-based detection with Tree-sitter This was the most technically involved part — and my favorite. Instead of running regexes over source code (fragile, easy to evade), the tool uses Tree-sitter to parse the actual AST of the JavaScript/TypeScript and look for specific patterns:

process.env access followed by an HTTP call in the same scope
eval(Buffer.from(..., 'base64')) — the classic obfuscated malware trick
require() calls with dynamic paths
Crypto mining signatures (stratum+tcp://)
AST analysis is opt-in and activates with downloadSource: true in the config — it downloads the package tarball from the registry and scans it locally. Nothing leaves your machine.

Tech stack
Rust with Tokio for async
Ratatui + Crossterm for the TUI
Tree-sitter for AST analysis
SQLx with PostgreSQL (optional — works fine without a DB too)
Reqwest for the OSV, GitHub Advisories, and NVD APIs
Serde for all the JSON/TOML work
The PostgreSQL part is for caching advisories and maintainer metrics between scans. No Postgres? Things just don't get cached, everything still works.

The GitHub Action
A good CLI tool should be easy to drop into a pipeline. There's a composite GitHub Action:

uses: ./ id: scan with: severity: high,critical fail-on: "2" github-token: ${{ secrets.GITHUB_TOKEN }} It automatically leaves a comment on the PR with the results table, and updates it on every push instead of piling up new comments.

What I actually ran into
Running a TUI and async tasks at the same time is not obvious.
Ratatui needs to own the terminal in a tight render loop. Tokio needs its threads free for async work. The fix was spawn_blocking — the render loop runs on a dedicated OS thread, the scan tasks run on the async runtime, and they talk through an unbounded channel. Once I understood why that separation exists, a lot of Rust's async model clicked into place.

include_str!() solved a problem I was overcomplicating.
I spent way too long thinking about how to ship the known-malicious database — separate file? download on first run? bundled as a dependency? Then I remembered that include_str!() embeds the file contents directly into the binary at compile time. One line. Works offline. No install step. Sometimes the simplest thing is actually the right thing.

Not having a database shouldn't break anything.
Early on, if PostgreSQL wasn't configured, the whole scan failed. That's the wrong default for a CLI tool — most people running it locally won't have a database set up. I reworked the orchestrator so the DB is genuinely optional: advisories still get fetched, scoring still runs, everything still works. The database just adds caching and persistence on top.

The borrow checker catches real bugs, not just theoretical ones.
At one point I had scan results being mutated from two places at the same time — one path was updating scores, another was building the output. The compiler refused to compile it. I thought it was being annoying. It wasn't: that was a real data race that would've caused silent incorrect output in a concurrent scan. The error message was actually pointing at the exact problem.

Where it stands
It works today for Node.js and Bun projects. Python, Go, and Rust support is on the roadmap.

The code is on GitHub — contributions are very welcome, especially around:

More entries in the known-malicious database
Parsers for other ecosystems
Integration tests