DEV Community

Cover image for A GPL dep can quietly poison your closed-source product. I built a tiny offline tool that catches it.
benjamin
benjamin

Posted on

A GPL dep can quietly poison your closed-source product. I built a tiny offline tool that catches it.

A few months ago a lawyer asked our team a simple question: "Can you prove
nothing in this product is GPL?"
We couldn't — not quickly. A couple thousand
transitive deps across Node and Python services, and the honest answer was "uh,
probably?", which is not what you want to tell a lawyer.

So I went looking for a tool to just tell me, locally, right now, which
licenses my dependencies carry and which ones are a problem. What I found:

  • license-checker — the npm default, ~900K weekly downloads — has been unmaintained for years, and it dumps raw license strings; it won't tell you GPL is a bigger deal than MPL.
  • Snyk, FOSSA, Black Duck — all good, all want a signup, an API token, and a network round-trip before classifying a folder already on my disk.

But the info I needed was already in my node_modules: every package ships a
package.json license field, every Python wheel a METADATA file. Why am I
uploading anything?

So I built licsniff — a zero-dependency CLI that reads those files locally,
classifies each license into a risk tier, and exits. No account, no network,
nothing to set up.

npx licsniff
Enter fullscreen mode Exit fullscreen mode
PACKAGE          VERSION  LICENSE              RISK
some-gpl-lib     2.1.0    GPL-3.0              strong-copyleft
mystery-pkg      0.0.3    (none)               unknown
copyleft-utils   1.4.0    LGPL-2.1             weak-copyleft
left-pad         1.3.0    MIT                  permissive
fast-json        3.1.4    (MIT OR Apache-2.0)  permissive
Enter fullscreen mode Exit fullscreen mode

Riskiest first. The line you actually need to worry about is at the top.

Tiers, not just strings

The whole point is that "GPL-3.0" is only useful if you know what bucket it falls
into. licsniff sorts every license into one of five tiers:

  • permissive — MIT, ISC, BSD, Apache-2.0, 0BSD, Unlicense, CC0… use freely.
  • weak-copyleft — LGPL-*, MPL-2.0, EPL-*, CDDL-*. File/linking obligations.
  • strong-copyleft — GPL-*, AGPL-*. Can force you to open-source your code.
  • proprietaryUNLICENSED, SEE LICENSE IN …. Not open source at all.
  • unknown — missing or unrecognized. The scariest, honestly — you don't even know what you're shipping.

SPDX expressions, parsed properly

Real metadata isn't clean — you get (MIT OR Apache-2.0), GPL-3.0 AND MIT,
GPLv3, GPL-3.0+, GPL-3.0-only, Apache License 2.0. licsniff normalizes
all of it and evaluates the boolean expressions the way they actually work:

  • OR → the least restrictive option wins (you pick the friendly one), so (MIT OR GPL-3.0) is permissive.
  • AND → the most restrictive wins, so GPL-3.0 AND MIT is strong-copyleft. (You don't want a false "permissive" gating your CI here.)

The flag that earns its keep: --fail-on

This is what made it stick on our team. Drop one line in CI:

licsniff --fail-on strong-copyleft
Enter fullscreen mode Exit fullscreen mode

It exits 1 the moment any dependency lands at or above that tier, so a GPL
transitive dep can never sneak in through an npm install again. There's also
--summary for counts and --json | jq for everything else.

It runs on both ecosystems

Half our services are Node, half Python, so licsniff ships on both registries.
Same tool, same tiers; each version audits its own ecosystem:

npx licsniff           # Node — scans node_modules, zero deps
pipx run licsniff      # Python — scans site-packages, pure stdlib
Enter fullscreen mode Exit fullscreen mode

The Python build reads *.dist-info/METADATA, including the modern PEP 639
License-Expression: field newer wheels use. Both ports share the exact same
classifier
, tested against the same vectors, so they tier a license
byte-for-byte identically.

A few design notes

  • One pure function at the core. classifyLicense(idOrName) → {tier, spdx} has no I/O, no clock, no globals; the CLI is just a thin folder-reader around it. That's why the Node and Python builds can be proven identical — they run one shared test table.
  • Offline and read-only by design. Never writes a file, never opens a socket. Safe in air-gapped CI, on a client's machine, everywhere.

Try it / break it

Code, issues, and the full README:

It's MIT and small. I'd genuinely like to know which license string it
mis-tiers — paste me a weird one from your node_modules and I'll add it to the
vectors.

How are you checking your dependency licenses today — or are you, like past me,
just hoping no lawyer asks?

Top comments (0)