The day lib/ stopped being readable
One Sunday afternoon, I run ls lib/ in Rembrandt, the ERP I've been coding alone for L'Atelier Palissy for a month. A month ago, when I started, lib/ fit in one MacBook screen. I could scan the names at a glance and know what each one did. That Sunday, I count forty-one. Thirteen are adapters to third-party services — Supabase, Gmail, Brevo, Slack, Stripe, Meta CAPI, QStash, Push, PennyLane — and each one is between 120 and 260 lines of honest plumbing. Nothing crashes, everything works. And yet the folder that used to invite a reading now asks me to scroll.
I realize that every new integration I asked Claude Code for crystallized into an adapter file of this size, because an adapter is easy to generate: clear signature, no business invariant to protect, no test to write. My agent did exactly what I asked, every time, and through daily sedimentation it produced the kind of codebase that Sculley and his Google co-authors described in 2015 as ending up, in pathological systems, as 95% glue code for 5% business logic.
A few weeks ago, Gaspard — our long-time IT contractor — dropped by the office for a reason that escapes me today. I showed him an early lib/ on screen, proud of the progress. He scrolled for three seconds without sitting down, and said without looking up: « C'est de la plomberie, ça. » — That's plumbing, right there. I nodded the way you nod at a technical remark you don't quite understand yet, assuming he meant a detail. Six weeks later I understand that he had just named, in two words, what Sculley and the technical-debt literature have been trying to articulate for ten years.
If you have 30 seconds. Glue code (adapters, format conversions, plumbing to external APIs) proliferates silently when you code fast, and even faster when you code with an LLM that happily produces adapters. The countermeasure: measure the glue/business ratio in
lib/with a 130-line script, and hook the CI to non-regression rather than an absolute threshold. This article gives the script, the CI pattern, and why non-regression beats a cap. Useful if you run a codebase that talks to many external services.
The framing I missed for three weeks
The paper Hidden Technical Debt in Machine Learning Systems (Sculley et al., NIPS 2015) describes a particular debt in ML systems: the code useful to the model is a tiny box at the center of a large plumbing ecosystem — data ingestion, normalization, serving, monitoring. The typical ratio they observe in production, 5/95. The authors don't claim glue is bad in itself; they claim that when it isn't named, it gets paid in hidden costs: every refactor becomes acrobatic, every migration is negotiated with ten files that shouldn't be concerned.
The framing is ML but the form extends far beyond. As soon as a system talks to five or six external services, it produces glue. A vertical ERP with six third-party integrations is structurally condemned to manufacture it, and the risk isn't that there is some — there will be — but that it gets counted as business code in the developer's mental equation. The day I reread lib/supabase-paginate.ts thinking it's a business brick, I've lost. It's an adapter, it must remain an adapter, it must be named as such, and its volume must enter a metric whose curve is entitled to worry me.
Nine years before Sculley, Moseley and Marks had laid down in Out of the Tar Pit (2006) the founding distinction that gives the problem its grid: essential complexity, which comes from the business, and accidental complexity, which comes from the technical solution chosen. Glue, in this grid, is accidental complexity in its purest form. It serves no business requirement; it only solves the fact that two systems don't speak the same language. It's this asymmetry — essential is paid once, accidental is paid at every reading, every refactor, every migration — that explains why glue becomes dangerous well before it becomes dominant.
The script, one hundred and thirty lines
I wrote scripts/glue-ratio.sh one afternoon, a bit against myself. Two hardcoded lists: the lib/*.ts files that are glue, and the ones that are business logic. Every new file I add must be consciously classified into one of the two. Nothing is automatic, and that's the only way every addition decision is a named decision.
GLUE_FILES=(
"lib/supabase.ts" "lib/supabase-admin.ts" "lib/supabase-server.ts"
"lib/supabase-paginate.ts" "lib/gmail.ts" "lib/gmail-api.ts"
"lib/brevo.ts" "lib/slack.ts" "lib/stripe.ts"
"lib/meta-capi.ts" "lib/pennylane.ts" "lib/qstash.ts"
"lib/push.ts" "lib/rate-limit.ts" "lib/cache.ts"
"lib/webhook-idempotency.ts" "lib/wordpress.ts" "lib/utils.ts"
"lib/database.types.ts"
)
BUSINESS_FILES=(
"lib/rembrandt.ts" "lib/rembrandt-tool-defs.ts"
"lib/rembrandt-tool-handlers.ts" "lib/lead-pipeline.ts"
"lib/email-outbox.ts" "lib/email-templates.ts"
"lib/permissions.ts" "lib/contacts.ts"
"lib/calendrier.ts" "lib/segments.ts"
)
The rest of the script sums lines, computes two percentages (global, and excluding database.types.ts), and prints a short verdict. The --metric mode only outputs the types-excluded ratio, designed to be compared in CI.
The auto-generated types trap
lib/database.types.ts is a file auto-generated by Supabase from the schema. It weighs over twenty thousand lines in Rembrandt, and since it is entirely glue (TypeScript definitions of tables, nothing business), it tips the global ratio above 60% if counted. That would be accurate, and useless, because no one makes a decision by rereading that file. The rule I eventually settled on: the reference ratio is excluding database.types.ts. The script exposes both figures — global for the record, types-excluded to steer by. Current repo ratio: 28% excluding types on main. Target I set myself: under 25% durably.
glue/business ratio — lib/
==========================
Glue: 22,183 lines (64%)
Business: 12,487 lines (36%)
Total: 34,670 lines
(excluding database.types.ts: 2,183 glue / 14,670 total = 28%)
OK: glue excl. types below 30% alert threshold (target 25%)
The CI that blocks regression, not an absolute
Here's the choice that took me time to make, and that matters more than the script itself. CI guardrails are often written with an absolute threshold: if (glue > 30%) fail. It's seductive because it's simple, and it's a bad idea. A mature project at 35% glue that holds can be perfectly healthy. A project at 18% rising to 22% in a week is drifting. The absolute threshold doesn't see the drift, it only sees the arrival.
I hooked the CI to non-regression between HEAD and origin/main, with a tolerance of zero points. Any PR that raises the ratio above main fails, and the message asks the real question: "are you adding business logic that justifies more plumbing, or are you adding an adapter that nobody asked for?". If the former, you add business alongside, the ratio drops, the PR passes. If the latter, you look to extract, to share, to rename.
# scripts/glue-ratio-check.sh (excerpt)
current=$(bash scripts/glue-ratio.sh --metric)
tmp=$(mktemp -d)
trap 'rm -rf "$tmp"' EXIT
mkdir -p "$tmp/scripts"
cp scripts/glue-ratio.sh "$tmp/scripts/"
git archive "$BASE_REF" lib/ | tar -x -C "$tmp"
base=$(cd "$tmp" && bash scripts/glue-ratio.sh --metric)
delta=$((current - base))
if [ "$delta" -gt "$TOLERANCE" ]; then
echo "FAIL: glue ratio increased by ${delta} pts (tolerance +${TOLERANCE})."
echo "Check if glue can be extracted into lib/mappings/ or lib/adapters/,"
echo "or if a new file is miscategorized in scripts/glue-ratio.sh."
exit 1
fi
A secondary safety net, for pathological cases: above 40%, the script enters alert mode in the human output, which forces a team debate even if non-regression passes. But it's a net, not the main metric.
Why a rule written in CLAUDE.md isn't enough
I had first written a rule in my CLAUDE.md, phrased roughly as "prefer business logic over adapters, keep lib/ thin". That rule prevented nothing. It stood against no fact, and an adapter that seems necessary in the moment always wins against an abstract sentence read at the top of a constraints file. A numerical metric, on the other hand, pushes a material fact at the writer's head: +3 points on this PR. The debate becomes concrete, the rule becomes opposable, and the writer — human or LLM — becomes aware of what they are doing. That's exactly what the CLAUDE.md cannot produce as long as it remains text.
There's a lesson here that goes beyond the metric itself. Disciplines that hold all have a number the machine computes for you. Not an intention, not a principle, not a wish — a number. The rest erodes at the pace of developer fatigue and agent complacency.
What you can copy into your project
Both scripts and a CI workflow example live in the companion repo, MIT license: github.com/michelfaure/rembrandt-samples.
Four directly applicable moves if your codebase has many external integrations:
- Maintain two hardcoded lists in a shell script, glue and business, and force every new addition to be classified into one or the other. No automatic detection — the friction is the point
- Exclude auto-generated files from the denominator. Expose them as a global figure for the record, but steer on the types-excluded ratio
- Hook the CI to non-regression, not an absolute threshold. Zero-point tolerance, message that asks the real question of the PR writer
- Secondary safety net at 40% for pathological cases, but it's a net, not the rule
And a broader discipline: anything that isn't measured drifts. A rule in a constraints file is read, then forgotten; a numerical metric that blocks a PR is bypassed consciously or not, but it is seen. LLMs are no exception to this rule — they make it more urgent, because they produce faster what they aren't asked to moderate.
And you — which metrics actually drive your PRs, and which have stayed as intentions? I read the comments.
Companion code: rembrandt-samples/glue-ratio/ — the measurement script, the non-regression CI gate, and the GitHub Actions workflow, MIT, copy-pastable.

Top comments (0)