You Have a Third Pile of Technical Debt. Nobody Has Built a Tool to Find It.

#softwareengineering #technology #technicaldebt #saas

You launch your usual app. ERROR. You're already pissed because you hadn't planned to do maintenance today. So you dig. My terminal threw ECONNREFUSED 127.0.0.1:46279 at my face and it took me exactly thirty seconds to understand that nobody, anywhere, was going to deal with this. No status page. No ticket to file. No SLA to wave around. The free service that one piece of my pipeline depended on had just gone down, and the only human on Earth who cared was me.

Of course I had never signed a contract with them. I was using a free service. I never showed up on any of their dashboards. And yet at 9:17 that Monday morning, they owed me something they didn't even know they owed me. Or actually the other way around: I owed them. I had been running up a tab for months without noticing, and the creditor had just called it in.

TLDR (You know about the technical debt you wrote. You measure the technical debt you inherited from your dependencies. There's a third pile nobody talks about: the debt you import every time you wire a free SaaS into your pipeline. It doesn't show up on any dashboard. No linter catches it. Dependabot doesn't see it. Audit your imported debt before a Monday morning audits it for you.)

The Three Piles

There are three kinds of technical debt and most teams only count two of them.

The first one is the debt you wrote. The Friday afternoon hack that survived. The TODO from 2022 that became load-bearing. The if-statement that was supposed to be temporary and now has its own commit history. You know it exists because you wrote it, and you know roughly where it lives because the bad smell follows you around. Your linter catches a slice of it. Code review catches another slice. The rest sits in your head.

The second pile is the debt you inherited. Lockfiles full of transitive dependencies you never picked. Libraries that haven't shipped in three years but still install. Packages with two maintainers and one of them just moved to a farm. You know this pile exists too, because there are tools for it. npm audit, Dependabot, Snyk, Renovate. They scream at you every Monday morning whether you want it or not. The screaming is annoying, but at least somebody is screaming.

Then there's the third pile. The pile you don't have a name for, because nobody named it. The free service you plug into one step of your pipeline because it was easier than building the thing yourself. The hosted API that processes a piece of your data because they had a generous free tier. The webhook endpoint that does a conversion you were never going to write yourself. None of this debt is in your codebase. None of it shows up in package.json. No tool monitors it. You don't even count it as a dependency in your own head, because dependencies are things you import and these things are things you call.

But they are dependencies. And they are debt. The debt is just sitting somewhere else, on somebody else's server, with somebody else's incentives. You didn't issue it. You imported it.

The debt you didn't issue is still your debt the moment it falls.

Why Nobody Sees It

The reason nobody sees this third pile is structural, not stupid.

Every tool we built for measuring technical debt looks inside your codebase. Linters parse your files. Dependabot reads your manifests. Snyk scans your lockfile. Code review happens on your PRs. The whole observability stack assumes the debt lives in artifacts you own and can grep. Imported debt does not live in artifacts you own. It lives at a URL. And a URL is not an asset, it's a promise.

A promise made by an entity that owes you nothing.

There's also a softer reason. You didn't import these things on purpose. You imported them on a Tuesday afternoon when you needed a quick conversion, googled it, found a free endpoint, pasted the URL into a config file, and moved on. It felt like using a tool, not like signing a contract. Nothing in your editor told you that you had just bolted a stranger's mortality to your pipeline. So you didn't update any mental ledger. There was no ledger.

The funny thing is the rest of the industry is perfectly aware that this stuff breaks. A 2025 survey of 1,000 senior tech executives found that 93% worry about downtime impact and 100% experienced outage-related revenue loss that year. The list of public incidents reads like a horror catalog: AWS us-east-1 going down for hours and dragging dependent SaaS providers along, Cloudflare WAF rules wiping out a chunk of global traffic in a single push, Azure configuration errors taking out Microsoft 365 and Xbox at the same time. We know outages happen. We track them publicly. We write postmortems.

But we track them after. There is no tool that walks into your repo and says "you depend on a fistful of things that could disappear tomorrow and you have a plan for zero of them." That tool does not exist because the inputs are not in your repo. They are scattered across HTTP calls in random files, hardcoded URLs in config, fetch statements buried in service modules, env vars pointing at hostnames you wrote down once and forgot.

You know what your package.json looks like. You have no idea what your outbound calls look like. That's the gap.

The Audit Nobody Runs

After Kroki's Excalidraw backend crashed and refused to come back, I sat down for the first time in years and ran the audit. Not the one for npm packages. The one for outbound HTTP calls to free services I had never paid for and could not replace in a hurry.

It took me a Saturday morning to grep through everything. The pipeline I was working on at the time was a product catalog automation for a small ecommerce client, the kind of thing that generates assembly diagrams and visual specs for product pages on their store. Not glamorous, but it ships every day. And every step of the pipeline had a free SaaS bolted onto it somewhere.

I came up with 14 outbound calls to services I had never paid for. I won't bore you with the full inventory. The interesting numbers were elsewhere. The number of items that had a fallback plan documented somewhere: zero. The number of items with monitoring on the upstream service: zero. The number of items I had ever stress-tested by killing the dependency on purpose: also zero.

I was running a production pipeline on top of a stack of free promises, and I had been doing it for so long that it didn't even register as a risk anymore. It registered as "infrastructure."

The Kroki replacement, the actual fix, was small. A single TypeScript file that does exactly what the broken service was doing for me, no more, no less. Runs on Bun. Calls a library directly instead of going through a headless browser. Lives in 47 lines. Uses 93MB of RAM. Renders a diagram in roughly 2 milliseconds. It runs in a Docker container on a VPS I was already paying $6 a month for, on the same internal network as the rest of the stack. No public endpoint. No certificates. No attack surface. It has been running since the day I wrote it and it has not once gone down.

Now, the part that bothers me is recent. Five years ago, that fix would have taken me a full day. Maybe two. The cost-benefit of replacing a free dependency would have been a net loss for any single one of them, so I would have done what everyone does, which is wait for the upstream to come back and pray. Today, with Claude Code in front of me, the same fix took thirty minutes and I did it during a coffee break. The math has flipped. The thing that used to be too expensive to fix is now too cheap to ignore.

Same way I rebuilt a paid setup that the vendor decided to retire on me, for a fraction of the cost a few months back. The arbitrage has changed under our feet, and most of us are still running the old prices in our head.

Every imported debt I had been carrying for years was suddenly cheap to refinance. I just hadn't noticed.

The Fix Is Not Self-Hosting

Careful here, because the easy version of this story is "self-host everything" and that is the wrong conclusion.

Self-hosting has its own debt. Servers need patching. Containers need restarting. Disks fill up. The fact that I replaced one free dependency with 47 lines of my own code does not mean I won the game. It means I traded one creditor for another. The new creditor is me, and at least I know where to find me.

The actual fix is much more boring. The actual fix is keeping a balance sheet.

You don't need a tool. You need a list. A flat text file in your repo, or a section in your README, or whatever survives your own laziness. Every external service your pipeline calls. Three columns. What it does, what dies if it dies, and what you would do about it on a Monday morning. That's it. The act of writing it forces you to look at each line and ask the only question that matters: do I have a plan, or am I betting that this one is too big to fall?

Most of yours will not have a plan. That's fine. The point of the audit isn't to fix everything in a weekend, it's to make the third pile visible. Once you can see it, you can start refinancing the cheapest items first. The two-millisecond replacements. The 47-line fixes. The ones where the library exists as a package and the only thing you were ever paying for was the HTTP wrapper.

You will discover, like I did, that a surprising number of your imported debts are exactly that: an HTTP wrapper around something you could have called directly. The infrastructure looked impressive because there was a hosted dashboard and a status page and a brand. Strip the wrapper and the actual logic is fifty lines. This is the same pattern I keep hitting elsewhere too, and it's why I wrote a whole piece on how the cheapest tool that does the job tends to beat the fancy one in production. Less surface, less to break, less to depend on.

Three questions to put on the balance sheet, for every line.

First question: if this thing dies on a Monday morning at 9 a.m., what stops working? Be honest. Don't say "nothing critical." Walk through it. Trace the call. See where it lands. If the answer is "the publish step" or "the customer-facing thing" or "the part that makes money," circle the line in red.

Second question: do I have a fallback, or do I just believe this one is too big to fall? The "too big to fall" reasoning is exactly the reasoning that gets you killed. Cloudflare is too big to fall. AWS us-east-1 is too big to fall. They both fell in 2025. Free tiers from indie maintainers fall every week. Belief is not a fallback.

Third question is the cheap one. How many lines of code would it cost to rebuild this myself, right now, while I'm calm, instead of in a panic at 9:17 on a Monday? Maybe the answer is "thousands" and you decide to live with the risk. That's a real decision. Maybe the answer is "47" and you do it during a coffee break. That's a real decision too. The point is making the decision instead of having it made for you.

Most of us have never made the decision. We just kept clicking the free service into the pipeline because it was there.

Three days after the incident, I went back to Kroki's status page. The Excalidraw backend was still listed as down. Someone had posted a message on their Discord asking if anyone was working on it. Nobody had answered.

My pipeline had been running for 72 hours without interruption. I had forgotten that I owed something to somebody.

AI makes you resilient or selfish. Depends how you squint. 🤷

Anyway, the point is this: audit your imported debt. Not because you need to fix it all, but because you need to see it. The debt you can see, you can plan for. The debt you can't see just waits for a Monday morning.