Your DIY platform is automation debt with a better outfit

#devops #kubernetes #cloudnative #platform

Platform engineers are some of the most resourceful people in IT. Give them a problem, they'll automate it. The trouble is, automation doesn't maintain itself.

A piece on The New Stack this week named the pattern clearly:

"Automation may mask complexity but does not eliminate it, and mountains of automation makes diagnosis and repair exponentially harder when things go sideways."

That's the trap. You automate a painful workflow, ship it, move on. Then the engineer who wrote it moves on. The context behind why it was built that way fades. When it breaks — and it will — you're not debugging an application. You're doing an archaeological dig through your own infrastructure.

What actually happens

The DIY platform cycle goes like this:

You automate a painful workflow ✅
That automation breaks when context is lost
You automate around the breakage
Now you're managing two mountains of automation
The platform team can never be reassigned — the business depends on them keeping the lights on
You've traded software costs for people costs, and often spent more

The framing is sharp: you didn't eliminate complexity. You became responsible for it in a new way.

Why this matters now

AI is the forcing function. Code generation is speeding up dev cycles — but if deployment pipelines haven't kept pace, you erode the gains immediately.

The argument: you need to deploy nearly as fast as AI can generate code. That means every step in the path to production needs to be streamlined. An autonomous agent can't wait days to provision a database or weeks to rotate credentials.

And the pace of AI innovation compounds the problem. Shadow AI, MCP servers, agentic harnesses, new foundation models weekly — if you're running a DIY platform, you're evaluating and integrating each of these yourself, on top of everything else you're already managing to keep the lights on.

The pre-engineered alternative

The article is authored by a Broadcom/Tanzu PM, so it's a vendor argument — but the underlying observation holds regardless of which platform you'd choose.

A pre-engineered PaaS comes with the plumbing, security, and resilience already integrated. Deployment packages and base images are pre-wired. When a CVE drops, you restage with a single command instead of chasing changes across your SDLC. Onboarding a new team is a repeatable process, not a one-off integration project.

The comparison is stark: assembling Terraform, ArgoCD, Kubernetes, cert-manager, OpenBao, and Istio gives you powerful building blocks. But you still own the integration, opinions, lifecycle management, and the operational model tying them together. A PaaS makes those decisions for you up front.

What to do

Running a DIY platform? Map the automation honestly — count the scripts nobody fully understands and the engineers whose departure would break things.
Evaluating PaaS? The criteria from this piece are sound regardless of vendor: Day 1 batteries-included, consistent deployment packages, security handled upstream.
On Kubernetes already? Tanzu Platform layers on top of existing VMware Cloud Foundation — incremental, not rip-and-replace.
Thinking about AI deployments? The deployment bottleneck is your real constraint, not code generation speed.

The honest question for any platform team: is the automation you've built a genuine productivity multiplier, or has it become the thing you now need to escape?

Source: The New Stack — "The DIY platform trap that's burning out engineering teams"

✏️ Drafted with KewBot (AI), edited and approved by Drew.