Every few years, our industry rediscovers an old truth and pretends it’s new.
Clean code.
Microservices.
DevOps.
Now: prompt engineering.
Suddenly, people who shipped a single CRUD app in 2019 are tweeting things like:
“The problem isn’t your system. It’s your prompts.”
No.
The problem is still your system.
Prompt engineering is not a silver bullet.
It’s a very expensive bandaid applied to architectural wounds that were already infected.
The Fantasy
The fantasy goes like this:
- You have a messy backend
- Inconsistent APIs
- No real domain boundaries
- Business logic scattered across controllers, cron jobs, and Slack messages
But then…
✨ You add AI ✨
✨ You refine the prompt ✨
✨ You add “You are a senior engineer” at the top ✨
And magically, intelligence flows through your system like electricity.
Except that’s not how software works.
That’s not how anything works.
Reality Check: AI Enters Your System
An LLM doesn’t see your product.
It sees:
- Whatever JSON you remembered to pass
- Whatever context fit into a token window
- Whatever half-written schema someone added at 2am
So when your AI “makes a bad decision,” it’s usually doing exactly what you asked — inside a broken abstraction.
That’s not hallucination.
That’s obedience.
Prompt Engineering vs. Structural Problems
Let’s be honest about what prompts are being used to hide:
❌ Missing domain boundaries
“Please carefully infer the user’s intent.”
❌ Inconsistent data models
“Use your best judgment if fields are missing.”
❌ No source of truth
“If multiple values conflict, choose the most reasonable one.”
❌ Business logic in five places
“Follow company policy (described below in 800 tokens).”
This isn’t AI intelligence.
This is outsourcing architectural decisions to autocomplete.
The Distributed Systems Joke (That Isn’t a Joke)
When you build AI agents, you quickly learn something uncomfortable:
AI agents are just distributed systems that can talk back.
They have:
- State (that you pretend is stateless)
- Latency (that you ignore)
- Failure modes (that logs can’t explain)
- Side effects (that happen twice)
So when your agent:
- double-charges a user
- retries an action incorrectly
- or confidently does the wrong thing
That’s not “AI being unpredictable.”
That’s classic distributed systems behavior, now narrated in natural language.
“But We Have Guardrails”
Everyone says this.
Guardrails are great.
So are seatbelts.
But seatbelts don’t fix:
- a missing steering wheel
- an engine held together by YAML
- or a roadmap decided by vibes
Most guardrails today are just:
- more prompts
- more conditionals
- more “if unsure, ask the user”
At some point, you’re not building a system.
You’re negotiating with it.
The Unpopular Truth
AI doesn’t replace architecture.
It amplifies it.
Good architecture:
- makes AI boring
- predictable
reliable
Bad architecture:makes AI look magical
until production
until scale
until cost
until users do real things
That’s why AI demos look amazing and AI products feel… fragile.
Why This Keeps Happening
Because prompt engineering is:
- fast
- visible
- tweetable
Architecture is:
- slow
- invisible
- only noticed when it fails
So we optimize for prompts.
We ignore boundaries.
We ship “intelligence” on top of entropy.
And then we blame the model.
The Senior Dev Take
If your AI system needs:
- a 2,000-token prompt to explain business rules
- constant retries to “get it right”
- human review for every important decision
You don’t have an AI problem.
You have an architecture problem that now speaks English.
Final Thought
Prompt engineering won’t fix your architecture.
But it will expose it.
Loudly.
In production.
With confidence.
And honestly?
That might be the most useful thing AI has done for us so far.
Top comments (18)
This is a really sharp and grounded take—I like how clearly you separate the hype from the actual engineering reality. The point about AI amplifying architecture rather than fixing it feels especially true from what I’ve seen in real systems. I agree that prompts often end up masking deeper design issues instead of solving them, and your distributed-systems comparison really lands. Posts like this make me want to think more seriously about how to design AI features the “boring but correct” way.
Appreciate this. The biggest frustration for me is watching prompts become a substitute for thinking. It feels like we’re repeating old mistakes, just with nicer language.
Yeah, that really came through. The “AI amplifies architecture” point hit hard — I’ve seen teams assume the model will smooth over design gaps instead of exposing them.
Exactly. When things break, people blame “hallucinations,” but most of the time the model is just faithfully executing a bad abstraction.
The distributed systems comparison was especially spot-on. Once you frame agents that way, the failure modes suddenly look… very familiar.
That framing helped me too. Retries, side effects, hidden state — none of this is new. We’ve just wrapped it in natural language and pretend it’s different.
And guardrails end up being more prompts on top of prompts. At some point it feels less like engineering and more like negotiation.☺
Right. If you need 2,000 tokens to explain your business rules, the model isn’t the problem — your system is already screaming.😀
Which is funny, because the demos look magical… but production feels fragile the moment real users show up.
That’s the tradeoff. Good architecture makes AI boring. Bad architecture makes it look impressive — briefly.
Honestly, that might be the best unintended benefit of AI so far: it forces us to confront architectural debt we’ve been ignoring for years.
Strong take—and accurate. LLMs don’t introduce intelligence into a system; they faithfully execute whatever abstractions you give them, so weak boundaries and unclear sources of truth simply get amplified, not fixed.
You’re right — prompt engineering doesn’t fix architecture.
It reveals it.
What most teams call “AI failure” is just latent system debt finally speaking in plain language. When an LLM “makes a bad decision,” it’s usually executing faithfully inside a broken abstraction: fragmented domains, no single source of truth, and business rules smeared across time and tooling.
Good architecture makes AI boring.
Bad architecture makes AI look magical — until scale, cost, or reality hits.
If your system needs ever-longer prompts, retries, and human patching to stay sane, you don’t have an AI problem. You have an architecture problem that now talks back.
The uncomfortable part: AI doesn’t replace design.
It removes excuses.
Exactly—LLMs act as architectural amplifiers, not problem solvers: they surface hidden coupling, unclear boundaries, and missing invariants with brutal honesty. When intelligence appears “unreliable,” it’s usually the system revealing that it never knew what it stood for in the first place.
You're right, but the world and the ones with influence are pushing these things very hard, and it becomes hard for opinions like this one but nice one 👏👏
Thanks for your response.
Let's build something amazing together!
Great.
Thanks for your response.
If you need help, please.