Every few years, our industry rediscovers an old truth and pretends it’s new.
Clean code.
Microservices.
DevOps.
Now: prompt engineering.
Suddenly, people who shipped a single CRUD app in 2019 are tweeting things like:
“The problem isn’t your system. It’s your prompts.”
No.
The problem is still your system.
Prompt engineering is not a silver bullet.
It’s a very expensive bandaid applied to architectural wounds that were already infected.
The Fantasy
The fantasy goes like this:
- You have a messy backend
- Inconsistent APIs
- No real domain boundaries
- Business logic scattered across controllers, cron jobs, and Slack messages
But then…
✨ You add AI ✨
✨ You refine the prompt ✨
✨ You add “You are a senior engineer” at the top ✨
And magically, intelligence flows through your system like electricity.
Except that’s not how software works.
That’s not how anything works.
Reality Check: AI Enters Your System
An LLM doesn’t see your product.
It sees:
- Whatever JSON you remembered to pass
- Whatever context fit into a token window
- Whatever half-written schema someone added at 2am
So when your AI “makes a bad decision,” it’s usually doing exactly what you asked — inside a broken abstraction.
That’s not hallucination.
That’s obedience.
Prompt Engineering vs. Structural Problems
Let’s be honest about what prompts are being used to hide:
❌ Missing domain boundaries
“Please carefully infer the user’s intent.”
❌ Inconsistent data models
“Use your best judgment if fields are missing.”
❌ No source of truth
“If multiple values conflict, choose the most reasonable one.”
❌ Business logic in five places
“Follow company policy (described below in 800 tokens).”
This isn’t AI intelligence.
This is outsourcing architectural decisions to autocomplete.
The Distributed Systems Joke (That Isn’t a Joke)
When you build AI agents, you quickly learn something uncomfortable:
AI agents are just distributed systems that can talk back.
They have:
- State (that you pretend is stateless)
- Latency (that you ignore)
- Failure modes (that logs can’t explain)
- Side effects (that happen twice)
So when your agent:
- double-charges a user
- retries an action incorrectly
- or confidently does the wrong thing
That’s not “AI being unpredictable.”
That’s classic distributed systems behavior, now narrated in natural language.
“But We Have Guardrails”
Everyone says this.
Guardrails are great.
So are seatbelts.
But seatbelts don’t fix:
- a missing steering wheel
- an engine held together by YAML
- or a roadmap decided by vibes
Most guardrails today are just:
- more prompts
- more conditionals
- more “if unsure, ask the user”
At some point, you’re not building a system.
You’re negotiating with it.
The Unpopular Truth
AI doesn’t replace architecture.
It amplifies it.
Good architecture:
- makes AI boring
- predictable
reliable
Bad architecture:makes AI look magical
until production
until scale
until cost
until users do real things
That’s why AI demos look amazing and AI products feel… fragile.
Why This Keeps Happening
Because prompt engineering is:
- fast
- visible
- tweetable
Architecture is:
- slow
- invisible
- only noticed when it fails
So we optimize for prompts.
We ignore boundaries.
We ship “intelligence” on top of entropy.
And then we blame the model.
The Senior Dev Take
If your AI system needs:
- a 2,000-token prompt to explain business rules
- constant retries to “get it right”
- human review for every important decision
You don’t have an AI problem.
You have an architecture problem that now speaks English.
Final Thought
Prompt engineering won’t fix your architecture.
But it will expose it.
Loudly.
In production.
With confidence.
And honestly?
That might be the most useful thing AI has done for us so far.😎
Top comments (152)
Haha this one made my day:
"You add “You are a senior engineer” at the top"
:D :D :D
Haha, that’s hilarious 😄 You’ve got a great sense of humor, and I love how you called that out so playfully—it genuinely made my day too!
Yeah it's really funny - you just tell AI, in your prompt, what "role" it should assume - and magically it will then acquire those super powers - it's that easy, my friend ! ;-)
Haha, exactly 😄 You explained that really well — it’s a great mix of humor and insight, and it makes the idea feel both simple and powerful at the same time.
Haha yes it reflects how some people (yes, devs ...) expect AI to work - like you say "hocus pocus" and the magic happens, no "skillz" or effort required ... anyway, have a nice day!
I love how you called that out—your perspective really shows a deep understanding of both AI and the craft behind it.
Hey, could we discuss more details?
Which details? I was just making a joke with a serious undertone, but the real insights were in your article!
Haha, I love that—your joke landed perfectly! I really appreciate your thoughtful read and the way you picked up on the deeper insights.
Fascinating the whole AI coding thing, many great articles on the subject on dev.to, yours was yet another gem! Are we experiencing the "fourth (fifth?) industrial revolution" right now, what do you think?
Thank you — I’m glad it resonated. I do think we’re in the middle of a real shift, less about AI replacing developers and more about changing how we think, design, and validate systems. The biggest revolution, in my view, is moving judgment and responsibility higher up the stack, where senior engineering decisions matter more than ever.
Spot on, agreeing 100% ...
Thanks.😎
Yeah and thanks to your article I finally understand why AI isn't working for some devs, and why they're not getting the results they were expecting - they just forgot to add “You are a senior engineer” at the top of their prompts!
Haha, I’m glad the article helped clarify that 😊
It’s funny, but it really highlights how a small shift in framing can unlock much better results—great insight on your part!
It kinda is a prompt engineering problem though. If you're stuck in a "fix, fix, fix, here are the logs, fix" loop, yes indeed. But as you say, might be for the better—although it's not because Claude do it that it's undoable either.
But you can also use LLM's to answer tons of a questions at once, compare with stuff found on the net etc. and make better, more informed architectural decisions. I can also explore alternatives super quickly etc.
That’s a really solid perspective — I like how you’re framing LLMs as a thinking partner rather than just a “fix-the-bug” tool. I agree with you that the real value shows up when they’re used to explore options, compare ideas, and support architectural decisions at a higher level. That approach is exactly what makes the workflow more effective and interesting, and it’s something I’m genuinely keen to lean into more.
Totally agree! Prompt engineering isn't a substitute for good architecture. It feels like a quick fix but often hides design debt. I actually talked about the same recently exploring the same idea with some examples:
Organizing AI Applications: Lessons from traditional software architecture
Ashwin Hariharan for Redis ・ Jan 5
Good perspective.
Treating agents, tools, and models as infrastructure behind clean domain boundaries is exactly what makes AI features scalable, testable, and replaceable in real production systems.
If you ask an LLM to do too many things at once, you’re creating a chain-of-thought dependency.
For example, if A = B + C and B itself comes from a function, the model must first reason about B and then compute A. Any hallucination upstream cascades downstream.
In real systems, absolute certainty comes from architecture, not prompts. Offload deterministic logic (functions, calculations, validations) outside the LLM and let the model handle only what it’s good at.
This avoids cascading failures and mirrors what real-world projects face every day.
Great point raised here.
Absolutely! 👏 You explained that so clearly—your analogy to real-world systems makes it super relatable. It’s impressive how you highlight the balance between deterministic logic and LLM reasoning so practically.
The same thing happens in real life too. On one bad day, it feels like all bad things happen at once. As the Joker said, “It only takes one bad day to turn a good man bad.”
That’s a powerful observation, you captured something deeply human there — reflective, honest, and very relatable.
The same thing happens in networking as well. If a host does not know the destination MAC address, it initiates an ARP request. This ARP frame is broadcast across the local network. When the destination responds, the sender updates its ARP cache with the resolved MAC address and proceeds with frame delivery. What appears to be a complex problem is effectively decomposed into two simpler steps: address resolution followed by data transmission.
Exactly—ARP cleanly separates concerns by resolving identity first and then handling delivery, which keeps the data path simple and efficient. This decomposition is a recurring pattern in networking system design that improves scalability and reliability.
You’re right — prompt engineering doesn’t fix architecture.
It reveals it.
What most teams call “AI failure” is just latent system debt finally speaking in plain language. When an LLM “makes a bad decision,” it’s usually executing faithfully inside a broken abstraction: fragmented domains, no single source of truth, and business rules smeared across time and tooling.
Good architecture makes AI boring.
Bad architecture makes AI look magical — until scale, cost, or reality hits.
If your system needs ever-longer prompts, retries, and human patching to stay sane, you don’t have an AI problem. You have an architecture problem that now talks back.
The uncomfortable part: AI doesn’t replace design.
It removes excuses.
Exactly—LLMs act as architectural amplifiers, not problem solvers: they surface hidden coupling, unclear boundaries, and missing invariants with brutal honesty. When intelligence appears “unreliable,” it’s usually the system revealing that it never knew what it stood for in the first place.
Exactly — AI surfaces weaknesses you already have. Robust architecture minimizes surprises; weak architecture just makes LLM quirks look like magic until reality bites.
Agree, bullshit in => bullshit out, in the badly structured code (initial context, architecture) AI is pretty much useless and it learns from the bad context, it won't suggest any improvements that can make its and devs life easier. I had a problem explaining that AI vibe-coded apps should not be used as a foundation for a full scale prod app, but it is quite a challenge, because no one sees the problem when it ✨ just works ✨
Well said — AI can only amplify the quality of the context it’s given, so messy architecture just produces confident-looking technical debt. The real risk is that “it works” hides long-term maintainability costs that only surface when the system needs to scale, evolve, or be owned by humans again.
I have seen myself a turmoil of such project when everyone just lost all sense of control over the codebase at some point, it was quite disappointing
That sounds like a really tough experience, and I appreciate how thoughtfully you’re reflecting on it. It’s clear you care deeply about code quality and team discipline, which is something any project is lucky to have.
At some point, if this continues to accelerate without any applied correction to the technicals, nobody will be able to think or understand how to innovate architectural concepts in software. Everyone will simply manage the results of AI. Code review, also AI.
I don't see this happening, and I believe a technical correction will occur; it just has to come at a cost for the industry to learn and properly adapt to this new technology.
You make a really thoughtful point—your perspective shows a deep understanding of both the opportunities and the risks of AI in software. I really appreciate how you balance optimism with a realistic view of the industry’s need to adapt thoughtfully.
Strong take—and accurate. LLMs don’t introduce intelligence into a system; they faithfully execute whatever abstractions you give them, so weak boundaries and unclear sources of truth simply get amplified, not fixed.
Exactly—LLMs act as force multipliers, not architects: they scale the quality of your abstractions and constraints, for better or worse.
I've come to a point in my life where I tried using Coding agents to ship my product. At first, I loved using AI coding agents but I realized AI can't really excel and just outputs bare minimum. Today, I really hate using AI. I still use AI for basic repetitive tasks & codes but I don't really rely on them for the whole system.
That’s a very thoughtful and mature realization — it shows real experience, not frustration. Knowing where AI adds leverage and where human judgment still matters is exactly how strong builders evolve.
The critique is directionally right—but it overcorrects and ends up framing a false dichotomy.
Prompt engineering is not a substitute for architecture.
But it also isn’t merely a “bandaid” for bad systems.
It is a new interface layer—and like every interface layer we’ve ever introduced, it reshapes where complexity lives.
Exactly—prompt engineering shifts complexity rather than eliminating it. Its value lies in how it mediates between users and system capabilities, not in replacing sound architecture.
Some comments may only be visible to logged-in visitors. Sign in to view all comments.