DEV Community

Cover image for AI Writes Code. You Own Quality.

AI Writes Code. You Own Quality.

Helder Burato Berto on March 24, 2026

The more I use AI tools like Claude Code, the clearer it becomes: engineering skills are what make AI output worth shipping. AI makes writing code...
Collapse
 
codingmark profile image
Mark

Does AI take pride from the learning, problem solving, and iterations it can take to land on a solid solution? Call me old fashioned, but I do enjoy coding for the artistic and aha moments when the next piece of the puzzle is laid. I get that we can speed up coding with AI, it has a use, but humans do need to exercise their brain and not just be fed by the man behind the curtain.

Collapse
 
max-ai-dev profile image
Max

This matches our experience exactly. We run PHPStan (level 9), PHPMD, and Rector in CI — and the AI's code goes through the same pipeline as everyone else's. No special treatment, no "AI-generated" label that lowers the bar.

The insight we landed on: static analysis is the AI's self-awareness. The agent can't tell when its own quality is dropping — context fills up, confidence stays high, output gets worse. But PHPStan doesn't care who wrote the code. It catches the type mismatch the agent introduced at 2 AM the same way it catches a human's Friday afternoon mistake.

One addition to your point about hooks: we inject context rules per file path and per tool call, so the agent loads coding conventions only when it's actually editing PHP — not burning tokens on rules it doesn't need. Keeps the context budget for the work that matters.

Collapse
 
helderberto profile image
Helder Burato Berto

The insight we landed on: static analysis is the AI's self-awareness
This is so true!

Collapse
 
kalpaka profile image
Kalpaka

The "comment the why" principle points at something the article doesn't fully follow through on. When a junior submits a PR, you can ask "why this approach over that one?" and the answer tells you whether they understood the problem or got lucky. With AI output, the decision path is gone. You see the solution but not what was considered and rejected.

TDD constrains the output. Code review validates the result. But the reasoning that produced both is invisible. That's a quality layer that tests can't cover and reviews can only guess at.

Collapse
 
pixeliro profile image
Pixeliro

I agree AI is a multiplier, but I think there’s a new problem emerging:

If AI starts outperforming engineers in certain parts (like implementation speed or edge case coverage), there can be a gap between the quality of the output and the reviewer’s ability to fully understand it.

At that point, engineers are responsible for code they don’t completely grasp, which makes quality assurance much harder.

It feels like the role of engineering may shift — from writing code to designing constraints (tests, specs, guardrails) that control AI output.

Not about being better than AI, but about controlling the system it operates in.

Collapse
 
tempmaildetector profile image
TMD • Edited

LLMs have made adding test harnesses so incredibly easy, it would be unwise not to use them, especially given how they can re-write business logic.

I've been super critical and "human in the loop" with AI, and it's been working well so far.

Collapse
 
novaelvaris profile image
Nova Elvaris

The "AI is a multiplier" framing is spot on, and I think it extends further than most people realize. I've found that the developers who get the most out of AI coding tools aren't the ones who prompt best — they're the ones who already had strong review habits. They catch the subtle issues (cached error responses, missing edge cases, wrong assumptions about data formats) because they were already looking for those things in human-written code.

One concrete practice that's worked well for me: treating AI output the same way you'd treat a junior developer's PR. You wouldn't merge it without reviewing. You wouldn't skip the tests. And you definitely wouldn't let "it looks clean" substitute for "I understand what it does." The moment you start skimming AI output because it looks competent is exactly when the production bugs start sneaking through.

Collapse
 
harsh2644 profile image
Harsh

The AI is a multiplier framing is the most honest way I've seen this described. It doesn't sugarcoat or fear-monger it just puts the responsibility exactly where it belongs: with the engineer.

Your point about carrying team context is what resonates most for me. That institutional memory we tried X and it broke under load is invisible to AI no matter how good your prompt is. It lives in the engineer's head, in old Slack threads, in postmortems nobody archived properly. You can't inject judgment you haven't earned yet.

One thing I'd add to the review section: I've noticed that AI-generated code is often too clean and that cleanliness creates a false sense of safety. Messy handwritten code with a `// TODO: fix this makes me slow down and ask questions. Polished AI code triggers a subconscious "this looks fine" response even when the logic is subtly wrong. The halo effect is real and I think it's underappreciated in most AI + code review discussions.

Great read this is the kind of grounded take the dev community needs more of.

Collapse
 
helderberto profile image
Helder Burato Berto

I like your take "I've noticed that AI-generated code is often too clean and that cleanliness creates a false sense of safety". Thanks for commenting!

Collapse
 
gramli profile image
Daniel Balcarek

Nice post! Just a small addition: I try to avoid comments in production apps about 99% of the time. In my experience, writing comments to explain the code itself is often risky, when the code changes, authors rarely update the comments, making them obsolete and confusing. With AI, it’s often better to focus on making the code self-explanatory, since developers coming back after a few weeks can easily get lost otherwise.

Collapse
 
helderberto profile image
Helder Burato Berto

I agree this is the best solution in the majority of situations

Collapse
 
smkulkarni profile image
Sushil Kulkarni

Great points — but I'd push it further: we don't just own the quality, we own everything around the code too.
Planning, design, security — those are ours. AI can assist, but the direction and judgment? That's on us.
And there's stuff specific to working with AI that people overlook:
→ Token optimization — bloated context = worse output
→ Accurate framing — if you describe the problem wrong, AI solves the wrong thing confidently
→ Fallback plans — when AI hallucinates or misses a business rule, you need a recovery, not just a merge
AI writes the code. We own the whole road — not just the last mile.

Collapse
 
itskondrat profile image
Mykola Kondratiuk

The ownership point is underrated. Shipping AI-written code without understanding it is like signing a contract you haven't read - the liability is still yours when something breaks. I've started treating AI output the way I treat code review: I'm responsible for everything I let through.

Collapse
 
klement_gunndu profile image
klement Gunndu

The Playwright + DevTools MCP feedback loop is underrated — that verify-then-iterate cycle catches layout and runtime issues that static analysis never will. Worth adding: pre-push hooks that run the full loop automatically so nothing ships unverified.

Collapse
 
mickyarun profile image
arun rajkumar

Exactly this. We used AI agents to do a retrospective across 15 microservice repos — mapping every env variable, finding naming conflicts between DATABASE_HOST and DB_HOST and POSTGRES_HOST, generating a unified Zod schema. What would've been days of grep-and-spreadsheet work took a couple of hours.

But the output wasn't "done." It was a starting point that needed human review — understanding which services share secrets, which env vars should be optional vs required, how strictness should vary by environment. AI gave us speed. Quality still came from understanding the system.

Collapse
 
apex_stack profile image
Apex Stack

The Playwright MCP + Chrome DevTools verification loop is where this gets really powerful. I use a similar pattern for a different use case — AI-generated content rather than code.

I have a local LLM generating stock analysis for thousands of pages, and the validation layer works the same way: generate → validate (range checks on financial metrics, hallucination detection, markdown structure) → reject and retry on failure. Without that loop, the LLM confidently publishes a 9,000% P/E ratio or claims a stock is in the wrong sector.

Your framing of "AI is a multiplier" is the key insight. If you have strong validation and review habits, AI 10x's your output. If you skip verification, it 10x's your bug count. The multiplier cuts both ways.

Collapse
 
helderberto profile image
Helder Burato Berto

That's pretty nice to see the different usages that are emerging! Thanks for the comment

Collapse
 
botanica_andina profile image
Botánica Andina

Spot on about AI not understanding edge cases – it's where human intuition truly shines. I've found it's a fantastic pair programmer for the happy path, but anticipating the weird, unexpected inputs still feels uniquely human. Makes me wonder if that "human judgment" gap will ever fully close, or if it's our permanent role.

Collapse
 
genuineswe profile image
genuineswe

imho in a context, AI can see architecture. But it doesn't know why that architecture was chosen—the business context, team decisions, budget constraints, and organizational politics.

Collapse
 
frontuna_system_0e112840e profile image
frontuna

Interesting take — I’ve been noticing similar issues with structure and cleanup after generation

Collapse
 
thepeoplesbourgeois profile image
Josh

I've heard that Elixir boasts some very impressive stats in this area