DEV Community

Cover image for I audited a codebase written by Devin 3.0. It was a nightmare.

I audited a codebase written by Devin 3.0. It was a nightmare.

Saqib Shah on March 05, 2026

We aren't just shipping features faster; we are shipping technical debt faster. If you treat AI as an architect instead of an intern, you are build...
Collapse
 
crevilla2050 profile image
crevilla2050

I have 25+ years of programming experience that have taught me a lot about dealing with teammates, interns and clients. I have learned to work alongside AI very well: I plan, design, make schemas and reflect it with the AI. Sometimes it opens ideas I hadn't thought about, or points our potential risks, and I go back to the designing board. Once I am happy with the flow and result, I write simple pseudo code and tell AI to translate into what we planned and very rarely come up with spaguetti code. In fact, i am going back to old projects and "improving them" thanks to AI, reducing code size a lot. You are right, AI should be treated as an Intern, not as a senior Architect. That our job as developers. Great article, saludos.

Collapse
 
saqibshahdev profile image
Saqib Shah

This comment is gold.

The workflow you described (Plan → Schema → Pseudo-code → AI) is exactly what’s missing in modern 'vibe coding.' You are using AI as a force multiplier, not a crutch.

Love the point about reducing code size in old projects—that’s the ultimate proof of using AI correctly. Thanks for sharing this valuable perspective!

Collapse
 
Sloan, the sloth mascot
Comment deleted
Collapse
 
saqibshahdev profile image
Saqib Shah

Appreciate that! Just trying to bring some sanity back to the development process.

Collapse
 
tommy_leonhardsen_81d1f4e profile image
Tommy Leonhardsen

The cleanup problem is itself an AI job — you just need the right tool for each step.

I run GLM-5 over messy codebases first. It's ruthlessly good at spotting inconsistent patterns, dead logic, and redundant abstractions — better than Sonnet/Opus on pure code review in my experience. I dump its findings into a structured Markdown file: what's wrong, where, and why.

That file goes straight into Claude Code as context, with Opus handling the actual refactor. Opus has the reasoning depth to hold the whole picture and collapse 400-line handlers into something coherent without breaking correctness.

The Markdown handoff is the key. GLM-5 deciphers the mess. Claude Code fixes it. No manual cleanup sprints needed.

Collapse
 
saqibshahdev profile image
Saqib Shah

That’s actually a really solid pipeline. Using GLM-5 to audit and Opus to refactor makes a lot of sense.
But my main concern is still ownership. If AI writes the code and another AI cleans it, the dev is completely out of the loop. When a bug hits in production, debugging a system you never truly understood is still a nightmare.

Collapse
 
deep_mishra_ profile image
deep mishra

I think the key difference is whether the AI is being used as a copilot or a replacement. When it’s assisting a developer, the results are usually great. When it’s left to generate entire systems, you often end up with exactly the kind of mess you described.

Collapse
 
saqibshahdev profile image
Saqib Shah

Exactly. It comes down to intent.

'Help me optimize this function' = Engineering.
'Build this entire system for me while I grab coffee' = Gambling.

Great distinction!

Collapse
 
0x0f8 profile image
Info Comment hidden by post author - thread only accessible via permalink
0x0F8

why are u using "UNKNOWN", just wasting space....u should not include the name at all if it's undefined and "unknown" string should render conditionally at the end of the pipe.

Collapse
 
saqibshahdev profile image
Saqib Shah

Technically, you are advocating for Separation of Concerns (keeping data raw and handling fallbacks in the UI/Pipe). That is a valid architectural pattern.
However, in this specific transformation layer, I prefer Data Predictability.
I want the consumer of this function to always receive a String, not String | Undefined. Sanitizing nulls here simplifies the downstream logic, so the UI doesn't have to guess if a field is missing.

Collapse
 
0x0f8 profile image
0x0F8

it doesnt actually simplify anything it makes it more retarded cuz then if client wants to render UNKNOWN with no caps they have to check string lol and then u can change string on backend and so it becomes more retarded just admit it ur pattern is dum

Thread Thread
 
0x0f8 profile image
0x0F8

also stop using ai for ur replies us sound like a bot

Collapse
 
harsh2644 profile image
Harsh

100% agree. AI is great for boilerplate and prototyping, but treating it as an architect is like letting an intern design the database schema. Fast now, firefighting later.

Collapse
 
saqibshahdev profile image
Saqib Shah

That analogy is terrifyingly accurate. 😅

Letting an intern design the schema works fine... until the first JOIN query kills the production server. 'Firefighting later' is the perfect summary.

Collapse
 
shekharrr profile image
Shekhar Rajput

Never heard of devin ever after that scandle

Collapse
 
saqibshahdev profile image
Saqib Shah

You're referring to the 2024 demo controversy (the Upwork fiasco). Totally valid point—the early hype was definitely misleading.

But here in 2026, the problem isn't that they 'faked' it—it's that they actually shipped it. People ARE using these agents now, and while they work better than the 2024 demos, they are generating massive technical debt hidden behind 'successful' PRs. That's the crisis I'm highlighting.

Collapse
 
hakan_nal_7cb5b50df6d998 profile image
Hakan Ünal

Good

Collapse
 
saqibshahdev profile image
Saqib Shah

Glad you found it useful!

Some comments have been hidden by the post's author - find out more