“I see you’re trying to show me the contents of that file, but it appears to be empty. Ah, now I see the issue. Let me try a different approach to debug this…”
— Cursor Agent 250511.
Alright, intro!
Let’s talk about the latest ripple making waves across the tech pond. You probably caught the buzz last week when Shopify’s CEO, Tobi Lütke, dropped an internal memo that basically set a new baseline: using AI isn’t just encouraged anymore, it’s practically expected. The memo even suggested teams need to justify not using AI before asking for more resources or headcount. Forbes quickly jumped on this, slapping a catchy (and maybe slightly terrifying?) “Hire AI, Not Humans” headline on it.
Now, before we dive deep, let’s be clear: this isn’t another think piece about the grand future of AGI, whether LLMs will pass the Turing test next Tuesday, or how the job market is going to morph entirely (though, yeah, those are big questions!).
Instead, I want to zoom in on something super relevant to those of us actually building things: the hype around Agentic Coders, or “Vibe Coding” as it is now tredy to call it. The promise is seductive, right? If AI can whip up graphics, translate languages on the fly, or slash presentation prep time, surely it can handle software engineering? Can startups really run lean with 5 devs instead of 50, like some breathless posts suggest? And personally, as an engineer, is this the silver bullet that finally unlocks that mythical “10x developer” status we all heard about?
(Co-pilot won’t let you create Studio Ghibli style illustrations any longer, but ChatGPT still does…)
Here we go: My Reality Check on “Vibe Coding”
To find out, I rolled up my sleeves and spent the last couple of months putting six different Vibe Coding products through their paces on real work. Here’s the lowdown from the trenches:
The Not-So-Groovy Parts (Where the Vibe Gets Weird):
- Hallucination Station: Oh boy. I saw AI agents confidently referencing libraries that simply don’t exist, either in my codebase or anywhere on npm/PyPI/etc. They’d suggest calling phantom methods on perfectly good, existing libraries. They even started inventing configuration keys in my YAML and JSON files like they were writing avant-garde poetry for my linters.
- Fantasy Code Chronicles: In one memorable instance, the agent decided the best way to parse something was with a complex regex. Anyone who’s wrestled with parsing knows: that’s usually a job for a lexer/parser combo. It wasn’t just suboptimal; it was the wrong tool for the job.
- Regex Roadblock & Model Hopping: Speaking of regex, another time, the agent wrote flawed regex pattern… and then got completely stuck trying to fix its own mistake. I eventually had to switch to a different AI model (looking at you, o3-mini) just to get past it.
- The “Ship It” Temptation: This is a sneaky one. The AI spits out code that looks plausible. It’s super tempting for a dev (especially one under pressure) to just grab it and move on without truly understanding the how or why. That’s tech debt waiting to happen, and good luck debugging it later if you didn’t grasp it initially.
- Code Quality — Duplication Drama: I saw the same logic rewritten multiple times in slightly different ways, leading to classic code duplication. Not exactly DRY (Don’t Repeat Yourself).
- Code Quality — Junior Moves: Experienced devs spend tons of time refactoring, generalizing, using inheritance, tweaking existing functions and classes. The agents? They often behaved like enthusiastic junior devs, preferring to “invent the world” by writing brand new code for every little thing instead of building smartly on what’s already there.
- Silent Breaking Changes: One agent tweaked my Linter config. Seemed innocent enough, until I realized it necessitated small, annoying changes across the entire codebase. And the agent? Blissfully unaware of the ripple effect it caused.
- Library Loyalty Issues: I use a specific, slightly less common open-source library (deepkit/types). The agent seemed determined to either replace it entirely or just ignore it, defaulting to more mainstream choices like Joi or even React libraries (in a backend context!). This got really obvious when there was a bug — instead of fixing the code using my preferred library, it just rewrote the whole section using a different library. Not helpful!
- Security? What Security?: This was genuinely concerning. Agents generated code that introduced vulnerabilities — specifically, forgetting to add input validation for data coming from external sources and neglecting to implement authorization checks on newly created API endpoints. Yikes.
- Lost in Translation (Jargon & Ambiguity): I often slipped up, using slightly “wrong” or ambiguous terms in my prompts. Asking for changes related to a ‘privilege’ when my code uses the term ‘permission’. Mentioning ‘swagger’ instead of the precise ‘OpenAPI’. Using internal shorthand like ‘CVE’ interchangeably with ‘vulnerability’ or ‘weakness’. A human teammate, even a junior one, would likely pause, ask for clarification, or realize a simple request shouldn’t require building a whole new universe of code. The agent? It often went full bazooka, interpreting a minor tweak request as a signal to construct entirely new models and libraries.
- The Ripple Effect (also a Lost in Translation): Tools like Morph.ai promise seamless integration, letting agents act directly on JIRA tickets. Cool concept, but now your Product Managers and Business Analysts need to be super precise with their language, basically speaking the same internal dialect as your codebase. If a PM writes a story about changing a ‘product’ but your system calls it a ‘catalog ID’, the agent might get hopelessly confused or build the wrong thing entirely.
Honestly? There were definitely moments I felt like I was wasting more time, painstakingly explaining the requirements, trying to guide the refactoring process, pointing out missed edge cases. Absolutely had a few “why didn’t I just write this myself?!” breakdowns, cursing the digital void.
And yeah, you could argue that some of these issues — misunderstanding requirements, needing guidance on refactoring — sound like onboarding a new human team member. True! But here’s the kicker: the agent delivers its answers with absolute, unwavering confidence. It never doubts itself. Imagine having a teammate who knows they’re right, even when they’re completely off-base. Annoying, right? You wouldn’t want that vibe on your team.
But What About Those Slick Demos?
We’ve all seen them — those mind-blowing demos from tools like v0 or Lovable, where an entire app seemingly springs into existence from a simple prompt. And yes, they are awesome! For whipping up a quick prototype, testing a small concept, or scaffolding something based on simple templates, they can definitely work and speed things up.
But scaling that to a real-world, production-grade application? One with multiple microservices, hundreds or thousands of files, strict linting rules, established coding patterns, and specific company best practices? Sorry, but we’re not there yet. Check back next year, maybe?
Performance Puzzles:
Can these agents help optimize algorithms or find performance bottlenecks? Sometimes, maybe. They might point you in a vaguely correct direction. But just as often, I found they’d latch onto my (potentially) flawed hunch and dig the rabbit hole even deeper, suggesting ineffective fixes or repeating suggestions I’d already ruled out.
Okay, Okay — FOCUS! Where Does Vibe Coding Actually Shine?
It’s not all doom and gloom! Based on my experiments, these tools are genuinely useful for specific tasks:
- Unit Test Automation: This is a big one. Pointing an AI at existing code and asking it to generate unit tests? It’s often surprisingly good at this, saving a ton of boilerplate time.
- Refactoring Assistance (with Guardrails): Once you have those solid tests, AI can be a great partner in refactoring older, gnarlier code. It can help explain undocumented logic and then assist in rewriting it using more modern approaches. Crucially, the tests keep it honest.
- Reducing Handoff Friction: Need to jump into a part of the codebase you’ve never touched before? An AI agent can help you add small features or make targeted changes without needing a deep, upfront understanding of the entire module. Think code bits, not massive features.
- Language Learning Accelerator: If you’re a JavaScript pro needing to dabble in Java (or vice-versa), these tools are fantastic tutors, helping you translate concepts and syntax quickly.
- Automated Code Review Buddy: When you think your code is ready, running it past an AI reviewer can surface genuinely insightful suggestions for improvement, catching things you might have missed. Like, 25% good suggestions and 75% nagging. But still.
- Skip Google and forget Stack Overflow. This one is 75% good for the Viber. If you know what your question should be, it will give you a concise answer, spearing you from endless scrolling on the internet.
The Bottom Line: Peer Programmer, Not Replacement (Yet?)
Let’s crunch some rough numbers. Assume a developer spends about 5 hours a day actually heads-down coding (the rest is meetings, design, reviews, learning, lunching etc.). If AI tools make that coding time, say, 30% more efficient — which feels plausible to me — that translates to roughly a 15–20% overall productivity boost for a typical developer.
That’s huge! It’s a significant improvement and absolutely justifies Tobi Lütke’s stance that reflexive AI usage should be the baseline.
But does it mean we’re replacing devs wholesale or creating legions of 10x engineers overnight? Nope. Not anytime soon. It’s powerful augmentation, a new tool in the belt that makes good developers better and faster at certain things. It helps, it assists, it speeds up knowledge building. But the human element — the critical thinking, the architectural vision, the nuanced understanding of requirements, the collaborative problem-solving, the ability to doubt and question — remains absolutely essential.
So yeah, embrace the AI assistants. Expect them to be part of the workflow. But keep your brilliant human developers close — you’re going to need them more than ever to wield these powerful new tools effectively.
Top comments (1)
Really appreciate how you broke down both the hype and the real pain points here
I've seen those "phantom libraries" and overconfident wrong answers too.
Totally agree on unit test generation being a sweet spot for AI right now.