DEV Community

Cover image for AI Coding Assistants Don't Suck Anymore
Miguel Paracuellos
Miguel Paracuellos

Posted on • Originally published at Medium

AI Coding Assistants Don't Suck Anymore

The hype is fading, the hallucinations are dropping, and the robots finally feel like teammates instead of toddlers with keyboards.


☕️ Quick Sip Summary

  • New models, fewer face-palms. Claude Sonnet 4 (via Cursor), GPT-4.5, and Gemini 2.5 crank hallucinations down to ~15%.
  • Speed is real, trust is tricky. A controlled METR study showed a 55% speed boost, but Stack Overflow’s 2025 survey says only 29% of devs actually trust AI output.
  • My reality check: AI now handles the boring 80%, but the final 20% is still very human. I’ve never merged a PR without running tests and giving the code a side-eye.

1. The Day My “Intern” Grew Up

Back in early ’24, AI coding tools felt... twitchy. I remember asking it to build a login page and getting something that not only skipped validation but practically whispered "hey, let’s hardcode a password for fun."

But I didn’t quit on it. I kept using it — more cautiously at first — learning where it helped and where it hallucinated. It was like watching a junior dev slowly grow up. The suggestions started making more sense. The bugs showed up less often. And somewhere between dozens of commits and a few thousand prompts, I realized: I was trusting it more than I thought.

Then came May 2025. Cursor integrated Claude Sonnet 4, and everything clicked. I typed: “Scaffold a Nuxt 3 page listing Stripe invoices with a Tailwind table.”

Two lattes later, I had a fully working page: clean props, sensible Tailwind, pagination built-in, no missing imports. It didn’t just look good — it ran.

Why the glow-up?

That aligned with what I was feeling: the tool had matured — and maybe so had I.


2. Where AI Shines — And Where It Slips

After months with Sonnet 4, here’s what’s felt like magic — and what still gives me pause:

What works well:

  • Boilerplate scaffolding (components, DTOs)
  • Writing unit tests
  • Repo Q&A like “Where do we parse JWTs?”
  • Project-wide refactors

Where I don’t trust it (yet):

  • Security-critical flows like auth and crypto
  • Perf-sensitive logic
  • Legacy spaghetti with zero documentation
  • Tasks I haven’t mentally designed yet

If I don’t know exactly what I want, the model will happily hallucinate an entire fantasy architecture. Then I get to debug my own laziness at 2 a.m.


3. My Four-Step Prompt Ritual 🙏

Here’s how I talk to the model now:

  1. Set the scene. “You're a senior dev experienced in Nuxt and Stripe.”
  2. Describe the goal. “Implement server-side pagination for /api/invoices.”
  3. Set the stack. “Nuxt 3, Prisma, PostgreSQL, limit 50 rows, return totalCount.”
  4. Guide the scope. “Please outline the steps only.”

From there, I review its plan, give feedback, and go step-by-step through implementation. It feels like pair programming — minus the headphone tugging.


4. Trust Issues: Everyone’s Using It, Nobody’s Sleeping Easy

  • 80% of devs use AI tools (according to Stack Overflow 2025).
  • Only 29% trust the output unedited.

Reddit is full of rants like “Management wants 20% of commits from Copilot.” One post even mentioned execs tracking prompt counts per day.

That’s not how I roll. I’d rather measure features shipped, not lines of AI-assisted code.


5. The Numbers Don’t Lie

One recent feature:

  • Old-school dev time: ~3h 45m

    • 45m for scaffolding
    • 120m for core logic
    • 60m for tests
  • With Cursor + Sonnet 4: ~1h 45m

    • 5m prompt for scaffolding
    • 90m prompts + tweaks
    • 10m for tests

That’s two hours saved — enough for a gym session, or let’s be real, another coffee and a doomscroll.


6. Five Things I Wish I’d Known Sooner

  1. Design before you prompt. AI isn’t great at mind-reading.
  2. Break it up. Big tasks become small, accurate prompts.
  3. Test everything. No green CI, no merge.
  4. Sleep on major merges. AI optimism is real — and sneaky.
  5. Don’t ditch juniors. Pair them with AI, then make them explain every line.

7. So… Should You Trust the Robot?

Yes — but only like you’d trust a hyper-literal intern. Brilliant with grunt work. Hopeless with nuance. When I give it structure and oversight, it makes me faster. When I hand it the wheel, it usually drives into a wall of undefined variables.


Your Turn

Are you vibing with the new generation of AI dev tools? Or still fighting ghosts in your PRs?

Drop a story below — horror or happy ending. And if enough folks ask, I’ll share my prompt cheat sheet in the next post.


This article was originally published on Medium:
AI Coding Assistants No Longer Hallucinate — If You Know What You’re Doing

Top comments (0)