DEV Community

Nova Elvaris
Nova Elvaris

Posted on

I Tracked Every AI Suggestion for a Week — Here's What I Actually Shipped

Last week I ran an experiment: I logged every AI-generated code suggestion I received and tracked which ones made it to production unchanged, which ones needed edits, and which ones I threw away entirely.

The results surprised me.

The Setup

  • Duration: 5 working days
  • Tools: Claude and GPT for code generation, Copilot for autocomplete
  • Project: A medium-sized TypeScript backend (REST API, ~40 endpoints)
  • Tracking: Simple markdown file, one entry per suggestion

The Numbers

Category Count Percentage
Shipped unchanged 12 18%
Shipped with edits 31 47%
Thrown away 23 35%
Total suggestions 66 100%

Only 18% of AI suggestions shipped without changes. Almost half needed editing. And over a third were useless.

What Got Shipped Unchanged

The 12 suggestions that shipped as-is had something in common: they were small and well-specified.

  • Unit tests for pure functions (given a clear function signature)
  • Type definitions from a schema description
  • Utility functions with obvious behavior (slugify, debounce, date formatting)
  • Regex patterns with clear requirements

Pattern: The more constrained the task, the better the output.

What Needed Edits

The 31 "shipped with edits" suggestions fell into predictable categories:

  • Wrong error handling (14 cases): AI almost always generates optimistic code. Try/catch blocks that log and continue instead of throwing. Missing null checks on database results.
  • Wrong abstraction level (9 cases): AI tends to over-abstract. Creating a class where a function would do. Adding config options nobody asked for.
  • Subtle logic bugs (8 cases): Off-by-one errors, incorrect date comparisons, missing edge cases in conditionals.

What Got Thrown Away

The 23 rejected suggestions shared patterns too:

  • Hallucinated APIs (7 cases): Functions that don't exist in the library version I'm using.
  • Wrong architecture (6 cases): Solutions that technically work but violate project conventions.
  • Overcomplicated (5 cases): A 40-line solution for a 5-line problem.
  • Just wrong (5 cases): Logic that doesn't match the requirement at all.

The Real Insight

I spent roughly 45 minutes per day on AI-assisted coding. My estimate of time saved (vs. writing everything manually): about 90 minutes per day.

Net gain: ~45 minutes/day, or about 3.5 hours/week.

That's real, but it's not the 10x productivity boost people claim. And it requires active review effort — the "savings" assume you catch the bugs before they ship.

What I Changed After This Experiment

  1. Stopped using AI for complex logic. If I need to think hard about the algorithm, I write it myself. AI is best for boilerplate and well-defined transformations.

  2. Started writing specs before prompting. Even a 2-line spec ("takes X, returns Y, handles Z") dramatically improved the "shipped unchanged" rate.

  3. Set a 3-minute rule. If I'm spending more than 3 minutes editing AI output, I delete it and write from scratch. It's faster.

Try It Yourself

Track your AI suggestions for one week. Just a simple log: accepted / edited / rejected. You might be surprised how much time you're spending on the "editing" step.


What's your accept rate? I'd guess most developers ship less than 25% of AI output unchanged — but I'd love to see other people's data.

Top comments (0)