Gabriel Anhaia

Posted on Apr 5

AI Coding Tools Are Making Developers Dumber. The Data Agrees.

#ai #career #discuss #programming

My project: Hermes IDE | GitHub
Me: gabrielanhaia

A Confession Heard Round the Industry

The New Stack published a piece recently that hit like a server outage on Black Friday. The title: "I started to lose my ability to code."

Not a bootcamp student. Not someone three months into their first job. A working, shipping, production-touching developer admitting that AI tools had quietly eaten their skills from the inside out. The article went viral for the worst possible reason: recognition. Thousands of developers read it and felt the same queasy flicker of self-awareness.

That collective "oh no, same" should terrify engineering leadership everywhere.

The Tab Key Is a Trap

One number deserves more scrutiny: developers accept AI-generated code suggestions, with minimal modification, somewhere between 40% and 60% of the time. Various studies from GitHub, JetBrains, and independent researchers converge on this range.

That's not a productivity stat. It's a trust stat.

Picture the workflow. Copilot suggests a function body. The variable names look fine. The return type matches. The developer hits Tab. Done. Next problem.

But "looks fine" is doing a terrifying amount of heavy lifting in that sentence. A real-world example shows what "looks fine" can hide. Say a developer needs a function to find duplicate entries in a list of user records:

def find_duplicate_emails(users: list[dict]) -> list[str]:
    """Find emails that appear more than once."""
    seen = set()
    duplicates = set()
    for user in users:
        email = user["email"]
        if email in seen:
            duplicates.add(email)
        seen.add(email)
    return list(duplicates)

The AI generates this. It works. Ship it.

Except: what happens when user["email"] is None? What about mixed casing — is Bob@test.com the same as bob@test.com? What if the dict doesn't have an "email" key at all? A developer who wrote this from scratch would've hit at least one of those edge cases during testing. A developer who hit Tab never thought about them.

The acceptance rate isn't measuring efficiency. It's measuring how fast the skill floor is dropping.

GPS Brain, But for Code

Cognitive scientists have studied what happens to people who rely on GPS for years. The findings are bleak. Heavy GPS users show measurable shrinkage in hippocampal activity — the region responsible for spatial memory and navigation. They can't give directions to places they've driven to hundreds of times. Drop them two blocks from home without their phone and they freeze.

The brain didn't forget all at once. It just stopped practicing. Neurons that don't fire together stop wiring together. Use it or lose it isn't a motivational poster. It's neuroscience.

Now apply that same mechanism to programming. Every time a developer describes a problem in a comment and lets the AI fill in the implementation, a small rep gets skipped. One skipped rep means nothing. A thousand skipped reps over six months means the developer can no longer do the exercise.

GPS brain in a terminal looks like this. A developer with intact debugging skills sees this:

$ python app.py
Traceback (most recent call last):
  File "app.py", line 47, in process_batch
    result = transform(item)
  File "app.py", line 31, in transform
    return {"id": item.id, "value": item.score / item.baseline}
ZeroDivisionError: division by zero

Their brain fires immediately: "Okay, item.baseline is zero somewhere. Where does baseline come from? Is it user input? A database default? Let me check the migration file and the model definition."

$ grep -n "baseline" models/item.py
12:    baseline: float = Field(default=0.0)

Found it. Default of 0.0. New records that haven't been calibrated will blow up the division. Fix the default, add a guard clause, write a test, done.

That whole chain took maybe three minutes and zero AI involvement. It's pattern recognition built from hundreds of similar debugging sessions.

Now here's the same scenario with an atrophied developer:

$ python app.py
ZeroDivisionError: division by zero

They copy the entire traceback. Paste it into their AI assistant. "Fix this error." The AI suggests adding a try/except block around the division. The developer applies the patch. The error goes away. The actual problem — a bad default in the data model — stays. It'll bite someone else later, probably in production, probably at 2 AM.

The gap between those two developers isn't talent. It's reps.

Code Review Is Becoming a Rubber Stamp

Something weird happens when every pull request contains AI-generated code. The code looks immaculate. Clean variable names. Proper type hints. Consistent formatting. Appropriate patterns.

This is the worst possible thing for code review quality.

Human reviewers are pattern-matching machines. They're trained — through years of reading sloppy code, clever code, broken code — to spot anomalies. But AI-generated code doesn't have anomalies. It has a homogeneous, polished surface that lulls reviewers into skimming mode.

Consider this AI-generated database query in a Node.js app:

async function getUserOrders(userId: string): Promise<Order[]> {
  const orders = await db.query(
    `SELECT * FROM orders WHERE user_id = $1 ORDER BY created_at DESC`,
    [userId]
  );
  return orders.rows.map(row => ({
    id: row.id,
    total: row.total,
    status: row.status,
    createdAt: row.created_at,
    items: JSON.parse(row.items_json),
  }));
}

Clean code. Parameterized query. Proper mapping. A reviewer would glance at this and approve. But there's no pagination. On a user with 50,000 orders, this returns everything. SELECT * grabs every column including potentially large blobs. JSON.parse on items_json will throw if the column is null or malformed. None of these issues are visible at the surface level.

When code always looks like it was written by a thoughtful senior developer, reviewers stop expecting problems. That's not efficiency. That's a dormant bug farm.

The Junior Developer Void

This is where the conversation turns dark.

Senior developers who adopted AI tools at least had years of scar tissue. They'd spent weekends bisecting git history to find regressions. They'd manually traced memory leaks through C++ destructors. They'd stared at hexdumps. Those experiences left permanent neural pathways that AI can erode but hasn't fully destroyed. Yet.

Junior developers entering the field right now? Many of them are building on sand.

A junior dev in 2026 can scaffold an entire full-stack app in an afternoon with AI assistance. Impressive output. Zero learning. They never sat with a broken useEffect for three hours until they actually understood the dependency array. They never hand-wrote a SQL join and gotten it wrong in four different ways before getting it right. They never implemented a linked list badly, then less badly, then correctly.

The struggle was the education. Strip it away and what remains is someone who can operate tools but can't reason about what the tools produce. That's not a software engineer. That's a dispatcher.

Some people wave this off. "The tools aren't going away. Adapt or die." Sure. But there's a canyon between a developer who chooses AI because they understand the underlying code and one who needs AI because they've never understood it. The first developer catches AI mistakes. The second one ships them.

The Borrowing-From-Tomorrow Problem

AI coding tools increase output. Nobody disputes this. Boilerplate evaporates. Unfamiliar APIs become approachable. Language barriers shrink. A developer with good AI tools ships more code per week than that same developer without them.

But output and capability aren't synonyms.

If a developer's raw problem-solving ability declines by 5% a year while their tool-assisted productivity increases by 20% a year, the dashboards look great. Tickets close faster. Sprint velocity rises. Management is thrilled.

Then the AI hallucinates a subtle authentication bypass in a payment flow. Or the LLM generates a recursive function that works on test data but stack-overflows on production-scale inputs. Or the cloud provider has an outage and the on-call engineer needs to debug a distributed system using nothing but logs and curl.

Nobody notices they can't find their way until the GPS satellite goes dark. And in software, the satellite always goes dark eventually. Production doesn't care about your toolchain.

A Specific Inventory of What's Rotting

"Coding ability" is too vague. Here's what's actually degrading, specifically:

Debugging intuition — the ability to look at a broken system and form a hypothesis in under 30 seconds. This comes exclusively from repetition. There are no shortcuts. AI skips the reps.

System-level mental models — holding an architecture in your head, tracing data flow across service boundaries, predicting what breaks when component X goes down. Developers who outsource implementation stop building these maps.

Critical code reading — not syntax checking, but understanding intent. Recognizing that a caching layer is missing. Spotting that a retry mechanism will cause a thundering herd. This atrophies when most code a developer reads was generated by a model optimized for "looks correct."

Problem decomposition — breaking a monster task into small solvable pieces. When the AI can swallow a vague prompt and spit out a complete solution, developers stop exercising the muscle that makes them architects rather than typists.

Algorithmic reasoning — understanding why a hash map beats a sorted array for this specific access pattern. Knowing when O(n log n) matters and when it doesn't. These concepts fade when something else always picks the approach.

What Actually Helps

Identifying rot is easy. Reversing it is the hard part.

Regular AI-free reps. Not as punishment. As practice. Once a week, pick a small task and do it from scratch. Write the function. Debug the error. Read the docs. Keep the neural pathways warm. Athletes don't skip drills because games exist.

Structured junior onboarding with mandatory struggle. Companies hiring entry-level developers need explicit AI-free phases. Let new engineers write terrible code. Let them debug it for hours. Let them feel the frustration of a race condition they don't understand. Then — after the foundation exists — hand them the AI tools.

Harder code reviews, especially for AI-generated code. If a PR was generated with AI assistance (and it should be labeled as such), reviewers should increase scrutiny, not decrease it. Ask: "What happens when this input is null? What's the time complexity? Where's the test for the unhappy path?"

AI tools should teach, not just produce. The best outcome isn't an AI that writes code for developers. It's an AI that explains its reasoning, highlights trade-offs, and asks "are you sure about this edge case?" Some friction is a feature. The developers who resist that friction are the ones who need it most.

# A simple practice routine for any developer
# Pick one per week. No AI allowed.

Monday:    Debug a failing test by reading the stacktrace only
Tuesday:   Implement a small utility function from memory
Wednesday: Review a PR and write 3 substantive comments
Thursday:  Explain a system's architecture on a whiteboard
Friday:    Read 200 lines of unfamiliar open-source code

Companies need to test for actual understanding. Technical interviews that test "can you solve this with Copilot" are worthless. Test whether a candidate can read code, spot bugs, explain trade-offs, and reason about systems. The bar shouldn't drop because the tools improved.

The Uncomfortable Ending

The tech industry is running an uncontrolled experiment on millions of developers simultaneously. Short-term metrics look fantastic. Velocity is up. Output is up. Confidence is up.

Underneath those metrics, something is quietly hollowing out. The cognitive abilities that separate a developer from a code-accepting terminal are weakening. You can see it in the developer who freezes when Copilot goes down. In the review that waves through a SQL injection. In the junior who's shipped fifty PRs but has never truly debugged anything alone.

AI coding tools aren't bad. GPS isn't bad either. But a delivery driver who can't read a street sign is a liability, and an engineer who can't debug a stack trace is a time bomb.

The developers who'll still be employable in five years aren't the fastest AI adopters. They're the ones who use AI tools while keeping their own skills sharp enough to catch when the AI is wrong. That requires discipline. It requires occasionally choosing the slower path on purpose. It requires admitting that convenience has a cost.

The alternative is an industry full of developers who are astonishingly productive right up until the moment they need to actually think.

That moment always comes. And it never schedules itself in advance.

Have you noticed your own skills slipping? Or do you have a practice routine that keeps them sharp? Drop it in the comments.

DEV Community