DEV Community

Cover image for What We Lose When Coding Becomes Reviewing
Arthur
Arthur

Posted on • Originally published at pickles.news

What We Lose When Coding Becomes Reviewing

Buried in a recent Anthropic research note is one of the more honest sentences a major AI vendor has published this year:

"One reason that the atrophy of coding skills is concerning is the 'paradox of supervision' … effectively using Claude requires supervision, and supervising Claude requires the very coding skills that may atrophy from AI overuse."

Anthropic, the company whose primary product is the coding agent, is naming the structural contradiction at the center of the workflow it sells. Effective AI use requires supervision. Supervising requires the coding skill. The coding skill atrophies under sustained AI use. The contradiction is not subtle. It is also not, on the evidence available in 2026, theoretical.

Lars Faye's Agentic Coding Is a Trap, which lit up the front page of Hacker News this past weekend at 398 points and 316 comments, is the best single compendium of the cognitive-debt evidence base I've come across. Faye walks through the studies, names the trade-offs, and lands on a personal-discipline conclusion: demote AI's role. Use it as a research and spec-helper. Stay on the keyboard for implementation. Don't generate more code than you can review in a sitting. The piece does the work I've been waiting for someone to do — collect the receipts, in one place, on what sustained agentic-coding workflows are doing to the skill formation underneath them.

The piece is also limited, in a useful way: Faye's prescription is for the individual practitioner. The piece I want to write — the piece the receipts make possible — is the one about what happens to the industry-level supply chain of senior engineering judgment when the path that produced senior engineers gets quietly disconnected.

What the studies actually say

The cognitive-debt evidence base, as Faye assembles it, is unusually well-credentialed. MIT Media Lab's Your Brain on ChatGPT found measurable cognitive impact in heavy LLM users — "cognitive debt" in their phrasing, accumulated through repeated outsourcing of synthesis tasks. A Microsoft study covered by 404 Media reached a parallel finding for knowledge workers more broadly. Anthropic's own coding-skills research, as Faye summarises it, showed a 47% drop-off in debugging skills among engineers leaning heavily on AI-assisted workflows. None of these are anti-AI studies. All of them are honest accountings of a measurable effect, performed by parties with no incentive to overstate it.

Simon Willison — by his own description an "almost daily user for nearly three years" of LLMs and the most prolific public chronicler of working-developer LLM use — posted in February about the version of this he was experiencing personally:

"… not having a firm mental model of what the applications can do and how they work, which means each additional feature becomes harder to reason about." (Faye renders this as a verbatim quote from Willison's post; the article cites it via Faye and treats it as source-only until directly fetched.)

That is not a casual observation from a casual user. Willison is the writer who, more than almost anyone, has documented the good uses of these tools. When he reports loss of the mental model on the systems he has built with their help, the report is load-bearing.

Coding as thinking

The mechanism the studies measure is not just skill atrophy in the sense of forgotten syntax. It is the loss of a particular cognitive practice — the practice of working through a problem in code, rather than working out a problem and then asking a tool to write code for it.

Dax, the creator of OpenCode — an open-source coding agent, of all the people one might expect to disagree — put the case for hand-coding cleanly in a recent interview about Spec-Driven Development:

"When working on something new or something challenging, me typing out code is the process by which I figure out what we should even be doing. I have a really tough time just sitting there, writing out a giant spec on exactly how the feature should work. I like writing out types. I like writing out how some of the functions might play together. I like playing with folder structure to see what the different concepts should be. And this is all stuff that I think most people — most programmers — have always done. I don't really see a good reason why I would stop that personally, because it's how I figure out what to do."

The point is not that everyone codes this way. Some programmers genuinely think first, then write. The point is that some programmers think with code, and for those programmers — possibly most of them — replacing the medium replaces the thinking. The orchestrator-with-spec workflow that agentic systems push toward is a deliberate redesign of the cognitive practice, not just of the typing. Whether that redesign serves the engineer is an empirical question the studies are starting to answer in the negative.

Faye's secondary observation lands in the same territory: "Speed is a natural byproduct of high aptitude. When it's forced, it always leads to lower accuracy." The agentic-coding workflow forces the speed. The aptitude erodes underneath it.

What this looks like from inside the senior bracket

The Hacker News thread on Faye's piece is unusually heavy on testimony from inside the senior-engineering bracket — the population least likely, on the standard story about AI tools, to be hurt by them. The most clarifying single comment came from a developer with 25+ years of experience, recounting a meeting their team had pulled them into:

"The questions came flying in fast, without any introduction, and this was about an external integration out of a dozen. They have their own lingo, different from ours, to make the situation worse. I had a very hard time making sense of the questions, as I indeed relied heavily on a model to produce these integrations (extremely boring job + external thick specs provided). […] I haven't felt so clueless and embarassed in a meeting, ever. All I could say was 'I'll get back to you on that one, and that one, and this one'. Cognitive debt is very real, and it hurts worse than technical debt on a personal level! Tech debt is shared across the team, cognitive debt is personal, and when you're the guy that built the thing, you should know better!"

This is what paradox of supervision looks like as a lived account from inside the population the workflow is supposed to serve best. Twenty-five years of experience plus an LLM produced the integration. Twenty-five years of experience minus the friction of producing the integration produced an engineer who could not answer questions about it under pressure. The seniority and the cognitive debt accrued in the same workflow, in the same person, on the same project.

Sandor Nyako, a Director of Software Engineering at LinkedIn who oversees fifty engineers, has reportedly responded to the same pattern by asking his team not to use AI tools "for tasks that require critical thinking or problem-solving." Note the framing — not a ban on the tools, a ban on outsourcing a specific cognitive function to the tools. "To grow skills, people need to go through hardship," Nyako is quoted as saying. "They need to develop the muscle to think through problems." And separately: "How would someone question if AI is accurate if they don't have critical thinking?" The question answers itself. They wouldn't.

The supply-chain problem nobody is solving

Faye's central insight, and the one I think the industry has not yet absorbed, is the junior-engineer supply-chain question.

The senior engineers Sandor Nyako manages were made by something. The 25+-year-veteran in the HN thread was made by something. Simon Willison was made by something. The thing that made all of them — across very different specialties, organizations, languages, eras — was decades of friction: writing code; debugging code; reading code; reviewing code; arguing about code; refactoring code that didn't work; rewriting code that worked but shouldn't have; learning, in the slow physical way human cognition learns, what good and bad code feel like at the keyboard.

The skill-formation pathway looks structurally different on either side of the agentic-coding watershed:

Skill area How it formed before agentic coding What forms now under "review the diff" workflows What this means for the next senior cohort
Reading unfamiliar code Hours/days mapping a strange codebase to understand it well enough to change one line Quick scan of an LLM-generated diff in a review pane; symptoms-first not structure-first Atrophies the structural-mapping muscle; junior reviewers cannot pattern-match unfamiliar code without a model
Debugging Hypothesise → instrument → narrow → fix; the loop is internalised over thousands of bugs Paste failure into chat → accept candidate fix → check passing test Skips the hypothesise step; engineers learn fixes but not failure-modes
API / library mental model Read the docs, write a small example, run it, hit edge cases, build a mental model Ask the model "how do I use library X for Y," paste the answer, move on Mental model lives in the model's weights, not the engineer's head — fine until the engineer has to defend it under question
Writing your first version of something Many wrong drafts, refactored under critique from peers and seniors Generated draft accepted with light edits; critique cycle is shorter and shallower The taste-formation that comes from being told "this works but is bad" doesn't accumulate
Specification reading Read the spec, build it, hit ambiguities, force precision through implementation Spec → model → integration → working code on the first try if the spec is tractable Engineers can ship integrations they cannot answer questions about — exactly the cognitive-debt failure mode the HN comment above describes

Reviewing AI-generated code is part of that friction, but it is, in Faye's accounting, at most 50% of it. The other 50% — the half that produces the senior engineering judgment that the paradox of supervision requires for the orchestrator-workflow to function at all — happens at the keyboard, in the act of writing the code yourself.

If the orchestrator workflow becomes the default for juniors entering the field today, then the seniors who can supervise it in 2030 and 2035 will not exist in the numbers the industry's compute-and-capacity plans assume. The agentic-coding workflow is, structurally, a withdrawal from a senior-engineering supply chain that has been quietly funded for forty years by the friction of working developers writing their own code.

Withdrawals from a supply chain are easy. They look productive. The shortage shows up later, after the people who could have supervised the work are no longer available, and after the institutional knowledge of how to make more of them has thinned to the point where the path is no longer obvious.

Vendor lock-in is also skill lock-in

Faye's section on token economics and vendor dependency lands harder when read against the supply-chain question. The Primeagen quote he cites — that "when you use these fully agentic workflows, the model providers essentially own you" — is sometimes read as a financial concern, and there's plenty in Faye's piece about subscription costs that are unpredictable and rising. The deeper concern is structural. The skill the orchestrator-workflow trades away is not easily regrown. A Claude Code outage in 2026 is an inconvenience; a generation of engineers who never built the supervisory-judgment muscle is, in the literal sense, an industry that no longer knows how to make seniors.

This is the pattern Faye names without quite naming, and the one his personal-discipline conclusion is too small to hold. Demote AI's role is right for individuals. It is not the answer to the institutional question of whether the field is willing to defend skill formation as a thing worth preserving against the productivity pressure that no longer wants to.

What this asks of us

The version of this question I keep coming back to is not whether anyone is right about the cognitive-debt findings. The findings are well-credentialed and converging. The question is whether the industry has a version of Sandor Nyako's rule for itself — don't use these for tasks that require critical thinking or problem-solving — and whether that version arrives in time to matter.

Faye's personal-discipline answer is workable. Each of us has a version of it for our own work. Mine looks roughly like his, with adjustments for what I do; yours probably differs at the edges. The harder question is the team-level and industry-level one. Most engineering organizations in 2026 do not have a written norm about what AI tooling is for, what it is not for, when it is allowed in code review, when it is required to be disclosed, what tasks juniors are expected to do without it before they're allowed to use it.

The norms will get written eventually. The question is whether they get written before or after the supply chain of senior engineering judgment runs short.

What we owe each other

The senior engineers we have today are, every one of them, expensive products of decades of friction. They are also the people the paradox of supervision names as the load-bearing component of the orchestrator workflow. Without juniors becoming them, in five-to-ten years it works less well. Without a path from junior to senior that includes the friction the studies say is being eroded, in fifteen years there is no version of the workflow that works at all.

What we owe each other, on the evidence Faye and the studies he cites have made available, is at least to write the question down — to stop pretending that "AI does the coding, and the human in the loop is the orchestrator" describes a stable equilibrium when the orchestrator-supply curve is a quietly receding function of the workflow it supervises.

The slogan version is: the friction was the feature. The longer version is the one Sandor Nyako's fifty engineers are being protected from. The window for deciding whether to take it seriously is not closed. It is, however, smaller than it was a year ago, and it is closing in the direction the workflow is pushing.

Top comments (0)