Connor Hickey

Posted on May 30

The Scaffold and the Cage: Vibe Coding, Enabled Coding, and the Fight for Judgment

#ai #coding #discuss #vibecoding

The phrase vibe coding has become a convenient way to describe a strange new relationship between humans, machines, and software. At its simplest, vibe coding means telling an AI system what you want and letting it produce the code. The human provides intent, mood, direction, and correction. The machine produces implementation. The result may be a game prototype, a tool, a website, a mod, a script, or an entire application. The person may not understand every line. They may not even pretend to. They describe the desired artifact, test whether it feels right, and keep prompting until the thing seems to work.

The term itself is recent. Andrej Karpathy coined vibe coding in a widely shared post in February 2025, describing a way of working in which you trust the model, stop reading the diffs, and "forget that the code even exists" (Karpathy, 2025); within the year the phrase had spread far enough to be named Collins Dictionary's Word of the Year (Collins, 2025). Karpathy was candid about what the mode gives up — in its original sense, vibe coding meant precisely not reviewing the output.

That description is useful, but it is also too blunt. It collapses many different practices into one label. It treats the person who blindly accepts generated code the same as the person who uses an agent to learn, debug, test, and gradually understand a system they could not have built alone. It also risks turning "vibe coder" into a social category — almost an insult — rather than a description of a method. The term can imply that someone is merely pretending to code, that they are outsourcing the real work while borrowing the identity of a programmer.

I am not sure that label fits me. At least, not cleanly.

I do not experience agentic coding as pretending to be a programmer. I experience it as finally being able to stay inside the programming loop long enough to become one.

That, at least, is the story I want to tell. A good part of this essay is an attempt to find out whether the story is true, or whether it is the most comfortable thing I could believe about a tool I have come to depend on.

The distinction matters. For me, and likely for many others, AI-assisted or agentic coding is not simply a shortcut around skill. It is a scaffold that makes skill reachable. It lowers the activation barrier. It helps manage the blank page, the syntax wall, the debugging spiral, the architecture fog, and the working-memory demands that make programming difficult to sustain. This is especially significant for people with ADHD or other executive-function challenges. Coding is not only a technical activity; it is also a cognitive endurance task. It requires attention, sequencing, planning, error tolerance, working memory, and the ability to return to a problem after repeated failure. Agentic coding changes the shape of that task.

The more interesting question, then, is not whether AI wrote the code. That question is already becoming less useful. The better question is: who owns the intent, the judgment, and the resulting system?

Coding is shifting from line production to system stewardship. In that shift, the meaningful boundary is no longer between human-written and AI-written code. The boundary lies between artifacts the human can own and artifacts the human merely accepts.

This essay began as a defense of the purple space between vibe coding and genuine ownership: the space where an agent writes more of the code than the human could comfortably write alone, but the human is still learning, testing, questioning, and moving toward understanding the system. I still think that space exists. But I no longer think it is a natural developmental stage. Purple is not a conveyor belt from dependence to competence. It is a fork. One path uses the agent as a scaffold and deliberately preserves the difficulty required to build judgment. The other uses the agent as a cage, removing so much friction that the user gains fluency without ownership. The difference is not whether the machine writes the code. The difference is whether the human refuses to surrender evaluation.

Throughout, I will use three colors as shorthand. Red is vibe coding in the narrow sense: the human expresses desire and accepts machine output with minimal understanding. Blue is enabled coding: the human leans on agents heavily but keeps conceptual ownership, verification responsibility, and the ability to reason about the system. Purple is the contested space between them — and the rest of this essay is an argument about which way it points.

Vibe Coding as Red: Desire Without Ownership

Vibe coding begins with desire. The human says, in natural language, what they want the software to do. The prompt may be specific or vague. It may describe an interface, a mechanic, a workflow, a tool, or a feeling. "Make me a basic platformer controller." "Build a save system." "Create an inventory UI." "Fix this bug." "Make it feel smoother." "Add juice." "Make the enemy smarter." "Make this look like a real app."

The agent responds with code. The human runs it. Something breaks. The human pastes the error back. The agent patches. The human tries again. Eventually the thing works, or appears to work. The loop continues.

There is nothing inherently wrong with this process. In low-risk contexts, it can be playful, productive, and creatively liberating. A solo developer can prototype faster. A non-programmer can test an idea. A designer can make an interactive sketch. A student can get unstuck. A person who would normally never touch code can suddenly make a working artifact.

The risk appears when the artifact becomes detached from human understanding. In the red zone, the user accepts code because it appears to work, not because they understand why it works. The program becomes opaque. The user's standard of correctness is surface behavior: the button clicks, the scene loads, the function returns something plausible, the error disappears. The agent becomes the only participant with any apparent model of the implementation, and even that model may be unstable or hallucinated.

This matters because software is not only output. Software has consequences. It stores data, moves money, exposes private information, controls experiences, shapes user behavior, and breaks in ways that can be subtle. Even in small projects, code accumulates. A prototype becomes a tool. A tool becomes infrastructure. A quick fix becomes an architectural dependency. The more a system grows, the more dangerous it becomes for the human to remain outside the logic of the thing they are building.

In red, the human says: "It works, so I accept it."

That may be enough for a disposable prototype. It is not enough for ownership.

Enabled Coding as Blue: Acceleration With Ownership

Enabled coding looks similar from the outside. The human still uses an agent. The agent may still write most of the code. The human may still describe changes in natural language. The workflow may still include copy-pasting errors, asking for patches, and iterating quickly.

The difference is not the amount of AI involvement. The difference is the human's relationship to the artifact.

Enabled coding means the agent reduces the execution burden while the human retains responsibility for direction, comprehension, verification, and maintenance. The human does not need to type every line to own the system. They do need to understand the relevant behavior well enough to make decisions about it.

In blue, the human asks different questions.

Why did you choose this pattern?
What files did you change?
What assumption does this function make?
What happens if the input is null?
What breaks if there are two players instead of one?
Is this state stored globally?
Can this be simplified?
Can we add a test?
Can you explain this like I am going to maintain it next month?

These questions change the role of the agent. The agent is no longer just a code vending machine. It becomes a pair programmer, tutor, debugger, explainer, and implementation accelerator. It can still be wrong, but its wrongness becomes part of a review process rather than a hidden liability.

Enabled coding does not require total mastery. That would be an unrealistic standard. No programmer understands every layer of the stack they use. Professional developers rely on compilers, engines, frameworks, libraries, documentation, autocomplete, forums, package managers, and abstractions they do not fully control. The question is not whether the human has absolute knowledge. The question is whether the human has enough situated understanding to responsibly guide, test, and maintain the system.

This is not only how I would like experienced developers to work; it appears to be how they actually do. When researchers observed and surveyed professional developers using AI agents through 2025, they found that the experienced ones do not vibe code at all. They plan the task, supervise the agent closely, and review its output rigorously, holding onto authority over design and implementation out of a refusal to compromise on software quality (Huang et al., 2025). Expertise, in agentic coding, expresses itself not as faster acceptance but as more disciplined control.

This is where the traditional gatekeeping around programming starts to break down. If programming is defined narrowly as manually producing lines of syntax, then AI-generated code seems to threaten the identity of the programmer. But if programming is understood as designing, reasoning about, testing, maintaining, and evolving computational systems, then agentic tools do not erase programming. They shift its center of gravity.

The coder becomes less like a typist and more like a system steward.

The Purple Zone: Scaffolded Ownership

Between red and blue is purple.

Purple is the state where the agent writes more code than the human could comfortably write alone, but the human is not merely accepting magic. The human is directing, testing, questioning, and learning. They may not understand the implementation immediately, but they do not treat incomprehension as the final state. They use the agent to move toward understanding.

This is the zone where many new programmers probably live now. It is also where many solo builders, indie developers, modders, designers, domain experts, and neurodivergent creators may find themselves. They are not traditional programmers in the old sense, but they are not non-programmers either. They are becoming capable through collaboration with a machine.

Purple is easy to dismiss because it looks messy. The person may ask naive questions. They may rely heavily on the agent. They may struggle to explain the code at first. They may use imprecise language. They may build something that works before they fully understand why it works. To an experienced programmer, this can look like incompetence wearing a productivity mask.

But that judgment, I want to argue, misses the developmental nature of the process. A beginner using an agent is not necessarily bypassing learning. They may be entering learning from the other side. Instead of spending weeks blocked by syntax, setup, and error messages, they can start with a functioning artifact and then interrogate it. They can ask the agent to explain the architecture. They can trace the data flow. They can request comments. They can break the code and repair it. They can compare implementations. They can ask why one approach is better than another. They can move from outcome to mechanism.

That is not fake programming. It is scaffolded programming — and the word is not loose. In developmental psychology, scaffolding (Wood, Bruner, & Ross, 1976) names the temporary support a more capable partner supplies so that a learner can accomplish something that would be "beyond his unassisted efforts," within what Vygotsky (1978) called the zone of proximal development: the distance between what a learner can do alone and what they can do with help. But the concept carries a condition that is easy to forget. The defining feature of a scaffold, in that literature, is that it fades — it is deliberately withdrawn as the learner's competence grows. A scaffold that is never removed is not a scaffold. It is a permanent prop, and the building never learns to stand.

The distinction depends on whether the scaffold becomes a bridge or a cage. If the user remains dependent on the agent for every change, every bug, and every explanation, purple collapses back into red. The artifact remains opaque. The user can produce software but cannot own it. But if the agent helps the user build a mental model, purple moves toward blue. The user becomes more capable over time.

I have now used the word if twice in a single paragraph, and I want to flag that, because everything optimistic in this essay is hiding inside those conditionals. I have asserted that the scaffold can become a bridge. I have not yet given any reason to believe it tends to. That is the work the rest of the essay has to do, and it is harder than I would like it to be.

ADHD, Executive Function, and the Programming Loop

The ADHD angle is not incidental. It may be central.

Programming is often described as a logic skill, but in practice it is also an executive-function gauntlet. A programming task requires the developer to hold multiple layers of information in mind: the goal, the current bug, the relevant files, the syntax, the architecture, the runtime behavior, the error messages, the edge cases, and the next step. The developer has to break large tasks into smaller tasks. They have to tolerate delayed gratification. They have to recover from repeated failure. They have to remember what they were doing before the last error interrupted them.

For someone with ADHD, these demands can become the real barrier. The problem is not always lack of intelligence or lack of interest. It can be task initiation, sequencing, working memory, context switching, emotional regulation, and persistence through friction. Programming creates friction constantly. One missing semicolon, one broken dependency, one unclear error, one setup issue, one file in the wrong folder — any of these can derail momentum.

Agentic coding can function as an external executive system. It can hold context. It can summarize the next step. It can break a feature into smaller chunks. It can explain an error without the shame spiral of feeling stupid. It can offer a concrete first move when the blank page is too abstract. It can convert "I want this mechanic" into "start by creating these files and these functions." It can keep the loop alive.

For me, that matters more than I know how to say in an essay that is trying to stay analytical. The agent does not simply make coding faster. It makes coding reachable. It lets me remain in contact with the work long enough to build understanding. Instead of falling out of the loop every time the task becomes too abstract or too fragmented, I can use the agent as a stabilizer. It gives me a way back in.

This reframes the ethics of AI-assisted coding. The public conversation often treats AI coding as a question of laziness, authenticity, or cheating. Those frames are too narrow. For some people, agentic coding is closer to access technology: an external support for task initiation, sequencing, working memory, and recovery from failure. It does not remove the need for judgment, effort, or learning. It changes the conditions under which those things become possible.

That is the strongest version of my case, and I believe it. Which is exactly why I have to attack it now, because I notice that I have arranged the argument so that no one is allowed to question it. I have wrapped the claim in the language of disability and accessibility, and that language has a way of ending conversations. To doubt an accessibility tool feels like doubting the person who needs it. But I am not interested in an argument that wins by becoming unfalsifiable. So I have to ask the question the accessibility framing is designed to make me feel bad for asking.

The Counterclaim: Why the Scaffold Might Be the Cage

Here is the objection, and I am going to give it every advantage.

The danger of agentic coding is not that it removes labor. The danger is that it may remove the specific forms of labor through which judgment is formed.

There is a comforting word for the support I described a moment ago: a prosthetic. An external stand-in for a capacity I struggle to supply on my own. But if I let myself reach for that word, I inherit its darker half. A prosthetic is not a teacher. It is a substitute. We do not expect a prosthetic limb to grow a real one underneath it. A wheelchair is not a stage in learning to walk. So the moment I call the agent a prosthetic, I may be smuggling in good news that the metaphor does not actually contain. The honest version of the prosthetic framing is not hopeful at all. It is the picture of a permanent substitution for a capacity that will never develop — because the substitution removes the very stimulus that would have developed it.

Look again at what I praised, and notice what it costs.

Every executive function the agent supplies is one I do not exercise. It holds the context, so my working memory never has to stretch to hold it. It sequences the next step, so I never build the muscle of decomposition. It absorbs the failure spiral, so I am never the one who sits in the wreckage of a broken build until I understand why it broke. The agent does not strengthen these capacities by performing them for me, any more than a forklift strengthens my back. It performs them instead of me. And a function that is always performed for you is a function that quietly disappears.

The learning sciences have an uncomfortable name for the thing I have been treating as pure cost: desirable difficulty (Bjork, 1994; Bjork & Bjork, 2011). The finding, roughly, is that conditions which make a task feel harder and slower in the moment often produce more durable learning, and conditions which make a task feel fluent and easy often produce the illusion of learning without the substance. Underneath it lies a distinction Bjork draws between performance — how well you can execute right now, with support in place — and learning — the durable capability that remains once the support is gone. The two routinely move in opposite directions, which is exactly why fluency in the moment is such an unreliable signal of competence acquired. Struggle is not a bug in the process of becoming competent. In many cases struggle is the process. There is a related and equally inconvenient result, the generation effect: across decades of experiments, people remember and understand material they generate themselves far better than the same material merely shown to them (Slamecka & Graf, 1978). Reading a correct solution feels like understanding. It is not. It is recognition wearing understanding's clothes.

Now consider what an agentic coding tool actually is, mechanically. It is a fluency-maximizing machine. Its entire value proposition is the removal of difficulty. That is the product. That is what I am paying for, in money and in dependence. So if the difficulty was where the learning lived, then the tool is not protecting my learning. It is optimizing it away, and presenting me with the pleasant sensation of competence as the receipt. This gap between sensation and fact is measurable. In a 2025 randomized controlled trial, experienced open-source developers predicted that AI tools would speed them up, and reported afterward that the tools had sped them up — yet they actually completed their tasks roughly nineteen percent slower with the tools than without them (Becker et al., 2025). The feeling of acceleration and the fact of it had come apart, and the people inside the experiment could not detect the difference. If fluency can hide a slowdown that large, it can certainly hide the smaller, slower divergence between understanding a system and merely operating one.

There is an older version of this worry, from outside software, named the ironies of automation (Bainbridge, 1983). Bainbridge's observation was that when you automate the routine parts of a task and leave the human responsible for the rest, you erode the operator's skill at exactly the moments automation fails and a human must take over — so the more reliable the automation, the less prepared the human it ultimately depends on. Aviation has tested this directly, and the result is precise about which skills go. When researchers had airline pilots fly routine and non-routine scenarios in a Boeing 747 simulator at varying levels of automation, they found that the manual control skills — the stick-and-rudder motor skills — held up reasonably well, but the cognitive skills of manual flight, the knowing-what-to-attend-to and deciding-what-to-do, were the ones that decayed under reliance on automation (Casner, Geven, Recker, & Schooler, 2014; see also Ebbatson, Harris, Huddlestone, & Sears, 2010). The hands remembered the airplane. The judgment did not. Generated code threatens to fail along the same fault line. The agent handles the ordinary. The human is summoned only for the catastrophe — the subtle data corruption, the security hole, the architectural dead end that no further prompting can patch. The mechanical skill of producing code may well survive; it is the judgment that quietly hollows out, invisibly, behind the comfortable hum of things mostly working.

This is the point where my essay is in real trouble, and I want to name precisely how, because it is worse than a missing caveat.

Rehabilitation science draws exactly the distinction that exposes the problem. Assistive and rehabilitative technologies are not one category but two, with opposite definitions of success (Cook & Polgar, 2015). Some is compensatory: a wheelchair, glasses, a hearing aid. You will use it permanently, and that is completely fine — independence was never about legs or unaided eyes. Permanent dependence on a wheelchair is not a failure of the wheelchair. It is the wheelchair working. But some assistive technology is rehabilitative: a course of physical therapy, training wheels, a scaffold around a building under construction. Its whole purpose is to be outgrown. Permanent dependence on a rehab program is not a success. It is the rehab failing.

Here is the bind. My entire gradient — red to purple to blue, movement, becoming, "a steward of the system" — is a rehabilitative claim. I am promising that you outgrow the scaffold. The word "scaffold" gave it away. So I do not get to retreat to the comfortable wheelchair defense when challenged — "it's a prosthetic, dependence is fine" — because the wheelchair defense abandons my thesis. I committed to the harder claim: that the tool is a bridge you cross and leave behind. And the harder claim is exactly the one the entire deskilling literature suggests is least likely to come true, because the easier and more fluent a scaffold makes the work, the less reason and less stimulus there is to ever step off it.

And the ADHD framing, which I leaned on as my strongest card, may be my weakest. Because if executive function is genuinely the barrier, then a tool that supplies executive function on demand removes the only conditions under which executive function gets practiced. The story I told — "it keeps me in the loop long enough to learn" — assumes the time in the loop is spent learning. But it might be spent being carried. The frictionless loop is not obviously a classroom. It may just be a more comfortable room in the same cage.

I am not going to pretend this objection is weak. It is the true center of the question, and most of the optimistic writing about AI and coding, including my own first draft of this essay, simply walks around it.

What Survives, and Under What Condition

I do not think the objection is fatal. But it changes what I am allowed to claim, and it forces me to give up ground I would rather have kept.

The counterclaim forces a narrower definition of enabled coding. I can no longer define it as AI-assisted production that happens to make me feel more capable. Nor can I define it as any process where the agent helps me stay in motion. Motion is not growth. The only defensible definition left is this: enabled coding is agentic coding in which production may be delegated, but evaluation is not.

First, the concession, and it is a real one. For the executive-function layer specifically — initiation, sequencing, the working-memory juggling, the chunking of a feature into files — I will grant the compensatory reading and stop pretending it is rehabilitative. I do not need to internalize the ability to break a task into the right four files, and I probably will not, and I have decided that is acceptable, the way a writer does not need to internalize manuscript formatting to be a writer. Ownership was never made of those parts. So if those particular muscles atrophy, they cost me nothing I needed to keep. The skeptic is right about them, and being right about them turns out not to matter.

The real question is not about executive function at all. It is about judgment. And here I have to relocate the entire argument.

Ownership, when I am honest about what it consists of, is not the ability to type or to sequence. It is the ability to evaluate. To look at a working solution and know whether it is also a correct one. To recognize the specific texture of an agent that has stopped reasoning and started guessing. To reject a patch that passes every visible test but quietly corrupts the architecture. To come back next week, reopen the project, find the relevant part, and make a controlled change without starting from zero. That faculty — judgment — is what blue is actually made of. Everything else is logistics.

So the only question worth arguing is narrow and brutal: is the agent compensatory or rehabilitative with respect to judgment?

And here the objection bites hardest, because I cannot wave it off. Judgment is built by consequence. It is the residue of having been wrong and having had to find out why. If the agent absorbs the failure spiral — and absorbing the failure spiral is exactly what I praised it for in the ADHD section — then it may absorb the error-and-consequence loop that is, as far as anyone knows, the only way judgment forms. I have to admit the painful symmetry: the feature that makes the tool an accessibility device is the same feature that threatens the one capacity I cannot afford to lose.

This is why I can no longer claim that purple tends toward blue. On the frictionless path — the path the tool is engineered to make easy — purple does not tend toward blue. It tends toward a deeper red. Judgment atrophies precisely as the skeptic predicts, and the loss is masked by the pleasant fluency of a system that keeps mostly working.

What survives is smaller, and conditional, and I think true: judgment can still form, if the human refuses to offload evaluation even while offloading production. Those two things are separable, and the separation is the whole game. I can let the agent write every line and still insist on being the one who decides whether the line deserves to exist. But that insistence is not natural. It runs directly against the grain of a tool whose entire design is to make insistence feel unnecessary, even rude — the friend who finishes your sentences so smoothly you forget you had one.

Which means the practices of ownership are not safety advice bolted onto an optimistic essay. They are the essay's actual engine, and I had them filed under the wrong heading.

Read the diff.
Ask for the explanation, then check it against the behavior instead of trusting it.
Run the code yourself.
Write the test before you believe the fix.
Keep changes small enough to understand.
Ask what breaks when the input is null, when there are two players, when the network is gone.
Refactor deliberately.
Return to old code and find out, honestly, whether you still understand it — and treat the answer as data about yourself, not the code.

Reframe these correctly and they are not hygiene. They are deliberately reintroduced difficulty. Reading the diff is refusing to offload comprehension. Writing the test is refusing to offload the definition of correct. Asking what breaks when the input is null is manufacturing, by hand, the edge-case confrontation that the happy path would otherwise have spared me. Each practice is a conscious reinjection of the friction the agent removed — and crucially, friction placed back exactly where the learning lives, rather than scattered at random across syntax and setup, where it never belonged. That is the form the optimism has to take after the objection. Not "the scaffold becomes a bridge." Rather: the scaffold becomes a bridge only for the person who keeps rebuilding, by hand, the difficulty the scaffold was selling them relief from.

The Boundary: When Red Becomes Blue

With that correction in place, the old account of the boundary still stands, but it reads differently now. The movement from vibe coding to enabled coding is not a single moment, and it is not a current that carries you. It is a set of practices performed against the grain.

A red prompt says: "Fix this."

A purple prompt says: "Here is the bug, here is what I expected, here is what happened, and here are the files that might be involved. Help me inspect the cause."

A blue prompt says: "The problem seems to be that this state is updated before the event listener finishes. Propose a minimal patch, explain the tradeoff, and include a regression test."

The difference is not vocabulary. The difference is whether a mental model exists behind the words, and a mental model is the one thing the tool cannot hand you, because building it is the difficulty the tool removes.

The final test is delayed ownership. Can the person come back next week, reopen the project, understand the relevant parts, and continue? Can they debug without starting from zero? Can they explain the system well enough to improve it? If yes, the code is no longer merely something they accepted. It is something they are beginning to own.

But notice what that test really measures. It measures whether the friction got put back. The person who can return and continue is not the person the tool produced by default. It is the person who insisted on understanding things the tool was willing to understand for them.

Risks: When the Agent Owns the System

Everything in the standard risk inventory is real. AI-generated code can be insecure, inefficient, brittle, overcomplicated, or subtly wrong. It can introduce dependencies without explaining why. It can solve the local bug while damaging the larger design. It can pass the visible test path and fail under edge cases. It can invent APIs. It can confidently explain false reasoning. It can encourage the user to move faster than their understanding.

But after the counterclaim, I no longer think the central danger is in the code. The central danger is in the user. The risk is not primarily that the agent produces a bad artifact. It is that the agent produces a person who feels like an owner and is not one — a person whose sense of competence is calibrated to fluency rather than understanding, and who therefore cannot tell the difference between a system they command and a system that merely behaves, until the day it stops behaving.

A person can ship code they do not understand. They can collect users, data, payments, or trust with a system they cannot maintain. They can build a game or an app that becomes impossible to extend because every feature was patched into existence through disconnected prompts. They can become dependent on the agent as a repair oracle, unable to distinguish a good fix from a bad one — which is just another way of saying their judgment never formed, masked by years of things mostly working.

The practices above do not eliminate this. They reintroduce friction in the right places, slowing the user down just enough to keep a responsible human in the loop. That is the most they can do, and it only works if the user actually does them, against the tool's every incentive to skip them.

Against Traditional Gatekeeping

None of this rehabilitates the old gatekeeping, and I want to be careful not to let a sobering objection curdle into nostalgia.

The old image of programming centers on manual authorship: a programmer is someone who knows the language, writes the lines, fixes the errors, and builds the system through direct control. In that model, AI assistance looks like contamination. But programming was never only manual authorship. It has always involved layers of abstraction — engines no one fully understands, libraries no one wrote, operating systems and compilers whose output is rarely inspected. A developer using a game engine or a web framework is already delegating enormous amounts of behavior to code they did not author. The question has always been how well the developer can reason within those abstractions.

Agentic coding adds a new abstraction layer: natural language as an interface to implementation. That layer is unstable and risky, but it is still an abstraction layer, and rejecting it outright because it changes the shape of labor would repeat an old mistake — confusing the tools of programming with its essence.

The essence is not typing. The essence is judgment under stewardship: forming an intention, translating it into a computational system, evaluating whether the system behaves correctly, and maintaining it as requirements change. AI can participate in all of that. It can do most of the line-level production. The human's role does not disappear — unless the human surrenders the evaluation. That, and not the volume of AI involvement, is the line between an enabled coder and a vibe coder. And after everything above, I have to add that the surrender is not a single choice. It is the default outcome of a frictionless tool, and resisting it is a daily, unnatural act.

Conclusion: The Bridge You Have to Carry Across

Vibe coding asks whether the machine can make software from my desire. The question I began with was whether the machine can help me become the kind of person who can own the software I desired.

The honest answer is harder than the one I wanted to write. The machine can make me someone who appears to own it, instantly. And that appearance is precisely the danger, because it is indistinguishable from the real thing — to me most of all — right up until the moment the system breaks and demands that I be the one who actually understands it.

The bridge from red to blue exists. I am now fairly sure of that. But the agent does not walk me across it, and it does not pull me toward it. Its gravity runs the other way, toward the comfortable, fluent, hollowing cage, because removing difficulty is what it is for. The only way across is to carry, by hand and on purpose, the very weight the agent kept offering to take — to read what it would have let me skim, to struggle where it would have let me coast, to be wrong in the specific ways that build judgment instead of letting the wrongness be quietly absorbed and patched.

So I will not say that a new kind of programmer is being formed by this technology. By default, the technology forms passive consumers, and dresses them in the feeling of mastery. What is true is smaller and entirely conditional: the technology makes available a path that a disciplined minority can take, against its grain, by manufacturing the difficulty it was built to remove.

Purple, I have to admit at the end, is not a stage you pass through on the way to blue. It is a fork, and it is a place you can fall back from at any moment. The well-lit, frictionless path leads back to red. The other path is uphill, and you build it yourself, out of the difficulty you choose to keep.

A vibe coder accepts the artifact.

An enabled coder refuses to stop understanding it, even when the machine has made understanding optional.

And because the machine is built to make understanding feel optional, it will win that argument whenever the user stops actively resisting it. That is why enabled coding cannot simply mean coding with help. It has to mean coding under a discipline: the discipline of keeping judgment human when production no longer has to be.

References

Bainbridge, L. (1983). Ironies of automation. Automatica, 19(6), 775–779. https://doi.org/10.1016/0005-1098(83)90046-8

Becker, J., et al. (2025). Measuring the impact of early-2025 AI on experienced open-source developer productivity. METR. (Reported finding: experienced developers expected and perceived a speedup from AI tools while completing tasks ~19% slower with them.)

Bjork, R. A. (1994). Memory and metamemory considerations in the training of human beings. In J. Metcalfe & A. Shimamura (Eds.), Metacognition: Knowing about knowing (pp. 185–205). MIT Press.

Bjork, E. L., & Bjork, R. A. (2011). Making things hard on yourself, but in a good way: Creating desirable difficulties to enhance learning. In M. A. Gernsbacher, R. W. Pew, L. M. Hough, & J. R. Pomerantz (Eds.), Psychology and the real world: Essays illustrating fundamental contributions to society (pp. 56–64). Worth Publishers.

Casner, S. M., Geven, R. W., Recker, M. P., & Schooler, J. W. (2014). The retention of manual flying skills in the automated cockpit. Human Factors, 56(8), 1506–1516. https://doi.org/10.1177/0018720814535628

Collins Dictionary. (2025). The Collins Word of the Year 2025. HarperCollins.

Cook, A. M., & Polgar, J. M. (2015). Assistive technologies: Principles and practice (4th ed.). Elsevier/Mosby.

Ebbatson, M., Harris, D., Huddlestone, J., & Sears, R. (2010). The relationship between manual handling performance and recent flying experience in air transport pilots. Ergonomics, 53(2), 268–277. https://doi.org/10.1080/00140130903342349

Huang, R., et al. (2025). Professional software developers don't vibe, they control: AI agent use for coding in 2025. arXiv preprint arXiv:2512.14012. https://arxiv.org/abs/2512.14012

Karpathy, A. (2025, February 2). There's a new kind of coding I call "vibe coding" [Post]. X. https://x.com/karpathy/status/1886192184808149383

Slamecka, N. J., & Graf, P. (1978). The generation effect: Delineation of a phenomenon. Journal of Experimental Psychology: Human Learning and Memory, 4(6), 592–604. https://doi.org/10.1037/0278-7393.4.6.592

Vygotsky, L. S. (1978). Mind in society: The development of higher psychological processes. Harvard University Press.

Wood, D., Bruner, J. S., & Ross, G. (1976). The role of tutoring in problem solving. Journal of Child Psychology and Psychiatry, 17(2), 89–100. https://doi.org/10.1111/j.1469-7610.1976.tb00381.x

Top comments (2)

Harjot Singh • May 31

The scaffold-vs-cage framing is the central tension of this era of tooling. The same AI assistance that scaffolds you up (lets you build beyond your current level) can cage you (atrophy the judgment you'd have built doing it yourself). My read: it's a scaffold when you stay the one making the consequential decisions and verifying the output, and a cage when you outsource the judgment itself and just accept what comes back. The discipline is using AI to go faster on the parts you understand, not to skip understanding. That's the line I try to hold in how Moonshift works, automate the mechanical, keep the human owning judgment. Where do you draw scaffold vs cage in your own practice?

Felix • Jun 1

Really thoughtful piece. The "cage" framing is spot on — the real danger isn't that AI replaces judgment, it's that the available tools subtly bias what you can build. If every AI coding tool prioritizes one language or one cloud provider, that becomes invisible infrastructure shaping your decisions. Reminds me of how API gateways lock you into pricing tiers without you noticing. Awareness is the first step.