Phil Whittaker

Posted on Mar 4 • Edited on Mar 7 • Originally published at philwhittaker.dev

Suspend Disbelief - From Implementation to Intention

#programming #webdev

Introduction

It's a reasonable thing to be sceptical about coding with AI. If you've been burned by earlier models, the hesitation makes sense. But the models really are good enough now. Not "kind of good enough for prototypes" or "good enough if you keep a close eye on them"—genuinely, substantively good enough to write production code. Code that holds up in review. The kind that ships.

It's taken a while to get here. Early ChatGPT produced code that looked plausible at first glance but fell apart under scrutiny—wrong outputs, runtime errors, style problems that signalled the model was pattern-matching rather than reasoning. That reputation stuck. And here's the uncomfortable truth: for a significant slice of the developer community, the mental model formed in 2022 hasn't been updated since.

The tools moved on. The assumptions didn't.

That gap—between what the models can actually do right now and what most developers believe they can do—is what this post is about. The bottleneck is no longer the model's capability. It's your imagination and your willingness to explore what these tools can actually do. Now it's time to suspend disbelief.

The Reality Check

Models Really Do Work Now

The trajectory tells the story better than any argument. On SWE-bench—a benchmark that measures how well AI systems resolve real-world GitHub issues, not toy puzzles—AI solved just 4.4% of tasks in 2023. By 2024, that figure had jumped to 71.7%. One year. A 67-percentage-point gain. Claude Sonnet 4.6, released in February 2026, now scores 79.6% on SWE-bench Verified.

The reason this benchmark matters more than most is that SWE-bench tests models against actual software engineering work: reading existing codebases, understanding context, making targeted changes across multiple files. It's not a contrived test. It's engineering.

What changed isn't one thing—it's compounding gains across reasoning, context handling, and architectural coherence that have quietly crossed a threshold where the output doesn't just look right, it works right. Each generation has closed the gap between "interesting toy" and "real engineering tool" at a pace that consistently outran even optimistic predictions.

And that pace shows no sign of slowing. Even if a model falls short of your specific needs today, betting against it tomorrow means betting against one of the most consistent improvement curves in modern software.

From Generic to Specific

The Core Developer Challenge

There's a useful way to think about what these models actually are: they're infinitely generic. They've been trained on a vast breadth of human knowledge—code, documentation, discussions, patterns—across virtually every domain and technology stack.

That breadth is the superpower. But it's also the limitation.

Infinitely generic doesn't cut it when you need something specific. Your project has a particular architecture, a particular set of constraints, a particular set of requirements that exist nowhere in any training dataset. The model doesn't know about your legacy system, your team's conventions, your product's edge cases.

It knows everything in general and nothing in particular.

Your job—the developer's job—is to bridge that gap. You translate infinitely generic capability into infinitely specific solutions. The quality of that translation depends on how clearly you can articulate what you actually need: the context, the constraints, the intent. That communication skill matters. The good news is it matters less with every generation, as models get better at inferring context, asking clarifying questions, and recovering from ambiguous instructions.

But intentional direction still drives better results. You are the one who knows what you're building. The model knows how to build it. Put those two things together effectively and you get something genuinely powerful.

The Seven Stages of AI Engineering

Building Confidence Through Progression

Getting to that effective partnership doesn't happen overnight. It happens in stages—and understanding those stages is one of the most useful mental models available for developers trying to figure out where they are and where to go next.

Dr. Waleed Kadous—Generative AI Engineering Lead at Canva—published The Seven Spheres of Human-AI Co-Development in December 2025. Picture concentric circles. At the centre, you're accepting autocomplete suggestions. At the outer edge, AI agents are the engineering team building whole systems under your direction. In between: assigning small tasks, delegating modules, orchestrating multi-step workflows.

Each stage builds on the last—you don’t lose anything, you add to it. The framework works because it matches how confidence actually develops: outward, one ring at a time.

The real insight from the framework is this: the bottleneck at every stage is never the model's capability. It's your mindset—your assumptions about what's possible, and your willingness to let go of control and try something bigger.

Suspend Disbelief

The Power of Curiosity and Experimentation

I tell developers this all the time: just ask the AI to do it. And they'll say they don't think the model will understand. The request is too complex, too specific, too weird, too ambitious. So they don't ask. They break it down into tiny, safe pieces that they could almost write themselves. They never discover what the model can actually do.

That's the disbelief in action. And it's costing them.

Curiosity is a technical skill. The willingness to ask "what if I tried this?"—and then actually try it-is what separates developers who get extraordinary results from AI coding from those who get mediocre results. Describe the whole feature instead of just the function. Paste the entire module and ask for a refactor. Sketch the user journey and let the model design the data model.

These aren't reckless experiments. They're how you discover what's actually possible. The developers getting the most from these tools aren't the most technically sophisticated. They're the most curious. They push further, expect more, and they're regularly surprised by how much they get.

That surprise—“I didn’t expect it to be able to do that”—is how disbelief dissolves. You set your scepticism aside just long enough to try something ambitious. The model delivers. Your expectations shift upward—not temporarily, but permanently. Next time, you ask for something bigger. The cycle compounds, and each round replaces a little more doubt with direct evidence.

It Comes Down to Trust

There are two kinds of trust you need to build, and they develop in parallel.

The first is trust in the model's capability—confidence that it can understand a genuinely complex request and produce a workable solution.

The second is trust in yourself—belief that you can communicate your needs effectively, evaluate what comes back, and know when to push further. This is the quieter challenge.

Developers who've spent careers measuring their value by their ability to write code sometimes struggle to find footing when the writing is handled. The skill is still there. It's just been redeployed.

You’re no longer valued for writing lines of code; you’re valued for producing the right ones—and increasingly, for knowing what those should be.

Confidence builds confidence. Each successful experiment—each time you asked for something you weren't sure would work and got back something useful—removes a small piece of doubt.

That doubt doesn't grow back. The floor of your expectations rises permanently. This is how suspended disbelief becomes actual belief.

The Hyper-Personal App

Your Best Learning Ground

It may be that the fastest way to build that confidence isn’t at work. It’s at home, on a project that matters only to you—a thing you’ve always wanted to build but never had the time or the know-how to create.

You are your own best customer for this kind of project. You know the requirements perfectly. You understand every edge case. You're the only stakeholder—no committees, no compromises, no sign-offs, no sprint planning, no production incidents waiting to happen. The freedom to experiment is total.

What does that look like? A tool that tracks every TV show you're watching, filtered and displayed in the way you want. Something that pulls together all the live music listings for your city into one feed so you never miss a gig. A script that monitors a niche RSS feed you care about and summarises the week every Sunday morning.

Something that only you would need, built exactly the way your brain works. These aren't products—they're prosthetics. Tools shaped to you rather than you adapting to them.

When the stakes are personal rather than professional, you give yourself permission to experiment without fear of failure. Ask for something you genuinely don't know how to build. Push the model into unfamiliar territory. See what it comes back with. Iterate. Over-engineer.

You'll naturally refine how you communicate with the model because you care about the outcome—not because someone told you to. That caring is the learning engine.

Play, explore, break things, start over. This is how you develop the intuition and confidence that transfers directly into your professional work—not by reading about it, but by doing it, repeatedly, in a context where the only person you need to impress is yourself.

The Only Question That Remains

Something changes when you hold a tool you actually built—something you couldn't have built without AI. The benchmarks stop being abstract. The adoption statistics stop being someone else's story. You have direct, personal evidence that it works—and that evidence outweighs any argument.

The bottleneck has shifted. It's no longer the model's capability—it's developer mindset. The willingness to experiment, to push further than feels comfortable, to ask for more than you think you'll get.

Curiosity and experimentation are technical skills—arguably the most important one right now.

You don't have to make a leap of faith. You take a small step, the model delivers, and the next step feels natural. That's how suspended disbelief becomes actual belief.

Stop wondering if the AI can do it. Build something for yourself—something only you would dream up, something that solves a problem only you have in exactly the way that makes sense to you. Once you have that tool, the question will never again be "can AI do this?" It will be "what should I build next?"

DEV Community