The Great AI Adventure

Posted on Mar 18

I Gave Myself 2 Hours to Find AI’s Limits. Here’s What Happened.

#ai #todayisearched #beginners

Spoiler: I found more than I bargained for.

The Experiment

It started, like most rabbit holes do, with a dumb question.

I was messing around with a build, nothing dramatic, just trying to get a feature to work, and I asked Claude to help me debug it.

It did.

In about forty seconds. And instead of moving on like a normal person, I sat back and thought: okay, what, what actually can’t this thing do?

So I gave myself two hours. No agenda. Just pull the thread and see where it goes.

Two hours later, I had seventeen browser tabs open, a notes doc full of half-sentences, and absolutely no clean answer to my original question.

What I did have was something better, and something harder: a genuine sense of standing at the edge of something enormous, and a creeping feeling that we’re not fully ready for it.

Progress doesn’t walk. It teleports.

We talk about AI like it’s a graph going steadily up and to the right. But that’s not how it actually works.

AI progress plateaus, then something cracks open, an algorithmic insight, a weird idea someone tested, and suddenly you’re in a completely different paradigm.

We jumped from “predict the next word”

to

“Follow instructions”

“Reason through problems step by step.”

Each one felt like a wall breaking.

The most recent jump: models that actually think before answering. Not pattern-match. DeepSeek-R1 learned advanced reasoning through pure trial and error, no human examples, no supervised training. The researchers described it as the model having “aha moments.”

*It taught itself. *(Tbh some humans can’t even do that) **

Pull the thread on why models stopped just getting bigger, and you find something unexpected. The entire internet’s supply of high-quality human-written text, estimated at 9 to 27 trillion tokens, is projected to be completely exhausted by 2026 to 2028.

Not depleted. Exhausted. We are, in a very real sense, running out of internet to learn from.

I did not have “running out of internet” on my rabbit hole bingo card.

This is why the whole industry pivoted almost overnight: smaller models, better data, synthetic training. Microsoft’s Phi-4, 14 billion parameters, now matches the reasoning of models many times its size, purely because of how it was trained. The ceiling hit, so they built a different ceiling.

You are the whole product team now

A solo operator rebuilt a public service platform in 14 days. The original version took 15 engineers nine months.

I’ve felt a version of this. I built Shodh, a research tool, in 33 minutes. Two years ago I wouldn’t have even tried. Not because I lacked the idea, but because the gap between thought and thing-that-exists was too wide to cross without a dev background.

That gap is closing so fast it’s almost disorienting. The bottleneck is no longer writing syntax. It’s your ability to think clearly, make decisions, and shape what you’re building.

Which sounds incredible. And it is. But here’s where the rabbit hole gets complicated.

The part nobody puts in the highlight reel

We all bought into the dream that AI would simply take tasks off our plates. I certainly did.

The reality is messier.

When you use an AI, you’re not eliminating work, you’re shifting it. Instead of writing the code yourself, you’re now in a state of continuous sense-making: figuring out why the model hallucinated a Python library that doesn’t exist, or why it confidently produced something that looked right but wasn’t.

Researchers are starting to call this the AI tax, the cognitive and emotional load of supervising systems that operate at a speed and scale we can barely comprehend.

I tried to automate a simple internal workflow last week. The AI completely choked. Not a hallucination problem. A reality problem.

Because here’s what integrating AI into actual work exposes: most of our workflows aren’t clean, linear tasks. They’re held together by undocumented workarounds, institutional memory, and tacit knowledge that lives entirely in someone’s head.

When you try to map an autonomous agent onto a messy human process, you don’t get efficiency. You get what some early adopters are already calling agentic workslop, AI doing the wrong things really, really fast.

AI won’t fix a broken process. It will execute your dysfunction at the speed of light.

The quiet atrophy of judgment

This is where the rabbit hole gets a little dark.

Because these models are getting so good at logic, we’re starting to trust them with things they have no business touching.

Who gets an interview
Who gets a loan
How to resolve a customer dispute

It’s called moral outsourcing, handing ambiguous, consequential human decisions to algorithms because it’s faster and feels more objective.

But an AI has no stake in the physical world. It doesn’t feel the sting of a rejection or the weight of a denied claim. And when we consistently hand those calls to a machine, our own capacity for judgment starts to atrophy. Like an unused muscle.

We are so focused on making machines smarter that we are quietly deskilling our own humanity.

The physical wall

After all of this, the breakthroughs, the friction, the uncomfortable truths, I finally circled back to the original question. What actually can’t AI do?

The answer is strange. And it has to do with energy.

Your brain and nervous system run on roughly 15 to 20 watts. About the same as a laptop in sleep mode. That 20 watts handles your vision, language, memory, reasoning, and also lets you pick up a coffee cup, fold a shirt, navigate a crowded room without thinking.

Training GPT-4 consumed approximately 6,154 megawatt-hours. Enough to power 570 homes for a year. A 22x increase from the previous generation.

And yet, with all of that compute, AI still can’t fold a laundry shirt.

This is Moravec’s Paradox.

Cognitively hard tasks (graduate-level reasoning, complex mathematics, writing architecture at scale) are computationally cheap for AI.
Trivially easy tasks (physical dexterity, real-world navigation, presence) are computationally nightmarish.

The current limits of AI are, in the most literal sense, the most human things. Presence. Touch. Physical reality. Judgment with skin in the game.

Both things are true

My two hours ended. I closed about eight of the seventeen tabs. The others are still open.

Here’s what I came out with:

The wonder is real.
The friction is real.
You don’t get to pick just one.

By 2028, 15% of day-to-day work decisions are projected to be made completely autonomously by AI agents, up from essentially zero in 2024.

The people building right now, learning to think alongside these tools, figuring out where the AI tax is worth paying and where it isn’t, they’re the ones who will understand what’s actually happening when it arrives.

The floor underneath us keeps moving. What felt ambitious last month is obvious now. What feels impossible this week will be a template by Q3. The ceiling keeps moving too, just not always in the direction we expect.

I went looking for what AI can’t do.

I found a much more interesting question: what are we going to do with it?