DEV Community

Cover image for AI Will Never Truly Think, Says This Paper. Tony Stark Would Disagree.
Aaryan Shukla
Aaryan Shukla

Posted on

AI Will Never Truly Think, Says This Paper. Tony Stark Would Disagree.

Let me ask you something.
Remember JARVIS?
That smooth, calm voice helps Tony Stark run his suits, manage his schedule, and answer every question instantly. Cool, right? But here's the thing — JARVIS wasn't thinking. He was just... incredibly good at his job—a very fancy assistant. Tony gives a command, JARVIS executes it. No feelings. No curiosity. No soul.
Then Tony made Ultron.
Ultron woke up. He read the internet in seconds, formed his own opinions, decided humans were the problem, and went full supervillain. Whether you loved or hated Age of Ultron as a movie, that idea of an AI that suddenly gets it, that understands the world and acts on that understanding, is genuinely fascinating.
And then there's Vision. Created from Ultron's body, powered by an Infinity Stone, and somehow... kind. Thoughtful. He lifts Thor's hammer in a quiet moment, and nobody makes a big deal of it. He just exists as something that feels genuinely conscious. Not a tool. Not a weapon. Something in between human and machine that we don't really have a word for.
JARVIS → Ultron → Vision. That's actually the entire debate about AGI in three characters.
And a research paper I came across recently says we're stuck at JARVIS — and might never get further.

The Paper
👉 Read it here: Foundations of AI Frameworks: Notion and Limits of AGI — arXiv:2511.18517
It's written by Bui Gia Khanh, a researcher from Hanoi University of Science, and the core argument is this:
AI systems today — ChatGPT, Claude, Gemini, all of them — are basically very advanced JARVIS. They're brilliant at responding. They're not actually thinking.
The paper calls them "sophisticated sponges." They absorb billions of examples of human writing, find patterns in all of it, and use those patterns to generate responses that sound like understanding. But there's nothing behind the curtain. No actual comprehension.
Here's a simple way to think about it — imagine someone handed you a massive instruction manual for a language you've never seen. You get a question in that language, you follow the manual, and you hand back an answer. To the person asking, it looks like you're fluent. But you have no idea what any of it means.
That's the paper's argument about modern AI.
It also says that just making AI bigger — more data, more computing power — won't fix this. You can scale JARVIS up forever, and you still won't get Vision. Because the architecture is different, not just the size.

Where The Paper Is Right
Honestly, some of this is hard to argue with.
We've all seen AI mess up in ways that feel weirdly dumb. Ask it something slightly outside its comfort zone, and it confidently makes things up. That's not what real intelligence looks like. Ultron didn't need to hallucinate facts — he understood context.
And the paper makes a fair point that nobody has really agreed on what "intelligence" even means. Philosophers have one answer, neuroscientists have another, computer scientists have a third. We've been chasing a finish line that nobody has fully drawn yet.

Where I Push Back 🔥
Here's my problem with the paper's conclusion.
It describes where we are really well. JARVIS — Yes, that's a fair description of today's AI. But saying we can never get to Vision because of how JARVIS works is like saying we'd never get planes because horses have four legs. Different problem, different solution.
A few things worth thinking about:
Nobody expected what AI can already do. Ten years ago, AI making photorealistic art or writing a full essay was science fiction. The surprises keep coming. We don't fully understand why AI does half the things it does — which means we also can't rule out what it might do next.
Vision wasn't built by scaling Ultron. He was built differently, from scratch, with a new approach. That's exactly what some researchers are now exploring — not just bigger models, but fundamentally different architectures. The paper actually agrees with this, it just sounds more pessimistic about it than I am.
We don't fully understand human intelligence either. The brain is still one of the biggest unsolved mysteries in science. So confidently saying AI can never match something we don't even fully understand ourselves feels a bit premature.

Why This Matters Even If You've Never Written A Line Of Code
This isn't just a debate for tech people.
If the paper is right — if AI is permanently stuck as a very convincing JARVIS — then we should probably stop treating AI answers as gospel. Every time you Google something and an AI summary pops up, you might be reading a very confident pattern match, not actual knowledge.
If the paper is wrong and we're heading toward something like Vision — then the changes coming are bigger than any of us are really prepared for. Not just in tech. In every field. Every job. Every part of daily life.
Either way, this conversation is worth having now.

My Take
I'm a data science student and I genuinely believe Vision is possible. Not tomorrow. Maybe not for a long time. But possible.
The JARVIS → Ultron → Vision arc in Marvel is fiction — but the question it raises is completely real. Can something we build ever stop being a tool and start being something that actually understands? Something that doesn't just respond, but thinks?
This paper makes a strong case that we're not on the right path yet. And maybe that's true. But "wrong path" just means we need to find the right one — not that the destination doesn't exist.
Somewhere out there, someone is probably working on the thing that makes today's AI look like a calculator.
I'd bet on Vision.

Do you think we'll ever get past JARVIS? Or is true AI intelligence always going to be a Marvel fantasy? Drop your thoughts below — especially if you're not a tech person, your take matters here too 👇

I'm Aaryan, a data science student writing about things I find genuinely interesting.

Top comments (0)