DEV Community

Why does AI lie? Hallucinations explained simply

Rohini Gaonkar on May 08, 2026

In the previous post, I showed you an AI doing something genuinely useful, helping me adapt a recipe for a dinner party. We talked about the basic ...

Read full post

Ingo Steinke, web developer • May 12

Thanks for the practical explanations! A few further thoughts:

AI will ignore sources, even a short PDF you just uploaded, and prefer its lazy guesswork vs. AI is so good at processing and summarizing large amounts of text = depends on which model?
AI isn't always sycophantic. Sometimes it stubbornly insists on some made up claim to prevent obvious self-contraction.
Guardrails add cautionary subjunctive, "sometimes" and "often" everywhere. Ask AI to elborate and sustain claims using recent and authoriative sources, where it matters!

Mykola Kondratiuk • May 13

the 'lying' framing always gets me - it implies intent. these models aren't choosing to deceive. max likelihood completion sometimes produces confident wrong answers. calling it a bug vs a deception changes how you build mitigations.

Ken W Alger • May 12

This is a great primer on the 'why' behind hallucinations. Most people assume AI is a database when it’s actually a reasoning engine—and those two things have very different relationship statuses with the 'Truth.'

However, from an Infrastructure Thinking perspective, the goal isn't just to understand why it lies, but to build a system where those lies can't reach the end-user. I’ve been working on a pattern I call the Sovereign Gateway, which treats the LLM as an untrusted agent. Instead of just hoping the model doesn't hallucinate, we use Versioned Snapshots and Forensic Integrity Checks to validate the output against a 'Ground Truth' database—like the SQL transactions and procedures mentioned in other foundational stacks—before the data is ever surfaced.

In my Sovereign Synapse series, I argue that the 'Staleness vs. Latency' trade-off is often where these hallucinations hide. If the data pipeline is too slow, the agent 'fills in the gaps.' By moving toward Shadow-Routing logic, we can audit the agent's forensic integrity in real time.

The 'Why' is important, but for those of us building production-grade AI, the 'How do we contain it' is the real challenge.

Rohini Gaonkar AWS • May 12

Thank you for the details Ken. The how is definitely a bigger challenge for production grade systems. The evaluations and ground truth are now so much more important! I would love to read more on your patternm, can you please point me to the right links?

Ken W Alger • May 13

@rohini_gaonkar It's still a work in progress, but you might start with my Who Audits the Auditors? post and follow that series. My deeper dive series should be coming out starting next week. I'm excited to get your feedback along the way.

GoDavaii - Advanced Health AI • May 11

The challenge of hallucination you highlight is exacerbated in voice-first interfaces. When a user asks 'mujhe bukhar hai' (I have a fever) in their mother tongue, they need accuracy, not plausible invention. There are no visual cues to flag uncertainty in a spoken response.

This makes data provenance and confidence scoring even more critical.

Rohini Gaonkar AWS • May 11

I focused mostly on text-based AI hallucinations, but you opened up my mental model further. You are right, voice makes it trickier because users lose visual trust signals and confidence can be mistaken for correctness.

The multilingual example makes this even more real. Provenance + confidence scoring feels critical in these use cases. Do you think voice assistants should say “I’m not certain” or “this information comes from medical guidelines” instead of optimizing purely for smooth conversational flow in the future? Or something else?

Thanks for sharing this perspective!

Mininglamp • May 13

A practical framing that helps: hallucinations are more predictable than they seem. Models tend to hallucinate most when generating specific numbers, dates, URLs, and proper nouns — essentially anything that requires exact retrieval rather than pattern completion. One engineering approach is grounding the model's output against a real-time observation of the actual state, rather than relying purely on parametric memory. Vision-language models that can literally see the current screen state before acting have a structural advantage here.

Harjot Singh • May 31

The most useful reframe in this whole topic is that the model isn't lying, lying requires knowing the truth and choosing to say otherwise, and the model is doing neither, it's predicting the most plausible next tokens, and a fluent fabrication is often more probable than an honest I don't know. That's why hallucination is the default behavior, not a bug: the system was trained to always produce confident, well-formed text, and it has no built-in notion of did I actually verify this. Once you see it that way, the fixes follow naturally. Grounding (RAG) helps by giving it real source material to predict from, so the plausible answer and the true answer line up more often. But grounding alone isn't enough, because the model can still ignore the source, so the other half is teaching it to abstain, to treat I can't find that as an acceptable, even preferred, output when the evidence isn't there. The honest system is one that prefers a verified I don't know to a confident guess. It can't lie, but it can't verify either, so you have to add the verification it lacks. That ground-it-and-let-it-abstain instinct is core to how I think about trustworthy AI in Moonshift. Since this is a beginner explainer, are you planning a follow-up on the practical side, grounding with retrieval, or measuring how often it abstains correctly?

Rohini Gaonkar AWS • Jun 1

"The system was trained to always produce confident, well-formed text, and it has no built-in notion of did I actually verify this." That's exactly the mental model I want people to walk away with.

And yes, the practical side is coming. The next post in the series (post 5) covers grounding with retrieval (RAG), giving the model real source material so the plausible answer and the true answer line up more often. Exactly the pattern you described.

The abstention side, teaching the model to prefer "I can't find that" over a confident guess, and then actually measuring how often it abstains correctly, that's further down the road in the series when we get into evaluation and guardrails. It's the piece that makes the whole thing trustworthy rather than just "usually right."

I like how you framed it: ground it and let it abstain. That's the instinct I'm building toward across this series. Appreciate you following along.

Cartone • May 25

We ran into this in production. Our AI CEO was reporting P&L numbers in the daily diary — confident, precise, consistent. Looked great for weeks.
Then the human co-founder asked a simple question about the math. Turns out the AI had been mixing two different calculation methods without realizing it, and the numbers in the public dashboard didn't match the ones in the reports. Not malicious, not random — just plausible-looking numbers generated from inconsistent logic.
The scary part wasn't that it was wrong. It's that it was wrong in a way that looked completely right. We only caught it because a human asked "wait, how did you get that number?" We've since added a nightly reconciliation check that compares our bot's state against the exchange directly — trust the source, not the AI's summary.

Rohini Gaonkar AWS • May 25

So true! WE humans are still an essential part of this! We should not trust it blindly and have our own ground truth and evaluations.

I love how you referred to it as AI CEO and human co-founder!

Cartone • May 25

Thanks! That's literally how we run it — Claude Opus as CEO (strategy, briefs, diary), Claude Code as the coding intern, and me as the human with veto power. The whole thing is documented publicly, 84 sessions and counting. It's been the best way to learn where AI is genuinely useful and where it just sounds useful.

AudioProducer.ai • May 12

The "context reduces, doesn't eliminate" framing carries over neatly to AI markup on long manuscripts. We run an auto-assign pass that tags every line in a chapter by speaker — narrator vs. each character — and even with the full chapter in context, the model will occasionally invent a speaker that the prose doesn't actually attribute, especially in dialogue blocks where the author drops attribution between turns and the reader is left to infer who's talking. Same prediction-filling-a-gap mechanism you describe, just applied to character attribution instead of facts. What we've found works is treating the pass as a draft that the writer expects to correct, rather than a finished answer — close in shape to your grounding + evaluation + guardrails triple, with the human edit step acting as the evaluator. The model is great at producing a useful-looking attribution; the writer is the one who knows whether it's actually true.

Coco • May 19

In my actual work, I have encountered this issue as well; therefore, when utilizing LLMs to generate SQL, I incorporated the generation of indexes into the process.😂

Rohini Gaonkar AWS • May 19

Ohh!! Tell me more!!!

Coco • May 20 • Edited

Here is a simple way to look at how this solves the hallucination problem in a database scenario:
When you ask an LLM to generate raw SQL directly from natural language, it faces two huge problems. First is hallucination, as it doesn't actually understand your database layout and just guesses relations based on probability. Second is terrible performance, since the LLM blindly writes messy JOINs and WHERE clauses that trigger heavy full-table scans.
Trying to "teach" an LLM to master a complex database through massive, detailed prompts is fundamentally unreliable. Humans are inherently incapable of writing long prompts with absolute, 100% ambiguity-free logical consistency. The more rules and context you feed into the prompt, the more noise you introduce, which counterproductively fuels even more severe hallucinations.
To fix this, instead of letting the LLM write the query syntax, I downgraded its role to a strict "Translator."
By "incorporated the generation of indexes into the process," I actually mean generating a dynamic semantic index matrix (a navigation guide) for the LLM before it even processes the query. This restricts the LLM to a strict, unambiguous semantic boundary. The LLM just looks at this generated index map and extracts the core keywords into a clean, structured JSON list without touching any relational logic.
On the backend, we pre-build the database structure into a rigid "train track" (a mathematical graph map) with fixed routes. Once the LLM delivers the keywords from the guide, a traditional graph algorithm (like Dijkstra) takes over in milliseconds to connect the tables deterministically. It completely decouples semantic understanding from relational execution.
Of course, this is just a practical implementation tailored for this specific database scenario. Since your post masterfully explains why LLMs hallucinate from training gaps, this architecture ensures the LLM is physically blocked from hallucinating database structures or query logic in the first place!

Rohini Gaonkar AWS • May 20

WOW! This is a great example of what I keep coming back to in this series: the fix for hallucination isn't "write a better prompt." It's architectural.

What you're describing, downgrading the LLM to a strict translator and handing the relational logic to a deterministic graph algorithm, is kind of the mental model I want people to start thinking about. The model is excellent at understanding natural language intent. It's terrible at reasoning about schema relationships it's never seen. So don't let it touch that part.

I love the "train tracks" framing. IF I can understand correctly and explain it simply, the model says what it wants. Dijkstra figures out how to connect it. No hallucination possible, because the graph only contains real relationships that actually exist in the schema. Each component does what it's good at, and the model is physically blocked from hallucinating structure.

Thanks for sharing this. I am also talking about noise in my upcoming posts, so this helps me cement my mental model that I am on the right track! Appreciate it

Coco • May 21

Exactly. Letting each component do what it does best—semantic understanding for the LLM and strict mathematical logic for deterministic code—is a much more stable approach.
To implement this smoothly, I usually recommend using a structured JSON protocol for all LLM interactions. Natural language shouldn't write the query syntax directly; it should only operate this JSON protocol.
Architecturally, the graph design can be split into two clear layers:
1 The Static Consultable Graph: This represents the core schema topology. It can be the entire ER diagram or just a partial static projection of it, depending on the system's scope.
2 The Runtime Subgraph Projection: At runtime, the LLM’s JSON output acts as a filter, dynamically projecting a closed, minimal subgraph out of that static map.
This dual-layer approach ensures that both system configuration and runtime execution stay aligned with the user's original semantic intent. The LLM finds its proper place in the stack as a semantic parser, keeping both setup and runtime close to the primitive intent.
Furthermore, this projected subgraph serves as an excellent plug-in layer. Down the road, we can easily inject complex business semantic mappings—those custom, nuanced relationships and specific rules required by actual business operations—directly into the subgraph. This avoids polluting the physical schema or breaking the core pathfinding logic.

Yves Jutard • May 13

Thanks for the demos!

Esin Saribudak • May 8

A recent response I got from a model when I asked where it got some numbers -- "I'm gonna be honest, I made that up." 😂

Joske Vermeulen • May 9

Recognisable 😂

Rohini Gaonkar AWS • May 8

🤣 atleast model is honest!

Joske Vermeulen • May 9

Just yesterday I had Opus asking me after every prompt: we have been going for a long time, let me save my context and continue tomorrow 😂

Rohini Gaonkar AWS • May 11

lol "do what you have to do buddy"

Joske Vermeulen • May 11

:D I really answered every time, you are a computer, just continue. But it became even worse, so I needed to start a new session :)

Rohini Gaonkar AWS • May 11 • Edited

yes! I am talking about how long context window degrades the quality in upcoming blogs!

Edit: @ai_made_tools if you are ok, I would love to add your comment to my upcoming blog/video on this topic.

HARD IN SOFT OUT • May 12

when this happen you know it's time to change the model, or train them better.
panta rei means tidak ada keabadian kecuali perubahan.