Why Does AI Just... Make Stuff Up?
The first time it happened to me, I didn't catch it right away. I asked ChatGPT to write a research piece and to cite its sources. It gave me a clean, confident report with specific findings, author names, and publication details. It read like a real paper. I was excited to share it, but then when I checked the sources, most of them didn't exist...
This is called hallucination, and if you use AI regularly, it will happen to you. Not might. Will.
Why it happens
AI models do not look things up. They predict what text is likely to come next based on patterns in their training data. When you ask a question, the model is not retrieving a fact from a database. It is generating the most plausible-sounding response it can construct.
Most of the time, "plausible-sounding" and "true" overlap. The model saw enough accurate information during training that its predictions tend to be correct. When it does not have strong patterns to draw from, though, it does not say "I do not know." It generates something that fits the shape of a correct answer without actually being one.
This is the fundamental thing to understand: the model is always generating, never retrieving. It does not have a sense of what it knows versus what it is making up. There is no internal fact-checker.
When the risk is highest
Not all questions carry the same hallucination risk. Some categories are reliably dangerous:
Specific numbers and statistics. "What percentage of companies adopted AI in 2024?" The model will give you a number. It might be right. It might be a plausible-sounding fabrication. Unless you verify, you have no way to tell.
Citations and sources. This is the classic trap. Ask for academic papers, legal citations, or news articles and the model will often generate titles, authors, and publication details that look completely real but do not exist. The format is perfect; the content is invented.
Recent events. Models have a training cutoff date. Anything after that date is either unknown to the model or reconstructed from limited information. The model will not always tell you when it is past its knowledge boundary. I want to be clear; this is the model itself, many of the interfaces have tools that can look up current events, but that is different than a model response.
Obscure or niche topics. The less training data exists about a subject, the more the model has to fill in gaps. Mainstream topics tend to be more reliable than specialized ones.
Multi-step reasoning with specific facts. "What was the GDP of Portugal in 2019, and how did it compare to Greece?" Each fact in the chain is an opportunity for error, and errors compound.
How to protect yourself
Spot-check in the area you care about. If you are asking the AI about an unfamiliar topic, independently look up one or two facts in that area first. Then ask the AI about those same facts and see how it answers. If it gets them wrong, the training data in that area might be thin, and you should verify more carefully. This gives you a direct signal about reliability in the specific domain you are relying on, rather than assuming that accuracy in one area carries over to another.
Ask it to show its work. Instead of "what is the answer?", try "walk me through your reasoning step by step." Hallucinations are easier to spot in the reasoning than in the conclusion. A confident wrong answer looks solid; a confident wrong reasoning chain usually has an obvious gap. It is often the case that when you force this type of reasoning, it also helps to reduce the hallucinations.
Check specific claims independently. If the AI cites a statistic, a study, or a quote, look it up. This takes 30 seconds and can save you from repeating fabricated information. I do this routinely, even with outputs that feel right.
Ask the AI directly. "How confident are you in this answer?" or "Is there anything in your response that might not be accurate?" This works not because the model knows what it knows, but because the prompt shifts it toward more cautious generation. You often get a more hedged, careful response that flags areas of uncertainty.
Use AI for drafts, not final answers. The safest framing is to treat AI output as a first draft that needs human verification, not as a finished product. Use it to generate ideas, structure arguments, and explore options. Then verify the facts yourself.
A reasoning engine, not an encyclopedia
AI is not an encyclopedia that occasionally makes mistakes. It is a reasoning engine that is always constructing its answers in real time. Sometimes the construction is brilliant. Sometimes it is confidently wrong. The output looks the same either way, and that is exactly what makes hallucination dangerous.
The good news is that this is manageable. You do not need to distrust everything. You need to know where the risk is highest and verify in those areas. Specific facts, numbers, citations, and recent events get checked. Brainstorming, structuring, and drafting are lower risk because you are using the AI for its reasoning, not its facts.
Once you internalize this, you stop being surprised when it happens and start building the verification step into your workflow.
Next time: everyone keeps saying "agent" like it means something. What it actually means and whether you need one.
If there is anything I left out or could have explained better, tell me in the comments.
Top comments (0)