What is AI Hallucination?

Q: "Why do AI models hallucinate?"

"Language models generate text by predicting the most likely next token given everything before it. They have no internal fact-checking mechanism and no way to distinguish between 'I know this' and 'I am guessing based on patterns'. When asked about something at the edge of their training data or outside it entirely, they continue generating plausible text, because that is all they can do. The result is confident, fluent, wrong output."

Q: "How common is hallucination in practice?"

"It depends heavily on the task and model. For well-documented topics within the model's training data, modern frontier models (GPT-4o, Claude) hallucinate infrequently. For specific facts (citation details, statistics, URLs, names of people), even the best models hallucinate several percent of the time. For knowledge after the training cutoff date, hallucination rate climbs sharply. For tasks where the model can show its reasoning (code, maths), you can verify correctness independently."

Q: "Can I trust AI output for legal or medical decisions?"

"Not without expert verification. AI hallucination rates are too high for any decision where being wrong has serious consequences. The correct workflow for high-stakes use cases: use AI to draft, analyse, or summarise, then have a qualified human expert verify every factual claim before acting on it. AI-generated legal contracts, medical advice, and financial recommendations all require human review."

Q: "Does retrieval-augmented generation (RAG) eliminate hallucination?"

"RAG significantly reduces hallucination by giving the model verified source documents to draw from instead of relying on training data alone. But RAG does not eliminate hallucination. The model can still misread a source document, generate text that is not grounded in the retrieved content, or hallucinate about aspects of the question that are not covered in the retrieved documents. RAG is the best available mitigation, not a complete solution."

Q: "What is the difference between hallucination and bias?"

"Hallucination is factual incorrectness: the model states something false. Bias is systematic skewing of outputs in a particular direction based on patterns in the training data: underrepresenting certain groups, over-representing certain viewpoints. Both are failure modes from the same source (training data patterns) but manifest differently. A model can produce biased output that is factually true, or unbiased output that is factually wrong."

AI hallucination is when a language model produces confident, fluent, factually wrong output. Why it happens, how to detect it, and how to reduce it.

5 min read No prior knowledge needed

Quick Answer

AI hallucination is when a language model produces confident, fluent, factually wrong output. The model is not lying or guessing randomly: it is doing exactly what it was designed to do (predict the most likely next word) but with no internal mechanism to flag when the answer is wrong. Hallucination happens with every major LLM and is one of the primary reasons to always verify AI-generated facts before acting on them.

Dark spiraling vortex with a glowing red core: the model's internal patterns spiral into confident but incorrect outputs when knowledge runs thin. — When a language model encounters a question at the edge of its training data, it continues generating with the same confidence as always: the spiral of plausible-sounding text has no internal brake.

Why the word “hallucination”

The term is borrowed from psychology. A hallucination is a perception that feels real to the person experiencing it but has no basis in external reality. An AI hallucination is text that reads as confident and authoritative but has no basis in fact.

A hallucinating model says “The CEO of Siemens Austria is [name]” with the same tone and confidence it uses to say “Vienna is the capital of Austria”. There is no signal in the output to indicate which statement it is sure about and which it invented.

A real hallucination example

Prompt: “Who won the Best Director Oscar for a film set in Vienna in 2022?”

Hallucinated response (paraphrased): “The 2022 Academy Award for Best Director went to [Director Name] for their film [Film Title], a critically acclaimed drama set in Vienna during the 1970s.”

This might sound plausible, include a plausible director name, a plausible film title, and a plausible plot description. Every specific factual claim can be completely fabricated. The model has generated a convincing answer to a question it did not actually know the answer to.

Where hallucination is most common

High risk

Specific statistics and numbers Citations and references URLs and links People and their roles Events after training cutoff

Medium risk

Legal and regulatory details Medical dosages and procedures Product specifications Niche or regional facts

Lower risk

Well-documented historical facts Code (can be tested and run) Language tasks (summarising text you provide) Structured reasoning from given premises

How to reduce hallucination

Technique 1: Ground the model in source documents

Instead of asking the model to recall facts from training, give it the facts in the prompt:

Context: [paste the actual document, policy, or data]

Question: Based only on the context above, what is the deadline 
for filing under the EU AI Act Article 53?

If the answer is not in the context, say "I cannot find this 
information in the provided document."

This is the core of Retrieval-Augmented Generation (RAG): retrieve the relevant documents first, then have the model answer from those documents. The model’s job becomes reading comprehension, not memory recall.

Technique 2: Ask the model to cite its sources

Answer the question and for each factual claim, indicate 
which sentence in the provided document you are drawing from, 
using [sentence X] notation.

Models that must cite their sources hallucinate less because citation forces the model to stay anchored to retrieved content.

Technique 3: Ask for confidence or uncertainty

Answer the following question. At the end of your response, 
rate your confidence on a scale of 1-10 and explain what you 
are uncertain about.

Current models can estimate their own uncertainty reasonably well. A confidence score of 4/10 is a signal to verify independently.

Technique 4: Verify with code execution

For questions involving numbers, dates, and calculations, have the model write code that produces the answer rather than generating the number directly:

python

# Instead of asking "what is 15% of 847,320?"
# Ask the model to write this:
result = 847_320 * 0.15
print(f"15% of 847,320 is {result}")  # 127,098.0

Code is deterministic. A hallucinated number in code fails when you run it.

1 Retrieve source documents Use a vector database or keyword search to find documents relevant to the question. Do not rely on the model's training data memory for factual claims.

→

2 Inject documents into prompt Paste retrieved content into the system prompt or user message. Instruct the model to answer from the provided context only.

→

3 Generate grounded response The model reads and summarises from your documents. Hallucination rate drops dramatically when the model is reading rather than recalling.

→

4 Spot-check high-stakes claims For any output used in legal, medical, financial, or public-facing contexts, have a human verify specific factual claims against the original sources.

Why you cannot fully eliminate hallucination

Models do not know what they do not know. There is no reliable internal signal that says “this is beyond my knowledge”. Research into calibration (making models better at knowing when they are uncertain) is active, but no current model eliminates hallucination.

This is why AI tools should augment human judgment in high-stakes contexts, not replace it.

What’s next

Building RAG Systems : Systematic approach to grounding LLMs in your own knowledge base
Prompt Engineering Best Practices : How prompt design affects hallucination rates
What is an LLM? : Why LLMs work the way they do, and why this leads to hallucination