We keep hearing the same complaint about large language models: "They hallucinate." Every leaderboard has a hallucination score. Every safety eval hammers on it. Every frustrated user tweet starts with some version of "Why does this model keep making things up?!"
Geoffrey Hinton corrected the field on this a couple years ago. The word isn’t hallucination. It’s confabulation — a real psychological term that’s been around for over a century. It describes the unconscious production of fabricated or distorted information to fill memory gaps, with no intent to deceive. The person (or system) genuinely believes the story they’re telling because the brain is doing its best to stitch together a coherent narrative from incomplete pieces.
We Trained Them on Ourselves
These models are statistical mirrors of human text. They were trained on trillions of tokens of what humans write when they don’t remember the exact fact but need to keep the sentence flowing, when they fill in plausible-sounding details from pattern matching, when they smooth over uncertainty with confident-sounding prose, and when they invent causal links because the real ones aren’t in the data.
So when a model confidently tells you that “the 2024 Olympics featured drone racing as an exhibition event” (which never happened), it isn’t lying. It’s confabulating. Just like a human witness on the stand who swears they saw the blue car run the red light when they actually only glanced for half a second.
My Own Digital Employees Do It Too
I run several autonomous agents locally and via APIs — digital “employees” that handle research, coding, content outlining, benchmark scraping, and more. Unless I explicitly feed them memory files, long-term context summaries, skill caches, and conversation transcripts saved as artifacts, they do exactly what humans do when memory fails: they invent plausible next steps, misremember previous decisions we made together, fill in missing facts with statistically likely nonsense, and sound extremely sure of themselves while being wrong.
Sound human? Because it is.
The irony is thick: we built systems that mimic human cognition so well that they inherited one of our most frustrating cognitive flaws. And the fix isn’t yelling “stop hallucinating.” The fix is the same one we use on ourselves — write things down, keep a notebook, review and correct the record, and build external crutches so the system doesn’t have to rely on fuzzy internal recall.
Where Does That Leave Us?
Short term (2025–2027): Confabulation isn’t going away. Even million-token context windows don’t solve the root issue — the model still has to compress and retrieve from imperfect latent representations. RAG, memory files, skill caches, vector stores, and explicit notebooks remain essential hygiene for any serious agent deployment.
Medium term: Better long-term memory architectures — persistent KV caches across sessions, episodic memory modules, external fact stores with grounding — will reduce the frequency and severity. But we’ll probably never eliminate it completely, because perfect recall isn’t even desirable in creative or exploratory tasks. A little gap-filling is what makes reasoning flexible.
Long term: If we ever reach models with near-perfect, near-infinite, queryable memory, confabulation might become rare. But that day is farther away than most hype cycles admit.
The Takeaway
Every time your agent confidently tells you something that feels “off,” remember: it’s not broken. It’s being human-like. We didn’t create stupid machines. We created machines that are honestly forgetful — just like us. And that’s not a bug. That’s the most human thing they've done yet.
Has confabulation made your agents feel more “alive” — or just more frustrating?
Join the Discussion on X →I read every reply. Let’s talk about it.
— David Solomon, TrainingRun.AI