Robert Kirkpatrick

Posted on Mar 21 • Originally published at Medium

Your AI Has the Memory of a Goldfish. Here Is Why That Is Actually Your Fault.

#ai #productivity #machinelearning #chatgpt

A thread hit r/artificial last week that stopped me mid-scroll. Eighty eight upvotes and fifty five comments, which is solid for that sub. The title was something like "LLMs forget instructions the same way ADHD brains do."

I have ADHD. Diagnosed, medicated, the whole thing. And when I read that post, I did not feel attacked. I felt seen.

Because the comparison is disturbingly accurate. And the reason it matters goes way beyond a clever Reddit analogy.

The Pattern Nobody Is Talking About

Here is what happens when you give a large language model a detailed set of instructions at the start of a conversation. For the first few turns, it follows them. The tone is right. The format holds. The constraints stick. Everything looks perfect.

Then around turn eight or nine, things start drifting. The model stops using your formatting. It forgets the constraint you set about length. It reverts to default behavior. By turn fifteen, it is writing like you never gave it instructions at all.

This is not a bug. It is how attention works in transformer architecture.

The model processes your instructions as tokens in a context window. Early tokens carry weight. As the conversation grows, those early tokens get pushed further from the model's active attention. The instructions do not disappear from the window. They just stop mattering as much.

It is the AI equivalent of someone nodding along in a meeting, saying "got it," and then doing something completely different forty five minutes later.

Sound familiar?

Why the ADHD Comparison Actually Holds

ADHD is not a deficit of attention. It is a deficit of attention regulation. People with ADHD can hyperfocus for hours on the right task. The problem is sustaining attention on instructions that are not immediately reinforced.

Large language models have the same structural problem. The architecture has a similar failure mode, not because models have brains or consciousness or feelings. The model's attention mechanism is literally called "attention." It assigns weight to tokens based on relevance to the current query. And just like an ADHD brain, the model's ability to hold onto instructions degrades as distance increases.

A person with ADHD can hear "remember to send that email after lunch," fully understand it, genuinely intend to do it, and still forget by two fifteen. The instruction was received. It just was not reinforced.

An LLM can receive "always format your response as bullet points with no more than three sentences each," follow it perfectly for six turns, and then start writing paragraphs again. The instruction is still in the context window. It is just no longer weighted heavily enough to override the model's default patterns.

Same failure. Different substrate.

Why "Just Remind It" Does Not Work

The most common advice you will see is to repeat your instructions periodically. Paste them again every few turns. Remind the model what you asked for.

This works. Kind of. The way that setting seventeen alarms works for someone with ADHD. It addresses the symptom without touching the cause.

Every time you re-paste instructions, you are burning context window space. You are adding tokens that could have been used for actual work. And you are creating a workflow where you, the human, are responsible for babysitting the AI's attention in real time.

That is not a system. That is a coping mechanism.

I know. I have about forty of them.

The Real Problem Is Architectural

The reason LLMs drift is not laziness. It is not bad training. It is the fundamental structure of how transformers allocate attention across a sequence of tokens.

Newer models have longer context windows. Claude can handle two hundred thousand tokens. Gemini pushed past a million. People treat this as a solution. More room means the model can hold onto instructions longer, right?

Not exactly.

A longer context window does not fix attention weighting. It just gives the problem more room to develop before it becomes obvious. Your instructions still lose relative weight as the conversation grows. It just takes longer to notice because the window is bigger.

Think of it this way. Losing your keys in a studio apartment makes the problem obvious fast. Losing your keys in a four bedroom house lets you wander around longer before you realize something is wrong. The keys are still lost.

What Actually Fixes It

Here is where the ADHD analogy becomes genuinely useful instead of just interesting.

The gold standard for ADHD management is not "try harder to remember." It is external structure. Checklists. Routines. Systems that do not depend on your attention being perfect in the moment.

My therapist said something years ago that stuck with me. She said, "You will never fix your working memory. Stop trying. Build systems that do not need it."

That is exactly what an AI workflow needs.

The fix for LLM memory drift is not bigger context windows. Better prompts will not solve it either. Re-pasting instructions every six turns like you are playing whack-a-mole with the model's attention span is just busywork.

The fix is building the checkpoints into the system itself.

What a Checkpoint System Looks Like

A checkpoint is a structured evaluation step that runs at a defined point in the workflow. Not when you remember to run it. Not when you notice the output drifting. At a defined, automatic, non-negotiable point.

In a CORE system design, this means every output stage has a validation pass built in. The model produces something. Before it moves forward, it gets checked against the original requirements. A structured rubric runs the same way every time, not a human eyeballing the output.

This is how we built Bulletproof Writer. The system does not depend on you noticing that the AI drifted off your instructions. It catches the drift automatically. Eight scoring dimensions. Every draft. Every time. You could be half asleep and the quality gate still holds.

Discovery Mode in version 3.4 takes this further. Instead of just scoring what you wrote, it maps the gap between what you intended and what actually landed on the page. It shows you where the model's attention drifted from your original brief. Specific, measurable dimensions that you can act on.

That is the difference between "try harder to remember" and "build a system that remembers for you."

The Bigger Lesson From Neuroscience

The ADHD research community figured this out decades ago. External structure beats internal willpower every single time. The people who manage ADHD well are not the ones with the best memory. They are the ones with the best systems.

The same principle applies to AI workflows. The people getting reliable, consistent results from LLMs in 2026 are not the ones with the most patience for re-pasting instructions. They are the ones who stopped relying on the model's attention span and built accountability structures around it.

You would not hand an employee a list of twenty requirements on Monday morning and then check the work on Friday with no check-ins, no milestones, no structured review points in between. That would be insane management. That is exactly how most people use AI. Giant instruction dump at the top. Hope for the best. Get frustrated when the output drifts.

The model is not the problem. The workflow is.

A Practical Framework

If you are dealing with LLM memory drift right now, here is what actually works.

Break long sessions into short ones. Instead of one conversation with thirty turns, run three conversations with ten turns each. Re-inject your full context at the start of each. The model's attention is strongest at the beginning of a session. Use that.

Build validation into the workflow, not your calendar. Do not plan to "check the output later." Build the check into the process itself. Every output gets evaluated against the same criteria before it moves forward.

Stop treating instructions as one-time events. In ADHD management, you do not tell yourself something once and expect it to stick. You build the reminder into the environment. Same with AI. Your instructions should be embedded in the system architecture, not pasted once at the top of a chat and forgotten.

Use structured scoring instead of vibes. "This looks good" is not a quality gate. "This scored seven point two out of ten on pacing and four point one on gap density" is a quality gate. One of those depends on you being sharp. The other does not.

Why This Matters Beyond AI

The reason the Reddit thread resonated is not just because the technical comparison is accurate. It touches something personal for a lot of people.

Millions of adults have ADHD. Many of them discovered it through exactly this kind of pattern recognition. They saw something that described their brain, and it clicked. The LLM comparison works the same way, in reverse. You watch the model forget your instructions, and you think, "Oh. That is what I do."

The solution is the same in both directions. Stop blaming the hardware. Start building better systems around it.

I did not fix my ADHD by getting a better brain. I fixed my output by building workflows that compensated for the attention regulation I would never have naturally. External checklists. Structured routines. Evaluation steps that do not depend on me remembering to do them.

That is what a CORE system does for AI. It does not fix the model's attention span. It builds the structure around the model so that attention drift stops mattering.

Where to Start

If you are tired of re-pasting instructions every six turns, stop doing that. Build a system that does not need you to.

The CORE Operating System is how we approach this at TotalValue. Structured checkpoints and validation passes that catch drift before it reaches your output.

Bulletproof Writer v3.4 is one specific implementation. Eight scoring dimensions. Automatic quality gates. Discovery Mode that maps the gap between your intent and the model's output. It is the external structure your AI workflow has been missing.

Your model does not need a better memory. It needs a system that does not depend on memory at all.

The comparison between LLMs and ADHD brains came from r/artificial. The engineering solution came from fifteen years of living with one of them and two years of building systems for the other.

Originally published on Medium.

DEV Community