The prompt is the cheap part. The context is the product.

#ai #agents #discuss #llm

Suppose the model forgot you completely between every message. No memory, no history, no idea who you are or what you were doing thirty seconds ago. Every time you hit send, it wakes up with total amnesia, reads whatever you put in front of it, answers, and dies.

That is not a thought experiment. That is just how a large language model works. The weights are frozen. Between two API calls it remembers nothing. Everything that feels like continuity (the model "knowing" your codebase, your tone, the bug you were chasing) is text that something assembled and handed over again, one more time, as if for the first time.

I find it useful to take that fact and push it all the way to its conclusion, because the conclusion quietly changes what I think I am building when I work with these tools.

Here is the first thing that falls out of it. If the model is amnesiac, the prompt is the most disposable artifact in the whole system. I used to treat the prompt as the thing I was crafting: the careful wording, the role-play, the "you are an expert" preamble. But a prompt lives for exactly one call and then it is gone. It is a paper airplane. Whatever value I thought I was banking in my prompt-writing skill was mostly evaporating on send.

The durable thing, the thing that actually decides whether the answer is any good, is the context I assembled around the prompt. Which files I pulled in. Which past decision I pasted back. Which error log, which schema, which three sentences of "here is why I do it this weird way." That assembly is the real work. It is also the part I kept treating as an afterthought while I fiddled with adjectives in the instruction line.

Why does this reframe matter?

Because it moves the engineering surface to a different place than most people are looking.

The industry has been circling this for about a year. Sometime in mid-2025 the phrase "context engineering" started showing up as the successor to "prompt engineering," and by 2026 it is the default framing. Anthropic published a whole piece on context engineering for agents; the trade reports are saturated with it. The short version of the shift: prompt engineering is about how you ask, context engineering is about what the model can see when you ask. The first is a sentence. The second is a system.

I think the reframe is correct, but I want to put it more bluntly than the vendor posts do. The prompt is the cheap part. It is cheap to write, cheap to copy, cheap to throw away. The context is the product. It is expensive to assemble, expensive to keep current, and it is the one part nobody can lift from you, because it is made entirely of your specifics: your code, your constraints, your year of small decisions.

And the moment context becomes the product, you walk straight into the thing the big-window marketing skips over. More context is not free. Models in 2026 ship million-token windows, and a few advertise ten million. But a long context is not a filing cabinet you can stuff and forget. Every serious evaluation I have read lands on the same failure: facts buried in the middle of a long context get lost, with accuracy dropping somewhere around 10 to 25 percent compared to the same fact placed at the start or the end. The bigger the window, the more "middle" there is to lose things in.

So context is not "paste everything." Context is curation under a relevance budget. That is an editorial job, not a storage job, which is exactly why it does not get easier as the windows grow. A bigger truck does not make you a better packer.

What this changed about my own notes

Here is where I stopped theorizing and looked at my own desk. If context is the product, then for me, working alone, the raw material for that product is the pile of things I have written down. My notes are not a memory aid anymore. They are the storage layer that everything I hand a model gets drawn from. A half-sentence I captured in a meeting in March becomes, in June, the exact line I paste in to stop the model re-suggesting the approach I already rejected and forgot I rejected.

Which means the bottleneck is not the model, and it is not the prompt. It is capture. A thought I failed to write down is context that will never exist, no matter how large the window gets. I build a note-to-email app, so I am biased here, but the bias grew out of the problem rather than the other way around: the cheapest lever on the quality of my context is the friction of getting things into it. A lot of mine now goes in by voice. I talk, it transcribes on the device, the line lands while my hands are still on the keyboard or the steering wheel. The method is not the point. The point is that lowering capture friction raises the ceiling on every future prompt, and almost nobody budgets for it.

I can date the moment this stopped being abstract. One afternoon in April I spent two hours fighting the same suggestion from Cursor, which kept proposing I cache a value I had very deliberately decided not to cache, for a reason that lived only in my head. Each time, I re-typed some version of that reason into the chat, got a fine answer for that one turn, closed the tab, and lost the explanation again. I was doing prompt engineering, I was doing it competently, and it was worthless, because none of it survived the session. The fix was not a sharper prompt. The fix was writing one durable line, the decision and its reason, into the place my notes already live, so that the next time the question surfaced I pasted instead of re-argued. The two hours I burned were not a model failure. They were a context failure, and the gap was entirely on my side of the keyboard.

The shape of a context line that earns its keep, for me, looks boring on purpose:

2026-03-14 decided: no background refresh on the watch app.
reason: watchOS kills long-running sends; relay through the phone instead.

That is not a note to myself. It is a paste-ready unit of context. When the model cheerfully suggests background refresh on the watch three months later, I do not argue with it and I do not re-derive the reasoning. I paste the line. The decision and the reason for it travel together, into the model's amnesiac little world, as if I had just explained both for the first time. I keep a versioned log of every prompt I send for the same reason; the prompt I wrote is disposable, but the context that made it work is worth filing.

The half of this I do not believe

Now the part where I argue with my own thesis, because I only half-run it.

If context is genuinely the product, the logical move is to hoard. Capture everything, store everything, retrieve aggressively, feed the model a fat dossier every time. I do not do that, and I have come to think hoarding is a trap, for two reasons. The lost-in-the-middle problem means a bloated context actively hurts: the signal drowns in the volume. And a context store you never prune rots. Last year's decision, the one I reversed in March, is still sitting there ready to mislead a model that has no way to know it is stale. A second brain stuffed with expired context is worse than a small one, the same way a confident wrong code comment is worse than no comment at all.

So the honest version of my position is narrower than the slogan on the title line. Context is the product, but the product is edited, not accumulated. The skill is not "save more." It is knowing which one paragraph, out of a year of notes, to put in front of the model right now, and being willing to throw the rest out of the frame. That is a judgment I cannot automate yet, and I am not fully convinced I want to.

So I run half my own thought experiment. I take seriously that the model forgets me, and I pour the effort into the context instead of the prompt. I refuse to take seriously the idea that more context is better, because the evidence and my own rotting notes agree that it is not.

If you work with these tools every day, I am curious about one specific thing, because it is exactly where I am least sure of myself. What is the one piece of context you catch yourself re-explaining to the model over and over, the decision or the constraint or the "no, it has to be done this way," that you have never managed to store somewhere it gets reliably pulled back in? I want the actual line, not the category. I suspect the collected answers would draw a pretty accurate map of where the tooling still pretends the model remembers you, when it plainly does not.

I'm a solo dev. I build Simple Memo, an iOS app that turns a thought into an emailed note before I lose it. I write here every few days about working with LLMs and shipping things alone.

DEV Community

The prompt is the cheap part. The context is the product.

Why does this reframe matter?

What this changed about my own notes

The half of this I do not believe

Top comments (0)