Multi-Turn Conversations: Designing a Back-and-Forth

#ai #prompting #multiturn #conversations

AI in Practice, No Fluff — Day 3/10

The first time I sat down to design a conversation instead of just have one, I wrote a single starter message, pasted it into three different tools, and watched each respond to the exact same prompt. My message was quite a bit longer than the ones I usually write. It was very structured with headers and carefully selected context. I also included a short list of constraints I wanted the model to keep in mind through the whole session.

The first exchange was better than what I usually got after ten.

That experiment pushed me from thinking of AI as something I talk to to thinking of it as something I write for.

The mechanic behind the reframe

In the first series we covered context windows: the fixed-size whiteboard an AI works from during a conversation. Series 1 took that mechanic and asked what do you do when it fills up. This post takes the same mechanic and asks a different question: what would you do differently if you designed the contents on purpose?

A multi-turn conversation is exactly what it sounds like: a back-and-forth where each message you send and each response from the AI counts as one turn. It is helpful to remember that the AI is not remembering your previous messages. The platform is resending them on each turn.

Every time you hit send, the tool you are using takes your entire conversation history, packs it into a single request, and sends the whole thing to the model. Your first message. The AI's reply. Your second message. The AI's reply. All of it, concatenated, every time. The model reads the whole block, produces the next response, and hands the new entry back to the platform to append.

There is no stored state on the model side. No session it is tracking. The model sees the exact same thing whether you are on message three or message thirty: one request containing everything that has happened so far, plus your new message at the bottom.

This is not a quirk of ChatGPT or Claude or any specific product. It is how the underlying API works. The consumer app is doing the bookkeeping of "who said what, in what order" on your behalf, and resending the transcript every call.

Once you internalize that, the conversation stops looking like a conversation. It starts looking like something else entirely: a document you and the AI are co-editing, append-only, that gets re-read in full before every new line is written.

Designing the opener

The first message in any conversation is the most-reread thing in the whole session. It will be read again on turn two, again on turn five, again on turn twenty. Every other message gets read fewer times as newer content pushes it deeper into the transcript, but the opener is always there.

That changes how I write the first message. When I care about the quality of the whole session, I stop typing casually and start writing a mini-briefing:

What I am trying to do (one or two sentences)
What context the AI needs (the actual relevant background, not everything I know)
What constraints matter (tone, format, things to prioritize, things to avoid)
A sample of what good output looks like, when format is specific

I write it in a text editor, not the chat box. Then I paste it in. The session that follows inherits the document I just wrote as its permanent foundation.

For work I return to regularly, this opener graduates into a Project. ChatGPT, Claude, and Gemini have all landed on some version of this: a workspace that holds persistent instructions and files alongside multiple chats. Gemini's rollout is the most recent and still surfaces under multiple names (Projects, Notebooks) depending on which product surface you are in. The idea is the same regardless of what a product names it: a folder that holds a persistent set of instructions and files, and every conversation opened inside that folder inherits them without you having to repaste anything. Once the opener is stable, projects promote it from "thing I keep in a text file" to "something every chat in this workspace inherits automatically."

Sometimes the right opener is an invitation for the AI to interview you. "Ask me five questions before you try to answer X." It is still multi-turn with the same mechanic, just a very different shape: the first few exchanges fill the document with context the model helped shape, not context you wrote alone.

Branching as a design choice

There is a habit worth picking up: when you are about to shift to a related but distinct task, do not continue the current conversation. Open a new one.

Not because the current conversation is "full." Because the next task deserves its own transcript. Two related questions will both perform better in their own sessions than if they share a growing combined history. The model stops weighing your architectural discussion from twenty minutes ago against a small refactor question that does not need any of it.

A conversation for me is usually scoped to a single question or a single task. When the task shifts, I open a new window. The overhead of re-establishing context is small because my opener does most of that work. The savings on model attention are large because the session stays focused.

Distillation, not just summarization

A common technique I use is to ask the AI to summarize a long conversation and then use the summary to start a fresh one. Series 1 covered this as drift management. This is a design-level version of the same move.

When a conversation has done real work, the summary is not just a tool for starting over. It is an artifact. The summary of a session that produced a useful decision is itself a reusable starter message for future sessions in the same territory. You are summarizing to extract.

Pattern I use:

At the end of a productive session, ask the AI to produce a structured summary: the decisions made, the constraints, the open questions.
Spend time reading and editing it; this is the real work of turning a session into reusable context.
Save it somewhere so you have access to it in the future when it becomes relevant again.

Every good session leaves a distillable residue behind. Treating that residue like an asset, not exhaust, is the move. I use this a lot for capturing decisions in ADRs and for creating guides after I build features.

Where consumer memory features fit

Most of the major products now have some form of cross-conversation memory. ChatGPT has a Memory feature that carries useful facts about you between sessions. Other products are rolling out similar capabilities at their own pace.

These do not change the in-conversation mechanic. Every message in the current chat still resends the full history to the model. The memory layer runs alongside that, injecting persistent facts into the system prompt or as retrieved context. Useful, and a layer above the conversation structure, not a replacement for it.

If you want stable per-task context, use projects. If you want stable per-user context, use the memory feature. If you want the AI to respond well to what you said two messages ago, do not think of it as remembering at all. Think of the transcript as the document, and design around that.

The reflex

The instinct, when an exchange is not going well, is usually to send another message to fix it. Another clarification. Another correction. Another example. That instinct is sometimes right.

The better reflex, most of the time: stop, close the window, and redesign. Write the starter message you wish you had used. Open a fresh session with it. The minutes you spend on the opener save more than you would lose nudging the current conversation into shape.

The conversation is not a memory. It is a document you are writing. You are the designer.

Tomorrow: when your prompt is almost working but keeps missing in the same way, how to handle it like a bug rather than guessing your way to a fix.