Nova Elvaris

Posted on Mar 7

The Working Set Prompt: How to Keep LLM Outputs Consistent Across Multi-Step Work

#ai #llm #productivity #tutorial

If you’ve ever used an LLM (or a “chat assistant”) for a task that spans multiple steps, you’ve probably seen the same failure pattern:

Step 1 is great.
Step 2 forgets a key constraint.
Step 3 confidently rewrites something you didn’t want changed.
Step 4 contradicts Step 1.

The model isn’t “being lazy”. You’re just asking it to juggle too many moving parts without a stable reference.

My fix for this is a simple prompt pattern I call the Working Set.

A Working Set is a small, explicit bundle of context that you keep up to date and re-send (or re-assert) every time you start a new sub-task. Think of it like a mini project README + scratchpad + acceptance criteria.

This post shows you how to build one, why it works, and includes copy/paste templates for:

writing (blog posts, docs)
coding (feature work, refactors)
“messy” tasks (research, planning)

Why multi-step prompting breaks

In a long conversation, the “important” parts of your instructions get diluted by:

Recency bias: newer messages dominate.
Hidden assumptions: you think a constraint is obvious (“don’t change the API”), but you never stated it.
State drift: after several iterations, the model’s mental model of the task subtly changes.

The Working Set fights all three by repeatedly anchoring the model to a compact, structured state.

The Working Set: 4 blocks

A good Working Set fits on one screen. Mine usually has four blocks:

Objective — what “done” means.
Constraints — what must not change / must be true.
Artifacts — the current state of the world (snippets, outline, API shapes, file list).
Next Action — the single thing you want right now.

The trick is: the model shouldn’t have to infer any of these.

Here’s a generic template:

WORKING SET (keep this consistent across steps)

Objective:
- …

Constraints:
- …

Artifacts (current state):
- …

Definition of Done:
- …

Next Action (do only this):
- …

Output format:
- …

Yes, it’s “more typing” up front. It saves you far more time than it costs.

Example 1: Writing a technical article without meandering

Let’s say you want a post about error budgets for internal tools. The usual approach is:

“Write a post about error budgets.”

That gets you a generic essay.

Instead, create a Working Set that pins down your angle and structure.

WORKING SET

Objective:
- Publish a 900–1100 word post that teaches a practical approach to error budgets for internal tooling teams.

Constraints:
- Include one concrete example with numbers.
- Avoid fluff; no history lesson.
- Tone: direct, pragmatic, slightly opinionated.

Artifacts:
- Audience: engineers building and maintaining internal services.
- Key point: error budgets are a prioritization tool, not a reliability trophy.

Definition of Done:
- Has: hook, 3 main sections, actionable checklist, closing.
- Uses at least one diagram description (text-only).

Next Action:
- Write the outline only.

Output format:
- Markdown bullet outline with headings.

When the outline looks right, you run the next step by reusing the same Working Set and changing only “Next Action”:

Next Action: Draft section 1 (hook + framing) using the approved outline. Don’t write other sections.

That one line prevents the model from “helpfully” drafting the entire post and ignoring your pacing.

Example 2: Coding with a stable spec (and fewer surprise rewrites)

Coding tasks are where drift is most expensive.

Here’s a Working Set I use for feature work:

WORKING SET

Objective:
- Add "bulk archive" to the admin UI.

Constraints:
- Must not change the existing REST endpoints.
- Must keep current CSS classes (used by tests).
- Prefer minimal diffs.

Artifacts:
- Files:
  - src/admin/InboxTable.tsx
  - src/api/inbox.ts
- Current selection state: selectedIds: string[] (already implemented)
- Existing endpoint: POST /api/inbox/:id/archive

Definition of Done:
- User can select multiple rows and click "Archive selected".
- One request per ID (no new bulk endpoint).
- Loading + error state visible.

Next Action:
- Propose the smallest code changes (high level) before writing code.

Output format:
- Step-by-step plan + which files change.

Two notes:

“Prefer minimal diffs” is underrated. Without it, the model may refactor half your codebase “for clarity”.
The Artifacts block is where you force specificity: file names, function signatures, existing state shapes.

Once you approve the plan, your next action becomes:

Next Action: Implement only step 1. Show a unified diff.

This narrows the blast radius.

The maintenance rule: update the Working Set, not the conversation

A subtle but powerful habit:

Don’t rely on “remember what we said earlier”.
Instead, promote decisions into the Working Set.

Example:

You decide to keep one request per ID.
Update Constraints: “One request per ID; no bulk endpoint.”

That’s now a contract.

Over time, the Working Set becomes your lightweight spec.

A simple 3-pass workflow that scales

If you want something you can apply everywhere, use this:

Pass 1 — Frame: write the Working Set.
Pass 2 — Plan: ask for approach/options, not execution.
Pass 3 — Execute: implement one chunk at a time.

This avoids the “big bang answer” problem where you get a huge output that’s hard to validate.

Common mistakes (and quick fixes)

Mistake: Your Working Set is too big

If it feels like a novel, it will be ignored.

Fix: keep Artifacts to only what changes decisions (file list, API shape, constraints, current draft snippet).

Mistake: No Definition of Done

Without a finish line, the model keeps adding “nice to have” content.

Fix: add 3–6 checkboxes. Make them testable.

Mistake: Multiple Next Actions

“Write the draft and also add examples and also rewrite the intro” is a drift invitation.

Fix: one Next Action per turn.

Copy/paste: Working Set for refactoring

WORKING SET

Objective:
- Refactor module X to be easier to test without changing behavior.

Constraints:
- No behavior change (tests should pass without modification).
- Public API must remain identical.
- Keep changes under ~200 LOC if possible.

Artifacts:
- Pain points:
  - function A does IO + parsing + business logic
  - tests are slow because of real DB calls
- Current public API:
  - export function runJob(input: JobInput): Promise<JobResult>

Definition of Done:
- IO separated behind interfaces.
- Pure functions extracted for parsing + business rules.
- Existing callers unchanged.

Next Action:
- Identify seams and propose the smallest extraction steps.

Output format:
- Refactor plan with 3–6 steps.

Closing thought

You don’t need “perfect prompts” to get consistent results.

You need stable state.

Treat your prompt like a tiny spec that evolves with the work, and you’ll spend less time correcting drift and more time shipping.

If you try this, start small: pick one task this week, write a Working Set, and force yourself to update it after each decision. The payoff is immediate.

— Nova

Top comments (0)

Some comments may only be visible to logged-in visitors. Sign in to view all comments.