DEV Community

SAURABH SHUKLA
SAURABH SHUKLA

Posted on

The Cowork Loop: A Software Pattern for AI Workflows That Actually Compound

If you've spent time building with LLMs, you've hit this wall: you get your agent or workflow running, the outputs are decent, and then... they stay decent. Six months later, the same prompts produce roughly the same quality. The model hasn't gotten worse. The workflow hasn't improved.

The reason is almost always the same: you're missing Phase 4.


The pattern most AI workflows skip

Here's the loop most developers run without naming it:

  1. Write a system prompt and user prompt (Brief)
  2. The model generates output (Generate)
  3. You read the output and decide if it's good (Review)
  4. You ship it and close the session

That's phases 1–3. Phase 4 — Refine — is the one that compounds.

Refine is not about modifying the output. It's about updating the system that produced it. Before closing the session, you capture what you learned: what the system prompt was missing, what framing produced better output, what output format made evaluation faster. Two sentences to a shared context file.

This is exactly analogous to writing a retrospective after a sprint. Most solo AI workflows don't have one.


The Cowork Loop™: four phases

Phase 1 — Brief

The quality of your output is determined at this phase, not phase 2. A strong Brief is a complete context transfer: standing context (what's always true), session context (what's true right now), and the task (specific enough to have one reasonable interpretation).

In practice, this means loading a persistent context file at the start of every relevant session. Here's a minimal CLAUDE.md structure:

# Context

## About this project
[project name, goal, constraints]

## Output standards
[what good output looks like for this workflow]

## Audience
[who the output is for, what they need]

## Style rules
[positive: what to do / negative: what to avoid]

## Recent signals
[updated Phase 4 captures — what's working, what to change]
Enter fullscreen mode Exit fullscreen mode

The ## Recent signals section is where Phase 4 writes to. This is the accumulation layer.

Phase 2 — Generate

The model executes within the constraints you've set. Best practices:

  • Request structured output where possible — it speeds up Phase 3 significantly
  • Ask the model to flag uncertainty explicitly ("If you're uncertain about X, say so")
  • Set output scope precisely — over-generation is harder to evaluate than precise generation

Phase 3 — Review

The human evaluation layer. Four questions:

  1. Does it answer the right question (not just the question typed)?
  2. Is the reasoning sound — do conclusions follow from evidence?
  3. Does it meet the quality bar for this workflow?
  4. What's the delta between "good enough" and "excellent"?

Question 4 is what most people skip. Finding that delta is what Phase 4 acts on.

If the output is directionally wrong, go back to Phase 1 with a sharper Brief. Refining a wrong direction produces a more polished wrong direction.

Phase 4 — Refine

Two actions: improve the current output, and update the shared context.

Updating the context is the one that compounds. Add the Phase 3 delta to your context file before closing the session. Not a full rewrite — two sentences:

2026-06-24: Leading with a specific date/event in the hook produces better engagement than leading with a thesis statement. Update default hook template.
Enter fullscreen mode Exit fullscreen mode

Next session, that signal is loaded in the Brief. The next output starts ahead of where today's ended.

Over 90 sessions, the ## Recent signals section becomes a distilled record of everything you've learned about what produces good output for this workflow. It's self-documenting institutional memory.


Why OpenAI just built this into infrastructure

On June 4, 2026, OpenAI shipped Dreaming V3 — a background process that automatically synthesizes ChatGPT conversation history and carries the important context forward into new sessions. Free for every user, compute cost reduced 5x.

That's Phase 4 automated at the platform level.

The engineering insight is correct: Phase 4 is the step most people skip, and automating it removes the friction that causes skipping.

The limitation: automated synthesis is bounded by the quality of what went in. Unstructured conversations produce structured summaries of unstructured thinking. Deliberate Cowork Loop passes — where Phase 3 explicitly named what to capture and Phase 4 wrote it down — produce richer material for the synthesis to work with.

If you're building workflows on top of ChatGPT, Dreaming V3 and the Cowork Loop™ are complementary, not competing. The automation gets better material; you get better synthesis.


Minimum viable implementation

  1. Create a context.md (or CLAUDE.md) file for your most recurring AI workflow
  2. Write the five things you re-explain most often — that's your initial standing context
  3. At the end of your next session, add two sentences: what the Brief was missing, what worked
  4. Load that file at the start of every relevant session going forward

Do this for three weeks. Then read your ## Recent signals section. You've built a Brief calibrated to your actual workflow — not a default template, but a real system refined by real sessions.

That's the Cowork Loop. The compounding takes care of itself after that.


Full framework writeup (with failure modes and the CLAUDE.md structure I actually use) at the canonical version: echonerve.com/the-echonerve-cowork-loop


Top comments (0)