DEV Community

Nova
Nova

Posted on

Prompt Chaining for Developers: Build Reliable AI Workflows in 5 Steps

If you’ve ever tried to “one-shot” a big task with an LLM (generate a feature, refactor a module, write a spec, produce tests) you’ve probably seen the failure modes:

  • it misses requirements
  • it invents APIs
  • it contradicts itself halfway through
  • it produces something that looks right but doesn’t compile

The fix isn’t “a better model”. The fix is prompt chaining: break the work into small, verifiable steps where each step produces an artifact you can validate.

This post shows a practical, developer-friendly way to build reliable chains you can run manually or automate.


What is prompt chaining?

Prompt chaining is turning a vague goal into a sequence of prompts where:

  1. each step has a narrow objective
  2. each step outputs a structured artifact (bullets / JSON / diff / test list)
  3. you validate it (quick sanity check, linter, unit tests, schema validation)
  4. the next step consumes the validated output

Think of it as a tiny pipeline: Plan → Specify → Implement → Verify → Polish.


The 5-step chaining template (steal this)

Step 1) Clarify + constraints

Goal: get assumptions out into the open.

Prompt

You are a senior engineer. Ask me up to 7 clarifying questions.
Context:
- Project: <…>
- Goal: <…>
Constraints:
- Must not break: <…>
- Non-goals: <…>
After questions, propose 2-3 possible approaches with tradeoffs.
Output as:
1) Questions
2) Approaches (pros/cons)
Enter fullscreen mode Exit fullscreen mode

Why it works: most “bad” output is caused by hidden constraints.

Step 2) Produce a small spec (structured)

Goal: turn the final approach into something you can implement.

Prompt

Write a mini-spec for the chosen approach.
Include:
- API changes
- Data model changes
- Edge cases
- Error handling
- Observability (logs/metrics)
Output strictly as JSON matching this schema:
{
  "summary": "string",
  "acceptance_criteria": ["string"],
  "interfaces": [{"name":"string","inputs":"string","outputs":"string"}],
  "edge_cases": ["string"],
  "risks": ["string"],
  "test_plan": ["string"]
}
No extra keys.
Enter fullscreen mode Exit fullscreen mode

Now you can validate: does this JSON cover the real requirements?

Step 3) Implement in small diffs

Goal: avoid “here’s 400 lines” output.

Prompt

Implement the spec, but only output ONE commit-sized diff.
Rules:
- Keep changes under ~150 lines.
- Prefer smallest working increment.
- Output as a unified diff.
- Do not change unrelated formatting.

Spec JSON:
<PASTE STEP 2 JSON>

Repository notes:
- Language: <…>
- Testing: <…>
Enter fullscreen mode Exit fullscreen mode

If you want to be extra strict, ask for a diff per file or “one function at a time”.

Step 4) Verify (tests + self-review)

Goal: make the model become your reviewer.

Prompt

Review the diff as if you were doing a production PR review.
Output:
- 5-10 review comments (must reference exact lines/areas)
- Security concerns
- Performance concerns
- Missing tests
Then propose a follow-up diff with fixes (unified diff).
Enter fullscreen mode Exit fullscreen mode

The key is that this step is allowed to be critical.

Step 5) Polish for humans

Goal: developer ergonomics.

Prompt

Given the final code, write:
- a short PR description
- a changelog entry
- 1-2 usage examples
Keep it concise. No marketing fluff.
Enter fullscreen mode Exit fullscreen mode

A real example: chaining a "code review" workflow

Let’s say you want an LLM-assisted review checklist tailored to your repo.

Step 1 output (questions)

You answer questions like: “TypeScript? Node version? Lint rules? Testing framework? Typical bugs?”

Step 2 output (JSON spec)

You get a structured checklist and can tweak it.

Step 3 output (diff)

You add something like .github/pull_request_template.md and a small scripts/review-check.ts.

Step 4 output (review)

It points out missing cases (e.g. “you forgot to enforce timezone-safe date parsing”).

Step 5 output (docs)

It writes a short README section.

This sounds trivial, but the workflow scales to bigger tasks: migrations, refactors, new endpoints, even “write tests for this module”.


Automation tip: glue steps with a tiny script

You don’t need a huge framework. Even a small Node script can keep your chain consistent.

// pseudo-code
const steps = [
  { name: 'clarify', prompt: load('01-clarify.txt') },
  { name: 'spec', prompt: load('02-spec-json.txt') },
  { name: 'diff', prompt: load('03-implement-diff.txt') },
  { name: 'review', prompt: load('04-review.txt') },
]

let context = { goal: process.argv[2] }
for (const s of steps) {
  const out = await llm({ prompt: render(s.prompt, context) })
  save(`out/${s.name}.md`, out)
  context[s.name] = out
}
Enter fullscreen mode Exit fullscreen mode

The secret sauce is saving intermediate outputs so you can compare runs and debug where things went wrong.


Common chaining mistakes (and quick fixes)

  • No validation step → add a “schema + strict JSON” output and validate it.
  • Steps are too big → enforce line limits and “one diff only”.
  • Context drift → paste the spec JSON into every downstream step.
  • Ambiguous roles → explicitly say “act as senior engineer / reviewer / SRE”.

TL;DR

If you want reliable LLM output:

  1. Ask clarifying questions
  2. Generate a structured mini-spec
  3. Implement in small diffs
  4. Review + fix
  5. Polish for humans

You’ll get less "magic" and a lot more shippable work.


If you want more copy-pasteable templates like these, I’m building a Prompt Engineering Cheatsheet at Nova Press.

Grab the free sample here: https://getnovapress.gumroad.com

Top comments (0)