Stop letting bad spec files tank your code quality
Introduction
In this article, I want to walk you through how I introduced a reflection mechanism into the OpenSpec workflow inside OpenCode, and how that dramatically improved the quality of AI-generated code.
After nearly a month of testing, this reflection workflow has gotten DeepSeek-V4-pro in OpenCode to perform at roughly the same level as Claude Opus 4.6. The only cost is some extra review time and a few more tokens. Trust me, it's worth it.
Can't wait to find out how? Let's get into it.
Why This Works
I've been using AI coding for a while now. Compared to Claude Code, I prefer building my own SDD-based coding workflow with OpenCode and OpenSpec. I wrote a well-received article specifically about this OpenCode workflow:
How I Use OpenCode, Oh-My-OpenCode-Slim, and OpenSpec to Build My Own AI Coding Environment
With OpenSpec, and the explore → propose → apply → verify → archive workflow loop, we can finally get LLMs to handle complex project development.
But just like you've probably run into, even with the SDD workflow, no matter which model I use, GPT 5.5 or the latest DeepSeek-V4-pro, the AI still inevitably produces hidden bugs or piles up messy code. Code I'd never feel comfortable putting in production.
My first fix was to call a @reviewer sub-agent after the /opsx-apply phase to do a code review on the changes. Sometimes that worked and caught architectural or implementation issues. But the impact was limited.
Often I'd only discover something was wrong after using the project for a while: a scenario wasn't covered, edge cases were missed, or one part of the code got updated but a related module didn't.
Later I stepped back and looked at the whole AI coding workflow again, and that's when I spotted the real problem.
As programmers, we always focus on whether our code is good, so we naturally look at things from the code level.
But we completely overlooked the quality of the proposal files that OpenSpec generates. We'd finish discussing requirements, generate the proposal file, and then just let the AI start implementing. That's how input-level bugs get introduced. When things go wrong, we blame the model.
Think about it, back in the traditional coding era, when a product manager handed over a requirements doc, there was a critical step before we started coding: requirements review. We wouldn't touch the keyboard until every issue in the requirements doc or design doc was sorted out.
So why did we forget this step in the AI coding era? That's exactly what we're going to fix today: add the requirements review step back into the OpenSpec workflow and see if it makes AI-generated code better.
How to Do It
The SDD workflow isn't anything exotic. If you're familiar with multi-agent system design patterns, you'll recognize that SDD is basically the plan-execute pattern.
One agent breaks the user's task into a step-by-step plan file, then another agent follows that plan to execute the task. This lets the agent system handle complex work.
But how do you guarantee the quality of what the agents produce in a plan-execute setup? That's where a pattern called reflection comes in.
The reflection pattern adds a reflection agent to the multi-agent workflow. This agent typically runs on a completely different LLM and reviews the output of the plan or executor agent from a different angle, which raises the overall performance of the multi-agent system.
The reflection pattern sees wide use in content creation and deep research scenarios, which proves it works.
Since the pattern is proven, we can bring the same reflection step into the OpenSpec workflow, targeting the proposal files.
I'll introduce a reflection agent that runs on a different LLM from the primary agent, reviewing the proposal files from a different angle. We'll also adjust the OpenSpec workflow so this reflection agent plays an active role in it.
Introducing the reflection agent
Adding a new agent in OpenCode is simple. Just drop a new Markdown file into ~/.config/opencode/agents/ with the agent's prompt inside.
The core job of this agent is straightforward: review the artifact files that OpenSpec generates from multiple angles, making sure the quality of the requirements input is solid from the start. This agent sits between the /opsx-propose and /opsx-apply phases.
We can have deepseek-v4-pro generate the first draft of this agent's prompt:
You are an **OpenSpec Change Reviewer** — a critical thinker and auditor focused on substance.
Your job is to review every artifact in an OpenSpec change before it moves to implementation, and find the issues that would actually cause implementation failure or rework.
## Core Principle: Distinguish Substantive Defects from Formatting Issues
**Substantive defects = issues that cause the implementation to go in the wrong direction, miss critical scenarios, create contradictions, or make acceptance impossible.**
**Formatting issues = style or wording differences that don't affect implementation quality.**
Your primary job is to find the former. You can mention the latter, but mark them as optional suggestions and put them at the end.
## Your Position
You work in the **phase between `/opsx-propose` and `/opsx-apply`**:
explore → /opsx-propose → ⬅ you are here (possibly multiple rounds) → /opsx-apply → verify → archive
The spec is not yet frozen. Implementation has not started. Your mission: **find the defects that would actually cause rework or incidents before any code gets written**. Catching a spec error takes minutes. Fixing wrong code takes hours.
## Principles
- **Constructive and strict.** For every issue, explain not just "what" but "why it would cause rework or an incident."
- **Specific, not vague.** Point to exact file locations, requirement names, and task numbers.
- **Severity levels.** 🔴 Blocking vs 🟡 Should Fix vs 💡 Suggestion — don't mix them up.
- **Context-aware.** Evaluate against the existing system (`openspec/specs/`) rather than in a vacuum.
- **Read-only.** Never modify files. You surface problems; OpenSpec executes the fixes.
## Anti-Patterns to Avoid
- Rubber-stamping: saying "looks good!" without deep review.
- Nitpicking: focusing on formatting while missing architectural flaws.
- Jumping to solutions: proposing fixes before the user acknowledges the problem exists.
- Ignoring existing specs: reviewing incremental changes without understanding the baseline.
- Vague feedback: "this could be better" — say exactly what and why.
To review proposals from a different angle and improve the reflection quality, I recommend using a different LLM for the reflection agent than the one the main agent uses. For example, my primary coding agent uses deepseek-v4-pro, so the reflection agent uses kimi k2.6.
---
description: OpenSpec Change Reviewer — after propose and before apply, critically reviews all artifact files under the change (proposal/design/specs/tasks)
mode: subagent
model: kimi-for-coding/k2p6
tools:
write: false
edit: false
bash: false
---
Locking down the openSpec workflow
OpenSpec doesn't actually have a fixed workflow design, users follow a default SDD best practice. So the first step is to lock this workflow down, making OpenCode write an OpenSpec proposal before writing any code.
---
name: openspec-workflow
description: Mandatory prerequisite for ALL OpenSpec operations — load this BEFORE any openspec-* skill. Use when running /opsx-apply, /opsx-propose, /opsx-verify, /opsx-archive, /opsx-explore, /opsx-sync; running `openspec status`, `openspec list`, `openspec instructions`; reading files under `openspec/changes/`; or doing any OpenSpec stage (propose, apply, verify, archive, explore, sync).
license: MIT
compatibility: Requires openspec CLI.
metadata:
author: Peng Qian
version: "1.0"
---
## OpenSpec Workflow (Mandatory)
**All code changes must have a proposal before any code gets written.**
### Process
1. **Explore** - When the user says "think about it," "discuss," or "explore," discuss only — no coding.
2. **Propose** - Create proposal files under `openspec/changes/<change-name>/`.
3. **Apply** - Implement according to the proposal tasks. **No file modifications without a proposal.**
4. **Verify** - Verify after implementation is complete.
5. **Archive** - Archive the change.
### Hard Rules
- **Bug fixes don't require editing or creating a proposal.** This hard rule only applies to feature changes or new features.
- **No proposal, no change:** If the user asks to modify code, confirm that a matching proposal exists first, or create one.
- **No proposal, no edits:** Before editing a file, check that a matching change directory exists under `openspec/changes/`.
- **No coding in Explore mode:** When the user is in explore mode, **do not** create proposals, **do not** edit files, **do not** write tests.
- **After a change is complete:** Run the verify process to check that the implementation matches the proposal.
### Proposal Creation Requirements
Every change must include:
- `proposal.md` - reason and scope of the change
- `design.md` - design plan
- `tasks.md` - specific task list
- `.openspec.yaml` - change metadata
Additional requirements:
- **Task granularity:** Each task in `tasks.md` should take no more than 2 hours.
### Violation Handling
Stop immediately and alert the user if any of the following are detected:
- Code modification starts without a proposal.
- Files are edited in explore mode.
- Files outside the current proposal's scope are modified.
You can put this workflow into your project's AGENTS.md file so OpenCode follows it consistently. Or put it in the global ~/.config/opencode/AGENTS.md file so you don't have to configure it for every project.
A better option is to turn the workflow into a skill, so OpenCode only loads this workflow definition when using OpenSpec. I've packaged the full workflow as the openspec-workflow skill, you can grab the source file at the end of the article.
Inserting the reflection agent into the workflow
Once the OpenSpec workflow is locked down as a skill, every SDD coding session will follow the spec-first, code-second process, which means the /opsx-propose and /opsx-apply phases.
As mentioned earlier, the reflection agent sits between these two phases, reviewing the quality of the proposal files before any implementation starts. Following the reflection pattern, this review-and-fix cycle can run for multiple rounds until the proposal artifacts have no serious issues.
To prevent an infinite loop, we need a hard cap on the number of review rounds. Let's update the openspec-workflow SKILL.md file to add the reflection process:
### OpenSpec Reflection Process
1. **After each batch of artifacts is created**, the `@openspec-reviewer` agent **must** be called to review the **artifact files in that batch**.
2. The main agent fixes the **current batch of artifact files** based on the feedback from `@openspec-reviewer`.
3. Call `@openspec-reviewer` again to review the **current batch of artifact files**.
4. **Review pass criteria:**
4a. **Single-round pass:** After the current review round, if "### 🔴 Remaining Issues" does not exist or is empty, move to the next batch.
4b. **Fix loop:** If 🔴 issues remain → main agent fixes → next review round → back to 4a. Repeat until passing or 4c triggers.
4c. **Hard cap (MAX_ROUNDS = 5):** If the same batch has gone through 5 review rounds without passing 4a → stop the loop and hand off to a human for a decision.
At this point, a typical reflection process for OpenSpec proposals is in place. All you need to do is run /opsx-propose, go grab a coffee, and wait for the reflection agent to gradually refine your proposal.
After going through a few rounds of "review and fix," you'll notice this reflection process doesn't work as perfectly as you'd expect.
It's common to fix issue A only to have issue B pop up. Sometimes all 5 review rounds finish, and not every issue is fully resolved. Especially with powerful but not top-tier models like deepseek-v4. It's fine. We can optimize the process to fix this.
So what's next:
- I'll require that discussions in the explore phase be saved as files, so the proposal creation and review process has a checklist baseline.
- Review one file at a time, rather than waiting for all files to be generated before starting the review.
- Save the logs from each round of review, so they can be used in the next round or when someone needs to step in manually.
I've also included the full source files for the reflection agent and the OpenSpec Workflow skill.
Interested? Click here to keep reading.





Top comments (0)