DEV Community

Nirsa
Nirsa

Posted on

The problem wasn't that the AI wrote bad code — weak specs caused unstable implementations

Recently I’ve been experimenting a lot with AI-assisted development workflows using tools like Codex and Claude Code.

At first, I assumed most implementation failures came from the AI itself.

But after repeatedly testing spec-driven workflows, I noticed something different:

The problem wasn't that the AI wrote bad code. The problem was that weak specs caused unstable implementations.

Ambiguous requirements often led to:

unstable architecture inconsistent contracts missing ownership boundaries unsafe delete/update behavior implementation drift features expanding outside original intent

In many cases, the AI was actually trying to follow the provided specification. The issue was that the specification itself was incomplete, unsafe, or unclear.

That led me to start experimenting with what I’ve been calling

VFW (Validation First Workflow)

The core idea is simple
Before AI coding starts, validate whether the specification is actually implementation-ready.

As part of that experiment, I started building a small OSS project called SpecGuard
https://github.com/KoreaNirsa/spec-guard

SpecGuard is not a code generator.

Instead, it acts more like a validation-first guard layer for spec-driven / AI-assisted development workflows.

Current v0.3.0 supports things like
readiness review for spec packages Critical / Major / Minor findings low review mode implementation handoff artifacts experimental PR drift review heuristic-first review flow

Typical workflow
Discovery → spec.md → technical-design.md → SpecGuard Review → readiness validation → implementation handoff → external coding agent → Pull Request → SpecGuard PR review

The project is still very experimental and immature in many areas.

Known limitations
heuristic false positives / false negatives limited benchmark coverage small real-world validation set review calibration still evolving UX/docs still rough

Right now this is still much closer to a demo-stage OSS project than a mature production tool.

But I’d like to continue evolving it toward something practical enough for real engineering workflows.

I’m especially interested in exploring
Spec-Driven Development validation-first workflows contract validation AI-assisted engineering PR review automation CI/CD validation gates harness/evaluation engineering

Feedback and contributors are very welcome.

Top comments (0)