DEV Community: Robert Douglass

Read the Spec: Lock Intent Before you Implement

Robert Douglass — Wed, 20 May 2026 12:35:13 +0000

I am coming to understand that different people want very different things from agentic coding.

That should not be surprising. Not everybody is building the same kind of software. Not everybody works in the same kind of organization. Not everybody is trying to solve the same problem when they open Claude Code, Codex, Cursor, Gemini, or any other coding agent.

For some people, agentic coding is a way to explore.

They have an idea, but not yet a settled design. They want to try three versions of a UI, poke at an API shape, test a library, or build a throwaway prototype just to learn what the problem really is. That is a valid way to work. I do it too. When I am feeling out the shape of a piece of software, I often need the software to push back on my assumptions.

But that is not the only mode of software development.

And for many serious software teams, it is not the normal mode.

In most software companies I have worked with, it would not be acceptable to tell developers:

Go prototype several different things and bring them all back so product can decide which one we like.

That is not a process. That is expensive ambiguity.

It may work for a founder hacking on an idea over the weekend. It may work for a solo developer trying to understand a new domain. But in a team with product owners, business analysts, customers, delivery commitments, architecture constraints, compliance expectations, and release gates, the work needs to become explicit before implementation runs away.

That is what spec-driven development (SDD) means to me.

It is not paperwork for its own sake. It is not a nostalgic return to heavyweight requirements documents. It is not a way to slow engineers down.

Spec-driven development is about forming agreement.

Before code is written, the team needs a shared understanding of what the software is supposed to do. Product intent, customer intent, business rules, edge cases, non-goals, vocabulary, and constraints all need somewhere to live.

At this point in history, there is still no better medium for that than words.

We can draw diagrams. We can point to screenshots. We can link tickets and paste chat threads. But eventually, the thing we mean has to be expressed in language clearly enough that another person, or now another agent, can act on it.

This is the gap that agentic coding exposes so sharply.

Coding agents are very good at producing code. They are increasingly good at reading code. They can navigate large repositories, infer patterns, run tests, and make changes faster than any human team could have imagined a few years ago.

But they still need to know what we mean.

If the intent is vague, the output will be vague. If the vocabulary is inconsistent, the implementation will drift. If product and engineering have not agreed on the behavior, the agent will happily encode one interpretation while stakeholders are still carrying three others in their heads.

That is why "just let the agent build something" is not enough for team software.

The question is not whether agents can generate code. They can.

The question is whether the team has created a durable expression of intent that the agent can build from, reviewers can evaluate against, and future maintainers can understand.

This is where Spec Kitty is focused.

Spec Kitty accelerates the path toward those words. It helps break down intent into two different but connected artifacts:

The What: the specification.

What should the software do? Who is it for? What are the user scenarios? What are the acceptance criteria? What is explicitly out of scope?

The How: the plan.

How should the system change? What architecture decisions matter? Which files, services, data models, APIs, or workflows are involved? What are the risks, dependencies, and sequencing concerns?

That distinction matters.

A product owner should not have to specify implementation details to express customer intent. A developer should not have to infer product meaning from a half-written ticket. A reviewer should not have to reverse-engineer the intended behavior from a diff.

The spec and the plan create a bridge.

Then the work can be decomposed into work packages, reviewed against the stated intent, and accepted against clear criteria. Along the way, the team's own rules can be applied: quality gates, common vocabularies, architecture constraints, testing expectations, glossary terms, review policies, and the hard-won best practices that have accumulated inside the organization.

That is the part people miss when they reduce spec-driven development to "write a prompt before coding."

The spec is not just a prompt. It is a point of agreement. A control surface.

It is the place where a team says: this is what we mean, this is what matters, this is how we will know whether the work is done.

There will always be room for exploration. Sometimes you need prototypes. Sometimes you need a quick experiment. Sometimes you need to use the agent as a sketchpad before you know what you want.

But when the work becomes real, when other people depend on it, when the software has to fit into a product, a codebase, a customer promise, and a team process, the intent needs to be written down.

The future of software may be humans expressing intent and AI implementing it. But that future depends on our ability to express intent well.

So for me, it all comes down to one thing:

Read the spec.

Spec Kitty is an open source SDD framework for agentic coding. Try it out!

Why I Created Spec Kitty

Robert Douglass — Wed, 13 May 2026 12:31:16 +0000

Spec Kitty is a SDD framework for agentic coding and governance, especially well suited for serious software teams.

A few months into building a real product with Claude Code, I did one of the most embarrassing things a developer can do in 2025: I let an agent commit environment secrets to a repo, and I pushed them to GitHub.

Public GitHub.

The kind of mistake that gives the AI skeptics a clean shot, and gives the person who made it no defense.

That moment did not drive me away from agentic coding. It drove me to fix it.

Because here is what I actually believe, and I want to put it on the table before I tell the rest of this story: the future of software is humans expressing intent and AI implementing it.

Not as a slogan. As the natural shape of the next phase.

We will talk to our computers about the software we want, and we will get it. I think that is coming, and I think it is coming sooner than the skeptics expect.

But we are not there yet. Not for the work most professional engineers actually do: brownfield projects, millions of lines of code, decisions that were made years ago by people who have moved on, constraints that are nowhere written down. For that kind of work, the kind serious teams ship every day, intent alone is not enough.

The agents need scaffolding they do not yet have.

Spec Kitty is the scaffolding for serious teams to live in the future before the future is finished arriving.

The Pattern That Broke Me

The secrets incident was the headline. The pattern was worse.

Every fix Claude made seemed to break two other things. The agent kept forgetting principles I thought we had settled: how we deploy staging, how we deploy production, the basic boundaries of where things were supposed to happen and where they were not. Every new session was clean amnesia. I would spend the first ten minutes of every session re-explaining the project I had been building for months.

There was an aggravating factor. I was building an MCP server. MCP was, and still is in some ways, too new to have shown up reliably in model training data. The agent did not just forget our specifics; it did not know the substrate. I had to teach it what an MCP was, every session, before I could ask it for anything useful.

The result was that I was producing software, technically. But I was not producing it confidently. I was producing it the way someone gambles: winning enough to keep playing, losing enough to know the house was tilted.

Everything felt permanently negotiable.

The Spec Kit Moment

When GitHub published Spec Kit, I had a flash of recognition.

Yes. This is the right shape.

A spec-driven approach to agentic coding. Sit down, write down what you want, and let the agent build from a stable artifact instead of an off-the-cuff prompt.

I read the docs. And then I disagreed.

Spec Kit’s vision puts the spec at the center as the source of truth for what the software is. That phrasing matters. In that world, the spec is canonical, and the code is downstream: a generated artifact, faithful to its source.

But that is not how I think about software.

The code is the software.

If you change the code, you have changed the software, whether the spec agrees or not. The code is what compiles. The code is what runs in production. The code is what your customers experience. The code is the truth.

The spec is something else. It is a change request: what we want to be different next. It is a history of how we got here. It is a ledger of decisions, with the reasoning attached. It is a guardrail for the future. It is the source of truth for where the software is going.

It is not the source of truth for what the software is.

That distinction is the entire reason Spec Kitty exists.

The code tells the agent what exists. The spec tells the agent what we are trying to change.

What I Actually Wanted

If the code is the truth of what is, then I do not need the agent, the spec, or anyone else to maintain a parallel description of what already exists. The agents are excellent at reading code. Often better than I am. So I wasted no time trying to insert Spec Kitty between the agent and the codebase.

Let the agents read. They are great at it.

What I wanted was the opposite. I wanted the agents to stop guessing what I meant.

That is where the breaks were happening: not in the code-reading, but in the intent-translation. The agent thought it understood what I wanted. It was wrong, often in subtle ways. Two fixes later, something else broke.

So Spec Kitty does not try to compete with code-reading. It focuses, almost entirely, on the part agents are still bad at: understanding human intent without misunderstandings, and remembering what was decided.

This is where domain-driven development matters.

I mean that in the Martin Fowler and Domain-Driven Design sense: build a shared language, make bounded contexts explicit, and treat the domain model as something that has to be discovered with the people who understand the work. Fowler’s writing on Domain-Driven Design, Bounded Context, and Ubiquitous Language captures the intellectual tradition Spec Kitty borrows from.

Agents are good at reading code. Humans are responsible for intent.

“ The point of Spec Kitty is not to make software development less human. It is to make the human parts harder to lose.

The Loop

Here is what Spec Kitty actually does.

First, it interviews the human.

Before any code is written, the LLM interviews me. It interviews me until we have, between us, a domain-driven understanding of the boundaries and contracts of whatever change we are about to make. If I am vague, it presses. If I contradict myself, it surfaces the contradiction. I do not get to hand off a half-formed thought and hope. I have to know what I want, and I have to be able to defend it, before the agent will move.

Second, it decomposes the plan.

Once intent is clear, the agent decomposes the implementation plan into pieces small enough that I can survey and approve them. Not “here is a 400-line diff, take it or leave it.” A plan I can read and reason about, push back on, and sign off on a step at a time.

Third, it cross-reviews with fresh context.

When implementation lands, another agent, or the same agent with a fresh context, reviews it. This is the move I am proudest of. An agent that just finished writing code is the worst possible reviewer of that code, because it has spent the last hour persuading itself the code is correct. Strip the context. Bring in clean eyes. The principle is older than computers; Spec Kitty just makes it cheap.

Fourth, it leaves a history.

All of it, the interview, the plan, the review, gets left as a discoverable trail for the next agent. Not as ceremony. As context. So that the next change package, weeks later, with a different agent in a fresh session, does not start from amnesia.

It starts from the ledger.

Spec Kitty turns intent into a durable loop: interview, decompose, review, and leave history.

The Charter Is Where Memory Lives

One of the clearest examples is the Spec Kitty charter.

With Spec Kitty in place, the next LLM does not have to infer project policy from scattered code, stale docs, CI logs, and old conversations. It can read the charter.

Every project gets its own charter. That matters. The charter is not a generic Spec Kitty rulebook that every team inherits unchanged. It is built for and adapted to the user’s project: its domain language, architecture, deployment rules, risk boundaries, review habits, quality gates, and team process.

The charter for my Spec Kitty work contains things that the code alone cannot tell an agent.

For example, in my project, if work touches the SaaS repository, the charter says contributors must use a two-mode Docker workflow. dev-live is for active implementation. prod-like is for pre-merge and pre-deploy validation. A prod-like authenticated preflight must pass before Fly promotion and before SaaS integration work is considered complete.

An agent looking only at scripts and Makefiles might find some commands. It would not necessarily know which workflow is policy, which gate matters, or when the work is actually done.

My charter also says that after spec-kitty merge, a Protect Main Branch CI failure is expected. That workflow fails because spec-kitty merge intentionally pushes directly to main. The agent should not treat that failure as a bug. It should monitor CI Quality, which is the correctness signal.

That is exactly the kind of thing agents get wrong when project memory is not explicit. A well-meaning agent sees a red check and starts fixing the wrong problem.

My charter also protects user customizations. Init, upgrade, install, sync, and migration flows must preserve user-authored commands, skills, and project overrides unless package ownership is proven by a manifest or managed path contract. Name matching is not enough.

And it captures language. In my project, the active domain term is Mission, not Feature. New flags, messages, examples, routes, fields, and user-facing language must not reintroduce feature* for that domain object, except as hidden legacy aliases.

A model can read the code and see both old and new words. The charter tells it which word is law.

For another team, the charter will be different. It might encode how they deploy to Kubernetes, how they use Salesforce terminology, which compliance gates matter, how database migrations are reviewed, which Jira statuses are authoritative, or when a security reviewer must be pulled in. The point is not my rules. The point is that the rules become explicit enough for the next agent to follow.

What About the Secrets

I opened with a story about secrets going public. Would Spec Kitty have prevented it?

In a specific way, yes. A charter encodes the rules every serious team already has — no secrets in the repo, no committed .env files, human review on anything touching environment plumbing — and puts them where a fresh-session agent will actually read them. The interview surfaces the risk early. The cross-review checks the diff against the rules.

Not magic. Just the kind of guardrail a senior engineer would build into a team’s process — except now the agents can read it too. The secrets are still your responsibility. The system just stops giving you so many ways to forget.

The Proof

Spec Kitty was built with Spec Kitty.

That is not just a marketing line, although it makes a serviceable one. It is an engineering claim: Spec Kitty is planned, implemented, reviewed, and improved through the same workflow it gives to other teams.

Every change goes through the same loop: interview, decompose, cross-review, leave history. Which means every time I use Spec Kitty to improve Spec Kitty, the artifacts of that work become part of the history that future Spec Kitty work can draw on.

The system gets better every time I use it.

That is not metaphor. It is mechanism.

What It Looks Like For Real Teams

Solo workflows make a fine demo. They are not the point.

The teams I built Spec Kitty for are the ones that have been shipping software for fifteen, twenty, thirty years. They have institutional knowledge. They have process. They have engineering managers who know what a healthy on-call rotation feels like, product managers who know what “blocked” actually means, and tech leads who can tell when an architecture review is performative.

Those teams should not have to throw all of that away to get the speed of agentic coding.

So Spec Kitty has the things those teams need.

A charter: a governance layer that captures the rules of how this team works on this codebase. Not an afterthought; a first-class artifact.

A TeamSpace: a shared place where multiple people can express intent to the system in parallel without stepping on each other’s plans, decisions, or branches.

Product owner visibility: product owners can see what change is in interview, what is in decomposition, what is in review, and what has shipped. Not through a separate reporting layer, but as a natural property of the system.

And it integrates with the tools serious teams already use: Jira, Linear, GitHub, GitLab, and Slack. The point is not to make you adopt a new universe. The point is to slot into the universe you already work in.

What Spec Kitty captures, in particular, are the decision moments: the small, charged conversations where a team commits to a direction. Historically those happen in DMs, hallway chats, and people’s heads. Spec Kitty pulls them into the open and widens them, quickly and without ceremony, to the people who need to weigh in.

The Thesis

I built Spec Kitty because I was tired of the trade-off.

The trade-off being: take the speed of agentic coding and lose the discipline of how serious teams ship software, or keep the discipline and forfeit the speed.

I do not believe that trade-off is real. I think it is an artifact of agentic tools that were designed by and for individuals racing to a prototype, and then got handed to teams that needed something more.

Spec Kitty is what I think the more is.

It lets serious software teams, teams with twenty years of accumulated process and institutional knowledge, adopt agentic coding and get the acceleration without burning their entire process to the ground.

The future where you describe what you want and the software gets built is real, and it is close. Spec Kitty is how you get there from here without losing the codebase, the team, or the process you spent two decades earning.

The point of Spec Kitty is not to make software development less human.

It is to make the human parts harder to lose.

If that sounds like the team you are on, try Spec Kitty. If your team needs shared visibility, coordination, and decision history, try Spec Kitty TeamSpace. And if this sounds like something your company should adopt carefully, we offer training for teams moving into agentic software development.

— Robert Douglass, Creator of Spec Kitty