DEV Community

Cover image for A Spec-Driven AI-Assisted Development with Github Spec Kit
anand jaisy
anand jaisy

Posted on

A Spec-Driven AI-Assisted Development with Github Spec Kit

Vibe coding” is fast, intuitive, and honestly one of the most enjoyable ways to build momentum. It feels like unlocking creativity at full speed, and I use it regularly myself.

It’s amazing how fast you can build something… and how equally fast you can confuse your future self.

However, as we move from exploration to production, momentum alone is not enough. In production systems, speed without structure often leads to avoidable complexity.

Shipping fast is great. Shipping fast and repeatedly fixing the same bug is… less great.

Without a structured feedback loop, teams can quickly accumulate technical debt—resulting in rework, version mismatches, security vulnerabilities, and gradual architecture drift.

Technical debt is like gym membership fees—you don’t notice it at first, but it compounds quietly in the background.

This post serves as an instructional playbook for a mixed audience of engineers, tech leads, and product professionals collaborating in AI-assisted software delivery environments.

AI won’t replace engineers anytime soon, but it will absolutely amplify both good design decisions and bad ones—at scale.

The examples throughout are based on Angular and Java/Quarkus, as they represent my primary technology stack. However, the principles are technology-agnostic—the same challenges around version drift, implicit defaults, and architectural inconsistency exist across all frameworks and platforms.

Frameworks change. Dependency versions change. But production chaos? That part is surprisingly stable.

The main agenda

The goal is not to stop vibe coding.

The goal is to introduce engineering control around vibe coding so we can preserve speed without compromising quality.

Think of it this way:

Vibe coding is excellent for discovery, experimentation, and rapid prototyping—where speed and creativity matter more than structure.
Specs-driven feedback loops, on the other hand, are essential when moving toward production decisions, where correctness, consistency, and maintainability become non-negotiable.

Prototypes are for learning. Production is for living with your decisions.

The winning model is not either-or. It is both, applied in sequence.

First, you move fast enough to explore the problem space.
Then, you slow down just enough to make sure what you built can actually survive contact with reality.

Decision Framework: When to Vibe, When to Spec

Use a simple guiding rule:

Early discovery → start with vibe coding.
This is where speed, experimentation, and iteration matter most, and the goal is to quickly learn what works.

Anything user-facing in production → start with a specs-driven feedback loop.
At this stage, clarity, consistency, and validation matter more than raw speed, because changes have real user and system impact.

Migration or platform-level work → always include explicit versioning and compatibility checks upfront.
This helps avoid hidden dependency issues, breaking changes, and long-term maintenance surprises.

This approach ensures that speed is preserved where it adds value, while rigor is applied where it truly matters—preventing short-term momentum from becoming long-term technical friction.

Two Workflows, Two Outcomes

Workflow A: Prompt to Implementation (No Feedback Loop)

Typical flow:

  1. Ask AI to build the project.
  2. Accept default choices without much scrutiny (because “AI probably knows better”).
  3. Start implementing features immediately.
  4. Discover issues later during integration, testing, or—best case—just before release.

Common failure modes:

  1. Framework or library versions are selected implicitly by the AI and don’t match team standards or support policies.
  2. Non-functional requirements like security, observability, and maintainability quietly disappear (they were “implied,” apparently).
  3. Product intent gets diluted because implementation starts before the design is actually clear.
  4. A working codebase is mistaken for a correct architecture.

It worked on my machine… and the AI wrote it, so now we have two mysteries.

Nothing says confidence like pushing a feature and immediately opening three new Jira tickets.

Workflow B: Specs-Driven AI Cycle (With Feedback Loop)

Typical flow:

  1. Capture intent, constraints, and what “done” actually means.
  2. Define a high-level technical design before touching code.
  3. Break it down into explicit decisions (APIs, architecture, data flow, versions).
  4. Ask AI to implement strictly against the spec.
  5. Validate dependencies, framework versions, architecture alignment, and tests.
  6. Feed findings back into the next iteration of the spec.

This workflow still moves fast—but it’s a controlled kind of fast. Less “hope-driven development,” more “intent-driven engineering.”

In Workflow A, you discover problems during deployment. In Workflow B, you argue about them politely with a document first.

AI is great at generating code. It’s also great at generating very confident wrong assumptions—so the spec is your seatbelt.

Instead of surprises showing up in production, they show up early—when fixing them doesn’t require emotional support snacks and a rollback plan.

The goal is not fewer surprises. The goal is cheaper surprises.

Common Advice vs Practical Reality

Common advice map

  1. “Just prompt better.”
  2. “AI is smart enough, ship it.”
  3. “We can clean it up later.”

Reality map

  1. Better prompts help, but do not replace system design.
  2. Fast output is not the same as valid architecture.
  3. Cleanup later is usually slower and more expensive.

If vibe coding is jazz, production is classical music—same instruments, very different consequences for wrong notes.

Github Spec Kit

Build high-quality software faster.
An open source toolkit that allows you to focus on product scenarios and predictable outcomes instead of vibe coding every piece from scratch.

Finally, a way to stop arguing with AI at 2 AM about whether ‘just one more endpoint’ needs a design doc.

Let’s start with this—because vibe coding doesn’t come with a rollback plan.

It requires Python to install, so no, this is not a ‘just vibe it’ situation anymore.

There are different ways to integration this in the projects

  1. One approach is to download the latest release and manually add it to the project directory or existing codebase.
  2. uv specify --from git+https://github.com/github/spec-kit.git@vX.Y.Z, replace vX.Y.Z with the latest tag from Releases
  3. Using the CLI uv tool install specify-cli --from git+https://github.com/github/spec-kit.git@vX.Y.Z

Oh, we have to install Spec Kit first—congratulations, we’ve now entered the “real engineering setup phase.

Diving into Spec Kit

In this tutorial, we will be working with a brownfield project. Let’s begin with the following command:

specify init <project_name> --integration copilot
Enter fullscreen mode Exit fullscreen mode

Since we are using GitHub Copilot, the integration mode is set to copilot.

The folder structure generated by Spec Kit is shown below:

Our technology stack includes:

Angular Angular Skill(https://angular.dev/ai/agent-skills)
Quarkus Quarkus skill(https://github.com/quarkusio/skills)

If we follow the instructions provided in both skills repositories, a dedicated folder will be generated within the agent directory as part of the setup process.

With this setup, any prompt you send to the LLM will now be influenced by the configured skills, meaning code generation will consistently follow those predefined patterns and guidelines.

At this point, you don’t just prompt the LLM anymore—it politely consults its ‘skills folder’ before replying, like it has suddenly become very responsible.

In other words, every response is now a mix of intelligence, structure, and just enough discipline to stop it from freelancing its own architecture decisions at 2 AM.

Github co-pilot with intellj

I am using the GitHub Copilot plugin cli in IntelliJ IDEA; however, the same functionality is also available in Visual Studio Code.

The most common used slash command with speckit are below

Slash commands with your coding agent:
/speckit.constitution - Establish project principles
/speckit.specify - Create baseline specification
/speckit.plan - Create implementation plan
/speckit.tasks - Generate actionable tasks
/speckit.implement - Execute implementation

The main goal of spec-kit.constitution is to define clear boundaries for the project so that the LLM operates within a well-defined set of principles.

Think of it as establishing guardrails for the system—ensuring consistency, predictability, and alignment with project standards.

It’s a bit like a well-designed zoo enclosure for an LLM: plenty of space to move, explore, and perform, but within clearly defined boundaries where it can operate safely and effectively.

Run the command in cli /speckit.constitution <Description of the project> Description you can present based on your principles

**Example** 

` /speckit.constitution Services may use Spring Boot, Quarkus, or Micronaut, but each service MUST own clear bounded-context responsibilities, explicit API contracts, and isolated build/runtime configuration. Cross-service dependencies MUST occur through documented interfaces, never through internal package coupling`  

The LLM will update the file under `.specify >> memory >> constituion.md`
Enter fullscreen mode Exit fullscreen mode
## Core Principles

### I. Polyglot Service Boundaries
Suruppa services MAY use Spring Boot, Quarkus, or Micronaut, but each service MUST own
clear bounded-context responsibilities, explicit API contracts, and isolated build/runtime
configuration. Cross-service dependencies MUST occur through documented interfaces, never
through internal package coupling.  
Rationale: the repository intentionally mixes frameworks; strict boundaries prevent framework
sprawl from becoming architecture drift.

- Spring Boot services MUST use Spring's application context isolation.
- Quarkus services MUST use CDI-based injection and avoid Spring compatibility layer.
- Micronaut services MUST use compile-time DI and avoid runtime reflection where possible.

### II. Shared Contract Libraries
Cross-cutting models and integration helpers MUST be published through shared modules (for
example `shared/model`, `shared/quarkus-common`, `shared/micronaut-common`) and versioned so
consumers can upgrade intentionally. Service-specific logic MUST remain in the owning service.  
Rationale: shared contracts reduce duplication and keep consistency across Angular clients and
multiple JVM frameworks.

- Spring Boot services MAY consume `shared/model` but MUST NOT depend on
  `shared/quarkus-common` or `shared/micronaut-common`.
- Framework-specific shared modules MUST NOT leak framework internals
  into their public API.

### III. Test and Contract Gates (NON-NEGOTIABLE)
Every change MUST include automated tests at the appropriate level: unit tests for domain logic,
framework integration tests for service behavior, and contract tests when APIs, protobuf, or
OpenAPI surfaces change. Builds MUST fail on test failure and no feature is complete without a
failing-then-passing test cycle for new behavior.  
Rationale: a distributed, polyglot architecture requires executable proofs of compatibility.

- Spring Boot: MUST use @SpringBootTest for integration tests.
- Quarkus: MUST use @QuarkusTest and QuarkusTestResource for integration tests.
- Micronaut: MUST use @MicronautTest for integration tests.

## Technology and Runtime Standards

- Java services MUST target Java 25 unless an approved exception is documented in feature plans.
- Build tooling MUST remain Gradle-based for JVM services and Angular CLI/NPM for web clients.
- API changes MUST document backward-compatibility impact and upgrade path for consumers.
- Database migrations MUST pass validation before merge.
- New shared libraries MUST publish to local Maven during development and define ownership.

- Spring Boot services MUST use Spring Boot 4.x (Jakarta EE namespace).
- Quarkus services MUST target Quarkus 3.x LTS.
- Micronaut services MUST target Micronaut 5.x.
- All three frameworks MUST target Java 25 per the existing standard.

- Generate Front-end API Models from Swagger

Run from the **repository root**:

bash
sh .specify/scripts/bash/swagger-codegen.sh <api-docs-url> <server-name>
Arguments:
<api-docs-url>  OpenAPI JSON endpoint (e.g. http://localhost:5100/v3/api-docs)
<server-name>  output folder name (e.g. authorization-server)

Example:
sh .specify/scripts/bash/swagger-codegen.sh \
  http://localhost:5100/v3/api-docs \
  authorization-server
Output goes to src/clients/web-ui/src/swagger-specification/<server-name>/.
 JAR: src/tools/swagger-codegen-cli-3.0.81.jar (Java required on $PATH).

Versioning policy for this constitution follows semantic versioning:

- MAJOR: incompatible governance or principle removal/redefinition.
- MINOR: new principle/section or materially expanded requirements.
- PATCH: clarifications, wording improvements, and non-semantic fixes.
Enter fullscreen mode Exit fullscreen mode

All the updated files are based on the templates that has been provided under .specify >> templates

We also need to add these instruction for co-pilot under .github >> co-pilot-instruction.md

For Angular development guidance, read `.github/agents/skills/angular-developer/SKILL.md`.
For creating new Angular apps, read `.github/agents/skills/angular-new-app/SKILL.md`.

For updating quarkus apps, read `.github/agents/skills/quarkus-update/SKILL.md`.
Enter fullscreen mode Exit fullscreen mode

Updating the plan-template with your tech stach will be much better for LLM, a simple example

**Language/Version**: [e.g., Java 25 or NEEDS CLARIFICATION]

**Primary Dependencies**: [e.g., Spring Boot 4.0.6, Angular 21.2, Quarkus 3.2.1, Micronaut 5.0.0 or NEEDS CLARIFICATION]

**Storage**: [if applicable, e.g., PostgreSQL]

**Testing**: [e.g., JUnit, custom test framework or NEEDS CLARIFICATION]

**Target Platform**: [e.g., Linux server, iOS 15+, WASM or NEEDS CLARIFICATION]
Enter fullscreen mode Exit fullscreen mode

We always have the privilege to edit the files under .github and .specify for the agent.

Oh—that’s the bold statement. Because nothing says ‘responsible engineering’ like confidently editing the rulebook the AI is actively following.

- /speckit.specify - <Description>

Example
/speckit.specify - Create a sign up screen with user first name, last name and email address
Enter fullscreen mode Exit fullscreen mode

Once you run this command, the LLM will generate multiple files under the spec folder. Always make sure to review the .md files generated inside the spec directory.

Treat those .md files like code reviews from a very fast intern who has read the entire internet but still forgot your project context.

Sample from the genereated file

### Functional Requirements

- **FR-001**: The registration page MUST present input fields for first name, last name, and email address.
- **FR-002**: All three fields (first name, last name, email) MUST be required; the form MUST NOT submit if any field is empty.
- **FR-003**: The email field MUST be validated against standard email format before submission is processed.
- **FR-004**: The system MUST display inline, field-level error messages for each validation failure without reloading the page.
- **FR-005**: The system MUST trim leading and trailing whitespace from all field values before processing.
- **FR-006**: The system MUST reject registration if the provided email address is already associated with an existing account.
- **FR-007**: The system MUST display a user-friendly error message when a duplicate email is detected, without disclosing details of the existing account.
- **FR-008**: Upon successful registration, the system MUST confirm the outcome to the user (e.g., success message or redirect to a confirmation page).
- **FR-009**: The system MUST prevent duplicate form submissions (e.g., disable the submit button after first click until the response is received).
- **FR-010**: The system MUST display a user-friendly error message if registration fails due to a system or service error, without exposing technical details.
Enter fullscreen mode Exit fullscreen mode

/speckit.clarify

We can run this command to clarify a few things. It contains basic questions that the LLM will ask as part of the setup flow. These questions are generated from predefined templates, so it is useful to understand where each input is coming from and how the pieces are connected.

It’s essentially the moment where the AI politely interviews you about your own project—just to make sure you both agree on what you’re building.

/speckit.plan - Create implementation plan

This command is used to define technical specifications and implementation details for the project. It includes decisions such as the framework to be used, architectural approach, tooling, and other technical constraints. For example, UI-level standards like button radius (e.g., 5%) or design tokens such as primary button color (e.g., red) can also be specified here.

Think of it as telling the LLM: yes, you can build it—but here are the laws of physics in this universe.

Remember to review all the files that are been generated inside specs directory

Because the AI will happily generate a whole architecture in seconds… and you still need to be the one who politely checks if it makes sense in this universe.

Sample from generated file

## Technical Context

**Language/Version**: TypeScript ~5.9.2

**Primary Dependencies**: Angular 22.x, Angular Material 22.x, Angular Reactive Forms (`@angular/forms`), RxJS ~7.8, TailwindCSS 4.x, `@falcon-ng/tailwind` 0.0.28

**Storage**: None (UI only; persistence delegated to the identity backend)

**Testing**: Angular TestBed + Vitest (Angular 22 default test runner in the identity UI), `ComponentFixture`, `HttpClientTestingModule`

**Target Platform**: Modern desktop/mobile browser (evergreen); no IE or native-app target

**Project Type**: Angular SPA component inside a mono-repo identity UI app (`src/infrastructure/identity/web/ui`)

**Performance Goals**: Form renders and first-interaction ready in < 1 s; inline validation feedback ≤ 200 ms after field blur (client-only)

**Constraints**: UI only — no backend code, no database migrations, no shared/model library changes required for this feature slice; spec FR-006 (duplicate email) handled by surfacing the HTTP 409 error from the existing API endpoint

**Scale/Scope**: Single component (≈ 3 files: `.ts`, `.html`, `.scss`) + 1 spec file + optional service; routes into an existing SPA with ~20 routes

Enter fullscreen mode Exit fullscreen mode

- /speckit.tasks - Generate actionable tasks

This command further decomposes the specification into smaller, executable tasks that can be implemented incrementally.

example /speckit.tasks The user’s first name, last name, and email are mandatory fields. A validation error message should be displayed when the user moves focus out of the input field without providing valid values.
Enter fullscreen mode Exit fullscreen mode

At this stage, the spec officially graduates into something your future self will definitely assign story points to.

Oh finally the last step… oh dear God, the files generated by the LLM are too many.

Congratulations—you didn’t just scaffold a project, you accidentally initialized a documentation ecosystem.

/speckit.implement - Execute implementation
This command generates the actual code based on the specifications defined in the specs directory.

In other words, it takes all the structured requirements, design decisions, and task breakdowns, and translates them into working implementation files.
This will generate the code as files are saying over the specs directory

This is the moment where the AI finally stops writing essays and starts writing code.

Everything in the specs folder now graduates into real files—like turning architectural theory into something that can actually break in production.

You don’t just get code… you get spec-compliant code, which is a fancy way of saying: if it breaks, at least it breaks consistently.

A Lightweight Team Working Agreement

If you want this to stick across product and engineering, agree on a short process:

  • No implementation prompts before intent + design are approved.
  • Every AI-generated project must include a version validation step.
  • Any architecture change during coding requires a design delta note.
  • PRs include a spec compliance checklist.
  • Release readiness requires explicit support-lifecycle verification.
  • This is a small governance layer with a huge payoff.

Final Takeaway

You do not need to choose between creativity and discipline.

Vibe coding is a powerful accelerant. Specs-driven feedback loops are the steering wheel and brakes.

Great teams use both.

If your organization wants to adopt AI coding responsibly, start with one change: never go from prompt to production without a spec checkpoint and a verification loop.

Top comments (0)