Fabián Silva

Posted on Apr 3

I Built a Visual Spec-Driven Development Extension for VS Code That Works With Any LLM

#vscode #ai #opensource #speckit

I Built a Visual Spec-Driven Development Extension for VS Code That Works With Any LLM

The Problem

If you've tried GitHub's Spec Kit, you know the value of spec-driven development: define requirements before coding, let AI generate structured specs, plans, and tasks. It's a great workflow.

But there's a gap.

Spec Kit works through slash commands in chat. No visual UI, no progress tracking, no approval workflow. You type /speckit.specify, read the output, type /speckit.plan, and so on. It works, but it's not visual.

Kiro (Amazon's VS Code fork) offers a visual experience — but locks you into their specific LLM and requires leaving VS Code for a custom fork.

I wanted both: a visual workflow inside VS Code that works with any LLM I choose.

So I built Caramelo.

What Caramelo Does

Caramelo is a VS Code extension that gives you a complete visual UI for spec-driven development:

1. Connect Any LLM — Including Your Corporate Proxy

Click a preset, enter credentials, done. No CLI tools required.

Supported out of the box:

GitHub Copilot — uses your existing subscription, no API key needed
Local: Ollama, LM Studio (no API key needed)
Cloud: Claude, OpenAI, Gemini, Groq (API key)
Custom: any OpenAI-compatible endpoint
Corporate proxies: custom auth headers for Azure API Manager, AWS API Gateway, etc.

You can have multiple providers of the same type — "Claude Personal" with your own API key and "Claude MyCompany" through your company's proxy, each with different endpoints and auth settings. Switch between them by clicking the dot indicator. Models are fetched from the API when available, or entered manually with automatic validation.

2. Visual Workflow with Approval Gates

Instead of remembering which slash command to run next, Caramelo shows your workflow visually:

Each phase must be approved before the next unlocks:

Requirements → generates spec.md
Design → generates plan.md + research.md + data-model.md
Tasks → generates tasks.md

You see the documents streaming in real time as the LLM writes them. Approve when satisfied, or edit manually first. If you regenerate an earlier phase, downstream phases are flagged as stale.

3. Constitution-Driven Generation

Before creating any specs, you define your project's constitution — the non-negotiable principles:

"All features must include error handling." "TDD mandatory." "No external dependencies without justification."

You can write them manually or click "Generate with AI" — describe your project, and the LLM suggests principles. These are automatically included as context in every generation.

4. Import Specs from Jira

For teams that plan in Jira:

Connect your Jira Cloud board (search by name for orgs with 2000+ boards)
Click "From Jira" when creating a spec
Search issues or type a key directly (e.g., PROJ-123)
Title, description, acceptance criteria, and comments become your spec's input

The spec card shows a linked Jira badge — click to jump to the issue.

5. Task Execution from the Editor

Generated tasks aren't just a document — they're actionable:

Run Task — click a button, the LLM generates the code
Run All Tasks — execute everything, respecting parallel markers [P]
Output Channel — watch the LLM reasoning in real time
Progress tracking — completion percentage in the sidebar (100% only when all tasks done)
Inline checklist — toggle tasks directly in the sidebar

6. Quality Tools

Before moving forward, verify your work:

Clarify — LLM identifies ambiguities, presents questions as QuickPick dialogs
Analyze — checks consistency across all artifacts, reports findings with severity levels
Fix Issues — one-click auto-fix from the analysis report
Checklists — generates content-specific verification items

All accessible from the Caramelo menu (cat icon in the editor toolbar) — a single grouped dropdown that keeps your toolbar clean.

Architecture: How It Works

The extension is surprisingly simple (~170KB bundle):

No LLM SDKs — native fetch with a shared SSE parser, plus vscode.lm for Copilot
No React — native VS Code APIs (WebviewView, CodeLens, QuickPick)
No external CLI — doesn't require specify CLI or any tool in PATH
Spec Kit compatible — reads/writes specs/, syncs templates from GitHub releases
State-driven UI — all inline editing uses re-render pattern, no fragile DOM manipulation

What I Learned Building This

VS Code's WebviewView API is powerful. A single webview panel replaced 3 separate TreeViews and gave us forms, progress rings, task checklists, and inline editing — all with plain HTML/CSS.
SSE streaming is simple. Two LLM provider types (OpenAI-compatible + Anthropic) plus Copilot's vscode.lm API cover 95% of use cases with ~150 lines of streaming code.
Corporate LLM access is messy. Different API managers use different auth header names and prefixes. Making these configurable per-provider was essential for enterprise adoption.
State-driven re-renders beat DOM manipulation. Early attempts to inject form elements via postMessage broke because refresh() destroyed event listeners. Storing editingState and re-rendering the full HTML with editors baked in was the reliable solution.
Spec-driven development works. Using Caramelo to build Caramelo proved the workflow. Each feature went through specify → clarify → plan → tasks → implement.

Try It

Install: Search "Caramelo" in VS Code Extensions, or visit the Marketplace
Source: github.com/fsilvaortiz/caramelo
License: MIT

Contributions welcome! Check the Contributing Guide.

Built with spec-driven development, powered by any LLM you choose.

Top comments (5)

Thomas Landgraf • Apr 5

Nice work shipping this — the "any LLM" angle is a real gap in the current SDD tools. Full disclosure: I'm the creator of SPECLAN, another VS Code extension in this space. We came at it from a slightly different angle — instead of wrapping the Spec Kit workflow visually, we built around a hierarchical spec tree (Goal then Feature then Requirement then Acceptance Criterion) with status gates that control who can touch what, plus an MCP server so AI agents read approved specs directly rather than copy-pasted prompts.

The approval workflow you added is exactly the missing piece from vanilla Spec Kit — the moment you add human checkpoints, the "AI runs wild" failure mode goes away. Curious: for your Jira integration, do you push specs out to Jira, or pull tickets in? We've been debating whether to add something similar or keep SPECLAN as the source of truth and let Jira mirror from it.

Fabián Silva • Apr 8

Thanks Thomas! SPECLAN's approach with the hierarchical spec tree and MCP server sounds really interesting — having AI agents read approved specs directly is a smart architecture choice.

To answer your question: Caramelo currently pulls from Jira (read-only). You select an issue from your board, and its title, description, acceptance criteria, and comments become the input for spec generation. The spec card then shows a linked Jira badge that links back to the original issue.

We deliberately started read-only to keep the integration simple and non-destructive. Pushing specs back to Jira (syncing status, creating sub-tasks from the generated tasks.md) is something I'd like to explore — but it opens the "source of truth" question you're describing. For now, Caramelo treats Jira as the origin and specs/ as the working space.

I'd be curious to see how SPECLAN's approval gates compare — the "who can touch what" access control is something we don't have yet and could be valuable for larger teams

Thomas Landgraf • Apr 8

The read-only pull from Jira is smart — starting non-destructive avoids the sync nightmare where you're debugging which direction overwrote what at 2am. I'd probably do the same.

On the approval gates: it's simpler than "access control" in the RBAC sense. Each spec file has a status field in YAML frontmatter (draft → review → approved → locked). The rule is just that once a spec reaches "approved," the editor makes it read-only — you can't silently change it. If something needs to change, you create a Change Request (separate file, references the original by ID, has its own review cycle). It's friction by design, not permissions.

In practice the status ends up encoding ownership implicitly — "draft" means the author is still working on it, "review" means it's someone else's turn to look, "approved" means the PO signed off and now only a formal CR can touch it. No user roles, just the file's lifecycle state driving who acts next.

For the Jira direction: the "source of truth" question is the hard part. If you ever push back, I'd suggest keeping it one-directional — Jira creates the intent (ticket), Caramelo refines it (spec), but the spec never writes back to Jira. Two-way sync between structured specs and Jira tickets is a graveyard of good intentions.