DEV Community: Strand

Building a Firecracker VM Orchestrator in Go - Part 2: API Server

Strand — Tue, 31 Mar 2026 02:11:55 +0000

Introduction

If you missed the first post in this series, start here. It covers the foundation: five provider interfaces that decouple the Flames control plane from any specific infrastructure backend.

This time we're building the first thing that actually does something: the API server. This is the first real consumer of those provider interfaces. A running binary. go run ./cmd/flames-api, curl against it, create a VM. That kind of thing.

But what made this spec interesting wasn't the HTTP handlers, it was the design conversation I had with Claude Code along the way. I pushed back on several decisions, changed the architecture in meaningful ways, and the final result is quite different from what was initially proposed. That's the part worth documenting.

The Steering

Here's the thing about working with AI agents on architecture: the first draft is usually reasonable but generic. The value I bring is knowing where generic breaks down. Let me walk through the key moments where I steered the design.

"Make transport an interface"

The initial spec described an HTTP API server. Handlers, routes, JSON. Standard stuff. But I've been down this road. You build everything into HTTP handlers, then six months later someone asks for gRPC, and you're refactoring the entire service layer.

So I pushed back: the transport needs to be a swappable layer. The business logic should live in a Service struct that speaks only domain types. HTTP is just the first adapter. gRPC, WebSocket, whatever, they're all just different ways to call the same methods.

This changed the entire structure. Instead of one package with handlers that embed business logic, we got three clean layers:

api/: Pure business logic. No net/http, no JSON tags, no transport concepts.
transport/httpapi/: Thin HTTP adapter. Decodes requests, calls service, encodes responses.
cmd/flames-api/: Wires it all together.

"Why use the colon style?"

The initial design proposed POST /v1/vms/{vm_id}:stop, Google's API design convention with colons for custom actions. I asked a simple question: if it's unusual and requires careful ServeMux registration, why not just use /stop?

There was no good reason. We changed it to POST /v1/vms/{vm_id}/stop. Sometimes the right architectural decision is just removing complexity.

"Why not add limit to ListControllers now?"

The design had a risk section noting "No pagination on ListControllers, acceptable for now, will need it later." I pointed out that ListEvents already had filtering via EventFilter, so why defer the same pattern for controllers?

That led to adding ControllerFilter (with Status and Limit fields), which meant extending the StateStore interface from SPEC-001 with a new ListControllers method. A small change now that avoids a breaking change later.

"In-memory for tests, SQLite for dev, Postgres for prod"

The idempotency store risk mentioned "when persistence is added." I corrected the framing. The project isn't going from "in-memory" to "persistent." It's a three-tier model: in-memory for tests, SQLite for local dev, Postgres (or other databases) for prod. Every stateful component should follow this progression.

This is the kind of context that lives in my head but needs to be explicit in the spec so the AI (and future contributors) make the right calls.

The Spec

Like SPEC-001, I shared the main ideas and core points (what the API should do, which endpoints, what the transport story should look like) and Claude Opus 4.6 turned that into three structured spec notes. I steer, it writes:

Requirements: 8 user stories, 23 functional requirements split across service layer, HTTP transport, and infrastructure. Plus non-functional requirements and 13 acceptance criteria.
Design: Package layout, dependency graph, service method signatures, HTTP route mapping, error mapping strategy, idempotency design.
Tasks: 10 tasks across 4 phases with dependency tracking.

A few highlights from the spec:

User Stories: The transport-independence story was the one I added after steering the design:

US-1: As an operator, I want to create a VM via an API...
US-7: As a developer, I want to start a fully functional API server with go run ./cmd/flames-api...
US-8: As a platform team, I want to swap the transport layer without rewriting business logic...

Functional Requirements: Split into three sections reflecting the layered architecture:

Section	Count	Scope
Service Layer	FR-001 to FR-012	Business logic, event emission, limit enforcement
HTTP Transport	FR-013 to FR-020	Routes, error mapping, idempotency, healthz
Infrastructure	FR-021 to FR-023	Entry point, flags, testability

Non-Functional: The stdlib-only constraint got nuanced: the service layer and HTTP transport must be stdlib-only, but future transports (gRPC) may bring their own dependencies scoped to their package.

The Architecture

The service is a concrete struct, not an interface. This was a deliberate call in the design. There's only one business logic, you don't need polymorphism for "the thing that does the work." Transports call its methods directly. If we ever need a service interface (middleware chaining, for example), we extract it then. Not before.

The API

All mutations return 202 Accepted because the desired state is recorded, but convergence happens asynchronously. This is fundamental to how Flames works: you tell the system what you want, and the reconciler makes it happen.

Route Map

Method	Path	Service Method	Status
`POST`	`/v1/vms`	`CreateVM`	202
`GET`	`/v1/vms/{vm_id}`	`GetVM`	200
`POST`	`/v1/vms/{vm_id}/stop`	`StopVM`	202
`DELETE`	`/v1/vms/{vm_id}`	`DeleteVM`	202
`POST`	`/v1/controllers`	`RegisterController`	201
`POST`	`/v1/controllers/{id}/heartbeat`	`Heartbeat`	200
`GET`	`/v1/controllers`	`ListControllers`	200
`GET`	`/v1/events`	`ListEvents`	200
`GET`	`/healthz`	-	200

Idempotency

Mutation endpoints support an Idempotency-Key header. Same key + same body replays the original response. Same key + different body returns 409. The implementation uses SHA-256 body hashing with a ResponseWriter wrapper to capture responses before they're sent. It's a transport concern, lives entirely in transport/httpapi/, not in the service layer.

Error Responses

Errors from the provider layer map cleanly to HTTP: ErrNotFound becomes 404, ErrAlreadyExists and ErrConflict become 409, anything else is 500.

Every error response is structured JSON with code, message, resource_type, and resource_id. No string matching on the client side. This is the same providererr pattern from SPEC-001 surfaced through HTTP. The structured errors we designed in the foundation layer pay off immediately.

The Implementation

10 tasks, 4 phases, all executed in a single session. 7 files created, 4 existing files modified. Zero external dependencies. Every test passes with -race. The binary starts, you curl it, VMs get created.

Reflections on the Workflow

This spec was more interesting than SPEC-001 because the steering mattered more. Provider interfaces are fairly standard Go, the AI can nail those with minimal guidance. But an API server involves architectural judgment calls: where does business logic live? What's a transport concern vs. a service concern? Do you build for flexibility now or later?

The answer was different for different things. Transport abstraction? Build it now, the cost is near zero and the payoff is real. Colon-style routes? Don't bother. Pagination on controllers? Add it now, same pattern already exists.

These are 30-second decisions in a conversation, but they compound into a fundamentally different codebase. The AI proposes, I steer, the spec captures the decision, and the implementation follows. That's the loop.

What I like about the Spec-Driven workflow on ContextPin is that these decisions are recorded. The Requirements note shows what we decided. The Design note shows why. If someone reads the spec six months from now, they'll understand not just the code but the reasoning behind it. That's the difference between documentation and a spec.

What's Next

The API server is running but nobody's talking to it yet. The next specs will bring the system to life: controllers that actually boot Firecracker VMs, a scheduler that assigns VMs to controllers, and persistence so state survives a restart. The provider interfaces and service layer are ready for all of it.

This post is part of the Flames Spec-Driven Development series.

This entire project is being built using ContextPin + Claude Code. ContextPin is an ADE (Agentic Development Environment), a GPU-accelerated desktop app designed for AI-assisted development that integrates directly with Claude Code, Codex, Gemini and OpenCode. ContextPin is going open-source soon. If you want to get early access, you can join here.

Building a Firecracker VM Orchestrator in Go - Part 1: Provider Interfaces

Strand — Mon, 30 Mar 2026 00:42:38 +0000

Introduction

I'm building Flames, an open-source control plane for managing microVMs powered by Firecracker and Jailer. The goal is a straightforward API to spin up, orchestrate, and tear down lightweight VMs — the kind of ephemeral, hardware-isolated environments that are becoming critical infrastructure. Especially in the AI ecosystem, where you're running untrusted code, agent workflows, or sandboxed execution, container-level isolation isn't enough. You need real VM boundaries with Jailer-enforced security, and you need it to be fast and programmable. That's what Flames is for.

I've been coding with AI agents for a while now, but what's different this time is that I'm using ContextPin as my main AI coding workspace — organizing specs, context, and decisions in one place so the AI always has what it needs. Spec-Driven Development, essentially.

I'm documenting the whole journey here.

The idea is simple: I write the specs, I bring my Go experience to the table, and I let AI handle the bulk of the implementation. This frees me up to spend more time on architecture, code review, and making sure the design decisions are sound — thinking about interfaces, data flow, concurrency trade-offs — while moving through the implementation phase much faster than I could solo.

This first spec tackles the most foundational piece: the provider interfaces. Five abstractions that decouple the entire control plane from any specific infrastructure backend.

The Problem

Flames needs to store VM state, cache data, queue background jobs, store blobs, and expose VMs via ingress. But I've been burned before by coupling to a specific database or queue system too early. A developer running go run locally shouldn't need a database cluster. A production deployment should plug in real backends without changing application code.

If you've written Go for any amount of time, you know the answer: narrow interfaces with in-memory defaults. The interesting part is getting the contracts right on the first pass — and that's where the spec-driven approach really pays off.

The green nodes are the five provider interfaces — the abstraction boundary. The dark nodes are the domain models each interface operates on. The control-plane components at the top consume only interfaces, never concrete implementations.

The Spec

Before writing a single line of Go, I wrote three structured notes in ContextPin:

Requirements — 7 user stories, 30+ functional requirements, acceptance criteria, and non-functional constraints. This is the contract.
Design — Package layout, dependency graph, data models, interface signatures, implementation notes, and risk analysis. This is the blueprint.
Tasks — The ordered implementation checklist derived from the design.

Having these written down before implementation meant I could hand them to Claude Code and say "build this" — and then spend my time reviewing the output against my own spec rather than dictating every line. The spec becomes the shared language between me and the AI.

The Five Interfaces

StateStore (provider/state) — VM, controller, and event records. Default: memstate (maps + mutex)
BlobStore (provider/blob) — Opaque artifact storage. Default: memblob (byte slices)
CacheStore (provider/cache) — Ephemeral key-value caching. Default: memcache (map + TTL)
WorkQueue (provider/queue) — Background job processing. Default: memqueue (slices + leases)
IngressProvider (provider/ingress) — VM service exposure. Default: noop (no network ops)

Each interface lives in its own package, imports only from model/ and provider/providererr/, and has zero external dependencies in its default implementation. This is the kind of structure I'd set up in any Go project — clean import graphs, no circular dependencies, everything testable in isolation.

Key Design Decisions

These are the calls I made during the spec phase — the kind of decisions that are hard to delegate to AI because they require judgment about where the project is heading.

Interface-per-package over monolithic provider

One interface per package rather than a single Provider mega-interface. I've seen the mega-interface approach in other projects and it always ends the same way: a component that only needs blob storage ends up importing the entire provider dependency tree. Keeping them separate also maps cleanly to future adapter packages — provider/state/postgres imports provider/state and nothing else.

Non-blocking `Dequeue`

WorkQueue.Dequeue returns ErrNoJobs instead of blocking. I went back and forth on this one, but non-blocking is simpler to implement correctly across all backends. The callers (reconciler, scheduler) will already run on tick-based loops anyway. If we ever need a push model, that's a separate optional interface — not a change to the core contract.

Conformance test suites

This is the one I'm most excited about. Every interface has a shared test suite in providertest/ that any adapter can import and run. The in-memory default passes it today. A future Postgres adapter runs the exact same tests. This is what turns interfaces from "type signatures that might work" into actual enforceable contracts. During review, I spent most of my time here — making sure the test cases cover the edge cases that matter.

Structured errors with `errors.Is` support

Provider errors carry metadata (resource type, ID) and support errors.Is matching against sentinels like ErrNotFound, ErrConflict, ErrCacheMiss. No string matching, ever. This is a pattern I've used in every serious Go project — the upfront cost is minimal and it saves hours of debugging later.

Reflections on the Workflow

I didn't write these specs by hand, line by line. I brought my Go and infra expertise — I'd already done a Firecracker + Jailer proof of concept internally — and explained exactly how I wanted the architecture to work. The AI helped me turn that into structured requirements, design docs, and task breakdowns. It's a conversation, not dictation.

But once the specs were done and living in ContextPin, everything moved fast. The AI produced code that matched my design because the design was explicit, not in my head. My review cycles were focused: does this match the spec? Does it handle the edge cases? Are the error types right?

I spent almost no time on implementation details and almost all my time on the things that matter for long-term quality: interface design, concurrency safety, test coverage.

What's Next

With the provider interfaces in place, the next spec will build on top of them — likely the API server or scheduler, which consume these interfaces via constructor injection. The in-memory defaults mean I can develop and test the next layer without standing up any infrastructure. That's the whole point.

This post is part of the Flames Spec-Driven Development series.

This entire project is being built using ContextPin + Claude Code. ContextPin is an ADE (Agentic Development Environment) — a GPU-accelerated desktop app designed for AI-assisted development that integrates directly with Claude Code, Codex, Gemini and OpenCode. ContextPin is going open-source soon. If you want to get early access, you can join here.

What is Spec-Driven Development?

Strand — Wed, 25 Mar 2026 17:38:15 +0000

If you ask a coding agent to build something meaningful, there is a good chance it will do a surprisingly good job. And at this point, I do not even mean only small tasks. These tools can already build large chunks of a product, entire features, and sometimes even a full website or app with very solid results.

That part is real.

I use coding agents a lot, and I think this is exactly why so many people get excited right away. You give the agent a direction, it starts moving fast, and the output can be genuinely impressive.

But the real challenge is not whether AI can handle scope. It clearly can.

The challenge is how that scope is defined, structured, and kept aligned as the project grows.

A larger project is not just "more code." It is more decisions, more constraints, more moving parts, more tradeoffs, and more chances for small misunderstandings to compound into bigger problems.

When I build products, I am not thinking only about the next function or the next component. I am thinking about the shape of the system, the tradeoffs, the constraints, the future changes, the developer experience, the boundaries between parts, and what I want the final product to become.

To me, that is the difference.

As a software engineer, I do not want to keep nudging the wheel left and right every five seconds while the car is already moving. I want to design the road.

That is why spec-driven development feels so important right now.

It gives me a way to turn intent into something explicit before implementation becomes the source of truth by accident. Instead of jumping straight into code and hoping the agent infers the rest, I can define what should exist, how it should behave, what constraints matter, what is out of scope, and what the architecture should protect.

That changes everything.

Spec-driven development is not new

One thing I think is important to say is that spec-driven development is not some brand new AI-era invention.

The idea of making behavior, contracts, and intent explicit before or alongside implementation has been around for decades. What is new is that AI makes the payoff much more obvious. A spec is no longer just documentation for humans. It is also high-value input for agents.

A quick timeline

Spec-driven development is not new at all. The idea has shown up in different forms for decades.

In 1969, formal reasoning about programs introduced the idea that software could be described in terms of clear expected behavior, not just written and tested afterward.

Later, formal specification methods and Design by Contract pushed this idea further by making system behavior and constraints explicit before or alongside implementation.

Then practices like Test-Driven Development and Behavior-Driven Development made specs more practical for everyday teams by turning expected behavior into tests and human-readable scenarios.

After that, API-first development with Swagger and OpenAPI brought spec-first thinking into mainstream backend development by making contracts machine-readable and implementation-friendly.

Now with AI coding agents, the same idea matters even more. A spec is no longer just helpful for humans. It also gives agents a much clearer structure to follow.

Why this matters so much now with AI

This is the part that feels different in practice.

Before AI, a spec was already valuable. It aligned people, clarified behavior, and reduced mistakes.

But now a spec also acts like a control surface for an agent.

That means I can do something like this:

Define the problem clearly.
Design the system shape.
Break the work into tasks.
Iterate with the agent.
Review the result against the spec.
Refine the spec or the implementation and repeat.

That loop is incredibly powerful.

Without a spec, I am often asking the model to guess what I mean from a half-formed prompt and a pile of code. Sometimes it still does well. But once the task gets bigger, or touches architecture, data flow, boundaries, naming, migration strategy, edge cases, or product intent, I have seen the quality drop fast.

The agent starts filling gaps with assumptions.

And that is the real problem.

AI is great at execution inside a constrained frame. But if I want the output to match my product vision, I need to provide the frame.

That is how I think about spec-driven development today. It is not bureaucracy. It is not writing giant documents for the sake of it. It is creating the minimum structured thinking needed so humans and agents can build in the same direction.

What I mean by "spec"

When I say spec, I do not just mean one giant requirements doc.

A spec can be:

a feature brief
architecture notes
contracts and interfaces
acceptance criteria
non-goals
task breakdowns
examples of expected behavior
migration constraints
edge cases
naming decisions
API schemas
invariants
notes about why a tradeoff was made

In other words, a spec is the explicit shape of intent.

That shape can be lightweight or deep depending on the problem.

For a tiny one-off task, I may not need much.

For a real product, I absolutely do.

My experience with this

The more I use coding agents, the more I feel this personally.

When I go straight from idea to code, things can move quickly, but I also get more drift. Naming drifts. Architecture drifts. The agent solves the local problem but misses the bigger system constraints. I end up spending more time correcting direction later.

When I spend time on planning first, the entire interaction changes.

The agent becomes more useful because I am no longer asking it to invent the project structure, the product intent, and the engineering standards from thin air. I am giving it a map.

And honestly, I think this is where a lot of the real value is.

Not just "AI writes code."

More like: "I define the system clearly enough that AI can execute without constantly losing the thread."

That is a much better workflow.

This is also why I'm building ContextPin

A big reason I care so much about this is because I think the tooling still feels incomplete.

A lot of AI coding tools are optimized for fast prompting and code generation, but not enough for long-lived context, specs, notes, design decisions, and local project knowledge that should stay close to the repo.

I want something better for that workflow.

So I'm building an ADE, an Agentic Development Environment, around this idea.

It is GPU accelerated and native. No Electron, no Tauri, no browser wrapper.

The focus is spec-driven development.

The idea is that I can keep notes, context, and structured project thinking inside the repo, versioned with Git, local-first, and ready for both humans and agents. Not hidden in random chat history. Not scattered across docs that never stay in sync. Not locked away from the actual development loop.

I want the spec, the design notes, the task breakdown, and the implementation context to live closer together.

That is the workflow I want for myself, so that is what I'm building.

Early access here: contextpin.com

Final thought

AI can help a lot with implementation, but I still believe the most important work is deciding what should be built, how it should be shaped, what constraints matter, and how the pieces should evolve over time.

That is the work of designing the road.

And in a world full of increasingly capable coding agents, I think specs are becoming one of the best ways to make that intent concrete.

Not because the idea is new.

But because now the payoff is impossible to ignore.

DEV Community: Strand

Building a Firecracker VM Orchestrator in Go - Part 2: API Server

Introduction

The Steering

"Make transport an interface"

"Why use the colon style?"

"Why not add limit to ListControllers now?"

"In-memory for tests, SQLite for dev, Postgres for prod"

The Spec

A few highlights from the spec:

The Architecture

The API

Route Map

Idempotency

Error Responses

The Implementation

Reflections on the Workflow

What's Next

Building a Firecracker VM Orchestrator in Go - Part 1: Provider Interfaces

Introduction

The Problem

The Spec

The Five Interfaces

Key Design Decisions

Interface-per-package over monolithic provider

Non-blocking Dequeue

Conformance test suites

Structured errors with errors.Is support

Reflections on the Workflow

What's Next

What is Spec-Driven Development?

Spec-driven development is not new

A quick timeline

Why this matters so much now with AI

What I mean by "spec"

My experience with this

This is also why I'm building ContextPin

Final thought

Non-blocking `Dequeue`

Structured errors with `errors.Is` support