Vishal VeeraReddy

Posted on May 12 • Edited on May 27

How to Run a Free, Local AI Design Studio (Open-Source Alternative to V0)

#ai #design #v0 #opensource

A Claude Design Alternative

Open-Design: Run a Local AI Design Studio for Free

How to wire up a self-hosted design generation stack that rivals Figma's AI features — without sending your work to a third-party cloud.

There is a quiet corner of the open-source world where two tools have been evolving in parallel, each solving a different half of the same problem. Open-design wants to be the AI-native design canvas. Lynkr wants to be the intelligent router that connects any AI model to any client. Together, they form a surprisingly capable stack: a browser-based design studio powered entirely by models you run yourself.

This article explains what each tool does, why they pair well, and walks you through the exact steps to get both running — from a blank machine to a working design session in under twenty minutes.

Part I — What Is Open-Design?

Open-design (github: nexu-io/open-design) is a web application that lets you describe what you want to build in plain English and receive a live, editable HTML design in return. Think of it as a Figma alternative where the "draw" action is replaced by a conversation.

The core idea

You open a project, type "Create a SaaS pricing page with three tiers, a purple gradient header, and a FAQ section below the fold," and the assistant responds with a fully rendered HTML page sitting in a split pane next to the chat. You can iterate on it — "make the CTA button larger and change the font to Inter" — and the design updates in place.

What makes this different from just asking ChatGPT for HTML and copying the output into a file is the structure open-design wraps around the conversation:

Project workspaces — each project has its own conversation history, file panel, and design system binding. Your palette and typography rules travel with the project.
Design files panel — generated HTML is not dropped in the chat as a code block. It is parsed, saved as a project file, and opened as a rendered tab in a file viewer with an iframe preview.
Skills — prebuilt workflow templates (landing page, deck, dashboard, prototype) that inject opinionated system prompts so the model follows proven patterns rather than making it up.
Design systems — you can attach a DESIGN.md that defines your color tokens, spacing scale, and component rules. The model reads this as authoritative and binds every artifact to it.
Live artifacts — a second artifact type for data-driven outputs (dashboards, reports) that can be refreshed on a schedule by re-running the generation with updated data.

Two execution modes

Open-design ships with two ways to run the AI backend.

Daemon mode spins up a local CLI agent (Codex, Claude Code, Gemini CLI) on your machine. The daemon manages the agent process, streams its output, and interprets the tool calls it makes — reading files, writing artifacts, running shell commands. This is the "full agentic" mode where the AI can explore your file system, install dependencies, and produce multi-file outputs.

API mode (also called BYOK — Bring Your Own Key) skips the local agent entirely. You point open-design at any OpenAI-compatible or Anthropic-compatible endpoint, hand it an API key, and it sends requests directly. The AI cannot use tools in this mode; it can only output <artifact> blocks, which open-design parses and routes to the Design Files panel. This is the simpler, faster mode — and it is exactly the mode Lynkr plugs into.

The artifact contract

Both modes share one key convention: the <artifact> tag.

When the AI wants to produce a design, it wraps its HTML inside:

<artifact identifier="landing-page.html" type="text/html" title="Landing Page">
<!DOCTYPE html>
<html lang="en">
  ...
</html>
</artifact>

Open-design's streaming parser watches the incoming text for this tag. The moment it sees <artifact, it opens a new file entry in the Design panel, streams the inner HTML into an iframe as the model generates it, and — when the closing </artifact> arrives — saves the file to the project folder and opens it as a tab. The user sees a live preview build up line by line, like watching someone draw.

The chat view simultaneously strips the artifact out (showing only any prose the model wrote around it) so the conversation stays clean.

Part II — What Is Lynkr?

Lynkr is an AI proxy server that presents a unified Anthropic-compatible API surface (/v1/messages) while internally routing requests to whichever provider and model makes sense for the task.

The problem it solves

If you run multiple AI providers — Ollama locally, Anthropic in the cloud, maybe Azure OpenAI for enterprise workloads — every client that talks to one of them is tightly coupled to that provider's API format. Swap the provider, update every client.

Lynkr breaks that coupling. Clients always speak Anthropic's message format. Lynkr translates on the fly to whatever the target provider expects and translates the response back.

What Lynkr actually does

Beyond simple translation, Lynkr runs an agent loop:

It receives a message request.
It forwards the request to the target model (Ollama, Anthropic, Azure, OpenRouter).
If the model responds with tool calls, Lynkr executes them server-side (web search, file reads, subtask delegation) and feeds the results back to the model.
The loop repeats until the model produces a final text answer or a terminal condition is hit.
The final response is returned in Anthropic SSE format.

This means clients that speak Anthropic's streaming protocol — including open-design in API mode — get a fully resolved, agentic response even when the underlying model is a local Ollama instance that has no native tool support.

Routing intelligence

Lynkr analyzes each incoming request for complexity (simple lookup vs. multi-step reasoning vs. heavy code generation) and routes to the appropriate tier:

Simple / fast → local Ollama model (zero cloud cost, low latency)
Complex / reasoning → cloud model (Anthropic, OpenAI, Azure)
Agent tasks → multi-step loop with tool execution

All of this is transparent to the client. Open-design sends one request; Lynkr decides where it goes.

Other capabilities

Token budget enforcement — Lynkr tracks token usage and compresses conversation history before it would overflow the context window, preserving the most recent and most important turns.
Headroom sidecar — a small Python service that monitors GPU/CPU memory and tells Lynkr how much headroom is available for local inference, enabling dynamic load shedding when the machine is under pressure.
Session memory — vector-search-backed conversation memory that lets the model recall context from previous turns in long projects.
Telemetry and tracing — structured logs, latency metrics, and per-provider cost tracking.
Dashboard — a web UI at /dashboard that shows live request throughput, provider health, and routing decisions.

Part III — Why They Pair Well

Open-design in API mode needs an endpoint that:

Speaks Anthropic's /v1/messages format with SSE streaming.
Understands the design system context injected in the system prompt.
Produces clean <artifact> HTML blocks without hallucinating tool calls or emitting ANSI escape codes into the CSS.
Routes to whatever model the user has available — local or cloud.

Lynkr provides all four. You point open-design at http://localhost:8081, select "Anthropic" as the protocol, set any string as the API key, and Lynkr handles the rest.

The integration is also meaningful in the other direction. Lynkr needs clients that exercise its capabilities in realistic ways. Open-design's rich system prompts, multi-turn conversations, and artifact streaming are exactly the kind of traffic that surfaces edge cases — model hallucinations, ANSI corruption in streamed HTML, token overflows — that make a proxy useful to stress-test.

Part IV — Setup Guide

Prerequisites

Tool	Version	Purpose
Docker	24+	Run open-design
Node.js	20+	Run Lynkr
Ollama	0.4+	Local model inference
Git	any	Clone repos

You also need at least one model pulled in Ollama. For design generation, minimax-m2.5:cloud gives strong results (it reasons visually and follows HTML conventions well). For lighter machines, qwen2.5-coder:7b or llama3.1:8b are workable.

ollama pull minimax-m2.5:cloud
# or for lighter machines:
ollama pull qwen2.5-coder:7b

Step 1 — Install and start Lynkr

# Clone the repo
git clone https://github.com/Fast-Editor/Lynkr
cd lynkr

# Install dependencies
npm install

# Start Lynkr on port 8081
node bin/cli.js start --port 8081

Lynkr starts, discovers your local Ollama instance automatically, and is ready to accept requests at http://localhost:8081/v1/messages.

You can verify it is running:

curl -s http://localhost:8081/health | jq .
# { "status": "ok", "providers": ["ollama", ...] }

To check the dashboard, open http://localhost:8081/dashboard in your browser.

Step 2 — Run open-design in Docker

# Pull and run open-design
docker run -d \
  --name open-design \
  -p 7456:7456 \
  --add-host=host.docker.internal:host-gateway \
  nexuio/open-design:latest

The --add-host flag is critical. Open-design runs inside a container, and it needs to reach Lynkr which runs on your host machine. host.docker.internal is the hostname that resolves to your host from inside a Docker container.

Open your browser at http://localhost:7456. You should see the open-design welcome screen.

Step 3 — Configure API mode to use Lynkr

In the open-design UI:

Click the Settings gear icon (top right).
Under Execution & model, click the BYOK tab (right side — "API provider").
Select the Anthropic protocol tab.
Under Quick fill provider, choose Custom provider.
Set API Key to any non-empty string (e.g. local). Lynkr does not validate it for local providers.
Set Model to the Ollama model you pulled — e.g. minimax-m2.5:cloud or qwen2.5-coder:7b.
Set Base URL to:

   http://host.docker.internal:8081

Click Test to confirm the connection, then close Settings.

The Settings panel with BYOK selected, Anthropic protocol active, and Base URL pointed at Lynkr. The model shown is claude-opus-4-5 but you can type any Ollama model name — the field accepts free text.

That is it. Open-design will now route all AI requests through Lynkr, which routes them to your local Ollama instance.

Step 4 — Create your first design

Click New Project and name it anything — "Landing Page Test".
When prompted for project kind, choose Prototype.
Type your first prompt in the chat:

Create a modern SaaS landing page with a dark hero section, gradient headline, three feature cards below, and a "Get Early Access" CTA button. Use a deep blue and violet color scheme.

Hit enter and watch what happens.

Lynkr receives the request, routes it to Ollama, the model generates an extended thinking block, then produces an <artifact> HTML block. Lynkr streams this back to open-design. Open-design's streaming parser detects the <artifact> tag, opens a Design panel tab, and renders the HTML in real time — line by line, as the model writes it.

When the stream ends, the file is saved automatically to the project and the Design tab snaps into focus with your rendered page.

The full open-design workspace after a generation completes. Left: the conversation. Center: the Design Files panel with a structured layers tree. Right: the live rendered preview — this is a 10-slide editorial pitch deck generated from a single prompt in about 60 seconds.

Step 5 — Iterate and refine

The conversation history is preserved per-project. Each turn, open-design builds the full message history and sends it to Lynkr, so the model has context from every prior design decision.

Try follow-up prompts like:

Make the hero headline larger, add a subtle animated gradient background to the hero section, and add a navigation bar at the top with the logo on the left and three nav links on the right.

or:

Add a testimonials section below the feature cards with three customer quotes and avatar placeholders. Keep the same color palette.

Each response produces an updated artifact. Open-design saves the new version and opens it, preserving the previous version in the file history.

Part V — Advanced Configuration

Using a cloud model as fallback

Lynkr supports routing heavy or complex requests to a cloud provider while keeping simple requests local. Add your Anthropic API key to Lynkr's config:

# In lynkr directory
export ANTHROPIC_API_KEY=sk-ant-...

node bin/cli.js start --port 8081

Lynkr's complexity analyzer will automatically route multi-step reasoning requests to claude-sonnet-4-6 and keep lighter requests on Ollama — without any change to your open-design configuration.

Binding a design system

Open-design lets you create a design system with a DESIGN.md that defines your brand tokens. Once bound to a project, this file is injected into every system prompt. The model reads it as authoritative and will not invent colors, fonts, or spacing outside what you defined.

A minimal DESIGN.md looks like:

## Colors
- Primary: #5B21B6 (violet-800)
- Accent: #7C3AED (violet-600)
- Background: #0F0F23
- Surface: #1A1A2E
- Text: #F8FAFC

## Typography
- Headings: Inter, weight 700
- Body: Inter, weight 400
- Code: JetBrains Mono

## Spacing
Base unit: 4px. All spacing is multiples of 4.

## Buttons
- Primary: solid violet-600, white text, 8px radius, 14px 28px padding
- Ghost: transparent, violet-600 border, violet-600 text

With this bound, every artifact the model generates will use exactly these values — no hallucinated hex codes.

Attaching skills

Skills are workflow templates that inject expert-level instructions for specific artifact types. Open-design ships with several built-in skills. You can also write your own SKILL.md and publish it to the open-design skill registry.

A good skill for landing pages would define the exact section order, component patterns (hero → social proof → features → CTA), and validation checklist the model must pass before emitting the artifact.

Part VI — Troubleshooting

"Done Xs" in the chat but no design appears

This means the model produced output but it was not recognized as an artifact. The most common causes:

The model used tool calls instead of outputting <artifact> blocks. Some models trained on agent data (like MiniMax or Qwen) reflexively try to call file-writing tools instead of producing direct output. Lynkr handles this by detecting the hallucinated tool calls, dropping them, and injecting a redirect message: "You don't have any tools available. Output the result directly as an <artifact> block." The model then produces the correct output on the follow-up. If you see this happening, it is normal — it adds one extra round trip but the design still arrives.

The HTML failed validation. Open-design requires that artifact HTML start with <!DOCTYPE html> or <html. If the model produces a prose response inside <artifact> tags (e.g. "I updated the header section"), it fails the structural check and is not saved. Try a more explicit prompt: "Output the complete HTML document in an artifact block."

Connection refused from the Docker container. Make sure you used host.docker.internal as the base URL (not localhost). From inside a Docker container, localhost refers to the container itself, not your host machine.

ANSI escape codes appearing in the generated HTML

This would have caused CSS rules like * { box-sizing: border-box } to appear as ▸ { box-sizing: border-box } with colored terminal output embedded in the HTML. Lynkr's latest version detects when text content looks like HTML and bypasses the ANSI markdown renderer, keeping the HTML clean. If you see this, make sure you are on the latest Lynkr commit.

The model keeps trying to explore the file system

If your Ollama model was trained on agent data (Claude Code, Codex), its first instinct is to run ls and read files before generating anything. Lynkr injects a system-level note when it detects a tool-less request: "You have NO tools available. Output ONLY text content directly." Combined with open-design's system prompt, this redirects the model within one or two turns.

Part VII — What This Stack Enables

Running open-design with Lynkr gives you a local AI design studio with some properties that are hard to get from fully-managed SaaS alternatives:

Privacy. Your prompts, your designs, and your conversation history never leave your machine (when using local Ollama models). Sensitive product ideas, unreleased brand concepts, confidential UI specs — none of it is sent to a third-party cloud.

Cost control. With Ollama models, inference is free. Lynkr's token budget enforcement and complexity routing mean you only pay cloud API costs for the requests that genuinely need it. A typical design session with minimax-m2.5:cloud on Ollama costs nothing.

Model flexibility. You are not locked to one vendor. If a better open-source design model releases tomorrow, you pull it with ollama pull and update the model name in open-design settings. The rest of the stack does not change.

Composability. Lynkr exposes a standard Anthropic API surface, which means anything that speaks Anthropic can use it — Claude Code, Codex, Continue.dev, your own scripts. Open-design is just one client. You can run others against the same Lynkr instance simultaneously.

Conclusion

Neither open-design nor Lynkr is trying to replace your existing design workflow wholesale. They are building blocks — a canvas that understands artifacts, and a router that understands models. Assembled correctly, they remove the most annoying friction from early-stage design: the gap between I know what I want this to look like and here is the actual HTML.

The integration is not yet seamless for every model (some still need the redirect injection to break out of agent mode), but the core loop — describe it, see it, iterate — works reliably once both services are running.

If you are running a product team and want to prototype faster without signing up for another SaaS tool, or if you are building in public and want full ownership of your AI-generated design assets, this stack is worth an afternoon of setup time.

Both projects are open source. Links below.

Open-design: github.com/nexu-io/open-design

Lynkr: https://github.com/Fast-Editor/Lynkr

Ollama: ollama.ai

If you found this useful, share it with someone building with open-source AI tools. Comments and questions are open below.

DEV Community