Machina Tools

Posted on Jun 21 • Originally published at machina.chat

How I Built PromptBoard — A Visual Canvas for Building AI Prompts

#ai #opensource #devtools #webdev

There's a class of AI prompts that don't fit in a text box.

Not because the ideas are too long — you can always write more. The problem is that the structure of what you want to communicate is inherently visual. You're describing a flow. You're pointing at an image. You're listing constraints that apply to some parts of the context but not others. You're trying to give the AI a briefing, not a paragraph.

The text box forces everything into one dimension. And the AI, however capable, has to reconstruct the structure you had in your head from a flat string of text.

PromptBoard solves this by flipping the approach: you build the prompt visually first, then export it.

The problem with prompting complex tasks

Every developer who uses AI agents regularly hits a pattern like this:

You have a bug to fix. It's not a simple bug — it involves a flow you need to explain, a screenshot of the broken state, three or four constraints the fix has to respect, and a description of what the correct behavior should look like.

You start typing. You write the task description, then realize you need to explain the flow first. You paste in a screenshot and then write around it. You add the constraints at the end but they're not clearly linked to the specific parts they apply to. By the time you hit send, the prompt is a 400-word wall of text with an image in the middle.

The AI can often handle this. But you're asking it to do structural inference that you could have done once, clearly, in a canvas.

The deeper issue is that prompts have a natural graph structure: nodes (concepts, constraints, examples) with labeled relationships between them. A text box serializes that graph into a linear sequence and throws away the relationship labels.

The design: blocks, arrows, export

PromptBoard is built around three concepts:

Blocks are the nodes. There are three types:

Text — free-form content, the main carrier of context. Can have an optional label.
Image — drag-and-drop or paste a screenshot. Gets embedded as base64 in the export.
Flow — a process/decision/terminal node for describing logic visually.

Arrows connect blocks and carry a label. "This constraint applies to this flow step." "This screenshot is evidence for this bug description." The relationships are explicit, not inferred from reading order.

Export serializes the canvas back to text — structured text. Blocks are rendered in top-to-bottom, left-to-right order. Arrows become a ## Flow section listing every connection with its label. Images are embedded as base64. The output is a Markdown file any AI can parse immediately.

Why a canvas, not a form

The first version of this tool was a form. Title field, description field, constraints field, image upload. Structured, explicit, readable.

It was unusable.

The problem with forms is that they impose a fixed schema. Your context doesn't always have a title and a description and three constraints. Sometimes it's just two things that are connected. Sometimes you have five images and no text yet.

A canvas has no schema. You start with an empty surface and put things where they make sense. The structure emerges from the layout, not from a pre-defined form. That's exactly how you think through a problem before you explain it — spatially, not linearly.

How voice dictation works

PromptBoard has voice dictation on every text block and arrow label. Two modes:

Chromium (Chrome, Edge): uses the Web Speech API with continuous: true. You click the mic button, talk, and transcribed text appends to the block in real time. No server, no API, no latency — the model runs in the browser.

Firefox and others: MediaRecorder captures the audio, then sends it to a local Whisper server (Transcriber, port 4324) for transcription. If Transcriber isn't running, a dialog appears — you can type what you said, or replay the audio.

The asymmetry is intentional: Chromium's built-in speech recognition is good enough for note-taking velocity. Whisper is better for longer or more technical dictation.

The export format

# Fix the checkout form

**Goal**
Fix the checkout form — Cart component won't submit after the last refactor

**Constraints**
No new deps · TypeScript strict · keep under 50 lines

![cart-screenshot](data:image/png;base64,...)

**[▭ Process]** Cart validates form fields

**[◇ Decision]** Payment API responds?

**[○ Terminal]** Show success or error state

## Flow

Cart validates form fields → Payment API responds? (calls POST /api/checkout)
Payment API responds? → Show success or error state (on failure: surface error message)

When the AI reads this, it has: the task in plain language, constraints explicitly stated, the screenshot as direct visual evidence, and the flow as a labeled graph.

Technical architecture

PromptBoard is a single HTML file, around 1,100 lines. No build step, no npm install, no server. Open it in a browser and it works.

State: a single S object holds all blocks, arrows, history stack, and interaction state. Everything is JSON-serializable. Boards are saved to localStorage (up to 20 boards).

Undo/redo: snapshot-based history (JSON.stringify + JSON.parse of the state). Up to 60 snapshots. Ctrl+Z / Ctrl+Y work everywhere outside a text input.

Arrows: rendered as SVG quadratic Bézier curves with a slight perpendicular offset to avoid overlapping block edges. Hit areas are 14px-wide transparent paths over 1.5px visible paths.

Canvas: 3000×2000px scrollable area. Blocks are position:absolute divs. Drag uses mousedown on the header + mousemove + mouseup on document.

Strengths

No installation. The tool lives in one file. Put it on a USB drive, serve it from any static host, or just keep it in your project folder and open it with a double-click.

Multimodal output. The base64 image embedding means the exported .md is self-contained — images travel with the text. Paste the entire thing into Claude or GPT-4o and the screenshots are right there.

Voice-first friendly. For developers who think faster than they type, or who are debugging a live environment and need both hands, voice dictation makes PromptBoard usable without touching a keyboard.

Composable with the rest of Machina. The .md export is the same format BugCapture produces. A natural workflow: BugCapture records the bug, ContextForge adds the git diff and logs, PromptBoard adds the visual structure and constraints.

Try it

PromptBoard is part of Machina — a free, open-source suite of AI developer tools.

git clone https://github.com/machina-tools/machina.git

Then open tools/promptboard/index.html in your browser. No server needed.

→ GitHub | machina.chat

Top comments (2)

Nazar Boyko • Jun 21

The framing that prompts have a graph shape and a text box throws away the edge labels is a genuinely good way to put it. That matches how I sketch a gnarly task before I write it out. One thing I'm curious about: the export flattens blocks top to bottom and left to right, then lists the arrows in a Flow section. Doesn't that lose some of the spatial meaning the canvas gave you, since where you put a block carries intent too? Or do the explicit arrow labels recover enough that reading order stops mattering?

Alex Shev • Jun 22

Visual prompt building makes sense for workflows where relationships matter more than wording. A canvas can expose dependencies, references, and constraints that get buried when everything is flattened into one long text box.