Amery

Posted on Jun 19

How I built a mind map that's just a Markdown list (and why that makes AI streaming almost free)

#typescript #ai #react #opensource

Most mind map tools store their data in a proprietary binary blob or lock you into a WYSIWYG editor. When I started Open MindMap, a React component, I made one decision up front that ended up shaping everything else:

The source of truth is a plain, indented Markdown-like list. Nothing else.

That one constraint turned out to cascade into a bunch of properties I wanted but hadn't fully planned for — diff-friendly maps, trivial programmatic generation, and the thing people ask about most: real-time AI streaming. This post walks through how the pieces fit together, from text to tree to SVG.

If you'd rather just play with it: the demo is at mindmap.u14.app and the code is on GitHub (@xiangfa/mindmap, Apache-2.0).

The whole thing in one sentence

text  →  tree  →  layout  →  SVG

Every stage is pure and one-directional, and every stage round-trips back to text. That symmetry is the trick that makes the rest easy.

1. The data model

A mind map is a forest of trees, so the node type is about as boring as you'd expect:

interface MindMapData {
  id: string;
  text: string;
  children?: MindMapData[];
  remark?: string;                 // multi-line note attached to a node
  taskStatus?: "todo" | "doing" | "done";
  // plugin-populated fields:
  tags?: string[];
  anchorId?: string;               // cross-link source
  crossLinks?: CrossLink[];        // cross-link targets
  collapsed?: boolean;
  dottedLine?: boolean;
  multiLineContent?: string[];
}

The important detail is what isn't here: no coordinates, no widths, no colors. Position and style are derived later from the tree, never stored. That keeps the data model serializable and keeps the text the single source of truth.

2. Parsing: indentation is the hierarchy

The parser's only real job is converting leading whitespace into parent/child relationships. The heart of it is a stack keyed by indentation depth:

function parseList(source: string): MindMapData[] {
  const lines = source.split("\n").filter((l) => l.trim().length > 0);
  const roots: MindMapData[] = [];
  const stack: { node: MindMapData; indent: number }[] = [];

  for (const line of lines) {
    const indent = line.length - line.trimStart().length;
    const text = line.trim().replace(/^[-*+]\s*/, ""); // strip bullet marker
    const node: MindMapData = { id: uid(), text, children: [] };

    // climb back up to the correct parent
    while (stack.length && stack[stack.length - 1].indent >= indent) {
      stack.pop();
    }

    if (stack.length === 0) roots.push(node);
    else stack[stack.length - 1].node.children!.push(node);

    stack.push({ node, indent });
  }

  return roots;
}

That's the entire core idea. A line more indented than the previous one is its child; a line at the same or lower indentation pops the stack until it finds its real parent. Multiple top-level lines (or trees separated by a blank line) simply become multiple roots — which is how you get several independent maps on one canvas.

It's O(n) over lines, single pass, no backtracking. Hold that property in mind — it's what makes streaming cheap in section 5.

3. The two layers of syntax

Plain indentation gets you a tree, but a tree of bare strings is boring. The richer syntax lives in two layers that run after the structural parse:

Line-level markers decide what kind of node a line is, before the text is stored:

- React #framework #frontend      → tags: ["framework", "frontend"]
- [x] Ship the parser             → taskStatus: "done"
- [ ] Write the layout engine     → taskStatus: "todo"
  > remember to handle RTL        → remark on the parent node
- Launch {#launch}                → anchorId: "launch"
  -> {#launch} "depends on"       → crossLink with a label

Inline formatting is parsed lazily, only when a node renders, into a small token stream:

// "**bold** and `code`" → [{type:"bold",...}, {type:"text"," and "}, {type:"code",...}]
parseInlineMarkdown(node.text);

Keeping inline parsing lazy matters: you don't pay to tokenize **bold** for a node that's currently scrolled off-screen or collapsed.

Each of these extras is a plugin. There are seven built in (tags, folding, cross-links, LaTeX, dotted lines, multi-line, frontmatter), they're all on by default, and each one is tree-shakeable — a plugin you don't import isn't in your bundle. A plugin is essentially a pair of hooks: one that claims a line during parsing and writes a field onto the node, and one that contributes to rendering. The core parser stays oblivious to all of them.

4. Why the round-trip matters

Because position and style are derived, every transform is reversible:

parseMarkdownList(text)        // text  → MindMapData
toMarkdownList(node)           // MindMapData → text
parseMarkdownMultiRoot(text)   // text  → MindMapData[]
toMarkdownMultiRoot(forest)    // MindMapData[] → text

This is what makes the maps diffable and version-controllable — a one-node change is a one-line diff in git, not an opaque binary delta. It's also why the built-in text editor can flip between a visual map and raw Markdown with no lossy conversion in either direction: they're two views of the same string.

5. Streaming: the part that was almost free

Here's the payoff. An LLM emits a mind map as a Markdown list, token by token. At any instant, the buffer you've received so far is already valid Markdown — a partial list is just a smaller list. So the streaming consumer is almost insultingly simple:

let buffer = "";
for await (const chunk of llmStream) {
  buffer += chunk;
  mindMapRef.current?.setMarkdown(buffer); // re-parse the partial text every tick
}

Each tick re-runs the O(n) parser over the whole buffer and hands React a fresh tree. Two things keep this smooth instead of janky:

The parser is cheap and total. It never throws on incomplete input — a half-written line is just a node with short text that'll get longer next tick. There's no "wait for a complete element" state machine to stall on.
The diff between ticks is tiny. Re-parsing produces a tree that's almost identical to the last one, so React's reconciliation and the SVG re-layout only touch what changed. The map visibly grows rather than flashing and redrawing.

The optional built-in AI input bar wraps this and talks to any OpenAI-compatible endpoint, but the streaming behavior isn't special-cased for it — anything that can call setMarkdown with a growing string gets the same effect. The component never had to know an LLM existed. That's the whole point of making text the interface.

One honest caveat: the AI input bar sends the API key straight from the browser. That's fine for local prototyping, but in production you should put a proxy in front of it so the key stays server-side.

6. Layout: from tree to coordinates

This is where the real work hides, because SVG gives you zero automatic layout — you place every pixel yourself.

The layout pass is a recursive walk that assigns each node an (x, y):

x comes from depth. Each level steps a fixed distance away from the root.
y comes from subtree size. A node is centered against the vertical span of its children, so a parent sits opposite the middle of its branch.

function measureSubtree(node: MindMapData): number {
  if (!node.children?.length || node.collapsed) return NODE_HEIGHT;
  return node.children.reduce((h, c) => h + measureSubtree(c), 0);
}

For the default balanced ("both") layout, the root's top-level branches are split between the left and right sides to keep the two halves roughly even in height, then each side lays out independently — left side mirrored. left and right modes just send every branch one way.

Each top-level branch also gets a stable branch index, which is how coloring works: the index maps to a CSS variable (--mindmap-branch-0 … -9) and lands on the SVG element as data-branch-index, so you can theme an entire branch with one selector.

Text width is the gnarly bit. SVG won't tell you how wide a string renders until it's in the DOM, so node sizing has to either measure rendered text or approximate it — and getting that wrong means edges that don't quite meet their boxes. It's the least glamorous and most fiddly part of the whole component.

7. Rendering: why pure SVG

Everything draws as SVG — no canvas, no external graph/layout library. That's a deliberate tradeoff:

Crisp at any zoom. Vectors don't pixelate, so deep zoom stays sharp.
Themeable with plain CSS. Nodes, edges, and tags are real elements with semantic classes, so you style them with ~30 CSS variables instead of re-rendering a bitmap.
Exportable for free (next section).

Edges are SVG paths — a curve from a parent's anchor point to each child's. Selection, the add-child button, fold toggles, and tags are all just more SVG nodes layered on. The cost you pay for this is that very large trees mean a lot of DOM nodes, so there's a ceiling where canvas would win on raw throughput. For the document-sized maps this is built for, vectors are the right call.

8. Export falls out of the rendering model

Because the live view is SVG, exporting it is mostly serialization:

SVG export walks the rendered tree and inlines a <style> block with the resolved CSS values, plus the same semantic classes and data-branch-index attributes. The result renders correctly as a standalone file with no external stylesheet.
PNG export draws that SVG onto a canvas at high DPI and reads back a blob.
Markdown export is just toMarkdownList again — the round-trip from section 4.

const svg = ref.current.exportToSVG();      // string
const png = await ref.current.exportToPNG(); // Promise<Blob>
const md  = ref.current.exportToOutline();   // markdown list

No separate "export renderer" to keep in sync with the on-screen one, because there's only ever one renderer.

9. What's still hard

In the spirit of being honest rather than selling:

Text measurement is the perennial source of layout bugs, as mentioned. Web fonts loading late can shift sizing after first paint.
Large trees push the DOM-node-count ceiling. Folding helps; virtualization of an SVG tree is a project of its own.
Cross-links break the clean tree model on purpose — they're a second edge layer drawn between arbitrary anchors, and routing them through a dense map without crossings is genuinely unsolved here.
Browser-side API keys for the AI bar need a proxy in real deployments.

None of these are unique to this project; they're the standard taxes you pay for "render a graph in the browser, from text, live."

Takeaways

If there's one transferable idea here, it's this: pick a plain-text representation as your source of truth, and a surprising number of features stop being features and start being consequences. Diffability, version control, programmatic generation, lossless editing, and incremental/streaming rendering all fell out of "it's just a Markdown list" — I didn't build them one by one.

The code is open source (github.com/u14app/mindmap, Apache-2.0) and on npm as @xiangfa/mindmap. I'd genuinely like feedback on the syntax design and the parsing/streaming approach — and if you spot a smarter way to handle the text-measurement problem, please tell me.

DEV Community