DEV Community

Amery
Amery

Posted on

How I built a mind map that's just a Markdown list (and why that makes AI streaming almost free)

Most mind map tools store their data in a proprietary binary blob or lock you into a WYSIWYG editor. When I started Open MindMap, a React component, I made one decision up front that ended up shaping everything else:

The source of truth is a plain, indented Markdown-like list. Nothing else.

That one constraint turned out to cascade into a bunch of properties I wanted but hadn't fully planned for — diff-friendly maps, trivial programmatic generation, and the thing people ask about most: real-time AI streaming. This post walks through how the pieces fit together, from text to tree to SVG.

If you'd rather just play with it: the demo is at mindmap.u14.app and the code is on GitHub (@xiangfa/mindmap, Apache-2.0).

The whole thing in one sentence

text  →  tree  →  layout  →  SVG
Enter fullscreen mode Exit fullscreen mode

Every stage is pure and one-directional, and every stage round-trips back to text. That symmetry is the trick that makes the rest easy.

1. The data model

A mind map is a forest of trees, so the node type is about as boring as you'd expect:

interface MindMapData {
  id: string;
  text: string;
  children?: MindMapData[];
  remark?: string;                 // multi-line note attached to a node
  taskStatus?: "todo" | "doing" | "done";
  // plugin-populated fields:
  tags?: string[];
  anchorId?: string;               // cross-link source
  crossLinks?: CrossLink[];        // cross-link targets
  collapsed?: boolean;
  dottedLine?: boolean;
  multiLineContent?: string[];
}
Enter fullscreen mode Exit fullscreen mode

The important detail is what isn't here: no coordinates, no widths, no colors. Position and style are derived later from the tree, never stored. That keeps the data model serializable and keeps the text the single source of truth.

2. Parsing: indentation is the hierarchy

The parser's only real job is converting leading whitespace into parent/child relationships. The heart of it is a stack keyed by indentation depth:

function parseList(source: string): MindMapData[] {
  const lines = source.split("\n").filter((l) => l.trim().length > 0);
  const roots: MindMapData[] = [];
  const stack: { node: MindMapData; indent: number }[] = [];

  for (const line of lines) {
    const indent = line.length - line.trimStart().length;
    const text = line.trim().replace(/^[-*+]\s*/, ""); // strip bullet marker
    const node: MindMapData = { id: uid(), text, children: [] };

    // climb back up to the correct parent
    while (stack.length && stack[stack.length - 1].indent >= indent) {
      stack.pop();
    }

    if (stack.length === 0) roots.push(node);
    else stack[stack.length - 1].node.children!.push(node);

    stack.push({ node, indent });
  }

  return roots;
}
Enter fullscreen mode Exit fullscreen mode

That's the entire core idea. A line more indented than the previous one is its child; a line at the same or lower indentation pops the stack until it finds its real parent. Multiple top-level lines (or trees separated by a blank line) simply become multiple roots — which is how you get several independent maps on one canvas.

It's O(n) over lines, single pass, no backtracking. Hold that property in mind — it's what makes streaming cheap in section 5.

3. The two layers of syntax

Plain indentation gets you a tree, but a tree of bare strings is boring. The richer syntax lives in two layers that run after the structural parse:

Line-level markers decide what kind of node a line is, before the text is stored:

- React #framework #frontend      → tags: ["framework", "frontend"]
- [x] Ship the parser             → taskStatus: "done"
- [ ] Write the layout engine     → taskStatus: "todo"
  > remember to handle RTL        → remark on the parent node
- Launch {#launch}                → anchorId: "launch"
  -> {#launch} "depends on"       → crossLink with a label
Enter fullscreen mode Exit fullscreen mode

Inline formatting is parsed lazily, only when a node renders, into a small token stream:

// "**bold** and `code`" → [{type:"bold",...}, {type:"text"," and "}, {type:"code",...}]
parseInlineMarkdown(node.text);
Enter fullscreen mode Exit fullscreen mode

Keeping inline parsing lazy matters: you don't pay to tokenize **bold** for a node that's currently scrolled off-screen or collapsed.

Each of these extras is a plugin. There are seven built in (tags, folding, cross-links, LaTeX, dotted lines, multi-line, frontmatter), they're all on by default, and each one is tree-shakeable — a plugin you don't import isn't in your bundle. A plugin is essentially a pair of hooks: one that claims a line during parsing and writes a field onto the node, and one that contributes to rendering. The core parser stays oblivious to all of them.

4. Why the round-trip matters

Because position and style are derived, every transform is reversible:

parseMarkdownList(text)        // text  → MindMapData
toMarkdownList(node)           // MindMapData → text
parseMarkdownMultiRoot(text)   // text  → MindMapData[]
toMarkdownMultiRoot(forest)    // MindMapData[] → text
Enter fullscreen mode Exit fullscreen mode

This is what makes the maps diffable and version-controllable — a one-node change is a one-line diff in git, not an opaque binary delta. It's also why the built-in text editor can flip between a visual map and raw Markdown with no lossy conversion in either direction: they're two views of the same string.

5. Streaming: the part that was almost free

Here's the payoff. An LLM emits a mind map as a Markdown list, token by token. At any instant, the buffer you've received so far is already valid Markdown — a partial list is just a smaller list. So the streaming consumer is almost insultingly simple:

let buffer = "";
for await (const chunk of llmStream) {
  buffer += chunk;
  mindMapRef.current?.setMarkdown(buffer); // re-parse the partial text every tick
}
Enter fullscreen mode Exit fullscreen mode

Each tick re-runs the O(n) parser over the whole buffer and hands React a fresh tree. Two things keep this smooth instead of janky:

  1. The parser is cheap and total. It never throws on incomplete input — a half-written line is just a node with short text that'll get longer next tick. There's no "wait for a complete element" state machine to stall on.
  2. The diff between ticks is tiny. Re-parsing produces a tree that's almost identical to the last one, so React's reconciliation and the SVG re-layout only touch what changed. The map visibly grows rather than flashing and redrawing.

The optional built-in AI input bar wraps this and talks to any OpenAI-compatible endpoint, but the streaming behavior isn't special-cased for it — anything that can call setMarkdown with a growing string gets the same effect. The component never had to know an LLM existed. That's the whole point of making text the interface.

One honest caveat: the AI input bar sends the API key straight from the browser. That's fine for local prototyping, but in production you should put a proxy in front of it so the key stays server-side.

6. Layout: from tree to coordinates

This is where the real work hides, because SVG gives you zero automatic layout — you place every pixel yourself.

The layout pass is a recursive walk that assigns each node an (x, y):

  • x comes from depth. Each level steps a fixed distance away from the root.
  • y comes from subtree size. A node is centered against the vertical span of its children, so a parent sits opposite the middle of its branch.
function measureSubtree(node: MindMapData): number {
  if (!node.children?.length || node.collapsed) return NODE_HEIGHT;
  return node.children.reduce((h, c) => h + measureSubtree(c), 0);
}
Enter fullscreen mode Exit fullscreen mode

For the default balanced ("both") layout, the root's top-level branches are split between the left and right sides to keep the two halves roughly even in height, then each side lays out independently — left side mirrored. left and right modes just send every branch one way.

Each top-level branch also gets a stable branch index, which is how coloring works: the index maps to a CSS variable (--mindmap-branch-0-9) and lands on the SVG element as data-branch-index, so you can theme an entire branch with one selector.

Text width is the gnarly bit. SVG won't tell you how wide a string renders until it's in the DOM, so node sizing has to either measure rendered text or approximate it — and getting that wrong means edges that don't quite meet their boxes. It's the least glamorous and most fiddly part of the whole component.

7. Rendering: why pure SVG

Everything draws as SVG — no canvas, no external graph/layout library. That's a deliberate tradeoff:

  • Crisp at any zoom. Vectors don't pixelate, so deep zoom stays sharp.
  • Themeable with plain CSS. Nodes, edges, and tags are real elements with semantic classes, so you style them with ~30 CSS variables instead of re-rendering a bitmap.
  • Exportable for free (next section).

Edges are SVG paths — a curve from a parent's anchor point to each child's. Selection, the add-child button, fold toggles, and tags are all just more SVG nodes layered on. The cost you pay for this is that very large trees mean a lot of DOM nodes, so there's a ceiling where canvas would win on raw throughput. For the document-sized maps this is built for, vectors are the right call.

8. Export falls out of the rendering model

Because the live view is SVG, exporting it is mostly serialization:

  • SVG export walks the rendered tree and inlines a <style> block with the resolved CSS values, plus the same semantic classes and data-branch-index attributes. The result renders correctly as a standalone file with no external stylesheet.
  • PNG export draws that SVG onto a canvas at high DPI and reads back a blob.
  • Markdown export is just toMarkdownList again — the round-trip from section 4.
const svg = ref.current.exportToSVG();      // string
const png = await ref.current.exportToPNG(); // Promise<Blob>
const md  = ref.current.exportToOutline();   // markdown list
Enter fullscreen mode Exit fullscreen mode

No separate "export renderer" to keep in sync with the on-screen one, because there's only ever one renderer.

9. What's still hard

In the spirit of being honest rather than selling:

  • Text measurement is the perennial source of layout bugs, as mentioned. Web fonts loading late can shift sizing after first paint.
  • Large trees push the DOM-node-count ceiling. Folding helps; virtualization of an SVG tree is a project of its own.
  • Cross-links break the clean tree model on purpose — they're a second edge layer drawn between arbitrary anchors, and routing them through a dense map without crossings is genuinely unsolved here.
  • Browser-side API keys for the AI bar need a proxy in real deployments.

None of these are unique to this project; they're the standard taxes you pay for "render a graph in the browser, from text, live."

Takeaways

If there's one transferable idea here, it's this: pick a plain-text representation as your source of truth, and a surprising number of features stop being features and start being consequences. Diffability, version control, programmatic generation, lossless editing, and incremental/streaming rendering all fell out of "it's just a Markdown list" — I didn't build them one by one.

The code is open source (github.com/u14app/mindmap, Apache-2.0) and on npm as @xiangfa/mindmap. I'd genuinely like feedback on the syntax design and the parsing/streaming approach — and if you spot a smarter way to handle the text-measurement problem, please tell me.

Top comments (0)