From O(n²) to O(n): Building a Streaming Markdown Renderer for the AI Era
If you've built an AI chat application, you've probably noticed something frustrating: the longer the conversation gets, the slower the rendering becomes.
The reason is simple — every time the AI outputs a new token, traditional markdown parsers re-parse the entire document from scratch. This is a fundamental architectural problem, and it only gets worse as AI outputs get longer.
We built Incremark to fix this.
The Uncomfortable Truth About AI in 2025
If you've been following AI trends, you know the numbers are getting crazy:
- 2022: GPT-3.5 responses? A few hundred words, no big deal
- 2023: GPT-4 cranks it up to 2,000-4,000 words
- 2024-2025: Reasoning models (o1, DeepSeek R1) are outputting 10,000+ word "thinking processes"
We're moving from 4K token conversations to 32K, even 128K. And here's the thing nobody talks about: rendering 500 words and rendering 50,000 words of Markdown are completely different engineering problems.
Most markdown libraries? They were built for blog posts. Not for AI that thinks out loud.
Why Your Markdown Parser is Lying to You
Here's what happens under the hood when you stream AI output through a traditional parser:
Chunk 1: Parse 100 chars ✓
Chunk 2: Parse 200 chars (100 old + 100 new)
Chunk 3: Parse 300 chars (200 old + 100 new)
...
Chunk 100: Parse 10,000 chars 😰
Total work: 100 + 200 + 300 + ... + 10,000 = 5,050,000 character operations.
That's O(n²). The cost doesn't just grow — it explodes.
For a 20KB AI response, this means:
- ant-design-x: 1,657 ms parsing time
- markstream-vue: 5,755 ms (almost 6 seconds of parsing!)
And these are popular, well-maintained libraries. The problem isn't bad code — it's the wrong architecture.
The Key Insight
Here's the thing:
Once a markdown block is "complete", it will never change.
Think about it. When the AI outputs:
# Heading
This is a paragraph.
After that second blank line, the paragraph is done. Locked in. No matter what comes next — code blocks, lists, more paragraphs — that paragraph will never be touched again.
So why are we re-parsing it 500 times?
How Incremark Actually Works
We built Incremark around this insight. The core algorithm:
- Detect stable boundaries — blank lines, new headings, fence closings
- Cache completed blocks — never touch them again
- Only re-parse the pending block — the one still receiving input
Chunk 1: Parse 100 chars → cache stable blocks
Chunk 2: Parse only ~100 new chars
Chunk 3: Parse only ~100 new chars
...
Chunk 100: Parse only ~100 new chars
Total work: 100 × 100 = 10,000 character operations.
That's 500x less work. Each character is parsed at most once. That's O(n).
Complete Benchmark Data
We benchmarked 38 real markdown files — AI conversations, docs, code analysis reports. Not synthetic test data. Total: 6,484 lines, 128.55 KB.
Here's the full table:
| File | Lines | Size | Incremark | Streamdown | markstream-vue | ant-design-x |
|---|---|---|---|---|---|---|
| test-footnotes-simple.md | 15 | 0.09 KB | 0.3 ms | 0.0 ms | 1.4 ms | 0.2 ms |
| simple-paragraphs.md | 16 | 0.41 KB | 0.9 ms | 0.9 ms | 5.9 ms | 1.0 ms |
| introduction.md | 34 | 1.57 KB | 5.6 ms | 12.6 ms | 75.6 ms | 12.8 ms |
| footnotes.md | 52 | 0.94 KB | 1.7 ms | 0.2 ms | 10.6 ms | 1.9 ms |
| concepts.md | 91 | 4.29 KB | 12.0 ms | 50.5 ms | 381.9 ms | 53.6 ms |
| comparison.md | 109 | 5.39 KB | 20.5 ms | 74.0 ms | 552.2 ms | 85.2 ms |
| complex-html-examples.md | 147 | 3.99 KB | 9.0 ms | 58.8 ms | 279.3 ms | 57.2 ms |
| FOOTNOTE_FIX_SUMMARY.md | 236 | 3.93 KB | 22.7 ms | 0.5 ms | 535.0 ms | 120.8 ms |
| OPTIMIZATION_SUMMARY.md | 391 | 6.24 KB | 19.1 ms | 208.4 ms | 980.6 ms | 217.8 ms |
| BLOCK_TRANSFORMER_ANALYSIS.md | 489 | 9.24 KB | 75.7 ms | 574.3 ms | 1984.1 ms | 619.9 ms |
| test-md-01.md | 916 | 17.67 KB | 87.7 ms | 1441.1 ms | 5754.7 ms | 1656.9 ms |
| Total (38 files) | 6484 | 128.55 KB | 519.4 ms | 3190.3 ms | 14683.9 ms | 3728.6 ms |
Being Honest: Where We're Slower
You'll notice something weird in the data. For footnotes.md and FOOTNOTE_FIX_SUMMARY.md, Streamdown appears much faster:
| File | Incremark | Streamdown | Why? |
|---|---|---|---|
| footnotes.md | 1.7 ms | 0.2 ms | Streamdown doesn't support footnotes |
| FOOTNOTE_FIX_SUMMARY.md | 22.7 ms | 0.5 ms | Same — it just skips them |
This isn't a performance issue — it's a feature difference.
When Streamdown encounters [^1] footnote syntax, it simply ignores it. Incremark fully implements footnotes — and we had to solve a tricky streaming-specific problem:
In streaming scenarios, references often arrive before definitions:
Chunk 1: "See footnote[^1] for details..." // reference arrives first
Chunk 2: "More content..."
Chunk 3: "[^1]: This is the definition" // definition arrives later
Traditional parsers assume you have the complete document. We built "optimistic references" that gracefully handle incomplete links/images during streaming, then resolve them when definitions arrive.
We chose to fully implement footnotes, math blocks ($...$), and custom containers (:::tip) because that's what real AI content needs.
Where We Actually Shine
Excluding footnote files, look at standard markdown performance:
| File | Lines | Incremark | Streamdown | Advantage |
|---|---|---|---|---|
| concepts.md | 91 | 12.0 ms | 50.5 ms | 4.2x |
| comparison.md | 109 | 20.5 ms | 74.0 ms | 3.6x |
| complex-html-examples.md | 147 | 9.0 ms | 58.8 ms | 6.6x |
| OPTIMIZATION_SUMMARY.md | 391 | 19.1 ms | 208.4 ms | 10.9x |
| test-md-01.md | 916 | 87.7 ms | 1441.1 ms | 16.4x |
The pattern is clear: the larger the document, the bigger our advantage.
For the largest file (17.67 KB):
- Incremark: 88 ms
- ant-design-x: 1,657 ms (18.9x slower)
- markstream-vue: 5,755 ms (65.6x slower)
Why Such a Huge Gap?
This is O(n) vs O(n²) in action.
Traditional parsers re-parse the entire document on every chunk:
Chunk 1: Parse 100 chars
Chunk 2: Parse 200 chars (100 old + 100 new)
Chunk 3: Parse 300 chars (200 old + 100 new)
...
Chunk 100: Parse 10,000 chars
Total work: 100 + 200 + ... + 10,000 = 5,050,000 character operations.
Incremark only processes new content:
Chunk 1: Parse 100 chars → cache stable blocks
Chunk 2: Parse only ~100 new chars
Chunk 3: Parse only ~100 new chars
...
Chunk 100: Parse only ~100 new chars
Total work: 100 × 100 = 10,000 character operations.
That's a 500x difference. And it only gets worse as documents grow.
When to Use Incremark
✅ Use Incremark for:
- AI chat with streaming output (Claude, ChatGPT, etc.)
- Long-form AI content (reasoning models, code generation)
- Real-time markdown editors
- Content requiring footnotes, math, or custom containers
- 100K+ token conversations
⚠️ Consider alternatives for:
- One-time static markdown rendering (just use marked directly)
- Very small files (<500 characters) — the overhead isn't worth it
Two Engines, One Goal
Marked or Micromark? Both have tradeoffs.
Marked is blazing fast but lacks advanced features. Micromark is spec-perfect but heavier.
Our answer: support both.
| Engine | Speed | Best For |
|---|---|---|
| Marked (default) | ⚡⚡⚡⚡⚡ | Real-time streaming, AI chat |
| Micromark | ⚡⚡⚡ | Complex docs, strict CommonMark |
We extended Marked with custom tokenizers for footnotes, math, and containers. If you hit edge cases Marked can't handle, switch to Micromark with one config change.
Both engines produce identical mdast output. Your rendering code doesn't care which one is running.
The Typewriter Problem Nobody Talks About
You know that smooth "typing" effect ChatGPT has? Most implementations do this:
displayText = fullText.slice(0, currentIndex)
This breaks markdown constantly. You get half-rendered **bold** tags, flickering code blocks, syntax that looks drunk.
We moved the animation to the AST level. Our BlockTransformer knows the structure — it animates within nodes, never across them. Result: buttery smooth typing that respects markdown semantics.
Try It Yourself
npm install @incremark/vue # or react, or svelte
<script setup>
import { ref } from 'vue'
import { IncremarkContent } from '@incremark/vue'
const content = ref('')
const isFinished = ref(false)
async function handleStream(stream) {
for await (const chunk of stream) {
content.value += chunk
}
isFinished.value = true
}
</script>
<template>
<IncremarkContent
:content="content"
:is-finished="isFinished"
:incremark-options="{ gfm: true, math: true }"
/>
</template>
We support Vue 3, React 18, and Svelte 5 with identical APIs. One core, three frameworks, zero behavior differences.
What's Next
This is version 0.3.0. We're just getting started.
The AI world is moving toward longer outputs, more complex reasoning traces, and richer formatting. Traditional parsers can't keep up — their O(n²) architecture guarantees it.
We built Incremark because we needed it. Hopefully you find it useful too.
📚 Docs: incremark.com
💻 GitHub: kingshuaishuai/incremark
🎮 Live Demos: Vue | React | Svelte
If this saved you debugging time, a ⭐️ on GitHub would mean a lot. Questions? Open an issue or drop a comment below.
Top comments (0)