DEV Community

mizchi (Kotaro Chikuba)
mizchi (Kotaro Chikuba)

Posted on

Building Incremental Markdown Compiler by Moonbit

I've implemented a Markdown Editor for about the fifth time in my life.

You can try it here.

https://markdown.mizchi.workers.dev/

Whether it's truly the fastest on Earth is hard to say since there are various axes of performance, but at least from the perspective of incremental compilation in an editor, I don't think it loses to anything else. (Benchmarks are later in this article)

I may be saying this myself, but it's incredibly fast. Since it uses differential compilation, it can handle large amounts of text. The automatic synchronization between preview and editing is convenient. Even when I threw 20,000 characters at it, there was no display lag and it maintained 60fps.

It's a pure Moonbit implementation without FFI, so it can be used as a library in js/wasm/native environments.

So, I'm writing this article to show everyone the potential of Moonbit implementation. The js build has also been published to npm as @mizchi/markdown, which is used in the playground above.

https://github.com/mizchi/markdown.mbt

https://www.npmjs.com/package/@mizchi/markdown

However, I should mention upfront that CommonMark (the Markdown standard test suite) compliance is a modest 207/542. The reason is that I skipped parts that were difficult to optimize in benchmarks.

That said, the failing cases are mostly nested patterns like *foo [bar](/url)*\n or _____foo_____\n, which I judged to be not problematic for practical use. Instead, I've added more support for GitHub Flavored Markdown extensions like - [x] Tasklists, tables, and footnotes. Those are in higher demand anyway.

The Basic Idea: Building CST Based on Cursor Position

One of the problems with Markdown Editors (and text editors in general) is that the entire text needs to be reparsed with every user input. To maintain 60fps you need to parse and reflect the entire text in under 16ms, and for 120fps it needs to be under 8ms.

With a naive implementation, performance degrades linearly with text size. So what do we do?

First, I decided to adopt CST. This is a data structure used in Tree Sitter and others, and unlike AST, it's a data structure that preserves the original input without losing it.

https://docs.rs/cstree/latest/cstree/

And for the idea of solving differential updates, this article was extremely helpful.

https://josephg.com/blog/crdts-go-brrr/

(This article is really interesting, so please read it)

Here, instead of thinking about programming language parsers, let's think from the perspective of CRDT for collaborative editing. Generally, for simultaneous editing like Google Docs, a data structure/algorithm called CRDT, Conflict-free replicated data type, is used.

https://en.wikipedia.org/wiki/Conflict-free_replicated_data_type

Essentially, it's a structure like a Git branch that can always be merged with FastForward, where change patches are sent back and forth and continuously merged. This allows multiple people to edit simultaneously without locking, even in unsynchronized environments.

The JS implementation of this is yjs, and the CRDTs Go Brrr article is about making a Rust implementation 5000x faster.

I'll skip the detailed techniques, but the content is roughly:

  • yjs creates a lot of objects on each parse, causing memory fragmentation in JS(V8). This was addressed by allocating efficient data structures
    • It's not impossible with ArrayBuffer(Uint8Array) in JS, but it would probably be difficult
  • For input, reparse based on the edit point, and only rebuild the syntax tree for the parts before and after where changes occurred

Checking only before and after the input is quite heuristic, but it's actually a very effective approach.

Setting aside memory fragmentation, rebuilding the Markdown CST for changes is the key to this implementation.

  1. Identify edit position: EditInfo { offset, old_length, new_length }
  2. Calculate affected range: Identify blocks containing changed lines
  3. Reuse decision: Blocks before the change → reuse as-is
  4. Partial parse: Reparse only blocks containing changes
  5. Merge: Reused blocks + new blocks + subsequent blocks (span adjustment)
// Calling incremental parse
let edit = EditInfo::replace(change_start, old_len, new_len)
let new_doc = parse_incremental(old_doc, old_text, new_text, edit)
Enter fullscreen mode Exit fullscreen mode

Implementing This

So, I built it.

https://github.com/mizchi/markdown.mbt

The actual implementation process was:

  • Discuss with ChatGPT 5.2 to do preliminary research on algorithms and confirm my understanding
  • Expand the tests from https://spec.commonmark.org/ into test cases with a script
  • Create a mechanism to give the same input as remark as a reference implementation and check if the output matches
  • Have Claude Code read the CRDTs Go brrr article as a reference and verify the algorithm's effectiveness with benchmarks on simple test cases
  • mizchi: "Do it" Claude Code: "Yes"
  • Occasionally check failing test cases and prioritize fixes
  • Periodically take benchmarks to confirm nothing has gotten slower
  • Let it do things to some extent, but keep implementation decisions in my own hands

Moonbit has a built-in moon bench benchmark command.

https://docs.moonbitlang.com/en/latest/language/benchmarks.html

Once the compiler was mostly complete, I quickly integrated it with preact + vite in the JS build and it was done.

https://markdown.mizchi.workers.dev/

Actual Benchmarks

I prepared small, medium, and large document sizes and compared them with cmark (also Moonbit implementation) and markdown-rs (Rust implementation).

Comparison with Different Libraries

μs/iter, smaller is better.

Parser Target small medium large
mizchi/markdown WASM-GC 125 291 1329
rami3l/cmark WASM-GC 46 120 416
markdown-rs Rust 202 621 3674

Unfortunately I couldn't beat cmark, but it's faster than markdown-rs. However, these two implementations should be CommonMark compliant, so it's frustrating to lose while cutting corners. cmark is apparently a port of an OCaml implementation, but it's quite fast.

https://github.com/moonbit-community/cmark.mbt

Comparison by Build Target

Next, let's compare mizchi/markdown and cmark mixed across different build targets.

Parser Target small medium large
mizchi/markdown JS 134 356 1870
mizchi/markdown WASM-GC 125 291 1329
mizchi/markdown Native 81 315 1547
rami3l/cmark JS 79 195 854
rami3l/cmark WASM-GC 46 120 416
rami3l/cmark Native 26 105 513
markdown-rs Rust Native 202 621 3674

The trend of mizchi/markdown being 2-3x slower than cmark is the same.

Looking at Moonbit backends, Native is faster for small sizes, but the gap with JS narrows as the volume increases. I imagine this is due to V8's JIT.

It's unexpected that wasm is faster. I had the impression that wasm is about 1.5-2x slower than native for the same implementation due to memory barriers, but it's possible Moonbit's native optimization level is lower.

Incremental Parse Benchmarks

This is the main topic. We measure how many μs each parser takes when editing one character in the editor.

Parser Method 50 paragraphs 100 paragraphs
mizchi/markdown Incremental 10 10
mizchi/markdown Full 210 386
rami3l/cmark Full 146 433
markdown-rs (Rust) Full 621 3674

mizchi/markdown's incremental parse completes in about 10μs regardless of document size.

Comparing with a 100-paragraph document:

  • mizchi/markdown (incremental): 10 μs
  • rami3l/cmark (full): 433 μs → 43x slower
  • markdown-rs (full): 3674 μs → 367x slower

cmark is fastest for full parse, but for editor real-time preview use cases, incremental parse wins overwhelmingly. Well, that's what I built it for, so it's expected...

mizchi/markdown's full parse is somewhat slow, but from the second edit onwards it only calculates differences, so the actual editor experience is the fastest. The effect grows as documents get longer, and incremental parse complexity approaches O(1).

I think the CST data construction processing overhead is why I'm losing to cmark, but I decided to accept it as a necessary cost.

Once I tuned it this far, as far as Chrome DevTools benchmarks show, the browser's renderer side became the bottleneck rather than JS. The solution would be implementing a virtual table like react-virtualized, but that's a future task.

Thoughts on Moonbit x AI

I really felt that vibe coding performs best when reference implementations and tests are sufficiently prepared. The CommonMark reference collection was a huge help.

It seems that designing APIs and test cases for each module is becoming what humans focus on.

Even if you can't write algorithm details from scratch, it's really important to know how they should behave, and to create mechanisms to get quantitative numbers through benchmarks, and use unambiguous metrics like test coverage. With quantitative metrics to aim for, human intervention decreases and bottlenecks are reduced. Also, compiler error hints and linting richness matter.

AI x Moonbit has insufficient training data so it repeatedly makes syntax mistakes in early stages, but since the compiler error quality is excellent, it gets smarter with long context. Moonbit code itself is concise and readable, making review easy too.

That said, I feel like I spent more time reviewing test cases in unit tests than reviewing implementations. I have a feeling this is how programming will go in the future.

Moonbit's Cross-Target Performance

Since it's developed specialized for wasm-gc, the wasm backend performance is excellent. On the other hand, I thought native could do a bit better.

Even as a TypeScript replacement, the build size is small enough to be practical. Using Map or Double::to_string hits built-in library spots and increases build size, so avoiding that is somewhat tricky.

Bundle size is the most important thing in today's JS frontend, and although I wasn't conscious of it during development, the output was about 92k minified, 20k gzipped. Smaller than remark (112k), larger than marked (42k). It should get even smaller with treeshaking.

The End

I think Moonbit is the best language for someone like me who mainly uses TypeScript and has dabbled a bit in Rust, for use across layers.

Pure Moonbit implementation lets you choose the runtime platform, which is great. Problems I had while writing in TS where I wished I could port to native have been almost completely solved.

The abstraction level is just right - lifetimes and low-level binary operations that frequently appear in Rust are hidden, allowing high-level descriptions, making it optimal not just for systems programming but also for the application layer.

However, to speed things up further I'd want multithreading, and it's a shame Moonbit doesn't support this yet. When I asked the author, they said it won't be possible in the stable 1.0 release planned for next year, but they want to consider it by 2.0.

So everyone, please write Moonbit. It's the best language.

Top comments (0)