JAI

Posted on Mar 30

I Eliminated Layout Jitter From LLM Streaming — Here's How

#llm #performance #showdev #webdev

Every AI chat app has the same bug. You've felt it. That stuttering scrollbar, the content jumping, the dropped frames when tokens stream in. I spent weeks building a library that makes it physically impossible.

The Problem Nobody Talks About

Open ChatGPT. Claude. Gemini. Any LLM-powered chat interface.

Now watch the scrollbar while the model streams a response.

See it? That micro-stutter. The scrollbar jumps. The content reflows. If you're on a slower device, you'll see actual frame drops. It's subtle on short responses, but stream 500+ tokens and it becomes infuriating.

Why does this happen?

Every single token that arrives triggers the same cascade:

Token arrives → DOM mutation → Style recalculation → Layout reflow → Paint → Composite

At 50 tokens/second, that's 50 full layout reflows per second. Each one forces the browser to:

Recalculate every CSS property that could be affected
Recompute the geometry of every element in the render tree
Determine what pixels need repainting
Composite the final frame

On a page with 200 DOM elements, each reflow touches dozens of nodes. The browser's layout engine was never designed for this kind of write-heavy, real-time workload.

The result: Scrollbar jitter. Content jumping. Dropped frames. A "janky" feeling that makes expensive AI products feel cheap.

The Nuclear Option: Bypass the DOM Entirely

I asked a simple question: What if we never trigger a single layout reflow?

The answer was <canvas>.

Canvas rendering uses fillText() — a direct pixel operation that happens in the compositor thread. No DOM nodes to measure. No CSS to recalculate. No layout to reflow. Just math → pixels.

But "just use canvas" is like saying "just rewrite everything in Assembly." You lose:

Text selection
Accessibility (screen readers)
Responsive reflow on resize
Line breaking
International text support (CJK, BiDi, Thai)

So I built ZeroJitter — a React component that gives you all of those back while keeping the canvas performance.

Architecture: How ZeroJitter Works

┌─ Main Thread ──────────────────────────────────┐
│                                                │
│  LLM tokens → useZeroJitter hook               │
│                    │                            │
│              postMessage()                      │
│                    ▼                            │
│  ┌─ Web Worker ────────────────────────┐       │
│  │ Intl.Segmenter → measureText()      │       │
│  │ Line breaking • CJK • BiDi • Emoji  │       │
│  │ Returns: lines[], height, widths     │       │
│  └─────────────────────────────────────┘       │
│                    │                            │
│              onmessage()                        │
│                    ▼                            │
│  CanvasRenderer.paint() → <canvas>              │
│  AccessibilityMirror  → <div aria-live>         │
│                                                │
└────────────────────────────────────────────────┘

The Key Insight: Measurement ≠ Rendering

The expensive part of text layout isn't painting pixels — it's measuring text. Every time you add a word, the browser needs to figure out: Does this word fit on the current line? Where does the next line start? How tall is the container now?

ZeroJitter moves ALL of this math to a Web Worker using CanvasRenderingContext2D.measureText(). The worker:

Segments text via Intl.Segmenter (handles CJK per-character breaking, Thai word boundaries, Arabic/Hebrew BiDi)
Measures each segment via an OffscreenCanvas measureText() call
Caches measurements — the word "the" at 16px Inter always has the same width
Performs line breaking with pure arithmetic (~0.0002ms per text block)
Returns line data to the main thread

The main thread then just fillText()s each line at its computed position. Zero layout involvement. Zero reflows. Locked 60fps.

The Numbers

Metric	DOM Rendering	ZeroJitter
Reflows per token	1	0
Layout time	0.3-2ms	<0.01ms
Frame drops (@ 100 tok/s)	12-30	0
FPS	45-58	60 (locked)
Scrollbar stability	Jittery	Rock solid

Usage

npm install zero-jitter

import { useRef } from 'react';
import { ZeroJitter, useZeroJitter } from 'zero-jitter';

function StreamingChat() {
  const ref = useRef<HTMLDivElement>(null);
  const { append, clear, layout } = useZeroJitter(ref);

  useEffect(() => {
    const sse = new EventSource('/api/chat');
    sse.onmessage = (e) => append(e.data);
    return () => sse.close();
  }, [append]);

  return (
    <ZeroJitter
      ref={ref}
      font="16px Inter"
      maxHeight={400}
      color="#e2e8f0"
    />
  );
}

That's it. Drop-in replacement. Your streaming goes from janky to buttery.

What Makes This Different

Not "just a canvas text renderer"

There are canvas text libraries. ZeroJitter is specifically engineered for streaming:

Token coalescing: Multiple tokens arriving in the same frame are batched into one worker message via requestAnimationFrame
Stale response discarding: Monotonic request IDs ensure out-of-order worker responses don't cause glitches
Incremental layout: Only remeasures changed text, not the entire document
Viewport culling: O(log n) binary search — only visible lines are painted, even for 10,000-line documents

Full accessibility

A visually-hidden <div aria-live="polite"> mirrors the canvas text with a 300ms debounce during streaming. Screen readers announce updates without being flooded by individual tokens.

Zero dependencies

The entire text layout engine (based on pretext) is vendored into the library. No external runtime dependencies. Just React as a peer dep.

International text

Built on Intl.Segmenter with full support for:

CJK (Chinese, Japanese, Korean) — per-character line breaking with kinsoku rules
Arabic/Hebrew — BiDi text with correct segment ordering
Thai — proper word segmentation (Thai has no spaces!)
Emoji — corrects Chrome/Firefox canvas emoji width inflation

Live Demo

See it yourself: altrusian.com/zero-jitter

The demo streams the same text into both a standard DOM element and a ZeroJitter canvas side-by-side, with real-time metrics. Crank the speed to 150 tok/s and watch the DOM panel fall apart while the canvas stays rock solid.

The Deeper Problem

Layout thrashing isn't a "nice to fix" — it's a trust destroyer.

When users interact with an AI chat app, the streaming response is the primary interface. If that interface stutters, users subconsciously associate the jank with the AI itself. "Is it thinking? Did it freeze? Is something wrong?"

Smooth streaming = perceived intelligence.

Every major AI company is going to need to solve this as models get faster. GPT-4o streams at ~100 tokens/second. The next generation will be 200+. DOM rendering will break completely at those speeds.

ZeroJitter is open source, MIT licensed, and ready for production.

Links:

📦 npm: npm install zero-jitter
🔗 GitHub: github.com/jvoltci/zero-jitter
🎮 Live Demo: altrusian.com/zero-jitter
🌐 Website: altrusian.com/zero-jitter

TL;DR: I built a React library that renders streaming LLM text on <canvas> instead of the DOM. Zero layout reflows, locked 60fps, full accessibility, zero dependencies. The scrollbar will never jitter again.

DEV Community