DEV Community

Cover image for Virtualizing Huge JSON Files in React: The Technical Breakdown
Utku Akyüz
Utku Akyüz

Posted on

Virtualizing Huge JSON Files in React: The Technical Breakdown

In my previous article, I detailed the journey that led me to combine the strong logic of json-diff-kit with a completely custom rendering solution. The goal was to build a component that could handle millions of lines of JSON diff data and provide a responsive, intuitive UX. All other existing tools failed to deliver.

This article breaks down the three core engineering challenges I overcame to build virtual-react-json-diff.

Challenge 1: High-Performance Rendering (Goodbye Table, Hello Grid)

The first step was abandoning the rigid <table> structure, which makes both styling and virtualization extremely complex.

I replaced the table structure with a CSS Grid-based diff architecture. This allowed for fluid layouts and vastly improved performance metrics compared to traditional table rendering overhead.

Here is the essential structure for each row component:

<div
  style={{
    display: "grid",
    gridTemplateColumns: "30px 1fr 30px 1fr",
  }}
>
  <div className="cell line-number">{leftPart.lineNumber}</div>
  <div className="cell">{leftPart.text}</div>
  <div className="cell line-number">{rightPart.lineNumber}</div>
  <div className="cell">{rightPart.text}</div>
</div>
Enter fullscreen mode Exit fullscreen mode

The 30px 1fr 30px 1fr template creates four columns (left line number, left content, right line number, right content), eliminating table layout overhead and simplifying the virtualization process via react-window.

Challenge 2: Virtualization vs. Tree Structures (The Conflict)

This was the most complex hurdle. Virtualization libraries like react-window require stability—they perform best when row heights are known and fixed. However, a JSON viewer must allow users to expand and collapse nested objects constantly.

Every time a user collapses a 5,000-line object into a single line, the virtualizer's internal calculations for total scroll height and item positioning break.

The Dynamic Segment-Based Solution

Since I couldn't find a library that offered both deep tree navigation and high-performance virtualization, I engineered a solution myself.

I built a dynamic registry using a segment-based rendering system that separates the raw diff data from the view representation.

Key Data Structures:

  1. Raw diff data: Complete diff results from json-diff-kit.

  2. Segments: Logical groupings that determine what content is currently visible/collapsed.

  3. View arrays: Flattened arrays fed to the virtualizer.

Equal lines are grouped into collapsible segments:


// Part of the core logic to group non-changed lines
export function generateSegments(diff: DiffResult[]): SegmentItem[] {
  const segments: SegmentItem[] = [];
  let i = 0;
  while (i < diff.length) {
    if (isNonEmptyEqualLine(diff[i])) {
      let j = i;
      while (j < diff.length && isNonEmptyEqualLine(diff[j])) j++;
      segments.push({ start: i, end: j, isEqual: true });
      i = j;
    } else {
      segments.push({ start: i, end: i + 1, isEqual: false });
      i++;
    }
  }
  return segments;
}
Enter fullscreen mode Exit fullscreen mode

When a user expands a collapsed segment, the system updates the segments and forces the virtualizer to recalculate:


useEffect(() => {
  const leftBuilt = buildViewFromSegments(segments, rawLeftDiff);
  const rightBuilt = buildViewFromSegments(segments, rawRightDiff);
  setLeftView(leftBuilt);
  setRightView(rightBuilt);
  
  // Critical: Recalculate virtualizer without visual jitter
  listRef.current?.resetAfterIndex(0, true); 
}, [segments, rawLeftDiff, rawRightDiff]);
Enter fullscreen mode Exit fullscreen mode

This dynamic registry allows the viewer to feel like a lightweight text editor, even while managing millions of hidden nodes, maintaining the desired 60 FPS performance.

Challenge 3: Navigation and Search (Don't Get Lost)

Navigating a 50,000-line diff requires tools beyond a standard scrollbar.

The Dual-Minimap

I implemented a dual-minimap system (one for each object) that highlights the changes and synchronizes precisely with the main scrollbar.

The Canvas Optimization: The minimap uses an offscreen canvas for efficient rendering. The diff visualization (coloring the lines) is pre-rendered once (the expensive part). During scrolling, only the lightweight scroll viewport box is composited on top, ensuring smooth performance.


// Pre-render diff on offscreen canvas (only when diff changes)
const diffCanvas = useMemo(() => {
  // ... logic to draw all colored diff lines on canvas
  // The color for each line is based on its type (add, remove, change)
  return offscreen;
}, [leftDiff, rightDiff, height, totalLines]);

// Draw scroll viewport (on every scroll)
const drawScrollBox = useCallback((ctx: CanvasRenderingContext2D) => {
  ctx.clearRect(0, 0, miniMapWidth, height);
  ctx.drawImage(diffCanvas, 0, 0); // Copy pre-rendered diff

  const totalContentHeight = totalLines * ROW_HEIGHT;
  const viewportTop = (currentScrollTop / totalContentHeight) * height;
  ctx.fillStyle = MINIMAP_SCROLL_COLOR;
  ctx.fillRect(0, viewportTop, miniMapWidth, viewportHeight);
}, [diffCanvas, currentScrollTop, totalLines]);
Enter fullscreen mode Exit fullscreen mode

Search & Indexing

Finding a specific value in a virtualized list is hard because the DOM elements might not exist yet.

I solved this by pre-indexing all search results and building a custom scroll-to-row mechanism that interacts directly with react-window's methods.


// When navigating matches, we scroll the virtualizer directly to the indexed position
const navigateMatch = useCallback((direction: "next" | "prev") => {
  const newIndex = direction === "next"
    ? (searchState.currentIndex + 1) % searchState.results.length
    : (searchState.currentIndex - 1 + searchState.results.length) % searchState.results.length;

  const matchIndex = searchState.results[newIndex];
  listRef.current?.scrollToItem(matchIndex, "center");
}, [searchState]);
Enter fullscreen mode Exit fullscreen mode

Search results are also visualized on the minimap by overlaying highlights on the pre-rendered canvas.

Conclusion: A Solution Finally Exists

I kept the package management simple, but open to improvement. The process of code review and cross reviewing with AI tools like Gemini to optimize some logic was incredibly enjoying.

If you are facing a similar problem, where scale is the bottleneck, you don't need to hit the same walls I did. A React Component for JSON comparison is here now.

The virtual-react-json-diff package offers:

Performance: Virtualized rendering for millions of lines.

UX: Minimaps, Search, and clean navigation.

Accuracy: Key-based comparison for arrays.

See the demo and the code
GitHub
Virtual-React-Json-Diff Demo
npm

Top comments (0)