The Problem
You're working on a 20,000-line JSONL (JSON Lines) dataset with carefully curated training data. You make changes, but need to verify what actually changed between versions.
The lines are too long. They don't fit on your screen. Each line is a dense unformatted JSON.
You reach for your favorite diff tool. And it fails.
Or worse—it shows you a meaningless blob of changes because it's treating your entire JSONL file as a single JSON document.
This shouldn't happen. But it does, constantly, to engineers and data engineers everywhere.
What is JSONL (and Why It Matters)
JSONL (JSON Lines) is deceptively simple: one valid JSON object per line.
{"id":1,"name":"Tom","age":35}
{"id":2,"name":"Maria","age":32}
{"id":3,"name":"Alex","age":28}
It's not the same as pretty-printed JSON with newlines. Each line is completely independent. Parse it, process it, forget it. Next line.
This format is everywhere:
- ChatGPT fine-tuning datasets (OpenAI's required format)
- ML training pipelines (streaming data without loading everything into memory)
- Structured logs (each log entry is a JSON object)
How It Works
- Parse each JSONL line independently (validates JSON syntax)
- Align by line number (line 1 vs. line 1, line 2 vs. line 2)
- Transform to pretty-printed JSON arrays (with 2-space indentation)
- Show side-by-side diff using Monaco Editor (VS Code's diff engine)
Result:
- ✅ Readable JSON instead of compact one-liners
- ✅ Clear visual diffs with syntax highlighting
- ✅ Handles different lengths (pads with
null) - ✅ Client-side only (your data never leaves your browser)
- ✅ Drag & drop files or paste directly
- ✅ Free, no signup, no tracking
Real-World Example
Let's say you're comparing two versions of a training dataset. Here's what you paste:
Left (original):
{"id":1,"name":"Tom","age":35,"score":92.5}
{"id":2,"name":"Maria","age":32,"score":88.3}
{"id":3,"name":"Alex","age":28,"score":95.1}
Right (modified):
{"id":1,"name":"Tommy","age":35,"score":92.5}
{"id":2,"name":"Maria","age":33,"score":88.3}
Notice line 3 is missing on the right, and there are changes in lines 1 and 2.
The tool shows:
- Side-by-side pretty-printed JSON arrays
- Line 1:
"name": "Tom"→"name": "Tommy"(highlighted in red/green) - Line 2:
"age": 32→"age": 33(highlighted) - Line 3: present on left,
nullon right (shows missing data)
No squinting. No character-by-character comparison. Just clear diffs.
Available here: https://www.jsonlify.com/compare-jsonlines
Top comments (0)