<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>DEV Community: Tech Labs</title>
    <description>The latest articles on DEV Community by Tech Labs (@lechlabs).</description>
    <link>https://dev.to/lechlabs</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F3774793%2F216879cd-9b7c-4ed9-9fb4-9862595cf9e8.jpg</url>
      <title>DEV Community: Tech Labs</title>
      <link>https://dev.to/lechlabs</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://dev.to/feed/lechlabs"/>
    <language>en</language>
    <item>
      <title>Why Your Diff Tool is Failing on JSONL Files</title>
      <dc:creator>Tech Labs</dc:creator>
      <pubDate>Mon, 16 Feb 2026 03:59:09 +0000</pubDate>
      <link>https://dev.to/lechlabs/why-your-diff-tool-is-failing-on-jsonl-files-19k0</link>
      <guid>https://dev.to/lechlabs/why-your-diff-tool-is-failing-on-jsonl-files-19k0</guid>
      <description>&lt;h2&gt;
  
  
  The Problem
&lt;/h2&gt;

&lt;p&gt;You're working on a 20,000-line JSONL (JSON Lines) dataset with carefully curated training data. You make changes, but need to verify what actually changed between versions.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;The lines are too long.&lt;/strong&gt; They don't fit on your screen. Each line is a dense unformatted JSON.&lt;/p&gt;

&lt;p&gt;You reach for your favorite diff tool. And it fails.&lt;/p&gt;

&lt;p&gt;Or worse—it shows you a meaningless blob of changes because it's treating your entire JSONL file as a single JSON document.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;This shouldn't happen.&lt;/strong&gt; But it does, constantly, to engineers and data engineers everywhere.&lt;/p&gt;

&lt;h2&gt;
  
  
  What is JSONL (and Why It Matters)
&lt;/h2&gt;

&lt;p&gt;JSONL (JSON Lines) is deceptively simple: &lt;strong&gt;one valid JSON object per line&lt;/strong&gt;.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;{"id":1,"name":"Tom","age":35}
{"id":2,"name":"Maria","age":32}
{"id":3,"name":"Alex","age":28}
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;It's not the same as pretty-printed JSON with newlines. Each line is completely independent. Parse it, process it, forget it. Next line.&lt;/p&gt;

&lt;p&gt;This format is everywhere:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;ChatGPT fine-tuning datasets&lt;/strong&gt; (OpenAI's required format)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;ML training pipelines&lt;/strong&gt; (streaming data without loading everything into memory)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Structured logs&lt;/strong&gt; (each log entry is a JSON object)&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  How It Works
&lt;/h2&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;Parse each JSONL line independently&lt;/strong&gt; (validates JSON syntax)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Align by line number&lt;/strong&gt; (line 1 vs. line 1, line 2 vs. line 2)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Transform to pretty-printed JSON arrays&lt;/strong&gt; (with 2-space indentation)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Show side-by-side diff&lt;/strong&gt; using Monaco Editor (VS Code's diff engine)&lt;/li&gt;
&lt;/ol&gt;

&lt;h3&gt;
  
  
  Result:
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;✅ &lt;strong&gt;Readable JSON&lt;/strong&gt; instead of compact one-liners&lt;/li&gt;
&lt;li&gt;✅ &lt;strong&gt;Clear visual diffs&lt;/strong&gt; with syntax highlighting&lt;/li&gt;
&lt;li&gt;✅ &lt;strong&gt;Handles different lengths&lt;/strong&gt; (pads with &lt;code&gt;null&lt;/code&gt;)&lt;/li&gt;
&lt;li&gt;✅ &lt;strong&gt;Client-side only&lt;/strong&gt; (your data never leaves your browser)&lt;/li&gt;
&lt;li&gt;✅ &lt;strong&gt;Drag &amp;amp; drop files&lt;/strong&gt; or paste directly&lt;/li&gt;
&lt;li&gt;✅ &lt;strong&gt;Free, no signup, no tracking&lt;/strong&gt;
&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Real-World Example
&lt;/h2&gt;

&lt;p&gt;Let's say you're comparing two versions of a training dataset. Here's what you paste:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Left (original):&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;{"id":1,"name":"Tom","age":35,"score":92.5}
{"id":2,"name":"Maria","age":32,"score":88.3}
{"id":3,"name":"Alex","age":28,"score":95.1}
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Right (modified):&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;{"id":1,"name":"Tommy","age":35,"score":92.5}
{"id":2,"name":"Maria","age":33,"score":88.3}
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Notice line 3 is missing on the right, and there are changes in lines 1 and 2.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;The tool shows:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Side-by-side pretty-printed JSON arrays&lt;/li&gt;
&lt;li&gt;Line 1: &lt;code&gt;"name": "Tom"&lt;/code&gt; → &lt;code&gt;"name": "Tommy"&lt;/code&gt; (highlighted in red/green)&lt;/li&gt;
&lt;li&gt;Line 2: &lt;code&gt;"age": 32&lt;/code&gt; → &lt;code&gt;"age": 33&lt;/code&gt; (highlighted)&lt;/li&gt;
&lt;li&gt;Line 3: present on left, &lt;code&gt;null&lt;/code&gt; on right (shows missing data)&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;No squinting. No character-by-character comparison. Just clear diffs.&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Available here:&lt;/strong&gt; &lt;a href="https://www.jsonlify.com/compare-jsonlines" rel="noopener noreferrer"&gt;https://www.jsonlify.com/compare-jsonlines&lt;/a&gt;&lt;/p&gt;

&lt;h1&gt;
  
  
  MachineLearning #DataEngineering #MLOps #JSONL #OpenAI #GPT #DataScience #WebDev
&lt;/h1&gt;

</description>
      <category>data</category>
      <category>dataengineering</category>
      <category>softwaredevelopment</category>
      <category>tooling</category>
    </item>
  </channel>
</rss>
