Snappy Tools

Posted on May 7

CSV to JSON in JavaScript: Handling Every Edge Case (Headers, Quotes, Types)

#javascript #webdev #beginners #tutorial

CSV is everywhere. Database exports, spreadsheet downloads, analytics reports — they all default to CSV. But modern APIs and front-end code want JSON. So at some point you will write a CSV-to-JSON parser in JavaScript, and it will work perfectly until it doesn't.

This guide covers the full implementation: the happy path, then every edge case that will break a naive parser.

The naive approach (and why it breaks)

The simplest CSV parser looks like this:

function csvToJson(csv) {
  const lines = csv.split('\n');
  const headers = lines[0].split(',');
  return lines.slice(1).map(line => {
    const values = line.split(',');
    return headers.reduce((obj, header, i) => {
      obj[header.trim()] = values[i]?.trim();
      return obj;
    }, {});
  });
}

This works for simple cases. But real CSV files will break it immediately.

Edge case 1: Quoted fields containing commas

RFC 4180 (the CSV standard) allows any field to be wrapped in double quotes. If a field contains a comma, it must be quoted:

name,address,city
Alice,"123 Main St, Apt 4",Boston

Splitting on , will tear "123 Main St, Apt 4" into two fields. You need a proper quoted-field parser.

function parseCSVLine(line) {
  const result = [];
  let current = '';
  let inQuotes = false;

  for (let i = 0; i < line.length; i++) {
    const char = line[i];

    if (char === '"') {
      if (inQuotes && line[i + 1] === '"') {
        // Escaped double-quote: "" inside a quoted field = literal "
        current += '"';
        i++;
      } else {
        inQuotes = !inQuotes;
      }
    } else if (char === ',' && !inQuotes) {
      result.push(current);
      current = '';
    } else {
      current += char;
    }
  }

  result.push(current);
  return result;
}

Edge case 2: Quoted fields containing newlines

A quoted field can span multiple lines:

name,bio
Alice,"Software engineer.
Loves hiking."
Bob,"Designer."

Splitting the CSV string on \n first will break this. You need to parse character-by-character across the whole string, tracking when you're inside quotes before deciding where each row ends.

Here's a full multi-line-safe parser:

function parseCSV(text) {
  const rows = [];
  let row = [];
  let field = '';
  let inQuotes = false;

  for (let i = 0; i < text.length; i++) {
    const c = text[i];

    if (inQuotes) {
      if (c === '"') {
        if (text[i + 1] === '"') {
          field += '"'; // escaped quote
          i++;
        } else {
          inQuotes = false; // end of quoted field
        }
      } else {
        field += c;
      }
    } else {
      if (c === '"') {
        inQuotes = true;
      } else if (c === ',') {
        row.push(field);
        field = '';
      } else if (c === '\n' || (c === '\r' && text[i + 1] === '\n')) {
        if (c === '\r') i++; // skip \r in CRLF
        row.push(field);
        field = '';
        if (row.some(f => f !== '')) rows.push(row); // skip empty lines
        row = [];
      } else {
        field += c;
      }
    }
  }

  // Last field + row
  if (field || row.length > 0) {
    row.push(field);
    if (row.some(f => f !== '')) rows.push(row);
  }

  return rows;
}

Edge case 3: Type coercion

Raw CSV values are always strings. For most use cases you want JavaScript-native types:

function coerceValue(val) {
  if (val === '' || val === null || val === undefined) return null;
  if (val === 'true') return true;
  if (val === 'false') return false;
  const num = Number(val);
  if (!isNaN(num) && val.trim() !== '') return num;
  return val;
}

This turns "42" into 42, "true" into true, and empty cells into null.

Whether to coerce is a judgement call — it's useful for data manipulation, but can cause issues if you need to preserve exact string representations (e.g., leading zeros in postcodes: "01234" → 1234).

Edge case 4: Custom delimiters

Not all "CSV" files use commas. Semicolons are the default in Excel for European locales (where commas are decimal separators). Tabs are common in TSV exports.

Auto-detecting the delimiter is useful:

function detectDelimiter(firstLine) {
  const candidates = [',', ';', '\t', '|'];
  const counts = candidates.map(d => ({
    delimiter: d,
    count: firstLine.split(d).length - 1
  }));
  return counts.sort((a, b) => b.count - a.count)[0].delimiter;
}

This isn't perfect (a line could have more pipes than commas by coincidence), but it works well in practice for standard exports.

Putting it together: CSV to JSON objects

function csvToJson(csvText, options = {}) {
  const { coerce = true, delimiter = null } = options;

  const rows = parseCSV(csvText);
  if (rows.length === 0) return [];

  const sep = delimiter || detectDelimiter(csvText.split('\n')[0]);
  // Re-parse with the detected delimiter if needed
  // (For simplicity, the parseCSV above uses ',' — extend it to accept a delimiter param)

  const headers = rows[0].map(h => h.trim());
  const results = [];

  for (let i = 1; i < rows.length; i++) {
    const obj = {};
    headers.forEach((header, j) => {
      const raw = rows[i][j] ?? '';
      obj[header] = coerce ? coerceValue(raw.trim()) : raw.trim();
    });
    results.push(obj);
  }

  return results;
}

Testing with a tricky CSV

const csv = `name,age,city,bio
Alice,30,Boston,"Loves hiking, camping."
Bob,25,NYC,"Says ""hello"" a lot."
Charlie,,,"
Multi-line
bio"`;

console.log(JSON.stringify(csvToJson(csv), null, 2));

Expected output:

[
  { "name": "Alice", "age": 30, "city": "Boston", "bio": "Loves hiking, camping." },
  { "name": "Bob", "age": 25, "city": "NYC", "bio": "Say \"hello\" a lot." },
  { "name": "Charlie", "age": null, "city": null, "bio": "\nMulti-line\nbio" }
]

When you don't want to write the parser yourself

If you're dealing with CSV in the browser and just need the JSON output without building a parser, SnappyTools has a free CSV-to-JSON converter — no upload, no signup, runs entirely client-side. Paste your CSV, pick your options, copy the JSON. Useful for quick one-off conversions or testing what your parser should produce.

Summary

Edge case	Naive parser	Robust parser
Comma inside field	❌ Breaks	✅ Handles with quoting
Escaped double-quotes	❌ Leaves `""` in output	✅ Replaces with `"`
Newlines inside field	❌ Splits the row	✅ Respects quote context
Windows line endings (CRLF)	❌ `\r` in last field	✅ Strips `\r`
Empty cells	❌ `undefined`	✅ Returns `null`
Type coercion	❌ All strings	✅ Native JS types

The full implementation above handles all of these. For production code, libraries like Papa Parse are battle-tested and worth the dependency. For lightweight or dependency-free environments, the parser above is a solid starting point.

DEV Community