CSV is everywhere. Database exports, spreadsheet downloads, analytics reports — they all default to CSV. But modern APIs and front-end code want JSON. So at some point you will write a CSV-to-JSON parser in JavaScript, and it will work perfectly until it doesn't.
This guide covers the full implementation: the happy path, then every edge case that will break a naive parser.
The naive approach (and why it breaks)
The simplest CSV parser looks like this:
function csvToJson(csv) {
const lines = csv.split('\n');
const headers = lines[0].split(',');
return lines.slice(1).map(line => {
const values = line.split(',');
return headers.reduce((obj, header, i) => {
obj[header.trim()] = values[i]?.trim();
return obj;
}, {});
});
}
This works for simple cases. But real CSV files will break it immediately.
Edge case 1: Quoted fields containing commas
RFC 4180 (the CSV standard) allows any field to be wrapped in double quotes. If a field contains a comma, it must be quoted:
name,address,city
Alice,"123 Main St, Apt 4",Boston
Splitting on , will tear "123 Main St, Apt 4" into two fields. You need a proper quoted-field parser.
function parseCSVLine(line) {
const result = [];
let current = '';
let inQuotes = false;
for (let i = 0; i < line.length; i++) {
const char = line[i];
if (char === '"') {
if (inQuotes && line[i + 1] === '"') {
// Escaped double-quote: "" inside a quoted field = literal "
current += '"';
i++;
} else {
inQuotes = !inQuotes;
}
} else if (char === ',' && !inQuotes) {
result.push(current);
current = '';
} else {
current += char;
}
}
result.push(current);
return result;
}
Edge case 2: Quoted fields containing newlines
A quoted field can span multiple lines:
name,bio
Alice,"Software engineer.
Loves hiking."
Bob,"Designer."
Splitting the CSV string on \n first will break this. You need to parse character-by-character across the whole string, tracking when you're inside quotes before deciding where each row ends.
Here's a full multi-line-safe parser:
function parseCSV(text) {
const rows = [];
let row = [];
let field = '';
let inQuotes = false;
for (let i = 0; i < text.length; i++) {
const c = text[i];
if (inQuotes) {
if (c === '"') {
if (text[i + 1] === '"') {
field += '"'; // escaped quote
i++;
} else {
inQuotes = false; // end of quoted field
}
} else {
field += c;
}
} else {
if (c === '"') {
inQuotes = true;
} else if (c === ',') {
row.push(field);
field = '';
} else if (c === '\n' || (c === '\r' && text[i + 1] === '\n')) {
if (c === '\r') i++; // skip \r in CRLF
row.push(field);
field = '';
if (row.some(f => f !== '')) rows.push(row); // skip empty lines
row = [];
} else {
field += c;
}
}
}
// Last field + row
if (field || row.length > 0) {
row.push(field);
if (row.some(f => f !== '')) rows.push(row);
}
return rows;
}
Edge case 3: Type coercion
Raw CSV values are always strings. For most use cases you want JavaScript-native types:
function coerceValue(val) {
if (val === '' || val === null || val === undefined) return null;
if (val === 'true') return true;
if (val === 'false') return false;
const num = Number(val);
if (!isNaN(num) && val.trim() !== '') return num;
return val;
}
This turns "42" into 42, "true" into true, and empty cells into null.
Whether to coerce is a judgement call — it's useful for data manipulation, but can cause issues if you need to preserve exact string representations (e.g., leading zeros in postcodes: "01234" → 1234).
Edge case 4: Custom delimiters
Not all "CSV" files use commas. Semicolons are the default in Excel for European locales (where commas are decimal separators). Tabs are common in TSV exports.
Auto-detecting the delimiter is useful:
function detectDelimiter(firstLine) {
const candidates = [',', ';', '\t', '|'];
const counts = candidates.map(d => ({
delimiter: d,
count: firstLine.split(d).length - 1
}));
return counts.sort((a, b) => b.count - a.count)[0].delimiter;
}
This isn't perfect (a line could have more pipes than commas by coincidence), but it works well in practice for standard exports.
Putting it together: CSV to JSON objects
function csvToJson(csvText, options = {}) {
const { coerce = true, delimiter = null } = options;
const rows = parseCSV(csvText);
if (rows.length === 0) return [];
const sep = delimiter || detectDelimiter(csvText.split('\n')[0]);
// Re-parse with the detected delimiter if needed
// (For simplicity, the parseCSV above uses ',' — extend it to accept a delimiter param)
const headers = rows[0].map(h => h.trim());
const results = [];
for (let i = 1; i < rows.length; i++) {
const obj = {};
headers.forEach((header, j) => {
const raw = rows[i][j] ?? '';
obj[header] = coerce ? coerceValue(raw.trim()) : raw.trim();
});
results.push(obj);
}
return results;
}
Testing with a tricky CSV
const csv = `name,age,city,bio
Alice,30,Boston,"Loves hiking, camping."
Bob,25,NYC,"Says ""hello"" a lot."
Charlie,,,"
Multi-line
bio"`;
console.log(JSON.stringify(csvToJson(csv), null, 2));
Expected output:
[
{ "name": "Alice", "age": 30, "city": "Boston", "bio": "Loves hiking, camping." },
{ "name": "Bob", "age": 25, "city": "NYC", "bio": "Say \"hello\" a lot." },
{ "name": "Charlie", "age": null, "city": null, "bio": "\nMulti-line\nbio" }
]
When you don't want to write the parser yourself
If you're dealing with CSV in the browser and just need the JSON output without building a parser, SnappyTools has a free CSV-to-JSON converter — no upload, no signup, runs entirely client-side. Paste your CSV, pick your options, copy the JSON. Useful for quick one-off conversions or testing what your parser should produce.
Summary
| Edge case | Naive parser | Robust parser |
|---|---|---|
| Comma inside field | ❌ Breaks | ✅ Handles with quoting |
| Escaped double-quotes | ❌ Leaves "" in output |
✅ Replaces with "
|
| Newlines inside field | ❌ Splits the row | ✅ Respects quote context |
| Windows line endings (CRLF) | ❌ \r in last field |
✅ Strips \r
|
| Empty cells | ❌ undefined
|
✅ Returns null
|
| Type coercion | ❌ All strings | ✅ Native JS types |
The full implementation above handles all of these. For production code, libraries like Papa Parse are battle-tested and worth the dependency. For lightweight or dependency-free environments, the parser above is a solid starting point.
Top comments (0)