DEV Community

ToolDeck for ToolDeck

Posted on • Originally published at tooldeck.dev

How to Convert JSON to CSV in Python (Complete Guide)

Almost every data pipeline eventually hits the same step: an API returns JSON, but the next consumer — a spreadsheet, an import script, a Redshift COPY command — needs CSV. Converting JSON to CSV in Python sounds trivial until you hit nested objects, inconsistent keys, or datetime values that need special handling.

Python gives you two solid paths: the built-in json + csv modules for zero-dependency scripts, and pandas for nested flattening and larger datasets — or the online JSON to CSV converter for quick one-off conversions without any code. This guide covers both approaches end to end, with runnable Python 3.8+ examples.

Key takeaways:

  • csv.DictWriter converts a list of dicts to CSV with zero dependencies — use json.load() to parse, then writeheader() + writerows().
  • Always open CSV files with newline="" on Windows to prevent blank rows between data rows.
  • pd.json_normalize() flattens nested JSON into a flat DataFrame before calling to_csv() — handles multi-level nesting automatically.
  • Pass index=False to DataFrame.to_csv() — without it, pandas writes an unwanted row-number column.
  • For files over 500 MB, use ijson for streaming JSON parsing combined with csv.DictWriter for constant memory usage.

What is JSON to CSV Conversion?

JSON to CSV conversion transforms an array of JSON objects into a tabular format where each object becomes a row and each key becomes a column header. JSON is hierarchical — objects can nest arbitrarily deep. CSV is flat — every value sits in a row-column grid. The conversion works cleanly when every object shares the same set of top-level keys. Nested objects, arrays, and inconsistent keys are where things get interesting.

Before (JSON):

[{"order_id":"ord_91a3","total":149.99,"status":"shipped"},
 {"order_id":"ord_b7f2","total":34.50,"status":"pending"}]
Enter fullscreen mode Exit fullscreen mode

After (CSV):

order_id,total,status
ord_91a3,149.99,shipped
ord_b7f2,34.50,pending
Enter fullscreen mode Exit fullscreen mode

csv.DictWriter — Convert JSON to CSV Without Pandas

The csv module ships with every Python installation. No pip install, no virtual environment gymnastics. csv.DictWriter takes a list of dictionaries and writes each one as a CSV row, mapping dict keys to column headers. The fieldnames parameter controls both the column order and which keys get included.

# Python 3.8+ — minimal json to csv example
import json
import csv

# Sample JSON data — an array of order objects
json_string = """
[
  {"order_id": "ord_91a3", "product": "Wireless Keyboard", "quantity": 2, "unit_price": 74.99},
  {"order_id": "ord_b7f2", "product": "USB-C Hub", "quantity": 1, "unit_price": 34.50},
  {"order_id": "ord_c4e8", "product": "Monitor Stand", "quantity": 3, "unit_price": 29.95}
]
"""

records = json.loads(json_string)

with open("orders.csv", "w", newline="", encoding="utf-8") as csvfile:
    writer = csv.DictWriter(csvfile, fieldnames=records[0].keys())
    writer.writeheader()
    writer.writerows(records)

# orders.csv:
# order_id,product,quantity,unit_price
# ord_91a3,Wireless Keyboard,2,74.99
# ord_b7f2,USB-C Hub,1,34.50
# ord_c4e8,Monitor Stand,3,29.95
Enter fullscreen mode Exit fullscreen mode

That newline="" argument on open() is not optional on Windows. Without it, you get double carriage returns — which show up as blank rows between every data row in Excel. On macOS and Linux it is harmless, so just always include it.

The code above uses json.loads() for a string. Use json.load() (no trailing s) when reading from a file handle. This trips people up constantly — one reads a string, the other reads a file object.

# Python 3.8+ — read JSON file, write CSV file
import json
import csv

with open("server_metrics.json", encoding="utf-8") as jf:
    metrics = json.load(jf)  # json.load() for file objects

# Explicit fieldnames control column order
columns = ["timestamp", "hostname", "cpu_percent", "memory_mb", "disk_io_ops"]

with open("server_metrics.csv", "w", newline="", encoding="utf-8") as cf:
    writer = csv.DictWriter(cf, fieldnames=columns, extrasaction="ignore")
    writer.writeheader()
    writer.writerows(metrics)

# Only the five specified columns appear, in exactly that order
Enter fullscreen mode Exit fullscreen mode

Setting extrasaction="ignore" silently drops any keys in the dicts that are not in your fieldnames list. The default is "raise", which throws a ValueError if any dict has an unexpected key. Pick whichever matches your tolerance for surprises.

Note: csv.DictWriter vs csv.writer: DictWriter maps dict keys to column positions automatically. csv.writer writes raw lists as rows — you handle the column ordering yourself. DictWriter is almost always the right choice for JSON-to-CSV because JSON records are already dictionaries.

Python's csv module ships with three named dialects: excel (comma delimiter, CRLF line endings — the default), excel-tab (tab delimiter, CRLF endings), and unix (LF line endings, quotes all non-numeric fields). For most JSON-to-CSV workflows the excel dialect is correct, but switch to unix when writing files that will be processed by POSIX tools like awk or sort.


Handling Non-Standard Types: datetime, UUID, and Decimal

JSON from APIs often contains dates as ISO strings, UUIDs as hyphenated strings, and monetary values as floats. When you parse these into Python objects for processing before writing CSV, you need to convert them back to strings. The csv module calls str() on every value, so most types just work. But datetime objects produce messy default string representations, and Decimal values need explicit formatting to avoid scientific notation.

# Python 3.8+ — pre-process datetime and Decimal before CSV write
import json
import csv
from datetime import datetime, timezone
from decimal import Decimal
from uuid import UUID

# Simulating parsed API response with Python types
transactions = [
    {
        "txn_id": UUID("a1b2c3d4-e5f6-7890-abcd-ef1234567890"),
        "created_at": datetime(2026, 3, 15, 9, 30, 0, tzinfo=timezone.utc),
        "amount": Decimal("1249.99"),
        "currency": "USD",
        "merchant": "CloudHost Inc.",
    },
    {
        "txn_id": UUID("b2c3d4e5-f6a7-8901-bcde-f12345678901"),
        "created_at": datetime(2026, 3, 15, 14, 12, 0, tzinfo=timezone.utc),
        "amount": Decimal("87.50"),
        "currency": "EUR",
        "merchant": "DataSync GmbH",
    },
]

def prepare_row(record: dict) -> dict:
    """Convert non-string types to CSV-friendly strings."""
    return {
        "txn_id": str(record["txn_id"]),
        "created_at": record["created_at"].isoformat(),
        "amount": f"{record['amount']:.2f}",
        "currency": record["currency"],
        "merchant": record["merchant"],
    }

with open("transactions.csv", "w", newline="", encoding="utf-8") as f:
    writer = csv.DictWriter(f, fieldnames=["txn_id", "created_at", "amount", "currency", "merchant"])
    writer.writeheader()
    for txn in transactions:
        writer.writerow(prepare_row(txn))

# transactions.csv:
# txn_id,created_at,amount,currency,merchant
# a1b2c3d4-e5f6-7890-abcd-ef1234567890,2026-03-15T09:30:00+00:00,1249.99,USD,CloudHost Inc.
# b2c3d4e5-f6a7-8901-bcde-f12345678901,2026-03-15T14:12:00+00:00,87.50,EUR,DataSync GmbH
Enter fullscreen mode Exit fullscreen mode

Warning: If you let Decimal values pass through without formatting, very small or very large numbers may render in scientific notation (e.g., 1.5E+7). Always format Decimal with an explicit f-string like f"{value:.2f}" when writing financial data to CSV.


csv.DictWriter Parameters Reference

The full constructor signature is csv.DictWriter(f, fieldnames, restval="", extrasaction="raise", dialect="excel", **fmtparams). Most of these have sensible defaults. The ones you will actually change are fieldnames, delimiter, and extrasaction.

Parameter Type Default Description
f file object (required) Any object with a write() method — typically from open()
fieldnames sequence (required) List of keys that defines column order in the CSV output
restval str "" Value written when a dict is missing a key from fieldnames
extrasaction str "raise" "raise" throws ValueError for extra keys; "ignore" silently drops them
dialect str / Dialect "excel" Predefined formatting rules — "excel", "excel-tab", or "unix"
delimiter str "," Single character separating fields — use "\t" for TSV output
quotechar str " Character used to quote fields containing the delimiter
quoting int csv.QUOTE_MINIMAL Controls when quoting is applied — MINIMAL, ALL, NONNUMERIC, NONE
lineterminator str "\r\n" String appended after each row — override to "\n" for Unix-style output

pandas — Convert JSON to CSV with DataFrames

If you are already working in a pandas-heavy codebase, or your JSON has nested objects that you need to flatten, the pandas approach is significantly less code than the stdlib version. The tradeoff: pandas is a ~30 MB dependency. For a throwaway script, that is fine. For a Docker image you ship to production, the stdlib approach keeps things lighter.

# Python 3.8+ — pandas read_json then to_csv
import pandas as pd

# Read JSON array directly into a DataFrame
df = pd.read_json("warehouse_inventory.json")

# Write to CSV — index=False prevents the auto-generated row numbers
df.to_csv("warehouse_inventory.csv", index=False)

# That's it. Two lines. pandas infers column types automatically.
Enter fullscreen mode Exit fullscreen mode

The index=False flag is one of those things you look up every single time. Without it, pandas writes a 0, 1, 2, ... column as the first column of your CSV. Nobody wants that.

Flattening Nested JSON with json_normalize

Real API responses are rarely flat. Orders contain shipping addresses, users contain nested preferences, telemetry events contain nested metadata. pd.json_normalize() walks nested dictionaries and flattens them into columns with dot-separated names.

# Python 3.8+ — flatten nested JSON using json_normalize
import json
import pandas as pd

api_response = """
[
  {
    "order_id": "ord_91a3",
    "placed_at": "2026-03-15T09:30:00Z",
    "customer": {
      "name": "Sarah Chen",
      "email": "s.chen@example.com",
      "tier": "premium"
    },
    "shipping": {
      "method": "express",
      "address": {
        "city": "Portland",
        "state": "OR",
        "zip": "97201"
      }
    },
    "total": 299.95
  },
  {
    "order_id": "ord_b7f2",
    "placed_at": "2026-03-15T14:12:00Z",
    "customer": {
      "name": "James Park",
      "email": "j.park@example.com",
      "tier": "standard"
    },
    "shipping": {
      "method": "standard",
      "address": {
        "city": "Austin",
        "state": "TX",
        "zip": "73301"
      }
    },
    "total": 87.50
  }
]
"""

orders = json.loads(api_response)

# json_normalize flattens nested dicts — sep controls the delimiter
df = pd.json_normalize(orders, sep="_")
df.to_csv("flat_orders.csv", index=False)

# Resulting columns:
# order_id, placed_at, customer_name, customer_email, customer_tier,
# shipping_method, shipping_address_city, shipping_address_state,
# shipping_address_zip, total
Enter fullscreen mode Exit fullscreen mode

The sep="_" parameter controls how nested key names are joined. The default is ".", which produces columns like customer.name. I prefer underscores because dots in column names cause trouble with SQL imports and some spreadsheet formulas.

For API responses that wrap the records array under a nested key, use the record_path parameter. If the response looks like {"data": {"orders": [...]}}, pass record_path=["data", "orders"] to navigate to the right list.


DataFrame.to_csv() Parameters Reference

DataFrame.to_csv() has over 20 parameters. These are the ones that matter for JSON-to-CSV workflows.

Parameter Type Default Description
path_or_buf str / Path / None None File path or buffer — None returns CSV as a string
sep str "," Field delimiter — use "\t" for TSV
index bool True Write row index as the first column — almost always set to False
columns list None Subset and reorder columns in the output
header bool / list True Write column names — set False when appending to existing file
encoding str "utf-8" Output encoding — use "utf-8-sig" for Excel compatibility on Windows
na_rep str "" String representation for missing values (NaN, None)
quoting int csv.QUOTE_MINIMAL Controls when fields get quoted
# Python 3.8+ — to_csv with common parameter overrides
import pandas as pd

df = pd.read_json("telemetry_events.json")

# TSV output with explicit encoding and missing value handling
df.to_csv(
    "telemetry_events.tsv",
    sep="\t",
    index=False,
    encoding="utf-8",
    na_rep="NULL",
    columns=["event_id", "timestamp", "source", "severity", "message"],
)

# Write to stdout for piping in shell scripts
print(df.to_csv(index=False))

# Return as string (no file written)
csv_string = df.to_csv(index=False)
print(len(csv_string), "characters")
Enter fullscreen mode Exit fullscreen mode

Convert JSON to CSV from a File and API Response

The two most common real-world scenarios: reading JSON from a file on disk and converting it, or fetching JSON from an HTTP API and saving the result as CSV.

File on Disk — Read, Convert, Save

# Python 3.8+ — convert JSON file to CSV with error handling
import json
import csv
import sys

def json_file_to_csv(input_path: str, output_path: str) -> int:
    """Convert a JSON file containing an array of objects to CSV.
    Returns the number of rows written.
    """
    try:
        with open(input_path, encoding="utf-8") as jf:
            data = json.load(jf)
    except FileNotFoundError:
        print(f"Error: {input_path} not found", file=sys.stderr)
        return 0
    except json.JSONDecodeError as exc:
        print(f"Error: invalid JSON in {input_path}: {exc.msg} at line {exc.lineno}", file=sys.stderr)
        return 0

    if not isinstance(data, list) or not data:
        print(f"Error: expected a non-empty JSON array in {input_path}", file=sys.stderr)
        return 0

    # Collect all unique keys across all records — handles inconsistent schemas
    all_keys: list[str] = []
    seen: set[str] = set()
    for record in data:
        for key in record:
            if key not in seen:
                all_keys.append(key)
                seen.add(key)

    with open(output_path, "w", newline="", encoding="utf-8") as cf:
        writer = csv.DictWriter(cf, fieldnames=all_keys, restval="", extrasaction="ignore")
        writer.writeheader()
        writer.writerows(data)

    return len(data)

rows = json_file_to_csv("deploy_logs.json", "deploy_logs.csv")
print(f"Wrote {rows} rows to deploy_logs.csv")
Enter fullscreen mode Exit fullscreen mode

HTTP API Response — Fetch and Convert

# Python 3.8+ — fetch JSON from API and save as CSV
import json
import csv
import urllib.request
import urllib.error

def api_response_to_csv(url: str, output_path: str) -> int:
    """Fetch JSON from a REST API endpoint and write it as CSV."""
    try:
        req = urllib.request.Request(url, headers={"Accept": "application/json"})
        with urllib.request.urlopen(req, timeout=30) as resp:
            if resp.status != 200:
                print(f"Error: API returned status {resp.status}")
                return 0
            body = resp.read().decode("utf-8")
    except urllib.error.URLError as exc:
        print(f"Error: could not reach {url}: {exc.reason}")
        return 0

    try:
        records = json.loads(body)
    except json.JSONDecodeError as exc:
        print(f"Error: API returned invalid JSON: {exc.msg}")
        return 0

    if not isinstance(records, list) or not records:
        print("Error: expected a non-empty JSON array from the API")
        return 0

    with open(output_path, "w", newline="", encoding="utf-8") as cf:
        writer = csv.DictWriter(cf, fieldnames=records[0].keys())
        writer.writeheader()
        writer.writerows(records)

    return len(records)
Enter fullscreen mode Exit fullscreen mode

Note: The example above uses urllib from the standard library to keep the script dependency-free. If you have requests installed, replace the urllib section with resp = requests.get(url, timeout=30); records = resp.json() — the rest of the CSV writing code stays identical.


Command-Line JSON to CSV Conversion

Sometimes you just need a one-liner in the terminal. Python's -c flag lets you run a quick conversion without creating a script file.

# Python one-liner: reads JSON from stdin, writes CSV to stdout
cat orders.json | python3 -c "
import json, csv, sys
data = json.load(sys.stdin)
w = csv.DictWriter(sys.stdout, fieldnames=data[0].keys())
w.writeheader()
w.writerows(data)
"

# Save output to a file
cat orders.json | python3 -c "
import json, csv, sys
data = json.load(sys.stdin)
w = csv.DictWriter(sys.stdout, fieldnames=data[0].keys())
w.writeheader()
w.writerows(data)
" > orders.csv
Enter fullscreen mode Exit fullscreen mode
# Using jq + csvkit for complex transformations
# Install csvkit: pip install csvkit

# jq flattens and selects fields, in2csv handles the CSV formatting
cat api_response.json | jq '[.[] | {id: .order_id, customer: .customer.name, total}]' | in2csv -f json > orders.csv

# Miller (mlr) is another option for JSON-to-CSV
mlr --json2csv cat orders.json > orders.csv
Enter fullscreen mode Exit fullscreen mode

Miller (mlr) is a standalone binary that treats JSON, CSV, and TSV as first-class formats with no Python runtime required. The --json2csv flag converts JSON input to CSV in a single pass. Install via Homebrew on macOS (brew install miller) or your Linux package manager.


High-Performance Alternative — pandas with pyarrow

For datasets in the tens-of-millions-of-rows range, pandas with the pyarrow backend reads and writes significantly faster than the default.

pip install pyarrow
Enter fullscreen mode Exit fullscreen mode
# Python 3.8+ — pandas with pyarrow for faster CSV writing
import pandas as pd

# Read JSON with pyarrow engine (faster parsing for large files)
df = pd.read_json("sensor_readings.json", engine="pyarrow")

# to_csv doesn't have an engine parameter, but the DataFrame operations
# between read and write benefit from pyarrow's columnar layout
df.to_csv("sensor_readings.csv", index=False)

# For truly large exports, consider writing to Parquet instead of CSV
# — binary format, 5-10x smaller, preserves types
df.to_parquet("sensor_readings.parquet", engine="pyarrow")
Enter fullscreen mode Exit fullscreen mode

If you are processing more than a few hundred MB of JSON and the final consumer accepts Parquet, skip CSV entirely. Parquet is smaller, preserves column types, and both Redshift and BigQuery load it natively. CSV is a lossy format — every value becomes a string.


Working with Large JSON Files

json.load() reads the entire file into memory. For a 200 MB JSON file, that means ~200 MB of raw text plus the Python object overhead — easily 500 MB+ of heap usage. For files over 100 MB, stream the input with ijson and write CSV rows as you go.

pip install ijson
Enter fullscreen mode Exit fullscreen mode

Streaming JSON Array to CSV with ijson

# Python 3.8+ — stream large JSON array to CSV with constant memory
import ijson
import csv

def stream_json_to_csv(json_path: str, csv_path: str) -> int:
    """Convert a large JSON array to CSV without loading it all into memory."""
    with open(json_path, "rb") as jf, open(csv_path, "w", newline="", encoding="utf-8") as cf:
        # ijson.items yields each element of the top-level array one at a time
        records = ijson.items(jf, "item")

        first_record = next(records)
        fieldnames = list(first_record.keys())

        writer = csv.DictWriter(cf, fieldnames=fieldnames)
        writer.writeheader()
        writer.writerow(first_record)

        count = 1
        for record in records:
            writer.writerow(record)
            count += 1

    return count

rows = stream_json_to_csv("clickstream_2026_03.json", "clickstream_2026_03.csv")
print(f"Streamed {rows} records to CSV")
Enter fullscreen mode Exit fullscreen mode

NDJSON / JSON Lines — One Object Per Line

NDJSON (Newline-Delimited JSON), also called JSON Lines or .jsonl, stores one valid JSON object per line with no wrapping array. This format is common in log pipelines, event streams (Kafka, Kinesis), and bulk exports from services like Elasticsearch and BigQuery.

# Python 3.8+ — convert NDJSON to CSV line by line
import json
import csv

def ndjson_to_csv(ndjson_path: str, csv_path: str) -> int:
    """Convert a newline-delimited JSON file to CSV, one line at a time."""
    with open(ndjson_path, encoding="utf-8") as nf:
        first_line = nf.readline()
        first_record = json.loads(first_line)
        fieldnames = list(first_record.keys())

        with open(csv_path, "w", newline="", encoding="utf-8") as cf:
            writer = csv.DictWriter(cf, fieldnames=fieldnames)
            writer.writeheader()
            writer.writerow(first_record)

            count = 1
            for line in nf:
                line = line.strip()
                if not line:
                    continue
                try:
                    record = json.loads(line)
                    writer.writerow(record)
                    count += 1
                except json.JSONDecodeError:
                    continue  # skip malformed lines

    return count

rows = ndjson_to_csv("access_log.ndjson", "access_log.csv")
print(f"Converted {rows} log entries to CSV")
Enter fullscreen mode Exit fullscreen mode

Note: Switch to streaming when the JSON file exceeds 100 MB. A 1 GB JSON array loaded with json.load() can consume 3–5 GB of RAM due to Python object overhead. With ijson, memory stays flat regardless of file size. If you just need a quick conversion of a small file, paste it into the JSON to CSV converter instead.


Common Mistakes

❌ Missing newline='' in open() — blank rows on Windows

Problem: The csv module writes \r\n line endings. Without newline='', Python's text mode adds another \r on Windows, producing double-spaced output.

Fix: Always pass newline="" when opening a file for CSV writing. It is harmless on macOS/Linux.

# Before — blank rows on Windows
with open("output.csv", "w") as f:
    writer = csv.DictWriter(f, fieldnames=columns)
    writer.writeheader()
    writer.writerows(data)

# After — clean output on all platforms
with open("output.csv", "w", newline="", encoding="utf-8") as f:
    writer = csv.DictWriter(f, fieldnames=columns)
    writer.writeheader()
    writer.writerows(data)
Enter fullscreen mode Exit fullscreen mode

❌ Forgetting index=False in pandas to_csv()

Problem: Without index=False, pandas prepends an auto-incrementing row number column (0, 1, 2, ...) that pollutes the CSV with data that was never in the original JSON.

Fix: Pass index=False to to_csv().

# Before
df = pd.read_json("events.json")
df.to_csv("events.csv")
# CSV gets an extra unnamed column: ,event_id,timestamp,...

# After
df = pd.read_json("events.json")
df.to_csv("events.csv", index=False)
# Clean CSV: event_id,timestamp,...
Enter fullscreen mode Exit fullscreen mode

❌ Using records[0].keys() when records have inconsistent keys

Problem: If JSON objects have different keys (some records have optional fields), using the first record's keys as fieldnames silently drops columns that only appear in later records.

Fix: Collect all unique keys across all records before creating the DictWriter.

# Before
records = json.load(f)
writer = csv.DictWriter(out, fieldnames=records[0].keys())
# Misses "discount" field that only appears in records[2]

# After
records = json.load(f)
all_keys = list(dict.fromkeys(k for r in records for k in r))
writer = csv.DictWriter(out, fieldnames=all_keys, restval="")
# Every key from every record is included as a column
Enter fullscreen mode Exit fullscreen mode

❌ Writing nested dicts directly to CSV without flattening

Problem: csv.DictWriter calls str() on nested dicts, producing columns with values like "{'city': 'Portland'}" — raw Python repr, not actual data.

Fix: Flatten nested objects first using pd.json_normalize() or a custom flattening function.

# Before
records = [{"id": "evt_1", "meta": {"source": "web", "region": "us-west"}}]
writer = csv.DictWriter(f, fieldnames=["id", "meta"])
writer.writerows(records)
# meta column contains: {'source': 'web', 'region': 'us-west'}

# After
import pandas as pd
records = [{"id": "evt_1", "meta": {"source": "web", "region": "us-west"}}]
df = pd.json_normalize(records, sep="_")
df.to_csv("events.csv", index=False)
# Columns: id, meta_source, meta_region
Enter fullscreen mode Exit fullscreen mode

csv.DictWriter vs pandas — Quick Comparison

Method Nested JSON Custom Types Streaming Dependencies Requires Install
csv.DictWriter ✗ (manual flatten) ✓ (row by row) None No (stdlib)
csv.writer ✓ (row by row) None No (stdlib)
pd.DataFrame.to_csv() ✗ (flat only) ✓ (via dtypes) pandas + numpy pip install
pd.json_normalize() + to_csv() ✓ (via dtypes) pandas + numpy pip install
csv.writer + json_flatten flatten_json pip install
jq + csvkit (CLI) ✓ (via jq) N/A jq, csvkit System install

Use csv.DictWriter when you need zero dependencies, your JSON is flat, and the script runs in a restricted environment (CI containers, Lambda functions, embedded Python). Use pd.json_normalize() + to_csv() when the JSON is nested, you need to transform or filter data before export, or you are already in a pandas workflow. For files that do not fit in memory, combine ijson with csv.DictWriter for constant-memory streaming. For quick, no-code conversions, the JSON to CSV converter on ToolDeck handles it without any Python setup.


Frequently Asked Questions

How do I convert JSON to CSV in Python without pandas?

Use the built-in json and csv modules. Call json.load() to parse the JSON file into a list of dicts, extract fieldnames from the first dict's keys, create a csv.DictWriter, call writeheader(), then writerows(). This approach has zero external dependencies and works in any Python 3.x environment.

If your JSON objects have inconsistent keys across records, collect all unique keys first with dict.fromkeys(k for r in records for k in r) before passing them as fieldnames to avoid missing columns.

import json
import csv

with open("orders.json") as f:
    records = json.load(f)

with open("orders.csv", "w", newline="", encoding="utf-8") as f:
    writer = csv.DictWriter(f, fieldnames=records[0].keys())
    writer.writeheader()
    writer.writerows(records)
Enter fullscreen mode Exit fullscreen mode

How do I handle nested JSON when converting to CSV?

Flat JSON arrays map directly to CSV rows, but nested objects need flattening first. With pandas, pd.json_normalize() handles this automatically — it joins nested keys with a dot separator (e.g., "address.city"). Without pandas, write a recursive function that walks the dict and concatenates keys with a delimiter.

import pandas as pd

nested_data = [
    {"id": "ord_91a3", "customer": {"name": "Sarah Chen", "email": "s.chen@example.com"}},
]
df = pd.json_normalize(nested_data, sep="_")
# Columns: id, customer_name, customer_email
df.to_csv("flat_orders.csv", index=False)
Enter fullscreen mode Exit fullscreen mode

Why does my CSV have blank rows between data rows on Windows?

The csv module writes \r\n line endings by default. On Windows, opening the file in text mode adds another \r, producing \r\r\n — which displays as a blank row. The fix is to always pass newline="" to open().

# Wrong — blank rows on Windows
with open("output.csv", "w") as f:
    writer = csv.writer(f)

# Correct — newline="" prevents double \r
with open("output.csv", "w", newline="") as f:
    writer = csv.writer(f)
Enter fullscreen mode Exit fullscreen mode

How do I append JSON records to an existing CSV file?

Open the file in append mode ("a") and create a DictWriter with the same fieldnames. Skip writeheader() since the header row already exists.

import csv

new_records = [
    {"order_id": "ord_f4c1", "total": 89.50, "status": "shipped"},
]

with open("orders.csv", "a", newline="", encoding="utf-8") as f:
    writer = csv.DictWriter(f, fieldnames=["order_id", "total", "status"])
    writer.writerows(new_records)
Enter fullscreen mode Exit fullscreen mode

What is the fastest way to convert a large JSON file to CSV in Python?

For files under 500 MB, pd.read_json() followed by to_csv() is the fastest single-call approach — pandas uses optimized C code internally. For files above 500 MB, use ijson to stream JSON records and write them to CSV with csv.DictWriter row by row. For NDJSON files, a plain Python for loop over the file handle achieves constant memory without any third-party library.

# Fast for files that fit in memory
import pandas as pd
df = pd.read_json("large_dataset.json")
df.to_csv("large_dataset.csv", index=False)

# Streaming for files that don't fit in memory
import ijson, csv
with open("huge.json", "rb") as jf, open("huge.csv", "w", newline="") as cf:
    records = ijson.items(jf, "item")
    first = next(records)
    writer = csv.DictWriter(cf, fieldnames=first.keys())
    writer.writeheader()
    writer.writerow(first)
    for record in records:
        writer.writerow(record)
Enter fullscreen mode Exit fullscreen mode

Can I write CSV output to stdout instead of a file in Python?

Yes. Pass sys.stdout as the file object to csv.writer() or csv.DictWriter(). With pandas, call to_csv(sys.stdout, index=False) or to_csv(None) to get a string you can print.

import csv
import sys
import json

data = json.loads('[{"host":"web-1","cpu":72.3},{"host":"web-2","cpu":45.1}]')
writer = csv.DictWriter(sys.stdout, fieldnames=data[0].keys())
writer.writeheader()
writer.writerows(data)
# host,cpu
# web-1,72.3
# web-2,45.1
Enter fullscreen mode Exit fullscreen mode

Top comments (0)