Almost every data pipeline eventually hits the same step: an API returns JSON, but the next consumer — a spreadsheet, an import script, a Redshift COPY command — needs CSV. Converting JSON to CSV in Python sounds trivial until you hit nested objects, inconsistent keys, or datetime values that need special handling.
Python gives you two solid paths: the built-in json + csv modules for zero-dependency scripts, and pandas for nested flattening and larger datasets — or the online JSON to CSV converter for quick one-off conversions without any code. This guide covers both approaches end to end, with runnable Python 3.8+ examples.
Key takeaways:
-
csv.DictWriterconverts a list of dicts to CSV with zero dependencies — usejson.load()to parse, thenwriteheader()+writerows(). - Always open CSV files with
newline=""on Windows to prevent blank rows between data rows. -
pd.json_normalize()flattens nested JSON into a flat DataFrame before callingto_csv()— handles multi-level nesting automatically. - Pass
index=FalsetoDataFrame.to_csv()— without it, pandas writes an unwanted row-number column. - For files over 500 MB, use
ijsonfor streaming JSON parsing combined withcsv.DictWriterfor constant memory usage.
What is JSON to CSV Conversion?
JSON to CSV conversion transforms an array of JSON objects into a tabular format where each object becomes a row and each key becomes a column header. JSON is hierarchical — objects can nest arbitrarily deep. CSV is flat — every value sits in a row-column grid. The conversion works cleanly when every object shares the same set of top-level keys. Nested objects, arrays, and inconsistent keys are where things get interesting.
Before (JSON):
[{"order_id":"ord_91a3","total":149.99,"status":"shipped"},
{"order_id":"ord_b7f2","total":34.50,"status":"pending"}]
After (CSV):
order_id,total,status
ord_91a3,149.99,shipped
ord_b7f2,34.50,pending
csv.DictWriter — Convert JSON to CSV Without Pandas
The csv module ships with every Python installation. No pip install, no virtual environment gymnastics. csv.DictWriter takes a list of dictionaries and writes each one as a CSV row, mapping dict keys to column headers. The fieldnames parameter controls both the column order and which keys get included.
# Python 3.8+ — minimal json to csv example
import json
import csv
# Sample JSON data — an array of order objects
json_string = """
[
{"order_id": "ord_91a3", "product": "Wireless Keyboard", "quantity": 2, "unit_price": 74.99},
{"order_id": "ord_b7f2", "product": "USB-C Hub", "quantity": 1, "unit_price": 34.50},
{"order_id": "ord_c4e8", "product": "Monitor Stand", "quantity": 3, "unit_price": 29.95}
]
"""
records = json.loads(json_string)
with open("orders.csv", "w", newline="", encoding="utf-8") as csvfile:
writer = csv.DictWriter(csvfile, fieldnames=records[0].keys())
writer.writeheader()
writer.writerows(records)
# orders.csv:
# order_id,product,quantity,unit_price
# ord_91a3,Wireless Keyboard,2,74.99
# ord_b7f2,USB-C Hub,1,34.50
# ord_c4e8,Monitor Stand,3,29.95
That newline="" argument on open() is not optional on Windows. Without it, you get double carriage returns — which show up as blank rows between every data row in Excel. On macOS and Linux it is harmless, so just always include it.
The code above uses json.loads() for a string. Use json.load() (no trailing s) when reading from a file handle. This trips people up constantly — one reads a string, the other reads a file object.
# Python 3.8+ — read JSON file, write CSV file
import json
import csv
with open("server_metrics.json", encoding="utf-8") as jf:
metrics = json.load(jf) # json.load() for file objects
# Explicit fieldnames control column order
columns = ["timestamp", "hostname", "cpu_percent", "memory_mb", "disk_io_ops"]
with open("server_metrics.csv", "w", newline="", encoding="utf-8") as cf:
writer = csv.DictWriter(cf, fieldnames=columns, extrasaction="ignore")
writer.writeheader()
writer.writerows(metrics)
# Only the five specified columns appear, in exactly that order
Setting extrasaction="ignore" silently drops any keys in the dicts that are not in your fieldnames list. The default is "raise", which throws a ValueError if any dict has an unexpected key. Pick whichever matches your tolerance for surprises.
Note:
csv.DictWritervscsv.writer: DictWriter maps dict keys to column positions automatically.csv.writerwrites raw lists as rows — you handle the column ordering yourself. DictWriter is almost always the right choice for JSON-to-CSV because JSON records are already dictionaries.
Python's csv module ships with three named dialects: excel (comma delimiter, CRLF line endings — the default), excel-tab (tab delimiter, CRLF endings), and unix (LF line endings, quotes all non-numeric fields). For most JSON-to-CSV workflows the excel dialect is correct, but switch to unix when writing files that will be processed by POSIX tools like awk or sort.
Handling Non-Standard Types: datetime, UUID, and Decimal
JSON from APIs often contains dates as ISO strings, UUIDs as hyphenated strings, and monetary values as floats. When you parse these into Python objects for processing before writing CSV, you need to convert them back to strings. The csv module calls str() on every value, so most types just work. But datetime objects produce messy default string representations, and Decimal values need explicit formatting to avoid scientific notation.
# Python 3.8+ — pre-process datetime and Decimal before CSV write
import json
import csv
from datetime import datetime, timezone
from decimal import Decimal
from uuid import UUID
# Simulating parsed API response with Python types
transactions = [
{
"txn_id": UUID("a1b2c3d4-e5f6-7890-abcd-ef1234567890"),
"created_at": datetime(2026, 3, 15, 9, 30, 0, tzinfo=timezone.utc),
"amount": Decimal("1249.99"),
"currency": "USD",
"merchant": "CloudHost Inc.",
},
{
"txn_id": UUID("b2c3d4e5-f6a7-8901-bcde-f12345678901"),
"created_at": datetime(2026, 3, 15, 14, 12, 0, tzinfo=timezone.utc),
"amount": Decimal("87.50"),
"currency": "EUR",
"merchant": "DataSync GmbH",
},
]
def prepare_row(record: dict) -> dict:
"""Convert non-string types to CSV-friendly strings."""
return {
"txn_id": str(record["txn_id"]),
"created_at": record["created_at"].isoformat(),
"amount": f"{record['amount']:.2f}",
"currency": record["currency"],
"merchant": record["merchant"],
}
with open("transactions.csv", "w", newline="", encoding="utf-8") as f:
writer = csv.DictWriter(f, fieldnames=["txn_id", "created_at", "amount", "currency", "merchant"])
writer.writeheader()
for txn in transactions:
writer.writerow(prepare_row(txn))
# transactions.csv:
# txn_id,created_at,amount,currency,merchant
# a1b2c3d4-e5f6-7890-abcd-ef1234567890,2026-03-15T09:30:00+00:00,1249.99,USD,CloudHost Inc.
# b2c3d4e5-f6a7-8901-bcde-f12345678901,2026-03-15T14:12:00+00:00,87.50,EUR,DataSync GmbH
Warning: If you let
Decimalvalues pass through without formatting, very small or very large numbers may render in scientific notation (e.g.,1.5E+7). Always format Decimal with an explicit f-string likef"{value:.2f}"when writing financial data to CSV.
csv.DictWriter Parameters Reference
The full constructor signature is csv.DictWriter(f, fieldnames, restval="", extrasaction="raise", dialect="excel", **fmtparams). Most of these have sensible defaults. The ones you will actually change are fieldnames, delimiter, and extrasaction.
| Parameter | Type | Default | Description |
|---|---|---|---|
f |
file object | (required) | Any object with a write() method — typically from open()
|
fieldnames |
sequence | (required) | List of keys that defines column order in the CSV output |
restval |
str | "" |
Value written when a dict is missing a key from fieldnames |
extrasaction |
str | "raise" |
"raise" throws ValueError for extra keys; "ignore" silently drops them |
dialect |
str / Dialect | "excel" |
Predefined formatting rules — "excel", "excel-tab", or "unix"
|
delimiter |
str | "," |
Single character separating fields — use "\t" for TSV output |
quotechar |
str | " |
Character used to quote fields containing the delimiter |
quoting |
int | csv.QUOTE_MINIMAL |
Controls when quoting is applied — MINIMAL, ALL, NONNUMERIC, NONE |
lineterminator |
str | "\r\n" |
String appended after each row — override to "\n" for Unix-style output |
pandas — Convert JSON to CSV with DataFrames
If you are already working in a pandas-heavy codebase, or your JSON has nested objects that you need to flatten, the pandas approach is significantly less code than the stdlib version. The tradeoff: pandas is a ~30 MB dependency. For a throwaway script, that is fine. For a Docker image you ship to production, the stdlib approach keeps things lighter.
# Python 3.8+ — pandas read_json then to_csv
import pandas as pd
# Read JSON array directly into a DataFrame
df = pd.read_json("warehouse_inventory.json")
# Write to CSV — index=False prevents the auto-generated row numbers
df.to_csv("warehouse_inventory.csv", index=False)
# That's it. Two lines. pandas infers column types automatically.
The index=False flag is one of those things you look up every single time. Without it, pandas writes a 0, 1, 2, ... column as the first column of your CSV. Nobody wants that.
Flattening Nested JSON with json_normalize
Real API responses are rarely flat. Orders contain shipping addresses, users contain nested preferences, telemetry events contain nested metadata. pd.json_normalize() walks nested dictionaries and flattens them into columns with dot-separated names.
# Python 3.8+ — flatten nested JSON using json_normalize
import json
import pandas as pd
api_response = """
[
{
"order_id": "ord_91a3",
"placed_at": "2026-03-15T09:30:00Z",
"customer": {
"name": "Sarah Chen",
"email": "s.chen@example.com",
"tier": "premium"
},
"shipping": {
"method": "express",
"address": {
"city": "Portland",
"state": "OR",
"zip": "97201"
}
},
"total": 299.95
},
{
"order_id": "ord_b7f2",
"placed_at": "2026-03-15T14:12:00Z",
"customer": {
"name": "James Park",
"email": "j.park@example.com",
"tier": "standard"
},
"shipping": {
"method": "standard",
"address": {
"city": "Austin",
"state": "TX",
"zip": "73301"
}
},
"total": 87.50
}
]
"""
orders = json.loads(api_response)
# json_normalize flattens nested dicts — sep controls the delimiter
df = pd.json_normalize(orders, sep="_")
df.to_csv("flat_orders.csv", index=False)
# Resulting columns:
# order_id, placed_at, customer_name, customer_email, customer_tier,
# shipping_method, shipping_address_city, shipping_address_state,
# shipping_address_zip, total
The sep="_" parameter controls how nested key names are joined. The default is ".", which produces columns like customer.name. I prefer underscores because dots in column names cause trouble with SQL imports and some spreadsheet formulas.
For API responses that wrap the records array under a nested key, use the record_path parameter. If the response looks like {"data": {"orders": [...]}}, pass record_path=["data", "orders"] to navigate to the right list.
DataFrame.to_csv() Parameters Reference
DataFrame.to_csv() has over 20 parameters. These are the ones that matter for JSON-to-CSV workflows.
| Parameter | Type | Default | Description |
|---|---|---|---|
path_or_buf |
str / Path / None | None |
File path or buffer — None returns CSV as a string |
sep |
str | "," |
Field delimiter — use "\t" for TSV |
index |
bool | True |
Write row index as the first column — almost always set to False |
columns |
list | None |
Subset and reorder columns in the output |
header |
bool / list | True |
Write column names — set False when appending to existing file |
encoding |
str | "utf-8" |
Output encoding — use "utf-8-sig" for Excel compatibility on Windows |
na_rep |
str | "" |
String representation for missing values (NaN, None) |
quoting |
int | csv.QUOTE_MINIMAL |
Controls when fields get quoted |
# Python 3.8+ — to_csv with common parameter overrides
import pandas as pd
df = pd.read_json("telemetry_events.json")
# TSV output with explicit encoding and missing value handling
df.to_csv(
"telemetry_events.tsv",
sep="\t",
index=False,
encoding="utf-8",
na_rep="NULL",
columns=["event_id", "timestamp", "source", "severity", "message"],
)
# Write to stdout for piping in shell scripts
print(df.to_csv(index=False))
# Return as string (no file written)
csv_string = df.to_csv(index=False)
print(len(csv_string), "characters")
Convert JSON to CSV from a File and API Response
The two most common real-world scenarios: reading JSON from a file on disk and converting it, or fetching JSON from an HTTP API and saving the result as CSV.
File on Disk — Read, Convert, Save
# Python 3.8+ — convert JSON file to CSV with error handling
import json
import csv
import sys
def json_file_to_csv(input_path: str, output_path: str) -> int:
"""Convert a JSON file containing an array of objects to CSV.
Returns the number of rows written.
"""
try:
with open(input_path, encoding="utf-8") as jf:
data = json.load(jf)
except FileNotFoundError:
print(f"Error: {input_path} not found", file=sys.stderr)
return 0
except json.JSONDecodeError as exc:
print(f"Error: invalid JSON in {input_path}: {exc.msg} at line {exc.lineno}", file=sys.stderr)
return 0
if not isinstance(data, list) or not data:
print(f"Error: expected a non-empty JSON array in {input_path}", file=sys.stderr)
return 0
# Collect all unique keys across all records — handles inconsistent schemas
all_keys: list[str] = []
seen: set[str] = set()
for record in data:
for key in record:
if key not in seen:
all_keys.append(key)
seen.add(key)
with open(output_path, "w", newline="", encoding="utf-8") as cf:
writer = csv.DictWriter(cf, fieldnames=all_keys, restval="", extrasaction="ignore")
writer.writeheader()
writer.writerows(data)
return len(data)
rows = json_file_to_csv("deploy_logs.json", "deploy_logs.csv")
print(f"Wrote {rows} rows to deploy_logs.csv")
HTTP API Response — Fetch and Convert
# Python 3.8+ — fetch JSON from API and save as CSV
import json
import csv
import urllib.request
import urllib.error
def api_response_to_csv(url: str, output_path: str) -> int:
"""Fetch JSON from a REST API endpoint and write it as CSV."""
try:
req = urllib.request.Request(url, headers={"Accept": "application/json"})
with urllib.request.urlopen(req, timeout=30) as resp:
if resp.status != 200:
print(f"Error: API returned status {resp.status}")
return 0
body = resp.read().decode("utf-8")
except urllib.error.URLError as exc:
print(f"Error: could not reach {url}: {exc.reason}")
return 0
try:
records = json.loads(body)
except json.JSONDecodeError as exc:
print(f"Error: API returned invalid JSON: {exc.msg}")
return 0
if not isinstance(records, list) or not records:
print("Error: expected a non-empty JSON array from the API")
return 0
with open(output_path, "w", newline="", encoding="utf-8") as cf:
writer = csv.DictWriter(cf, fieldnames=records[0].keys())
writer.writeheader()
writer.writerows(records)
return len(records)
Note: The example above uses
urllibfrom the standard library to keep the script dependency-free. If you haverequestsinstalled, replace theurllibsection withresp = requests.get(url, timeout=30); records = resp.json()— the rest of the CSV writing code stays identical.
Command-Line JSON to CSV Conversion
Sometimes you just need a one-liner in the terminal. Python's -c flag lets you run a quick conversion without creating a script file.
# Python one-liner: reads JSON from stdin, writes CSV to stdout
cat orders.json | python3 -c "
import json, csv, sys
data = json.load(sys.stdin)
w = csv.DictWriter(sys.stdout, fieldnames=data[0].keys())
w.writeheader()
w.writerows(data)
"
# Save output to a file
cat orders.json | python3 -c "
import json, csv, sys
data = json.load(sys.stdin)
w = csv.DictWriter(sys.stdout, fieldnames=data[0].keys())
w.writeheader()
w.writerows(data)
" > orders.csv
# Using jq + csvkit for complex transformations
# Install csvkit: pip install csvkit
# jq flattens and selects fields, in2csv handles the CSV formatting
cat api_response.json | jq '[.[] | {id: .order_id, customer: .customer.name, total}]' | in2csv -f json > orders.csv
# Miller (mlr) is another option for JSON-to-CSV
mlr --json2csv cat orders.json > orders.csv
Miller (mlr) is a standalone binary that treats JSON, CSV, and TSV as first-class formats with no Python runtime required. The --json2csv flag converts JSON input to CSV in a single pass. Install via Homebrew on macOS (brew install miller) or your Linux package manager.
High-Performance Alternative — pandas with pyarrow
For datasets in the tens-of-millions-of-rows range, pandas with the pyarrow backend reads and writes significantly faster than the default.
pip install pyarrow
# Python 3.8+ — pandas with pyarrow for faster CSV writing
import pandas as pd
# Read JSON with pyarrow engine (faster parsing for large files)
df = pd.read_json("sensor_readings.json", engine="pyarrow")
# to_csv doesn't have an engine parameter, but the DataFrame operations
# between read and write benefit from pyarrow's columnar layout
df.to_csv("sensor_readings.csv", index=False)
# For truly large exports, consider writing to Parquet instead of CSV
# — binary format, 5-10x smaller, preserves types
df.to_parquet("sensor_readings.parquet", engine="pyarrow")
If you are processing more than a few hundred MB of JSON and the final consumer accepts Parquet, skip CSV entirely. Parquet is smaller, preserves column types, and both Redshift and BigQuery load it natively. CSV is a lossy format — every value becomes a string.
Working with Large JSON Files
json.load() reads the entire file into memory. For a 200 MB JSON file, that means ~200 MB of raw text plus the Python object overhead — easily 500 MB+ of heap usage. For files over 100 MB, stream the input with ijson and write CSV rows as you go.
pip install ijson
Streaming JSON Array to CSV with ijson
# Python 3.8+ — stream large JSON array to CSV with constant memory
import ijson
import csv
def stream_json_to_csv(json_path: str, csv_path: str) -> int:
"""Convert a large JSON array to CSV without loading it all into memory."""
with open(json_path, "rb") as jf, open(csv_path, "w", newline="", encoding="utf-8") as cf:
# ijson.items yields each element of the top-level array one at a time
records = ijson.items(jf, "item")
first_record = next(records)
fieldnames = list(first_record.keys())
writer = csv.DictWriter(cf, fieldnames=fieldnames)
writer.writeheader()
writer.writerow(first_record)
count = 1
for record in records:
writer.writerow(record)
count += 1
return count
rows = stream_json_to_csv("clickstream_2026_03.json", "clickstream_2026_03.csv")
print(f"Streamed {rows} records to CSV")
NDJSON / JSON Lines — One Object Per Line
NDJSON (Newline-Delimited JSON), also called JSON Lines or .jsonl, stores one valid JSON object per line with no wrapping array. This format is common in log pipelines, event streams (Kafka, Kinesis), and bulk exports from services like Elasticsearch and BigQuery.
# Python 3.8+ — convert NDJSON to CSV line by line
import json
import csv
def ndjson_to_csv(ndjson_path: str, csv_path: str) -> int:
"""Convert a newline-delimited JSON file to CSV, one line at a time."""
with open(ndjson_path, encoding="utf-8") as nf:
first_line = nf.readline()
first_record = json.loads(first_line)
fieldnames = list(first_record.keys())
with open(csv_path, "w", newline="", encoding="utf-8") as cf:
writer = csv.DictWriter(cf, fieldnames=fieldnames)
writer.writeheader()
writer.writerow(first_record)
count = 1
for line in nf:
line = line.strip()
if not line:
continue
try:
record = json.loads(line)
writer.writerow(record)
count += 1
except json.JSONDecodeError:
continue # skip malformed lines
return count
rows = ndjson_to_csv("access_log.ndjson", "access_log.csv")
print(f"Converted {rows} log entries to CSV")
Note: Switch to streaming when the JSON file exceeds 100 MB. A 1 GB JSON array loaded with
json.load()can consume 3–5 GB of RAM due to Python object overhead. With ijson, memory stays flat regardless of file size. If you just need a quick conversion of a small file, paste it into the JSON to CSV converter instead.
Common Mistakes
❌ Missing newline='' in open() — blank rows on Windows
Problem: The csv module writes \r\n line endings. Without newline='', Python's text mode adds another \r on Windows, producing double-spaced output.
Fix: Always pass newline="" when opening a file for CSV writing. It is harmless on macOS/Linux.
# Before — blank rows on Windows
with open("output.csv", "w") as f:
writer = csv.DictWriter(f, fieldnames=columns)
writer.writeheader()
writer.writerows(data)
# After — clean output on all platforms
with open("output.csv", "w", newline="", encoding="utf-8") as f:
writer = csv.DictWriter(f, fieldnames=columns)
writer.writeheader()
writer.writerows(data)
❌ Forgetting index=False in pandas to_csv()
Problem: Without index=False, pandas prepends an auto-incrementing row number column (0, 1, 2, ...) that pollutes the CSV with data that was never in the original JSON.
Fix: Pass index=False to to_csv().
# Before
df = pd.read_json("events.json")
df.to_csv("events.csv")
# CSV gets an extra unnamed column: ,event_id,timestamp,...
# After
df = pd.read_json("events.json")
df.to_csv("events.csv", index=False)
# Clean CSV: event_id,timestamp,...
❌ Using records[0].keys() when records have inconsistent keys
Problem: If JSON objects have different keys (some records have optional fields), using the first record's keys as fieldnames silently drops columns that only appear in later records.
Fix: Collect all unique keys across all records before creating the DictWriter.
# Before
records = json.load(f)
writer = csv.DictWriter(out, fieldnames=records[0].keys())
# Misses "discount" field that only appears in records[2]
# After
records = json.load(f)
all_keys = list(dict.fromkeys(k for r in records for k in r))
writer = csv.DictWriter(out, fieldnames=all_keys, restval="")
# Every key from every record is included as a column
❌ Writing nested dicts directly to CSV without flattening
Problem: csv.DictWriter calls str() on nested dicts, producing columns with values like "{'city': 'Portland'}" — raw Python repr, not actual data.
Fix: Flatten nested objects first using pd.json_normalize() or a custom flattening function.
# Before
records = [{"id": "evt_1", "meta": {"source": "web", "region": "us-west"}}]
writer = csv.DictWriter(f, fieldnames=["id", "meta"])
writer.writerows(records)
# meta column contains: {'source': 'web', 'region': 'us-west'}
# After
import pandas as pd
records = [{"id": "evt_1", "meta": {"source": "web", "region": "us-west"}}]
df = pd.json_normalize(records, sep="_")
df.to_csv("events.csv", index=False)
# Columns: id, meta_source, meta_region
csv.DictWriter vs pandas — Quick Comparison
| Method | Nested JSON | Custom Types | Streaming | Dependencies | Requires Install |
|---|---|---|---|---|---|
csv.DictWriter |
✗ (manual flatten) | ✗ | ✓ (row by row) | None | No (stdlib) |
csv.writer |
✗ | ✗ | ✓ (row by row) | None | No (stdlib) |
pd.DataFrame.to_csv() |
✗ (flat only) | ✓ (via dtypes) | ✗ | pandas + numpy | pip install |
pd.json_normalize() + to_csv() |
✓ | ✓ (via dtypes) | ✗ | pandas + numpy | pip install |
csv.writer + json_flatten |
✓ | ✗ | ✓ | flatten_json | pip install |
jq + csvkit (CLI) |
✓ (via jq) | N/A | ✓ | jq, csvkit | System install |
Use csv.DictWriter when you need zero dependencies, your JSON is flat, and the script runs in a restricted environment (CI containers, Lambda functions, embedded Python). Use pd.json_normalize() + to_csv() when the JSON is nested, you need to transform or filter data before export, or you are already in a pandas workflow. For files that do not fit in memory, combine ijson with csv.DictWriter for constant-memory streaming. For quick, no-code conversions, the JSON to CSV converter on ToolDeck handles it without any Python setup.
Frequently Asked Questions
How do I convert JSON to CSV in Python without pandas?
Use the built-in json and csv modules. Call json.load() to parse the JSON file into a list of dicts, extract fieldnames from the first dict's keys, create a csv.DictWriter, call writeheader(), then writerows(). This approach has zero external dependencies and works in any Python 3.x environment.
If your JSON objects have inconsistent keys across records, collect all unique keys first with dict.fromkeys(k for r in records for k in r) before passing them as fieldnames to avoid missing columns.
import json
import csv
with open("orders.json") as f:
records = json.load(f)
with open("orders.csv", "w", newline="", encoding="utf-8") as f:
writer = csv.DictWriter(f, fieldnames=records[0].keys())
writer.writeheader()
writer.writerows(records)
How do I handle nested JSON when converting to CSV?
Flat JSON arrays map directly to CSV rows, but nested objects need flattening first. With pandas, pd.json_normalize() handles this automatically — it joins nested keys with a dot separator (e.g., "address.city"). Without pandas, write a recursive function that walks the dict and concatenates keys with a delimiter.
import pandas as pd
nested_data = [
{"id": "ord_91a3", "customer": {"name": "Sarah Chen", "email": "s.chen@example.com"}},
]
df = pd.json_normalize(nested_data, sep="_")
# Columns: id, customer_name, customer_email
df.to_csv("flat_orders.csv", index=False)
Why does my CSV have blank rows between data rows on Windows?
The csv module writes \r\n line endings by default. On Windows, opening the file in text mode adds another \r, producing \r\r\n — which displays as a blank row. The fix is to always pass newline="" to open().
# Wrong — blank rows on Windows
with open("output.csv", "w") as f:
writer = csv.writer(f)
# Correct — newline="" prevents double \r
with open("output.csv", "w", newline="") as f:
writer = csv.writer(f)
How do I append JSON records to an existing CSV file?
Open the file in append mode ("a") and create a DictWriter with the same fieldnames. Skip writeheader() since the header row already exists.
import csv
new_records = [
{"order_id": "ord_f4c1", "total": 89.50, "status": "shipped"},
]
with open("orders.csv", "a", newline="", encoding="utf-8") as f:
writer = csv.DictWriter(f, fieldnames=["order_id", "total", "status"])
writer.writerows(new_records)
What is the fastest way to convert a large JSON file to CSV in Python?
For files under 500 MB, pd.read_json() followed by to_csv() is the fastest single-call approach — pandas uses optimized C code internally. For files above 500 MB, use ijson to stream JSON records and write them to CSV with csv.DictWriter row by row. For NDJSON files, a plain Python for loop over the file handle achieves constant memory without any third-party library.
# Fast for files that fit in memory
import pandas as pd
df = pd.read_json("large_dataset.json")
df.to_csv("large_dataset.csv", index=False)
# Streaming for files that don't fit in memory
import ijson, csv
with open("huge.json", "rb") as jf, open("huge.csv", "w", newline="") as cf:
records = ijson.items(jf, "item")
first = next(records)
writer = csv.DictWriter(cf, fieldnames=first.keys())
writer.writeheader()
writer.writerow(first)
for record in records:
writer.writerow(record)
Can I write CSV output to stdout instead of a file in Python?
Yes. Pass sys.stdout as the file object to csv.writer() or csv.DictWriter(). With pandas, call to_csv(sys.stdout, index=False) or to_csv(None) to get a string you can print.
import csv
import sys
import json
data = json.loads('[{"host":"web-1","cpu":72.3},{"host":"web-2","cpu":45.1}]')
writer = csv.DictWriter(sys.stdout, fieldnames=data[0].keys())
writer.writeheader()
writer.writerows(data)
# host,cpu
# web-1,72.3
# web-2,45.1
Top comments (0)