DEV Community

Cover image for How to Fix CSV Encoding Issues (UTF-8, Windows-1252, and More)
Mahizul Islam
Mahizul Islam

Posted on

How to Fix CSV Encoding Issues (UTF-8, Windows-1252, and More)

How to Fix CSV Encoding Issues (UTF-8, Windows-1252, and More)

If you've ever opened a CSV file and seen broken characters like ’ instead of apostrophes, or é instead of é, you've encountered a CSV encoding problem. This is one of the most common issues developers and data analysts face when working with CSV files.

In this guide, I'll explain why encoding issues happen, how to detect them, and how to fix them — without writing a single line of code.

Why Does CSV Encoding Matter?

CSV files don't store information about their encoding. When you open a CSV, your software has to guess which encoding was used. If it guesses wrong, you get garbled text.

The most common culprits:

  • Windows-1252 — the default encoding for Excel on Windows. Fine for Western European languages, but breaks on special characters from other languages.
  • ISO-8859-1 (Latin-1) — similar to Windows-1252, commonly used in older systems.
  • UTF-16 — used by some Windows applications, includes a BOM (Byte Order Mark) at the start.
  • Shift-JIS, GBK, EUC-KR — common in Japanese, Chinese, and Korean systems respectively.

UTF-8 is the universal standard. Every modern database, API, and web application expects UTF-8. If your CSV isn't UTF-8, you'll run into import errors, broken characters, and data loss.

How to Detect CSV Encoding

Before fixing, you need to know what encoding your file is using. Look out for these signs:

  • Strange characters like ’, é, £ — classic Windows-1252 misread as UTF-8
  • Question marks ? replacing characters — encoding mismatch
  • Extra invisible characters at the start — this is a BOM (Byte Order Mark)
  • Import errors in MySQL, PostgreSQL, or MongoDB

You can check your CSV encoding instantly using the free CSV Encoding Checker — it detects UTF-8, Windows-1252, UTF-16, and more directly in your browser without uploading your file anywhere.

How to Fix CSV Encoding

Once you know the encoding, converting to UTF-8 is straightforward.

Option 1: Use a Free Online Tool (No Code)

The easiest way is to use the CSV to UTF-8 Converter. It supports 14 encodings including Windows-1252, ISO-8859-1, Shift-JIS, GBK, and UTF-16. Everything runs in your browser — your file is never uploaded to a server.

Option 2: Python

import pandas as pd

df = pd.read_csv('your-file.csv', encoding='windows-1252')
df.to_csv('fixed-file.csv', encoding='utf-8', index=False)
Enter fullscreen mode Exit fullscreen mode

Option 3: Node.js

const iconv = require('iconv-lite');
const fs = require('fs');

const input = fs.readFileSync('your-file.csv');
const decoded = iconv.decode(input, 'win1252');
fs.writeFileSync('fixed-file.csv', decoded, 'utf8');
Enter fullscreen mode Exit fullscreen mode

Option 4: Excel

  1. Open Excel → Data → From Text/CSV
  2. In the import wizard, change File Origin to 65001: Unicode (UTF-8)
  3. Save as CSV

The UTF-8 BOM Problem

Even after converting to UTF-8, Excel sometimes still shows garbled characters. This is because Excel on Windows needs a BOM (Byte Order Mark) — a hidden 3-byte marker at the start of the file — to recognize UTF-8.

When downloading from the CSV to UTF-8 Converter, the file automatically includes a BOM so Excel opens it correctly every time.

Quick Reference: Common Encoding Issues

Broken text Original character Likely encoding
’ ' (apostrophe) Windows-1252
é é Windows-1252
£ £ Windows-1252
????? Japanese/Chinese/Korean Wrong encoding
Invisible chars at start (none) UTF-16 BOM

Summary

  1. Check your encoding with a CSV Encoding Checker
  2. Convert to UTF-8 using Python, Node.js, Excel, or an online converter
  3. Include a UTF-8 BOM if opening in Excel on Windows
  4. Always save exports as UTF-8 to avoid future issues

All the tools mentioned in this article are free and browser-based — your data never leaves your device. Check out the full CSV toolkit for more tools like CSV Validator, CSV Formatter, and CSV Duplicate Remover.

Top comments (0)