PostgreSQL Error 22P05: untranslatable character
The 22P05 untranslatable_character error occurs when PostgreSQL cannot convert a character from one encoding to another during a read or write operation. This typically happens when the database server encoding and the client encoding are mismatched, and a character exists in the source encoding that has no equivalent in the target encoding. It is one of the most common encoding-related errors in legacy or multi-language database environments.
Top 3 Causes
1. Server and Client Encoding Mismatch
The most frequent cause. If your database is encoded in LATIN1 but your application sends UTF-8 characters (e.g., Korean, emoji, special symbols), PostgreSQL cannot translate them.
-- Check your database encoding
SELECT datname, pg_encoding_to_char(encoding) AS encoding
FROM pg_database
WHERE datname = current_database();
-- Check current session client encoding
SHOW client_encoding;
-- Fix: explicitly set client encoding to match your data
SET client_encoding = 'UTF8';
2. Importing Data with Mixed or Unknown Encoding
During ETL or COPY operations from external sources (CSV, Oracle exports, legacy files), the source encoding may be ambiguous or mixed, causing untranslatable characters to slip through.
-- Always specify encoding explicitly in COPY
COPY my_table (col1, col2)
FROM '/data/import.csv'
WITH (FORMAT csv, HEADER true, ENCODING 'UTF8');
-- Detect rows containing non-ASCII characters
SELECT id, content
FROM my_table
WHERE length(content) != octet_length(content);
-- Convert bytes from a known encoding (e.g., EUC-KR to UTF-8)
SELECT convert_from('\xc7\xd1\xb1\xdb'::bytea, 'EUC_KR') AS result;
3. Inserting Multibyte Characters into a SQL_ASCII Database
SQL_ASCII databases skip encoding conversion entirely. When a UTF-8 client tries to insert multibyte characters, PostgreSQL raises this error because it refuses to silently mangle the data.
-- Test whether a conversion will succeed before committing
DO $$
BEGIN
PERFORM convert('test string'::bytea, 'UTF8', 'LATIN1');
RAISE NOTICE 'Conversion OK';
EXCEPTION WHEN untranslatable_character THEN
RAISE NOTICE 'Untranslatable character detected: %', SQLERRM;
END;
$$;
-- Strip non-ASCII characters as a temporary workaround
UPDATE my_table
SET content = regexp_replace(content, '[^\x00-\x7F]', '', 'g')
WHERE content ~ '[^\x00-\x7F]';
Quick Fix Solutions
-
Set client encoding explicitly at the session or connection level:
SET client_encoding = 'UTF8'; - Create new databases with UTF-8 to avoid the problem entirely:
CREATE DATABASE mydb
ENCODING 'UTF8'
LC_COLLATE = 'en_US.UTF-8'
LC_CTYPE = 'en_US.UTF-8'
TEMPLATE = template0;
- Pre-validate data before loading to catch problematic characters early.
-
Use
convert_from()to safely handle raw byte data from external systems.
Prevention Tips
Standardize on UTF-8 for all databases. UTF-8 covers the full Unicode range including all languages and emoji, eliminating the majority of encoding translation errors. Enforce this as a team or organization standard in your database provisioning scripts.
Always specify encoding in connection strings and COPY commands. Never rely on defaults. In your application config, explicitly set
client_encoding=UTF8. In data pipelines, always pass theENCODINGoption toCOPY. This makes encoding behavior deterministic and auditable regardless of server defaults.
Related Errors
- 22021 (character_not_in_repertoire): Similar to 22P05 but raised during character set validation rather than encoding conversion.
- 22000 (data_exception): The parent error class that 22P05 falls under.
- 08P01 (protocol_violation): Can appear when encoding negotiation fails at the protocol level during connection setup.
📖 Want a more detailed guide?
Check out the full in-depth version (Korean) on oraerror.com — includes detailed analysis, additional SQL examples, and prevention tips.
Top comments (0)