Tokens per Word: GPT-5 vs Claude vs GPT-4, Measured Across 7 Languages

#ai #llm #openai #claude

Most token-cost guides repeat the same rule of thumb: one token is about three quarters of an English word. That figure is roughly right for English on a modern tokenizer, and increasingly wrong for everything else. Published numbers are surprisingly thin, so we measured it.

Method

We built a 13-sample corpus: the same 94-word passage human-translated into English, Spanish, Portuguese, French, German, Chinese, and Japanese (so the comparison holds meaning constant, not length), plus Python, JavaScript, JSON, Markdown, emoji-heavy social text, and CSV data.

GPT counts come from tiktoken (o200k_base for GPT-5/4o, cl100k_base for GPT-4, p50k for the GPT-3 era), so they are exact. Claude counts come from Anthropic's official count-tokens endpoint, envelope-calibrated (we measured the fixed message wrapper of 6 to 7 tokens and subtracted it; a doubling check came back with zero drift).

Tokens per word, same passage

Language	Words	GPT-5 (o200k)	Tokens/word	GPT-4 (cl100k)	Claude Sonnet 4.6	Claude Opus 4.8
English	94	110	1.17	110	116	177
Spanish	107	143	1.34	172	184	256
Portuguese	102	137	1.34	176	188	241
French	109	153	1.40	194	207	275
German	93	159	1.71	203	245	324
Chinese	n/a	159	n/a	223	217	216
Japanese	n/a	205	n/a	268	241	240

The four findings that surprised us

Spanish costs +30% vs English on GPT-5, but it used to be much worse. The same passage was +56% on GPT-4's cl100k and more than double on the GPT-3 era p50k. o200k roughly doubled the vocabulary and spent it on human languages.
Claude has two counting regimes. Anthropic's count-tokens endpoint reports identical numbers for Sonnet 4.6 and Haiku 4.5, but roughly 1.5x higher for Opus 4.8 on Latin-script text (Chinese and Japanese barely change). Since billing follows each model's own count, Opus costs about 2.5x Sonnet per English word despite a 1.67x sticker ratio.
CSV is the most expensive thing you can send. 57 tokens per 100 characters vs 19 for English prose. Digits, dates, and separators fragment into many small tokens.
o200k did not help code. Our JavaScript sample actually costs slightly more on o200k than cl100k. The multilingual gains came at no benefit to source code.

What a million words costs (input, measured)

Language	GPT-5	Claude Haiku 4.5	Claude Sonnet 4.6	Claude Opus 4.8
English	$1.46	$1.23	$3.70	$9.41
Spanish	$1.67	$1.72	$5.16	$11.96
German	$2.14	$2.63	$7.90	$17.42

Reproduce it

The full dataset, corpus, and methodology are free under CC BY 4.0:

Full article with all tables
tokenizer-comparison-2026.csv and JSON
The 13-sample corpus, so you can verify every count

To sanity-check the GPT numbers interactively, our browser-local token counter runs the real o200k encoding client-side, so counts match this dataset exactly and your text never leaves the page.