the bugs that only break in Japanese
Here's a failure mode that almost never shows up in your issue tracker: the code is correct in English, the tests pass, CI is green, and then someone types Japanese and something quietly goes wrong. No stack trace. No error. Just a result that's subtly off.
I call these silent breakages. They survive because the person who could reproduce them isn't the person who wrote the code. I spend a lot of my open-source time finding and fixing exactly this class of bug in other people's repos. Here are the four patterns I look for first, and how I turn one fix into the next.
1. Enter during IME composition
This is the most common one in any web text input. When you type Japanese (or Chinese, or Korean), you go through an IME: you type romaji, a candidate list appears, and you press Enter to confirm the conversion — not to submit the form. If a keydown handler treats that Enter as a submit, the form fires mid-conversion and eats half the input.
The fix is one guard:
input.addEventListener('keydown', (e) => {
if (e.isComposing) return; // without this, Japanese input breaks
if (e.key === 'Enter') handleSubmit();
});
e.isComposing (or keyCode === 229) tells you the IME is mid-composition. Core editor components ship without this check more often than you'd expect.
2. Byte slicing on multibyte text
Common in Go, Rust, and C CLIs. A Japanese character is three bytes in UTF-8, so slicing a string by byte offset cuts through the middle of a character and produces mojibake — or a panic.
s := "日本語"
fmt.Println(s[0:3]) // one mangled char, or a panic on a bad boundary
r := []rune(s)
fmt.Println(string(r[0:1])) // "日"
In Rust the same shape is &s[..n] on a byte index instead of iterating .chars(). Anywhere a tool truncates a string "to N characters" by byte length, CJK input is the input that breaks it.
3. Display width
Terminal tools that draw tables, columns, or progress bars assume one character is one column. CJK characters occupy two terminal columns. Use len() for width and every row with Japanese in it has a ragged right edge. The fix is wcwidth / wcswidth instead of length.
4. Locale that doesn't propagate
Tools that spawn child processes or containers and don't pass LANG / LC_ALL through. The parent runs ja_JP.UTF-8, the child falls back to C, and suddenly sorting of Japanese filenames is wrong or an iconv step mangles output. It only shows up in the environment the maintainers don't run.
how I find the next one for free
Once you've seen one of these fixed, the same shape is usually sitting untouched somewhere else in the same repo. I wrote up the general version of this — find the bug a merged fix forgot — but for CJK specifically the move is:
- Search a repo's merged PRs for
isComposing,[]rune,wcwidth,CJK, orcomposition. - Read what the fix changed, then
grepthe rest of the repo for the same pattern. The byte-slice that got fixed in one file usually has a twin two files over. - Reproduce it with a test that uses real Japanese text (
"日本語テスト", not"あ"— a one-character test looks thin and tells you less). - Open a tiny PR that links the original fix. "Same issue as #NNN, the sibling path it missed." A reviewer can verify it in under a minute.
every pair I find this way goes into a public dataset — repo, file, bug class, and how long the twin stayed broken: https://github.com/greymoth-jp/sibling-leftover-dataset
being honest about it
Not every one of these is welcome, and not every one lands. Some maintainers are wary of drive-by contributions right now, and a forced "sibling" that isn't really there will lose their trust faster than a typo fix. The bar that earns the merge is the opposite of volume: one tiny, obviously-correct change, with the reference fix linked, that respects the reviewer's time.
This is the kind of work I've had merged into raylib, NestJS, bat, es-toolkit, gin, Medusa, Jan and others — about two dozen so far, all listed (read live from the GitHub API, so I can't fudge the count) at my proof dashboard.
If your software has Japanese users and you've never typed Japanese into it, one of these four is probably waiting in your codebase right now.
field notes on the Japan-shaped holes in global software · github.com/greymoth-jp
Top comments (0)