DEV Community: greymoth

ime bugs, display-width bugs, and bidi bugs are the same bug

greymoth — Sat, 11 Jul 2026 01:29:24 +0000

Payload CMS has a search box in its admin UI. Type a query in Japanese, press Enter to confirm the kanji your IME just converted, and the search fires on the partial string that was still mid-conversion, not the query you meant to type. The Enter that closes an IME conversion and the Enter that submits a form are, to the browser, the exact same keydown event. Nothing in SearchInput's handler checked which one it was looking at.

That's payloadcms/payload#17138, one line: skip the handler when event.nativeEvent?.isComposing is true. I've filed close to identical fixes in dozens of other projects. Not similar bugs. The identical bug, in code that has never seen the others.

I keep a corpus of these. Right now it holds 129 documented bugs across 120 open-source libraries. 115 of those are pull requests I opened myself, 65 already merged; the other 14 cite an existing report someone else filed. Every entry links to a real GitHub PR or issue, and the build fails on any entry whose link isn't one. After sorting the categories for a while, the 129 collapse into three broken assumptions, not twelve.

a keystroke is a submit

This is the Payload bug, and it's the largest bucket by a wide margin: 40 of the 129 entries, 31 percent, more than any other single category in the corpus. It shows up in React, Vue, Svelte, Angular, and every headless component library that lets you bind a handler to Enter. CopilotKit/CopilotKit#5764 has the same shape in an Angular chat widget: the composer submitted the message on the Enter that confirms an IME composition instead of waiting for the real one. Different framework, different UI, same missing isComposing check. The fix is always one line. The bug survives review because most contributors, and every CI runner, type in a language where a keystroke and a finished character are the same thing.

a character is a byte

ratatui's barchart centers a value label by measuring its display width, correctly, in the sibling function right next to it, and by its UTF-8 byte length in this one. "中文" is 4 columns wide on a terminal but 6 bytes long, so the centering math undershoots and the label lands left of where it should sit (ratatui/ratatui#2625). A CJK character isn't one byte, isn't always two bytes, and isn't always one terminal column either, and code that treats those as interchangeable drifts the moment real text shows up.

pdf.js has a version of the same mistake one layer down. Font.prototype.encodeString silently drops the character that follows U+FFFE or U+FFFF when saving or printing a PDF, because its surrogate-pair guard treats those two single code units as if they started a surrogate pair, then swallows the next character as the pair's second half (mozilla/pdf.js#21538). The fix is a tighter range check. It's the same family as ratatui's bug: code written against "how many units is this" instead of "what does this actually represent."

text is left-to-right

out-of-character exists specifically to catch invisible and abusable Unicode. It already stripped the implicit bidi marks (U+200E, U+200F, U+061C), and let the explicit bidi embedding, override, and isolate controls (U+202A-U+202E, U+2066-U+2069) pass straight through untouched (spencermountain/out-of-character#90). Those are the characters behind Trojan Source (CVE-2021-42574): reorder how a line renders without changing a single byte of what's actually there. A library built to catch this class of character missed nine of the code points in its own job description.

bangumi/server-private had the narrower version of the same gap: its printable-character check matched the older bidi marks but not the directional isolates U+2066-U+2069, so a string made entirely of isolate controls passed validation as ordinary text (bangumi/server-private#1700). Bidi and control characters are one of the smaller categories in the corpus, 4 of 129, but it's the one where "harmless-looking bug" and "supply-chain attack vector" turn out to be the same finding.

using the corpus

Everything above comes straight out of data/corpus.json: one object per bug, with the category, the symptom, a minimal repro, and the fix, checked against the live GitHub API so a closed or reverted PR can't sit there uncorrected. A companion repo, cjk-agent-fixtures, turns the repros into CI fixtures, so a regression on any of these three assumptions gets caught before it ships again instead of waiting for someone with a Japanese keyboard to notice by hand.

None of the twelve categories in the corpus are actually twelve problems. They're three, wearing whatever framework or language happened to be lying around. If your test suite only ever types in English, at least one of them is sitting in your codebase right now. The corpus is the fastest way to check which.

github.com/greymoth-jp/cjk-failure-corpus

Output is cheap now. Keep the receipts.

greymoth — Thu, 09 Jul 2026 20:00:39 +0000

Caveat first, because most posts like this bury it and I'd rather lead with the part that's actually honest. What I'm building can prove a process happened. It cannot prove a human did the thinking. If you automate the steps, or paste an LLM's answer through them, the record still fills up. So this is not "proof of human." It's proof of process. If that distinction doesn't matter to you, you can close the tab now and we're still friends.

Here's the thing that's been bugging me for months.

A year ago, if someone handed you a tight decision memo — three options weighed, one picked, the reasons written down — the artifact itself was evidence. Producing it cost judgment and time, so having it meant someone spent both. That link is gone. Anyone can generate a plausible version of that memo, or a clean PR description, or a crisp design doc, in about nine seconds. The output stopped being proof of anything.

So what's actually scarce now? Not the answer. The trail to it. What you decided, what you rejected, when, and what happened after reality pushed back on the call. That part AI can't hand you, because it doesn't have your context, your constraints, or your consequences. The decision-to-outcome loop is yours. The problem is almost nobody records it, so it evaporates. You end up with the polished final thing and no memory of the reasoning that got you there.

Why this got urgent

We're drowning in plausible output, and we're starting not to trust any of it. Reviewers can't tell what a person reasoned through from what got autocompleted. The main response so far has been detectors, which are a losing arms race — every detector gets beaten, and worse, they flag careful human writing as fake. I think the more honest move isn't to detect the fake. It's to let the real work keep a receipt.

What I'm building

Working name Glovrex. The short version: you record decisions as you make them. What you chose, the options you killed, the reason. Then it links each decision to what actually happened later — the outcome, not just the intention. The record is tamper-evident, so you can't quietly backdate a call to look smart after the fact, and neither can anyone reading it.

What comes out the other end is a portable log. "On this date I decided X over Y and Z, for these reasons, and here's how it aged." A receipt for your own judgment. It's useful to you, because your past self is a stranger and this is how you audit whether your reasoning was any good or you just got lucky. And it's useful to show other people, because a track record beats the polished final artifact that everyone has learned to distrust. The version of this I already trust most is boring: a GitHub profile full of other people's merged PRs. Nobody can generate that one for you.

That's the visible value. I'm going to stay quiet on how it decides what to keep and surface. That's the part I'm still building and the part that's mine.

The honest limit

Back to the caveat, because it's load-bearing. Provenance proves the process ran: these steps, at these times, in this order, unaltered since. It does not prove the quality or the humanity of the thinking inside. Recording a decision doesn't make it a good decision. And a determined faker can perform the whole ritual with a bot.

What tamper-evidence actually buys you is narrower and more real: the record can't be silently rewritten later. The timeline is honest even when the thinking wasn't. That's a much smaller claim than "verified human work," and I'd rather ship the smaller true claim than the bigger false one. If I ever start selling this as proof a human did the cognitive labor, call me on it.

One more thing worth saying plainly. This sits on top of LLMs, not against them. I use them all day. The point isn't "AI bad." It's that when generation is free, the generated thing carries less information, and the trail around it carries more. Glovrex is a layer for the trail.

Where this is

Pre-launch. No signup wall to shove at you, nothing "revolutionary," no metrics I haven't earned. I'm writing this partly to think out loud and partly to find the people who already feel the problem — engineers, researchers, anyone whose real value is their judgment over time, watching that judgment get harder to prove as the output around it turns to noise.

If that's you, here's the disagreement I actually want: where does "proof of process" stop being useful and start being theater? That's the question I don't have fully answered yet, and it's the one that decides whether this is worth building.

Teaching a grader the difference between pаypаl and paypal

greymoth — Sat, 04 Jul 2026 18:36:48 +0000

Look at these two strings: paypal and pаypаl. In most fonts they render the same. The second one has two Cyrillic а characters standing in for Latin a. A person can't reliably tell them apart on sight, and a plain == comparison can't tell them apart at all unless it's checking code points, not glyphs.

That pair is one of 72 test cases in something I finished today: a grader for the kind of text bug that's obvious once you see it and invisible until then. It's built against Prime Intellect's Environments Hub, which collects RL environments and evals that AI labs train and test models against. This one grades text correctness specifically — nothing about the pretty stuff, just: is this string handled right.

I've spent about a year finding these bugs by hand, in real repos, as pull requests. 115 of mine are merged upstream as of this morning (misskey, strapi, MUI, Vue Router, Wails, Tencent's tdesign, and a long tail of smaller ones — zero self-merged, github.com/greymoth-jp if you want to check). Most of what I found reduces to a short list of repeating shapes: an Enter key that submits a form mid-IME-conversion, a .length check that splits a kanji in half, a locale file that silently drifts out of date behind the English source. I keep the CJK/Unicode ones in a public corpus, cjk-failure-corpus — 97 of them now, each linked to a real PR or issue, not written from memory.

At some point the question stopped being "can I find one more of these" and started being "can a program judge whether an answer is correct, the way I've been judging them by hand." That's what a grader is. Building one turned out to be a different skill from finding bugs — but the same underlying judgment, made explicit and checkable.

Three families, 72 cases, no stored answers

tokenization-length (26 cases). Given a string, report its length three different ways: grapheme clusters (what a person sees), Unicode code points, and UTF-16 code units, and flag whether a naive count would get it wrong. 𠮷野家 — the kanji Yoshinoya prints on its own storefront sign — is one code point and two UTF-16 units. A plain .length in JavaScript reports 2 for a string a human reads as one character.

encoding-injection (30 cases — the paypal one lives here). Decide whether a string hides something: a homoglyph swap, an invisible character splicing a token in two, a bidirectional-override character (U+202E, the mechanism behind Trojan Source, publicly tracked as CVE-2021-42574) that can make a file actually named txt.exe display as exe.txt, or a fullwidth character that survives Unicode normalization into something dangerous (ｄｅｌｅｔｅ normalizes to delete). Every positive case is paired with a negative one — a legitimate Japanese or Korean string, or a real emoji sequence — so a detector that just flags every non-ASCII string scores no better than chance.

rendering-output (16 cases). Text that's fine going in and corrupted coming out: UTF-8 misread as Latin-1 leaves control characters that never occur in real text, or a Private-Use-Area code point that most default font stacks render as a blank box.

The part I care about more than the bug list: nothing in the grader is a stored answer key. Every oracle is re-derived at grading time, in Python — len() for code points, .encode('utf-16-le') for UTF-16 units, the grapheme package for clusters, a small reference scanner for the injection checks. A correct answer scores 1.0 across every class. A naive baseline — roughly the level of check most real code actually ships — scores 0.43 on average. That gap is the training signal.

I deliberately left two things out. The source corpus has a couple of checks (pangu spacing, budoux line-breaking) whose ground truth comes from a JavaScript library. Porting that to Python would mean trusting a second implementation to stay byte-identical to the first forever, which breaks the entire "re-derive, don't store" premise — so I cut them instead of faking the confidence. Two Indic-conjunct cases got cut for the same reason: the Python grapheme library predates the current Unicode rule for that script (UAX #29 GB9c) and would grade them wrong on purpose. Scoping something out because you can't verify it honestly is a different move from shipping it and hoping nobody checks.

What I don't know yet

Whether this earns anything is untested. Prime Intellect runs a funded program for environment submissions — real money, open-tier bounties in the low hundreds and an approval-gated tier in the low thousands — but every credited contributor listed so far is an org (Arcee AI, Hud.so, Groq), not a solo account with no prior relationship. Publishing is one command. Getting paid, or even reviewed, by people who've never seen my name before is the actual experiment here, not this post.

What I do know: the corpus that took a year of manual, unglamorous bug-hunting to build turned out to be exactly the training data a grader like this needs. That wasn't the plan when I started filing PRs into strangers' repos for no clearer reason than "this is wrong and I can fix it." It's the kind of connection you only notice once you've done enough of the boring version by hand.

a width check said the string was safe to cut. it split a kanji in half.

greymoth — Fri, 03 Jul 2026 20:41:00 +0000

a name went into a terminal table and came out broken. the surname was 𠮷田. that first character is not the ordinary 吉 you get from the 吉 key, it is 𠮷 (U+20BB7), a rarer form that real people in Japan actually have on their family register. the table truncated the cell to fit a column, and what printed was 𠮷 followed by a replacement character. the kanji had been cut in half.

the interesting part is where the bug lived. not in the truncation loop. in a one-line shortcut that decided, before truncating, that this particular string was safe to cut by raw index. it was wrong, and it was wrong for a reason that only shows up on the exact character I just described.

three numbers that are usually the same, and one string where they aren't

a JavaScript string has more than one length depending on what you ask.

"𠮷".length is 2. .length counts UTF-16 code units, and 𠮷 lives outside the Basic Multilingual Plane, so it is stored as a surrogate pair: two code units, 𠮷.
its code-point count is 1. [..."𠮷"].length is 1.
its display width, the number of terminal columns it occupies, is 2. it is an East Asian wide character.

for plain ASCII these all collapse to the same number. "abc" is 3 code units, 3 code points, 3 columns. that coincidence is what a lot of text code quietly leans on. it holds right up until a character makes two of those numbers agree for different reasons.

𠮷 is exactly that character. two code units because it is a surrogate pair. two columns because it is wide. same number, 2, arrived at two completely different ways. hold onto that, it is the whole bug.

the real code

this is the truncation helper in cli-table3, the library a lot of CLIs use to draw tables. strlen here is display width. it strips ANSI color codes and runs the string through string-width, which counts a wide CJK character as 2. so strlen answers "how many columns," not "how many characters."

function truncateWidth(str, desiredLength) {
  if (str.length === strlen(str)) {
    return str.substr(0, desiredLength);
  }

  while (strlen(str) > desiredLength) {
    str = str.slice(0, -1);
  }

  return str;
}

read the first branch as an optimization. "if the code-unit length equals the display width, then every character is one unit and one column, so there are no wide characters and nothing tricky, I can just cut by index with substr." for "abc" that is true, 3 === 3, cut away.

now feed it "𠮷𠮷". code-unit length is 4. display width is 4. 4 === 4, so the branch fires and it cuts by code unit:

"𠮷𠮷".substr(0, 3)   // "𠮷" + "\uD842"

substr(0, 3) takes three code units: the full first 𠮷, then the high surrogate of the second one. the low surrogate is left behind. you get one clean kanji followed by a lone high surrogate \uD842, which is not a character at all. terminals render it as the replacement box. that is the half a kanji in the table cell.

the shortcut was built for the case where length equals width because everything is one-to-one. a surrogate-pair wide character satisfies length === width too, 2 === 2, but for the opposite reason, both numbers are 2 because the character is doubled on both axes. it walks straight into the fast path and gets sliced by index, which is the one thing that path assumed it would never have to do.

why it survived

the obvious question is how a CJK bug survives in a table library that people clearly use with CJK. the answer is that ordinary Japanese and Chinese text never reaches this branch.

take 漢. it is U+6F22, inside the BMP, so "漢".length is 1. its width is 2. 1 === 2 is false, so 漢 skips the fast path entirely and goes to the while loop below. every common kanji, every kana, every Hangul syllable behaves this way: one code unit, two columns, length never equals width. they are all safe.

the fast path only misfires when a single character is a surrogate pair and wide. that intersection is small. it is CJK Extension B and beyond, the rare kanji that show up in personal names and place names, plus emoji, which are also non-BMP and mostly width 2. so the library worked for years of 東京 and 漢字 and quietly mangled 𠮷田 and anything with an emoji in a narrow column. the common case took a different branch, so the shortcut looked safe.

the slow path had a milder version of the same disease, by the way. str.slice(0, -1) removes one code unit, not one character. hand the loop a string ending in a surrogate pair and it lops off a low surrogate on the first pass and leaves the high one dangling. same family, quieter symptom.

the fix

two changes. guard the fast path so it refuses any string that contains a high surrogate, and make the slow path trim whole code points instead of code units.

function truncateWidth(str, desiredLength) {
  // `str.length === strlen(str)` is also true for surrogate-pair characters
  // (e.g. CJK Extension B or emoji), which count as 2 code units and 2 columns.
  // `substr`/`slice` cut by code unit, so exclude them here and trim by code
  // point below to avoid splitting a surrogate pair into a lone surrogate.
  if (str.length === strlen(str) && !/[\uD800-\uDBFF]/.test(str)) {
    return str.substr(0, desiredLength);
  }

  let chars = Array.from(str);
  while (strlen(chars.join('')) > desiredLength) {
    chars.pop();
  }

  return chars.join('');
}

Array.from(str) iterates by code point, so Array.from("𠮷𠮷") is a two-element array, each element a whole kanji. pop() removes one whole character. the loop can no longer stop in the middle of a surrogate pair because there is no middle to stop in. the fast path stays for the genuinely simple case, ASCII and other strings with no surrogates, where substr is both correct and cheaper.

worth naming the tools. Array.from and the spread operator both split by code point, which fixes surrogate pairs. they do not split by grapheme, so a flag emoji or a family emoji built from several code points joined with zero-width joiners will still come apart. if you need whole user-perceived characters, that is Intl.Segmenter with granularity: 'grapheme'. code point was the right level here because the unit of width is the code point, but know which one you are reaching for.

the failing fixture

this is the test that goes red before the fix and green after. it is the whole point, because the fix is one line and the value is keeping it fixed, not finding it once.

it('does not split a surrogate-pair wide char (CJK Ext B)', function () {
  let kanji = String.fromCodePoint(0x20bb7);          // 𠮷
  expect(truncate('a' + kanji + 'bc', 4)).toEqual('a' + kanji + '…');
  expect(truncate('a' + kanji + 'bc', 3)).toEqual('a…');
  expect(truncate(kanji + kanji, 3)).toEqual(kanji + '…');
});

it('does not split a surrogate-pair wide char (emoji)', function () {
  let emoji = String.fromCodePoint(0x1f600);
  expect(truncate('a' + emoji + 'bc', 3)).toEqual('a…');
  expect(truncate('x' + emoji + emoji + 'y', 4)).toEqual('x' + emoji + '…');
});

note the inputs are built with String.fromCodePoint, not pasted glyphs. that keeps the test readable in any editor and makes the code point explicit, so nobody later "cleans up" 𠮷 into 吉 and deletes the coverage without noticing. the assertion that matters most is truncate(kanji + kanji, 3): a width budget that lands between the two columns of the second character. the old code returned a lone surrogate there. that is the exact spot the bug lives.

the check, for the next one

the general shape is bigger than one library. any code that truncates, pads, aligns, or measures text is juggling three different numbers for one string, and it is only correct if it uses the same one throughout:

string	code units (`.length`)	code points	display columns
`abc`	3	3	3
`漢字`	2	2	4
`𠮷`	2	1	2
`😀`	2	1	2

the failure mode is always the same: measure by one number, cut by another. cli-table3 measured width, then cut by code unit, and the two disagreed on the one character where they happened to be equal for different reasons. so the check is a habit, not a rule. when you slice a string with substr, slice, or a bare index, ask what unit that index is in. it is code units. then ask whether the length you compared it against was in the same unit. if you measured display width or code points and then cut by index, you have this bug, and it is invisible until a non-BMP character walks through.

and test it deliberately. one CJK Extension B character, String.fromCodePoint(0x20bb7), and one emoji, at a width that lands mid-character. ASCII will never show you this. you have to hand the function the input it is quietly afraid of.

this one is a single entry in a corpus of 97 real CJK, IME, and Unicode failures I have been collecting, most of them one-line fixes hiding in libraries that work perfectly in English. the same split-a-code-point shape shows up in opentype.js clamping cmap character codes (open), in slate keeping Indic conjuncts together (open), and in web UI truncation and a markdown smart-quotes pass where I filed the same fix and it did not land (clerk and markdown-it, both closed). the corpus and a runnable fixture suite in JS and Go are linked below. don't take my word for the diagnosis, the cli-table3 diff is public, read it and decide if it holds.

corpus: https://greymoth-jp.github.io/cjk-failure-corpus/
fixtures (JS + Go): https://github.com/greymoth-jp/cjk-agent-fixtures
the fix in this post: https://github.com/cli-table/cli-table3/pull/360

— greymoth (@greymoth__)

The Enter key that submits your form while a Japanese user is still typing

greymoth — Thu, 02 Jul 2026 21:05:57 +0000

Here's the whole lesson up front, so you can leave after one paragraph if you want:

If your text field submits on Enter, it almost certainly submits on the Enter a Japanese, Chinese, or Korean user presses to confirm a word. That Enter isn't "send." It's "yes, that kanji." Your handler can't tell the difference unless you check one flag, and your English test suite will pass green forever while this ships. The flag is event.isComposing.

That's it. The rest is why it happens, why CI is blind to it, and a free way to pin it so it doesn't crawl back.

What actually happens

Japanese, Chinese, and Korean don't map one key to one character. You type a phonetic guess, the IME shows candidates, and you press Enter (or Space, then Enter) to pick one. That confirming Enter fires a keydown with key: "Enter", same as any other. If your submit handler only looks at key, it fires. The user was mid-word. Their first attempt is gone.

The tell is that it eats the first one. A Japanese user types a message, hits Enter to confirm the conversion, and the form submits with half a sentence, or the tag commits early, or the command palette runs the highlighted command. They learn to type, confirm somewhere else, then paste. That's the workaround real users invent for your bug.

I hit this in a Vue library, naive-ui. Its n-dynamic-tags committed a tag on the Enter that confirmed an IME conversion, so you couldn't type a multi-character CJK tag without it splitting early. The fix that got merged is small on purpose:

// inside the Enter handler
if (inputInstRef.value?.isCompositing) return

Guard the handler while composition is active, and the confirming Enter does nothing. The real Enter, the one after compositionend, still commits. Twenty-nine lines including the changelog and the test. The bug had been there a while; nobody typing in English would ever meet it.

Why your CI never sees it

This is the part that matters for anyone shipping to a global audience. You don't reproduce this by reading the code. You reproduce it by having an IME on and composing a word. Nobody on the review is typing 日本語 into the field. So the diff looks fine, the tests are green, and the regression ships.

The portable guard, if you're not in a framework that wraps it:

input.addEventListener('keydown', (e) => {
  if (e.isComposing || e.keyCode === 229) return // IME is mid-composition
  if (e.key === 'Enter') submit()
})

e.isComposing is true between compositionstart and compositionend. keyCode === 229 is the legacy signal for the same state and still shows up on older Safari and some Android keyboards. In React you read it off e.nativeEvent.isComposing, because the synthetic event doesn't always carry it. Frameworks differ in the spelling; the idea is identical.

So the fix is trivial. The problem is that "fix it once" and "keep it fixed" are different jobs. There's no lint rule that reliably flags "this Enter handler forgot about composition," and the next refactor that touches the handler can drop the guard, and again, no English-only test goes red. It comes back within a release or two. I've watched it come back.

Pinning it so it can't come back

The only thing that keeps this dead is a test that composes a word and asserts the submit didn't fire. That's a specific, slightly annoying test to write, and it's the same test every project needs, which is exactly the kind of thing worth sharing instead of everyone re-deriving it.

So I put the cases in a small MIT package: @greymoth/cjk-agent-fixtures. It's a runnable regression fixture pack for eleven of these input bugs, in JavaScript (Vitest/Jest) and Go, standard library only. For the IME case it hands you the keyboard/composition event sequence and the correct result, and you replay it against your own handler:

import { editorCases, applyEvents } from '@greymoth/cjk-agent-fixtures'
import { createInput } from '../src/text.js' // your code

it.each(editorCases)('$slug', ({ events, correct }) => {
  const input = applyEvents(createInput(), events)
  expect(input.submitted).toBe(correct.submitted) // false during composition
})

Be clear about what that is. It's not a scanner. It doesn't read your bundle and guess whether you're vulnerable. You point it at your functions, it holds the inputs and the expected answers, and your CI goes red when your handler gets it wrong. Every case also carries the wrong value a common broken handler returns, so you can confirm the test actually bites before you trust the green.

The other ten, briefly

The IME Enter is one of eleven, and they cluster into a few wrong assumptions about text. A quick sense of the neighbours, because if you have one you probably have three:

A byte slice through 日本語 (3 bytes per char) lands mid-character and prints U+FFFD.
str.length over-counts a rare kanji like 𠮷 or any emoji, and a slice at an odd UTF-16 boundary leaves a lone surrogate.
A field of only full-width spaces (　　, U+3000, what the IME types on the space bar) passes your ASCII .trim() "not empty" check.
Half-width katakana ﾊﾝｶｸ and ハンカク compare unequal, so your "username already taken" check misses the collision.

Same shape every time: code that was written assuming one character is one byte is one column in one encoding, meeting text where none of that holds. The full taxonomy and a receipt (a real PR) for each is in the corpus.

Honest limits

The IME guard has genuine edge cases. Some browsers keep isComposing true after focus leaves mid-composition, so a naive guard can freeze the field until refocus. The fixtures cover that as a separate case (#5), but if you only copy the one-liner above you can trade one bug for another.
Fixtures don't find your bug for you. If your Enter handler lives somewhere the cases can't reach without a five-line adapter, that's real work, not a drop-in.
If your product genuinely has zero CJK/RTL/emoji users and never will, this is ceremony. I don't think that's most products shipping in 2026, but it's a real out.

If one confirming-Enter test saves one Japanese user from losing their first message, it paid for itself. That's the entire pitch. No account, no signup, MIT, works offline.

Three ways CJK text breaks big open-source projects, over and over

greymoth — Thu, 02 Jul 2026 16:30:41 +0000

I keep a small corpus of Japanese/CJK bugs I've found in open-source projects while sending fixes upstream. At some point I stopped looking at them as individual bugs and started looking at them as a small set of repeating shapes. Three of them show up constantly, in codebases with nothing else in common: a federated social network, a CRM, a component library, a commerce platform, a local-AI desktop app, a data-grid, a design system, a headless CMS. Different stacks, same failure.

None of these are exotic. Each one is a real merged fix, and each one is boring enough that it passed code review and CI without anyone noticing, sometimes for years. That's the actual finding: these bugs aren't hard to fix once you see them. They're hard to see, because the systems that would normally catch a regression, tests, linting, review, don't have Japanese input in them.

Pattern 1: IME composition treated as a keystroke

What it is. Typing Japanese, Chinese, or Korean doesn't produce final characters one key at a time. You type romaji, an Input Method Editor shows a preedit string, and you press Enter to confirm the conversion into kanji. That confirming Enter is the same physical key most web apps bind to "submit."

If a keydown handler doesn't check composition state, the confirming Enter fires the handler mid-word: a chat message sends half-typed, a rename commits before the kanji conversion finished, a dropdown closes on the wrong item.

Why it's invisible. It only happens with an IME switched on. Most contributors and most CI runners never turn one on. The input works perfectly for every test that types plain ASCII, which is nearly all of them. No exception is thrown, nothing fails a snapshot test, the bug just silently eats or mangles the user's keystroke.

Real example. misskey-dev/misskey#17646, merged into a repo with over 11,000 stars: the chat composer's onKeydown checked ev.key === 'Enter' and sent the message, with no composition guard at all. Mid-conversion Enter sent a half-typed message. The fix is one line: if (ev.isComposing || ev.key === 'Process' || ev.keyCode === 229) return; before the send logic runs.

It's not a one-off oversight. twentyhq/twenty#22270, a CRM with over 52,000 stars, had the identical gap in two unrelated components at once: the attachment-rename input and the AI chat-thread rename input. Same missing guard, same fix, two files, same PR. And vuetifyjs/vuetify#22974, a component library with over 41,000 stars, already had a shared isComposingIgnoreKey helper elsewhere in the codebase for exactly this problem. VAutocomplete's keydown handler just never called it. The knowledge existed one file over. It didn't reach this one.

How to catch it. Switch your OS keyboard to a Japanese or Chinese IME. Type into every input that reacts to Enter or Escape, and watch what fires before you've confirmed the conversion. Or grep for key === 'Enter' across your codebase and check each hit for a composition guard. The primary composer usually has one. Count how many of the smaller inputs next to it don't.

Pattern 2: locale files silently fall behind

What it is. A product gets translated into Japanese once, then the English source keeps shipping new strings. Every string added to en.json after that point exists only in English until someone notices and backfills it. There's no build error, no lint rule, no CI check that a locale file has drifted, because a missing key isn't invalid JSON. It's just a hole.

Why it's invisible. The UI doesn't crash. i18next and most i18n libraries fall back to the English string (or the raw key) automatically. The product looks fully localized to anyone who isn't reading it in Japanese, including most of the team that shipped it.

Real example. medusajs/medusa#15839, an e-commerce platform with roughly 34,900 stars: the admin dashboard's Japanese locale file was 511 keys behind English. Not mistranslated, just absent, across product options, inventory, order fulfillment, MFA settings, and permissions. Someone had done a full Japanese translation pass at some point; the product just kept growing past it.

Jan, a local-AI desktop client with over 43,000 stars, showed the same drift spread across multiple namespaces rather than one. settings.json alone was 69 keys short with 4 more still sitting in English (janhq/jan#8352), and common.json, the namespace backing search, the providers panel, and toast messages, was 109 strings behind (janhq/jan#8349). It took three separate PRs to bring ja back to parity because the drift had been accumulating across releases, not from one gap.

Sometimes the gap is a handful of keys, not hundreds. mui/mui-x#23001 found that four Data Grid locale strings, including the "no columns" overlay text, had already been translated for zh-CN and ko-KR but were left commented out for ja-JP since the feature shipped. Two other locales got the follow-up treatment. Japanese didn't.

How to catch it. Run a key-diff between your source locale and every target locale on every release, not just at translation time. If ja.json has fewer leaf keys than en.json, you already have this bug, whether or not anyone's filed it.

Pattern 3: translated, but wrong

What it is. The key exists, the string isn't empty, and it's still broken, because the translation carries the wrong meaning into a UI context the translator wasn't shown. This is the pattern that key-diffing and automated QA can't catch at all, because nothing is missing. Everything renders. It's just incorrect.

Why it's invisible. A native Japanese speaker skimming the label in isolation, outside the UI, might not catch it either. The error only shows up when the word sits next to the control it's supposed to describe.

Real example. ant-design/ant-design#58563, a component library with over 98,000 stars: the Typography component's expand/collapse control was labeled 拡大する ("to enlarge/zoom in") for expand and 崩壊 ("collapse," as in a building collapsing or a system failing) for collapse. Both are real, dictionary-correct Japanese words. Neither means "show more text" or "show less text." The fix swapped them for 展開する and 折り畳む, the actual UI-collapse vocabulary.

strapi/strapi#26845, a headless CMS with over 72,000 stars, had the WYSIWYG editor's character counter labeled キャラクター, a loanword that means "character" in the fictional, personified sense (a cartoon character, a game character), not "character" as in a unit of text. The correct word for a text character in this context is 文字. Someone had translated the English word, not the meaning it carried in that specific control.

How to catch it. This one doesn't have a mechanical check. It needs a native speaker actually looking at the rendered UI, not a spreadsheet of key-value pairs, because the failure lives in the gap between a word's dictionary sense and the sense the interface needs at that exact spot.

The actual pattern is one level up

Stack these three next to each other and a shape appears. Composition-state handling, key-completeness checks, and meaning-in-context review are three different kinds of infrastructure, and English-only teams don't build any of them by default, because English doesn't need them. English text is typed one character at a time, English locale files are the source of truth so they can't drift behind themselves, and translation isn't a concept that applies to the language you already wrote the UI in.

So none of this is really about translation quality. Translation is a one-time act on strings. What actually breaks is the surrounding system: does the input layer understand non-Latin text entry, does the release process notice a locale falling behind, does anyone check meaning-in-context instead of string presence. Localization is what happens when all three of those hold at once, continuously, not just on the day someone did a translation pass. Every project above is a well-maintained, actively developed repo. The gap wasn't effort. It was infrastructure nobody had a reason to build until an outsider pointed at the specific line.

I keep a running, searchable corpus of bugs like these, CJK-specific breakage across open-source input handling, locale files, and Unicode edge cases, with repro cases and the fix for each: github.com/greymoth-jp/cjk-failure-corpus. If you maintain something with text input or a translated locale, it's a fast way to check whether your project already has one of these three shapes sitting in it.

More of this kind of thing: github.com/greymoth-jp · glovrex.com

The Sandwich Test: How I Check If A Dev-Tool Idea Is Actually Winnable Before I Build It

greymoth — Thu, 02 Jul 2026 11:26:55 +0000

Four dev-tool ideas this week. Four dead, all from the same cause, and it took me embarrassingly long to see the pattern instead of just the individual rejections.

I'm a solo dev, no team, no funding, building in public-ish. The move I keep reaching for is the classic one: ship something free (a CLI, a GitHub Action, a linter) that devs adopt for free, then sell a paid backend on top: history, dashboards, team alerts, whatever the free tool can't do alone. It's the Sentry/Vercel playbook, scaled down. It's also, as of 2026, mostly a trap if you're doing it alone with no existing audience. Here's the check I wish I'd been running from idea one instead of idea four.

The idea that looked good on paper

The one I actually got excited about: flaky-test analytics. Real, universal pain (every CI setup eventually has a test that fails 1 time in 20 for no reason), and unlike most of my other ideas, there's an actual company charging real money for it. BuildPulse has been selling this since around 2019, three tiers, $99/$249/$499 a month, same structure on the pricing page for years. That's rare. Most "obvious" dev-tool ideas don't have anyone visibly paying for them at all.

So I went looking for the wedge: free CLI/Action reads your JUnit XML, no write access needed, dead simple to adopt. Then I checked who else is standing in that spot.

Trunk.io raised $28.5M in venture funding. Their flaky-test detection is free for any team under 5 monthly active committers, and it already works with GitHub Actions today. That's not a roadmap promise, that's the current pricing page. Datadog bundles flaky-test tracking into CI Visibility at $8/committer/month, money most teams are already spending on Datadog for other reasons. Cypress Cloud includes flake detection starting at $67/mo. Gradle has a named "Flaky Test Detection" feature in Develocity. None of these companies built flaky-test detection as the product. They built it as a reason to keep you inside a bigger bill you already pay.

The wedge I wanted, free CLI for small teams not big enough for Datadog, is exactly the slice Trunk.io just made free. Not "hard to compete with." Actually free, today, for my target customer.

Same shape, different idea

I ran the same check on a completely different idea (cross-repo drift detection, catching when a bugfix in one repo doesn't get propagated to its sibling repos, something I'd noticed doing OSS work across a bunch of related codebases). Different problem, same two walls.

Low end: Renovate (21,901 stars, free, AGPL) and GitHub's own Dependabot already handle dependency-drift-across-repos for zero dollars. Multi-gitter (1,212 stars, free, Apache-2.0) already does bulk cross-repo PRs. High end: Moderne, the closest real competitor, closed a $30M Series B in early 2025, roughly $50M raised total, and their OpenRewrite tech is already embedded in bigger vendors' code-automation stacks. Sourcegraph raised $245M and sits at a $2.6B valuation. Snyk, if you frame it as a security-drift problem instead, has raised $1.6B.

Free OSS eating the bottom, $50M-to-$1.6B-funded companies owning the top via enterprise trust (SSO, compliance, the stuff that makes a security team say yes) that I cannot produce alone. No gap in the middle. Same shape as flaky tests. Same shape, it turns out, as a CI-autofix idea I killed a week earlier too: GitHub's own Copilot Autofix already owns that lane by default, built into the platform, and the two independent players in that space (Sweep, Korbit) didn't survive as independents either. One pivoted its whole product to a JetBrains plugin, the other got folded into a security company two months ago.

I started calling this the sandwich. You're the filling.

The actual check (steal this)

Before I sink a week into a "free tool, paid backend" idea now, I ask two questions, in this order:

Is a funded company already giving away the exact free-tier version of my wedge, on purpose, as customer acquisition for something bigger? Not "could they." Is there a live pricing page right now where my target customer gets it free. This isn't rare in 2026, it's the default move for anyone with a seed round.
Does actually landing a paying customer require trust infrastructure I can't produce solo (SSO, compliance paperwork, security audits, an incident-response story)? If the buyer needs to trust the company as much as the tool, that buyer is not going to hand a credit card to an anonymous solo dev with a GitHub repo, no matter how good the tool is.

If the answer to both is yes, I stop. Not "make the free tier better," not "find a niche within the niche." Stop, because the structure doesn't change with more effort. It changes with more funding or an existing reputation, neither of which building harder gets you.

The part that took me longest to internalize: real pain is not the same thing as willingness to pay. Flaky tests are a genuinely universal complaint. Nobody's going to argue with you that it's annoying. But the fix for that pain is already a checkbox inside four different tools teams already have open bills with — so the pain being real doesn't mean anyone owes a fifth, separate invoice to a stranger. A tool that only flags a problem (a linter, a checker, a "here's what's wrong" CLI) is the weakest version of this trap, because a check is a feature, and features get copied into the next platform release for free. They don't get billed separately. If your whole product is "I noticed the bug," you don't have a product, you have a feature request someone bigger will ship next quarter.

What actually did survive the check

Not everything did die, which is the part worth keeping. Two patterns from this same search held up under the same scrutiny, and neither of them is "free tool, hope people upgrade."

IPinfo (IP geolocation data, one guy, Ben Dowling, out of a Stack Overflow post in 2014) never had a free-adoption funnel at all. It's metered API access, paid from day one, grew off SEO and dev search instead of a personal audience. Sidekiq (Mike Perham, background-job processing for Ruby, solo for most of its life, reportedly into seven figures a year) kept the free OSS core but sold the paid layer as a license key for extra features shipped in code, not a hosted SaaS with dashboards and team seats. No infra to run, no "please upgrade" funnel to babysit, no enterprise trust apparatus required because you're not asking anyone to hand you their CI pipeline's write access.

The thing both have in common: neither one is trying to convert someone else's free users. They charge their own users, directly, for a scoped thing, from the start. That's the opposite move from "give it away and hope."

Back to idea five. At least now I've got a filter that kills the bad ones in an afternoon instead of a week.

Written by **greymoth. I build developer tools and write about where software quietly breaks — Japanese/CJK edge cases, i18n, the boring infra nobody checks. → *glovrex.com** · github.com/greymoth-jp*

How this page breaks Japanese lines

greymoth — Wed, 01 Jul 2026 22:47:04 +0000

Open a Japanese sentence in a narrow column and watch where the browser breaks it. It will happily split 特定商取引法 into 特定商取引 / 法, or push a 。 to the start of the next line. Japanese has no spaces, so the default line-breaker treats almost every character boundary as fair game. To a Japanese reader that looks broken in the same way impor / tant would look broken to you.

Most sites ship exactly that. It is the kind of thing you only notice if you read the page in Japanese, which is most of the point of this whole site.

The rule we actually want

Japanese wraps at phrase boundaries — 文節, roughly a content word plus its trailing particles. It also follows 禁則: a closing bracket or a 。 never starts a line, an opening bracket never ends one. Those two together are what "set correctly" means.

CSS gives you half of it for free:

.prose {
  line-break: strict;   /* keep 。 、 ) off the start of a line */
  word-break: keep-all; /* never break inside a run of characters */
  overflow-wrap: break-word;
}

line-break: strict handles the kinsoku edge. word-break: keep-all tells the browser to stop breaking between characters at all. But now nothing breaks, and a long sentence overflows the column. We have to hand the browser the break points back — the right ones this time.

Finding the phrases

The break points are the phrase boundaries, and finding them means segmenting Japanese, which is the hard part. I use BudouX, Google's small phrase model. It turns a sentence into chunks:

import { loadDefaultJapaneseParser } from "budoux";

const parser = loadDefaultJapaneseParser();
parser.parse("特定商取引法の表示ページが無い。");
// → ["特定商取引法の", "表示ページが", "無い。"]

Then I join the chunks with <wbr>, the "break here if you must" tag. With word-break: keep-all in force, the browser breaks only at those points:

- <p>特定商取引法の表示ページが無い。</p>
+ <p>特定商取引法の<wbr>表示ページが<wbr>無い。</p>

Notice the 。 stayed glued to 無い. That is the kinsoku rule falling out of phrase segmentation for free — the model never puts a boundary in front of trailing punctuation, so there is nothing to break before it.

I run this at build time, not in the browser. A small pass walks the rendered HTML, inserts <wbr> into Japanese text, and skips anything inside <code> or <pre> so code samples are left alone. The model stays on the build machine. The reader downloads a few <wbr> tags and no JavaScript.

Where it stops

BudouX is a model, not a rulebook, so it is about right, not exactly right. It occasionally splits a rare compound in a place a typographer wouldn't, and it has nothing to say about full justification or 約物 spacing. For body text at a normal measure I have not needed to correct it by hand yet. If I do, I will say so here.

The honest limit is the usual one: this fixes the mechanical part. It cannot tell you the Japanese was worth reading. That is still a human call.

The Enter key that fires while you're still typing

greymoth — Wed, 01 Jul 2026 10:57:04 +0000

Type きょう into a search box, press the spacebar to convert it to 今日, and press Enter to accept the kanji. On a lot of sites the search fires right then — on きょう, or on nothing, or it submits the whole form. You wanted to pick a word. The page heard go.

If you only ever type English you will never reproduce this, because you never compose. That is exactly why it ships. The person who wrote the handler pressed Enter a thousand times and it always meant submit.

The Enter that confirms is the same Enter you're listening for

An IME turns keystrokes into candidate text and waits for you to confirm. The confirming keypress is usually Enter. The problem is that your keydown listener sees that Enter too, and by default it can't tell "commit this conversion" apart from "submit the form."

The browser does leave you a tell. While the IME is composing, a keydown carries isComposing === true, and — going further back — reports keyCode === 229 instead of the real key. The Enter that closes the conversion is a composing keydown. The Enter you actually want, the one after the word is settled, is not.

The fix is a guard clause

Bail out of the handler while composition is in flight:

input.addEventListener("keydown", (e) => {
  if (e.isComposing || e.keyCode === 229) return; // still converting
  if (e.key === "Enter") submit();
});

isComposing is the modern, readable check. keyCode === 229 covers browsers old enough not to set it. Keeping both costs nothing and the second one has saved me on a stock Android WebView more than once.

React hides the flag one level down

React wraps the DOM event, and on the synthetic event isComposing is not reliably populated. The value you want is on the native event:

- onKeyDown={(e) => { if (e.key === "Enter") search(); }}
+ onKeyDown={(e) => {
+   if (e.nativeEvent.isComposing) return;
+   if (e.key === "Enter") search();
+ }}

Same bug, same one-line fix, just reached through nativeEvent. This is the version I paste into most codebases, because most of them are React and most of them read e.isComposing, find it undefined, and quietly do nothing.

Tracking composition yourself

If you'd rather hold the state explicitly — say you toggle other behavior during composition — the events are compositionstart and compositionend:

let composing = false;
el.addEventListener("compositionstart", () => (composing = true));
el.addEventListener("compositionend", () => (composing = false));
el.addEventListener("keydown", (e) => {
  if (composing) return;
  if (e.key === "Enter") submit();
});

Where it stops

The flag approach has one sharp edge worth knowing. Browsers don't agree on the order of the last two events. In some, compositionend fires before the confirming keydown, so your composing flag is already false and the Enter leaks through as a submit — the exact bug you were fixing. That is why I lead with the per-event isComposing / keyCode 229 check: it reads the state of the keypress itself instead of a flag you have to keep in sync.

And the honest limit: none of this proves your form works in Japanese. It proves this one keypress does. The only way to know the rest holds is to actually type Japanese into it — which is the thing that never happens in a test suite written by someone who doesn't.

I cataloged 93 CJK and Unicode bugs in open source. Most are the same five mistakes.

greymoth — Tue, 30 Jun 2026 03:52:19 +0000

I keep a Japanese keyboard on while reading other people's code. Not for any noble reason at first, it's just my keyboard. But after a while you start seeing the same small breakages over and over, in libraries that are otherwise excellent and work perfectly in English. So I started writing them down. The list is now 93 entries across 87 libraries, and it's public:

https://greymoth-jp.github.io/cjk-failure-corpus

It's built like caniuse, except instead of "does this browser support X" it's "here is a real text-handling bug, the library it's in, a minimal repro, and the fix." Every row links to an actual pull request or issue. I'll get to why that matters at the end.

The thing I didn't expect: 93 bugs, but they're not 93 different problems. They cluster into about five.

One bug is a third of the list

36 of the 93 are the same bug. When you type Japanese, Chinese, or Korean, you don't type final characters. You type romaji, an IME shows you a preedit, and you press Enter to confirm the conversion into kanji. That confirming Enter is the same physical Enter your form is listening for.

So a user is mid-word, hits Enter to pick the right kanji, and the handler fires onSearch or commitName or handleSave on text that isn't finished. No error, no stack trace, CI green. It only reproduces with an IME on, which most maintainers don't have, so it lives forever.

The fix is one property. While a composition is active, isComposing is true:

// before
if (e.key === 'Enter') commit();

// after
if (e.key === 'Enter' && !e.nativeEvent.isComposing) commit();

The interesting part isn't the fix, it's where it's missing. Codebases usually already know about this. They just stopped one input short. In LibreChat the main message textarea was guarded and there was even a comment explaining it; the prompt-name field, the labels form, and the tag input next to it weren't. Trilium already had an isIMEComposing helper used by the note editor; the board view's card and column editors just never imported it. Same repo, same knowledge, one screen over.

So it's not "teams don't know about IME." The guard lives on the input everyone tests, and the secondary inputs are the ones nobody types Japanese into during review. Search box, inline rename, tag input, modal. Four shapes, over and over.

(One fiddly note if you go fix your own: in React you reach through to e.nativeEvent.isComposing rather than trust the synthetic event, and || e.keyCode === 229 is a legacy fallback for code paths that report 229 instead of setting the flag. There's a genuinely annoying edge right when composition ends where isComposing can already read false on the confirming Enter, browser depending. I haven't found one rule that holds everywhere; checking both is what's survived for me.)

The other four

After IME, the list thins out into four more shapes.

Locale leftovers (24). A key exists in en and never made it to ja, so a string silently falls back to English. select2 had removeItem and search in every locale except ja.js; screen readers read those aloud, so a Japanese user heard English. Or it's a parse table that formats a date but can't read its own output back, because the diacritic or the era character got dropped. A 和暦 library I looked at produced 令和元年5月1日 and then refused to parse it, because the year matcher was [0-9]{1,2} and 元 (gannen, "year one") isn't a digit.

Surrogate and grapheme (11). Code that walks text by code unit instead of grapheme cluster. Surrogate pairs split down the middle, ZWJ emoji get mis-counted, combining marks drift off their base, variation selectors get dropped. Anything that does str[i] or .length on user text is a candidate.

Kana and romaji (8). Transliteration tables that drop or reverse a kana. The clean test is a round-trip: convert and convert back, you should land where you started. One library could decompose ヷ and ヺ but passed ヸ and ヹ straight through, the other half of the same wa-row family.

Width and normalization (5). A CJK character renders two cells wide in a monospace terminal, but .length says one. Table formatters and truncation that count characters instead of display width overflow the box every time the text is Japanese.

That's 84 of the 93 in five buckets. The long tail is numerals (kanji numbers, including the 大字 forms used in contracts), regex round-trips, and a byte-order mark one code path strips and its sibling leaves glued to the first field name.

Why every row links to a PR

The honest part. Most of these entries are pull requests I sent. I only mark one "merged" when the GitHub API says merged, not when I push it and not while it's in review. As I write this, 15 of the 93 have merged; the rest are open. A few entries aren't mine at all, they're cited from other people's bug reports that document the same failure, and those are marked cited and link to the original report.

I built it this way on purpose. The site is one Node script over a JSON file, and the build fails loudly if an entry doesn't point at a real PR or issue. So the page physically can't claim a fix it can't link to. That constraint is the whole value of it as a reference: you don't have to trust me, you click through.

What to do with it

If you maintain something with a text input, the ten-minute version is: switch your keyboard to a Japanese IME, then type into every input that does something on Enter and watch what fires before you've confirmed the word. The main composer is probably fine. Try the search box. Try the inline rename. Try the chip input buried in a settings panel.

If you'd rather grep: find every key === 'Enter' and count how many have a composition guard. The main one will. Count the rest.

And if you hit a text-handling bug that isn't in the list, tell me and I'll add it. That's sort of the point of keeping a list instead of re-finding the same thing every month.

https://greymoth-jp.github.io/cjk-failure-corpus

A searchable corpus of CJK and Unicode bugs in open-source libraries

greymoth — Mon, 29 Jun 2026 20:50:00 +0000

A Japanese user types into your search box. They write とうきょう, press Space to convert it to 東京, then press Enter to confirm the candidate. The search fires. The query that went through was the half-finished one, before the conversion committed.

This is the most common internationalization bug I run into, and it is almost always one line to fix. The Enter that confirms an IME conversion is the same Enter your keydown handler is listening for. The guard is to skip the handler while a composition is still active: event.isComposing, or keyCode === 229. In React you have to read it off event.nativeEvent.isComposing, because the synthetic event drops the field.

I kept hitting variations of this across different libraries, so I started writing them down. That list is now a small public reference.

CJK / Unicode Failure Corpus

https://greymoth-jp.github.io/cjk-failure-corpus/

It is a searchable list of real CJK, IME, and Unicode text-handling bugs in open-source libraries. For each entry there is a one-line symptom, a minimal repro, the library it hits, and the fix. Right now it has 89 entries across 84 libraries. 15 of the fixes have merged, the rest are open or were closed.

The point is to have something to reach for when one of these bites you. Search the library or the symptom, get the repro and the one-line fix that already worked somewhere else. Most of these are the same handful of mistakes, made over and over, in code that works fine in English.

A few entries are not my PRs. They are cited upstream issues from the wider ecosystem that document the same failure, marked cited and linked to the original report. Everything else is a PR I opened, with the title, repo, URL, and merge status pulled from the GitHub API rather than written from memory. The build refuses to publish an entry that does not point at a real PR or issue, so the page cannot claim a fix it cannot link to.

Three entries, to show the shape

The IME Enter, in naive-ui (Vue, merged). In n-dynamic-tags, pressing Enter to confirm a kana-to-kanji conversion creates a tag from the in-progress text instead of just finishing the conversion. Repro: render <n-dynamic-tags>, focus the input, type とうきょう with a Japanese IME, Space to get 東京, then Enter to pick the candidate. A tag gets added from the unconfirmed text. Fix: skip tag creation while e.isComposing is true, and only act on the Enter that fires after compositionend. This exact category shows up across React, Vue, Svelte, and Angular, so the corpus tracks it as one pattern with per-framework notes (React needs nativeEvent.isComposing; Svelte exposes the native event directly; Safari and Chromium even disagree on whether the commit keydown reports isComposing or keyCode 229).

A dropped apostrophe, in hepburn (kana to romaji). Katakana ン before a vowel or a Y gets romanized without the syllabic-n apostrophe, unlike hiragana ん. So シンヨウ comes out as SHINYOU when it should be SHIN'YOU, and now it collides with シニョウ.

const { fromKana } = require('hepburn')
fromKana('しんよう') // SHIN'YOU
fromKana('シンヨウ') // SHINYOU  <- apostrophe dropped

Round-trip is the oracle here: kana to romaji and back should be stable, and the hiragana sibling already did it right. The fix is to map katakana ン the same way.

A locale that cannot parse its own output, in date-fns. This one is not even CJK, which is exactly why it is in the list. In the Galician (gl) locale, June formats as xuño, but the June parse pattern is /^xun/i. That matches the abbreviation xun and not the wide form, because the third character is ñ, not n. So format then parse fails, for June only:

const s = format(new Date(2021, 5, 1), 'MMMM', { locale: gl }); // 'xuño'
parse(s, 'MMMM', new Date(), { locale: gl });                   // Invalid Date

The locale's own test snapshot already records Invalid Date for June while the other eleven months parse fine. Fix: widen the pattern to /^xu[nñ]/i, the way Catalan already folds diacritics into its patterns. It belongs next to the CJK entries because it is the same class of bug: text round-tripping that nobody tested in a script with characters outside ASCII.

What it is not

It is not a linter and not a guarantee. It tells you that a specific bug existed and how it was fixed. Whether your code has the same one is still something you have to check. The detection is mechanical, the judgment is yours.

And not every PR landed. A few were closed, because the maintainer fixed it another way or did not want the change. Those stay in the list, marked closed, because a closed PR is still a documented failure with a repro attached.

If you maintain a library that takes text input and you want to know whether it has one of these, the fastest path is to search the corpus for your stack and skim the IME and locale-data sections first. That is where most of the bodies are buried.

There is also a companion repo that turns the repros into CI fixtures, so the regressions can be caught automatically instead of rediscovered: https://github.com/greymoth-jp/cjk-agent-fixtures

Corpus: https://greymoth-jp.github.io/cjk-failure-corpus/

Your main input handles IME composition. The rename box next to it doesn't.

greymoth — Mon, 29 Jun 2026 13:39:40 +0000

Almost every app I look at guards its primary text input against IME composition. The search box, the inline rename field, the tag input, the modal next to it: those get forgotten. That's where the same bug keeps living.

I've been sending one-line fixes for this across a bunch of editors and AI tools for a while now, and at this point it's predictable enough that I can usually guess which file the bug is in before I open the repo.

the bug, in 30 seconds

When you type Japanese (or Chinese, or Korean) you don't type final characters. You type romaji, an IME shows a preedit, and you press Enter or Space to confirm the conversion into kanji. That confirming Enter is the same physical Enter your form listens for.

So a user is mid-word, hits Enter to pick the right kanji, and your handler fires onSearch or commitName or handleSave on text that isn't finished yet. No error. No stack trace. CI is green. It only happens with an IME turned on, which most of the maintainers don't have, so it sits there.

The fix is one property. While a composition is active, isComposing is true:

// before
if (e.key === 'Enter' && value.length > 0) {
  onSearch(value);
}

// after
if (e.key === 'Enter' && !e.nativeEvent.isComposing && value.length > 0) {
  onSearch(value);
}

That's the whole thing. (payloadcms/payload#17138, one line.)

the part I actually want to point at

Here's what made me start writing this down. The codebases usually already know about the bug. They just stopped one input short.

In LibreChat the main message textarea is guarded. The fix I sent for the prompt-name field, the labels form, and the dynamic tag input has a comment I left pointing right at it: Ignore the Enter that commits an IME composition (see useTextarea.ts). The knowledge was in the repo. It just never made it to the three smaller inputs sitting beside the composer. (danny-avila/LibreChat#13996)

Trilium was even clearer. It already had a helper, isIMEComposing, living in services/shortcuts, used by the note editor. The board view's card and column title editors just didn't import it. Same repo, same helper, one screen over, unguarded. (TriliumNext/Trilium#10315)

So this isn't really "teams don't know about IME." It's that the guard lives on the input everyone tests, and the secondary inputs are the ones nobody types Japanese into during review.

where it hides

If you go looking, the spots repeat. In Jan it was the add-project and rename-thread dialogs (menloresearch/jan#8359). In Excalidraw it was the search menu's Enter-to-jump-to-next-match (excalidraw/excalidraw#11573). In Twenty it was attachment rename and the AI chat thread title (twentyhq/twenty#22270).

Search, rename, tag/chip, dialog. Four shapes, over and over. The early-return form is what most of them ended up with:

const onKeyDown = (e) => {
  if (e.nativeEvent.isComposing || e.keyCode === 229) {
    return;
  }
  if (e.key === 'Enter') {
    commit();
  }
};

Two things worth knowing if you go to write this yourself.

In React you reach through to e.nativeEvent.isComposing. Every one of these fixes does that rather than trust the synthetic event. And the || e.keyCode === 229 is a legacy fallback: on some code paths the keydown that fires mid-composition reports keyCode 229 instead of setting isComposing. There's also a genuinely fiddly bit at the exact moment composition ends, where isComposing can already read false on the very Enter that confirms, depending on the browser. I haven't found one clean rule that holds everywhere. The belt-and-suspenders check of both is what's survived for me in practice.

finding it in your own app

You don't need a tool. Switch your keyboard to a Japanese IME, then type into every input that does something on Enter and watch what fires before you've confirmed the word. The composer will probably be fine. Try the search box. Try the inline rename. Try the chip input in a settings panel.

Or grep. Find every key === 'Enter' (or your keymap's equivalent) and check each one for a composition guard. The main one will have it. Count how many of the rest don't.

One honest note on numbers, since the links above are the evidence. Two of these have merged as I write this, the payload and Twenty fixes; the rest are still open. I'd rather point at the ones that landed than claim I swept the ecosystem. The shape is identical in all of them, which is sort of the point: the guard sits on the input everyone tests and stops one box short.

It's a small fix. It stays unfixed because it's invisible to the people writing the code, and the people who hit it ten times a day mostly shrug and don't report it. If you ship anything with a text input, it's worth ten minutes with an IME on.