Rapls

Posted on Apr 19 • Edited on Apr 23 • Originally published at raplsworks.com

Building "あ" -> "雨": Japanese Suggest Without IME Access

#javascript #webdev #i18n #tutorial

This article was originally published on my blog (Rapls Works) in Japanese. I've translated and adapted it for dev.to readers, with extra context for non-Japanese audiences.

You've probably seen this: type あ into Google's Japanese search box, and you instantly see kanji suggestions — 雨 (rain), 赤 (red), 青 (blue), 秋 (autumn). Nice UX. So I tried to build it on my own site.

My first instinct: "I'll just read the IME's conversion candidates from JavaScript."

Turns out, that's impossible by design. Browsers deliberately hide IME internals from the DOM, for some very good security reasons we'll get into.

So I had to route around it. The trick is to maintain your own {kanji, reading} dictionary and do a prefix search against the reading field in hiragana. Instead of piggybacking on the IME, you ignore it and build your own lookup.

Bonus: the techniques in this post apply to Chinese, Korean, and Vietnamese input too — anywhere the browser's composition API is involved.

TL;DR

IME candidates are unreachable from JavaScript for privacy reasons
Build your own reading dictionary and do startsWith prefix search
Use compositionstart / compositionend to gate searches during IME composition
Debounce (~150ms) to avoid flicker during consecutive confirmations

Tested on Chrome 147 / macOS Tahoe 26.4 / ES6+ (April 2026).

Why You Can't Read IME Candidates

Short answer: IMEs run at the OS level, and browsers intentionally don't expose their internal state to JavaScript. This is a security and privacy decision.

Think about it. If a website could read your IME's suggestion list, it would see what you're about to type before you confirm it. Worse, IMEs have learning features — they remember family names, addresses, and employer names you've typed before. Leaking that to websites would be a huge privacy problem.

What browsers give you instead: just two signals — whether composition is in progress, and whether it's finished. Nothing about the candidates themselves.

So you need a different approach.

Mental Shift: "Conversion" → "Search"

Forget the IME. Maintain a dictionary of {text, reading} pairs yourself, and do prefix matching on the reading.

Dictionary:
  { text: "雨",   reading: "あめ" }   // rain
  { text: "赤",   reading: "あか" }   // red
  { text: "青",   reading: "あお" }   // blue
  { text: "秋",   reading: "あき" }   // autumn

Input "あ"   → all readings starting with "あ" match → 雨, 赤, 青, 秋
Input "あめ" → only readings starting with "あめ" → 雨

Japanese words are typed from the first character onward, so prefix search fits naturally. The candidate list narrows as the user types — exactly how Google's suggestions feel.

Three pieces to build: dictionary, search logic, UI. Let's go.

The `input` Event Misfire Problem

If you're coming from building English autocomplete, you're used to just listening to input. That doesn't work here.

When typing あめ via the IME, a, m, e keypresses fire the input event three times before confirmation:

1. Press "a"     → input fires (field: "あ")    ← not confirmed
2. Press "m"     → input fires (field: "あm")   ← not confirmed
3. Press "e"     → input fires (field: "あめ")  ← not confirmed
4. Press Enter   → input fires (field: "あめ")  ← confirmed

If you search on every input, you'll flash candidates at あ, clear them at あm, flash them again at あめ, then settle at あめ. The UI jitters horribly.

The Fix: `compositionstart` / `compositionend`

Browsers expose three events for IME composition state:

compositionstart — IME composition begins
compositionupdate — composition text changes (e.g., romaji → hiragana)
compositionend — composition finishes (user confirms)

The right place to trigger search is compositionend. Search only on confirmation, and the misfire problem is gone.

Cross-Browser Caveat

e.isComposing on the input event can be unreliable, particularly on older Safari and iOS. Keep a manual flag as a backup:

const input = document.getElementById('searchInput');
let isComposing = false;

// Composition start → set flag
input.addEventListener('compositionstart', () => {
  isComposing = true;
});

// Composition end → clear flag → search
input.addEventListener('compositionend', () => {
  isComposing = false;
  performSearch(input.value);
});

// Input event → skip if composing
input.addEventListener('input', (e) => {
  if (e.isComposing || isComposing) {
    return;  // noop during composition
  }
  performSearch(e.target.value);  // non-IME input (e.g., English)
});

The double check e.isComposing || isComposing absorbs browser differences. Works on Chrome, Firefox, Safari, Edge.

Debounce: Stop Flicker on Consecutive Confirmations

We've fixed composition. Next problem: consecutive confirmations.

Typing 天気予報 (weather forecast) typically happens as two IME confirmations: てんき then よほう. If we search on each, the てんき results flash and disappear. Useless flicker.

Debounce: wait for input to stop changing for ~150ms before running the callback. While the user keeps typing, reset the timer. When they settle, run once.

function debounce(func, delay) {
  let timeoutId = null;

  return function(...args) {
    if (timeoutId !== null) {
      clearTimeout(timeoutId);
    }
    timeoutId = setTimeout(() => {
      func.apply(this, args);
    }, delay);
  };
}

const debouncedSearch = debounce(performSearch, 150);

100–300ms is typical. 150ms hits the sweet spot for local searches (dictionary in memory) — essentially instant to the user.

Prefix Search

function searchByReading(query) {
  const q = query.toLowerCase().trim();
  if (!q) return [];

  // Match on reading OR on the text itself
  let results = dictionary.filter(item => {
    return item.reading.startsWith(q) ||
           item.text.toLowerCase().startsWith(q);
  });

  // Exact matches first, then prefer shorter readings
  results.sort((a, b) => {
    const aExact = a.reading === q || a.text.toLowerCase() === q;
    const bExact = b.reading === q || b.text.toLowerCase() === q;
    if (aExact && !bExact) return -1;
    if (!aExact && bExact) return 1;
    return a.reading.length - b.reading.length;
  });

  return results.slice(0, 10);
}

We match both reading and text so アメリカ (America, in katakana) matches on both あめりか (hiragana) and アメリカ.

Sort by exact match first, then prefer shorter readings (more specific). Cap at 10 — too many options hurts usability.

Don't Skip HTML Escaping

Injecting dictionary entries into the DOM without escaping opens an XSS hole. Even if your dictionary is 100% under your control today, you'll likely add user-submitted entries or API data later. Escape from day one.

function escapeHtml(text) {
  const div = document.createElement('div');
  div.textContent = text;
  return div.innerHTML;
}

// When inserting into DOM
li.innerHTML = `
  ${escapeHtml(item.text)}
  ${escapeHtml(item.reading)}
`;

textContent + reading back via innerHTML auto-escapes <, &, etc. Boring but essential.

All Together

// === Dictionary ===
const dictionary = [
  { text: '雨',       reading: 'あめ' },
  { text: '赤',       reading: 'あか' },
  { text: '青',       reading: 'あお' },
  { text: '秋',       reading: 'あき' },
  { text: '朝',       reading: 'あさ' },
  { text: 'アメリカ', reading: 'あめりか' },
  { text: '天気',     reading: 'てんき' },
  { text: '天気予報', reading: 'てんきよほう' },
  // ... customize for your domain
];

// === Utilities ===
function debounce(func, delay) {
  let timeoutId = null;
  return function(...args) {
    if (timeoutId !== null) clearTimeout(timeoutId);
    timeoutId = setTimeout(() => func.apply(this, args), delay);
  };
}

function escapeHtml(text) {
  const div = document.createElement('div');
  div.textContent = text;
  return div.innerHTML;
}

// === Search ===
function searchByReading(query) {
  const q = query.toLowerCase().trim();
  if (!q) return [];

  let results = dictionary.filter(item =>
    item.reading.startsWith(q) || item.text.toLowerCase().startsWith(q)
  );

  results.sort((a, b) => {
    const aExact = a.reading === q || a.text.toLowerCase() === q;
    const bExact = b.reading === q || b.text.toLowerCase() === q;
    if (aExact && !bExact) return -1;
    if (!aExact && bExact) return 1;
    return a.reading.length - b.reading.length;
  });

  return results.slice(0, 10);
}

// === Main ===
const input = document.getElementById('searchInput');
let isComposing = false;

input.addEventListener('compositionstart', () => { isComposing = true; });
input.addEventListener('compositionend', () => {
  isComposing = false;
  updateSuggestions(input.value);
});

function updateSuggestions(query) {
  const results = searchByReading(query.trim());
  console.log('Results:', results);
  // → next step: render UI
}

const debouncedUpdate = debounce(updateSuggestions, 150);

input.addEventListener('input', (e) => {
  if (e.isComposing || isComposing) return;
  debouncedUpdate(e.target.value);
});

Verify the Behavior

Typing あめ via IME and confirming:

1. Press "a"      → compositionstart → isComposing = true
2. input event    → skipped (isComposing is true)
3. Press "m", "e" → same, skipped
4. Press Enter    → compositionend → isComposing = false → updateSuggestions('あめ')
5. Results: [{ text: '雨', reading: 'あめ' }, { text: 'アメリカ', reading: 'あめりか' }]

Typing test without IME:

1. No compositionstart → isComposing stays false
2. input event → debouncedUpdate called
3. 150ms later → updateSuggestions runs

Both paths work correctly.

Works for CJK, Not Just Japanese

One thing worth calling out: this same pattern works for Chinese, Korean, and Vietnamese too. The composition events are defined at the browser level, not specific to Japanese. If you're building a multilingual input UX, the same compositionstart / compositionend logic covers all CJK-family input.

Next: The UI Layer

At this point we have the brain — IME handling + search logic. The body — dropdown UI, keyboard navigation (↑/↓/Enter/Escape), focus management, click handling — is coming in the follow-up.

The sequel will also cover server-side integration, external API patterns, and performance tuning (trie structures, Web Workers for >10k entry dictionaries).

👉 Part 2 (originally in Japanese, code blocks in English): Japanese Suggest — Full Version: Keyboard Operations, blur Pitfalls, API Integration

Related deep-dive — I also wrote about a weird macOS bug where the first character of Japanese input sometimes commits as English. It's a good look at how IME composition interacts with OS-level input handling:

👉 The "First Character Stuck in English" Problem on macOS

Summary

"あ → 雨" is solved with prefix search on readings, not IME conversion. IME candidates are off-limits to browsers, so maintain your own dictionary and use startsWith for matching.

The Japanese-specific challenge is coexisting with the IME. Use compositionstart / compositionend to gate searches, and add debouncing to prevent flicker. Put those together and you have autocomplete that feels native to Japanese input.

Have you built Japanese or CJK autocomplete before? I'd love to hear about your approach — especially if you've tackled this with a larger dictionary (Trie/WFST structures, etc.) or with a server-side lookup. Drop it in the comments.

This post was originally published at Rapls Works. If you're into WordPress plugin development, Japanese i18n, or edge-case IME behavior, there's more over there.

DEV Community

Building "あ" -> "雨": Japanese Suggest Without IME Access

TL;DR

Why You Can't Read IME Candidates

Mental Shift: "Conversion" → "Search"

The `input` Event Misfire Problem

The Fix: `compositionstart` / `compositionend`

Cross-Browser Caveat

Debounce: Stop Flicker on Consecutive Confirmations

Prefix Search

Don't Skip HTML Escaping

All Together

Verify the Behavior

Works for CJK, Not Just Japanese

Next: The UI Layer

Summary

Top comments (0)

TL;DR

Why You Can't Read IME Candidates

Mental Shift: "Conversion" → "Search"

The input Event Misfire Problem

The Fix: compositionstart / compositionend

Cross-Browser Caveat

Debounce: Stop Flicker on Consecutive Confirmations

Prefix Search

Don't Skip HTML Escaping

All Together

Verify the Behavior

Works for CJK, Not Just Japanese

Next: The UI Layer

Summary

The `input` Event Misfire Problem

The Fix: `compositionstart` / `compositionend`