This article was originally published on my blog (Rapls Works) in Japanese. I've translated and adapted it for dev.to readers.
When you type あ into Google's search box, you immediately see kanji suggestions like 雨 (rain), 赤 (red), 青 (blue), and 秋 (autumn). Ever wanted to build the same feature on your own site?
My first instinct was: "I'll just grab the IME's conversion candidates from JavaScript." Turns out, this is impossible. Not a bug — an intentional security design.
So how do you actually do it? You maintain your own dictionary of kanji with their readings (pronunciations), and perform a prefix search using hiragana. Instead of piggybacking on the IME, you route around it.
In this post, I'll walk through building this "reading-based search" style of Japanese autocomplete from scratch, using only HTML and JavaScript. We'll cover IME composition event handling, debouncing, and prefix search logic — enough for you to ship a working implementation.
TL;DR
Japanese autocomplete is implemented as a prefix search on reading data, not by tapping into the IME's conversion candidates. Browsers don't expose IME internals. The two keys to a clean implementation are:
- Use
compositionstart/compositionendevents to detect composition state and only search on confirmation. - Apply debouncing to avoid spamming searches on consecutive confirmations.
Tested on Chrome 147 / macOS Tahoe 26.4 / JavaScript ES6+ (as of April 2026).
Why You Can't Access IME Candidates
IMEs run at the OS level, and browsers deliberately prevent JavaScript from reading their internal state. This is a security and privacy decision.
Let's clear this up first. "Can't I just read the candidate list from the IME?" — yes, this is what everyone thinks first, but it's technically impossible.
IME (Input Method Editor) is OS-level software. Browsers don't expose its internal state to JavaScript. The reason is security and privacy.
Imagine if a website could read IME candidates. It would see what a user is about to type before they confirm it. Worse, IMEs have learning features — they remember family names, addresses, employer names you've previously typed. Leaking that to websites would be a serious problem.
What browsers do give you is just two pieces of information: whether composition is in progress and whether composition has ended. Nothing about the candidates themselves.
The Right Approach: "Prefix Search on Readings"
Forget the IME. Build a dictionary of {kanji, reading} pairs yourself, then do prefix matching with hiragana input. That's the answer.
The mental shift is from "conversion" to "search."
Dictionary examples:
{ text: "雨", reading: "あめ" } (rain)
{ text: "赤", reading: "あか" } (red)
{ text: "青", reading: "あお" } (blue)
{ text: "秋", reading: "あき" } (autumn)
Input "あ" → all readings starting with "あ" match → 雨, 赤, 青, 秋
Input "あめ" → only readings starting with "あめ" match → 雨
Japanese words are entered from the first character onward, so prefix search fits naturally. As the user types more, the list narrows down — exactly how Google's search suggestions feel.
Three pieces needed: dictionary data, search logic, display UI. Let's build them in order.
Understanding IME Composition Events — The Heart of Japanese Autocomplete
Japanese input goes through a "composition" phase via the IME. If you only listen to input events, your autocomplete will misfire constantly. The fix is to use compositionstart / compositionend to detect composition state and search only on confirmation.
For English autocomplete, listening to the input event is enough. Japanese is different. There's a composition step driven by the IME, and ignoring it makes your autocomplete explode.
What Goes Wrong: The input Event Misfire Problem
Consider typing あめ (rain). On the keyboard, you press a, m, e — and the input event fires three times:
1. Press "a" → input event fires (field: "あ") ← not confirmed yet
2. Press "m" → input event fires (field: "あm") ← not confirmed yet
3. Press "e" → input event fires (field: "あめ") ← not confirmed yet
4. Press Enter → input event fires (field: "あめ") ← confirmed
If you search on every input event, you'll show candidates at あ, then clear them at あm, then show them again at あめ. The UI will jitter horribly.
The Fix: Detect Composition State
Browsers expose three events for IME composition state:
-
compositionstart— IME composition begins. "I'm about to enter via the IME." -
compositionupdate— fires as the composition text changes (e.g., romaji → hiragana conversion). -
compositionend— composition finishes (user confirms). "Input is finalized."
The best time to run your search is compositionend. If you only search on confirmation, the misfire problem disappears.
Implementation: Tracking Composition with a Flag
You can check e.isComposing on the input event, but older Safari and iOS sometimes get it wrong. Using a manual flag variable alongside is safer.
const input = document.getElementById('searchInput');
let isComposing = false;
// Composition start → set flag
input.addEventListener('compositionstart', () => {
isComposing = true;
});
// Composition end → clear flag → run search
input.addEventListener('compositionend', () => {
isComposing = false;
performSearch(input.value); // ← search here
});
// Input event → skip if composing
input.addEventListener('input', (e) => {
if (e.isComposing || isComposing) {
return; // do nothing during composition
}
performSearch(e.target.value); // for non-IME input (e.g., English)
});
The double check e.isComposing || isComposing absorbs cross-browser differences. Works on Chrome, Firefox, Safari, and Edge.
Debounce — Stopping Wasted Searches on Rapid Input
Debounce is a pattern that waits for input to settle (e.g., 150ms) before running a search. It prevents flickering from consecutive confirmations. For local searches, it feels essentially instantaneous.
We now skip searches during composition. The next problem: consecutive confirmations.
When typing 天気予報 (weather forecast), the user confirms てんき first, then よほう. If we search on each confirmation, the results for てんき flash briefly then vanish. That's a useless flicker for the user.
Debounce waits for input to stop changing for a given interval (e.g., 150ms) before running the callback. While the user keeps typing, the timer keeps resetting. Once they settle, the search fires exactly once.
Debounce Function Implementation
function debounce(func, delay) {
let timeoutId = null;
return function(...args) {
// Cancel the previous timer if it exists
if (timeoutId !== null) {
clearTimeout(timeoutId);
}
// Set a new timer
timeoutId = setTimeout(() => {
func.apply(this, args);
}, delay);
};
}
// Usage
const debouncedSearch = debounce(performSearch, 150);
Use setTimeout to delay execution, and clearTimeout to cancel the previous pending call. Simple, but hugely effective.
Delay ranges of 100–300ms are common. 150ms is a sweet spot for local searches (where the dictionary lives in memory), and feels effectively instantaneous.
Prefix Search Logic
Combine JavaScript's filter and startsWith to do prefix matching across both reading and text. Sort exact matches first, and cap results at the top 10.
Now the search logic itself. We'll use filter + startsWith.
Basic Search Function
function searchByReading(query) {
const normalizedQuery = query.toLowerCase().trim();
if (!normalizedQuery) {
return [];
}
// Prefix match on reading OR on the text itself
let results = dictionary.filter(item => {
return item.reading.startsWith(normalizedQuery) ||
item.text.toLowerCase().startsWith(normalizedQuery);
});
// Sort: exact matches first, then shorter readings
results.sort((a, b) => {
const aExact = a.reading === normalizedQuery ||
a.text.toLowerCase() === normalizedQuery;
const bExact = b.reading === normalizedQuery ||
b.text.toLowerCase() === normalizedQuery;
if (aExact && !bExact) return -1;
if (!aExact && bExact) return 1;
return a.reading.length - b.reading.length;
});
return results.slice(0, 10);
}
We match against both the reading and the text itself so that アメリカ (America) hits on both あめりか and アメリカ.
Sort prioritizes exact matches, then prefers shorter readings (more specific words). slice(0, 10) caps results at 10 — too many choices makes selection harder.
Security: Don't Forget HTML Escaping
When inserting candidates into the DOM, skipping escape handling creates an XSS vulnerability. Even with a self-managed dictionary, future additions of user input or API data mean you should escape from day one.
If you drop candidates into the DOM without escaping, you create a cross-site scripting (XSS) vulnerability. Even if today's dictionary is fully under your control, the moment you add user-submitted entries or fetch from an external API, you're exposed. Escape from the start.
function escapeHtml(text) {
const div = document.createElement('div');
div.textContent = text;
return div.innerHTML;
}
// When injecting into DOM
li.innerHTML = `
${escapeHtml(item.text)}
${escapeHtml(item.reading)}
`;
Assigning to textContent and reading back via innerHTML auto-escapes <, &, etc. Mundane, but essential for production.
Putting It All Together
Combining IME composition handling, debounce, prefix search, and HTML escaping gives us the full "brain" of the feature. Works for both Japanese and English input.
Full code, end to end:
// === Dictionary ===
const dictionary = [
{ text: '雨', reading: 'あめ' },
{ text: '赤', reading: 'あか' },
{ text: '青', reading: 'あお' },
{ text: '秋', reading: 'あき' },
{ text: '朝', reading: 'あさ' },
{ text: 'アメリカ', reading: 'あめりか' },
{ text: '天気', reading: 'てんき' },
{ text: '天気予報', reading: 'てんきよほう' },
// ... add entries for your use case
];
// === Utilities ===
function debounce(func, delay) {
let timeoutId = null;
return function(...args) {
if (timeoutId !== null) clearTimeout(timeoutId);
timeoutId = setTimeout(() => func.apply(this, args), delay);
};
}
function escapeHtml(text) {
const div = document.createElement('div');
div.textContent = text;
return div.innerHTML;
}
// === Search ===
function searchByReading(query) {
const q = query.toLowerCase().trim();
if (!q) return [];
let results = dictionary.filter(item =>
item.reading.startsWith(q) || item.text.toLowerCase().startsWith(q)
);
results.sort((a, b) => {
const aExact = a.reading === q || a.text.toLowerCase() === q;
const bExact = b.reading === q || b.text.toLowerCase() === q;
if (aExact && !bExact) return -1;
if (!aExact && bExact) return 1;
return a.reading.length - b.reading.length;
});
return results.slice(0, 10);
}
// === Main ===
const input = document.getElementById('searchInput');
let isComposing = false;
input.addEventListener('compositionstart', () => { isComposing = true; });
input.addEventListener('compositionend', () => {
isComposing = false;
updateSuggestions(input.value);
});
function updateSuggestions(query) {
const results = searchByReading(query.trim());
console.log('Results:', results);
// → next step: render the UI
}
const debouncedUpdate = debounce(updateSuggestions, 150);
input.addEventListener('input', (e) => {
if (e.isComposing || isComposing) return;
debouncedUpdate(e.target.value);
});
Verifying Behavior
Typing あめ via IME and confirming:
1. Press "a" → compositionstart → isComposing = true
2. input event → skipped because isComposing is true
3. Press "m", "e" → same, skipped
4. Press Enter → compositionend → isComposing = false → updateSuggestions('あめ')
5. Results: [{ text: '雨', reading: 'あめ' }, { text: 'アメリカ', reading: 'あめりか' }]
Typing test without IME:
1. No compositionstart fires → isComposing stays false
2. input event → debouncedUpdate is called
3. 150ms later → updateSuggestions runs
Correct behavior for both Japanese and English input.
Next Steps: UI and Full Working Demo
Everything above is the "brain" of the feature. The UI — keyboard navigation, focus handling, click handling — is the "body." I cover that in a follow-up post.
At this point the brain (IME handling + search logic) is complete.
In the follow-up, I build the HTML/CSS for the dropdown UI, add keyboard navigation (↑/↓/Enter/Escape), mouse click handling, and focus control. The result is a copy-paste ready working demo. I also cover server-side integration, external API patterns, and performance tuning.
👉 Part 2 (Japanese, with English-friendly code): Japanese Suggest — Full Version: Keyboard Operations, blur Pitfalls, API Integration
Related reading — I also wrote about a quirky macOS issue where the first character of Japanese input sometimes gets committed as English. It's a deep dive into how IME composition interacts with OS-level input handling:
👉 The "First Character Stuck in English" Problem on macOS (Japanese, with code)
Summary
"あ → 雨" is solved with prefix search on readings, not IME conversion. Since IME candidates are off-limits to browsers, maintain your own dictionary and use startsWith for matching.
The Japanese-specific challenge is coexisting with the IME. Use compositionstart / compositionend to detect composition state and skip searches during composition. Combine that with debouncing to avoid flicker from consecutive confirmations. Put these two together and you have a smooth, IME-friendly autocomplete.
This post was originally published at Rapls Works. If you're into WordPress plugin development, Japanese i18n, or weird IME edge cases, there's more over there.
Top comments (0)