DEV Community

SEN LLC
SEN LLC

Posted on

A Keyboard-Driven HTML Entity Lookup That Lets You Copy All Three Forms

A Keyboard-Driven HTML Entity Lookup That Lets You Copy All Three Forms

"Is it © or &copyright;?" "What's the decimal code point for β„’?" I kept typing these from memory, getting them slightly wrong, and wasting time. So I built a 60-entity reference where you can search by name, character, description, or number, and a single click gives you the named / decimal / hex form. Autofocus on the search input.

Writing about HTML or Markdown means you constantly need escape sequences for characters you are talking about literally. I want to type < and display <, not a broken tag. HTML offers three ways to write each entity (named, decimal, hex), and the one that's right depends on context β€” named for legibility, decimal for XML where named forms aren't defined, hex when cross-referencing Unicode tables.

πŸ”— Live demo: https://sen.ltd/portfolio/html-entities-lookup/
πŸ“¦ GitHub: https://github.com/sen-ltd/html-entities-lookup

Screenshot

60+ hand-picked entities across six categories (basic / punctuation / currency / math / arrows / Greek). Autofocus on search. Click any of three copy buttons per card to grab that form to clipboard. Zero dependencies, no build.

Why not ship all ~2200 HTML5 entities

The WHATWG entity list has 2200+ entries. Most of them are things like &DiacriticalGrave; or &LeftRightDoubleArrow; you'll use once in your life.

I deliberately curated 60. Reasons:

  1. Browsable scroll length β€” a 2200-row grid is a flatscape, not a reference
  2. Real hand-use cases β€” if I've never typed &hearts; in a blog post, it doesn't belong
  3. Data integrity is testable at 60, not at 2200

Same philosophy as my HTTP status reference (entry #14): curation over completeness. A reference you can browse in three seconds beats a reference that returns every possibly-relevant match.

Five-field data model with multi-axis search

Each entity has name, character, codepoint, category, and a localized description:

{ name: 'copy', char: 'Β©', code: 169, category: 'punct',
  desc: { ja: 'γ‚³γƒ”γƒΌγƒ©γ‚€γƒˆ', en: 'copyright' } }
Enter fullscreen mode Exit fullscreen mode

Search hits all five fields:

function matches(entity, query) {
  const q = query.trim().toLowerCase()
  if (!q) return true
  return (
    entity.name.toLowerCase().includes(q) ||
    entity.char === q ||                          // exact char match
    String(entity.code).includes(q) ||            // decimal code
    entity.desc.ja.includes(q) ||
    entity.desc.en.toLowerCase().includes(q) ||
    entity.category.includes(q)
  )
}
Enter fullscreen mode Exit fullscreen mode

Paste Β© and find it. Type copy or 169 or copyright and find it. Users don't have to learn your search grammar β€” they type the first thing that comes to mind and get a result.

The one exception is char === q (strict match). Using includes on single characters blows up β€” searching & as a substring would match every description containing the word "and" in it. Single characters get strict equality; everything else is substring.

Three copy buttons per card

HTML has three ways to represent each entity:

  • Named: &copy; β€” most readable, HTML 4+
  • Decimal numeric: &#169; β€” works in XML, which rejects named entities beyond a tiny built-in set
  • Hex numeric: &#xA9; β€” easiest to cross-reference against Unicode tables

Each card ships all three as clickable buttons:

function copyButtons(entity) {
  return `
    <button data-copy="&${entity.name};">&amp;${entity.name};</button>
    <button data-copy="&#${entity.code};">&amp;#${entity.code};</button>
    <button data-copy="&#x${entity.code.toString(16).toUpperCase()};">&amp;#x${entity.code.toString(16).toUpperCase()};</button>
  `
}

container.addEventListener('click', async (e) => {
  const btn = e.target.closest('[data-copy]')
  if (!btn) return
  await navigator.clipboard.writeText(btn.dataset.copy)
  showToast(`Copied: ${btn.dataset.copy}`)
})
Enter fullscreen mode Exit fullscreen mode

Two patterns worth noting:

  • data-copy attribute stores the exact string to copy, so the click handler is one line regardless of which button.
  • Event delegation β€” one listener on the grid container, not one per button. Six categories Γ— 60 cards Γ— 3 buttons = 180 buttons. Attaching a listener to each would work but wastes memory and makes teardown harder. container.addEventListener('click') + e.target.closest('[data-copy]') handles all 180 from a single listener.

Keyboard-driven UX

Autofocus the search input on page load:

<input id="search" type="text" autofocus placeholder="Search...">
Enter fullscreen mode Exit fullscreen mode

And a GitHub-style / shortcut to refocus:

document.addEventListener('keydown', (e) => {
  if (e.key === '/' && document.activeElement.tagName !== 'INPUT') {
    e.preventDefault()
    document.getElementById('search').focus()
  }
})
Enter fullscreen mode Exit fullscreen mode

The end-to-end flow becomes: open page β†’ type β†’ click a copy button β†’ done. Three seconds, zero pointer travel if you already know the entity name. Reference tools live or die on this tight loop.

Tests

13 cases on node --test. Data integrity plus search behavior:

test('every entity has name, char, code, category, desc', () => {
  for (const e of ENTITIES) {
    assert.ok(e.name)
    assert.ok(e.char)
    assert.ok(Number.isInteger(e.code))
    assert.ok(e.category)
    assert.ok(e.desc.ja)
    assert.ok(e.desc.en)
  }
})

test('char matches its codepoint', () => {
  for (const e of ENTITIES) {
    assert.equal(e.char.codePointAt(0), e.code)
  }
})

test('no duplicate names', () => {
  const seen = new Set()
  for (const e of ENTITIES) {
    assert.ok(!seen.has(e.name), `dup: ${e.name}`)
    seen.add(e.name)
  }
})

test('search by direct character', () => {
  const r = ENTITIES.filter((e) => matches(e, 'Β©'))
  assert.equal(r.length, 1)
  assert.equal(r[0].name, 'copy')
})

test('search by description', () => {
  const r = ENTITIES.filter((e) => matches(e, 'copyright'))
  assert.ok(r.some((e) => e.name === 'copy'))
})
Enter fullscreen mode Exit fullscreen mode

The char.codePointAt(0) === code assertion is the most valuable one. It cross-validates two fields against each other, catching any typo where you'd written Β© = 167 instead of Β© = 169. Your Unicode table isn't in front of you when you write these tests, but math is.

Series

This is entry #15 in my 100+ public portfolio series.

If there's a commonly used entity missing, issues welcome.

Top comments (0)