Ever wondered why some emojis take up more character count than expected? (Especially when dealing with input character count) Turns out many emojis are actually combinations of simpler ones!
Examples of composite emojis:
- ๐จโ๐จโ๐ฆโ๐ฆ = ๐จ+๐จ+๐ฆ+๐ฆ (family of four men)
- ๐ฉโ๐ป = ๐ฉ+๐ป (woman technologist)
- ๐ณ๏ธโ๐ = ๐ณ๏ธ+๐ (rainbow flag)
- ๐จโ๐ณ = ๐จ+๐ณ (man cook)
- ๐ง๐ปโ๐จ = ๐ง + ๐ป + ๐จ (artist: light skin tone)
Developer gotcha: Different programming languages handle Unicode differently, so emoji length calculations can vary between frontend and backend. Always test your character limits with composite emojis.
JavaScript examples:
For javascript, Intl.Segmenter can be a great help
const family = '๐จโ๐จโ๐ฆโ๐ฆ';
// Default length - counts UTF-16 code units
console.log(family.length); // 11
// Destructuring - counts grapheme clusters
console.log([...family].length); // 7
// See the actual components
console.log(Array.from(family));
// ['๐จ', 'โ', '๐จ', 'โ', '๐ฆ', 'โ', '๐ฆ']
// For accurate user-visible character count
const segmenter = new Intl.Segmenter('en', {granularity: 'grapheme'});
console.log([...segmenter.segment(family)].length); // 1
Playground
I got curious about all possible combinations, so I made "Emoji Architect" to explore these emojis, with this tool you can:
- Browse all composite emojis
- See the breakdown of any emoji
- Filter combinations by base emoji components
๐ https://www.thingsaboutweb.dev/en/emojiarchitect
More explanation and ways of building emojis coming soon...
Reference
Best reference is the spec
Top comments (0)