DEV Community

Cover image for Accessibility issues with stylized unicode characters
Nathaniel
Nathaniel

Posted on • Originally published at inputoutput.dev

Accessibility issues with stylized unicode characters

This is a piece I originally wrote as a warning on my website. It's aimed at a general audience — not nerds.

Unicode characters like 𝔱𝔥𝔢𝔰𝔢 cause accessibility issues and have a strong association with spam and scams.

You've probably landed on this page because you clicked learn more on the warning on one of the unicode text converter tools. (or not, because this is dev.to)

That's because the output of those tools cause accessibility issues.

This page exists to explain those issues — and to convince you to either not use those tools, or to use them in a way that doesn't ruin people's experience of the web.

In addition to the accessibility issues — they also have a strong association with spam and scams — more information can be found at the bottom of the article.

What is unicode?

To a computer everything is a number.

For a computer to work with the alphabet you need to give each letter a unique number.

For different computers to work together — they all need to agree on which letters are assigned which numbers.

Unicode is the system the world uses to make sure every computer agrees on which character is assigned which number.

Here's some examples:

  • captial A is given the number 65
  • capital B is #66
  • lowercase a is #97

The world has thousands of languages which have unicode symbols too:

  • Ђ is #1026
  • က is #4096
  • is #13056

Unicode is also used for symbols and Emoji:

  • is #10025
  • 𝔉 is #120073
  • 😂 is #128514

Without a system like unicode the internet would not be possible — this page would just be a mess of the wrong symbols.

Unicode mathematical symbols

Unicode has many symbols used in math and phonetics — and these symbols often resemble stylised letters of the latin alphabet.

These are intended for use in mathematics, but instead people use them to write stylized text online — often on social-media sites that don't have the option to use bold or italic text.

Here's an example of what it looks this text looks like (it may not render on your device) — it says "this is some fancy text that looks funky and weird."

ᴛʜɪs ⒤⒮ 𝘀𝗼𝗺𝗲 𝔣𝔞𝔫𝔠𝔶 𝕥𝕖𝕩𝕥 𝓉𝒽𝒶𝓉 🄻🄾🄾🄺🅂 𝚏𝚞𝚗𝚔𝚢 🅐🅝🅓 𝒘𝒆𝒊𝒓𝒅.
Enter fullscreen mode Exit fullscreen mode

It's true. It does look funky and weird — but it also comes with some serious accessibility issues.

Accessibility issues

Screen readers

Screen readers are an assistive technology that reads the contents of the web aloud.

When a screen reader reads these symbols, it doesn't interpret them visually — it correctly interprets them as the symbols that they are.

So a sentence like: "please buy my plastic garbage" written like this:

please buy my plastic 𝔤𝔞𝔯𝔟𝔞𝔤𝔢
Enter fullscreen mode Exit fullscreen mode

is read aloud as "please buy my plastic mathematical fraktur g mathematical fraktur a mathematical fraktur r mathematical fraktur b mathematical fraktur a mathematical fraktur g mathematical fraktur e".

Which is incredibly frustrating — and destroys any chance of the author selling their plastic garbage to screen reader users.

Incompatible devices

Unicode contains over 100000 letters and symbols and they add more every year.

To have an image saved on your device for every single one of these symbols would take up a lot of memory — and be a lot of work for the device creators.

It's not surprising that many devices don't render these symbols at all. Instead they show a bunch of squares.

This is a familiar experience for anyone who has received a message containing a brand new emoji.

How to solve these issues

Social media and messaging apps

Some social media sites and messaging apps allow you write bold and italic text — and have unintentionally made this a hidden feature.

However, many sites and apps don't allow users to style text. So the simple solution is don't use these symbols — I hope this article has convinced you not to.

Instead allow the content of your words to speak for themselves — or just quit social media and go outside a look at a flower.

On websites

If you're new to web development and want to stylise text — then you need to learn some css. You can learn more about it here — MDN — CSS: Cascading Style Sheets

If you need to use these symbols on your website — like they are used on this page — and want screen readers to read them correctly — wrap them in an element with the aria-label attribute, like so:

<span aria-label="your text">𝕪𝕠𝕦𝕣 𝕥𝕖𝕩𝕥</span>

or if the symbols you are using a purely decorative, and have no meaning for the reader. Use an aria-hidden attribute instead.

<span aria-hidden>✩</span>

Association with spam and scams

In addition to accessibility issues — unicode symbols like these have a strong association with spam and online scams.

Spam

One of the ways email and text message spam filters detect a spam message is by searching the email's content for specific words or phrases.

Some spammers hope to avoid detection by replacing these words or specific letters in these words with stylized characters.

Because of this, many people associate these stylised characters with spam.

Confusables and homoglyph attacks

Scarier than spam. Criminal hackers use these unicode characters to trick you into giving up access to your online accounts.

For example — lets say your example bank has a website with the example url: example.com

Someone may send you an email pretending to be your bank asking you to log into your account. It looks real, so you click on the link — and the link opens a website that looks identical to your bank's website.

Even the url is the same — but it's not the same! Instead of saying example.com it says 𝚎𝚡𝚊𝚖𝚙𝚕𝚎.com or subtler still еxample.com. (this one uses a Cyrillic е)

This is called a homoglyph attacka homoglyph is a character that looks very similar to another character.

Unicode keeps a catalogue of all these homoglyphs to help people build tools to prevent such attacks. Unicode refers to them as confusables.

Avoid the association

The association with spams, scams, online crime, make these unicode character extra worthwhile not to use.

You may even find that your emails and text messages go straight to people's junk mail.

Top comments (1)

Collapse
 
priteshusadadiya profile image
Pritesh Usadadiya

[[..Pingback..]]
This article was curated as a part of #89th Issue of Software Testing Notes Newsletter.
Web: softwaretestingnotes.com