DEV Community

Discussion on: When the white space became a beast

Collapse
 
mortoray profile image
edA‑qa mort‑ora‑y

What problem is the non-breaking-space creating?

Surely the myriad of other Unicode spacing characters would also create similar issues?

Collapse
 
tbodt profile image
tbodt

There are dozens and dozens of Unicode characters that show up as blank space. You might think that \s in a regex would find all of them, since it matches characters with the "separator, space" unicode property. But not all blank characters have the "separator, space" property, including (with the characters between parentheses):
U+3164 HANGUL FILLER (ㅤ)
U+1D173 MUSICAL SYMBOL BEGIN BEAM (𝅳) (there are 7 other similar musical symbols)
U+200D ZERO WIDTH JOINER (‍)
U+180E MONGOLIAN VOWEL SEPARATOR (᠎) (only shows up as blank in some fonts)
There's even one character,   (U+1680 OGHAM SPACE MARK) that has the "separator, space" property and doesn't display as whitespace. Hilariously enough, you can use this character as whitespace in JavaScript.