Do you see <meta charset="UTF-8">
in html heads. If you know what this charset does then don't read further but if you don't you will learn something new.
When you write some text in your html code, to show exactly the same text, browser must know the charset of the document.
Let me make it easy for you. Just make an html file and add <meta charset="ISO-8859-1">
in your html head. So now we are using charset ISO-8859-1 which was used earlier before utf-8.
Now in your html file add <p>हर्ष</p>
. "हर्ष" is my name written in hindi language(language spoken in India). Now open the file in your browser. What do you see?? hahaha... you see something like "जैसा". Why is this so?? This is because the charset "ISO-8859-1" does not support hindi characters. That's why instead of my name some random characters appear in the browser. However if you change your charset to "utf-8" it will show you exactly the same you wrote in your code.
"utf-8" is the default character encoding for html5, meaning even if you don't declare the charset, browser will consider utf-8.
I hope that now you have the confidence to answer to the question "What is utf-8 or charset?".
Top comments (4)
From one of the creators of Stack Overflow: The Absolute Minimum Every Software Developer Absolutely, Positively Must Know About Unicode and Character Sets - No Excuses!
We should all be using UTF-8 at the very least. It's good to see that browsers will treat documents as UTF-8 by default.
But of course I still get tons of data formats that are not encoded in UTF-8 and it's quite a headache to have to deal with it :)
UTF-16 is not a new version. It's a different representation of the characters, using another structure of data.
Thanks for sharing!