Today, I rethought the HTML meta tag <meta charset="UTF-8">. I got some new knowledge.
Two ways to specify charset in a HTML file
When I use VSCode, I found an interesting abbreviation meta:utf, then I inset the corresponding code <meta http-equiv="Content-Type" content="text/html; charset=UTF-8"> into the editor. It was right then I thought it seemed this line of declaration was used to specify a charset of a HTML file. If so, what about <meta charset="UTF-8">. Then I Googled it, and I found this article on stackoverflow.
In HTML5, they are equivalent. Use the shorter one, as it is easier to remember and type. Browser support is fine since it was designed for backwards compatibility.
So I should always use <meta charset="UTF-8">.
HTTP Header Content-Type or <meta charset="UTF-8">
As I know, we can also specify a charset for a HTML file with HTTP Response Header Content-Type: text/html; charset=UTF-8. But I want to know which has higher priority. So I built a simple Node.js server to figure out it. Here's the code:
const http = require("http");
http
.createServer((req, res) => {
res.writeHead(200, {
"Content-Type": "text/html; charset=UTF-8"
})
res.write(`<!DOCTYPE html>
<html>
<head>
<meta http-equiv="Content-Type" content="text/html; windows-1252">
</head>
<body>
<p>你好</p>
</body>
</html>`);
res.end();
})
.listen(3000);
Then I access http://localhost:3000, then I got the Chinese Characters. Obviously, HTTP Response Header has a higher priority.
Default character set for a HTML file in Chrome
The default character set is windows-1252. If you open the HTML file directly with Chrome, it seems to be UTF-8. So in Firefox, while no in Safari whose defalut character set is still windows-1252.
Best practice
Always specify the <meta charset="UTF-8"> declaration and Response Header Content-Type. And we should keep this in mind that the later one has a higher priority.
Top comments (2)
Interesting, good read.
You left some typo's in your article though: Tody = Today, widows-1252 = windows-1252 and I think you meant Content-type in stead of Conent-Type.
Thanks for correcting.