Edit: Still don't know why. But I did find out it only happens if you've got a weird or missing user-agent
string in your request headers.
I've been doing some research on declaring character encodings.
Specifically, do you really need the <meta charset="UTF-8">
tag?
You must declare a character encoding, but by default most servers include this in the http headers
and that's actually better than using a <meta>
tag — the earlier it's declared the sooner the page can render.
A micro-optimisation really.
On top of that, for HTML5
utf-8
is the only valid character encoding. So <!doctype html>
is implicitly declaring the character encoding too.
<meta charset="UTF-8">
is considered sacred. So before I started telling people it's a useless 22 bytes
. I thought I'd see what google
do.
In the google homepage <head>
tags they have:
<meta content="text/html; charset=UTF-8" http-equiv="Content-Type">
But then in the http headers
it's:
Content-Type: text/html; charset=ISO-8859-1
What's going on here?
Here's my guesses:
- Maybe it's a backwards compatibility thing. Perhaps browsers that don't understand the
<meta>
tag also don't understandutf-8
? - Maybe it's a performance optimization. Perhaps it's faster to parse the very first part of the document in
ISO-8859-1
then switch toutf-8
for the rest.
What do you think? What does google know that we don't (besides literally everything)?
Top comments (1)
Good one!