Edit: Still don't know why. But I did find out it only happens if you've got a weird or missing user-agent string in your request headers.
I've been doing some research on declaring character encodings.
Specifically, do you really need the <meta charset="UTF-8"> tag?
You must declare a character encoding, but by default most servers include this in the http headers and that's actually better than using a <meta> tag — the earlier it's declared the sooner the page can render.
A micro-optimisation really.
On top of that, for HTML5 utf-8 is the only valid character encoding. So <!doctype html> is implicitly declaring the character encoding too.
<meta charset="UTF-8"> is considered sacred. So before I started telling people it's a useless 22 bytes. I thought I'd see what google do.
In the google homepage <head> tags they have:
<meta content="text/html; charset=UTF-8" http-equiv="Content-Type">
But then in the http headers it's:
Content-Type: text/html; charset=ISO-8859-1
What's going on here?
Here's my guesses:
- Maybe it's a backwards compatibility thing. Perhaps browsers that don't understand the
<meta>tag also don't understandutf-8? - Maybe it's a performance optimization. Perhaps it's faster to parse the very first part of the document in
ISO-8859-1then switch toutf-8for the rest.
What do you think? What does google know that we don't (besides literally everything)?
Top comments (1)
Good one!