DEV Community

Panda Quests
Panda Quests

Posted on

URL encoding

This post was first published on my blog. For more details check it out.

In my last article I talked about what an URL is. In this article I’ll go a little bit further and talk in easy to understand way about what URL encoding is and why it is so important to URL encode.

Photo by NASA on Unsplash
URL was designed to make it as usable and interoperable as possible. Therefore the internet standard defines so called “unsafe characters”.

Examples for unsafe characters are:
The space “ ”, because they seem to disappear when printed or you don’t know how man space characters are there.
The pond/sharp character “#”, because it is reserved for the fragment (we covered what a “fragment” is here already).
The caret “^”, because not all network devices transmit this character correctly.

What is considered a safe and what an unsafe character is defined in the RFC 3986. RFC stands for Request for Comments. It’s a recommendation made by the IETF (Internet Engineering Task Force). Even though it is officially a recommendation only it is considered a de facto standard.

The RFC 3986 defines safe characters as alpha numeric characters in the US-ASCII and a few special characters like the colon “:” and the slash mark “/”.

If you want to transmit one of these unsafe characters, then you have to “percent-encode” or also called “URL encode” them. For example if you want to store on the server foo.com the file “^hello world.txt”, then the valid URL would look like: “http://foo.com/%5Ehello%20world.txt”

As you can see the caret “^” and the space “ ” have been replaced with “%5E” resp. “%20”. The characters after the percent characters “%” represent the corresponding hexadecimal number in the US-ASCII charachter table, i.e. “5E” and “20” are stands for “^” resp. “ ” in the US-ASCII table.

The full US-ASCII table can be found here.

Once these characters arrive at the server as a request it has parsed through an URL decoder. An URL decoder basically reverses the process of URL encoding. So, instead of having “%5E” or “%20”, you’ll have the caret “^” or the space “ ” character again.

Have you used URL encoding before? What kind of project was it? Do you have any questions? Comment below and let me know.

This article was first published on my blog. For more details or to support me, read that article on my blog

Top comments (0)