Ever wondered why your URLs have %20 instead of spaces, or why your API breaks when a parameter contains an &? That's URL encoding at work — and understanding it will save you hours of debugging.
What is URL Encoding?
URL encoding (also called percent encoding) converts characters into a format safe for URLs. Since URLs can only contain a limited set of ASCII characters, anything outside that set gets replaced with % followed by two hex digits.
For example:
- Space →
%20 -
&→%26 -
=→%3D -
#→%23 -
@→%40 -
+→%2B
This is defined in RFC 3986 and is fundamental to how the web works.
Why URL Encoding Matters
1. Prevents Broken URLs
Consider searching for "rock & roll":
❌ Bad: https://api.example.com/search?q=rock & roll
✅ Good: https://api.example.com/search?q=rock%20%26%20roll
Without encoding, the & is interpreted as a parameter separator, breaking your query into two malformed parameters.
2. Security
Unencoded URLs are a vector for injection attacks (XSS, SQL injection). Encoding neutralizes special characters by converting them to harmless percent-encoded representations.
3. International Characters
Non-ASCII characters (Chinese, Arabic, emoji) must be encoded using their UTF-8 byte sequences:
café → caf%C3%A9
日本 → %E6%97%A5%E6%9C%AC
🚀 → %F0%9F%9A%80
4. Data Integrity
URLs pass through proxies, load balancers, logging systems, and analytics tools. Encoding ensures nothing gets corrupted along the way.
How Percent Encoding Works
The algorithm is simple:
- Take the character's UTF-8 byte value
- Write
%+ two hex digits for each byte
ASCII characters (1 byte):
| Character | Decimal | Hex | Encoded |
|---|---|---|---|
| Space | 32 | 20 | %20 |
! |
33 | 21 | %21 |
# |
35 | 23 | %23 |
& |
38 | 26 | %26 |
+ |
43 | 2B | %2B |
/ |
47 | 2F | %2F |
= |
61 | 3D | %3D |
? |
63 | 3F | %3F |
@ |
64 | 40 | %40 |
Multi-byte characters produce multiple %XX sequences:
é (2 bytes UTF-8) → %C3%A9
中 (3 bytes UTF-8) → %E4%B8%AD
🎉 (4 bytes UTF-8) → %F0%9F%8E%89
Three Categories of Characters
1. Unreserved (Never need encoding)
A-Z a-z 0-9 - . _ ~
These are always safe in any URL position.
2. Reserved (Encode when used as data)
: / ? # [ ] @ ! $ & ' ( ) * + , ; =
These have special meaning in URLs. Encode them when they appear as data (parameter values), not when they serve their structural purpose (delimiters).
3. Everything Else (Always encode)
Spaces, non-ASCII characters, control characters — always encode these.
The Space Encoding Confusion
Spaces can be encoded two ways:
| Context | Encoding | Example |
|---|---|---|
| Path segments | %20 |
/my%20file.txt |
| Query strings |
+ or %20
|
?q=hello+world |
The + convention comes from HTML form submissions (application/x-www-form-urlencoded). Outside query strings, always use %20.
This dual convention is the #1 source of URL encoding bugs. 🐛
URL Encoding in JavaScript
// Encode a parameter value (most common)
encodeURIComponent("rock & roll")
// → "rock%20%26%20roll"
// Encode a full URL (preserves structure)
encodeURI("https://example.com/path?q=hello world")
// → "https://example.com/path?q=hello%20world"
// Decode
decodeURIComponent("rock%20%26%20roll")
// → "rock & roll"
Rule of thumb:
- Use
encodeURIComponent()for parameter values (99% of cases) - Use
encodeURI()for complete URLs (rare)
URL Encoding in Python
from urllib.parse import quote, quote_plus, urlencode
# For path segments
quote("my file.txt")
# → "my%20file.txt"
# For query string values (spaces → +)
quote_plus("rock & roll")
# → "rock+%26+roll"
# Build a complete query string
urlencode({"q": "rock & roll", "page": "1"})
# → "q=rock+%26+roll&page=1"
URL Encoding in PHP
// Spaces → + (form encoding)
urlencode("rock & roll");
// → "rock+%26+roll"
// Spaces → %20 (RFC 3986)
rawurlencode("rock & roll");
// → "rock%20%26%20roll"
// Build query string from array
http_build_query(["q" => "rock & roll", "page" => 1]);
// → "q=rock+%26+roll&page=1"
URL Encoding vs HTML Encoding
These are completely different — don't confuse them!
| URL Encoding | HTML Encoding | |
|---|---|---|
| Purpose | Safe URLs | Safe HTML |
| Format | %XX |
&entity; |
| Space | %20 |
|
| & | %26 |
& |
| < | %3C |
< |
| When | Building URLs | Rendering HTML |
Security tip: When embedding a URL in an HTML attribute (like href), you need both encodings — URL-encode the data first, then HTML-encode the complete URL.
Common URL Encoding Bugs
Bug #1: Double encoding
❌ %2520 (the % in %20 got encoded again)
✅ %20
This happens when you encode an already-encoded URL. Always encode raw data, never encoded strings.
Bug #2: Using encodeURI instead of encodeURIComponent
// Wrong — doesn't encode & and =
encodeURI("key=value&foo=bar")
// → "key=value&foo=bar" (unchanged!)
// Right — encodes everything
encodeURIComponent("key=value&foo=bar")
// → "key%3Dvalue%26foo%3Dbar"
Bug #3: Not encoding at all
// This WILL break if query contains & or =
const url = `https://api.com/search?q=${userInput}`;
// Always encode user input
const url = `https://api.com/search?q=${encodeURIComponent(userInput)}`;
Try It Yourself 🚀
I built a free URL Parser that breaks down any URL into its components — protocol, hostname, path, query parameters, and fragment. Perfect for debugging encoded URLs:
Paste any URL and instantly see every component decoded and separated. 100% client-side — nothing leaves your browser.
Conclusion
URL encoding is one of those web fundamentals that's easy to ignore until it causes a production bug. Remember these rules:
-
Always use
encodeURIComponent()for parameter values - Never encode an already-encoded string (double encoding)
- Spaces:
%20in paths,+or%20in query strings - URL encoding ≠ HTML encoding — different contexts, different rules
Bookmark this guide and you'll never have a URL encoding bug again.
What's the worst URL encoding bug you've encountered? Share in the comments! 💬
Top comments (0)