URL Encoding Explained: Why Your API Calls Break on Special Characters

#javascript #tutorial #webdev #beginners

I once spent an hour debugging an API integration that failed silently on certain user inputs. The search endpoint worked fine for "javascript" but returned empty results for "C++ programming." The culprit was the + character being interpreted as a space instead of a literal plus sign. The fix was one function call: encodeURIComponent(). But understanding why that fix works requires knowing how URL encoding actually operates.

Why URLs need encoding

URLs were designed in the early 1990s with a very limited character set. RFC 3986 defines the characters that can appear in a URL without encoding: letters, digits, and a handful of special characters (-, _, ., ~). Everything else -- spaces, non-ASCII characters, reserved delimiters -- must be percent-encoded.

Percent encoding replaces each byte of a character with % followed by two hex digits representing that byte's value:

Space   -> %20
#       -> %23
&       -> %26
=       -> %3D
+       -> %2B

The reason is straightforward: characters like &, =, #, and / have special meaning in URLs. & separates query parameters. = separates keys from values. # introduces a fragment. / separates path segments. If your data contains these characters, the parser cannot tell the difference between structure and content.

encodeURI vs encodeURIComponent

JavaScript gives you two encoding functions, and using the wrong one is one of the most common bugs I see.

encodeURI() encodes a complete URI. It leaves structural characters intact: :, /, ?, #, &, =. Use it when you have a full URL and want to encode only the parts that are not valid URI characters.

encodeURI("https://example.com/search?q=hello world&lang=en")
// "https://example.com/search?q=hello%20world&lang=en"

The & and = and / are preserved because they are part of the URL structure.

encodeURIComponent() encodes a URI component -- a single value within the URL. It encodes everything except letters, digits, and - _ . ~.

encodeURIComponent("hello world&lang=en")
// "hello%20world%26lang%3Den"

Here the & and = are encoded because within a single parameter value, they are data, not structure.

The rule: use encodeURIComponent() for individual parameter values. Use encodeURI() only when encoding a complete URL where the structure is already correct.

// Correct: encode each value separately
const query = `?name=${encodeURIComponent(name)}&city=${encodeURIComponent(city)}`;

// Wrong: encodeURI leaves & and = unencoded
const query = `?name=${encodeURI(name)}&city=${encodeURI(city)}`;
// Breaks if name contains & or =

The space encoding mess

Spaces in URLs have two representations, and the confusion between them causes countless bugs.

%20 is the standard percent-encoding for a space, defined in RFC 3986. It is valid in any part of a URL.

+ is an alternative encoding for spaces, but only in the application/x-www-form-urlencoded format used by HTML forms. It is valid only in query strings, not in path segments or fragments.

When you submit an HTML form with method GET, the browser encodes spaces as + in the query string. When you use encodeURIComponent(), spaces become %20. Both are valid in query parameters, but they are not interchangeable everywhere.

// HTML form submission
// "hello world" -> "hello+world" (in query string)

// encodeURIComponent
encodeURIComponent("hello world") // "hello%20world"

// URLSearchParams (uses form encoding)
new URLSearchParams({ q: "hello world" }).toString() // "q=hello+world"

If your backend interprets + as a literal plus sign instead of a space, form submissions break. If your frontend sends %2B (encoded plus) but the backend double-decodes it, plus signs become spaces. Know which encoding your stack uses and be consistent.

UTF-8 and international characters

Modern URLs can contain any Unicode character, but they must be encoded as UTF-8 bytes, each byte percent-encoded individually.

encodeURIComponent("cafe")  // "cafe" (no encoding needed for ASCII)
encodeURIComponent("cafe")  // "caf%C3%A9" (e with accent = 2 UTF-8 bytes)

The browser's address bar often shows decoded Unicode for readability (you see wikipedia.org/wiki/Zurich instead of wikipedia.org/wiki/Z%C3%BCrich), but the actual HTTP request uses the encoded form.

Common mistakes

Double encoding. If a URL already contains %20 and you run it through encodeURIComponent(), it becomes %2520 (the % gets encoded as %25). Always encode raw values, never already-encoded strings.

Encoding the entire URL with encodeURIComponent. This encodes the : in https: and the / in the path, producing an unusable URL. Use encodeURIComponent only on values, not on the full URL.

Forgetting to encode in template strings. This is the most common version:

// Bug: if username is "john&admin=true", this creates a parameter injection
fetch(`/api/users?name=${username}`)

// Fixed
fetch(`/api/users?name=${encodeURIComponent(username)}`)

Assuming decodeURIComponent handles all encoding. It handles percent-encoding but not the +-as-space convention. If you receive form-encoded data, replace + with spaces before decoding.

For quick encoding and decoding during development, especially when debugging API calls with special characters, I keep a URL encoder at zovo.one/free-tools/url-encoder that handles both standard percent-encoding and form encoding side by side.

URL encoding is one of those things that seems trivial until it breaks in production. Encode every user-supplied value. Use encodeURIComponent, not encodeURI. And never trust that the data going into a URL is safe just because it looks like plain text.

I'm Michael Lip. I build free developer tools at zovo.one. 350+ tools, all private, all free.