Base64 Is Not Encryption: What Every Developer Gets Wrong

#javascript #tutorial #webdev #beginners

I once reviewed a codebase where API keys were "secured" by encoding them in Base64 before storing them in a config file. The developer genuinely believed this was a form of encryption. It is not. Base64 is an encoding scheme, not a cipher. Anyone can decode it instantly. Understanding what Base64 actually does, and what it does not do, will save you from shipping something embarrassing.

What Base64 actually is

Base64 converts binary data into a string of ASCII characters. That is the entire purpose. It exists because many systems -- email protocols, JSON payloads, HTML data URIs, URL parameters -- were designed to handle text, not arbitrary binary data. If you try to shove raw binary through a text-based protocol, certain bytes get misinterpreted as control characters, nulls get truncated, and data gets corrupted.

Base64 solves this by mapping every 6 bits of input to one of 64 printable ASCII characters: A-Z, a-z, 0-9, +, and /. Three bytes of input (24 bits) become four Base64 characters (4 x 6 = 24 bits). The output is always about 33% larger than the input. That overhead is the trade-off for safe text transport.

The padding mystery

If you have encoded anything to Base64, you have seen the trailing = signs. These exist because Base64 works on 3-byte groups. If your input is not a multiple of 3 bytes, the encoder pads the output.

Input is 1 byte: two Base64 characters + ==
Input is 2 bytes: three Base64 characters + =
Input is 3 bytes: four Base64 characters, no padding

"A"     -> "QQ=="
"AB"    -> "QUI="
"ABC"   -> "QUJD"

Some implementations strip the padding because the decoder can infer the original length from the output length. JWTs, for example, use Base64URL encoding without padding. This is fine as long as the decoder expects it.

Base64 in JavaScript

The browser gives you two built-in functions:

btoa("Hello, world"); // "SGVsbG8sIHdvcmxk"
atob("SGVsbG8sIHdvcmxk"); // "Hello, world"

The names are unintuitive. btoa stands for "binary to ASCII" and atob stands for "ASCII to binary." These functions only handle Latin-1 characters. Try encoding a string with Unicode characters and you get a DOMException:

btoa("Hello "); // DOMException: The string contains characters outside of the Latin1 range

The fix is to encode the string as UTF-8 first:

// Encode Unicode to Base64
function toBase64(str) {
  return btoa(
    new TextEncoder()
      .encode(str)
      .reduce((acc, byte) => acc + String.fromCharCode(byte), "")
  );
}

// Decode Base64 to Unicode
function fromBase64(b64) {
  return new TextDecoder().decode(
    Uint8Array.from(atob(b64), (c) => c.charCodeAt(0))
  );
}

In Node.js, use Buffer:

Buffer.from("Hello, world").toString("base64"); // "SGVsbG8sIHdvcmxk"
Buffer.from("SGVsbG8sIHdvcmxk", "base64").toString(); // "Hello, world"

Node's Buffer handles UTF-8 natively, so you do not need the workaround.

Base64URL: the variant that matters

Standard Base64 uses + and / as its 63rd and 64th characters. These are both special characters in URLs. If you put standard Base64 in a query parameter, the + gets interpreted as a space and the / gets interpreted as a path separator.

Base64URL replaces + with - and / with _. It also typically omits the = padding. This is the encoding used in JWTs, data URIs in some contexts, and anywhere Base64 data needs to survive a URL round-trip.

// Standard Base64 to Base64URL
function toBase64URL(base64) {
  return base64.replace(/\+/g, "-").replace(/\//g, "_").replace(/=+$/, "");
}

If you are building anything that passes encoded data through URLs, use Base64URL. The number of bugs I have seen from standard Base64 in query strings is genuinely depressing.

Five common mistakes

1. Using Base64 for "security." I said it at the top and I will say it again. Base64 is reversible by anyone with a browser console. It provides zero security. Use proper encryption (AES-256, for example) for sensitive data.

2. Base64-encoding large files in JSON. If you embed a 10MB image as a Base64 string in a JSON payload, it becomes 13.3MB due to the 33% overhead. Your API gets slower, your memory usage spikes, and your database stores bloated documents. Use file uploads with presigned URLs instead.

3. Forgetting about the size increase in data URIs. Inlining small images as data:image/png;base64,... in CSS avoids an HTTP request, which is good for icons under 2KB. But doing this for larger images makes your CSS file enormous and non-cacheable separately from the stylesheet.

4. Double-encoding. I have seen code that runs Base64 encoding twice, producing a string like U0dWc2JHOD0= instead of SGVsbG8=. Usually this happens when a library encodes automatically and the developer encodes manually before passing the data in. Always check whether your HTTP client or library is already handling the encoding.

5. Ignoring line length limits. The MIME specification (RFC 2045) requires Base64 output to be wrapped at 76 characters with CRLF line breaks. Some encoders do this by default, some do not. If you are generating Base64 for email attachments, you need the line breaks. If you are generating Base64 for a JSON field or URL parameter, you do not.

When to actually use Base64

The legitimate use cases are straightforward. Embedding binary data in JSON or XML. Encoding email attachments (MIME). Data URIs for small inline assets. Transmitting binary data through systems that only support text. Encoding credentials in HTTP Basic Authentication headers (where the security comes from HTTPS, not from the encoding).

For quick encoding and decoding without writing code, I keep a Base64 encoder at zovo.one/free-tools/base64-encoder bookmarked. It handles both standard and URL-safe variants, which is useful when debugging JWT payloads or data URIs.

Base64 is a tool for data transport, not data protection. Use it for what it is designed for and reach for actual cryptography when security matters.

I'm Michael Lip. I build free developer tools at zovo.one. 350+ tools, all private, all free.