Writing Base64 From Scratch in JavaScript — Why atob Isn't Enough
JavaScript has
btoa()andatob(), but they only accept Latin-1.btoa("こんにちは")throws. The URL-safe Base64 variant (-and_instead of+and/) isn't supported at all. Implementing Base64 manually — read 3 bytes, write 4 characters, handle padding — is about 40 lines and lets you handle UTF-8, URL-safe encoding, and line wrapping properly.
Base64 is one of those encodings every developer touches but few understand. It's not encryption. It's a way to represent arbitrary bytes as ASCII text — useful for JSON, URLs, email attachments, and data URIs. The math is trivial but the edge cases (padding, variants, Unicode) trip people up.
🔗 Live demo: https://sen.ltd/portfolio/base64-tool/
📦 GitHub: https://github.com/sen-ltd/base64-tool
Features:
- Text mode (UTF-8 encode/decode)
- File mode (drop image → data URL)
- URL-safe variant toggle
- Line wrap toggle (76 chars, MIME format)
- Size comparison (original vs base64)
- Auto-detect encoding direction
- Image preview for decoded data URLs
- Japanese / English UI
- Zero dependencies, 55 tests
The 3-to-4 byte conversion
Base64 groups 3 input bytes (24 bits) into 4 output characters (24 / 6 = 4 chars of 6 bits each):
const ALPHABET = 'ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789+/';
export function encode(bytes, urlSafe = false) {
const alpha = urlSafe ? BASE64_URL_ALPHABET : ALPHABET;
let result = '';
for (let i = 0; i < bytes.length; i += 3) {
const b1 = bytes[i];
const b2 = i + 1 < bytes.length ? bytes[i + 1] : 0;
const b3 = i + 2 < bytes.length ? bytes[i + 2] : 0;
const c1 = b1 >> 2; // top 6 bits of b1
const c2 = ((b1 & 0x03) << 4) | (b2 >> 4); // bottom 2 of b1 + top 4 of b2
const c3 = ((b2 & 0x0F) << 2) | (b3 >> 6); // bottom 4 of b2 + top 2 of b3
const c4 = b3 & 0x3F; // bottom 6 of b3
result += alpha[c1] + alpha[c2];
result += i + 1 < bytes.length ? alpha[c3] : (urlSafe ? '' : '=');
result += i + 2 < bytes.length ? alpha[c4] : (urlSafe ? '' : '=');
}
return result;
}
The bit shuffling is what every tutorial gets wrong once. Each input byte contributes to two output characters because 8 and 6 don't divide evenly.
Padding
When the input length isn't divisible by 3, you pad with zero bytes and mark the "missing" output characters with =:
- 3 bytes in → 4 chars out, no padding
- 2 bytes in → 3 chars + 1
= - 1 byte in → 2 chars + 2
=
URL-safe variant omits the = padding (it's redundant since length mod 4 determines it). That means a URL-safe encoded "f" is just "Zg", not "Zg==".
UTF-8 for text
Since browser btoa only accepts Latin-1, encoding Japanese text requires two steps:
export function encodeText(text, urlSafe = false) {
const bytes = new TextEncoder().encode(text); // UTF-8 bytes
return encode(bytes, urlSafe);
}
export function decodeText(str) {
const bytes = decode(str);
return new TextDecoder().decode(bytes);
}
TextEncoder produces a Uint8Array of UTF-8 bytes. The base64 encoder doesn't care about text — it just takes bytes. This two-step approach works for any Unicode input.
Example: "こ" is U+3053, which encodes to 3 UTF-8 bytes 0xE3 0x81 0x93. Base64-encoded those become "44GT". Round-trip works correctly.
URL-safe variant (RFC 4648 §5)
Standard Base64 uses + and / which conflict with URL syntax. The URL-safe variant replaces them with - and _:
export const BASE64_ALPHABET = 'ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789+/';
export const BASE64_URL_ALPHABET = 'ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789-_';
export function standardToUrlSafe(str) {
return str.replace(/\+/g, '-').replace(/\//g, '_').replace(/=+$/, '');
}
export function urlSafeToStandard(str) {
let s = str.replace(/-/g, '+').replace(/_/g, '/');
while (s.length % 4 !== 0) s += '=';
return s;
}
The conversion between variants is just character substitution plus handling the padding. This is how JWT tokens encode their parts: URL-safe Base64 without padding, period-separated.
isBase64 detection
"Does this string look like Base64?" is harder than it seems:
export function isBase64(str) {
const clean = str.replace(/\s/g, '');
if (clean.length === 0) return false;
// Standard: only valid chars + optional = padding, length multiple of 4
if (/^[A-Za-z0-9+/]+={0,2}$/.test(clean) && clean.length % 4 === 0) return true;
// URL-safe: with - or _ present
if (/^[A-Za-z0-9_-]+$/.test(clean) && /[-_]/.test(clean)) return true;
return false;
}
The "URL-safe" check requires - or _ to be actually present, otherwise every alphanumeric string would match. The auto-direction detection in the UI uses this to guess whether to encode or decode.
Series
This is entry #80 in my 100+ public portfolio series — 80% of the way there.
- 📦 Repo: https://github.com/sen-ltd/base64-tool
- 🌐 Live: https://sen.ltd/portfolio/base64-tool/
- 🏢 Company: https://sen.ltd/

Top comments (0)