When an API returns a content field that looks like eyJob3N0IjogImRiLXByb2Qi…, or your secrets manager hands you an encoded credential, or you need to extract a JWT payload — Python base64 decode is your first stop. The built-in base64 module handles all of it, but the small details around bytes vs strings, URL-safe alphabets, and missing padding catch almost every developer at least once.
This guide covers base64.b64decode(), urlsafe_b64decode(), automatic padding repair, decoding from files and HTTP responses, CLI tools, input validation, and four common mistakes with before/after fixes — all runnable Python 3.8+ examples. If you just need a quick one-off decode without writing code, ToolDeck's Base64 Decoder handles both standard and URL-safe Base64 instantly in your browser.
Key takeaways:
-
base64.b64decode(s)is built into Python stdlib — no install required; always returns bytes, not str - Chain
.decode("utf-8")afterb64decode()to convert bytes to a Python string - For URL-safe Base64 (uses
-and_), usebase64.urlsafe_b64decode()— standard in JWTs, OAuth tokens, Google API credentials - Fix "Incorrect padding" with:
padded = s + "=" * (-len(s) % 4) - Set
validate=Trueon external input to raisebinascii.Erroron non-Base64 characters
What is Base64 Decoding?
Base64 is an encoding scheme that represents arbitrary binary data as a string of 64 printable ASCII characters: A–Z, a–z, 0–9, +, and /, with = used as padding. Every 4 Base64 characters encode exactly 3 original bytes, so the encoded form is roughly 33% larger than the source. Decoding reverses the process — transforming the ASCII representation back into the original bytes.
Base64 does not encrypt data. It is purely a binary-to-text encoding:
# Before — Base64 encoded
eyJob3N0IjogImRiLXByb2QubXljb21wYW55LmludGVybmFsIiwgInBvcnQiOiA1NDMyLCAidXNlciI6ICJhcHBfc3ZjIn0=
# After — decoded
{"host": "db-prod.mycompany.internal", "port": 5432, "user": "app_svc"}
base64.b64decode() — Standard Library Decoding
Python's base64 module ships with the standard library — zero installation, always available. The primary function is base64.b64decode(s, altchars=None, validate=False). It accepts a str, bytes, or bytearray, and always returns bytes.
Minimal working example
import base64
import json
# Encoded database config received from a secrets manager
encoded_config = (
"eyJob3N0IjogImRiLXByb2QubXljb21wYW55LmludGVybmFsIiwgInBvcnQiOiA1NDMyLCAid"
"XNlciI6ICJhcHBfc3ZjIiwgInBhc3N3b3JkIjogInM0ZmVQYXNzITIwMjYifQ=="
)
# Step 1: decode Base64 bytes
raw_bytes = base64.b64decode(encoded_config)
# b'{"host": "db-prod.mycompany.internal", "port": 5432, ...}'
# Step 2: convert bytes → str
config_str = raw_bytes.decode("utf-8")
# Step 3: parse into a dict
config = json.loads(config_str)
print(config["host"]) # db-prod.mycompany.internal
print(config["port"]) # 5432
Note:
b64decode()always returnsbytes— never a string. If the original data was text, chain.decode("utf-8"). If it was binary (image, PDF, gzip), keep the bytes as-is.
Extended example: strict validation
import base64
import binascii
encoded_event = (
"eyJldmVudCI6ICJvcmRlci5zaGlwcGVkIiwgIm9yZGVyX2lkIjogIk9SRC04ODQ3MiIsICJ"
"0aW1lc3RhbXAiOiAiMjAyNi0wMy0xM1QxNDozMDowMFoiLCAicmVnaW9uIjogImV1LXdlc3QtMSJ9"
)
try:
# validate=True raises binascii.Error on any non-Base64 character
raw = base64.b64decode(encoded_event, validate=True)
event = raw.decode("utf-8")
print(event)
# {"event": "order.shipped", "order_id": "ORD-88472", ...}
except binascii.Error as exc:
print(f"Invalid Base64: {exc}")
except UnicodeDecodeError as exc:
print(f"Not UTF-8 text: {exc}")
Decoding URL-safe Base64 (base64url)
Standard Base64 uses + and /, which are reserved characters in URLs. The URL-safe variant (RFC 4648 §5, also called "base64url") replaces them with - and _. This is the encoding used in JWT tokens, OAuth 2.0 PKCE challenges, Google Cloud credentials, and most modern web authentication flows.
Use base64.urlsafe_b64decode() — it handles - → + and _ → / substitution automatically.
import base64
import json
# JWT payload segment (the middle part between the two dots)
jwt_payload_b64 = (
"eyJ1c2VyX2lkIjogMjg5MywgInJvbGUiOiAiYWRtaW4iLCAiaXNzIjogImF1dGgubXljb21w"
"YW55LmNvbSIsICJleHAiOiAxNzQwOTAwMDAwLCAianRpIjogImFiYzEyMzQ1LXh5ei05ODc2In0"
)
# Restore padding before decoding (JWT deliberately omits '=')
padded = jwt_payload_b64 + "=" * (-len(jwt_payload_b64) % 4)
payload_bytes = base64.urlsafe_b64decode(padded)
payload = json.loads(payload_bytes.decode("utf-8"))
print(payload["role"]) # admin
print(payload["iss"]) # auth.mycompany.com
print(payload["user_id"]) # 2893
The expression
"=" * (-len(s) % 4)adds exactly 0, 1, or 2 padding characters as needed and is a no-op when the string is already correctly padded. It is the idiomatic Python fix for JWT and OAuth padding issues.
base64.b64decode() Parameters Reference
| Parameter | Type | Default | Description |
|---|---|---|---|
s |
bytes \ | str \ | bytearray |
altchars |
bytes \ | None | None |
validate |
bool | False | When True, raises binascii.Error on non-Base64 characters; when False, whitespace is silently ignored |
The validate=False default is intentional for PEM-formatted data and multi-line Base64. For API payloads or any untrusted input, pass validate=True.
Python Base64 Decode Padding Error — How to Fix It
The most frequent error when decoding Base64 in Python:
import base64
base64.b64decode("eyJ0eXBlIjogImFjY2VzcyJ9")
# binascii.Error: Incorrect padding
Base64 requires string lengths that are multiples of 4. JWTs and URLs strip trailing = padding to save bytes.
Option 1: Restore padding inline (recommended)
import base64, json
def b64decode_unpadded(data: str | bytes) -> bytes:
"""Decode Base64 with automatic padding correction."""
if isinstance(data, str):
data = data.encode("ascii")
data += b"=" * (-len(data) % 4)
return base64.b64decode(data)
token_a = "eyJ0eXBlIjogImFjY2VzcyJ9" # 0 chars stripped
token_b = "eyJ0eXBlIjogInJlZnJlc2gifQ" # 1 char stripped
token_c = "eyJ0eXBlIjogImFwaV9rZXkifQ==" # already padded
for token in (token_a, token_b, token_c):
result = json.loads(b64decode_unpadded(token).decode("utf-8"))
print(result["type"])
# access
# refresh
# api_key
Option 2: URL-safe decode for JWT / OAuth
import base64, json
def decode_jwt_segment(segment: str) -> dict:
"""Decode a single JWT segment (header or payload)."""
padded = segment + "=" * (-len(segment) % 4)
raw = base64.urlsafe_b64decode(padded)
return json.loads(raw.decode("utf-8"))
id_token_payload = (
"eyJzdWIiOiAiMTEwNTY5NDkxMjM0NTY3ODkwMTIiLCAiZW1haWwiOiAic2FyYS5jaGVuQGV4"
"YW1wbGUuY29tIiwgImhkIjogImV4YW1wbGUuY29tIiwgImlhdCI6IDE3NDA5MDAwMDB9"
)
claims = decode_jwt_segment(id_token_payload)
print(claims["email"]) # sara.chen@example.com
print(claims["hd"]) # example.com
Decode Base64 from a File and API Response
Reading and decoding a Base64 file
import base64, json
from pathlib import Path
def decode_attachment(envelope_path: str, output_path: str) -> None:
"""
Read a JSON envelope with a Base64-encoded attachment,
decode it, and write the binary output to disk.
"""
try:
envelope = json.loads(Path(envelope_path).read_text(encoding="utf-8"))
encoded_data = envelope["attachment"]["data"]
file_bytes = base64.b64decode(encoded_data, validate=True)
Path(output_path).write_bytes(file_bytes)
print(f"Saved {len(file_bytes):,} bytes → {output_path}")
except FileNotFoundError:
print(f"Envelope file not found: {envelope_path}")
except (KeyError, TypeError):
print("Unexpected envelope structure — 'attachment.data' missing")
except base64.binascii.Error as exc:
print(f"Invalid Base64 content: {exc}")
# {"attachment": {"filename": "invoice_2026_03.pdf", "data": "JVBERi0xLjQK..."}}
decode_attachment("order_ORD-88472.json", "invoice_2026_03.pdf")
Decoding Base64 from an HTTP API response
import base64, json, urllib.request
def fetch_and_decode_secret(vault_url: str, secret_name: str) -> str:
url = f"{vault_url}/v1/secrets/{secret_name}"
req = urllib.request.Request(url, headers={"X-Vault-Token": "s.internal"})
try:
with urllib.request.urlopen(req, timeout=5) as resp:
body = json.loads(resp.read().decode("utf-8"))
# Vault returns: {"data": {"value": "<base64>", "encoding": "base64"}}
encoded = body["data"]["value"]
return base64.b64decode(encoded).decode("utf-8")
except urllib.error.URLError as exc:
raise RuntimeError(f"Vault unreachable: {exc}") from exc
except (KeyError, UnicodeDecodeError, base64.binascii.Error) as exc:
raise ValueError(f"Unexpected secret format: {exc}") from exc
If you use
requests, replaceurllib.requestwithresp = requests.get(url, timeout=5, headers=headers)andbody = resp.json(). The Base64 decoding logic is identical.
Command-Line Base64 Decoding
# Decode a Base64 string (Linux / macOS)
echo "eyJob3N0IjogImRiLXByb2QubXljb21wYW55LmludGVybmFsIn0=" | base64 --decode
# {"host": "db-prod.mycompany.internal"}
# Decode a file, save output
base64 --decode encoded_payload.txt > decoded_output.json
# Python's cross-platform CLI decoder (works on Windows too)
python3 -m base64 -d encoded_payload.txt
# Decode a JWT payload segment inline
echo "eyJ1c2VyX2lkIjogMjg5MywgInJvbGUiOiAiYWRtaW4ifQ" | python3 -c "
import sys, base64, json
s = sys.stdin.read().strip()
padded = s + '=' * (-len(s) % 4)
print(json.dumps(json.loads(base64.urlsafe_b64decode(padded)), indent=2))
"
For exploratory work where writing a shell pipeline feels like overkill, paste the string into ToolDeck's Base64 Decoder — it auto-detects URL-safe input and fixes padding on the fly.
Validating Base64 Input Before Decoding
import base64, binascii, re
# ── Option A: try/except (recommended) ──────────────────────────────────────
def safe_b64decode(data: str) -> bytes | None:
"""Return decoded bytes, or None if the input is not valid Base64."""
try:
padded = data + "=" * (-len(data) % 4)
return base64.b64decode(padded, validate=True)
except (binascii.Error, ValueError):
return None
print(safe_b64decode("not-base64!!")) # None
print(safe_b64decode("eyJ0eXBlIjogInJlZnJlc2gifQ")) # b'{"type": "refresh"}'
# ── Option B: regex pre-validation ──────────────────────────────────────────
_STANDARD_RE = re.compile(r"^[A-Za-z0-9+/]*={0,2}$")
def is_valid_base64(s: str) -> bool:
stripped = s.rstrip("=")
padded = stripped + "=" * (-len(stripped) % 4)
return bool(_STANDARD_RE.match(padded))
print(is_valid_base64("SGVsbG8gV29ybGQ=")) # True
print(is_valid_base64("SGVsbG8gV29ybGQ!")) # False
High-Performance Alternative: pybase64
For high-throughput pipelines processing thousands of payloads per second, pybase64 is a C-extension wrapper around libbase64 that is typically 2–5× faster than stdlib on large inputs.
pip install pybase64
import pybase64
# Drop-in replacement — identical API to stdlib base64
image_bytes = pybase64.b64decode(encoded_image, validate=False)
# URL-safe variant
token_bytes = pybase64.urlsafe_b64decode("eyJpZCI6IDQ3MX0=")
print(token_bytes) # b'{"id": 471}'
The API is intentionally identical to base64 — swap the import and nothing else changes. Use it only when profiling confirms Base64 is actually a bottleneck.
Common Mistakes
Mistake 1: Forgetting to call .decode() on the result
# ❌ b64decode() returns bytes — this crashes downstream
raw = base64.b64decode("eyJ1c2VyX2lkIjogNDcxLCAicm9sZSI6ICJhZG1pbiJ9")
user_id = raw["user_id"] # TypeError: byte indices must be integers
# ✅ decode bytes → str, then parse
raw = base64.b64decode("eyJ1c2VyX2lkIjogNDcxLCAicm9sZSI6ICJhZG1pbiJ9")
payload = json.loads(raw.decode("utf-8"))
print(payload["user_id"]) # 471
Mistake 2: Using b64decode() on URL-safe Base64 input
# ❌ JWT tokens use '-' and '_' — not in standard alphabet
jwt_segment = "eyJ1c2VyX2lkIjogMjg5M30"
base64.b64decode(jwt_segment) # binascii.Error or silently wrong bytes
# ✅ use urlsafe_b64decode() for any token with '-' or '_'
padded = jwt_segment + "=" * (-len(jwt_segment) % 4)
data = base64.urlsafe_b64decode(padded)
print(json.loads(data.decode("utf-8"))) # {'user_id': 2893}
Mistake 3: Not fixing padding on stripped tokens
# ❌ JWTs strip '=' — this crashes
segment = "eyJ0eXBlIjogImFjY2VzcyIsICJqdGkiOiAiMzgxIn0"
base64.urlsafe_b64decode(segment) # binascii.Error: Incorrect padding
# ✅ always add padding before urlsafe_b64decode()
padded = segment + "=" * (-len(segment) % 4)
result = json.loads(base64.urlsafe_b64decode(padded).decode("utf-8"))
print(result["type"]) # access
Mistake 4: Calling .decode("utf-8") on binary data
# ❌ PDFs, PNGs, ZIPs are not UTF-8 — this crashes
pdf_b64 = "JVBERi0xLjQKJeLjz9MKNyAwIG9iago8PC9U..."
pdf_text = base64.b64decode(pdf_b64).decode("utf-8") # UnicodeDecodeError
# ✅ write binary directly to a file — no .decode() needed
pdf_bytes = base64.b64decode(pdf_b64)
Path("report_q1_2026.pdf").write_bytes(pdf_bytes)
Decoding Large Base64 Files
For files larger than ~50–100 MB, use a chunked approach to avoid loading everything into memory at once:
import base64
def decode_large_b64_file(input_path: str, output_path: str, chunk_size: int = 65536) -> None:
"""chunk_size must be a multiple of 4 to keep Base64 block boundaries aligned."""
assert chunk_size % 4 == 0
with open(input_path, "rb") as src, open(output_path, "wb") as dst:
while True:
chunk = src.read(chunk_size)
if not chunk:
break
chunk = chunk.strip()
if chunk:
dst.write(base64.b64decode(chunk))
decode_large_b64_file("snapshot_2026_03.b64", "snapshot_2026_03.sql.gz")
For PEM certificates and MIME attachments with line wrapping, use base64.decodebytes() — it silently ignores whitespace and newlines.
Python Base64 Decoding Methods — Quick Comparison
| Method | Alphabet | Padding | Best For |
|---|---|---|---|
base64.b64decode() |
Standard (+/) | Required | General-purpose, email, PEM |
base64.decodebytes() |
Standard (+/) | Ignored | PEM certs, MIME, multiline |
base64.urlsafe_b64decode() |
URL-safe (-_) | Required | JWT, OAuth, Google Cloud |
base64.b32decode() |
32-char (A–Z, 2–7) | Required | TOTP secrets, DNS-safe IDs |
base64.b16decode() |
Hex (0–9, A–F) | None | Hex checksums |
pybase64.b64decode() |
Standard (+/) | Required | High-throughput pipelines |
Use b64decode() as your default. Switch to urlsafe_b64decode() the moment you see - or _ in the input — those characters are the unmistakable sign of URL-safe Base64. For one-off checks during development, this online Base64 decoder handles both alphabets and auto-repairs padding — no Python environment needed.
Frequently Asked Questions
How do I decode a Base64 string to a regular string in Python?
Call base64.b64decode(encoded) to get bytes, then call .decode("utf-8") on the result. The two steps are always separate because b64decode() only reverses the Base64 alphabet — it does not know whether the original content was UTF-8, Latin-1, or binary.
Why do I get "Incorrect padding" when decoding Base64 in Python?
Base64 strings must be a multiple of 4 characters long. JWTs and URLs strip trailing = padding. Fix it with padded = s + "=" * (-len(s) % 4). This formula adds exactly 0, 1, or 2 characters as needed.
What is the difference between b64decode() and urlsafe_b64decode()?
Both decode the same Base64 algorithm but with different alphabets. b64decode() uses + and /; urlsafe_b64decode() uses - and _. Mixing them up causes either a binascii.Error or silently corrupt output.
How do I decode a Base64-encoded image in Python?
Decode to bytes with base64.b64decode(encoded), then write those bytes directly to a file — do not call .decode("utf-8") on image data. If the input is a data URL (data:image/png;base64,...), strip the prefix first with _, encoded = data_url.split(",", 1).
Can I decode Base64 in Python without importing any module?
No reason to. The base64 module is part of Python's standard library — always available, implemented in C, zero dependencies.
Top comments (0)