SEN LLC

Posted on Apr 28

Three Address Checksums, Three Engineering Philosophies — Verifying Base58Check, Bech32m, and EIP-55 in 250 Lines of Browser JS

#javascript #crypto #bitcoin #ethereum

When you paste a wallet address into something, the something either typo-checks it or it doesn't. Most engineers reach for validate_address(addr) from a library and move on. Spend an hour writing the verifiers from scratch and you discover something interesting: Bitcoin and Ethereum have completely different engineering philosophies about address checksums, and the differences are visible at the byte level.

Here's a 250-line browser-only verifier for Base58Check (BTC legacy), Bech32 / Bech32m (BTC SegWit), and EIP-55 (Ethereum), and what writing each one taught.

🔐 Demo: https://sen.ltd/portfolio/address-decoder/
📦 GitHub: https://github.com/sen-ltd/address-decoder

Three formats, three worldviews

Format	Example	Checksum	Year	Stance
Base58Check	`1A1zP1eP5QGefi2DMPTfTL5SLmv7DivfNa`	4 bytes from SHA256²	2009	Defensive paranoia
EIP-55	`0x5aAeb6053F3E94C9b9A09f33669435E7Ef1BeAed`	1 bit per nibble via letter case	2016	Bolted-on, backwards-compatible
Bech32 / Bech32m	`bc1qw508d6qejxtdg4y5r3zarvary0c5xw7kv8f3t4`	30-bit polynomial mod	2017 / 2021	QR + voice friendly

Working through them in order:

Bitcoin Base58Check — defensive paranoia

Satoshi's original design. An address is exactly:

[version (1 byte)] [payload (20 bytes)] [checksum (4 bytes)] = 25 bytes total

Encoded in Base58 (Bitcoin's alphabet). The checksum is the first 4 bytes of SHA256(SHA256(version || payload)) — SHA-256 squared. Verification is six lines:

const versioned = decoded.slice(0, decoded.length - 4);
const expected = (await dsha256(versioned)).slice(0, 4);
const valid = bytesEqual(checksum, expected);

async function dsha256(b) {
  return new Uint8Array(
    await crypto.subtle.digest("SHA-256",
      await crypto.subtle.digest("SHA-256", b))
  );
}

The hashing-twice was a 2009-era hedge against length-extension attacks on plain SHA-256 — by today's threat model, single-pass would be fine, but Bitcoin can't change without a hard fork.

4 bytes = 32 bits of checksum means the false-positive rate for a typo is 1 in 4 billion. Effectively zero. The price you pay is that addresses look like radio static (1A1zP1eP5QGefi2DMPTfTL5SLmv7DivfNa).

The Base58 trap

The Base58 alphabet excludes 0, O, I, and l to avoid visual confusion — 58 = 64 minus 6 ambiguous characters. The implementation gotcha: leading 1 characters encode leading 0x00 bytes, separately from the actual base-58 conversion.

let leadingZeros = 0;
while (leadingZeros < s.length && s[leadingZeros] === "1") leadingZeros++;
// ... base-58 → base-256 conversion of the rest ...
return new Uint8Array([...zeros(leadingZeros), ...converted]);

Forget that and you decode addresses one byte short for any address whose first byte is 0x00 (which includes mainnet P2PKH — most real addresses). I tripped on it on the first test.

Ethereum EIP-55 — bolted-on case checksum

When Ethereum launched in 2015, addresses had no checksum. They were literal hex:

0xde709f2102306220921060314715629080e2fb77

Hex is case insensitive at the bytes level (0xa and 0xA decode to the same nibble), so letter case was a free 1-bit-per-character of unused signal. EIP-55 (2016) used it for a checksum:

Lowercase the address; treat the hex string as ASCII bytes; hash with keccak-256.
Walk each character of the lowercase address. If it's a letter and the corresponding hash nibble is >= 8, uppercase that letter. Digits stay digits.

function checksumHex(lowerHex, hashBytes) {
  let out = "";
  for (let i = 0; i < lowerHex.length; i++) {
    const c = lowerHex[i];
    if (c >= "0" && c <= "9") { out += c; continue; }
    const byte = hashBytes[i >>> 1];
    const nibble = (i % 2 === 0) ? (byte >>> 4) : (byte & 0x0f);
    out += nibble >= 8 ? c.toUpperCase() : c;
  }
  return out;
}

The elegance is in what happens to existing tools:

Pre-EIP-55 wallets send all-lowercase. Receivers treat that as "no checksum present" and accept it.
EIP-55-aware wallets send mixed-case; receivers verify the mixed-case matches the recomputed pattern.
The spec is explicit: addresses uniformly cased one way are accepted without checksum verification.

This is a textbook "don't break the world" upgrade. The cost is that lowercase Ethereum addresses still ship today with no integrity check at all.

The keccak-256 trap — it's not NIST SHA3

Ethereum's "keccak-256" is not NIST SHA3-256. The two share the Keccak-f[1600] permutation underneath but the padding byte differs:

Keccak-256 (Ethereum): 0x01
SHA3-256 (NIST): 0x06

That single-byte difference produces completely different digests. Anyone who reaches for crypto.subtle.digest("SHA3-256", ...) to verify EIP-55 fails immediately — the Web Crypto API has SHA-3 but not Keccak, so a 150-line hand-rolled implementation is required.

The implementation core:

function keccakF(state) {
  for (let round = 0; round < 24; round++) {
    // θ — column parity, XOR with neighbouring columns
    // ρ + π — lane rotation + permutation across the 5×5 state
    // χ — non-linear within each row: a[x] ^= ~a[x+1] & a[x+2]
    // ι — XOR a per-round constant into lane (0,0)
  }
}

JavaScript's bitwise ops cap at 32 bits, so each 64-bit lane is stored as a (lo, hi) pair. BigInt would dominate the hot loop, so it's manual.

The eight official EIP-55 vectors made debugging painless: each one is an address that should round-trip to itself when re-checksummed. My first version had one wrong round constant — the high half of RC[2] was zero when it should have been 0x80000000 — and all eight vectors failed with different garbage. Fix one byte, all eight pass. Cryptographic test vectors at their best.

const EIP55_VECTORS = [
  "0x52908400098527886E0F7030069857D2E4169EE7",
  "0x8617E340B3D01FA5F11F306F4090FD50E238070D",
  "0x5aAeb6053F3E94C9b9A09f33669435E7Ef1BeAed",
  // ... 5 more
];
for (const addr of EIP55_VECTORS) {
  test(`EIP-55: ${addr.slice(0, 12)}…`, () => {
    const r = decodeEthereumAddress(addr);
    assert.equal(r.eip55_checksum, addr); // recompute matches input
  });
}

Bitcoin SegWit's Bech32 / Bech32m — QR-friendly with a twist

Bech32 (BIP-0173, 2017) was designed for SegWit native addresses (bc1...) with three explicit goals:

All-lowercase so QR codes stay small (case-sensitive Base58 hurts QR error correction)
A voice-friendly alphabet of 32 characters (drops b, i, o, and 1)
Strong adjacent-error detection via a polynomial checksum

Structure:

[hrp (e.g. "bc")] "1" [data (5-bit groups)] [checksum (6 chars = 30 bits)]

The 1 is a separator that's been excluded from the alphabet, so it can never appear in the HRP or the data. The checksum is a BCH code modulo a generator over GF(2³⁰):

function bech32Polymod(values) {
  const GEN = [0x3b6a57b2, 0x26508e6d, 0x1ea119fa, 0x3d4233dd, 0x2a1462b3];
  let chk = 1;
  for (const v of values) {
    const top = chk >>> 25;
    chk = ((chk & 0x1ffffff) << 5) ^ v;
    for (let i = 0; i < 5; i++) {
      if ((top >>> i) & 1) chk ^= GEN[i];
    }
  }
  return chk >>> 0;
}

This catches 100% of single-character errors and the vast majority of two-character errors. The choice of generator polynomial isn't arbitrary; it was derived to maximise that error-detection guarantee against the most common typing mistakes (sipa's notes on the math are worth a read).

The Bech32m post-mortem (BIP-0350)

In 2020, a class of two-character "insertion + deletion" errors was found to slip past Bech32. The fix: a new variant Bech32m with a different final XOR constant (0x2bc830a3 instead of Bech32's 1). Crucially, the BIP-0350 deployment plan split by witness version:

Witness version	Encoding
0 (P2WPKH, P2WSH)	bech32 (legacy, kept for compatibility)
1+ (Taproot)	bech32m (new)

Both formats coexist in the wild. The implementation must check both polymod constants, then verify the variant matches the witness version:

const variant =
  poly === BECH32_CONST  ? "bech32"  :
  poly === BECH32M_CONST ? "bech32m" : null;
const variantOk =
  (witnessVersion === 0 && variant === "bech32") ||
  (witnessVersion >= 1 && variant === "bech32m");

Accepting bech32m for v0 (or vice versa) is technically a wallet bug — it means an address that looks valid but came from a buggy generator might be accepted, leading to lost funds. Worth the discipline to keep the check strict.

A dispatcher false-positive worth knowing

The tool guesses the format from the input. The first version's bech32 decoder false-fired on Bitcoin Genesis (1A1zP1eP5...) because:

bech32 looks for the last 1 in the input as a separator
Bitcoin legacy P2PKH addresses contain plenty of 1 characters
The mixed-case rejection check fired before the HRP was validated, returning {format: "bech32", valid: false, error: "mixed case is forbidden"}

Fix: reject anything whose computed HRP isn't pure lowercase letters.

const hrp = lc.slice(0, idx);
if (!/^[a-z]+$/.test(hrp)) return null;

The spec technically allows digits in the HRP, but every real-world HRP (bc, tb, bcrt, ltc, tltc) is pure lowercase. For an address verifier, Postel's "be liberal in what you accept" is the wrong default — you want to be strict to avoid sending money to a structurally-valid-but-wrong-format address.

Takeaways

Base58Check (Bitcoin legacy) is paranoid by design: 32 bits of double-SHA-256 catches every typo. The ugliness is the cost.
EIP-55 (Ethereum) is pragmatic: a 1-bit-per-letter checksum sneaked into the existing case-insensitive hex format, with a deliberate "all-one-case = no checksum" carve-out that kept the Ethereum ecosystem from breaking when it shipped in 2016.
Bech32 / Bech32m (Bitcoin SegWit) is engineering-textbook: requirements (QR, voice, error detection) drove a custom alphabet, a mathematically-derived polynomial, and — when a flaw was found in 2020 — a compatible split into a new variant tied to witness version.

Three formats, three lessons in cryptographic design under different constraints. The fact that it all fits in 250 lines of browser JavaScript is a reminder that the math is small; the policy is what the spec is really about.

Full source on GitHub — decoder.js (250 lines), keccak256.js (150 lines), 30 tests covering all 8 EIP-55 vectors plus BIP-0173/0350 references plus BTC mainnet/testnet. MIT licensed.

Live demo — six example addresses one click away.

DEV Community