Back in 2014 I was working at YouTube and kept running into the same maddening problem: IDs with characters that looked the same. I was focused on standing up their first external facing LMS for YouTube Certified Online and the unique identifiers for training courses, but I noticed how everyone around me was constantly messing up the YouTube video IDs themselves. Misreading an O as a 0, squinting at I and l and 1, and doing doubletakes on B and 8. For the dylexics, it was hard to mind the p's and q's with the d's and b's. Support folks dictating codes over the phone, swapping an S for a 5. Printed labels that were genuinely ambiguous depending on the font.
Now I'm not dyslexic. But my brother is, and I have a dear friend who is legally blind, and family members with limitations who I help navigate technology every day. When I was working out which characters to cut, I wasn't optimizing for a theoretical average user, I was thinking about the specific people in my life who get failed by design decisions that nobody bothered to question. The mirror pairs aren't just a "nice to have" removal. For a meaningful portion of your users, they're the difference between an ID that works and one that doesn't.
I looked around for a standard. There wasn't one people were actually using. So I quietly worked out which characters to cut and started using the resulting set for everything I built.
In 2019 I finally wrote it up. I've shared it on Medium, LinkedIn, and wherever else I could get people to look at it since then. The reaction was always the same: "Oh, that's obvious. Why isn't this the default?"
Today I'm sharing it as an actual installable library for the first time.
HardGuard25 is a 25-character alphanumeric set for human-friendly unique IDs.
0 1 2 3 4 5 6 7 8 9 A C D F G H J K M N P R U W Y
Why 11, not 4
Crockford Base32 has been around since 2002. It removes I, L, O, and U β four characters. It's the most common "unambiguous" encoding people reach for.
Four wasn't enough.
HardGuard25 removes every character that creates visual ambiguity, for any reader, at any size, in any common typeface. Here's the full removal list:
π« Digit lookalikes: O (matches 0), I (matches 1), S (matches 5), Z (matches 2), B (matches 8). Crockford gets I, L, O, U. HardGuard25 also removes S, Z, and B.
πͺ Dyslexia mirror pairs: d/b, q/p, 3/E. Dyslexic readers reverse these reliably. No reason to include both sides of a pair.
βοΈ Operator and context lookalikes: V (mirrors U in some fonts), T (looks like +), X (multiplication symbol, also used as a redaction placeholder). These cause parsing confusion in spreadsheets, URLs, and printed labels.
The tradeoff: HardGuard25 codes run 1β2 characters longer than Crockford Base32 for the same entropy. If your IDs are machine-read 99% of the time, Crockford is fine. If a human ever has to read, type, speak, or transcribe your ID, every one of those 11 characters is a support ticket waiting to happen.
Quickstart
JavaScript
npm install @snapsynapse/hardguard25
import { generate, validate, checkDigit } from '@snapsynapse/hardguard25';
generate(8); // "AC3H7PUW"
generate(8, { checkDigit: true }); // "AC3H7PUWR"
validate("AC3H-7PUW"); // true
checkDigit("AC3H7PUW"); // "R"
Python
pip install hardguard25
from hardguard25 import generate, validate, check_digit
generate(8) # "AC3H7PUW"
generate(8, check_digit=True) # "AC3H7PUWR"
validate("AC3H-7PUW") # True
check_digit("AC3H7PUW") # "R"
Go
import "github.com/snapsynapse/hardguard25/go"
id, _ := hardguard25.Generate(8) // "AC3H7PUW"
id, _ = hardguard25.GenerateWithCheck(8) // "AC3H7PUWR"
ok := hardguard25.Validate("AC3H-7PUW") // true
ch, _ := hardguard25.CheckDigit("AC3H7PUW") // 'R'
How long do IDs need to be?
| Length | Unique IDs | Good for |
|---|---|---|
| 4 | 390,625 | Small inventory, event tickets |
| 6 | 244 million | Medium business scale |
| 8 | 152 billion | Large systems |
| 16 | 3.55 Γ 10Β²Β² | Cross-system identifiers |
| 20 | 2.11 Γ 10Β²β· | Public tokens |
Rule of thumb: 4β5 characters for small business, 8+ for large systems, 16β22 for tokens and cross-org use.
When not to use it
HardGuard25 is not a general-purpose encoding scheme. Skip it for:
- Cryptographic keys β use proper key derivation
- Blockchain consensus β use domain-specific formats
- Systems requiring UUID guarantees β use UUIDv7 or ULID
It's for IDs that humans interact with. Full stop.
Try it
Interactive generator: hardguard25.com
Full spec in SPEC.md β covers entropy math, collision guidance, normalization rules, check digit algorithm, test vectors, and accessibility notes.
Got issues with this? Please raise them to the Github project!
Do you use a custom character set for your IDs, or do you default to UUID/ULID? Curious how many people are still hitting the O/0 problem in production π
Top comments (0)