"Validate that it's a phone number. How hard can it be?"
I've heard this sentence start many projects that ended in frustration. Phone numbers look simple — we use them every day, we know what they look like, how hard could validation be?
Very hard, it turns out. Phone numbers exist at the intersection of decades of telecommunications regulation, 200+ countries with different conventions, and constant changes as new number blocks are allocated and old rules are updated.
Let me take you down the rabbit hole.
What Does "Valid" Even Mean?
Start with the basic question: what makes a phone number valid?
Syntactically valid? The string matches some pattern. It has the right length. It contains only digits (and maybe some formatting characters). This is the easiest to check and the least meaningful.
Plausibly valid? The number could exist in some country. The length is right for the claimed country code. The area code is a real area code. This is more useful but still doesn't guarantee the number works.
Currently assigned? The number is actually assigned to a carrier and could, in theory, receive calls. This requires carrier database lookups and changes constantly as numbers are allocated, reassigned, and disconnected.
Actually reachable? You can successfully deliver a call or SMS to this number right now. This is the gold standard, but you can only know for sure by trying.
Most applications care about "plausibly valid" — catching typos and garbage input while accepting numbers that are likely real. Going beyond that gets expensive and complex.
The Format Jungle
Users enter phone numbers in dozens of formats:
5551234567
555-123-4567
(555) 123-4567
555.123.4567
+1 555 123 4567
+15551234567
1-555-123-4567
+1 (555) 123-4567
All the same number. All valid input that your system needs to handle.
Some formats are common in certain countries. Americans write (555) 123-4567. British users write 07911 123456. Germans write 0170 1234567. French users write 06 12 34 56 78 (with spaces between pairs).
Your input field needs to accept whatever format users naturally type. Your storage needs a canonical format that removes ambiguity.
The E.164 Standard
E.164 is the international standard for phone number formatting. A US number in E.164 looks like:
+15551234567
That's it. Plus sign, country code, national number. No spaces, no dashes, no parentheses. The plus sign indicates it's a complete international number. The 1 is the country code (US/Canada). The rest is the national number.
Always store phone numbers in E.164. This format is:
- Globally unique (includes country code)
- Unambiguous (no guessing what country)
- Sortable and comparable
- Standard format for APIs and carriers
Display in whatever local format makes sense for the user. Store in E.164 for your systems.
The Country Code Problem
Without a country code, a phone number is ambiguous.
5551234567 — is this an American number? Canadian? Could be either; they share the +1 country code.
7911123456 — British mobile? Russian regional? Hard to tell without context.
You need country context to validate properly. A 10-digit number starting with 07 might be valid in the UK but invalid in the US. Number length varies by country. Area code conventions vary by country.
The best approach: require country selection or infer it reliably (from user profile, IP geolocation, or payment country). Never validate international phone numbers without country context.
Length Isn't Simple Either
Different countries have different valid phone number lengths:
| Country | Example | Digits after country code |
|---|---|---|
| USA | +1 555 123 4567 | 10 |
| UK | +44 7911 123456 | 10 |
| Germany | +49 170 1234567 | 10-11 |
| France | +33 6 12 34 56 78 | 9 |
| India | +91 98765 43210 | 10 |
| Poland | +48 501 234 567 | 9 |
Some countries have variable length numbers. German mobile numbers are 10-11 digits after the country code. Some countries have recently changed number lengths, requiring updates to validation rules.
Length checking alone isn't enough to validate, but it's a useful first filter. A 7-digit number in a country that requires 10 is definitely wrong.
Number Types Matter
Not all phone numbers are equal. Phone numbers have types:
Mobile — Can receive SMS. Usually tied to an individual person. Most valuable for verification.
Landline — Voice only (usually). Often shared/business numbers. Can't receive SMS on most landlines.
VoIP — Voice over IP numbers from services like Google Voice, Skype, etc. Can be anywhere, often temporary. May or may not receive SMS.
Toll-free — Inbound only. 1-800, 1-888, etc. Can't receive SMS.
Premium rate — Costs money to call. Sometimes used for scams.
If you're doing SMS verification, mobile numbers are what you want. Sending an SMS to a landline fails. Sending to VoIP might work but is less reliable.
Some applications restrict VoIP numbers because they're easier to get anonymously and discard. If you're trying to tie accounts to real identities, VoIP detection matters.
The Disposable Number Problem
Services exist that provide temporary phone numbers. Users get a number, receive a verification SMS, and discard the number. These services include:
- Google Voice (semi-permanent but anonymous)
- TextNow, TextFree, etc.
- Dedicated "burner" apps
- Online SMS receiving services
These are valid phone numbers. They work. They just aren't tied to a real identity the way a carrier-assigned mobile number is.
For fraud prevention, you might want to detect and reject them. Users creating multiple fake accounts typically use disposable numbers because getting multiple real carrier numbers is expensive.
But be careful: legitimate users use Google Voice. Travelers use local SIMs. Privacy-conscious users have valid reasons to want non-carrier numbers. Not every unusual number is fraud.
The International Input Problem
Building an international phone number input is harder than it looks.
Which country codes to include? There are 200+ countries. A dropdown with 200+ options is unwieldy. Do you show all of them? Just the common ones? Let users type country codes directly?
Formatting as users type. Users expect input to format automatically — add dashes, spaces, parentheses as they type. But formatting rules vary by country. US: (555) 123-4567. UK: 07911 123456. Germany: 0170 1234567.
Handling paste. Users paste phone numbers from emails, documents, etc. Your input needs to parse various formats and normalize them.
Validation feedback. When is a number "wrong"? A partial number isn't wrong, just incomplete. You need to validate at the right moment with the right feedback.
Libraries like libphonenumber (originally from Google) handle much of this complexity. They know the formatting and validation rules for every country. Use them rather than reinventing this wheel.
What Validation Should You Do?
Different validation levels suit different needs:
Level 1: Basic format validation. Is this string plausibly a phone number? Right length, only valid characters, not clearly garbage. You can do this client-side for immediate feedback.
Level 2: Country-aware validation. Is this a plausible number for this country? Right length, valid area code pattern, valid number range. Use libphonenumber or a validation API.
Level 3: Carrier lookup. Is this number currently assigned to a carrier? Which type (mobile/landline/VoIP)? Requires API calls to number intelligence services. Costs money per lookup.
Level 4: Verification. Can we actually send an SMS or call to this number? The only definitive test. Costs money per verification. Use sparingly — at signup, at critical actions.
Most applications should do Level 2 validation for all input (catch typos and garbage) and Level 4 verification for critical flows (new account signup, password reset phone changes).
Storage and Indexing
How you store phone numbers affects what you can do with them.
Store E.164 format. +15551234567. This is canonical, unambiguous, and sortable.
Create a normalized index. For searching, you might want to index without formatting: 15551234567. Users searching might type with or without country code, with various formatting.
Store original input. For debugging and analytics, keep the original string the user entered. If validation is wrong, you need to see what they actually typed.
Consider regional display format. If you know the user's region, store their preference for how to display their own number. They entered it in a familiar format; show it back that way.
Common Mistakes
Assuming 10 digits. US numbers are 10 digits (11 with country code). Many countries have different lengths.
Assuming one format. (555) 123-4567 is American. Don't force this format on international users.
Stripping leading zeros. In many countries, national numbers start with 0 (0171 in Germany, 07911 in UK). This 0 is dropped in international format but critical in national format.
Not storing country code. A number without country code is incomplete. You might infer it now, but inference can be wrong, and you lose the ability to dial internationally.
Over-validating. Rejecting valid numbers because your regex is too strict. Phone number rules are complex and change. Use a library.
Under-validating. Accepting anything that looks vaguely numeric. "Call me at 867-5309" shouldn't pass validation.
Testing Phone Validation
Phone validation code needs tests with international numbers:
- US numbers (easy baseline)
- UK numbers (leading zero, different length)
- German numbers (variable length)
- Numbers with extensions
- Toll-free numbers
- Invalid numbers (too short, too long, invalid area codes)
- Numbers from countries with recent changes to their numbering plans
Keep test numbers updated. Numbering plans change. A test that passed last year might fail this year because the rules changed.
Validate and normalize phone numbers with the Phone Number Validator API. Detect temporary and disposable numbers with the Disposable Phone Checker. Get phone number validation right the first time.
Originally published at APIVerve Blog
Top comments (0)