How email verification works: syntax, MX, and SMTP explained

#email #deliverability #webdev #programming

"Email verification" sounds like one thing, but it's really a stack of checks of increasing depth and cost. Knowing what each layer actually proves helps you pick the right level instead of overpaying for verification you don't need.

Layer 1: syntax

The cheapest check: does the string look like a valid email address? A pragmatic regex catches obvious garbage (asdf, a@@b, trailing spaces). It's instant and free, but weak on its own: nobody@asdf.asdf passes syntax and can't receive a single message.

Layer 2: domain and MX records

Next, does the domain actually accept mail? Every domain that receives email publishes MX (mail exchanger) records in DNS pointing to its mail servers. A quick DNS lookup tells you whether any exist. No MX (and no fallback A record) means the domain can't receive mail, so the address is undeliverable no matter how it's spelled. This single step removes a large class of fakes and dead domains.

Layer 3: SMTP mailbox check

The deepest level connects to the domain's mail server and begins the motions of sending a message to ask whether that specific mailbox exists, without actually delivering anything. It's the only layer that can hint a particular inbox is real, but it comes with real caveats:

It's slow (a live connection per address).
Many servers are "accept-all" and say yes to everything, so the answer is often meaningless.
Lots of providers block or throttle these probes, and outbound port 25 is blocked on most modern hosting, so it's frequently unavailable anyway.

SMTP checks matter most for cleaning old, cold lists, and far less for stopping junk at signup.

The heuristics layer

Alongside those, useful verification adds signal that has nothing to do with deliverability per se:

Disposable detection: is it a throwaway provider?
Role detection: is it info@ or admin@ rather than a person?
Typo suggestions: "did you mean gmail.com?" for gmial.com.
A deliverability score: one 0–100 number that rolls it all up so you can just threshold on it.

Which layers do you actually need?

For the most common job, validating emails at signup, syntax + MX + the heuristics catch the overwhelming majority of bad addresses, instantly and cheaply. Live SMTP probing adds cost and unreliability for little extra benefit in that context, and is mainly worth it for bulk cleaning of stale lists.

So before reaching for the heaviest, slowest option, ask what you're actually trying to prevent. For most signup forms, the first two layers plus a few heuristics are all you need.

Disclosure: I build an email verification API, which is why I spend time thinking about this. The post is vendor-neutral; the layers apply to any tool you choose.