Min8T

Posted on May 2

Why email verification is 21 checks, not 1 (and why MCP makes it agent-ready)

#agents #javascript #mcp #showdev

"Valid email format" isn't enough.

hello@gmail.com and hello@gmial.com both parse as RFC-5322 valid. One reaches a real inbox. The other bounces, hurts your sender reputation, and might land you on a blocklist.

Real email-deliverability work is 20+ checks, not 1. Most "email verifier" services are a black box that returns valid or invalid and you get to trust them. I wanted something I could reason about end-to-end and that an AI agent could orchestrate piece by piece.

This is what's actually inside the pipeline I shipped (@deliveriq/mcp — open source, MIT, on npm), broken down stage by stage. If you've ever wondered why a single email check takes ~3 seconds and why each piece matters, this is the long version.

The 5 stages

The verification flow runs 5 stages with 21 checks total, in priority order. Each stage runs in parallel where possible. A failure at an early stage short-circuits the rest (you don't run SMTP if the syntax is broken).

Stage 1 — Email format validation (2 checks)

1.1 Syntax validation (RFC 5322). Local-part length, character set, proper @ placement, domain format. Catches things like @@domain.com or 80-character local parts before any network calls.

1.2 Typo detection (Sift3 fuzzy match). Compares the domain against a database of 1,200+ known misspellings. gmial.com → gmail.com. outlokk.com → outlook.com. Sift3 is a string-distance algorithm that's faster than Levenshtein for short strings — important when you're running this on every check.

Total cost so far: zero network calls. If syntax is invalid, the pipeline returns immediately.

Stage 2 — Domain & provider checks (5 checks)

2.1 Disposable domain detection. A continuously maintained list of 164,000+ throwaway domains (Mailinator, Guerrilla Mail, 10minutemail, and 164k friends). Disposable addresses self-destruct, so they're flagged high-risk by default.

2.2 Role-based detection. 130+ role prefixes — admin@, info@, support@, noreply@, postmaster@, webmaster@, etc. These typically have lower engagement and higher bounce rates than personal addresses.

2.3 Free-provider check. 200+ consumer email domains worldwide. Useful when the use case requires corporate addresses (B2B prospecting, etc.). Not a hard fail — just a signal.

2.4 Alias normalization. Resolves +tag and dot-aliases to canonical form. user+newsletter@gmail.com → user@gmail.com. u.s.e.r@gmail.com → user@gmail.com. This is how you collapse duplicates that look different.

2.5 Email-pattern analysis. Scores the local part for entropy, keyboard walks (qwerty12@), and bot-style sequences. Distinguishes human-typed from auto-generated addresses.

Stage 3 — Mailbox verification (3 checks)

This is where it gets interesting. The previous 7 checks are local — Stage 3 actually talks to the mail server.

3.1 MX record resolution. DNS MX lookup with A-record fallback. Highest-priority MX is used for SMTP. Capped at the top 2 servers to prevent timeout stacking when a domain has 8 MX records.

3.2 ISP identification. MX-pattern matching against known ISPs (Gmail, Outlook, Yahoo, Apple iCloud, etc.). This matters because some major providers aggressively block SMTP probes — for those, the pipeline skips Stage 3.3 and uses heuristic scoring instead.

3.3 SMTP handshake verification. This is the real one. The pipeline opens a TCP connection on port 25 and runs the conversation:

> EHLO verifier.example.com
> MAIL FROM: <neutral@verifier.example.com>
> RCPT TO: <hello@target.com>     # the email we're checking
< 250 OK                            # accepted = exists
> RCPT TO: <random-1234567@target.com>  # catch-all probe
< 250 OK                            # also accepted = catch-all server
> QUIT

A second RCPT TO with a random address detects catch-all servers (where the server accepts every address regardless). 5-second timeout per connection. No email is ever sent — MAIL FROM and RCPT TO are setup commands; we close before DATA.

This is the slowest part of the pipeline. ~1–2 seconds typical, sometimes more if the receiving server greylists.

Stage 4 — Reputation & intelligence (7 checks)

4.1 Gravatar lookup. Whether the address has a registered Gravatar profile picture. Positive trust signal — someone set up a public identity with this address.

4.2 DNSBL blacklist check. Queries the domain across 50 categorized DNSBL zones: Spamhaus (SBL/XBL/PBL), SpamCop, SURBL, URIBL, SORBS, Barracuda BRBL, UCEProtect, DroneBL, and many more. Includes domain-based lists and IP-based lists. A single listing isn't a death sentence; multiple listings are.

4.3 Domain age via RDAP. Registration date, expiration, registrar, DNSSEC status, nameservers. Domains under 30 days are high-risk; under a year, elevated. Pulled from the registry, not WHOIS — RDAP is the modern replacement and returns structured JSON.

4.4 HIBP breach check. Have I Been Pwned database lookup. Compromised addresses are more likely to be abandoned or used by spammers leveraging credential dumps.

4.5 DKIM record check. Probes 15 common DKIM selectors (google, default, selector1, s1, mail, k1, etc.) in parallel. Returns key type (RSA / Ed25519) and estimated key size. DKIM presence indicates the domain signs outbound email — a strong legitimacy signal.

4.6 Infrastructure analysis. Evaluates 6 email-authentication standards in one composite score:

SPF record presence + syntax
DKIM key availability
DMARC policy strength (none / quarantine / reject)
MTA-STS for transport security
BIMI for verified-sender brand indicators
TLS-RPT for TLS reporting

Domains with the full stack score highest.

4.7 MX server reputation. Reverse-DNS (PTR) and IP-DNSBL on the MX server itself. Servers without PTR records or with blacklisted IPs indicate lower-quality infrastructure.

Stage 5 — Scoring & classification (4 checks)

The last stage rolls everything up.

5.1 Spam-trap heuristic scoring. 12 weighted signals:

Domain age (younger = more suspicious)
Role-based address
No Gravatar
No SMTP deliverability
Disposable domain
DNSBL listing
Catch-all server
Email-pattern entropy
MX server reputation
Local-part entropy
Email-pattern type
MX IP blacklist status

Plus classification of the trap type: Pristine (ISP-created, never used by a real person), Recycled (formerly active, repurposed after abandonment), or Typo (misspelling of a legitimate domain). Each comes with a confidence score.

5.2 Domain trust score. A 0–100 composite: age (25 pts) + infrastructure (25 pts) + reputation (25 pts) + trust signals like DNSSEC and registrar lock (25 pts). Maps to five trust levels: Trusted / Positive / Neutral / Suspicious / Malicious.

5.3 Deliverability score. All preceding signals roll into a single 0–100 score weighing SMTP deliverability, infrastructure quality, spam-trap probability, and address characteristics. The model gets one actionable number instead of a dump of 21 booleans.

5.4 Reachability classification. Score maps to Safe (80–100, high confidence inbox) / Risky (40–79, may bounce or land in spam) / Invalid (1–39, almost certain bounce) / Unknown (0, couldn't be determined due to greylisting / catch-all / timeout).

Why MCP, not "another dashboard"?

Email deliverability is one of those problems where the right answer needs 6–8 API calls in sequence, not one call.

You ask "is this list safe to send to?" and the right answer involves: verify each email, then for each invalid → check if the domain is disposable, for each catch-all → check sender reputation, for each unknown → DNSBL the MX, then compute aggregate spam-trap probability for the whole list, then the recommendation.

A dashboard makes you click through that. The model orchestrates it as tool calls — and explains the verdict in plain English at the end.

That's why I exposed it as MCP rather than just a REST API + dashboard. The API is also there, but the MCP layer is what makes it agent-ready.

The 12 MCP tools map to the pipeline:

deliveriq_verify_email — runs all 5 stages on a single address
deliveriq_batch_verify — same, async, up to 100K addresses per job
deliveriq_blacklist_check — Stage 4.2 only
deliveriq_infrastructure_check — Stage 4.6 only
deliveriq_spam_trap_analysis — Stage 5.1 only
deliveriq_domain_intel — Stages 4 + 5.2 (composite)
deliveriq_find_email — pattern-based discovery
deliveriq_org_intel — cached organization patterns
(+ batch status, batch download, list jobs, check credits)

When Claude is asked "audit this list before we send", it composes calls across these tools rather than stuffing 21 questions into a single prompt.

Install

# Add to ~/Library/Application Support/Claude/claude_desktop_config.json
{
  "mcpServers": {
    "deliveriq": {
      "command": "npx",
      "args": ["-y", "@deliveriq/mcp"],
      "env": { "DELIVERIQ_API_KEY": "lc_your_key" }
    }
  }
}

Free tier with no credit card at https://min8t.com/deliveriq.

Closing thoughts

Two things I learned building this:

SMTP catch-all servers are why "valid email" is unprovable. A server that returns 250 OK to every RCPT TO is correct from a protocol standpoint — RFC 5321 doesn't require accuracy. So you can never be 100% certain hello@catch-all-domain.com reaches a real person. The best you can do is detect the catch-all behavior and weight it accordingly. Stage 5's "Risky" category is mostly catch-all results.

MCP is genuinely better than a dashboard for this workflow. Composing 6–8 tools in sequence to debug "why isn't this email getting through?" is the kind of agent-task that's incredibly tedious in a dashboard but feels natural when an LLM is driving it.

What MCP tools are you missing for email or deliverability work? Curious what others would add to a pipeline like this.

DEV Community