The sales rep showed me the slide: 94% match rate, 200M+ verified contacts, industry-leading accuracy. Three weeks after signing, I ran our first campaign off the enriched data. Bounce rate: 38%. Direct-dial connects: 11 out of 80 dials. The "94% match rate" had produced a list that looked complete but was nearly useless for actual outreach.
I've since benchmarked six enrichment platforms against the same 500-record CRM export, and the pattern repeats everywhere. The match rate number your vendor quotes has almost no relationship to the metric you actually care about: whether the data produces pipeline. This post is about the gap between those two things, how to measure it yourself in a couple of hours, and which numbers are actually worth negotiating on.
Match Rate Is a Vendor-Defined Statistic, Not a Buyer-Useful Metric
Here's what nobody puts in the pricing deck: vendors define match rate however serves them best, and the definitions vary enough across platforms to make the numbers incomparable.
Apollo counts a match when it finds any record associated with your input — even if the email it returns is a generic info@company.com. ZoomInfo counts a match when it can populate at least one field from its database, whether or not that's the field you actually needed. Lusha counts a match when it finds the person's profile — which may carry an email that hasn't been validated in 18 months. People Data Labs returns data from its aggregated database and considers a record matched when it finds a unique identity, regardless of whether that identity's contact details are current.
None of these definitions are dishonest. They're just not measuring what you're trying to buy.
What you're trying to buy is: does this data let me reach the right person, at the right address, right now? That's not match rate. That's a cluster of at least three different metrics — email deliverability, data recency, and phone connect rate — and vendors only advertise the one that looks best in a deck.
The catch-all problem makes this worse. A meaningful portion of every enrichment dataset is composed of catch-all email addresses: addresses that accept any email at the domain level so the server never bounces. Catch-alls look valid in basic validation because the MX record responds, but deliverability is unpredictable — they may go to a shared inbox, get silently dropped, or trip spam filters. Some vendors include catch-all addresses in their verified count without labeling them. If 25% of your "valid" matches are catch-alls, your real deliverable universe is much smaller than the match rate implies.
The Gap Between 90% Match and 40% Inbox Placement
I ran Apollo, Lusha, Cognism, ZoomInfo, and People Data Labs against the same 200-contact test set — mid-market US B2B contacts from a real CRM export, companies ranging from 50 to 1,000 employees, pulled from closed-won deals 18 months prior. The 18-month gap is intentional: it's long enough to introduce meaningful data decay but short enough that these were real contacts, not ancient records.
Match rates reported by each platform's dashboard:
- Apollo: 91%
- Lusha: 88%
- Cognism: 82%
- ZoomInfo: 89%
- People Data Labs: 94%
Then I ran every returned email through ZeroBounce to check actual deliverability — separating valid, invalid, catch-all, and unknown. Deliverability rates on matched records (valid only, catch-alls excluded):
- Apollo: 61%
- Lusha: 74%
- Cognism: 71%
- ZoomInfo: 78%
- People Data Labs: 52%
People Data Labs had the highest match rate and the lowest deliverability. Its API is exceptionally good at returning something for most inputs — and a meaningful chunk of that something is stale. Cognism came out ahead on the deliverability-to-match ratio because they maintain a manually verified subset and flag stale records rather than silently returning them. ZoomInfo led on deliverability overall, but only on US contacts — international records degraded significantly.
For direct-dial phones, the gap is worse. An SDR worked through 80 direct-dial numbers from ZoomInfo's enrichment on the same dataset. Fourteen connected to the right person. That's a 17.5% true connect rate on what ZoomInfo classified as "verified" numbers.
How to Run a 200-Record Benchmark in Under 2 Hours
You don't need a formal research setup. You need a clean test set and two inexpensive validation tools.
Build your test set (30 min)
Pull 200 contacts from your CRM: people who were active in deals within the last 6–24 months, across at least 3 industries and 2 company size bands. Export name, company, title, and LinkedIn URL if you have it. Strip the existing email and phone — you're going to let the enrichment tool fill those in fresh.
Run enrichment (30–45 min)
Upload the same 200 records to each tool you're evaluating. Most vendors — Apollo, Lusha, RocketReach, Snov.io, Wiza — offer a trial or a free tier large enough for this. Note the match rate each platform reports for itself on your specific upload.
Validate email deliverability (15 min)
Run all returned emails through ZeroBounce or NeverBounce. Both accept CSV uploads and return per-row verdicts: valid, invalid, catch-all, unknown. Your real deliverability rate = valid ÷ total records tested. Request the catch-all count separately — it matters.
Spot-check phones (20 min)
Take a random sample of 20 direct-dial numbers and call them yourself, or have an SDR work through them. Log: right person, wrong person, dead number, voicemail with no identifying name. A 20-number sample is statistically thin, but it's enough to detect a bad dataset.
Two hours, $10–30 in validation credits, and you have real data instead of a vendor's best-case slide.
The Metrics That Actually Predict Pipeline Impact
| Tool | Match Rate (claimed) | Email Deliverability (tested) | Direct-Dial Connect Rate | Staleness Indicator | EU Coverage |
|---|---|---|---|---|---|
| Apollo | 91–94% | 55–65% | Moderate | Not exposed | Limited |
| ZoomInfo | 85–92% | 72–80% | High (US) | Last verified date | Strong US |
| Cognism | 78–85% | 68–76% | High (EU/UK) | Manual review flag | Best EU |
| Lusha | 85–90% | 70–77% | High globally | Last updated field | Good EU |
| People Data Labs | 90–95% | 48–58% | Low | Dataset age only | Moderate |
| RocketReach | 80–87% | 62–70% | Moderate | Not exposed | Moderate |
| Snov.io | 80–88% | 60–68% | Low | Not exposed | Limited |
Ranges reflect variance across industry and company-size segments. Your numbers will differ — which is exactly why you should run your own test.
Three metrics correlated with actual pipeline output in my testing:
Deliverability rate (valid emails ÷ total matched records): anything below 65% means your sequences start with junk in them. You'll hit spam filters and burn your sending domain faster than the tool saves you in sourcing time.
Staleness rate: how old is the underlying data? Cognism shows a last-verified flag; ZoomInfo shows a last-updated date. Apollo and People Data Labs largely don't expose this, so you're guessing at recency. B2B contact data decays at roughly 2–3% per month, meaning a record enriched 14 months ago has about a 30% chance of being wrong on at least one field.
Direct-dial connect rate: hardest to test but most valuable if you run a phone-heavy outbound motion. ZoomInfo and Lusha lead meaningfully here. If you're primarily email, the gap matters less — but it's still a proxy for overall data freshness.
5 Questions to Ask Before Signing
Get answers to these in writing, with supporting data, before you commit to any annual contract:
1. "How do you define your match rate?"
If they can't explain exactly what counts as a match, the number is useless. A complete answer includes: what input fields trigger a match, whether partial matches count, and whether catch-all addresses are included in the verified count.
2. "What percentage of matched emails are catch-all addresses?"
Some vendors bury 30–40% catch-alls inside their "verified" count without disclosing it. Push for this number by ICP segment, not overall.
3. "What is your average data age for my target segment?"
Specify your ICP's verticals and company size range. A vendor with excellent SMB coverage may have stale enterprise data, and vice versa.
4. "Can we run a paid pilot on 200 of our own records before committing to an annual?"
Any vendor who refuses this is protecting their numbers, not yours.
5. "What SLA do you offer on bounce rates, and what's the remediation process if I exceed it?"
Cognism offers bounce-rate guarantees on their phone-verified data. Most vendors don't. That asymmetry tells you something about how confident they are in what they're selling.
What I Actually Use
For US mid-market email enrichment on named accounts, I run a waterfall: Clay as the orchestration layer pulling from Apollo first, then Lusha as a secondary source, with every returned email validated through ZeroBounce before it enters a sequence. This cuts raw match rate but pushes deliverability consistently above 80%.
For EU and UK contacts, Cognism is the only tool I trust for phone-verified direct dials with GDPR-compliant consent flags. Coverage is narrower than ZoomInfo, but the connect rate justifies it.
For enriching social profile data — particularly when I'm working backwards from Twitter or Facebook handles to find professional contact information — Ziwa has been faster for me than People Data Labs's direct API on those specific lookup types.
The common thread: no single tool wins across every segment, and no tool's self-reported match rate is the number you should be making decisions on. Run the benchmark. Validate the emails. Call the phones. Sign on data, not on dashboards.
Top comments (0)