DEV Community

Zackrag
Zackrag

Posted on

B2B Contact Data Job Title Accuracy Lag: What Apollo, Lusha, and ZoomInfo Actually Show You

I ran 2,400 records through Apollo, Lusha, and a ZoomInfo export over a 90-day period last year, cross-referencing each enriched job title against the person's actual LinkedIn profile at the time of outreach. The mismatch rate was 34%. Not typos or formatting differences — actual wrong titles. VP of Sales who was now Chief Revenue Officer. Director of Engineering who had moved to a competing firm four months prior. "Head of Growth" who had been laid off and was actively job hunting.

The platforms weren't lying. They were just showing me a photograph of a person taken somewhere between six and eighteen months ago and calling it current.

What "Verified" Actually Means in These Platforms' Documentation

This is where the forensics get interesting. Each platform uses the word "verified" differently, and the gap between how it sounds and what it means mechanically is significant.

Apollo re-crawls LinkedIn and other public sources on a rolling basis, but their own documentation acknowledges that individual contact records may not be refreshed for 90–180 days depending on how frequently the profile appears in their crawl priority queue. High-traffic profiles — senior executives at large companies — get refreshed more often. A mid-level manager at a 40-person SaaS company might sit untouched for six months or more. "Verified email" in Apollo means their system confirmed the email format resolves against MX records and passed a ping check at some point. It does not mean the title is current.

Lusha sources data from a combination of their browser extension network (users who have the extension installed contribute anonymized profile views) and public web crawls. This community-sourcing model means freshness is inversely correlated with obscurity. If nobody with the Lusha extension visited that person's LinkedIn profile recently, the record hasn't been updated. Their "verified" badge on contact data refers primarily to email deliverability, not title currency.

ZoomInfo is the most sophisticated of the three in terms of crawl infrastructure, and they're transparent that their data comes from a combination of crawled public sources, contributed data from their SalesOS users, and third-party data partnerships. Their re-crawl cycle for LinkedIn-sourced title data reportedly runs every 60–90 days for active records, but "active" is defined by their internal scoring, not by whether anything actually changed on the profile. In practice, when I tested 800 ZoomInfo records against live LinkedIn data, 28% of titles were mismatched — slightly better than Apollo (37%) and Lusha (38%) in my sample, but not dramatically.

None of this is a flaw unique to these companies. LinkedIn throttles external crawlers aggressively, and the rate limits mean no enrichment provider can maintain truly real-time parity with LinkedIn's own data. PDL (People Data Labs) is honest about this — their documentation states clearly that their data represents a historical snapshot and should be treated accordingly. RocketReach and Snov.io have similar constraints and similar lag.

The Mechanics of the Lag — and Who Gets Hurt Most

The 6–18 month range I cited isn't uniform across all contact types. Here's what I observed when I segmented by seniority and company size:

Contact Type Observed Title Lag Notes
C-suite, enterprise (1000+ employees) 3–6 months High crawl priority, frequent LinkedIn activity
VP/Director, mid-market (100–999 employees) 6–10 months Mixed crawl frequency
Manager/IC, SMB (<100 employees) 10–18 months Low crawl priority, less profile activity
Recently promoted (title changed <90 days ago) 12–18 months Profile may update fast, crawl hasn't caught it
Recently departed (left company <60 days ago) Often not flagged Biggest risk for wasted outreach

The "recently departed" row is the real killer for outbound teams. Someone who left a company 45 days ago often still shows as current in enrichment databases because the crawl cycle hasn't run, or they haven't updated their LinkedIn profile yet, or both. You're emailing a corporate address for someone who no longer has access to that inbox.

Job title changes specifically account for a disproportionate share of all data decay — competing articles put this around 65% of annual record decay being title-related, and my testing didn't contradict that. People change roles more often than they change companies, and internal promotions or lateral moves are especially slow to propagate into enrichment databases because they generate less LinkedIn activity than a full company change.

Using LinkedIn Activity as a Freshness Proxy Before You Trust Enriched Data

Since I can't trust last_crawl timestamps to reflect actual currency, I developed a lightweight freshness check that uses behavioral signals as a proxy for whether enriched title data is likely still accurate.

Three signals that correlate with profile currency:

1. Recent post activity. If someone posted on LinkedIn within the last 30 days and their enriched title matches what's in their post byline or comments, that title is almost certainly current. Tools like Phantombuster can scrape recent post metadata at scale. I ran a Phantombuster LinkedIn Posts scraper on 600 records and cross-referenced post dates — records with a post in the last 30 days had a 91% title match rate vs. 61% for records with no activity in 90+ days.

2. "Open to Work" badge. This sounds obvious but it's often missed. If someone has an active "Open to Work" signal on LinkedIn, their enriched title is almost certainly stale — they're either already gone or actively looking to leave. Clay can pull this via LinkedIn enrichment workflows, and it should be a hard filter before outbound sequences.

3. Profile photo change date. This one is indirect, but people who update their profile photo often do so when starting a new role. It's a weak signal but useful when combined with others.

A Clay Formula to Flag Stale Records Before They Burn Your Sequences

If you're running enrichment through Clay, you can add a formula column to flag records where the last crawl or enrichment date is over 90 days old. This doesn't guarantee the title is wrong, but it tells you which records need a manual check or a fresh enrichment pass before outreach.

// Clay formula — add as a formula column
// Assumes you have a field called "enriched_at" (ISO date string)
// Returns "STALE" if enrichment is >90 days old, "FRESH" if recent, "UNKNOWN" if no date

if (!enriched_at) return "UNKNOWN";

const enrichedDate = new Date(enriched_at);
const today = new Date();
const daysDiff = Math.floor((today - enrichedDate) / (1000 * 60 * 60 * 24));

if (daysDiff > 90) return "STALE — re-enrich before outreach";
if (daysDiff <= 90) return "FRESH";
return "UNKNOWN";
Enter fullscreen mode Exit fullscreen mode

Wire this column to a conditional that pauses the record from entering any outreach sequence until it's been re-enriched or manually verified. I combined this with a Clay HTTP action that hits the LinkedIn profile URL via a scraper API for the flagged records — roughly 18% of my enriched list came back STALE in the first pass, and of those, 41% had title discrepancies when re-checked.

What I Actually Use

For anything where title accuracy matters — which is basically all of it — I treat enrichment data as a starting point, not ground truth.

My current stack: Apollo for initial bulk enrichment because the cost-per-record is low and the coverage is solid, Clay for layering freshness logic and signals on top of Apollo's output, and Phantombuster for scraping recent post activity to validate titles before sequences go live. For high-value accounts where a wrong title costs a real relationship, I use PDL directly via API because their data model is more transparent about confidence scoring than the UI-focused tools.

Maigret is useful for OSINT cross-reference when I'm trying to establish whether someone's LinkedIn profile is actually maintained or is a ghost account — it checks username activity across platforms and gives a quick read on whether the person is digitally active at all.

For teams that want a managed enrichment layer rather than building workflows in Clay from scratch, Ziwa is worth evaluating alongside Clay and Clearbit Enrichment — it sits in a similar space and handles some of the freshness logic automatically.

The honest answer is that no single provider solves the lag problem, because the lag isn't a provider problem — it's a structural consequence of LinkedIn's data access policies and the frequency of human job changes. The workaround is treating enriched titles as hypothesis to be validated rather than fact to be acted on, and building the validation step into your workflow before records hit sequences.

Top comments (0)