If you sell to clinics, doctors or dentists in the US, you're sitting on top of one of the cleanest free B2B datasets that exists and most people ignore it. The NPPES NPI registry is a government database of every healthcare provider in the country, it has a public JSON API, and there's no key, no scraping and no terms-of-service grey area. The catch: it gives you who and where, but not the email. This post is about closing that gap.
What the NPI registry actually is
Every US provider that bills insurance has an NPI (National Provider Identifier). The Centers for Medicare & Medicaid Services publish the whole thing through NPPES, and there's a documented public API: npiregistry.cms.hhs.gov/api-page.
You can query by name, specialty (taxonomy), city, state or postal code. Each record gives you:
- Provider or organization name
- NPI number (so every lead is verifiably real)
- Primary taxonomy / specialty
- Practice address and phone
- Whether it's an individual (NPI-1) or organization (NPI-2)
That's already more structured and more trustworthy than most paid lead lists, because every row maps to a government-verified identifier.
Pulling a target segment
Say you want dentists in Austin, TX. The API takes simple query params:
const params = new URLSearchParams({
version: '2.1',
taxonomy_description: 'Dentist',
city: 'Austin',
state: 'TX',
limit: '200'
});
const res = await fetch(`https://npiregistry.cms.hhs.gov/api/?${params}`);
const { results } = await res.json();
const leads = results.map(r => ({
npi: r.number,
name: r.basic.organization_name
|| `${r.basic.first_name} ${r.basic.last_name}`,
taxonomy: r.taxonomies.find(t => t.primary)?.desc,
address: r.addresses.find(a => a.address_purpose === 'LOCATION'),
}));
The limit caps at 200 per call and results are paginated with skip, so you loop in pages until you've covered the segment. Be gentle with request rate — it's a public good, not a private API you're paying to abuse.
The missing piece: emails
NPPES deliberately does not publish email addresses. So a raw NPI pull is a list of verified practices with phone + address but no inbox. To turn it into something you can actually run a cold email campaign against, you enrich each record with the practice's own website and the email published on it.
The enrichment chain per lead:
- Find the website. The NPI address + name is usually enough to resolve the practice site (a places lookup, or a constrained web search by name + city).
-
Crawl the contact pages. Same trick as any contact scraper — fetch
/contact,/about,/appointmentsetc. first, decode obfuscatedinfo [at] clinic [dot] comaddresses, and filter image/asset false positives. - Keep the NPI as the join key. Because every record has a unique NPI, you can dedupe and re-enrich cleanly later without guessing whether two rows are the same practice.
This is the part that's tedious to maintain by hand: resolving the site, handling the 10-20% of practices with no site or a Facebook-only presence, retrying flaky requests, and not getting rate-limited across thousands of small clinic sites.
I packaged the whole chain — NPI pull plus website resolution plus contact-page crawl — as a hosted, pay-per-lead scraper: Healthcare Provider Leads. You give it a specialty + location, it returns NPI-verified providers with name, specialty, address and phone, plus the emails and socials enriched from their sites. No API key, and you pay per lead delivered rather than per month.
If you want to enrich a list of arbitrary practice websites you already have (not NPI-sourced), the generic Email Scraper & Contact Finder does just the crawl-and-extract step on any URL.
Why this beats buying a list
- Verifiable. Every lead has an NPI you can check against the public registry. Bought lists are full of dead and duplicated rows.
- Free at the source. The provider data is taxpayer-funded and public; you only pay for the enrichment work.
- Segmentable. Taxonomy + geography filtering means you can target "pediatric dentists in Florida" exactly, instead of a generic "healthcare" dump.
Takeaway
The NPI registry solves the "is this a real, licensed provider?" problem for free and at scale. The only thing it's missing is the email — and that's a solvable enrichment step (resolve site, crawl contact pages, keep NPI as the key), not a reason to go buy a stale list. Start from the government data, enrich on top, and every lead in your CRM is one you can trace back to a real identifier.
Top comments (0)