I ran Maigret against 200 prospect records last quarter before a high-ACV outbound push, and it caught 11 identity mismatches that would have embarrassed my team — wrong person, dead account, or a namesake with a completely different job title. Nobody in sales ops talks about this tool because it lives in the cybersecurity corner of the internet. That's a gap worth closing.
Why Username Pattern Matching Catches What Enrichment Tools Miss
Apollo, RocketReach, and PDL are excellent at returning contact data. They are not good at confirming that the email address and LinkedIn URL they handed you actually belong to the same human being you're targeting. Data decay, common names, and merged company records create silent mismatches — records that look clean but point to the wrong person.
The problem gets worse as deal size increases. If you're personalizing outreach for a $150K ACV deal and you reference a conference talk that a different "Michael Chen" gave, you've already lost credibility before the first reply.
Maigret is an open-source Python tool built originally for journalist and security research workflows. It takes a username — or a set of username variants — and checks for matching accounts across thousands of platforms. The key insight for sales ops is that people are remarkably consistent with usernames across platforms. If PDL or RocketReach returns a LinkedIn handle of mchen_product, and Maigret finds that same handle active on GitHub, Medium, and a niche Slack community for product managers, that's strong corroboration. If it finds nothing except a dormant MySpace page, that's a flag.
How to Build Username Variants from Enrichment Output
Before you run Maigret, you need a candidate username list. Here's the pattern I use after pulling a record from PDL or RocketReach:
Start with what enrichment gives you:
- LinkedIn vanity URL slug (e.g.,
michael-chen-42b) - Twitter/X handle if returned
- GitHub username if returned
- Email prefix (the part before the
@)
Derive variants programmatically:
def generate_username_variants(first: str, last: str, email_prefix: str) -> list[str]:
f, l = first.lower(), last.lower()
variants = [
f"{f}{l}", # michaelchen
f"{f}.{l}", # michael.chen
f"{f}_{l}", # michael_chen
f"{f[0]}{l}", # mchen
f"{f[0]}.{l}", # m.chen
email_prefix.lower()
]
return list(set(variants))
Feed each variant into Maigret separately. A typical install and run looks like this:
pip install maigret
maigret mchen --timeout 10 --retries 2 --json report_mchen.json
I run this with --timeout 10 to avoid the long tail of slow sites hanging the process. For bulk work, I pipe the JSON output into a simple aggregator script that scores each variant by the number of active accounts found and the recency of activity where Maigret can detect it.
Cross-Referencing Maigret Output Against PDL and RocketReach Data
This is where the actual verification happens. The goal isn't to find every account a person has ever created. It's to confirm that the professional identity returned by your enrichment tool is coherent with what's publicly findable under the same username.
Here's the cross-reference logic I apply:
| Signal | Weight | Notes |
|---|---|---|
| LinkedIn slug matches active GitHub/Medium account | High | Same professional identity visible across platforms |
| Email prefix matches username on 3+ professional platforms | High | Strong corroboration |
| Username found but bio/location contradicts PDL data | Medium-High | Investigate — may be namesake |
| No username variants found anywhere | Medium | Account may be private or record is stale |
| Username found with different industry/title in bio | High risk | Likely wrong person |
| Only consumer platforms found (gaming, Reddit) | Low risk | Normal — not everyone has professional presence |
When Maigret returns a result where the bio says "software engineer in Mumbai" and PDL told you this person is a VP of Sales in Austin, that's not a data gap — that's a different human being.
I ran this cross-reference process on 200 records pulled from RocketReach for accounts over $100K estimated ACV. Results:
- 178 records: username patterns corroborated enrichment data cleanly
- 11 records: meaningful discrepancy (wrong person or clearly stale account — person had left the company)
- 11 records: no signal either way (private or minimal online presence)
That 5.5% mismatch rate on records I would have personalized and sent is not small. At $150K ACV per account, protecting even two of those from a bad first impression pays for the ops overhead many times over.
Flagging High-Risk Records Before Personalization
I built a lightweight flag system in our CRM (HubSpot, in our case) with a custom property called identity_confidence. It has three states: verified, unverified, and review.
Records hit review when:
- Maigret finds the username on a platform but the bio contradicts enrichment data
- The only matching platforms are consumer/gaming (suggests the professional record may be attached to a hobbyist account)
- PDL and RocketReach returned conflicting email domains for the same person
Records in review go to a human (me, or a trained SDR) before any personalized copy is written. The step takes five to ten minutes per record. That's the honest cost of this workflow, and it's why I only apply it to accounts above a defined ACV threshold — for us, that's $75K.
Below that threshold, we run standard enrichment-only verification: email validation via Hunter.io, LinkedIn URL spot-check, and a Clearbit company firmographic sanity check. Fast and good enough.
Here's how the tooling maps across deal sizes:
| ACV Range | Identity Verification Stack | Time per Record |
|---|---|---|
| Under $25K | Apollo + Hunter.io email validation | ~1 min |
| $25K–$75K | Above + RocketReach cross-ref, LinkedIn spot-check | ~3 min |
| $75K–$150K | Above + Maigret username pattern check | ~8 min |
| Over $150K | Full stack + manual review, Maigret on all variants | ~15 min |
This tiered approach keeps the ops cost proportional to deal value. Running Maigret on every inbound demo request would be a waste of analyst time.
What I Actually Use
For the core enrichment layer, PDL is my go-to when I need API access to bulk records — the data coverage for US and EU enterprise contacts is solid, and the identity graph is more reliable than most alternatives I've tested. RocketReach fills gaps on direct dials and smaller company records where PDL thins out. For email finding and validation, Hunter.io remains reliable and the domain search is genuinely useful for mapping org structures.
Phantombuster handles LinkedIn automation for the initial list-building step when I need to pull fresh data from Sales Navigator exports. Clay ties a lot of these enrichment sources together in a workflow I can hand to a less technical SDR without them needing to touch APIs directly.
Maigret sits at the verification layer, not the enrichment layer. It doesn't tell you someone's phone number or email. It tells you whether the person your enrichment tools found is plausibly the person you think they are. For high-ACV prospecting, that's a meaningful distinction.
For teams that want username-based identity verification without running open-source tooling locally, Ziwa is one option in this space worth evaluating alongside Maigret for your workflow.
The honest caveat on Maigret: it's a CLI tool that requires Python, it occasionally hits rate limits on major platforms, and the site database needs periodic updates. It's not a polished SaaS product. If your team can't support a Python dependency, you'll spend more time maintaining it than using it. For a technical sales ops practitioner who's comfortable in a terminal, it earns its place in the stack.
The 11 records it caught for me last quarter were worth every minute of setup time.
Top comments (0)