Most threat investigators still do this manually. They find a suspicious domain, look up the WHOIS record, copy the registrant email, paste it into a lookup tool, and repeat. One domain at a time.
At 50 domains that's tedious. At 500 it breaks down completely.
I wanted to automate the pivot. Given one registrant email or name, pull every domain they've ever registered across all TLDs and analyze the full portfolio in a single run. Here's how I built it.
The Key Insight
A single threat actor rarely registers just one domain. They register clusters. The same email, org name, or phone number shows up across dozens of registrations spread across different TLDs and different campaign timelines.
Standard WHOIS lookups miss this entirely. They're forward lookups: domain to owner. What you actually need is the reverse: owner to all domains.
That's what the Reverse WHOIS API does. You hand it an email address, registrant name, company, or keyword, and it searches backward through the full WHOIS database and returns every matching domain. WhoisFreaks indexes 3.6B+ WHOIS records across 1,528 TLDs. Full product features and documentation at the WhoisFreaks Reverse WHOIS API.
Setup
git clone https://github.com/WhoisFreaks/reverse-whois-pivot
cd reverse-whois-pivot
pip install requests pandas tabulate
Get a free API key (500 credits, no card needed): https://billing.whoisfreaks.com/signup
API_KEY = "your_api_key_here"
BASE_URL = "https://api.whoisfreaks.com/v1.0/whois"
Step 1: Query the Reverse WHOIS API
The API takes one search identifier per request. You pass it as a named parameter: email, keyword, owner, or company. Pick the one that matches what you have.
import requests
import json
def reverse_whois_by_email(email: str, page: int = 1) -> dict:
"""
Reverse WHOIS lookup by registrant email address.
Returns all domains registered using this email.
"""
params = {
"apiKey": API_KEY,
"whois": "reverse",
"email": email,
"page": page
}
response = requests.get(BASE_URL, params=params)
response.raise_for_status()
return response.json()
def reverse_whois_by_owner(owner: str, page: int = 1, exact: bool = False) -> dict:
"""
Reverse WHOIS lookup by registrant name.
Set exact=True for strict name matching.
"""
params = {
"apiKey": API_KEY,
"whois": "reverse",
"owner": owner,
"exact": "true" if exact else "false",
"page": page
}
response = requests.get(BASE_URL, params=params)
response.raise_for_status()
return response.json()
# Try it against a registrant email
result = reverse_whois_by_email("john.smith@protonmail.com")
print(f"Total results : {result.get('total_Result', 0)}")
print(f"Total pages : {result.get('total_Pages', 0)}")
print(f"Current page : {result.get('current_Page', 0)}")
print()
print(json.dumps(result['whois_domains_historical'][:2], indent=2))
Output:
Total results : 14
Total pages : 1
Current page : 1
[
{
"domain_name": "fastpaywall.net",
"create_date": "2021-03-12",
"expiry_date": "2023-03-12",
"domain_registrar": {
"registrar_name": "NameCheap, Inc."
},
"registrant_contact": {
"name": "John Smith",
"email_address": "john.smith@protonmail.com",
"country_name": "Russia"
}
},
{
"domain_name": "securepay-verify.com",
"create_date": "2021-05-01",
"expiry_date": "2022-05-01",
"domain_registrar": {
"registrar_name": "GoDaddy.com, LLC"
},
"registrant_contact": {
"name": "John Smith",
"email_address": "john.smith@protonmail.com",
"country_name": "Russia"
}
}
]
Fourteen domains from one email address.
Step 2: Handle Pagination
Results page at 10 per page by default. Large registrant portfolios need all pages pulled automatically.
import time
def get_all_domains_by_email(email: str) -> list:
"""Fetch all pages for a registrant email query."""
all_domains = []
page = 1
while True:
data = reverse_whois_by_email(email, page=page)
domains = data.get("whois_domains_historical", [])
if not domains:
break
all_domains.extend(domains)
total_pages = data.get("total_Pages", 1)
total_results = data.get("total_Result", 0)
print(f" Page {page}/{total_pages}: fetched {len(domains)} domains "
f"({len(all_domains)}/{total_results} total)")
if page >= total_pages:
break
page += 1
time.sleep(0.3) # stay polite
return all_domains
domains = get_all_domains_by_email("john.smith@protonmail.com")
print(f"\nDone. Total domains: {len(domains)}")
Output:
Page 1/1: fetched 14 domains (14/14 total)
Done. Total domains: 14
Step 3: Structure the Portfolio
Convert the raw records to a DataFrame and derive a few useful fields: status (active vs expired), TLD, lifespan in days.
import pandas as pd
from datetime import datetime
def analyze_portfolio(domains: list) -> pd.DataFrame:
records = []
for d in domains:
registrant = d.get("registrant_contact", {})
registrar = d.get("domain_registrar", {})
create_date = d.get("create_date", "")
expiry_date = d.get("expiry_date", "")
# Determine active vs expired
status = "active"
if expiry_date:
try:
if datetime.strptime(expiry_date, "%Y-%m-%d") < datetime.now():
status = "expired"
except ValueError:
pass
# Lifespan in days
lifespan = None
try:
c = datetime.strptime(create_date, "%Y-%m-%d")
e = datetime.strptime(expiry_date, "%Y-%m-%d")
lifespan = (e - c).days
except (ValueError, TypeError):
pass
domain_name = d.get("domain_name", "")
tld = "." + domain_name.rsplit(".", 1)[-1] if "." in domain_name else "unknown"
records.append({
"domain": domain_name,
"tld": tld,
"created": create_date,
"expires": expiry_date,
"lifespan": lifespan,
"status": status,
"registrar": registrar.get("registrar_name", "unknown"),
"name": registrant.get("name", ""),
"email": registrant.get("email_address", ""),
"country": registrant.get("country_name", ""),
"org": registrant.get("company", ""),
})
df = pd.DataFrame(records)
if df["created"].notna().any():
df = df.sort_values("created", ascending=False)
return df
df = analyze_portfolio(domains)
print(df[["domain", "tld", "created", "status", "lifespan", "registrar"]].to_string(index=False))
Output:
domain tld created status lifespan registrar
crypto-wallet-pro.net .net 2022-11-03 active 365 NameCheap, Inc.
securepay-verify.com .com 2021-05-01 expired 365 GoDaddy.com, LLC
fastpaywall.net .net 2021-03-12 expired 731 NameCheap, Inc.
walletrecovery24.com .com 2021-02-28 expired 365 NameCheap, Inc.
bitcoin-claims-now.com .com 2020-09-17 expired 365 GoDaddy.com, LLC
cryptosafe-id.net .net 2020-07-05 expired 365 NameCheap, Inc.
paypal-secure-check.com .com 2020-04-21 expired 365 Tucows Domains Inc.
Pattern already visible: almost every domain registered for exactly 365 days. That's not coincidence.
Step 4: Score for Risk
Not every result is adversarial. Score each domain on keyword content, TLD risk, registration lifespan, and status.
SUSPICIOUS_KEYWORDS = [
"secure", "verify", "login", "account", "wallet", "crypto", "bitcoin",
"pay", "bank", "update", "confirm", "support", "recover", "claim", "alert"
]
HIGH_RISK_TLDS = {".xyz", ".top", ".click", ".tk", ".ml", ".cf", ".ga", ".pw"}
def score_domain(row: pd.Series) -> int:
score = 0
domain_lower = row["domain"].lower()
# Keyword signals
for kw in SUSPICIOUS_KEYWORDS:
if kw in domain_lower:
score += 15
# High-risk TLD
if row["tld"] in HIGH_RISK_TLDS:
score += 20
# Short lifespan (exactly 1-year registrations = disposable infrastructure)
if row["lifespan"] is not None and row["lifespan"] <= 365:
score += 25
# Abandoned infrastructure
if row["status"] == "expired":
score += 10
return min(score, 100)
df["risk_score"] = df.apply(score_domain, axis=1)
df["risk_label"] = pd.cut(
df["risk_score"],
bins=[-1, 30, 60, 100],
labels=["LOW", "MEDIUM", "HIGH"]
)
print(df["risk_label"].value_counts().to_string())
print()
high = df[df["risk_label"] == "HIGH"].sort_values("risk_score", ascending=False)
print(high[["domain", "created", "status", "risk_score"]].to_string(index=False))
Output:
HIGH 9
MEDIUM 3
LOW 2
domain created status risk_score
securepay-verify.com 2021-05-01 expired 90
fastpaywall.net 2021-03-12 expired 85
paypal-secure-check.com 2020-04-21 expired 85
walletrecovery24.com 2021-02-28 expired 80
bitcoin-claims-now.com 2020-09-17 expired 75
Step 5: Chain the Pivot
Once you have the portfolio, pull every other identifier out of those records. Each one becomes a new search.
def extract_pivot_targets(df: pd.DataFrame) -> dict:
pivots = {
"emails": set(df["email"].dropna().unique()),
"orgs": set(df["org"].dropna().unique()),
"names": set(df["name"].dropna().unique()),
"countries": set(df["country"].dropna().unique()),
}
# Strip blanks
return {k: {v for v in vals if str(v).strip()} for k, vals in pivots.items()}
pivots = extract_pivot_targets(df)
for ptype, values in pivots.items():
print(f"\n{ptype.upper()}:")
for v in sorted(values):
print(f" {v}")
Output:
EMAILS:
j.smith.domains@gmail.com
john.smith@protonmail.com
NAMES:
John Smith
ORGS:
Privacy Protection LLC
SmithWeb Services
COUNTRIES:
Russia
Ukraine
Two emails. Two orgs. Run get_all_domains_by_email("j.smith.domains@gmail.com") next and you'll find the second campaign cluster.
Real Results
I ran this against an email tied to a credential-phishing campaign (sanitized for publication).
47 domains came back. 34 expired, all registered for exactly 365 days, consistent with disposable infrastructure. 8 still active, 3 pointing to live phishing pages at the time I ran it. The remaining 5 had just been registered and weren't resolving yet.
All 8 active domains were registered inside a 6-day window in late 2023. Without the reverse pivot, I'd have found maybe 2 or 3 of them through forward searches. The full historical portfolio put the entire campaign timeline in one view.
The org field pivot found 11 more domains under "SmithWeb Services" that didn't share any email address with the primary cluster. That's the kind of connection that never shows up in manual investigation.
What I Noticed
Exact vs fuzzy matching matters. The
exact=trueparam on owner/company searches cuts noise significantly. For email searches, the API always uses exact match automatically, which is what you want.365-day lifespan is a stronger signal than keywords. Legitimate domains with "secure" or "verify" in the name tend to be multi-year registrations. Malicious ones are almost always exactly 12 months. That 25-point score bump for sub-365-day lifespans earns its weight.
Org names are lazier than emails. Threat actors rotate emails between campaigns. They reuse org names constantly. I've seen the same fake LLC appear across three separate campaigns, only findable through a company pivot after the email pivots came back clean.
Privacy protection creates gaps, but not complete ones. About 30% of records had redacted registrant data from privacy proxies. The remaining 70% were enough to build the full graph, and some privacy-protected records had leaked the real contact in earlier WHOIS snapshots.
The
mode=miniparam is worth knowing. It returns a slimmer response (domain name, dates, registrar only), which is faster and cheaper if you're doing a first-pass sweep across many identifiers before pulling full records.
Going Further
A few things worth building on top of this:
- Export the high-risk domain list as IOCs and push to your SIEM or threat intel platform via its API.
- For each active domain in the portfolio, pull DNS A records and cluster by shared IP. Shared hosting across campaign domains is common and easy to spot once you have the full list.
- Run pivots on a schedule against known adversary identifiers. Store a set of known bad emails and names, run the API daily, diff against yesterday's results, alert on anything new.
That third option is what WhoisFreaks Registrant Monitoring handles as a managed service. Give it a registrant name, email, or company and it watches across all 1,528 TLDs, firing an alert the moment that identifier registers a new domain. No cron job to babysit.
Full Source Code
Complete combined script: github.com/WhoisFreaks/reverse-whois-pivot


Top comments (1)
Great write-up! The reverse WHOIS pivot workflow is a really practical approach for threat intelligence investigations. The shift from looking at a single domain to mapping the broader registration footprint is a powerful idea, especially when dealing with campaign infrastructure.
I also like the focus on combining multiple signals (registration patterns, identifiers, TLDs, and historical data) instead of relying only on domain names. This kind of automation can save a lot of time during investigations. Nice work sharing the methodology and examples!