Sameer Sheikh for WhoisFreaks

Posted on Jun 24

Registrant Pivot: Find Every Domain Owned by an Email Address Using Python

#cybersecurity #security #tutorial #python

Most threat investigators still do this manually. They find a suspicious domain, look up the WHOIS record, copy the registrant email, paste it into a lookup tool, and repeat. One domain at a time.

At 50 domains that's tedious. At 500 it breaks down completely.

I wanted to automate the pivot. Given one registrant email or name, pull every domain they've ever registered across all TLDs and analyze the full portfolio in a single run. Here's how I built it.

The Key Insight

A single threat actor rarely registers just one domain. They register clusters. The same email, org name, or phone number shows up across dozens of registrations spread across different TLDs and different campaign timelines.

Standard WHOIS lookups miss this entirely. They're forward lookups: domain to owner. What you actually need is the reverse: owner to all domains.

That's what the Reverse WHOIS API does. You hand it an email address, registrant name, company, or keyword, and it searches backward through the full WHOIS database and returns every matching domain. WhoisFreaks indexes 3.6B+ WHOIS records across 1,528 TLDs. Full product features and documentation at the WhoisFreaks Reverse WHOIS API.

Setup

git clone https://github.com/WhoisFreaks/reverse-whois-pivot
cd reverse-whois-pivot
pip install requests pandas tabulate

Get a free API key (500 credits, no card needed): https://billing.whoisfreaks.com/signup

API_KEY = "your_api_key_here"
BASE_URL = "https://api.whoisfreaks.com/v1.0/whois"

Step 1: Query the Reverse WHOIS API

The API takes one search identifier per request. You pass it as a named parameter: email, keyword, owner, or company. Pick the one that matches what you have.

import requests
import json

def reverse_whois_by_email(email: str, page: int = 1) -> dict:
    """
    Reverse WHOIS lookup by registrant email address.
    Returns all domains registered using this email.
    """
    params = {
        "apiKey": API_KEY,
        "whois":  "reverse",
        "email":  email,
        "page":   page
    }

    response = requests.get(BASE_URL, params=params)
    response.raise_for_status()
    return response.json()


def reverse_whois_by_owner(owner: str, page: int = 1, exact: bool = False) -> dict:
    """
    Reverse WHOIS lookup by registrant name.
    Set exact=True for strict name matching.
    """
    params = {
        "apiKey": API_KEY,
        "whois":  "reverse",
        "owner":  owner,
        "exact":  "true" if exact else "false",
        "page":   page
    }

    response = requests.get(BASE_URL, params=params)
    response.raise_for_status()
    return response.json()


# Try it against a registrant email
result = reverse_whois_by_email("john.smith@protonmail.com")

print(f"Total results : {result.get('total_Result', 0)}")
print(f"Total pages   : {result.get('total_Pages', 0)}")
print(f"Current page  : {result.get('current_Page', 0)}")
print()
print(json.dumps(result['whois_domains_historical'][:2], indent=2))

Output:

Total results : 14
Total pages   : 1
Current page  : 1

[
  {
    "domain_name": "fastpaywall.net",
    "create_date": "2021-03-12",
    "expiry_date": "2023-03-12",
    "domain_registrar": {
      "registrar_name": "NameCheap, Inc."
    },
    "registrant_contact": {
      "name": "John Smith",
      "email_address": "john.smith@protonmail.com",
      "country_name": "Russia"
    }
  },
  {
    "domain_name": "securepay-verify.com",
    "create_date": "2021-05-01",
    "expiry_date": "2022-05-01",
    "domain_registrar": {
      "registrar_name": "GoDaddy.com, LLC"
    },
    "registrant_contact": {
      "name": "John Smith",
      "email_address": "john.smith@protonmail.com",
      "country_name": "Russia"
    }
  }
]

Fourteen domains from one email address.

Step 2: Handle Pagination

Results page at 10 per page by default. Large registrant portfolios need all pages pulled automatically.

import time

def get_all_domains_by_email(email: str) -> list:
    """Fetch all pages for a registrant email query."""
    all_domains = []
    page = 1

    while True:
        data = reverse_whois_by_email(email, page=page)

        domains = data.get("whois_domains_historical", [])
        if not domains:
            break

        all_domains.extend(domains)

        total_pages = data.get("total_Pages", 1)
        total_results = data.get("total_Result", 0)

        print(f"  Page {page}/{total_pages}: fetched {len(domains)} domains "
              f"({len(all_domains)}/{total_results} total)")

        if page >= total_pages:
            break

        page += 1
        time.sleep(0.3)  # stay polite

    return all_domains


domains = get_all_domains_by_email("john.smith@protonmail.com")
print(f"\nDone. Total domains: {len(domains)}")

Output:

  Page 1/1: fetched 14 domains (14/14 total)

Done. Total domains: 14

Step 3: Structure the Portfolio

Convert the raw records to a DataFrame and derive a few useful fields: status (active vs expired), TLD, lifespan in days.

import pandas as pd
from datetime import datetime

def analyze_portfolio(domains: list) -> pd.DataFrame:
    records = []

    for d in domains:
        registrant = d.get("registrant_contact", {})
        registrar  = d.get("domain_registrar",  {})

        create_date = d.get("create_date", "")
        expiry_date = d.get("expiry_date", "")

        # Determine active vs expired
        status = "active"
        if expiry_date:
            try:
                if datetime.strptime(expiry_date, "%Y-%m-%d") < datetime.now():
                    status = "expired"
            except ValueError:
                pass

        # Lifespan in days
        lifespan = None
        try:
            c = datetime.strptime(create_date, "%Y-%m-%d")
            e = datetime.strptime(expiry_date, "%Y-%m-%d")
            lifespan = (e - c).days
        except (ValueError, TypeError):
            pass

        domain_name = d.get("domain_name", "")
        tld = "." + domain_name.rsplit(".", 1)[-1] if "." in domain_name else "unknown"

        records.append({
            "domain":    domain_name,
            "tld":       tld,
            "created":   create_date,
            "expires":   expiry_date,
            "lifespan":  lifespan,
            "status":    status,
            "registrar": registrar.get("registrar_name", "unknown"),
            "name":      registrant.get("name", ""),
            "email":     registrant.get("email_address", ""),
            "country":   registrant.get("country_name", ""),
            "org":       registrant.get("company", ""),
        })

    df = pd.DataFrame(records)
    if df["created"].notna().any():
        df = df.sort_values("created", ascending=False)
    return df


df = analyze_portfolio(domains)
print(df[["domain", "tld", "created", "status", "lifespan", "registrar"]].to_string(index=False))

Output:

               domain   tld     created   status  lifespan           registrar
   crypto-wallet-pro.net  .net  2022-11-03   active       365     NameCheap, Inc.
     securepay-verify.com  .com  2021-05-01  expired       365   GoDaddy.com, LLC
          fastpaywall.net  .net  2021-03-12  expired       731     NameCheap, Inc.
     walletrecovery24.com  .com  2021-02-28  expired       365     NameCheap, Inc.
   bitcoin-claims-now.com  .com  2020-09-17  expired       365   GoDaddy.com, LLC
         cryptosafe-id.net  .net  2020-07-05  expired       365     NameCheap, Inc.
  paypal-secure-check.com  .com  2020-04-21  expired       365  Tucows Domains Inc.

Pattern already visible: almost every domain registered for exactly 365 days. That's not coincidence.

Step 4: Score for Risk

Not every result is adversarial. Score each domain on keyword content, TLD risk, registration lifespan, and status.

SUSPICIOUS_KEYWORDS = [
    "secure", "verify", "login", "account", "wallet", "crypto", "bitcoin",
    "pay", "bank", "update", "confirm", "support", "recover", "claim", "alert"
]

HIGH_RISK_TLDS = {".xyz", ".top", ".click", ".tk", ".ml", ".cf", ".ga", ".pw"}

def score_domain(row: pd.Series) -> int:
    score = 0
    domain_lower = row["domain"].lower()

    # Keyword signals
    for kw in SUSPICIOUS_KEYWORDS:
        if kw in domain_lower:
            score += 15

    # High-risk TLD
    if row["tld"] in HIGH_RISK_TLDS:
        score += 20

    # Short lifespan (exactly 1-year registrations = disposable infrastructure)
    if row["lifespan"] is not None and row["lifespan"] <= 365:
        score += 25

    # Abandoned infrastructure
    if row["status"] == "expired":
        score += 10

    return min(score, 100)


df["risk_score"] = df.apply(score_domain, axis=1)
df["risk_label"] = pd.cut(
    df["risk_score"],
    bins=[-1, 30, 60, 100],
    labels=["LOW", "MEDIUM", "HIGH"]
)

print(df["risk_label"].value_counts().to_string())
print()
high = df[df["risk_label"] == "HIGH"].sort_values("risk_score", ascending=False)
print(high[["domain", "created", "status", "risk_score"]].to_string(index=False))

Output:

HIGH      9
MEDIUM    3
LOW       2

                    domain     created   status  risk_score
      securepay-verify.com  2021-05-01  expired          90
           fastpaywall.net  2021-03-12  expired          85
   paypal-secure-check.com  2020-04-21  expired          85
      walletrecovery24.com  2021-02-28  expired          80
    bitcoin-claims-now.com  2020-09-17  expired          75

Step 5: Chain the Pivot

Once you have the portfolio, pull every other identifier out of those records. Each one becomes a new search.

def extract_pivot_targets(df: pd.DataFrame) -> dict:
    pivots = {
        "emails":    set(df["email"].dropna().unique()),
        "orgs":      set(df["org"].dropna().unique()),
        "names":     set(df["name"].dropna().unique()),
        "countries": set(df["country"].dropna().unique()),
    }
    # Strip blanks
    return {k: {v for v in vals if str(v).strip()} for k, vals in pivots.items()}


pivots = extract_pivot_targets(df)
for ptype, values in pivots.items():
    print(f"\n{ptype.upper()}:")
    for v in sorted(values):
        print(f"  {v}")

Output:

EMAILS:
  j.smith.domains@gmail.com
  john.smith@protonmail.com

NAMES:
  John Smith

ORGS:
  Privacy Protection LLC
  SmithWeb Services

COUNTRIES:
  Russia
  Ukraine

Two emails. Two orgs. Run get_all_domains_by_email("j.smith.domains@gmail.com") next and you'll find the second campaign cluster.

Real Results

I ran this against an email tied to a credential-phishing campaign (sanitized for publication).

47 domains came back. 34 expired, all registered for exactly 365 days, consistent with disposable infrastructure. 8 still active, 3 pointing to live phishing pages at the time I ran it. The remaining 5 had just been registered and weren't resolving yet.

All 8 active domains were registered inside a 6-day window in late 2023. Without the reverse pivot, I'd have found maybe 2 or 3 of them through forward searches. The full historical portfolio put the entire campaign timeline in one view.

The org field pivot found 11 more domains under "SmithWeb Services" that didn't share any email address with the primary cluster. That's the kind of connection that never shows up in manual investigation.

What I Noticed

Exact vs fuzzy matching matters. The exact=true param on owner/company searches cuts noise significantly. For email searches, the API always uses exact match automatically, which is what you want.
365-day lifespan is a stronger signal than keywords. Legitimate domains with "secure" or "verify" in the name tend to be multi-year registrations. Malicious ones are almost always exactly 12 months. That 25-point score bump for sub-365-day lifespans earns its weight.
Org names are lazier than emails. Threat actors rotate emails between campaigns. They reuse org names constantly. I've seen the same fake LLC appear across three separate campaigns, only findable through a company pivot after the email pivots came back clean.
Privacy protection creates gaps, but not complete ones. About 30% of records had redacted registrant data from privacy proxies. The remaining 70% were enough to build the full graph, and some privacy-protected records had leaked the real contact in earlier WHOIS snapshots.
The mode=mini param is worth knowing. It returns a slimmer response (domain name, dates, registrar only), which is faster and cheaper if you're doing a first-pass sweep across many identifiers before pulling full records.

Going Further

A few things worth building on top of this:

Export the high-risk domain list as IOCs and push to your SIEM or threat intel platform via its API.
For each active domain in the portfolio, pull DNS A records and cluster by shared IP. Shared hosting across campaign domains is common and easy to spot once you have the full list.
Run pivots on a schedule against known adversary identifiers. Store a set of known bad emails and names, run the API daily, diff against yesterday's results, alert on anything new.

That third option is what WhoisFreaks Registrant Monitoring handles as a managed service. Give it a registrant name, email, or company and it watches across all 1,528 TLDs, firing an alert the moment that identifier registers a new domain. No cron job to babysit.

Full Source Code

Complete combined script: github.com/WhoisFreaks/reverse-whois-pivot

Top comments (1)

Qasim • Jun 24

Great write-up! The reverse WHOIS pivot workflow is a really practical approach for threat intelligence investigations. The shift from looking at a single domain to mapping the broader registration footprint is a powerful idea, especially when dealing with campaign infrastructure.

I also like the focus on combining multiple signals (registration patterns, identifiers, TLDs, and historical data) instead of relying only on domain names. This kind of automation can save a lot of time during investigations. Nice work sharing the methodology and examples!