Most "domain finders" on GitHub stop at stage one — they spit out a list of names someone might want to buy. The real work is everything after that: deciding which ones are worth money, not buying the obvious trademark traps, not blowing the monthly budget on a hype day, and actually settling a sale.
I built domain-harness because I wanted to test a small thesis — "AI-generated brandable domains + automated valuation can clear $X / month" — without giving a script direct keys to a registrar and praying.
This post is about the architecture, not the thesis. Specifically: how do you build something that touches money and registrar APIs without losing money the moment the LLM hallucinates?
The pipeline
discover ─► value ─► acquire ─► list ─► negotiate ─► settle
│ │ │ │ │
AI gen + local + multi- Dan / AI reply
expired AI Council registrar Afternic + counter
feeds (Claude fallback Sedo
+ DeepSeek)
Six stages, all gateable individually, all dry_run by default. The thing that makes it safe is what's between the stages, not the stages themselves.
Defense layer 1: two-tier valuation
Calling an LLM to value every candidate domain is how you waste $50 of tokens on garbage by 11pm. Local heuristic gate first:
- Length penalty over 12 chars
- Hard-cap punctuation / digit counts
- Dictionary-word boost
- Brandable phonotactics scoring (CVC patterns score better than
xqkz) - Hard reject on hyphens-and-numbers cocktails
Only candidates above min_local_score get billed against the AI Council. In practice this filters about 90% of the discovery output, and the surviving 10% is the only thing worth a token spend.
Defense layer 2: AI Council, not AI Oracle
Every survivor goes to two independent providers — Claude and DeepSeek — with the same prompt:
council = Council(
[function_voter("claude", claude_valuator, weight=1.0),
function_voter("deepseek", deepseek_valuator, weight=1.0)],
threshold=2, # both must clear per-provider threshold
)
Both must clear their threshold before the harness will spend a cent. One model alone is a hallucination machine; two independent models with the same hallucination at the same time is much rarer (and when it does happen, your loss is capped by layer 3 anyway).
Caveat I learned the hard way in a different project: LLM scores are not alpha. In a trading harness I weighted council confidence directly into position size and the high-confidence picks systematically lost money. So in domain-harness the council is a gate, not a size multiplier. It can say "yes / no, the domain is worth it" — it cannot say "buy 5x more of this one."
Defense layer 3: hard budget walls
This is the part that lets me sleep:
-
daily_cap_usd: $50 -
monthly_cap_usd: $1000 -
per_domain_cap_usd: $15
These aren't soft warnings. Every acquisition.buy() call goes through budget_guard.check() and an over-cap call raises BudgetExceeded. The buyer literally cannot exceed them. If any single wall trips, the discovery loop stops and the daemon goes idle until a human re-enables it.
I keep a state file data/budget_state.json with today's spend and this month's spend. The CI smoke test verifies every guard:
[5] 预算守卫硬刹车 ✓ 单域名超过返回拦截($200 > $15)
✓ 正常金额通过($10)
Defense layer 4: trademark + WHOIS
A surprising fraction of "great brandable domain" suggestions from LLMs are subtle trademark violations. mygoogle.com is obvious; paypall.io is less obvious; openai-clone.ai is comically obvious but the council will still rank it because it scores "ai".
So:
- Static blocklist (Fortune 500 + Anthropic / OpenAI / Stripe / etc).
- User-extensible blocklist.
- Substring match against the blocklist (so
mygoogle.comandopenai-clone.aifail). - WHOIS check before any buy attempt (some "available" feeds lie).
(Edit-distance / typo-variant matching is on the roadmap — goggle.com currently slips past unless you add it explicitly.)
[3] 商标黑名单 ✓ mygoogle.com 拦截
✓ openai-clone.ai 拦截
✓ paywall.com 通过(非商标)
[4] WHOIS 检查 ✓ 注册过的域名识别(google.com)
Defense layer 5: multi-registrar fallback
When the harness decides to buy, it doesn't fail-stop on a single registrar's API hiccup. Priority chain: Porkbun → Cloudflare. If Porkbun returns "unavailable" or 5xx, Cloudflare tries. If both fail, the candidate goes to a re-try queue, not a buy.
Each registrar is a thin adapter — same register(domain, contact) interface, different SDK underneath. The adapter shape is deliberately simple so adding Namecheap or GoDaddy later is a single new file, not a refactor.
Defense layer 6: dry_run by default, opt-in spend
The default mode in .env.example is MODE=dry_run. Every stage knows the mode and short-circuits actual money calls when it's dry_run. The full pipeline runs end-to-end in dry_run so I can backtest discovery → valuation → would-have-bought against a real registrar SDK without paying. To flip to live: change one line, re-read the smoke test results, then change it.
CI runs the 25-case smoke on every push:
汇总:25 PASS / 0 FAIL/ERROR / 共 25
That suite covers every gate — discovery → valuation → trademark → WHOIS → budget guard → duplicate guard → registration → portfolio write → listing → AI negotiation → settle. Failing any one of them blocks merge.
What I'd build differently if starting over
- Faster registrar latency profiling. Currently the fallback is static priority. A version 2 should A/B by recent success rate and latency.
- Per-vertical AI prompts. The council right now is generic; "this is a fintech domain" vs "this is a SaaS domain" probably wants different valuation framing.
- Negotiation memory. The reply bot is stateless per inquiry; same buyer asking three times gets three independent counters.
When this is the wrong tool
- You want to register vanity domains for personal projects. This is overkill.
- You're already buying via a managed broker. They handle the registrar / settle layers.
- You don't want any LLM cost. The council is the heart of the value layer.
When this is the right tool
- You want a budget-walled testbed for a domain investing thesis.
- You're tired of "I clicked buy and it took $50 because the dropdown defaulted to 5 years."
- You want a CI-gated pipeline where every buy decision is auditable and reversible (well — except for the buy, but the decision is).
Get started
git clone https://github.com/lfzds4399-cpu/domain-harness.git
cd domain-harness
pip install -r requirements.txt
cp .env.example .env # fill in only what you use
python tests/e2e_smoke.py # 25 PASS in dry_run, no spend
Repo: github.com/lfzds4399-cpu/domain-harness. MIT.
If you're building anything that touches money + LLMs, I'd love to compare notes on which guards you ended up needing vs which ones you removed because they fired too often. Issues / PRs welcome.
Top comments (0)