<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>DEV Community: Osiris Technical Institute</title>
    <description>The latest articles on DEV Community by Osiris Technical Institute (@osiristechnicalinstitute).</description>
    <link>https://dev.to/osiristechnicalinstitute</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F3908507%2F857e69a1-1231-4a0f-9d06-e477cd4e838b.jpeg</url>
      <title>DEV Community: Osiris Technical Institute</title>
      <link>https://dev.to/osiristechnicalinstitute</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://dev.to/feed/osiristechnicalinstitute"/>
    <language>en</language>
    <item>
      <title>5 ways subdomain enumeration breaks (and how to handle each)</title>
      <dc:creator>Osiris Technical Institute</dc:creator>
      <pubDate>Wed, 13 May 2026 13:28:16 +0000</pubDate>
      <link>https://dev.to/osiristechnicalinstitute/5-ways-subdomain-enumeration-breaks-and-how-to-handle-each-3ihn</link>
      <guid>https://dev.to/osiristechnicalinstitute/5-ways-subdomain-enumeration-breaks-and-how-to-handle-each-3ihn</guid>
      <description>&lt;p&gt;Subdomain enumeration looks easy. There's a wordlist. There are CT logs. There's a DNS resolver. Plug them together, return a list. Maybe sort it.&lt;/p&gt;

&lt;p&gt;Then you run it on 50 different domains for the first time and notice that the results are wildly inconsistent. Sometimes you get 3 subdomains. Sometimes you get 30,000. Sometimes you get an empty array on a domain that should have an obvious hit. The tool isn't broken — it's quietly failing in five different ways, depending on the input.&lt;/p&gt;

&lt;p&gt;Here's what each one looks like in production, and how to build a tool that actually returns useful results across arbitrary inputs.&lt;/p&gt;

&lt;h2&gt;
  
  
  1. CT log sources go down silently
&lt;/h2&gt;

&lt;p&gt;The most common single source is crt.sh. It has fantastic coverage, it's free, and its uptime is... not consistent. A naive implementation hits crt.sh, gets a non-200 or an empty array, and treats the result as "no subdomains found." Which is &lt;em&gt;technically&lt;/em&gt; what crt.sh returned. But it's wrong.&lt;/p&gt;

&lt;p&gt;The fix is parallelism plus honesty:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="k"&gt;async&lt;/span&gt; &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;enumerate_subdomains&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;domain&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="n"&gt;sources&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;crtsh&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;certspotter&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;hackertarget&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
    &lt;span class="n"&gt;results&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="n"&gt;asyncio&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;gather&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
        &lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nf"&gt;run_with_timeout&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;s&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;domain&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;timeout&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;10&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;s&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;sources&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;
        &lt;span class="n"&gt;return_exceptions&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="bp"&gt;True&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;ok&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;failed&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[],&lt;/span&gt; &lt;span class="p"&gt;[]&lt;/span&gt;
    &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;source&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;r&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="nf"&gt;zip&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;sources&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;results&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
        &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="nf"&gt;isinstance&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;r&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nb"&gt;Exception&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="ow"&gt;or&lt;/span&gt; &lt;span class="ow"&gt;not&lt;/span&gt; &lt;span class="n"&gt;r&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
            &lt;span class="n"&gt;failed&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;append&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;source&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;__name__&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="k"&gt;else&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
            &lt;span class="n"&gt;ok&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;extend&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;r&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;subdomains&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nf"&gt;sorted&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;set&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;ok&lt;/span&gt;&lt;span class="p"&gt;)),&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;warnings&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;failed&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Three independent sources query in parallel. Return the union of whatever succeeded, plus a &lt;code&gt;warnings[]&lt;/code&gt; array naming which sources failed so the caller can reason about completeness. If only one source succeeded, the caller knows. If all failed, the caller knows. Silent partial failures are the actual bug — not the upstream outages.&lt;/p&gt;

&lt;h2&gt;
  
  
  2. Pure CT misses dormant subdomains
&lt;/h2&gt;

&lt;p&gt;Certificate Transparency logs catch every subdomain that's ever had a TLS certificate issued. They don't catch:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Internal-only subdomains that never got a public cert&lt;/li&gt;
&lt;li&gt;Subdomains using self-signed certs&lt;/li&gt;
&lt;li&gt;Subdomains provisioned this week that haven't issued a cert yet&lt;/li&gt;
&lt;li&gt;Subdomains behind cloud providers that don't propagate certs to the CT log ecosystem&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The realistic gap is somewhere between 10% and 30% of an organization's actual subdomain inventory, depending on how much internal infrastructure they expose.&lt;/p&gt;

&lt;p&gt;The fix is a DNS bruteforce fallback. Critically, this fallback only runs when the CT sources return suspiciously few results — not on every query, because bruteforce is slow and noisy:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="k"&gt;async&lt;/span&gt; &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;fallback_bruteforce&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;domain&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;wordlist&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="c1"&gt;# Resolve a random subdomain first - if it returns an IP,
&lt;/span&gt;    &lt;span class="c1"&gt;# the domain has wildcard DNS and bruteforcing is meaningless
&lt;/span&gt;    &lt;span class="n"&gt;nonce&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;doesnotexist-&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="n"&gt;secrets&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;token_hex&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;8&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nf"&gt;resolve&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;nonce&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s"&gt;.&lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;domain&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="bp"&gt;None&lt;/span&gt;  &lt;span class="c1"&gt;# wildcard DNS - skip
&lt;/span&gt;    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;w&lt;/span&gt; &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;w&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;wordlist&lt;/span&gt; &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nf"&gt;resolve&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;w&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s"&gt;.&lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;domain&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)]&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Bigger wordlist = better coverage, slower runtime. The standard wordlists from ProjectDiscovery work well as a starting point.&lt;/p&gt;

&lt;h2&gt;
  
  
  3. Wildcard DNS poisons results
&lt;/h2&gt;

&lt;p&gt;&lt;code&gt;*.example.com&lt;/code&gt; is legitimate DNS configuration — it lets a domain resolve for any subdomain to a single IP. CDNs and SaaS platforms use it routinely. If you're running a DNS bruteforce against a domain with a wildcard, every word in your wordlist resolves successfully, and you return a list of 10,000 fake subdomains.&lt;/p&gt;

&lt;p&gt;The fix is the wildcard-detection probe shown above: query a random nonsense subdomain &lt;em&gt;before&lt;/em&gt; running the wordlist. If the nonsense resolves, the domain has a wildcard, and you skip bruteforcing entirely (or aggressively filter to entries whose IP differs from the wildcard's IP — which catches some real subdomains but is fragile).&lt;/p&gt;

&lt;p&gt;This is the failure mode that does the most damage to a tool's credibility. A user who sees 10,000 results, half of which are obviously made up, doesn't trust the output again.&lt;/p&gt;

&lt;h2&gt;
  
  
  4. Rate limiting and IP bans
&lt;/h2&gt;

&lt;p&gt;crt.sh especially is aggressive about per-IP rate limiting. If your tool runs queries serially with no backoff, you'll hit the limit somewhere around the 5th or 10th query of a session and start getting 429s or 503s — which look identical to "no results" if you're not reading status codes carefully.&lt;/p&gt;

&lt;p&gt;Three pieces of the fix:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Per-source timeouts.&lt;/strong&gt; Cap each upstream call so a slow source doesn't block the whole pipeline.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Exponential backoff with jitter&lt;/strong&gt; on retries. Standard pattern, applied per-source.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Bounded parallelism.&lt;/strong&gt; Three sources in parallel is fine. 50 queries against crt.sh in parallel is suicide. Use a semaphore or a small worker pool.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Also, narrow your queries when possible. crt.sh supports specific TLD filters (&lt;code&gt;?q=%.example.com&lt;/code&gt;) that return faster and use less server resources than broader searches.&lt;/p&gt;

&lt;h2&gt;
  
  
  5. IDN / punycode domains return nothing
&lt;/h2&gt;

&lt;p&gt;International domain names (IDNs) — Cyrillic, Chinese, Arabic, etc. — get encoded as ASCII via punycode (the &lt;code&gt;xn--&lt;/code&gt; prefix) before going into DNS. A user looking up &lt;code&gt;мой-сайт.com&lt;/code&gt; needs the tool to convert that to its &lt;code&gt;xn--&lt;/code&gt; form before any DNS query.&lt;/p&gt;

&lt;p&gt;Most modern libraries handle this — &lt;code&gt;dnspython&lt;/code&gt; and &lt;code&gt;httpx&lt;/code&gt; both do. But if your tool is doing raw string concatenation anywhere in the pipeline (&lt;code&gt;f"{prefix}.{domain}"&lt;/code&gt; without normalization), you'll silently fail on IDNs and the user won't know why. The fix is one line:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="n"&gt;domain&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;idna&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;encode&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;user_input&lt;/span&gt;&lt;span class="p"&gt;).&lt;/span&gt;&lt;span class="nf"&gt;decode&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;ascii&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Run that as the first step of every domain-handling function. Always, even if you "don't expect IDNs." You'll get one eventually.&lt;/p&gt;




&lt;h2&gt;
  
  
  The pattern
&lt;/h2&gt;

&lt;p&gt;Every one of these failures is silent. The tool returns &lt;em&gt;something&lt;/em&gt; — usually an empty list or a contaminated one — and the user assumes the tool worked. The single biggest improvement you can make to a subdomain enumeration tool isn't a new source or a faster algorithm. It's surfacing failure honestly. Return what succeeded. Name what didn't. Let the caller decide whether the result is complete enough to act on.&lt;/p&gt;

&lt;p&gt;This is exactly the pattern we built into &lt;a href="https://oti-labs.com/domain-intelligence-api" rel="noopener noreferrer"&gt;Domain Intelligence&lt;/a&gt; — bundles subdomain enum (CT logs + DNS bruteforce fallback, wildcard-aware) with DNS, WHOIS/RDAP, SSL, and email security in one call, and surfaces partial failures in a &lt;code&gt;warnings&lt;/code&gt; field rather than swallowing them. Free tier on RapidAPI, MIT-licensed source on &lt;a href="https://github.com/osiris-technical-institute/domain-intelligence-api" rel="noopener noreferrer"&gt;GitHub&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;But the principles above hold whether you use a service or roll your own. Honest failures beat confident wrong answers every time.&lt;/p&gt;

</description>
      <category>security</category>
      <category>webdev</category>
      <category>devops</category>
      <category>python</category>
    </item>
  </channel>
</rss>
