You visit a website. Within seconds, you want to know: what's it built with? What CDN? What framework? What analytics?
I needed this for a project — bulk tech stack detection across thousands of domains. Wappalyzer's browser extension is great for one-off lookups, but I needed an API that could handle volume, return structured data, and catch things the browser extension misses.
So I built DetectZeStack, a tech stack detection API in Go. It scans 7,200+ technologies using four detection layers. Here's how it works under the hood.
The Problem With Single-Layer Detection
Most tech detection tools rely on one method: matching patterns in HTML, headers, and JavaScript. That's what Wappalyzer does, and it's genuinely good at it.
But it misses things:
- DNS-level infrastructure (CDNs, hosting providers identified by CNAME records)
- TLS certificate issuers (tells you who provides their SSL — Cloudflare, AWS, Let's Encrypt)
- Infrastructure headers that aren't in the fingerprint database
A site behind Cloudflare with a React frontend might only show "React" with single-layer detection. You'd miss the CDN, the certificate authority, and the hosting provider.
The Four Detection Layers
Layer 1: Wappalyzer Fingerprinting (7,200+ signatures)
The foundation. I use wappalyzergo, which ports Wappalyzer's fingerprint database to Go. It analyzes:
- HTML content (meta tags, script sources, DOM patterns)
- HTTP response headers (Server, X-Powered-By, etc.)
- JavaScript variables and objects
- Cookie names and patterns
This alone catches most frontend frameworks, CMS platforms, analytics tools, and e-commerce platforms.
Layer 2: DNS CNAME/NS Fingerprinting (111 signatures)
Here's where it gets interesting. When you resolve a domain's DNS, the CNAME chain reveals infrastructure:
stripe.com → stripe.com.cdn.cloudflare.net → ...
That CNAME tells you Cloudflare is involved, even if the HTTP headers are scrubbed clean.
I maintain 111 DNS signatures mapping CNAME patterns to technologies:
-
*.cloudfront.net→ Amazon CloudFront -
*.fastly.net→ Fastly -
*.netlify.app→ Netlify -
*.vercel-dns.com→ Vercel -
*.herokuapp.com→ Heroku
The DNS lookup runs in parallel with the HTTP fetch, so it adds zero latency. If DNS times out (2-second cap), the scan still returns HTTP-based results.
Layer 3: TLS Certificate Analysis
Every HTTPS connection includes a TLS handshake with the server's certificate. The certificate issuer reveals the SSL/TLS provider:
| Certificate Issuer | Technology |
|---|---|
| Cloudflare, Inc. | Cloudflare SSL |
| Amazon | AWS Certificate Manager |
| Let's Encrypt | Let's Encrypt |
| Google Trust Services | Google Cloud |
| DigiCert Inc | DigiCert |
This is essentially free — the cert info is already in the TLS handshake, no extra request needed.
Layer 4: Custom Header Matching
Some infrastructure providers add unique headers that aren't in Wappalyzer's database:
-
X-Railway-Request-Id→ Railway (PaaS) -
X-Amz-Cf-Pop→ Amazon CloudFront (edge location) -
X-Nf-Request-Id→ Netlify
These fill gaps where standard fingerprinting falls short.
Deduplication
When multiple layers detect the same technology, the API deduplicates by name. If Wappalyzer detects "Cloudflare" from headers AND DNS detects "Cloudflare" from CNAME, you get one entry — not two.
Higher-confidence detections take priority. Wappalyzer's pattern match at 100% confidence beats a DNS-only detection at 80%.
What the Output Looks Like
Here's a real scan of stripe.com:
curl "https://detectzestack.com/demo?url=stripe.com"
{
"url": "https://stripe.com",
"domain": "stripe.com",
"technologies": [
{
"name": "Amazon S3",
"categories": ["CDN"],
"confidence": 100,
"description": "Amazon S3 or Amazon Simple Storage Service...",
"website": "https://aws.amazon.com/s3/",
"icon": "Amazon S3.svg"
},
{
"name": "Amazon Web Services",
"categories": ["PaaS"],
"confidence": 100,
"website": "https://aws.amazon.com/"
},
{
"name": "DigiCert",
"categories": ["SSL/TLS certificate authority"],
"confidence": 70
},
{
"name": "HSTS",
"categories": ["Security"],
"confidence": 100
},
{
"name": "Nginx",
"categories": ["Web servers", "Reverse proxies"],
"confidence": 100,
"cpe": "cpe:2.3:a:f5:nginx:*:*:*:*:*:*:*:*"
}
],
"categories": {
"CDN": ["Amazon S3"],
"PaaS": ["Amazon Web Services"],
"Security": ["HSTS"],
"SSL/TLS certificate authority": ["DigiCert"],
"Web servers": ["Nginx"]
},
"meta": {
"status_code": 200,
"tech_count": 5,
"scan_depth": "full"
}
}
Notice the DigiCert entry with 70% confidence — that came from TLS certificate analysis (Layer 3), not HTML fingerprinting.
And here's github.com, which returns 8 technologies:
curl "https://detectzestack.com/demo?url=github.com"
{
"technologies": [
{ "name": "Amazon S3", "categories": ["CDN"], "confidence": 100 },
{ "name": "Amazon Web Services", "categories": ["PaaS"], "confidence": 100 },
{ "name": "C3.js", "categories": ["JavaScript libraries"], "confidence": 100 },
{ "name": "Contentful", "categories": ["CMS"], "confidence": 100 },
{ "name": "GitHub Pages", "categories": ["PaaS"], "confidence": 100 },
{ "name": "HSTS", "categories": ["Security"], "confidence": 100 },
{ "name": "React", "categories": ["JavaScript frameworks"], "confidence": 100 },
{ "name": "Sectigo", "categories": ["SSL/TLS certificate authority"], "confidence": 70 }
]
}
Contentful (CMS), C3.js (charting), React (frontend), Sectigo (TLS) — all from a single API call.
Architecture Decisions
Why Go? Concurrency is first-class. The DNS lookup, HTTP fetch, and TLS extraction all run in parallel goroutines. A typical scan completes in 1-2 seconds.
Why not just wrap the Wappalyzer npm package? Performance and deployability. The Go binary is a single executable, ~15MB, runs on a $3/month Fly.io instance. No Node.js runtime, no headless browser, no Puppeteer.
Why SQLite for storage? The API caches scan results to avoid hammering target sites. SQLite is perfect for this — single-file database, zero configuration, handles thousands of concurrent reads. It runs alongside the API process on the same machine.
Why not headless browser rendering? Some JavaScript-heavy sites would benefit from it, but it would 10x the infrastructure cost and response time. Wappalyzer's static analysis catches the vast majority of technologies. If you need rendered-page analysis, tools like Wappalyzer's browser extension are the right choice.
Try It
The /demo endpoint is free, no signup needed:
# Try it right now
curl "https://detectzestack.com/demo?url=your-site.com"
For production use (higher rate limits, change tracking, history), it's on RapidAPI with a free tier — 100 requests/month, no credit card.
There are also alternatives worth considering: Wappalyzer's npm package if you want to self-host detection, and BuiltWith if you need historical data going back years. DetectZeStack's differentiator is the multi-layer detection approach and the structured API response with confidence scores.
If you're building anything that needs tech stack data — competitive analysis, security auditing, lead enrichment — I'd love to hear about your use case. Drop a comment or find me on Twitter/X.
Top comments (0)