DEV Community

Amelia
Amelia

Posted on

The Easiest Way to Find Hidden Sitemaps for SEO Audits

Ever spent hours hunting for a website's sitemap.xml only to hit a 404? I’ve been there. It’s frustrating when you’re doing SEO audits or competitor analysis and can’t find the exact URL structure.

Here’s a quick, developer-friendly way to programmatically detect sitemaps—and a free tool that saves you even more time.

First, the manual method. Most sitemaps live at predictable paths. In Python, you can try:

import requests

def find_sitemap(domain):
    common_paths = [
        "/sitemap.xml",
        "/sitemap_index.xml",
        "/sitemap/",
        "/robots.txt"
    ]
    for path in common_paths:
        url = f"https://{domain}{path}"
        try:
            resp = requests.get(url, timeout=5)
            if resp.status_code == 200 and 'xml' in resp.headers.get('Content-Type', ''):
                return url
        except requests.exceptions.RequestException:
            continue
    return None

print(find_sitemap("example.com"))
Enter fullscreen mode Exit fullscreen mode

But what if the site uses dynamic sitemaps or multiple ones? Checking robots.txt is better:

def parse_robots_sitemap(domain):
    url = f"https://{domain}/robots.txt"
    resp = requests.get(url, timeout=5)
    if resp.status_code == 200:
        for line in resp.text.splitlines():
            if line.lower().startswith("sitemap:"):
                return line.split(":", 1)[1].strip()
    return None
Enter fullscreen mode Exit fullscreen mode

This works for most cases, but sometimes you need a broader scan—especially when analyzing hundreds of domains for a client or personal project. That’s when I turn to a dedicated tool.

I’ve been using the SERPSpur Free Sitemap Finder Tool (https://serpspur.com/tool/free-sitemap-finder-tool/) for quick checks. It does exactly what my Python script does, but without the setup. You drop in a URL, it scans common paths and robots.txt, and returns all detected sitemaps in seconds. It’s handy for spot-checking during audits or when you’re on a machine without Python.

The real power comes from combining both approaches. Use the tool for instant checks, then automate bulk analysis with the code above. For example, I run a cron job weekly that uses the Python script to check my client sites and logs any missing sitemaps. Then I use the tool to manually verify edge cases.

Pro tip: Always check for gzip-compressed sitemaps too. Some servers return them with Content-Encoding: gzip. Your requests library handles that automatically, but the tool might not display the raw XML—just the URL.

Whether you’re a dev automating SEO tasks or a marketer doing manual analysis, finding sitemaps shouldn’t be a treasure hunt. Have a reliable method, and keep that tool bookmarked for when you need a quick answer.

Top comments (1)

Collapse
 
burhanchaudhry profile image
Burhan

I appreciate the honesty here. It's refreshing to see someone acknowledge that not every project needs to be a masterpiece from day one. How do you balance iteration speed with code quality in your own work?