DEV Community

Cover image for Automating SEO Audits with Python and Screaming Frog: A Developer’s Guide
Autheri
Autheri

Posted on

Automating SEO Audits with Python and Screaming Frog: A Developer’s Guide

Why Automate SEO Audits?

Manual SEO audits are time-consuming, error-prone, and impractical for large websites. By combining Screaming Frog (a powerful crawler) with Python (for data processing), you can:

 

Crawl 1,000+ pages in minutes.

 

Identify critical issues (broken links, duplicate content, slow pages).

 

Generate actionable reports programmatically.

 

Save 10+ hours/month for developers and marketers.

 

For agencies like FiveUpTech, automation is key to delivering comprehensive SEO audit services at scale. Let’s dive in.

 

Step 1: Crawling Your Site with Screaming Frog

Screaming Frog is the industry standard for SEO crawling. Here’s how to use it for automation:

 

A. Basic Crawl Setup

  1. Download Screaming Frog (free up to 500 URLs).
  2. Enter your domain (e.g., https://fiveuptech.com).
  3. Configure settings:
  • Crawl Depth: Set to “Unlimited” for full site coverage.
  • User-Agent: Use “Googlebot” to mimic search engine crawlers.
  • Export: Save data as a CSV for Python processing.

B. Extracting Critical Data

Export these key CSV files:

 

  • Internal Links: Identify orphaned pages.
  • Response Codes: Find 404s/5xx errors.
  • Metadata: Audit title tags and meta descriptions.
  • Redirects: Spot inefficient chains (e.g., 3+ hops).

 

Step 2: Analyzing Crawl Data with Python

Python’s pandas library lets you process Screaming Frog data programmatically.

 

A. Setup Your Environment

import pandas as pd  

import numpy as np  

B. Load Crawl Data

# Load CSV exports from Screaming Frog  

internal_links = pd.read_csv('internal_links.csv')  

response_codes = pd.read_csv('response_codes.csv')  

metadata = pd.read_csv('metadata.csv')  

C. Identify Broken Links

# Filter for 4xx/5xx errors  

broken_links = response_codes[response_codes['Status Code'].isin([404, 500, 503])]  

 

# Export to CSV  

broken_links.to_csv('broken_links_report.csv', index=False)  

D. Audit Duplicate Metadata

# Find duplicate title tags  

duplicate_titles = metadata[metadata.duplicated(subset=['Title'], keep=False)]  

 

# Flag titles over 60 characters  

long_titles = metadata[metadata['Title'].str.len() > 60]  

E. Analyze Internal Link Equity

# Count internal links per page  

link_equity = internal_links.groupby('Destination').size().reset_index(name='Internal Links')  

 

# Merge with metadata  

merged_data = pd.merge(metadata, link_equity, left_on='Address', right_on='Destination', how='left')  

Step 3: Generating Automated Reports

Use Python to create stakeholder-ready reports.

A. Build a Summary Dashboard

# Calculate key metrics  

total_pages = len(metadata)  

total_errors = len(broken_links)  

duplicate_titles_count = len(duplicate_titles)  

 

# Print summary  

print(f"SEO Audit Summary for {domain}:")  

print(f"- Total Pages Crawled: {total_pages}")  

print(f"- Critical Errors: {total_errors}")  

print(f"- Duplicate Titles: {duplicate_titles_count}")  

B. Export HTML/PDF Reports

Use Jinja2 + WeasyPrint to convert data to PDF:

 

from jinja2 import Template  

from weasyprint import HTML  

 

template = Template(open('report_template.html').read())  

html_out = template.render(  

   domain=domain,  

   broken_links=broken_links,  

   duplicate_titles=duplicate_titles  

)  

 

HTML(string=html_out).write_pdf('seo_audit_report.pdf')  

Step 4: Advanced Automation with APIs

For enterprise sites, integrate Screaming Frog’s API and Python scripts:

A. Schedule Weekly Crawls

import requests  

 

# Trigger Screaming Frog API crawl  

api_url = "https://api.screamingfrog.com/crawl"  

payload = {  

    "domain": "https://fiveuptech.com",  

    "depth": "unlimited",  

    "export_format": "csv"  

}  

response = requests.post(api_url, json=payload)  

B. Slack/Discord Alerts

# Send broken links to Slack  

import requests  

 

slack_webhook = "https://hooks.slack.com/services/..."  

message = {  

    "text": f"{len(broken_links)} broken links found on {domain}!"  

}  

requests.post(slack_webhook, json=message)  

 

Tools and Resources

  • Screaming Frog SEO Spider
  • Python pandas Documentation

 

When to Hire Professionals

While automation solves 80% of SEO issues, complex sites need expert analysis. For example:

 

  • JavaScript-heavy SPAs requiring deep rendering.
  • International SEO with hreflang and CDN configurations.
  • Penalty recovery after Google algorithm updates.

 

If your team lacks bandwidth, FiveUpTech’s SEO experts offer custom automation solutions and manual audits for enterprise clients.

 

Conclusion

Automating SEO audits with Python and Screaming Frog lets you focus on strategic fixes instead of grunt work. Start with basic crawls, layer in Python analysis, and scale with APIs. For more tips, explore FiveUpTech’s blog or contact us for a personalized audit.

Top comments (1)

Collapse
 
ramkumar-m-n profile image
Ramkumar M N

Hi Autheri,
Use code tag for all your code blocks for easy readability and accessibility. Please find the sample below.

# Load CSV exports from Screaming Frog  

internal_links = pd.read_csv('internal_links.csv')  

response_codes = pd.read_csv('response_codes.csv')  

metadata = pd.read_csv('metadata.csv')  
Enter fullscreen mode Exit fullscreen mode

Regards,
Ram