DEV Community

Cover image for Stop Guessing, Start Measuring: A Python Script to Automate Your SEO Data Collection
john
john

Posted on

Stop Guessing, Start Measuring: A Python Script to Automate Your SEO Data Collection

Introduction (The Hook)
Hey developers! πŸ‘‹ Ever feel like SEO is a dark art of best guesses and confusing metrics? What if you could use your coding skills to bring some data-driven clarity to the process?

As a developer who also works in SEO, I often found myself manually checking rankings and technical statsβ€”a huge time sink. So, I built a simple Python script to automate the boring parts and free me up for the actual analysis.

In this tutorial, I’ll show you how to build a script that automatically pulls critical SEO data, giving you a custom dashboard of your site's health. Let's bridge the gap between code and SEO.

What We're Building: The SEO Data Fetcher
Our script will use free APIs and libraries to fetch:

Google Search Console-style data (Approximate rankings & clicks via SerpApi).

Core Web Vitals (The performance metrics Google actually uses for ranking).

Basic Backlink Count (To monitor your site's authority).

Prerequisites:

Basic knowledge of Python.

A free SerpApi account (they have a generous free tier).

The following Python packages: requests, python-dotenv

The Code: Let's Get Technical
First, set up your environment. Create a .env file to keep your API key safe:

# .env
SERP_API_KEY=your_serpapi_key_here

Now, let's dive into the script (seo_dashboard.py):
`import requests
import os
from dotenv import load_dotenv

load_dotenv()

class SEODashboard:
def init(self, domain):
self.domain = domain
self.serpapi_key = os.getenv('SERP_API_KEY')

def get_ranking_data(self, query):
    """Fetches approximate ranking data for a keyword using SerpApi."""
    print(f"πŸ” Checking ranking for: '{query}'")
    params = {
        "engine": "google",
        "q": query,
        "api_key": self.serpapi_key
    }

    try:
        response = requests.get('https://serpapi.com/search', params=params)
        results = response.json()
        organic_results = results.get('organic_results', [])

        for index, result in enumerate(organic_results):
            if self.domain in result.get('link', ''):
                print(f"βœ… Rank #{index + 1} for '{query}'")
                return index + 1
        print(f"❌ Not in top 100 for '{query}'")
        return None

    except Exception as e:
        print(f"⚠️  SerpApi Error: {e}")
        return None

def get_core_web_vitals(self, url):
    """Uses the CrUX API to get real-world Core Web Vitals data."""
    print(f"πŸ“Š Checking Core Web Vitals for: {url}")
    API_ENDPOINT = f"https://chromeuxreport.googleapis.com/v1/records:queryRecord?key={os.getenv('CRUX_API_KEY', '')}"
    payload = {"url": url}

    try:
        response = requests.post(API_ENDPOINT, json=payload)
        data = response.json()

        if 'record' in data:
            metrics = data['record']['metrics']
            print("--- Core Web Vitals Results ---")
            print(f"LCP (Loading): {metrics.get('largest_contentful_paint', {}).get('percentile', {}).get('p75', 'N/A')}ms")
            print(f"FID (Interactivity): {metrics.get('first_input_delay', {}).get('percentile', {}).get('p75', 'N/A')}ms")
            print(f"CLS (Visual Stability): {metrics.get('cumulative_layout_shift', {}).get('percentile', {}).get('p75', 'N/A')}")
        else:
            print("❌ No CrUX data available for this URL.")

    except Exception as e:
        print(f"⚠️  CrUX API Error: {e}")

def get_backlink_count(self):
    """Gets an approximate backlink count using the Moz API (free tier)."""
    print(f"πŸ”— Checking backlink count for: {self.domain}")
    # Note: This is a placeholder. For a real implementation, you would use:
    # - Moz's API (requires free account)
    # - or the Ahrefs API (paid)
    # For this demo, we'll simulate the concept.
    print("πŸ’‘ Pro Tip: Integrate with Moz's Links API for actual backlink data.")
    return "Simulated - Integrate with Moz/Abrefs API"
Enter fullscreen mode Exit fullscreen mode

if name == "main":
# Initialize with your domain
my_domain = "theseolighthouse.blogspot.com"
dashboard = SEODashboard(my_domain)

# Check rankings for target keywords
keywords = ["SEO tips", "Python SEO automation", "technical SEO"]
for keyword in keywords:
    dashboard.get_ranking_data(keyword)

# Check Core Web Vitals
dashboard.get_core_web_vitals(f"https://{my_domain}")

# Check backlinks
dashboard.get_backlink_count()`
Enter fullscreen mode Exit fullscreen mode

Running the Script & Expected Output
Save the script and run it from your terminal:
python seo_dashboard.py
You should see output like this:
πŸ” Checking ranking for: 'SEO tips'
βœ… Rank #14 for 'SEO tips'
πŸ” Checking ranking for: 'Python SEO automation'
❌ Not in top 100 for 'Python SEO automation'
πŸ“Š Checking Core Web Vitals for: https://theseolighthouse.blogspot.com
--- Core Web Vitals Results ---
LCP (Loading): 2200ms
FID (Interactivity): 105ms
CLS (Visual Stability): 0.12
πŸ”— Checking backlink count for: theseolighthouse.blogspot.com
πŸ’‘ Pro Tip: Integrate with Moz's Links API for actual backlink data.

Taking It Further: Next Steps
This is just the beginning! Here's how you can extend this script:

Add scheduling with cron to run daily and log results to a CSV.

Integrate with Google Search Console API for your actual click/impression data.

Build a simple Flask dashboard to visualize the trends over time.

Add a broken link checker using the requests library to scan your site.

Conclusion
By combining development skills with SEO, you can create powerful tools that provide real insights, not just guesses. This script is a starting pointβ€”something I've built upon and written about extensively on my own blog, The SEO Lighthouse, where I explore the intersection of code and search marketing.

Top comments (0)