DEV Community

Elowen
Elowen

Posted on

How to build a Google rank tracker with TalorData (Python)

Tracking where a domain ranks for a list of keywords on Google sounds like a one-afternoon project. It is — until you try to scrape google.com yourself. After a few hundred requests from the same IP you start getting the consent page, then a captcha, then nothing at all. Rotating proxies and headless browsers works for a while, but you end up spending more time keeping the scraper alive than using its data.

Easier path: hand that problem to a SERP API and keep your code focused on the rank itself.

This walks through a small rank tracker in Python using TalorData's SERP API. Single request first, parse the response, look up a target domain's position in the organic results, then loop over a keyword list and persist the history to SQLite so it can be charted.

Get a token

TalorData returns Google (and Bing/Yandex/etc.) results as structured JSON. Sign up at talordata.com, grab the bearer token from the dashboard, export it:

First request

The endpoint takes form-encoded POST. Smallest useful call — Google search for serp api, US desktop, top 10:

curl -X POST 'https://serpapi.talordata.net/serp/v1/request' \
  -H "Authorization: Bearer $TALORDATA_TOKEN" \
  -H 'Content-Type: application/x-www-form-urlencoded' \
  -d 'engine=google' \
  -d 'q=serp api' \
  -d 'device=desktop' \
  -d 'location=United States' \
  -d 'num=10' \
  -d 'json=1'
Enter fullscreen mode Exit fullscreen mode

Top-level keys in the response:

organic
sponsored_results
pagination
people_also_ask
people_are_saying
related
request_params
search_information
search_metadata
Enter fullscreen mode Exit fullscreen mode

For ranking the only one that matters is organic. The rest are useful for other things — people_also_ask for content ideas, sponsored_results to see who's bidding on the term, pagination to go past page one — but the rank lives in organic.

Parse it in Python

import os
import requests

ENDPOINT = "https://serpapi.talordata.net/serp/v1/request"
TOKEN = os.environ["TALORDATA_TOKEN"]


def search(query: str, location: str = "United States", num: int = 10) -> dict:
    resp = requests.post(
        ENDPOINT,
        headers={
            "Authorization": f"Bearer {TOKEN}",
            "Content-Type": "application/x-www-form-urlencoded",
        },
        data={
            "engine": "google",
            "q": query,
            "device": "desktop",
            "location": location,
            "num": num,
            "json": 1,
        },
        timeout=60,
    )
    resp.raise_for_status()
    return resp.json()


data = search("serp api")
for r in data["organic"][:5]:
    print(f'{r["position"]:>2}  {r["display_link"]}')
    print(f'    {r["title"]}')
Enter fullscreen mode Exit fullscreen mode

Output:

 1  https://serpapi.com
    SerpApi: Google Search API
 2  https://brightdata.com› products › serp-api
    SERP API - SERP Scraper API - Free Trial
 3  https://dataforseo.com› APIs
    SERP API You Can Trust
 4  https://serper.dev
    Serper - The World's Fastest and Cheapest Google Search API
 5  https://github.com› serpapi › google-search-results-python
    serpapi/google-search-results-python
Enter fullscreen mode Exit fullscreen mode

Each entry in organic has position, title, link, display_link, description, source, plus sometimes snippet_highlighted_words and redirect_link. display_link is the breadcrumb URL Google renders under the title — close to but not the same as link. For domain matching use the host of link.

Find a target domain's rank

from urllib.parse import urlparse


def find_rank(target_domain: str, organic: list) -> int | None:
    target = target_domain.lower().removeprefix("www.")
    for r in organic:
        host = (urlparse(r["link"]).hostname or "").removeprefix("www.")
        if host == target or host.endswith("." + target):
            return r["position"]
    return None


TARGET_DOMAIN = "github.com"  # change this to the domain you want to track
print(find_rank(TARGET_DOMAIN, data["organic"]))
# 5
Enter fullscreen mode Exit fullscreen mode

Two things worth knowing. The endswith branch handles subdomains, so docs.github.com still counts as a github.com hit — change that behavior if you want exact-host matching. And if the domain isn't in the slice you requested, you get None, not an exception. None means "not in top N", not "doesn't rank" — bump num to 100 if you want to know whether it's anywhere on the first ten pages.

Track a list of keywords over time

One ranking is a data point. A tracker is a series. SQLite is enough for this — single file, ships with Python, no server.

import sqlite3
from datetime import date

KEYWORDS = [
    "serp api",
    "google search api",
    "scrape google results",
    "rank tracker python",
    "google serp scraper",
    "google search json api",
]

DB = sqlite3.connect("rankings.db")
DB.executescript("""
CREATE TABLE IF NOT EXISTS rankings (
    day      TEXT NOT NULL,
    keyword  TEXT NOT NULL,
    domain   TEXT NOT NULL,
    rank     INTEGER,
    PRIMARY KEY (day, keyword, domain)
);
""")


def track(domain: str, keywords: list[str]) -> None:
    today = date.today().isoformat()
    for kw in keywords:
        data = search(kw, num=100)
        rank = find_rank(domain, data["organic"])
        DB.execute(
            "INSERT OR REPLACE INTO rankings VALUES (?, ?, ?, ?)",
            (today, kw, domain, rank),
        )
        print(f"{kw:<28} -> {rank}")
    DB.commit()


track(TARGET_DOMAIN, KEYWORDS)
Enter fullscreen mode Exit fullscreen mode

INSERT OR REPLACE means you can re-run on the same day without primary-key errors — useful if the run dies halfway through. num=100 is deliberate: most of these keywords won't put the target on page one, and a wall of Nones isn't useful data. One request returns up to a hundred results for the same cost as ten.

Sample run:

serp api                     -> 5
google search api            -> 12
scrape google results        -> 8
rank tracker python          -> None
google serp scraper          -> 23
google search json api       -> 7
Enter fullscreen mode Exit fullscreen mode

Chart

Once a few days are in the table, the plot is small:

import matplotlib.pyplot as plt

OFF_RADAR = 101  # plot NULL ranks just below the visible range


def plot(keyword: str, domain: str) -> None:
    cur = DB.execute(
        "SELECT day, rank FROM rankings "
        "WHERE keyword = ? AND domain = ? ORDER BY day",
        (keyword, domain),
    )
    rows = cur.fetchall()
    days = [r[0] for r in rows]
    ranks = [r[1] if r[1] is not None else OFF_RADAR for r in rows]

    fig, ax = plt.subplots(figsize=(8, 4))
    ax.plot(days, ranks, marker="o")
    ax.invert_yaxis()
    ax.set_ylim(OFF_RADAR + 5, 0)
    ax.set_title(f'{domain}"{keyword}"')
    ax.set_ylabel("Google rank (1 = top)")
    fig.autofmt_xdate()
    fig.tight_layout()
    fig.savefig(f"{keyword.replace(' ', '_')}.png", dpi=120)


plot("serp api", TARGET_DOMAIN)
Enter fullscreen mode Exit fullscreen mode

invert_yaxis is the bit that makes the chart read correctly — rank 1 at the top, a drop in position is a line going down. Mapping None to 101 keeps dropouts visible instead of leaving holes in the series; pick a different cutoff if you only care about the top ten.

Wrapping up

The whole thing is about 80 lines of Python. One endpoint, one parser, a SQLite table, a matplotlib chart.

Two obvious extensions: add a second domain so you can plot two lines on the same axes, and split desktop and mobile into their own runs since they don't always agree.

Top comments (0)