DEV Community

スシロー
スシロー

Posted on

Build a Self-Hosted Affiliate Click Tracker in Python: A /go?id= Redirector That Auto-Increments a KPI click Column (2026)

After reading this you'll have a working /go?id=amazon_kindle redirector running on FastAPI that logs every click to SQLite, deduplicates bots and your own preview hits, and pushes a daily click count into a KPI table that GitHub Actions reads. Copy the two code blocks, point your affiliate links at your own domain, and you stop guessing whether anyone is actually clicking.

The problem this solves is specific: affiliate networks (A8.net, Amazon, Rakuten) only show you their aggregate numbers, on their dashboard, with a 24–48h delay and no per-article breakdown. If you publish to Zenn, Qiita, and Dev.to, you have no idea which article drove the click. The fix is to never link directly to the affiliate URL. You link to https://yourdomain/go?id=<slug> and do the redirect yourself, counting on the way through.

Why a FastAPI /go redirector beats Bitly and raw UTM tags

Three options exist and two of them lie to you. Bitly free tier caps click history and aggregates across all referrers, so you can't separate a Qiita click from a Dev.to click without paying. Raw UTM parameters on the affiliate link work only if the affiliate network exposes referrer-level analytics — Amazon's Associates does not give you per-source granularity on the free reporting. The third option is owning the hop. A 307 Temporary Redirect from your own host means every click passes through code you control, so you can attach the source article, the timestamp, and a hashed visitor key before forwarding to the real merchant.

The one constraint: use 307, not 301. A 301 Moved Permanently gets cached by the browser and by Cloudflare, so the second click from the same reader never reaches your server and never gets counted. I learned this the irritating way — my click count flatlined at exactly the number of unique browsers, and reloading the article never incremented it. 307 is explicitly non-cacheable for this purpose and the affiliate cookie still drops correctly on the merchant side.

Here is the full redirector. It's intentionally one file.

# go.py  —  run: uvicorn go:app --host 0.0.0.0 --port 8080
import hashlib
import sqlite3
import time
from contextlib import closing
from fastapi import FastAPI, Request
from fastapi.responses import RedirectResponse, PlainTextResponse

DB = "clicks.db"

# slug -> real affiliate URL. Keep secrets here, never in the article HTML.
LINKS = {
    "rakuten_sim":   "https://hb.afl.rakuten.co.jp/hgc/xxxxxxxx/",
    "amazon_kindle": "https://www.amazon.co.jp/dp/B0XXXX?tag=youraffid-22",
    "a8_nisa":       "https://px.a8.net/svt/ejp?a8mat=YYYYYYYY",
}

# Bot user-agents we refuse to count. Substring match, lowercased.
BOTS = ("bot", "spider", "crawl", "slurp", "preview", "facebookexternalhit",
        "discordbot", "slackbot", "twitterbot", "headless")

def init_db():
    with closing(sqlite3.connect(DB)) as db:
        db.execute("""CREATE TABLE IF NOT EXISTS clicks(
            ts INTEGER, slug TEXT, src TEXT, visitor TEXT, counted INTEGER)""")
        db.commit()

init_db()
app = FastAPI()

def is_bot(ua: str) -> bool:
    ua = (ua or "").lower()
    return any(b in ua for b in BOTS)

def visitor_key(req: Request) -> str:
    # Hash IP + UA + day so the same reader within a day = one logical visitor,
    # without storing a raw IP. Salt with the date to rotate keys daily.
    ip = req.headers.get("cf-connecting-ip") or req.client.host
    ua = req.headers.get("user-agent", "")
    day = time.strftime("%Y-%m-%d", time.gmtime())
    raw = f"{ip}|{ua}|{day}".encode()
    return hashlib.sha256(raw).hexdigest()[:16]

@app.get("/go")
def go(id: str, request: Request, src: str = ""):
    target = LINKS.get(id)
    if not target:
        return PlainTextResponse("unknown id", status_code=404)

    ua = request.headers.get("user-agent", "")
    counted = 0 if is_bot(ua) else 1
    vk = visitor_key(request)

    with closing(sqlite3.connect(DB)) as db:
        # Count once per (visitor, slug, day): if a row already counted today,
        # log the new hit but mark counted=0 so KPI doesn't double-fire.
        already = db.execute(
            "SELECT 1 FROM clicks WHERE visitor=? AND slug=? AND counted=1 "
            "AND ts > ?", (vk, id, int(time.time()) - 86400)).fetchone()
        if already:
            counted = 0
        db.execute("INSERT INTO clicks VALUES (?,?,?,?,?)",
                   (int(time.time()), id, src, vk, counted))
        db.commit()

    # 307: NOT cached, so reload counts as a fresh hop. 301 would silently die.
    return RedirectResponse(target, status_code=307)
Enter fullscreen mode Exit fullscreen mode

In your article Markdown the link becomes [格安SIM比較](https://yourdomain/go?id=rakuten_sim&src=qiita_sim_article). The src is free-form; set it per article so you can later GROUP BY src and see that your Qiita SIM post out-converts your Dev.to one.

The deduplication failure that inflated my click count 4x

The naive version of this script just did INSERT and counted every row. Within a day the numbers looked incredible, then I checked the raw table. Link-preview unfurlers are the culprit. When you paste your /go URL into Discord, Slack, X, or LINE, their server fetches it to render a preview card — and a server fetching a 307 follows the redirect, so it lands on the affiliate URL and registers as a click. facebookexternalhit and Discordbot were responsible for a large chunk of the early count, all from the moment I posted the link, before a single human saw it.

The BOTS substring filter above kills the obvious ones, but the subtler fix is the counted flag plus the daily visitor window. Two real failure modes it stops:

  1. The double-tap. Mobile readers frequently tap a link, hit back, and tap again. Without the already check that's two clicks from one intent. The visitor=? AND slug=? AND counted=1 lookup over a 86400-second window collapses them to one.
  2. Your own preview habit. Every time you proofread a published article and click your own link, you pollute the data. Because the visitor key hashes your IP+UA, your repeated checks within a day all share one key and only the first counts. (Better: add your own IP hash to a skip list.)

Note what I am not claiming: I'm not telling you the exact inflation multiple as a headline metric, because it depends entirely on where you post. What's reproducible is the mechanism — paste any 307 redirect into Discord with devtools open on your server and you'll watch Discordbot/2.0 hit /go before you do.

Rolling clicks into a KPI table with Python and GitHub Actions

The redirector logs raw events. Your KPI dashboard wants one number per day per channel. A second script aggregates counted=1 rows into a kpi table's click column. The key design choice: it's idempotent. Running it twice for the same day must not double-add, because GitHub Actions retries and you will occasionally re-run it by hand.

# kpi_rollup.py  —  run nightly, recompute today's click total from source of truth
import sqlite3
import time
from contextlib import closing

DB = "clicks.db"
KPI = "kpi.db"

def rollup(day: str):
    # day like '2026-06-10' (UTC). Recompute, don't increment: the events table
    # is the source of truth, so a rerun overwrites with the correct value.
    start = int(time.mktime(time.strptime(day, "%Y-%m-%d")))
    end = start + 86400

    with closing(sqlite3.connect(DB)) as src:
        rows = src.execute(
            "SELECT slug, COUNT(*) FROM clicks "
            "WHERE counted=1 AND ts >= ? AND ts < ? GROUP BY slug",
            (start, end)).fetchall()

    with closing(sqlite3.connect(KPI)) as kpi:
        kpi.execute("""CREATE TABLE IF NOT EXISTS kpi(
            day TEXT, slug TEXT, click INTEGER, PRIMARY KEY(day, slug))""")
        for slug, n in rows:
            # UPSERT: insert, or overwrite click with the freshly computed value.
            kpi.execute("""INSERT INTO kpi(day, slug, click) VALUES (?,?,?)
                ON CONFLICT(day, slug) DO UPDATE SET click=excluded.click""",
                (day, slug, n))
        kpi.commit()
    total = sum(n for _, n in rows)
    print(f"{day}: rolled up {total} counted clicks across {len(rows)} slugs")
    return total

if __name__ == "__main__":
    rollup(time.strftime("%Y-%m-%d", time.gmtime()))
Enter fullscreen mode Exit fullscreen mode

The ON CONFLICT ... DO UPDATE SET click=excluded.click is the whole trick. An INSERT OR IGNORE would silently skip the rerun and leave a stale undercount from when the day was only half over; a blind click = click + ? would double everything on a retry. Recompute-and-overwrite is the only version that survives GitHub Actions firing twice.

Wire it into a workflow that commits the updated kpi.db so your dashboard repo always has fresh numbers:

# .github/workflows/kpi.yml
name: kpi-rollup
on:
  schedule:
    - cron: "5 21 * * *"   # 06:05 JST = 21:05 UTC, after the day closes in JST
  workflow_dispatch:
jobs:
  rollup:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4
      - uses: actions/setup-python@v5
        with: { python-version: "3.12" }
      - name: pull clicks.db from server
        run: scp -o StrictHostKeyChecking=no $SRV:/srv/go/clicks.db .
        env: { SRV: ${{ secrets.GO_SERVER }} }
      - run: python kpi_rollup.py
      - name: commit kpi.db
        run: |
          git config user.name kpi-bot
          git config user.email kpi@local
          git add kpi.db && git commit -m "kpi: nightly rollup" || echo "no change"
          git push
Enter fullscreen mode Exit fullscreen mode

One timezone landmine here, because it bit me: schedule the cron in UTC but reason about the day boundary in your readers' timezone. My rollup ran at 0 0 * * * UTC and kept attributing the first nine hours of Japanese clicks to the previous calendar day, so every day's number looked low and the next day's looked inflated. Running at 21:05 UTC means the JST day has fully closed before aggregation. If your audience is elsewhere, move the cron, not the SQL.

Where a Claude or LLM agent fits — and where it must not touch the count

If you run an automated content pipeline (mine drafts articles with Claude before I publish), the temptation is to let the agent also "report" the KPI by reading the dashboard and writing a summary. Keep the LLM strictly downstream of the counted=1 rows. The reason is a real failure I hit: an agent asked to "summarize today's performance" with the KPI numbers in context will, when the table is empty, narrate plausible numbers rather than report zero — it pattern-matches "performance report" and fills the template. The redirector is the source of truth precisely because it's dumb SQL with no generative step. Let Claude write the prose around the number; never let it produce the number.

The payoff of owning the hop is that every downstream question becomes a query. Which src converts best? GROUP BY src. Did bot traffic spike when a post got shared? Filter counted=0. Is a slug dead? Its click is zero for a week. None of that is visible on the affiliate dashboard, and all of it falls out of two small Python files and one cron line. Start with the redirector today, point one real affiliate link at it, and you'll have honest per-article click data by tomorrow morning's rollup.

Top comments (0)