DEV Community

Cover image for gghstats: Keep GitHub traffic past 14 days
Hermes Rodríguez
Hermes Rodríguez

Posted on

gghstats: Keep GitHub traffic past 14 days

We've all been there. You ship an open-source project, a tiny CLI, or a docs site. You watch Insights → Traffic for a week: views spike, clones climb, life is good.

Then you come back a month later and ask a simple question: did that blog post actually move the needle over time? GitHub’s answer is blunt: detailed traffic (views and clones) only lives in a rolling 14-day window. Past that, the granularity is gone unless you exported it yourself.

I wanted historical traffic — without a SaaS middleman, without babysitting CSV exports, and with something I could run beside my other self-hosted stuff. That’s why I built gghstats. The first stable line is v0.1.0 (binaries on Releases, multi-arch image on GHCR).


The problem in one sentence

GitHub is a great place to host code; it is not a long-term analytics warehouse for repository traffic. If you care about trends, seasonality, or “what happened after launch,” you need your own copy of that data.


What gghstats does

gghstats is a small Go service that:

  1. Uses the GitHub API (with a personal access token) to pull traffic metrics on a schedule.
  2. Merges them into a local SQLite database so history accumulates instead of disappearing.
  3. Serves a web UI and JSON API so you can browse aggregates and per-repo charts.

On startup it runs a full sync once (so repo discovery matches your filter right away), then repeats on GGHSTATS_SYNC_INTERVAL (default 1h). No waiting for the first tick to see data.

It’s deliberately boring technology: one binary, one file for the DB, backups = copy gghstats.db.

Live demo (read-only UI): gghstats.hermesrodriguez.com


Stack (opinionated and minimal)

Piece Why
Go Fast, single binary, easy to ship in Docker.
SQLite No separate DB server; ship backups with the rest of your backups.
Chart.js Charts in the dashboard without a heavy frontend framework.
Bootstrap grid Layout and responsive behavior without reinventing CSS — the UI is intentionally neo-brutalist (hard borders, monospace, loud accents) so it feels like a tool, not a marketing site.

“Works on my machine” wasn’t enough

I wanted a production-shaped repo, not just go run:

  • Docker / Docker Compose for local runs.
  • docker-compose.prod.yml with Traefik, Let’s Encrypt, and no public port on the app container — only 80/443 on the proxy.
  • Helm chart under charts/gghstats for Kubernetes.
  • GoReleaser + GitHub Actions for releases, artifacts, and multi-arch images (linux/amd64, linux/arm64).

If you’ve ever maintained a side project, you know the drag of “I’ll dockerize it later.” I put the boring work upfront so future me doesn’t hate present me.


How it works (high level)

gghstats — high-level data flow: GitHub API, gghstats service, SQLite, web dashboard, JSON API, browser, and scripts

  1. Fetch — Scheduled sync using your token (scope: repo for private repos you care about).
  2. Store — Upserts into SQLite so you keep a timeline, not a snapshot.
  3. Serve — Dashboard for humans and JSON for scripts.

Filtering (GGHSTATS_FILTER, exclusions like !fork, etc.) lives in env vars so you can keep the sync set tight.


Two numbers that matter (aggregate vs history)

On the main screen you see rollups: totals across the repos you track. That’s the “at a glance” view.

The real payoff is opening one repository: you see interval stats (what GitHub is showing now for the last ~14 days) next to lifetime totals from your database. The gap between “what GitHub is willing to remember” and “what you kept” is the whole point — and the charts (clones, views, stars over time) are where SQLite stops being a file and starts being a memory.

Main dashboard (rollups across tracked repos):

gghstats main dashboard — neo-brutalist UI, repo list and aggregates

Repository detail — interval stats from GitHub’s window next to lifetime totals from SQLite, plus Chart.js trends (clones, views, stars):

gghstats repository detail — interval vs lifetime stats and historical charts


Who should try it

  • Maintainers who want long-term traffic context.
  • People who already self-host and want data sovereignty (your DB, your VPS, your rules).
  • Anyone allergic to “sign up for another analytics product to see GitHub stats.”

Try it

From the repo (Compose):

git clone https://github.com/hrodrig/gghstats.git && cd gghstats
cp .env.example .env
# set GGHSTATS_GITHUB_TOKEN, tune GGHSTATS_FILTER if needed
docker compose up -d
# open http://localhost:8080
Enter fullscreen mode Exit fullscreen mode

Published image only (no clone — see README for all env vars):

docker pull ghcr.io/hrodrig/gghstats:v0.1.0
docker run --rm -p 8080:8080 \
  -e GGHSTATS_GITHUB_TOKEN \
  -e GGHSTATS_FILTER \
  -v "$PWD/gghstats-data:/data" \
  ghcr.io/hrodrig/gghstats:v0.1.0
Enter fullscreen mode Exit fullscreen mode

Production-oriented compose (Traefik, TLS) lives in docker-compose.prod.yml — see the repo README for env vars like GGHSTATS_HOSTNAME and ACME_EMAIL.

Repo: github.com/hrodrig/gghstats

Issues and PRs welcome. If this saves you from losing another year of traffic history, it was worth writing.

How do you capture or export GitHub traffic today — CSV dumps, scripts, or nothing? Curious what others do in the comments.


Credits

This article was drafted from my own notes and a long brainstorming thread with Gemini (analysis, structure, and image ideas). The code, the rough edges, and the neo-brutalist UI are mine — blame me for the bugs, not the LLM.

Top comments (0)