A year ago, I was working with a HealthTech client who needed web analytics. Because of strict patient privacy laws and GDPR, Google Analytics was completely off the table.
We looked at self-hosting great open-source alternatives like Plausible or Umami. But operationally, they felt heavy. To run them reliably, you end up managing a whole stack: PostgreSQL, ClickHouse, Redis, Node.js, or Elixir.
I didn't want to babysit a database cluster just to count pageviews. I wanted the performance of an enterprise analytics stack, but with the deployment simplicity of a single file.
So, I built HitKeep. It's been ingesting millions of hits in production for that client, and after battling a healthy dose of imposter syndrome, I’ve spent the last few months polishing the core and open-sourcing it.
Here is how I built a full analytics platform inside a single 12MB Go binary, and why data sovereignty is its core feature.
The Architecture: DuckDB + NSQ in Go
To make deployment as simple as downloading a binary (or running one Docker container), everything had to be embedded.
-
Storage (Embedded DuckDB): Analytics requires OLAP (Online Analytical Processing) databases. Standard row-based databases like Postgres choke on heavy aggregations. I embedded DuckDB—a lightning-fast columnar database that lives in a single file (
hitkeep.db). You can store about 1 Million raw hits per ~120MB. - Ingestion (Embedded NSQ): Writing to a columnar database synchronously per HTTP request creates lock contention. To solve this, HitKeep embeds an NSQ broker in-memory. The HTTP handler enqueues the hit in microseconds, and a background consumer micro-batches the writes to DuckDB.
- Clustering (Memberlist): If you need High Availability, it has native clustering via HashiCorp Memberlist (gossip protocol) for leader election.
-
Frontend: The dashboard (Angular) and the 2KB tracking script are compiled directly into the Go binary using
embed.FS.
True Data Sovereignty (The "Takeout" API)
Most "privacy-friendly" tools still lock your data in their ecosystem. I believe your data belongs to you.
HitKeep has a first-class Takeout API. With one click, you can export every single raw data point (hits, events, goals) into open formats: Parquet, CSV, JSON, or NDJSON.
Parquet is especially powerful here—you can take your HitKeep export and immediately query it in Python, Apache Spark, or a data warehouse without any transformation.
Security Isn't an "Enterprise" Feature
Because of its HealthTech roots, security couldn't be an afterthought or a paid upgrade.
- Zero 3rd-Party Requests: The dashboard makes zero outbound calls. It even proxies site favicons server-side via DuckDuckGo so your browser never leaks IP data. It is fully air-gap compatible.
- WebAuthn & 2FA: Hardware security keys (Passkeys/YubiKey) and TOTP (Authenticator apps) are built-in for account protection.
- API Clients: Built-in bearer token generation for CI/CD pipelines or custom dashboards.
What it tracks
It is cookie-less by default and (optionally) respects Do Not Track headers. It tracks exactly what you need to run a website:
- Traffic, Referrers, Devices, and Countries.
- Custom Events & Conversion Goals.
- Multi-step Funnels with drop-off analysis.
- Automatic UTM Campaign attribution.
- Scheduled Email Reports (Dispatched via your own SMTP, no external cron needed).
It's also fully translated into English, German, Spanish, French, and Italian.
What's missing?
I want to be transparent about what it doesn't do yet. There is no eCommerce revenue tracking, no cross-device identity stitching (by design, for privacy), and it still needs some UI polish. Up next on the roadmap is bringing your own SSO (OIDC/SAML).
Try it out
You can self-host HitKeep right now for free.
-
GitHub (Source & Screenshots):
PascaleBeier / hitkeep
Privacy-first web analytics you can self-host or run in managed EU/US cloud.
HitKeep
Privacy-first web analytics you can self-host or run in managed EU/US cloud.
HitKeep is an open source web analytics platform built for people who want a simpler stack than the usual PostgreSQL, Redis, ClickHouse, and reverse-proxy pileup.
- Single binary runtime
- Embedded DuckDB and NSQ with batched ingest writes
- Privacy-first tracking
- Goals, funnels, ecommerce, AI visibility, AI chatbot analytics, email reports, and API clients
- Self-hosted or managed cloud with EU/US region choice
Website · Cloud · Live Demo · Docs · API · Releases
Why HitKeep
HitKeep is for teams that want product analytics without adopting a full analytics platform stack.
- Simple to run: one binary, one data directory, no external database required
- Efficient write path: NSQ buffers ingest bursts and DuckDB appender batches smooth out disk-heavy per-row inserts
- Privacy-first by default: cookie-less tracking, Do Not Track support, focused data collection
- Useful out of the box: traffic analytics with countries/languages…
Documentation: https://hitkeep.com
If you like the project, I'd incredibly appreciate a star on GitHub!
☁️ Don't want to manage a server?
HitKeep is proudly Open Source. But if you want the strict privacy, data-ownership, and cookie-less tracking of HitKeep without having to manage a VPS, systemd, or database backups, I am launching HitKeep Cloud.
It will offer fully managed, single-tenant instances hosted strictly in the EU (Frankfurt) or the US.
👉 Join the HitKeep Cloud Early Access Waitlist




Top comments (0)