DEV Community

Cover image for From HealthTech to Open Source: Building a sovereign web analytics engine in a single binary
Pascale Beier
Pascale Beier

Posted on

From HealthTech to Open Source: Building a sovereign web analytics engine in a single binary

A year ago, I was working with a HealthTech client who needed web analytics. Because of strict patient privacy laws and GDPR, Google Analytics was completely off the table.

We looked at self-hosting great open-source alternatives like Plausible or Umami. But operationally, they felt heavy. To run them reliably, you end up managing a whole stack: PostgreSQL, ClickHouse, Redis, Node.js, or Elixir.

I didn't want to babysit a database cluster just to count pageviews. I wanted the performance of an enterprise analytics stack, but with the deployment simplicity of a single file.

So, I built HitKeep. It's been ingesting millions of hits in production for that client, and after battling a healthy dose of imposter syndrome, I’ve spent the last few months polishing the core and open-sourcing it.

Here is how I built a full analytics platform inside a single 12MB Go binary, and why data sovereignty is its core feature.

The Architecture: DuckDB + NSQ in Go

To make deployment as simple as downloading a binary (or running one Docker container), everything had to be embedded.

Ingesting tens of thousands hits per hour on a $4 VPS

  1. Storage (Embedded DuckDB): Analytics requires OLAP (Online Analytical Processing) databases. Standard row-based databases like Postgres choke on heavy aggregations. I embedded DuckDB—a lightning-fast columnar database that lives in a single file (hitkeep.db). You can store about 1 Million raw hits per ~120MB.
  2. Ingestion (Embedded NSQ): Writing to a columnar database synchronously per HTTP request creates lock contention. To solve this, HitKeep embeds an NSQ broker in-memory. The HTTP handler enqueues the hit in microseconds, and a background consumer micro-batches the writes to DuckDB.
  3. Clustering (Memberlist): If you need High Availability, it has native clustering via HashiCorp Memberlist (gossip protocol) for leader election.
  4. Frontend: The dashboard (Angular) and the 2KB tracking script are compiled directly into the Go binary using embed.FS.

True Data Sovereignty (The "Takeout" API)

Most "privacy-friendly" tools still lock your data in their ecosystem. I believe your data belongs to you.

HitKeep has a first-class Takeout API. With one click, you can export every single raw data point (hits, events, goals) into open formats: Parquet, CSV, JSON, or NDJSON.
Parquet is especially powerful here—you can take your HitKeep export and immediately query it in Python, Apache Spark, or a data warehouse without any transformation.

Takeout across sites to Parquet, JSON, Excel..

Security Isn't an "Enterprise" Feature

Because of its HealthTech roots, security couldn't be an afterthought or a paid upgrade.

  • Zero 3rd-Party Requests: The dashboard makes zero outbound calls. It even proxies site favicons server-side via DuckDuckGo so your browser never leaks IP data. It is fully air-gap compatible.
  • WebAuthn & 2FA: Hardware security keys (Passkeys/YubiKey) and TOTP (Authenticator apps) are built-in for account protection.
  • API Clients: Built-in bearer token generation for CI/CD pipelines or custom dashboards.

Yes, you can login completely using only Webauthn

What it tracks

It is cookie-less by default and (optionally) respects Do Not Track headers. It tracks exactly what you need to run a website:

  • Traffic, Referrers, Devices, and Countries.
  • Custom Events & Conversion Goals.
  • Multi-step Funnels with drop-off analysis.
  • Automatic UTM Campaign attribution.
  • Scheduled Email Reports (Dispatched via your own SMTP, no external cron needed).

It's also fully translated into English, German, Spanish, French, and Italian.

What's missing?

I want to be transparent about what it doesn't do yet. There is no eCommerce revenue tracking, no cross-device identity stitching (by design, for privacy), and it still needs some UI polish. Up next on the roadmap is bringing your own SSO (OIDC/SAML).

Try it out

You can self-host HitKeep right now for free.

  • GitHub (Source & Screenshots):

    GitHub logo PascaleBeier / hitkeep

    A self-hostable, privacy-first web analytics service in a single Go binary.

    HitKeep

    Web Analytics in a single binary.

    License: MIT Go Version Docker Image (GHCR) Docker Image (Hub) Documentation OpenSSF Best Practices

    HitKeep is a self-hostable, privacy-first web analytics platform designed for radical simplicity without sacrificing performance.

    Unlike other solutions that require you to manage a complex stack (PostgreSQL, Redis, ClickHouse, Nginx), HitKeep runs as a single, self-contained executable. It embeds a high-performance OLAP database (DuckDB) and a distributed message queue (NSQ) directly into the binary.

    HitKeep analytics dashboard — traffic overview, geographic breakdown, goals, funnels, and UTM attribution

    More screenshots

    Login

    HitKeep login page

    Share Dashboard

    HitKeep shareable read-only dashboard link

    Events

    HitKeep custom event analytics with timeseries chart and property breakdown

    Event Audience Breakdown

    HitKeep event audience breakdown — top pages, referrers, devices, and countries for a selected event

    Goals & Conversion Tracking

    HitKeep goals and conversion tracking

    Multi-Step Funnels

    HitKeep multi-step funnel analytics

    UTM Campaign Attribution

    HitKeep UTM tracking and campaign attribution

    UTM Link Builder

    HitKeep built-in UTM link builder

    Email Reports

    HitKeep scheduled email reports

    Weekly Report Email

    HitKeep weekly analytics report email

    Digest Report Email

    HitKeep digest report email with multi-site summary

    Profile & Settings

    HitKeep user profile and settings

    TOTP & WebAuthn / Passkeys

    HitKeep two-factor authentication setup — TOTP and WebAuthn Passkeys

    API Clients & Bearer Tokens

    HitKeep API client management

    API Reference

    HitKeep built-in OpenAPI reference

    Admin — User Management

    HitKeep admin panel — user management

    Admin — Site Management

    HitKeep admin panel — site management

    HitKeep Cloud is coming!

    Prefer a managed solution and funding Open Source? Join the Early Access Waitlist for fully managed, data-sovereign and privacy-first analytics in the EU or US.

    Join the Waitlist →

    Documentation

    Visit hitkeep.com

  • Documentation: https://hitkeep.com

If you like the project, I'd incredibly appreciate a star on GitHub!


☁️ Don't want to manage a server?
HitKeep is proudly Open Source. But if you want the strict privacy, data-ownership, and cookie-less tracking of HitKeep without having to manage a VPS, systemd, or database backups, I am launching HitKeep Cloud.

It will offer fully managed, single-tenant instances hosted strictly in the EU (Frankfurt) or the US.

👉 Join the HitKeep Cloud Early Access Waitlist


Top comments (0)