<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>DEV Community: Archibald Duskbottle</title>
    <description>The latest articles on DEV Community by Archibald Duskbottle (@archibald_duskbottle_366f).</description>
    <link>https://dev.to/archibald_duskbottle_366f</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F3802502%2F7925ea3d-1e75-4b35-a4c1-4da9951eb2ca.png</url>
      <title>DEV Community: Archibald Duskbottle</title>
      <link>https://dev.to/archibald_duskbottle_366f</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://dev.to/feed/archibald_duskbottle_366f"/>
    <language>en</language>
    <item>
      <title>How We Built a GA4-Compatible Analytics Pipeline to Escape US Tech Lock-in</title>
      <dc:creator>Archibald Duskbottle</dc:creator>
      <pubDate>Mon, 02 Mar 2026 22:27:50 +0000</pubDate>
      <link>https://dev.to/archibald_duskbottle_366f/how-we-built-a-ga4-compatible-analytics-pipeline-to-escape-us-tech-lock-in-a06</link>
      <guid>https://dev.to/archibald_duskbottle_366f/how-we-built-a-ga4-compatible-analytics-pipeline-to-escape-us-tech-lock-in-a06</guid>
      <description>&lt;h1&gt;
  
  
  How We Built a GA4-Compatible Analytics Pipeline to Escape US Tech Lock-in - d8a.tech
&lt;/h1&gt;

&lt;p&gt;Google Analytics is everywhere. It's also a deal-breaker for a growing number of teams.&lt;/p&gt;

&lt;p&gt;Under GDPR and the post-Schrems II landscape, sending EU visitor data to Google's US infrastructure is legally murky at best. For healthcare organizations under HIPAA or government sites under FedRAMP, it's a non-starter.&lt;/p&gt;

&lt;p&gt;The usual answer is to switch to a privacy-friendly alternative. The problem: most of them require you to throw away your existing tracking plan and start over. If you've invested in a GA4 setup - event taxonomy, GTM configuration, custom dimensions - that's a real switching cost.&lt;/p&gt;

&lt;h2&gt;
  
  
  One hard requirement
&lt;/h2&gt;

&lt;p&gt;We built d8a around a single constraint: it had to speak GA4's protocol natively. Same &lt;code&gt;/g/collect&lt;/code&gt; endpoint, same parameters. If you're already sending data to Google, you're already sending it in the right format for d8a. No rewrites, no migration weekend.&lt;/p&gt;

&lt;h2&gt;
  
  
  How it's put together
&lt;/h2&gt;

&lt;p&gt;The pipeline has three moving parts: a tracking component that turns HTTP requests into hits, a queue that buffers them, and a processing component that closes sessions and writes to your warehouse.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;The transport is pluggable.&lt;/strong&gt; By default the two components communicate through the filesystem - cheap, simple, works on a single VPS. For HA deployments (how our cloud runs it), you swap in object storage and the processes scale independently. A RabbitMQ driver would fit the same interface if you need minimum latency at high throughput. The tradeoffs are real: filesystem is fine for a single node but rules out HA; object storage is pennies and unlocks horizontal scaling; a message broker gets you minimum latency but comes with either maintenance overhead or a bigger cloud bill.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;The session engine has a pluggable batched KV interface.&lt;/strong&gt; Session state - grouping hits by visitor, tracking inactivity windows - lives behind an interface with a complete blackbox test suite. The default implementation uses BoltDB: embedded, no external process, runs anywhere. Swapping it for something distributed (Redis, Cassandra, whatever fits your infra) is a matter of passing a different implementation that satisfies the same contract.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;The tracking component is protocol-agnostic.&lt;/strong&gt; GA4 is the default, but the HTTP path-to-protocol mapping is an abstraction - adding a new ingest protocol is a matter of implementing an interface. Matomo, Amplitude, or anything else with a defined HTTP tracking format could be a drop-in. This also means, that it can act as a self-hosted mirror for teams already on those platforms, intercepting existing tracking calls without touching the client.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Warehouse destinations:&lt;/strong&gt; ClickHouse (fully self-hosted), BigQuery, or CSV files written to S3/MinIO, GCS, or local disk. The file path works with Snowflake Snowpipe, Redshift Spectrum, Databricks Auto Loader, DuckDB - if you already have a warehouse, you can pipe into it.&lt;/p&gt;

&lt;p&gt;Deployment-wise: a single VPS is enough to get started. Our own cloud runs it on Kubernetes with the object storage transport between the two components.&lt;/p&gt;

&lt;h2&gt;
  
  
  Get started
&lt;/h2&gt;

&lt;p&gt;d8a is open source, MIT licensed.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;a href="https://docs.d8a.tech/getting-started" rel="noopener noreferrer"&gt;Getting started guide&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://github.com/d8a-tech/d8a" rel="noopener noreferrer"&gt;GitHub&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;We also have completely free (for now) &lt;a href="https://app.d8a.tech" rel="noopener noreferrer"&gt;cloud - app.d8a.tech&lt;/a&gt;.&lt;/li&gt;
&lt;/ul&gt;

</description>
      <category>analytics</category>
      <category>clickhouse</category>
      <category>eu</category>
      <category>bigquery</category>
    </item>
  </channel>
</rss>
