Density Tech

Posted on Feb 5

We Tried to Build Analytics Without a Database. It Sort of Worked.

#programming #opensource #database #tutorial

The client needed product analytics.

The problem wasn’t scale.
The problem was money.

Snowflake’s minimum was roughly $600 per month. The total infrastructure budget was closer to $200. Given that constraint, debating which data warehouse to use felt like the wrong conversation to have.

So instead, we asked a different question:

What if we didn’t use a database at all?

Dumping Everything into Object Storage

The first decision was straightforward: use object storage.

For this engagement, we chose MinIO. Events were ingested, written out as Parquet files, and stored durably. No long-running services. No query engine sitting idle. Just storage.

That immediately raised the obvious concern:
how do you query this without turning it into an unmaintainable pile of files?

That’s where DuckDB entered the picture.

DuckDB is an in-process SQL engine. No server, no cluster, no operational setup. Install it, point it at Parquet files, and start writing SQL.

Initially, it felt too simple to be serious.

We tried it anyway.

Within a day, we had funnel queries, retention calculations, and basic aggregations running directly against Parquet files in S3-compatible storage. No ingestion jobs. No warehouse loaders. No retry logic.

It worked far better than expected.

The Iceberg Detour That Changed Everything

About two weeks into the project, Apache Iceberg came up during a casual discussion.

At that point, we weren’t actively looking to change anything. The system was working. But Iceberg promised things that were starting to matter:

ACID semantics
Schema evolution
Snapshot isolation
Table-level operations on object storage

We treated it as an experiment.

That experiment quietly became the foundation.

Once we moved from loosely managed Parquet files to Iceberg tables, several problems disappeared immediately:

Schema changes became manageable
Bad data was no longer permanent
Table state became explicit and queryable

DuckDB’s Iceberg extension made the integration trivial. One ATTACH, and the tables behaved like standard relational tables.

The value became obvious the first time a customer sent several hours of malformed events. Previously, fixing that would have meant manually tracking files and rewriting data. With Iceberg, it was a single DELETE statement.

That alone justified the decision.

Local Development Without Friction

One outcome that surprised me was how much this improved local development.

DuckDB’s UI mode (duckdb -ui) provides a browser-based SQL editor running entirely on a developer’s machine. No credentials to manage. No shared environments. No waiting for services to start.

Even developers who typically avoid SQL were able to explore real analytics data locally.

That level of accessibility is rare in analytics systems—and for an MVP, it mattered more than raw throughput.

The Limitations (Because There Are Always Limitations)

This setup isn’t a silver bullet.

DuckDB is not designed for high concurrency. Once concurrent access increased, caching became necessary. Event ingestion is buffered and batched. Writes are controlled and intentionally infrequent.

This is not a multi-tenant analytics platform.

And it was never meant to be.

What We Actually Delivered

In roughly three weeks, the client had:

A functional product analytics backend
Iceberg-managed tables on object storage
DuckDB-powered analytical queries and derivations
Infrastructure costs under $50 per month
A system solid enough to demo to customers

Most importantly, they avoided committing early to an expensive or complex architecture before the product justified it.

The Real Takeaway

From my perspective, if you’re pre–product-market fit, a data warehouse is often the wrong first move.

You don’t need infinite scale.
You don’t need query queues.
You don’t need vendor contracts.

You need speed, flexibility, and the ability to change direction without regret.

DuckDB plus object storage—and Iceberg when structure starts to matter—isn’t just cheaper. It’s better suited for experimentation.

We’ll move to something heavier when it’s required.

For now, this works. And as an engineer, it’s been a genuinely enjoyable system to build.

DEV Community