DEV Community

yep
yep

Posted on • Originally published at yepchaos.com

Finding Rigth Database

Most applications need to persist state. In a chat application, that state is massive, constantly growing, and high-frequency. The obvious starting point is a traditional RDBMS — but the specific access patterns of a real-time chat system eventually force a rethink.

The Problem with RDBMS for Chat

I could use PostgreSQL for storing messages. It works, until it doesn't.

Chat is different from most relational data. Messages don't join to other tables. What I actually need is simple: insert a message, fetch messages by room or user. That's it. So the requirements are:

  • It grows fast — millions, then billions of rows
  • No joins needed — just "give me all messages for room X"
  • Reads and writes need to be fast

Traditional databases like PostgreSQL and MySQL weren't designed with this access pattern as the primary use case. Here's why that matters.

Partitioning

As the message table grows, we can partition it — split it into smaller physical chunks based on some key, like room ID or time range. The database only scans the relevant partition instead of the whole table. Postgres supports this natively, but it handles it differently from distributed systems — partitions still live on a single machine, so we’re organizing data, not distributing the load.

The Write Scaling Problem

The bigger issue is writes. PostgreSQL and MySQL use a single-master model — one node handles all writes, replicas handle reads. Every message sent goes through that one master. At high write volume, that becomes bottleneck.

The common solution is sharding: split data across multiple independent database instances, each owning a slice. Hash the room ID to decide which shard it lives on. In theory, clean. In practice, painful — managing shard keys, handling rebalancing when nodes are added, cross-shard queries becoming a nightmare. I decided early on to avoid this entirely by choosing a database built for it natively.

Cassandra and ScyllaDB

This is where wide-column stores like Cassandra — and its C++ reimplementation, ScyllaDB — come in. Same architecture, ScyllaDB just rewrote it in C++ for better performance and lower latency.

The core idea: instead of one master handling writes, Cassandra/ScyllaDB uses a ring topology. Every node in the cluster owns a range of a hash space. When a message is written, the room ID gets hashed and routed to the node that owns that hash range. No single master, no write bottleneck — every node can accept writes.

Replication works naturally on top of this. With a replication factor of 3, a write doesn't just go to the primary node — it also goes to the next 2 nodes on the ring. So there are 3 copies of the data across different nodes. If one goes down, the data is still there. No manual failover, it's built into how the ring works.

The other key advantage is the partition key. By using room ID as the partition key, Cassandra/ScyllaDB guarantees all messages for that room are stored together on the same node. Pair that with a clustering key on timestamp, and messages within a room are physically stored in time order — fetching history becomes one sequential read, already sorted. No ORDER BY, no extra cost.

This turns random I/O into sequential I/O. Fetching chat history means finding the right node and reading one continuous stream. That's a hardware-level optimization that a single-master Postgres setup simply can't match at scale.

The tradeoff: Cassandra/ScyllaDB is bad at full scans and joins, because those require hitting every node. But based on the requirements here, that doesn't matter — joins are never needed.

This isn't just theory. Discord went through this exact problem — first scaling with Cassandra for billions of messages, then eventually migrating to ScyllaDB for better performance at trillions of messages. Worth reading if you want a production-scale perspective:

I'll write a dedicated post on Cassandra/ScyllaDB internals — replication strategies, consistency levels, and multi-DC support deserve their own space.

The Hybrid Architecture

There's no perfect database. Different tools solve different problems, so I use both:

  1. PostgreSQL — relational, "small" data: users, friend lists, room metadata. Needs ACID compliance and complex queries, but doesn't grow at a massive rate.
  2. Cassandra/ScyllaDB — the heavy data: every message ever sent. High-write throughput, fast sequential reads by room, horizontally scalable without a single write master.

Each database does what it's actually good at.

What's Next

There's more to cover here — consistency models, high availability, failover, and distributed systems fundamentals like Raft. I'll get into those in future posts. For now, this is the architectural reasoning behind the storage layer in ASTRING.

Top comments (0)