harsh patel

Posted on Feb 4

When to Choose Cassandra in System Design

#architecture #database #performance #systemdesign

How It Achieves Massive Write Scalability (and What You Trade Off)

Designing large-scale systems often comes down to one uncomfortable truth:

You cannot optimize reads, writes, consistency, and simplicity all at once.

When your system is write-heavy—logs, metrics, events, feeds, IoT data—traditional databases often become the bottleneck. This is where Apache Cassandra becomes a compelling choice.

This article explains:

When Cassandra is the right database
Why it outperforms traditional databases for writes
How its internal storage model enables that performance

When Should You Choose Cassandra?

Cassandra is a strong choice when writes dominate your workload and availability matters more than strict consistency.

Choose Cassandra if:

You have very high write throughput (10k–100k+ writes/sec)
Data arrives continuously (events, logs, metrics, tracking)
You need horizontal scaling by adding nodes
Downtime is unacceptable (multi-node, multi-DC availability)
Your queries are predictable and can be modeled without JOINs
Eventual or tunable consistency is acceptable

Avoid Cassandra if:

You need complex ad-hoc queries or JOINs
You rely heavily on transactions across multiple rows/tables
Reads must be extremely fast and strongly consistent
Your dataset is small and doesn’t need horizontal scaling

Cassandra is not a “general-purpose” database—it’s a specialized write-scaling machine.

Why Traditional Databases Struggle with Writes

Most relational databases (PostgreSQL, MySQL) use B-tree–based storage engines.

How writes work in traditional databases:

Data is updated in place
Indexes must be updated immediately
Writes trigger random disk I/O
Locks, WAL flushes, and index maintenance add overhead

This works well for mixed workloads, but at scale:

Random disk seeks become expensive
Index-heavy schemas slow down writes
Vertical scaling hits hardware limits
Sharding adds operational complexity

In short: traditional databases optimize for reads first.

Why Cassandra Is Faster for Writes

Cassandra flips the design priorities.

Instead of updating data in place, Cassandra uses a Log-Structured Merge Tree (LSM Tree), which turns almost every write into a sequential append.

Key design choice:

Never modify data in place. Always append.

Sequential disk writes are orders of magnitude faster than random writes, which is why Cassandra can sustain massive write throughput on modest hardware.

Cassandra’s Internal Storage Model (LSM Tree)

Cassandra’s storage engine revolves around three core components.

1. Commit Log (Durability First)

Every write is appended to the commit log
Acts as a write-ahead log
Ensures data isn’t lost if a node crashes
Sequential disk I/O → very fast

This step guarantees durability without slowing down the write path.

2. Memtable (In-Memory Writes)

Writes are stored in an in-memory, sorted structure
Sorted by primary key
Multiple updates to the same key are merged in memory
No disk I/O for each write

This absorbs write bursts efficiently.

3. SSTable (Immutable Disk Storage)

When the Memtable fills up, it is flushed to disk as an SSTable
SSTables are:
- Immutable
- Sorted by primary key
- Written sequentially

Because SSTables are never updated, Cassandra avoids random disk writes entirely.

How Cassandra Handles Updates and Deletes

Cassandra treats every change as a new write:

Updates → new version with a higher timestamp
Deletes → written as tombstones (delete markers)

The “current state” of a row is determined by timestamps, not by overwriting data.

This design enables fast writes but shifts cleanup work to the background.

Reading Data: The Trade-Off

Reads are more complex than writes:

Check the Memtable (latest data)
Use Bloom filters to identify relevant SSTables
Read SSTables from newest to oldest
Merge results to find the latest version

Bloom filters help skip unnecessary disk reads, but reads are still slower than in well-indexed relational databases.

This is the intentional trade-off.

Compaction: Paying the Cost Later

To prevent unlimited growth of SSTables and tombstones, Cassandra runs compaction:

Merges multiple SSTables into fewer ones
Resolves multiple versions of rows
Removes deleted and expired data
Improves read performance over time

Cassandra makes writes cheap now, and pays the cost later via compaction.

Why This Design Scales So Well

Putting it all together:

Traditional DB	Cassandra
In-place updates	Append-only writes
Random disk I/O	Sequential disk I/O
Write blocks reads	Writes isolated from reads
Hard to shard	Built-in partitioning
Vertical scaling	Horizontal scaling

Cassandra scales because:

Writes are fast and predictable
Nodes are independent (shared-nothing)
Adding nodes increases total throughput
Failures don’t stop the system

Final Takeaway

Cassandra is not faster because it’s “better”—it’s faster because it chooses a different set of trade-offs.

If your system is write-heavy, highly available, and horizontally scalable, Cassandra’s LSM-based storage model can outperform traditional databases by an order of magnitude.

But if reads, transactions, or flexibility matter more—choose something else.

Good system design is not about the best database.

It’s about the right database for the workload.

DEV Community