DEV Community

harsh patel
harsh patel

Posted on

When to Choose Cassandra in System Design

How It Achieves Massive Write Scalability (and What You Trade Off)

Designing large-scale systems often comes down to one uncomfortable truth:

You cannot optimize reads, writes, consistency, and simplicity all at once.

When your system is write-heavy—logs, metrics, events, feeds, IoT data—traditional databases often become the bottleneck. This is where Apache Cassandra becomes a compelling choice.

This article explains:

  1. When Cassandra is the right database
  2. Why it outperforms traditional databases for writes
  3. How its internal storage model enables that performance

When Should You Choose Cassandra?

Cassandra is a strong choice when writes dominate your workload and availability matters more than strict consistency.

Choose Cassandra if:

  • You have very high write throughput (10k–100k+ writes/sec)
  • Data arrives continuously (events, logs, metrics, tracking)
  • You need horizontal scaling by adding nodes
  • Downtime is unacceptable (multi-node, multi-DC availability)
  • Your queries are predictable and can be modeled without JOINs
  • Eventual or tunable consistency is acceptable

Avoid Cassandra if:

  • You need complex ad-hoc queries or JOINs
  • You rely heavily on transactions across multiple rows/tables
  • Reads must be extremely fast and strongly consistent
  • Your dataset is small and doesn’t need horizontal scaling

Cassandra is not a “general-purpose” database—it’s a specialized write-scaling machine.


Why Traditional Databases Struggle with Writes

Most relational databases (PostgreSQL, MySQL) use B-tree–based storage engines.

How writes work in traditional databases:

  • Data is updated in place
  • Indexes must be updated immediately
  • Writes trigger random disk I/O
  • Locks, WAL flushes, and index maintenance add overhead

This works well for mixed workloads, but at scale:

  • Random disk seeks become expensive
  • Index-heavy schemas slow down writes
  • Vertical scaling hits hardware limits
  • Sharding adds operational complexity

In short: traditional databases optimize for reads first.


Why Cassandra Is Faster for Writes

Cassandra flips the design priorities.

Instead of updating data in place, Cassandra uses a Log-Structured Merge Tree (LSM Tree), which turns almost every write into a sequential append.

Key design choice:

Never modify data in place. Always append.

Sequential disk writes are orders of magnitude faster than random writes, which is why Cassandra can sustain massive write throughput on modest hardware.


Cassandra’s Internal Storage Model (LSM Tree)

Cassandra’s storage engine revolves around three core components.


1. Commit Log (Durability First)

  • Every write is appended to the commit log
  • Acts as a write-ahead log
  • Ensures data isn’t lost if a node crashes
  • Sequential disk I/O → very fast

This step guarantees durability without slowing down the write path.


2. Memtable (In-Memory Writes)

  • Writes are stored in an in-memory, sorted structure
  • Sorted by primary key
  • Multiple updates to the same key are merged in memory
  • No disk I/O for each write

This absorbs write bursts efficiently.


3. SSTable (Immutable Disk Storage)

  • When the Memtable fills up, it is flushed to disk as an SSTable
  • SSTables are:
    • Immutable
    • Sorted by primary key
    • Written sequentially

Because SSTables are never updated, Cassandra avoids random disk writes entirely.


How Cassandra Handles Updates and Deletes

Cassandra treats every change as a new write:

  • Updates → new version with a higher timestamp
  • Deletes → written as tombstones (delete markers)

The “current state” of a row is determined by timestamps, not by overwriting data.

This design enables fast writes but shifts cleanup work to the background.


Reading Data: The Trade-Off

Reads are more complex than writes:

  1. Check the Memtable (latest data)
  2. Use Bloom filters to identify relevant SSTables
  3. Read SSTables from newest to oldest
  4. Merge results to find the latest version

Bloom filters help skip unnecessary disk reads, but reads are still slower than in well-indexed relational databases.

This is the intentional trade-off.


Compaction: Paying the Cost Later

To prevent unlimited growth of SSTables and tombstones, Cassandra runs compaction:

  • Merges multiple SSTables into fewer ones
  • Resolves multiple versions of rows
  • Removes deleted and expired data
  • Improves read performance over time

Cassandra makes writes cheap now, and pays the cost later via compaction.


Why This Design Scales So Well

Putting it all together:

Traditional DB Cassandra
In-place updates Append-only writes
Random disk I/O Sequential disk I/O
Write blocks reads Writes isolated from reads
Hard to shard Built-in partitioning
Vertical scaling Horizontal scaling

Cassandra scales because:

  • Writes are fast and predictable
  • Nodes are independent (shared-nothing)
  • Adding nodes increases total throughput
  • Failures don’t stop the system

Final Takeaway

Cassandra is not faster because it’s “better”—it’s faster because it chooses a different set of trade-offs.

If your system is write-heavy, highly available, and horizontally scalable, Cassandra’s LSM-based storage model can outperform traditional databases by an order of magnitude.

But if reads, transactions, or flexibility matter more—choose something else.

Good system design is not about the best database.

It’s about the right database for the workload.

Top comments (0)