Thoughts on CockroachDB & Cassandra

A Brief History (not of Time but of distributed databases)

While both Apache Cassandra and CockroachDB are massively scalable, open source, distributed databases, they were designed and architected with different goals in mind. Cassandra can trace its origins back to Facebook where its creators drew inspiration from Dynamo and BigTable white papers. Facebook was looking to create a solution that could ingest huge amounts of data and query the dataset using specific lookup patterns. CockroachDB was founded by former Google employees that had worked on Google Spanner. As such, the focus of CockroachDB was to provide a scalable database solution that was SQL compliant while supporting globally distributed transactions.

Both databases have proven to be very successful at what they do. I have spent many years working on and assisting enterprises in implementing distributed database solutions. This blog is a summary of my observations and experiences over the years working with Apache Cassandra and CockroachDB. I hope you find this content useful!

Since we are talking about distributed databases, I should start by defining exactly what that term means. A distributed database can be thought of as a collection of machines, either physical or virtual, typically called nodes that are loosely joined together and work together to function as a single logical database. One of the primary characteristics of distributed databases is their ability to replicate data which guarantees that if a node within the database should fail, the data contained on that node is not lost. Both CockroachDB and Apache Cassandra replicate data, but the mechanism they use to accomplish this task is where they begin to diverge.

The CAP Theorem

Let us start by looking at the CAP theorem. The CAP theorem states that a distributed database has three critical attributes, Consistency, Availability and the ability to survive network Partitions and that a distributed database is only going to be able to fully satisfy two of these criteria. Since any distributed system needs to be architected to be tolerant of network partitions the difference comes down to whether a database solution shows preference to consistency or availability. This distinction isn’t always as obvious as the terms indicate. For now, I will just state that Apache Cassandra is classified as an AP system meaning it can potentially return inconsistent results under certain failure scenarios. CockroachDB is a CP system meaning it will never return inconsistent results, choosing instead to return an error during a failure situation when quorum, meaning consensus from a majority of nodes, cannot be achieved.

How Data is Stored

To understand the strengths of each database it is necessary to understand at a high level how each system stores its data. Under the covers both databases store data in immutable sorted string or SSTables. Cassandra uses a deterministic hashing algorithm on a table's partition key to randomly and evenly distribute data throughout the cluster. Each node becomes the owner for a subset of the hashed values and each piece of data is stored on additional nodes to satisfy the defined replication factor. This architecture means that every node in the cluster is aware of which nodes hold each piece of data. As a result queries acting against the primary key of a table are extremely fast.

All data in CockroachDB is persisted in a large key value store based on the primary key of the table or index. This key value store is ordered lexicographically by key. The key value store is then divided into segments called ranges and these ranges are what is replicated throughout the cluster to maintain availability and survivability. Ranges can be up to 512MB in size. Cockroach maintains a range index so that each node is aware of the locations of all ranges in the cluster. This means that in addition to delivering fast key lookups, scan operations against a table’s key are also performant.

Data Model Design

With a basic understanding of how each database stores and accesses data we can now turn to the methods used to implement an effective data model. Apache Cassandra uses what is commonly known as query driven design. With query driven design, all data access patterns are determined up front and individual tables are created with the proper primary key to satisfy the conditions on the select query. Remember that Cassandra is optimized to select based on the primary key of a table. Data is partitioned based on the hashed value of the partition key component of the primary key. An attempt to select on a non-keyed value in the table results in a complete table scan across the cluster and is highly inefficient. Indexing data in Cassandra is also problematic due to the partitioning strategy and secondary indexes are essentially hidden tables with a different primary key that duplicate all data from the base table. This makes evolving the data model very difficult as the application changes.

CockroachDB is Postgres wire compatible and more traditional relational database design techniques can be employed. Foreign keys and online schema changes are supported within CockroachDB. An effective data model in CockroachDB will closely resemble a third normal form data model used by traditional non-distributed relational database management systems.

Consistency

There continues to be a lot written about consistency within distributed databases. I won’t get too deep into this conversation, but I will highlight some key differences. CockroachDB supports ACID compliant transactions at the highest serializable isolation level. In fact all mutations in CockroachDB are treated as transactions which allows CockroachDB to maintain consistent data even in a globally distributed deployment. Distributed systems require the use of a consensus algorithm and CockroachDB chose to implement the Raft protocol. Raft elects one of the range replicas as a leaseholder. All reads and writes for a range go through this leaseholder range ensuring data remains consistent. A write is considered successful when a majority of replicas have synchronously committed the mutation to their raft log. In the event that a node containing a leaseholder fails, one of the other replica ranges is elected as the leaseholder. One side effect of this implementation is that retry logic generally needs to be written into the application layer for workloads that may experience transaction contention.

Apache Cassandra applies an eventual consistency methodology. A consistency level can be applied to each read or write request and this consistency level determines how many nodes need to respond to satisfy the request. If reads and writes both use a quorum consistency level then consistent results can be achieved. The big difference here is that Cassandra replicates data asynchronously. In a distributed environment asynchronous replication is vulnerable to inconsistencies due to network issues, node availability, etc. Apache Cassandra addresses this concern by continually running a background process called repair which compares data between the nodes and uses the most recent timestamp or last write wins semantics to resolve discrepancies.

Ease of Use

With all the database choices on the market today, one of the often overlooked capabilities of a database system is how easy it is to install, use and maintain. Historically, Apache Cassandra has been notoriously difficult to install and operate. Configuring Cassandra consists of modifying several different yaml configuration files to meet user requirements. This complexity grew largely due to Cassandra’s open source origins where commits to enable features and functionality were inconsistently applied to the code base. Cassandra is written in Java and as such requires a fairly deep understanding of JVM runtime settings to tune for optimal performance. Further there is no easy to use built in monitoring solution. Cassandra does include JMX monitoring endpoints but they are not always intuitive or easily consumed.

CockroachDB is written in the Go programming language and is deployed as a single executable binary. Included in this binary is a browser based user interface which allows for easy monitoring and maintenance of the database. Configuration is controlled by cluster settings and startup flags that are easily set during initialization.

Conclusion

Apache Cassandra and CockroachDB share many characteristics. They are both highly scalable, open source, cloud native and cloud agnostic database solutions. However, as I mentioned at the outset, Apache Cassandra and CockroachDB were designed from the ground up to solve distributed data challenges differently. Hopefully this article highlights those differences and can assist in evaluating the proper database solution for your next application.