<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>DEV Community: Jeffrey Carpenter</title>
    <description>The latest articles on DEV Community by Jeffrey Carpenter (@jeffreyscarpenter).</description>
    <link>https://dev.to/jeffreyscarpenter</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F262283%2Fb32d36d5-920a-4173-9690-86fb157c32b5.png</url>
      <title>DEV Community: Jeffrey Carpenter</title>
      <link>https://dev.to/jeffreyscarpenter</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://dev.to/feed/jeffreyscarpenter"/>
    <language>en</language>
    <item>
      <title>Data Services for the Masses</title>
      <dc:creator>Jeffrey Carpenter</dc:creator>
      <pubDate>Thu, 06 Oct 2022 19:58:01 +0000</pubDate>
      <link>https://dev.to/datastax/data-services-for-the-masses-5h58</link>
      <guid>https://dev.to/datastax/data-services-for-the-masses-5h58</guid>
      <description>&lt;p&gt;I’ve held several roles in my career in IT, ranging from software developer to enterprise architect to developer advocate. I’ve always been fascinated by the role that data plays in our applications—putting it into databases, getting it back out quickly, making sure it remains accurate when transferred between systems. Many of the hardest problems I’ve encountered have centered around data. For example: &lt;/p&gt;

&lt;p&gt;Writing a cache eviction algorithm for an application that replayed hours worth of time-series radar data on a loop (ask me about the maintenance nightmare I created) &lt;/p&gt;

&lt;p&gt;Learning how to form queries in Kibana so that we could pull just the right log statements to help us debug interactions between microservices and even down to the database. &lt;/p&gt;

&lt;p&gt;One problem sits at the intersection of technology and the people building it. &lt;strong&gt;&lt;em&gt;There’s an ongoing debate over when developers should be required to access data via APIs and when they should be allowed to write their own database queries directly.&lt;/em&gt;&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Here, I’ll explore some of the solutions I’ve encountered in past projects and share why I’m excited about the &lt;a href="https://stargate.io/?ref=hackernoon.com" rel="noopener noreferrer"&gt;Stargate&lt;/a&gt; project as a framework for solving this problem for everyone. Stargate is an open-source API gateway for data, built on top of Apache Cassandra.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fc8u90w1h9qarpl5gk9wg.jpg" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fc8u90w1h9qarpl5gk9wg.jpg" alt="image" width="558" height="320"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  &lt;strong&gt;A brief history of data access design patterns&lt;/strong&gt;
&lt;/h2&gt;

&lt;p&gt;Even back when we were writing monolithic applications, many of us committed to creating maintainable code were isolating data access and complex query logic behind object-relational mapping tools or using patterns like data access objects (DAOs). Later, when we started using service-oriented architecture (SOA), similar patterns for abstracting data access appeared.&lt;/p&gt;

&lt;p&gt;While working in the hospitality industry, I helped design a cloud-based reservation system based on a microservices architecture. Following patterns inherited from our legacy SOA system, we found ourselves creating a set of services that we called entity services. Each entity service provided access to a particular data type such as hotels, rates, inventory, or reservations. I shared this architecture at Cassandra Summit and other conferences in 2016:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F2rjjgaforyfamufd15gk.jpg" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F2rjjgaforyfamufd15gk.jpg" alt="image" width="622" height="304"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;We layered services that implemented business processes on top of the entity services. The shopping service composed data from the hotel, rate, and inventory services to provide hotel and room options given desired dates and travel locations. The booking service wrote records into the reservation service and decremented the available room counts in the inventory service. &lt;/p&gt;

&lt;p&gt;Each entity service was responsible for its own storage, and services were not permitted to access the storage of another service. This meant that each entity service could potentially have a different database, although, in practice, the initial entity services were all implemented on Cassandra, with a different keyspace to contain the tables used by each service.&lt;/p&gt;

&lt;p&gt;Each entity service consisted of a few simple elements: an API layer (typically REST), business logic like data validation, and code to map between the data format presented on the API (typically JSON) and database queries implemented using a driver. I built a reference implementation of an entity service called the &lt;a href="https://github.com/jeffreyscarpenter/reservation-service?ref=hackernoon.com" rel="noopener noreferrer"&gt;Reservation Service&lt;/a&gt; for my O’Reilly Cassandra &lt;a href="https://www.amazon.com/Cassandra-Definitive-Guide-Distributed-Scale/dp/1491933666?ref=hackernoon.com" rel="noopener noreferrer"&gt;book&lt;/a&gt; based on this architecture.  &lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fpzrb55xfriv5dce9ngqt.jpg" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fpzrb55xfriv5dce9ngqt.jpg" alt="image" width="586" height="304"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  &lt;strong&gt;Drawbacks of entity services&lt;/strong&gt;
&lt;/h2&gt;

&lt;p&gt;Over time I began to observe that these entity services followed similar patterns in their API templates, validation logic, and database logic. While using frameworks such as Java Spring certainly helped with the API layer, logging, and other concerns, I wondered if there was more we could do to eliminate a lot of similar-looking code. The database access code, in particular, was quite formulaic:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Feu716x69p5smndp5t098.jpg" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Feu716x69p5smndp5t098.jpg" alt="image" width="654" height="319"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;We encountered a performance challenge as well. Having multiple layers of microservices meant additional latency as client requests traversed business services, entity services, and the database.&lt;/p&gt;

&lt;h2&gt;
  
  
  &lt;strong&gt;There are few new problems in computer science&lt;/strong&gt;
&lt;/h2&gt;

&lt;p&gt;It turns out that my team was not the only one encountering these issues and coming up with solutions. At DataStax Accelerate 2019 (our annual Cassandra community gathering), Michael Figurere shared how Instagram introduced the concept of a Cassandra gateway into their architecture.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fazhyns73khnvjli79czz.jpg" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fazhyns73khnvjli79czz.jpg" alt="image" width="525" height="249"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;The motivations for introducing this gateway layer were familiar to my ears: &lt;/p&gt;

&lt;p&gt;&lt;em&gt;A desire to abstract the details of writing application queries from developers&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;&lt;em&gt;A desire to support increased throughput and lower latency&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;&lt;em&gt;A strategy for using familiar APIs to minimize impact to client applications&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;In an interesting twist, the “familiar API” was Cassandra’s legacy Thrift-based API. Instagram had a large investment in clients using Thrift. Introducing the gateway promoted client reuse, while providing a translation between Thrift and Cassandra’s more modern CQL API. This layer also made it much easier to upgrade Cassandra versions.&lt;/p&gt;

&lt;p&gt;This highlights an interesting phenomenon: while the desire for an API layer to abstract data access is common across many organizations, each organization tends to have its own unique API requirements. There are often existing services with various API styles (such as REST, gRPC, Thrift) and data formats (such as JSON or protobuf). What if we could avoid the hassle of maintaining a bunch of services that are just thin wrappers around the database?&lt;/p&gt;

&lt;h2&gt;
  
  
  &lt;strong&gt;Enter Stargate&lt;/strong&gt;
&lt;/h2&gt;

&lt;p&gt;This desire to provide a common layer for data access with multiple different API styles inspired the Stargate project. The basic idea is simple: collapse the API layer into the database. As the picture shows, Stargate provides a pluggable framework for adding different API styles on top of Cassandra-compatible databases. &lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fveq28ruil6lhtty8uy61.jpg" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fveq28ruil6lhtty8uy61.jpg" alt="image" width="619" height="200"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;At the time of writing, the following API plugins are supported:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;RESTful APIs&lt;/strong&gt; This plugin exposes existing tables defined via CQL in a connected Cassandra cluster and provides an endpoint for creating a new schema. Data payloads are defined as JSON objects.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;GraphQL APIs&lt;/strong&gt; This plugin exposes CQL tables as a GraphQL API. A couple of features I absolutely love about GraphQL are the ability to request a subset of the fields of a returned row of data and the ability to compose data from multiple tables in a single query. &lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Document API&lt;/strong&gt; This plugin is what I point to when people ask if Cassandra is “schemaless” like Mongo DB. Traditionally the answer to this has been “no”—Cassandra requires a schema defined by CQL. However, the Document API changes all this, enabling you to throw arbitrary JSON documents at Stargate, which stores them and then lets you query documents or sub-documents.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;CQL API&lt;/strong&gt; This plugin supports Cassandra’s native query language. You might wonder why you would use this instead of one of the other APIs or just accessing a Cassandra cluster directly. The main reason, in my opinion, is a cool pattern I want to show you now.&lt;/p&gt;

&lt;h2&gt;
  
  
  &lt;strong&gt;The hidden benefit of an old idea&lt;/strong&gt;
&lt;/h2&gt;

&lt;p&gt;An interesting aspect of Stargate’s architecture is that Stargate nodes are actually Cassandra nodes. They participate in Cassandra’s distributed architecture as nodes that respond to client queries but don’t actually store any data, delegating storage and retrieval to regular Cassandra nodes. This enables a flexible scaling approach that wasn’t possible previously. Now you can scale the number of Stargate nodes to handle your query volume and the number of Cassandra nodes to handle your storage volume.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F2apgw5uczme54efe6vvy.jpg" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F2apgw5uczme54efe6vvy.jpg" alt="image" width="632" height="292"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;As it turns out, the idea of nodes that participate in a Cassandra cluster but don’t store data is not a new one. For example, longtime community member Eric Lubow introduced a similar concept called “coordinator nodes” (also known as “proxy nodes”) in his talk at Cassandra Summit 2016, based on his work at SimpleReach:  &lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fgvgvi7x7w453y5ttqcly.jpg" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fgvgvi7x7w453y5ttqcly.jpg" alt="image" width="528" height="400"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;As shown in the figure above, the coordinator nodes can use different instance types than other “data nodes” in the cluster to support optimal use of resources and save on cloud computing costs. This is a benefit that Stargate provides as well, and one that is easily realized when deploying Stargate on Kubernetes as part of the &lt;a href="https://k8ssandra.io/?ref=hackernoon.com" rel="noopener noreferrer"&gt;K8ssandra&lt;/a&gt; project, just by changing a few values in a YAML config file to specify a different instance type.&lt;/p&gt;

&lt;h2&gt;
  
  
  &lt;strong&gt;More exciting possibilities ahead&lt;/strong&gt;
&lt;/h2&gt;

&lt;p&gt;Because of Stargate’s pluggable architecture, you can extend it for your own API needs and define additional APIs or tailor one of the existing open-source connectors to match your enterprise API standards. The roadmap includes plugins for gRPC, and streaming interfaces such as Pulsar and Kafka.&lt;/p&gt;

&lt;p&gt;I’m also excited about the possibilities when the Stargate and K8ssandra open source projects are used together. The goal is to provide a production-ready, Cassandra-based data layer that you can install in any Kubernetes environment in minutes and focus on coding your apps.  If you'd like to play with Cassandra quickly off K8s, try the managed &lt;a href="https://astra.dev/3rSxXGl" rel="noopener noreferrer"&gt;DataStax Astra DB&lt;/a&gt;, which is built on Apache Cassandra.&lt;/p&gt;

&lt;p&gt;Beyond these two projects, there’s another community of people from different organizations who have come together to dream up the future of cloud-based data infrastructure - the &lt;a href="https://dok.community/?ref=hackernoon.com" rel="noopener noreferrer"&gt;Data on Kubernetes community&lt;/a&gt; (in fact, this article is based on &lt;a href="https://www.youtube.com/watch?v=PMZ-T3TgDCE&amp;amp;ref=hackernoon.com" rel="noopener noreferrer"&gt;my talk&lt;/a&gt; at the Data on Kubernetes Community Day at KubeCon EU 2021). We’d love to work with you in any or all of these communities!&lt;/p&gt;

</description>
    </item>
    <item>
      <title>Why Kubernetes Is The Best Technology For Running A Cloud-Native Database</title>
      <dc:creator>Jeffrey Carpenter</dc:creator>
      <pubDate>Tue, 20 Sep 2022 17:03:29 +0000</pubDate>
      <link>https://dev.to/datastax/why-kubernetes-is-the-best-technology-for-running-a-cloud-native-database-4ie9</link>
      <guid>https://dev.to/datastax/why-kubernetes-is-the-best-technology-for-running-a-cloud-native-database-4ie9</guid>
      <description>&lt;p&gt;We’ve been talking about migrating workloads to the cloud for a long time, but a look at the application portfolios of many IT organizations demonstrates that there’s still a lot of work to be done. In many cases, challenges with persisting and moving data in clouds continue to be the key limiting factor slowing cloud adoption, despite the fact that databases in the cloud have been available for years.&lt;/p&gt;

&lt;p&gt;For this reason, there has been a surge of recent interest in data infrastructure that is designed to take maximum advantage of the benefits that cloud computing provides. A &lt;a href="https://k8ssandra.io/blog/2021/03/23/the-search-for-a-cloud-native-database/?ref=hackernoon.com" rel="noopener noreferrer"&gt;cloud-native database&lt;/a&gt; is one that achieves the goals of scalability, elasticity, resiliency, observability, and automation; the &lt;a href="https://k8ssandra.io/?utm_medium=referral&amp;amp;utm_source=hackernoon&amp;amp;utm_campaign=k8ssandra&amp;amp;ref=hackernoon.com" rel="noopener noreferrer"&gt;K8ssandra&lt;/a&gt; project is a great example. It packages Apache Cassandra and supporting tools into a production-ready Kubernetes deployment. &lt;/p&gt;

&lt;p&gt;This raises an interesting question: must a database run on Kubernetes to be considered cloud-native? While Kubernetes was originally designed for stateless workloads, recent improvements in Kubernetes such as StatefulSets and persistent volumes have made it possible to run stateful workloads as well. Even longtime DevOps practitioners &lt;a href="https://thenewstack.io/a-case-for-databases-on-kubernetes-from-a-former-skeptic/?ref=hackernoon.com" rel="noopener noreferrer"&gt;skeptical of running databases on Kubernetes&lt;/a&gt; are beginning to come around, and &lt;a href="https://cloud.google.com/blog/products/databases/to-run-or-not-to-run-a-database-on-kubernetes-what-to-consider?ref=hackernoon.com" rel="noopener noreferrer"&gt;best practices are starting to emerge&lt;/a&gt;. &lt;/p&gt;

&lt;p&gt;But of course grudging acceptance of running databases on Kubernetes is not our goal. If we’re not pushing for greater &lt;a href="https://containerjournal.com/topics/a-maturity-model-for-cloud-native-databases/?ref=hackernoon.com" rel="noopener noreferrer"&gt;maturity in cloud-native databases&lt;/a&gt;, we’re missing a big opportunity. To make databases the most “cloud-native” they can be, we need to embrace everything that Kubernetes has to offer. A truly cloud-native approach means adopting key elements of the Kubernetes design paradigm. A cloud-native database must be one that can run effectively on Kubernetes. Let’s explore a few Kubernetes design principles that point the way.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Principle 1: Leverage compute, network, and storage as commodity APIs&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;One of keys to the success of cloud computing is the commoditization of compute, networking, and storage as resources we can provision via simple APIs. Consider this sampling of AWS services:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;  Compute: we allocate virtual machines through EC2 and Autoscaling Groups (ASGs)&lt;/li&gt;
&lt;li&gt;  Network: we manage traffic using Elastic Load Balancers (ELB), Route 53, and VPC peering &lt;/li&gt;
&lt;li&gt;  Storage: we persist data using options such as the Simple Storage Service (S3) for long-term object storage, or Elastic Block Storage (EBS) volumes for our compute instances. &lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Kubernetes offers its own APIs to provide similar services for a world of containerized applications:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;  Compute: pods, deployments, and replica sets manage the scheduling and life cycle of containers on computing hardware&lt;/li&gt;
&lt;li&gt;  Network: services and ingress expose a container’s networked interfaces&lt;/li&gt;
&lt;li&gt;  Storage: persistent volumes and stateful sets enable flexible association of containers to storage&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Kubernetes resources promote portability of applications across Kubernetes distributions and service providers. What does this mean for databases? They are simply applications that leverage compute, networking, and storage resources to provide the services of data persistence and retrieval:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;  Compute: a database needs sufficient processing power to process incoming data and queries. Each database node is deployed as a pod and grouped in StatefulSets, enabling Kubernetes to manage scaling out and scaling in.&lt;/li&gt;
&lt;li&gt;  Network: a database needs to expose interfaces for data and control. We can use Kubernetes Services and Ingress Controllers to expose these interfaces.&lt;/li&gt;
&lt;li&gt;  Storage: a database uses persistent volumes of a specified storage class to store and retrieve data.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Thinking of databases in terms of their compute, network, and storage needs removes much of the complexity involved in deployment on Kubernetes.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Principle 2: Separate the control and data planes&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Kubernetes promotes the separation of control and data planes. The Kubernetes API server is the key data plane interface used to request computing resources, while the control plane manages the details of mapping those requests onto an underlying IaaS platform.&lt;/p&gt;

&lt;p&gt;We can apply this same pattern to databases. For example, Cassandra’s data plane consists of the port exposed by each node for clients to access Cassandra Query Language (CQL) and the port used for internode communication. The control plane includes the Java Management Extensions (JMX) interface provided by each Cassandra node. Although JMX is a standard that’s showing its age and has had some security vulnerabilities, it's a relatively simple task to take a more cloud-native approach. In K8ssandra, Cassandra is deployed in a custom container image that adds a RESTful Management API, bypassing the JMX interface.&lt;/p&gt;

&lt;p&gt;The remainder of the control plane consists of logic that leverages the Management API to manage Cassandra nodes. This is implemented via the Kubernetes operator pattern. Operators define custom resources and provide control loops that observe the state of those resources and take actions to move them toward a desired state, helping extend Kubernetes with domain-specific logic.&lt;/p&gt;

&lt;p&gt;The K8ssandra project uses &lt;a href="https://github.com/datastax/cass-operator?ref=hackernoon.com" rel="noopener noreferrer"&gt;cass-operator&lt;/a&gt; to automate Cassandra operations. Cass-operator defines a “CassandraDatatcenter” custom resource (CRD) to represent each top-level failure domain of a Cassandra cluster. This builds a higher-level abstraction based on Stateful Sets and Persistent Volumes.&lt;/p&gt;

&lt;p&gt;A sample K8ssandra deployment including Apache Cassandra and cass-operator:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fau270n756te04k3wz3g9.jpg" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fau270n756te04k3wz3g9.jpg" alt="Alt Text" width="800" height="617"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Principle 3: Make observability easy&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;The three pillars of observable systems are logging, metrics, and tracing. Kubernetes provides a great starting point by exposing the logs of each container to third-party log aggregation solutions. Metrics and tracing require a bit more effort to implement, but there are multiple solutions available.&lt;/p&gt;

&lt;p&gt;The K8ssandra project supports metrics collection using the kube-prometheus-stack. The Metrics Collector for Apache Cassandra (MCAC) is deployed as an agent on each Cassandra node, providing a dedicated metrics endpoint. A ServiceMonitor from the kube-prometheus-stack pulls metrics from each agent and stores them in Prometheus for use by Grafana or other visualization and analysis tools.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Principle 4: Make the default configuration secure&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Kubernetes networking is secure by default: ports must be explicitly exposed in order to be accessed externally to a pod. This sets a useful precedent for database deployment, forcing us to think carefully about how each control plane and data plane interface will be exposed, and which interfaces should be exposed via a Kubernetes Service.&lt;/p&gt;

&lt;p&gt;In Kassandra, CQL access is exposed as a service for each CassandraDatacenter resource, while APIs for management and metrics are accessed for individual Cassandra nodes by cass-operator and the Prometheus Service Monitor, respectively.&lt;/p&gt;

&lt;p&gt;Kubernetes also provides facilities for secret management, including sharing encryption keys and configuring administrative accounts. K8ssandra deployments replace Cassandra’s default administrator account with a new administrator username and password.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Principle 5: Prefer declarative configuration&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;In the Kubernetes declarative approach, you specify the desired state of resources and controllers manipulate the underlying infrastructure in order to achieve that state. Cass-operator allows you to specify the desired number of nodes in a cluster, and manages the details of placing new nodes to scale up, and selecting which nodes to remove to scale down.&lt;/p&gt;

&lt;p&gt;The next generation of operators should enable us to specify rules for stored data size, number of transactions per second, or both. Perhaps we’ll be able to specify maximum and minimum cluster sizes, and when to move less frequently used data to object storage.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;The best designs draw on the wisdom of the community&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Hopefully I’ve convinced you that Kubernetes is a great source of best practices for cloud-native database implementations, and the innovation continues. Solutions for federating Kubernetes clusters are still maturing, but will soon make it much simpler to manage multi-data center Cassandra clusters in Kubernetes. In the Cassandra community, we can work to make extensions for management and metrics a part of the core Apache project so that Cassandra is more naturally cloud-native for everyone, right out of the box.&lt;/p&gt;

&lt;p&gt;If you’re excited at the prospect of cloud-native databases on Kubernetes, you’re not alone. A group of like-minded individuals and organizations has assembled as the &lt;a href="https://dok.community/?ref=hackernoon.com" rel="noopener noreferrer"&gt;Data on Kubernetes Community&lt;/a&gt;, which has hosted over 50 meetups in multiple languages since its inception last year. We’re grateful to MayaData for helping to start this community, and are excited to announce that DataStax has joined as a co-sponsor of the DoKC.&lt;/p&gt;

&lt;p&gt;In more great news, the DoKC was &lt;a href="https://community.cncf.io/data-on-kubernetes/?ref=hackernoon.com" rel="noopener noreferrer"&gt;accepted as an official CNCF community group&lt;/a&gt;, and hosted the first ever &lt;a href="https://events.linuxfoundation.org/kubecon-cloudnativecon-europe/program/colocated-events/?ref=hackernoon.com#data-on-kubernetes-day" rel="noopener noreferrer"&gt;Data on Kubernetes Day&lt;/a&gt; as part of &lt;a href="https://events.linuxfoundation.org/kubecon-cloudnativecon-europe/?ref=hackernoon.com" rel="noopener noreferrer"&gt;Kubecon/CloudNativeCon Europe&lt;/a&gt; on May 3. Rick Vasquez’s &lt;a href="https://www.youtube.com/watch?v=gpz7mgxK5Ww&amp;amp;ref=hackernoon.com" rel="noopener noreferrer"&gt;talk&lt;/a&gt;, “A Call for DBMS to Modernize on Kubernetes,” lays down a challenge to make the architectural changes required to become truly cloud native. Together, we’ll arrive at the best solutions through collaboration in open source communities like Kubernetes, Data on Kubernetes, Apache Cassandra, and K8ssandra. Let’s lead with code and keep talking!  If you'd like to play with Cassandra quickly off K8s, try the managed &lt;a href="https://astra.dev/3RZmeRH" rel="noopener noreferrer"&gt;DataStax Astra DB&lt;/a&gt;, which is built on Apache Cassandra.&lt;/p&gt;

</description>
    </item>
    <item>
      <title>How the world caught up with Apache Cassandra</title>
      <dc:creator>Jeffrey Carpenter</dc:creator>
      <pubDate>Thu, 15 Sep 2022 16:50:42 +0000</pubDate>
      <link>https://dev.to/datastax/how-the-world-caught-up-with-apache-cassandra-4cjb</link>
      <guid>https://dev.to/datastax/how-the-world-caught-up-with-apache-cassandra-4cjb</guid>
      <description>&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fptljp57tbtdg8klmkcud.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fptljp57tbtdg8klmkcud.png" alt="Image description" width="800" height="533"&gt;&lt;/a&gt;&lt;br&gt;
The O’Reilly book, &lt;em&gt;Cassandra: The Definitive Guide,&lt;/em&gt; features a quote from Ray Kurzweil, the noted inventor and futurist: &lt;/p&gt;

&lt;p&gt;“An invention has to make sense in the world in which it is finished, not the world in which it is started.” &lt;/p&gt;

&lt;p&gt;This quote has a prophetic ring to it, especially considering my co-author Eben Hewitt included it in the 2010 first edition of this book we wrote, back when Apache Cassandra, the open-source, distributed, and highly scalable NoSQL database, was just on its 0.7 release. &lt;/p&gt;

&lt;p&gt;In those days, other NoSQL databases were appearing on the scene as part of platforms with worldwide scale from vendors like  Amazon, YouTube, and Facebook. With many competing database projects and a slowly emerging response from relational database vendors, the future of this emerging landscape wasn’t yet clear, and Hewitt qualified his assessment with this summary: “In a world now working at web scale and looking to the future, Apache Cassandra &lt;em&gt;might be one part&lt;/em&gt; of the answer.” (emphasis added)&lt;/p&gt;

&lt;p&gt;While many of those databases from the NoSQL revolution and the &lt;a href="https://en.wikipedia.org/wiki/NewSQL" rel="noopener noreferrer"&gt;NewSQL&lt;/a&gt; counter-revolution have now faded into history, Cassandra has stood the test of time, maturing into a rock-solid database that arguably still scales with performance and reliability better than any other. &lt;/p&gt;

&lt;p&gt;Twelve-plus years after its invention, Cassandra is now used by approximately 90 percent of the Fortune 100, and it’s appeal is broadening quickly, driven by a rush to harness today’s “data deluge” with apps that are globally distributed and always-on. Add to this recent advances in the Cassandra ecosystem such as &lt;a href="https://stargate.io/" rel="noopener noreferrer"&gt;Stargate&lt;/a&gt;, &lt;a href="https://k8ssandra.io/" rel="noopener noreferrer"&gt;K8ssandra&lt;/a&gt;, and cloud services like &lt;a href="https://astra.dev/3qGHoI5" rel="noopener noreferrer"&gt;Astra DB&lt;/a&gt;, and the cost and complexity barriers to using Cassandra are fading into the past. So while it’s fair to say that while Cassandra might have been ahead of its time in 2007, it’s primed and ready for the data demands of the 2020s and beyond.&lt;/p&gt;

&lt;h1&gt;
  
  
  &lt;strong&gt;Cassandra grows up fast&lt;/strong&gt;
&lt;/h1&gt;

&lt;p&gt;Cassandra made a lot of sense to its inventors at Facebook when they developed it in 2007 to store and access reams of data for Messenger, which was growing insanely fast. From the start, Cassandra scaled quickly, and accessed huge amounts of data within strict SLAs—in a way that relational databases and SQL, which had long been the standard ways to access and manipulate data, couldn’t. As it became clear that this technology was suitable for other use cases, Facebook handed Cassandra to the Apache Software Foundation, where it became an open source project (it was voted into a top-level project in 2010).&lt;/p&gt;

&lt;p&gt;The reliability and fail-over capabilities offered by Cassandra quickly won over some rising web stars, who loved its scalability and reliability. Netflix launched its streaming service in 2007, using an Oracle database in a single data center. As the company’s streaming service users, the devices they binge-watched with, and data expanded rapidly, the limitations on scalability and the potential for failures became a serious threat to Netflix’s success. At the time, Netflix’s then-cloud architect Adrian Cockroft said he viewed the single data center that housed Netflix’s backend as a single point of failure. Cassandra, with its distributed architecture, was a natural choice, and by 2013, most of Netflix’s data was housed there, and Netflix still uses Cassandra today.&lt;/p&gt;

&lt;p&gt;Cassandra survived its adolescent years by retaining its position as the database that scales more reliably than anything else, with a continual pursuit of operational simplicity at scale. It demonstrated its value even further by integrating with a broader data infrastructure stack of open source components, including the analytics engine &lt;a href="https://spark.apache.org/" rel="noopener noreferrer"&gt;Apache Spark&lt;/a&gt;, stream-processing platform &lt;a href="https://kafka.apache.org/" rel="noopener noreferrer"&gt;Apache Kafka&lt;/a&gt;, and others.&lt;/p&gt;

&lt;h1&gt;
  
  
  &lt;strong&gt;The Cassandra constellation&lt;/strong&gt;
&lt;/h1&gt;

&lt;p&gt;Cassandra hit a major milestone this month, with the release of 4.0. The members of the Cassandra community pledged to do something that’s unusual for a dot-zero release: make 4.0 so stable that major users would run it in production from the get-go. But the real headline is the overall growth of the Cassandra ecosystem, measured by changes both within the project and related projects, and improvements in how Cassandra plays within anyour infrastructure. &lt;/p&gt;

&lt;p&gt;A host of complementary open-source technologies have sprung up around Cassandra to make it easier for developers to build apps with it. &lt;a href="https://stargate.io/" rel="noopener noreferrer"&gt;Stargate&lt;/a&gt;, for example, is an open source data gateway that provides a pluggable API layer that greatly simplifies developer interaction with any Cassandra database. REST, GraphQL, Document, and gRPC APIs make it easy to just start coding with Cassandra without having to learn the complexities of CQL and Cassandra data modeling.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://k8ssandra.io/" rel="noopener noreferrer"&gt;K8ssandra&lt;/a&gt; is another open source project that demonstrates this approachability, making it possible to deploy Cassandra on any Kubernetes engine, from the public cloud providers to VMWare and OpenStack. K8ssandra extends the Kubernetes promise of application portability to the data tier, providing yet another weapon against vendor-lock in.&lt;/p&gt;

&lt;h1&gt;
  
  
  &lt;strong&gt;What if data wasn’t a problem?&lt;/strong&gt;
&lt;/h1&gt;

&lt;p&gt;There’s a question that Hewitt poses in &lt;em&gt;&lt;a href="https://www.datastax.com/resources/ebook/oreilly-cassandra-definitive-guide" rel="noopener noreferrer"&gt;Cassandra: The Definitive Guide&lt;/a&gt;:&lt;/em&gt; “What kind of things would I do with data if it wasn’t a problem?”&lt;/p&gt;

&lt;p&gt;Netflix asked this question—and ran with the answer—almost a decade ago. The $25-billion company is a paragon of the kind of success that can be built with the right tools and the right strategy at the right time. But today, for a broad spectrum of companies that want to achieve business success, data also can’t be a “problem.” &lt;/p&gt;

&lt;p&gt;Think of the modern applications and workloads that should never go down, like online banking services, or those that operate at huge, distributed scale, such as airline booking systems or popular retail apps. Cassandra’s seamless and consistent ability to scale to hundreds of terabytes, along with its exceptional performance under heavy loads, has made it a key part of the data infrastructures of companies that operate these kinds of applications.&lt;/p&gt;

&lt;p&gt;Across industries, companies have staked their business on the reliability and scalability of Cassandra. Best Buy, the world’s largest multichannel consumer electronics retailer, refers to Cassandra as “flawless” in how it handles massive spikes in holiday purchasing traffic. Bloomberg News has relied on Cassandra since 2016 because it’s easy to use, easy to scale, and always available; the financial news service serves 20 billion requests per day on nearly a petabyte of data (that’s the rough equivalent of over 4,000 digital pictures a day—for every day of an average person’s life). &lt;/p&gt;

&lt;p&gt;But Cassandra isn’t just for big, established sector leaders like Best Buy or Bloomberg. &lt;a href="https://www.datastax.com/enterprise-success/ankeri" rel="noopener noreferrer"&gt;Ankeri&lt;/a&gt;, an Icelandic startup that operates a platform to help cargo shipping operators manage real-time vessel data, chose Cassandra—delivered through DataStax’s &lt;a href="https://astra.dev/3qGHoI5" rel="noopener noreferrer"&gt;Astra DB&lt;/a&gt;—in part because of its ability to scale as the company gathers an increasing amount of data from a growing number of ships. It wanted a data platform that wouldn’t make data a problem, and wouldn’t get in the way of its success.&lt;/p&gt;

&lt;h1&gt;
  
  
  &lt;strong&gt;Making Cassandra simpler and more cost-effective&lt;/strong&gt;
&lt;/h1&gt;

&lt;p&gt;A handful of organizations have built services around Cassandra, in an effort to make it more accessible, and to solve some of the inherent challenges that come with operating a robust database.&lt;/p&gt;

&lt;p&gt;One particularly hard nut to crack when it comes to managing databases has been provisioning. With cloud computing services (think AWS Lambda), scaling, capacity planning, and cost management are all automated, resulting in software that’s easy to maintain, and cost effective—”serverless,” in other words. But because modern databases store data by partitioning it across nodes of a database cluster, they’ve proved challenging to make serverless. Doing so requires rebalancing data across nodes when more are added, in order to balance storage and computing capabilities. &lt;/p&gt;

&lt;p&gt;Because of this, enterprises have been required to guess what their peak usage will be—and pay for that level, even if they aren’t using that capacity. That’s why it was a big deal when DataStax announced earlier this year that its Astra DB cloud database built on Cassandra is &lt;a href="https://www.datastax.com/blog/2021/02/datastax-serverless-what-we-did-and-why-its-game-changer" rel="noopener noreferrer"&gt; available&lt;/a&gt; as a serverless, pay-as-you-go service. According to &lt;a href="https://www.datastax.com/gigaom-tco" rel="noopener noreferrer"&gt;recent research&lt;/a&gt; by analyst firm GigaOm, the serverless Astra DB can deliver significant cost savings. And developers will only pay for what they use, no matter how many database clusters they create and deploy. &lt;/p&gt;

&lt;p&gt;Carl Olofson, research vice president at IDC, noted: “A core benefit of the cloud is dynamic scalability, but this has been more difficult to achieve for storage than with compute. By decoupling compute from storage, DataStax’s Astra DB service lets users take advantage of the innate elasticity of the cloud for data, with a cloud agnostic database.”&lt;/p&gt;

&lt;h1&gt;
  
  
  &lt;strong&gt;A database for today&lt;/strong&gt;
&lt;/h1&gt;

&lt;p&gt;While Cassandra is more than a decade young, it is a database for today.  If the argument of 2010 was “Cassandra may be the future,” and 2017 “Cassandra is mature,” the 2021 version is “Cassandra is an essential part of any modern data platform.” The developments in Cassandra and its surrounding ecosystem point to a coming wave of new developers and enterprises worldwide for whom Cassandra is not just a sensible choice, but an obvious one.&lt;/p&gt;

&lt;h2&gt;
  
  
  &lt;strong&gt;Want to learn more about DataStax Astra DB, built on Apache Cassandra? &lt;a href="https://www.datastax.com/products/astra/demo" rel="noopener noreferrer"&gt;Sign up&lt;/a&gt; for a free demo.&lt;/strong&gt;
&lt;/h2&gt;

</description>
    </item>
    <item>
      <title>Why a Cloud-Native Database Must Run on K8s</title>
      <dc:creator>Jeffrey Carpenter</dc:creator>
      <pubDate>Tue, 19 Jul 2022 17:44:05 +0000</pubDate>
      <link>https://dev.to/datastax/why-a-cloud-native-database-must-run-on-k8s-1ped</link>
      <guid>https://dev.to/datastax/why-a-cloud-native-database-must-run-on-k8s-1ped</guid>
      <description>&lt;p&gt;We’ve been talking about migrating workloads to the cloud for a long time, but a look at the application portfolios of many IT organizations demonstrates that there’s still a lot of work to be done. In many cases, challenges with persisting and moving data in clouds continue to be the key limiting factor slowing cloud adoption, despite the fact that databases in the cloud have been available for years.&lt;/p&gt;

&lt;p&gt;For this reason, there has been a surge of recent interest in data infrastructure that is designed to take maximum advantage of the benefits that cloud computing provides. A &lt;a href="https://k8ssandra.io/blog/2021/03/23/the-search-for-a-cloud-native-database/" rel="noopener noreferrer"&gt;cloud-native database&lt;/a&gt; is one that achieves the goals of scalability, elasticity, resiliency, observability and automation; the &lt;a href="https://k8ssandra.io/" rel="noopener noreferrer"&gt;K8ssandra&lt;/a&gt; project is a great example. It packages Apache &lt;a href="https://containerjournal.com/?s=Cassandra" rel="noopener noreferrer"&gt;Cassandra&lt;/a&gt; and supporting tools into a production-ready Kubernetes deployment.&lt;/p&gt;

&lt;h2&gt;
  
  
  &lt;strong&gt;Databases on Kubernetes&lt;/strong&gt;
&lt;/h2&gt;

&lt;p&gt;This raises an interesting question: must a database run on Kubernetes to be considered cloud-native? While Kubernetes was originally designed for stateless workloads, recent improvements in Kubernetes – such as StatefulSets and persistent volumes –  have made it possible to run stateful workloads, as well. Even long-time DevOps practitioners &lt;a href="https://thenewstack.io/a-case-for-databases-on-kubernetes-from-a-former-skeptic/" rel="noopener noreferrer"&gt;skeptical of running databases&lt;/a&gt; on Kubernetes are beginning to come around, and &lt;a href="https://cloud.google.com/blog/products/databases/to-run-or-not-to-run-a-database-on-kubernetes-what-to-consider" rel="noopener noreferrer"&gt;best practices&lt;/a&gt; are starting to emerge.&lt;/p&gt;

&lt;p&gt;But, of course, grudging acceptance of running databases on Kubernetes is not our goal. If we’re not pushing for greater &lt;a href="https://containerjournal.com/topics/a-maturity-model-for-cloud-native-databases/" rel="noopener noreferrer"&gt;maturity in cloud-native databases&lt;/a&gt;, we’re missing a big opportunity. To make databases the most “cloud-native” they can be, we need to embrace everything that Kubernetes has to offer. A truly cloud-native approach means adopting key elements of the Kubernetes design paradigm. A cloud-native database must be one that can run effectively on Kubernetes. Let’s explore a few Kubernetes design principles that point the way.&lt;/p&gt;

&lt;h3&gt;
  
  
  &lt;strong&gt;Principle One: Leverage Compute, Network and Storage as Commodity APIs&lt;/strong&gt;
&lt;/h3&gt;

&lt;p&gt;One of keys to the success of cloud computing is the commoditization of compute, networking and storage as resources we can provision via simple APIs. Consider this sampling of AWS services:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;  Compute: we allocate virtual machines through EC2 and Autoscaling Groups (ASGs)&lt;/li&gt;
&lt;li&gt;  Network: we manage traffic using Elastic Load Balancers (ELB), Route 53, and VPC peering&lt;/li&gt;
&lt;li&gt;  Storage: we persist data using options such as the Simple Storage Service (S3) for long-term object storage, or Elastic Block Storage (EBS) volumes for our compute instances.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Kubernetes offers its own APIs to provide similar services for a world of containerized applications:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;  Compute: pods, deployments, and replica sets manage the scheduling and life cycle of containers on computing hardware&lt;/li&gt;
&lt;li&gt;  Network: services and ingress expose a container’s networked interfaces&lt;/li&gt;
&lt;li&gt;  Storage: persistent volumes and stateful sets enable flexible association of containers to storage&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Kubernetes resources promote portability of applications across Kubernetes distributions and service providers. What does this mean for databases? They are simply applications that leverage compute, networking and storage resources to provide the services of data persistence and retrieval:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;  Compute: a database needs sufficient processing power to process incoming data and queries. Each database node is deployed as a pod and grouped in StatefulSets, enabling Kubernetes to manage scaling out and scaling in.&lt;/li&gt;
&lt;li&gt;  Network: a database needs to expose interfaces for data and control. We can use Kubernetes Services and Ingress Controllers to expose these interfaces.&lt;/li&gt;
&lt;li&gt;  Storage: a database uses persistent volumes of a specified storage class to store and retrieve data.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Thinking of databases in terms of their compute, network and storage needs removes much of the complexity involved in deployment on Kubernetes.&lt;/p&gt;

&lt;h3&gt;
  
  
  &lt;strong&gt;Principle Two: Separate the Control and Data Planes&lt;/strong&gt;
&lt;/h3&gt;

&lt;p&gt;Kubernetes promotes the separation of control and data planes. The Kubernetes API server is the key data plane interface used to request computing resources, while the control plane manages the details of mapping those requests onto an underlying IaaS platform.&lt;/p&gt;

&lt;p&gt;We can apply this same pattern to databases. For example, Cassandra’s data plane consists of the port exposed by each node for clients to access Cassandra Query Language (CQL) and the port used for internode communication. The control plane includes the Java Management Extensions (JMX) interface provided by each Cassandra node. Although JMX is a standard that’s showing its age and has had some security vulnerabilities, it’s a relatively simple task to take a more cloud-native approach. In K8ssandra, Cassandra is deployed in a custom container image that adds a RESTful Management API, bypassing the JMX interface.&lt;/p&gt;

&lt;p&gt;The remainder of the control plane consists of logic that leverages the management API to manage Cassandra nodes. This is implemented via the Kubernetes operator pattern. Operators define custom resources and provide control loops that observe the state of those resources and take actions to move them toward a desired state, helping extend Kubernetes with domain-specific logic.&lt;/p&gt;

&lt;p&gt;The K8ssandra project uses &lt;a href="https://github.com/datastax/cass-operator" rel="noopener noreferrer"&gt;cass-operator&lt;/a&gt; to automate Cassandra operations. Cass-operator defines a “CassandraDatatcenter” custom resource (CRD) to represent each top-level failure domain of a Cassandra cluster. This builds a higher-level abstraction based on Stateful Sets and Persistent Volumes.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fymy3h26bb7pk5v6tp3vy.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fymy3h26bb7pk5v6tp3vy.png" alt="Alt Text" width="637" height="518"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h3&gt;
  
  
  &lt;strong&gt;Principle Three: Make Observability Easy&lt;/strong&gt;
&lt;/h3&gt;

&lt;p&gt;The three pillars of observable systems are logging, metrics and tracing. Kubernetes provides a great starting point by exposing the logs of each container to third-party log aggregation solutions. Metrics and tracing require a bit more effort to implement, but there are multiple solutions available.&lt;/p&gt;

&lt;p&gt;The K8ssandra project supports metrics collection using the kube-prometheus-stack. The Metrics Collector for Apache Cassandra (MCAC) is deployed as an agent on each Cassandra node, providing a dedicated metrics endpoint. A ServiceMonitor from the kube-prometheus-stack pulls metrics from each agent and stores them in Prometheus for use by Grafana or other visualization and analysis tools.&lt;/p&gt;

&lt;h3&gt;
  
  
  &lt;strong&gt;Principle Four: Make the Default Configuration Secure&lt;/strong&gt;
&lt;/h3&gt;

&lt;p&gt;Kubernetes networking is secure by default: ports must be explicitly exposed in order to be accessed externally to a pod. This sets a useful precedent for database deployment, forcing us to think carefully about how each control plane and data plane interface will be exposed, and which interfaces should be exposed via a Kubernetes Service.&lt;/p&gt;

&lt;p&gt;In Kassandra, CQL access is exposed as a service for each CassandraDatacenter resource, while APIs for management and metrics are accessed for individual Cassandra nodes by cass-operator and the Prometheus Service Monitor, respectively.&lt;/p&gt;

&lt;p&gt;Kubernetes also provides facilities for secrets management, including sharing encryption keys and configuring administrative accounts. K8ssandra deployments replace Cassandra’s default administrator account with a new administrator username and password.&lt;/p&gt;

&lt;h3&gt;
  
  
  &lt;strong&gt;Principle Five: Prefer Declarative Configuration&lt;/strong&gt;
&lt;/h3&gt;

&lt;p&gt;In the Kubernetes declarative approach, you specify the desired state of resources and controllers manipulate the underlying infrastructure in order to achieve that state. Cass-operator allows you to specify the desired number of nodes in a cluster, and manages the details of placing new nodes to scale up, and selecting which nodes to remove to scale down.&lt;/p&gt;

&lt;p&gt;The next generation of operators should enable us to specify rules for stored data size, number of transactions per second or both. Perhaps we’ll be able to specify maximum and minimum cluster sizes, and when to move less frequently used data to object storage.&lt;/p&gt;

&lt;h2&gt;
  
  
  &lt;strong&gt;Draw on the Wisdom of the Community&lt;/strong&gt;
&lt;/h2&gt;

&lt;p&gt;I hope I’ve convinced you that Kubernetes is a great source of best practices for cloud-native database implementations, and the innovation continues. Solutions for federating Kubernetes clusters are still maturing, but will soon make it much simpler to manage multi-data center Cassandra clusters in Kubernetes. In the Cassandra community, we can work to make extensions for management and metrics a part of the core Apache project so that Cassandra is more naturally cloud-native for everyone, right out of the box.&lt;/p&gt;

&lt;p&gt;If you’re excited at the prospect of cloud-native databases on Kubernetes, you’re not alone. A group of like-minded individuals and organizations has assembled as the &lt;a href="https://dok.community/" rel="noopener noreferrer"&gt;Data on Kubernetes Community&lt;/a&gt;, which has hosted over 50 meetups in multiple languages since its inception last year. We’re grateful to MayaData for helping to start this community, and are excited to announce that DataStax has joined as a co-sponsor of the DoKC.&lt;/p&gt;

&lt;p&gt;In more great news, the DoKC was accepted as &lt;a href="https://community.cncf.io/data-on-kubernetes/" rel="noopener noreferrer"&gt;an official CNCF community group&lt;/a&gt;, and hosted the first ever &lt;a href="https://events.linuxfoundation.org/kubecon-cloudnativecon-europe/program/colocated-events/#data-on-kubernetes-day" rel="noopener noreferrer"&gt;Data on Kubernetes Day&lt;/a&gt; as part of &lt;a href="https://events.linuxfoundation.org/kubecon-cloudnativecon-europe/" rel="noopener noreferrer"&gt;Kubecon/CloudNativeCon Europe&lt;/a&gt; on May 3. Rick Vasquez’s &lt;a href="https://dok.community/dokc-day-schedule/" rel="noopener noreferrer"&gt;talk&lt;/a&gt;, “A Call for DBMS to Modernize on Kubernetes,” lays down a challenge to make the architectural changes required to become truly cloud-native. Together, we’ll arrive at the best solutions through collaboration in open source communities like Kubernetes, Data on Kubernetes, Apache Cassandra and K8ssandra. Let’s lead with code and keep talking! If you'd like to play with Cassandra quickly off K8s, try the managed &lt;a href="https://astra.dev/3cqZn1s" rel="noopener noreferrer"&gt;DataStax Astra DB&lt;/a&gt;, which is built on Apache Cassandra.&lt;/p&gt;

</description>
    </item>
    <item>
      <title>How to Put a Database in Kubernetes</title>
      <dc:creator>Jeffrey Carpenter</dc:creator>
      <pubDate>Thu, 10 Mar 2022 22:08:33 +0000</pubDate>
      <link>https://dev.to/datastax/how-to-put-a-database-in-kubernetes-27hl</link>
      <guid>https://dev.to/datastax/how-to-put-a-database-in-kubernetes-27hl</guid>
      <description>&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F0ep7lk8lm72e0wj1vzbi.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F0ep7lk8lm72e0wj1vzbi.png" alt="Image description" width="768" height="441"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;em&gt;Learn the key steps of deploying databases and stateful workloads in Kubernetes and meet the cloud-native technologies, like K8ssandra, that can streamline Apache Cassandra for K8s.&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;The idea of running a stateful workload in Kubernetes (K8s) can be intimidating, especially if you haven’t done it before. How do you deploy a database? Where is the actual storage? How is the storage mapped to the database or the application using it?&lt;/p&gt;

&lt;p&gt;At &lt;a href="https://events.linuxfoundation.org/kubecon-cloudnativecon-north-america/" rel="noopener noreferrer"&gt;KubeCon North America 2021&lt;/a&gt;, I’ll be giving a talk on “&lt;a href="https://sched.co/lV3V" rel="noopener noreferrer"&gt;How to put a database in Kubernetes&lt;/a&gt;” where I demystify the deployment of databases and stateful workloads in K8s. Basically, it boils down to a few key steps:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Get to know the Kubernetes primitives&lt;/li&gt;
&lt;li&gt;Pick a database&lt;/li&gt;
&lt;li&gt;Pick a storage provider&lt;/li&gt;
&lt;li&gt;Pick an operator&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;This blog is a sneak preview of my upcoming talk, which will take place in Los Angeles and streamed online this October 12. If you’d like to join me at this year’s KubeCon, either virtually or in-person, &lt;a href="https://events.linuxfoundation.org/kubecon-cloudnativecon-north-america/register/" rel="noopener noreferrer"&gt;register here&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;In the meantime, this blog post dives into the key steps of deploying databases and stateful workloads in K8s. You can learn more about them during my talk, as well as in the &lt;a href="https://twitter.com/JessHaberman/status/1425898298959859712" rel="noopener noreferrer"&gt;upcoming O’Reilly book&lt;/a&gt;: Managing Cloud Native Data on Kubernetes.&lt;/p&gt;

&lt;h1&gt;
  
  
  Get to know the Kubernetes primitives
&lt;/h1&gt;

&lt;p&gt;Simply put: databases are just applications composed of compute, network, and storage. We can deploy them like any other K8s application and take advantage of resources that it provides: StatefulSets, Services, StorageClasses, PersistentVolumes, and PersistentVolumeClaims, and more.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F1z46ruganyl915egblik.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F1z46ruganyl915egblik.png" alt="Image description" width="800" height="410"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Figure 1: Kubernetes resources help us think of applications in terms of compute, network, and storage.&lt;/p&gt;

&lt;p&gt;Getting comfortable with using these primitives will help you understand how databases and other data infrastructure are deployed on K8s. For example, a deployment of &lt;a href="https://cassandra.apache.org/" rel="noopener noreferrer"&gt;Apache Cassandra®&lt;/a&gt; will typically use a StatefulSet to launch pods across available Kubernetes worker nodes, with each Cassandra pod having its own PersistentVolumeClaim that can be preserved and reused if the pod needs to be replaced.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F5ds8tozl1dzeiuki6nta.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F5ds8tozl1dzeiuki6nta.png" alt="Image description" width="768" height="441"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Figure 2: Simple deployment of Cassandra on Kubernetes using a StatefulSet.&lt;/p&gt;

&lt;p&gt;For more great examples of using these primitives online, check the reference example in the Kubernetes documentation of &lt;a href="https://kubernetes.io/docs/tutorials/stateful-application/cassandra/" rel="noopener noreferrer"&gt;deploying Cassandra using StatefulSets&lt;/a&gt;. We’re also building a &lt;a href="https://dtsx.io/3toxkEb" rel="noopener noreferrer"&gt;collection of examples on GitHub&lt;/a&gt; in association with the book project and would love to see your issues and pull requests.&lt;/p&gt;

&lt;p&gt;Once you’ve familiarized yourself with the basic building blocks of Kubernetes, there are three main considerations when setting up the right database for your application.&lt;/p&gt;

&lt;h1&gt;
  
  
  Pick a database
&lt;/h1&gt;

&lt;p&gt;To start, you’ll want to think about what &lt;em&gt;kind&lt;/em&gt; of database your application needs. To help you make the right choice, consider the following factors:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Database language:&lt;/strong&gt; does your application need SQL, NoSQL, developer-friendly data APIs?&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Capacity, performance, and scalability requirements&lt;/strong&gt;: will your data fit on a single node, or will you need a distributed database that can scale as your application grows?&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Deployment topology&lt;/strong&gt;: will your application be running in on-premises data centers, public clouds, or a mix of both?&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Deciding on a database isn’t entirely independent from other decisions in your application design, and we’ll see more of this below. Note that your needs may also change as your application evolves.&lt;/p&gt;

&lt;h1&gt;
  
  
  Pick a storage provider
&lt;/h1&gt;

&lt;p&gt;Unless the database you choose is just a cache holding ephemeral data, you’ll need to configure your database to use persistent storage. If you’re using one of the public clouds, you’ll have storage options available such as Elastic Block Storage (EBS) volumes in AWS.&lt;/p&gt;

&lt;p&gt;However, there are many other options that are cloud-vendor independent. You can find a thriving ecosystem of K8s providers in the &lt;a href="https://landscape.cncf.io/card-mode?category=cloud-native-storage&amp;amp;grouping=category" rel="noopener noreferrer"&gt;Cloud-Native Storage category&lt;/a&gt; of the CNCF Landscape.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fcg8gj6yg7auschrbtzt9.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fcg8gj6yg7auschrbtzt9.png" alt="Image description" width="800" height="231"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Figure 3: Cloud Native Storage projects on the CNCF Landscape as of September 2021.&lt;/p&gt;

&lt;p&gt;These include a number of options for managing both local and networked storage, in formats such as block, file, and object storage. You’ll likely be able to find sample code that shows how to configure your selected database to use your chosen storage provider. For example, here’s a &lt;a href="https://docs.openebs.io/docs/next/cassandra.html" rel="noopener noreferrer"&gt;tutorial on running Apache Cassandra on OpenEBS&lt;/a&gt;, a popular open-source storage provider for K8s that you can run in a variety of environments.&lt;/p&gt;

&lt;h1&gt;
  
  
  Pick an operator
&lt;/h1&gt;

&lt;p&gt;If you intend on running more than a small handful of nodes of your selected database, you’ll benefit from automating your operations by using a K8s Operator. You can find a wide variety of operators for databases and other applications at the &lt;a href="https://operatorhub.io/?category=Database" rel="noopener noreferrer"&gt;OperatorHub&lt;/a&gt;. When selecting an operator, you’ll want to make sure it’s open-source, and also check how actively it’s maintained.&lt;/p&gt;

&lt;p&gt;There are operators for most popular databases, such as the Zalando &lt;a href="https://postgres-operator.readthedocs.io/en/latest/" rel="noopener noreferrer"&gt;Postgres-operator&lt;/a&gt;, or &lt;a href="https://dtsx.io/3kQTmeW" rel="noopener noreferrer"&gt;Cass-operator&lt;/a&gt;, which the Apache Cassandra community &lt;a href="https://cassandra.apache.org/_/blog/Cassandra-and-Kubernetes-SIG-Update-2.html" rel="noopener noreferrer"&gt;has recently banded around&lt;/a&gt;. Cass-operator is actually part of a larger project called &lt;a href="https://dtsx.io/2WUnNc4" rel="noopener noreferrer"&gt;K8ssandra&lt;/a&gt;, which builds on that operator to create a more comprehensive data platform around Cassandra. This includes tooling for maintenance and backups, along with an open-source data gateway called &lt;a href="https://dtsx.io/3teXdq1" rel="noopener noreferrer"&gt;Stargate&lt;/a&gt; that supports a variety of developer-friendly APIs.&lt;/p&gt;

&lt;h1&gt;
  
  
  An alternate approach: Pick a managed service
&lt;/h1&gt;

&lt;p&gt;Of course, even with an operator, running a database in K8s yourself may be more than you want to take on, especially if you’re a smaller team looking to maximize your leverage.&lt;/p&gt;

&lt;p&gt;If this is you, you can still take advantage of one of the many managed database services available. If you need a highly scalable database combined with a great developer experience, &lt;a href="https://astra.dev/3KqAChA" rel="noopener noreferrer"&gt;DataStax Astra DB&lt;/a&gt; is a great choice. Astra DB is a managed Cassandra service that itself happens to be built on top of Kubernetes, and the Stargate APIs are available by default — even with a &lt;a href="https://astra.dev/3KqAChA" rel="noopener noreferrer"&gt;free Astra DB account&lt;/a&gt;.&lt;/p&gt;

&lt;h1&gt;
  
  
  Meet a community of cloud-native data practitioners
&lt;/h1&gt;

&lt;p&gt;No matter what choices you end up making for your K8s-deployed applications, you can find a group of passionate developers pushing the state of the art forward in the &lt;a href="https://dok.community/" rel="noopener noreferrer"&gt;Data on Kubernetes Community&lt;/a&gt; (DoKC). If you’re attending KubeCon North America, join us for &lt;a href="https://events.linuxfoundation.org/kubecon-cloudnativecon-north-america/program/colocated-events/#data-on-kubernetes-day" rel="noopener noreferrer"&gt;DoK Day&lt;/a&gt; on Tuesday, October 12.&lt;/p&gt;

&lt;p&gt;&lt;em&gt;Register &lt;a href="https://events.linuxfoundation.org/kubecon-cloudnativecon-north-america/register/" rel="noopener noreferrer"&gt;here&lt;/a&gt; to join KubeCon North America 2021 and &lt;a href="https://dtsx.io/3zLW7V2" rel="noopener noreferrer"&gt;subscribe to our event alert&lt;/a&gt; to get notified about new DataStax workshops for developers, by developers. For exclusive posts on Cassandra, streaming, Kubernetes, and more; follow &lt;a href="https://dtsx.io/3kTIehj" rel="noopener noreferrer"&gt;DataStax on Medium&lt;/a&gt;.&lt;/em&gt;&lt;/p&gt;

&lt;h1&gt;
  
  
  Resources
&lt;/h1&gt;

&lt;ol&gt;
&lt;li&gt;&lt;a href="https://astra.dev/3KqAChA" rel="noopener noreferrer"&gt;Astra DB — Managed Apache Cassandra as a Service&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://dtsx.io/3teXdq1" rel="noopener noreferrer"&gt;Stargate APIs | GraphQL, REST, Document&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://dtsx.io/3toxkEb" rel="noopener noreferrer"&gt;GitHub: Examples for Managing Cloud-Native Data on Kubernetes&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://dtsx.io/3kQTmeW" rel="noopener noreferrer"&gt;k8ssandra/cass-operator: The DataStax Kubernetes Operator for Apache Cassandra&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://events.linuxfoundation.org/kubecon-cloudnativecon-north-america/" rel="noopener noreferrer"&gt;KubeCon North America 2021&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://dtsx.io/38IChhy" rel="noopener noreferrer"&gt;DataStax Academy&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://dtsx.io/2WWvizo" rel="noopener noreferrer"&gt;DataStax Workshops&lt;/a&gt;&lt;/li&gt;
&lt;/ol&gt;

</description>
    </item>
    <item>
      <title>Multi-cluster Cassandra deployment with Google Kubernetes Engine (Pt. 2)</title>
      <dc:creator>Jeffrey Carpenter</dc:creator>
      <pubDate>Tue, 01 Mar 2022 21:15:25 +0000</pubDate>
      <link>https://dev.to/datastax/multi-cluster-cassandra-deployment-with-google-kubernetes-engine-pt-2-2ffb</link>
      <guid>https://dev.to/datastax/multi-cluster-cassandra-deployment-with-google-kubernetes-engine-pt-2-2ffb</guid>
      <description>&lt;p&gt;This is the second in a series of posts examining patterns for using K8ssandra to create Cassandra clusters with different deployment topologies.&lt;/p&gt;

&lt;p&gt;In the &lt;a href="https://k8ssandra.io/blog/tutorials/deploy-a-multi-datacenter-apache-cassandra-cluster-in-kubernetes/" rel="noopener noreferrer"&gt;first article&lt;/a&gt; in this series, we looked at how you could create a Cassandra cluster with two datacenters in a single cloud region, using separate Kubernetes namespaces in order to isolate workloads. For example, you might want to create a secondary Cassandra datacenter to isolate a read-heavy analytics workload from the datacenter supporting your main application.&lt;/p&gt;

&lt;p&gt;In the rest of this series, we’ll explore additional configurations that promote high availability and accessibility of your data across various different network topologies, including hybrid and multi-cloud deployments. Our focus for this post will be on creating a Cassandra cluster running on Kubernetes clusters in multiple regions within a single cloud provider – in this case Google Cloud. If you worked through the first blog, many of the steps will be familiar.&lt;/p&gt;

&lt;p&gt;Note: for the purpose of this exercise, you’ll create GKE clusters in two separate regions, under the same Google Cloud project. This will make it possible to use the same network.&lt;/p&gt;

&lt;h2&gt;
  
  
  &lt;strong&gt;Preparing the first GKE Cluster&lt;/strong&gt;
&lt;/h2&gt;

&lt;p&gt;First, you’re going to need a Kubernetes cluster in which you can create the first Cassandra datacenter. To create this first cluster, follow the instructions for &lt;a href="https://docs.k8ssandra.io/install/gke/" rel="noopener noreferrer"&gt;K8ssandra on Google Kubernetes Engine (GKE)&lt;/a&gt;, which reference scripts provided as part of the &lt;a href="https://github.com/k8ssandra/k8ssandra-terraform/tree/main/gcp" rel="noopener noreferrer"&gt;K8ssandra GCP Terraform Example&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;When building this example for myself, I provided values for the environment variables used by the Terraform script to match my desired environment. Notice my initial GKE cluster is in the &lt;code&gt;us-west4&lt;/code&gt; region. You’ll want to change these values for your own environment.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;export TF_VAR_environment=dev
export TF_VAR_name=k8ssandra
export TF_VAR_project_id=&amp;lt;my project&amp;gt;
export TF_VAR_region=us-west4
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;After creating the GKE cluster, you can ignore further instructions on the &lt;a href="https://docs.k8ssandra.io/install/gke/" rel="noopener noreferrer"&gt;K8ssandra GKE docs page&lt;/a&gt; (the “Install K8ssandra” section and beyond), since you’ll be doing a custom K8ssandra installation. The Terraform script should automatically change your &lt;code&gt;kubectl&lt;/code&gt; context to the new cluster, but you can make sure by checking the output of &lt;code&gt;kubectl config current-context&lt;/code&gt;.&lt;/p&gt;

&lt;h2&gt;
  
  
  &lt;strong&gt;Creating the first Cassandra datacenter&lt;/strong&gt;
&lt;/h2&gt;

&lt;p&gt;First, a bit of upfront planning. It will be easier to manage our K8ssandra installs in different clusters if we use the same administrator credentials in each datacenter. Let’s create a namespace for the first datacenter and add a secret within the namespace:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;kubectl create namespace us-west4
kubectl create secret generic cassandra-admin-secret --from-literal=username=cassandra-admin --from-literal=password=cassandra-admin-password -n us-west4
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Notice that I chose to create a namespace matching the GCP region in which I’m deploying K8ssandra. This is done as part of enabling DNS between the GKE clusters, which is a topic that we’ll discuss in depth in a future post. You’ll want to specify a namespace corresponding to the region you’re using.&lt;/p&gt;

&lt;p&gt;The next step is to create a K8ssandra deployment for the first datacenter. You’ll need Helm installed for this step, as described on the &lt;a href="https://docs.k8ssandra.io/install/gke/" rel="noopener noreferrer"&gt;K8ssandra GKE docs page&lt;/a&gt;. Create the configuration for the first datacenter in a file called &lt;code&gt;dc1.yaml&lt;/code&gt;, making sure to change the affinity labels to match zones used in your GKE cluster:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;cassandra:
 auth:
   superuser:
     secret: cassandra-admin-secret
 cassandraLibDirVolume:
   storageClass: standard-rwo
 clusterName: multi-region
 datacenters:
 - name: dc1
   size: 3
   racks:
   - name: rack1
     affinityLabels:
       failure-domain.beta.kubernetes.io/zone: us-west4-a
   - name: rack2
     affinityLabels:
       failure-domain.beta.kubernetes.io/zone: us-west4-b
   - name: rack3
     affinityLabels:
       failure-domain.beta.kubernetes.io/zone: us-west4-c
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;In addition to requesting 3 nodes in the datacenter, this configuration specifies an appropriate storage class for the GKE environment (&lt;code&gt;standard-rwo&lt;/code&gt;), and uses affinity to specify how the racks are mapped to GCP zones. Make sure to change the referenced zones to match your configuration. For more details, please reference the first &lt;a href="https://k8ssandra.io/blog/tutorials/deploy-a-multi-datacenter-apache-cassandra-cluster-in-kubernetes/" rel="noopener noreferrer"&gt;blog post&lt;/a&gt; in the series.&lt;/p&gt;

&lt;p&gt;Deploy the release using this command:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;helm install k8ssandra k8ssandra/k8ssandra -f dc1.yaml -n us-west4
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This causes the K8ssandra release named &lt;code&gt;k8ssandra&lt;/code&gt; to be installed in the namespace &lt;code&gt;us-west4&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;As would be the case for any Cassandra cluster deployment, you will want to wait for the first datacenter to be completely up before adding a second datacenter. Since you’ll now be creating additional infrastructure for the second datacenter, you probably don’t need to wait, but if you’re interested, one simple way to do make sure the datacenter is up is to watch until the Stargate pod shows as initialized since it depends on Cassandra being ready:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;kubectl get pods -n us-west4 kubectl get pods -n us-west4 --watch --selector app=k8ssandra-dc1-stargate
NAME                                                  READY   STATUS             RESTARTS   AGE
k8ssandra-dc1-stargate-58bf5657ff-ns5r7                     1/1     Running            0          15m
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This is a great point to get some information you’ll need below to configure the second Cassandra datacenter: seeds. In the first blog post in this series, we took advantage of a headless Kubernetes service that K8ssandra creates called the seed service, which points to a couple of the Cassandra nodes that can be used to bootstrap new nodes or datacenters into a Cassandra cluster. You can take advantage of the fact that the seed nodes are labeled to find their addresses.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;kubectl get pods -n us-west4 -o jsonpath="{.items[*].status.podIP}" --selector cassandra.datastax.com/seed-node=true
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Which produces output that looks like this:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;10.56.6.8 10.56.5.8 10.56.4.7
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Record a couple of these IP addresses to use as seeds further down.&lt;/p&gt;

&lt;h2&gt;
  
  
  &lt;strong&gt;Preparing the second GKE cluster&lt;/strong&gt;
&lt;/h2&gt;

&lt;p&gt;Now you’ll need a second Kubernetes cluster that will be used to host the second Cassandra datacenter. The terraform scripts used above to create the first GKE cluster also create a network and service account that should be reused for the second cluster. Instead of modifying the Terraform scripts to take existing resources into account, you can create the new GKE cluster using the console or the &lt;code&gt;gcloud&lt;/code&gt; command line.&lt;/p&gt;

&lt;p&gt;For example, I chose the &lt;code&gt;us-central1&lt;/code&gt; region for my second cluster. First, I explicitly created a subnet in that region as part of the same network that Terraform created for the first datacenter.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;gcloud compute networks subnets create dev-k8ssandra-subnet2 --network=dev-k8ssandra-network --range=10.2.0.0/20 --region=us-central1
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Then I created the second GKE cluster using that network and the same compute specs as the first cluster:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;gcloud beta container clusters create "k8ssandra-2" --region "us-central1" --machine-type "e2-highmem-8" --disk-type "pd-standard" --disk-size "100" --num-nodes "1" --network dev-k8ssandra-network --subnetwork dev-k8ssandra-subnet2 --node-locations "us-central1-b","us-central1-c","us-central1-f"
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Change the &lt;code&gt;kubectl&lt;/code&gt; context to the second datacenter. Typically you can obtain a command to do this by selecting the cluster in the GCP console and pressing the “Connect” button.&lt;/p&gt;

&lt;p&gt;Then you’ll need to create a firewall rule to allow traffic between the two clusters. I obtained the IP space of each subnet and the IP space of each GKE cluster and created a rule to allow all traffic:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;gcloud compute firewall-rules create k8ssandra-multi-region-rule --direction=INGRESS --network=dev-k8ssandra-network --action=ALLOW --rules=all --source-ranges=10.0.0.0/20,10.2.0.0/20,10.56.0.0/14,10.24.0.0/14
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;If desired, you could create a more targeted rule to only allow TCP traffic between ports used by Cassandra.&lt;/p&gt;

&lt;h2&gt;
  
  
  &lt;strong&gt;Adding a second Cassandra datacenter&lt;/strong&gt;
&lt;/h2&gt;

&lt;p&gt;Let’s start by creating a namespace for the new datacenter matching the GCP region name. We also need to create administrator credentials to match those created for the first datacenter, since the secrets are not automatically replicated between clusters.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;kubectl create namespace us-central1
kubectl create secret generic cassandra-admin-secret --from-literal=username=cassandra-admin --from-literal=password=cassandra-admin-password -n us-central1
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Now you’ll create a configuration to deploy an additional Cassandra datacenter &lt;code&gt;dc2&lt;/code&gt; in the new GKE cluster. For the nodes in &lt;code&gt;dc2&lt;/code&gt; to be able to join the Cassandra cluster, a few steps are required:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;The first is one you’ve already taken care of: using the same Google Cloud network for both GKE clusters means the nodes in the new datacenter will be able to communicate with nodes in the original datacenter.&lt;/li&gt;
&lt;li&gt;Second, make sure to use the same Cassandra cluster name as for the first datacenter.&lt;/li&gt;
&lt;li&gt;Finally, you’ll need to provide the seed nodes you recorded earlier so that the nodes in the new datacenter know how to contact nodes in the first datacenter to join the cluster.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Now create a configuration in a file called &lt;code&gt;dc2.yaml&lt;/code&gt;. Here’s what my file looked like, you’ll want to change the additional seeds and affinity labels to your configuration:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;cassandra:
 auth:
   superuser:
     secret: cassandra-admin-secret
 additionalSeeds: [ 10.56.2.14, 10.56.0.10 ]
 cassandraLibDirVolume:
   storageClass: standard-rwo
 clusterName: multi-region
 datacenters:
 - name: dc2
   size: 3
   racks:
   - name: rack1
     affinityLabels:
       failure-domain.beta.kubernetes.io/zone: us-central1-f
   - name: rack2
     affinityLabels:
       failure-domain.beta.kubernetes.io/zone: us-central1-b
   - name: rack3
     affinityLabels:
       failure-domain.beta.kubernetes.io/zone: us-central1-c
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Similar to the configuration for &lt;code&gt;dc1&lt;/code&gt;, this configuration also uses affinity. A similar allocation of racks can be used to make sure Cassandra nodes are evenly spread across the remaining workers. Deploy the release using a command such as this:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;helm install k8ssandra2 k8ssandra/k8ssandra -f dc2.yaml -n us-central1
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;If you look at the resources in this namespace using a command such as &lt;code&gt;kubectl get services,pods&lt;/code&gt; you’ll note that there are a similar set of pods and services as for &lt;code&gt;dc1&lt;/code&gt;, including Stargate, Prometheus, Grafana, and Reaper. Depending on how you wish to manage your application, this may or may not be to your liking, but you are free to tailor the configuration to disable any components you don’t need.&lt;/p&gt;

&lt;h2&gt;
  
  
  &lt;strong&gt;Configuring Cassandra Keyspaces&lt;/strong&gt;
&lt;/h2&gt;

&lt;p&gt;Once the second datacenter comes online, you’ll want to configure Cassandra keyspaces to replicate across both clusters&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Important&lt;/strong&gt;: You’ll likely need to first change your &lt;code&gt;kubectl&lt;/code&gt; context back to the first GKE cluster, for example using the &lt;code&gt;kubectl config use-context&lt;/code&gt; command. You can list existing contexts using &lt;code&gt;kubectl config get-contexts&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;To update keyspaces, connect to a node in the first datacenter and execute &lt;code&gt;cqlsh&lt;/code&gt;.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;kubectl exec multi-region-dc1-rack1-sts-0 cassandra -it -- cqlsh -u cassandra-admin -p cassandra-admin-password
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Use the &lt;code&gt;DESCRIBE KEYSPACES&lt;/code&gt; to list the keyspaces and &lt;code&gt;DESCRIBE KEYSPACE &amp;amp;lt;name&amp;gt;&lt;/code&gt; command to identify those using the &lt;code&gt;NetworkTopologyStrategy&lt;/code&gt;. For example:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;cassandra-admin@cqlsh&amp;gt; DESCRIBE KEYSPACES
reaper_db      system_auth  data_endpoint_auth  system_traces
system_schema  system       system_distributed
cassandra-admin@cqlsh&amp;gt; DESCRIBE KEYSPACE system_auth
CREATE KEYSPACE system_auth WITH replication = {'class': 'NetworkTopologyStrategy', 'dc1': '3'}  AND durable_writes = true;
…
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Typically you’ll find that the &lt;code&gt;system_auth&lt;/code&gt;, &lt;code&gt;system_traces&lt;/code&gt;, and &lt;code&gt;system_distributed&lt;/code&gt; keyspaces use &lt;code&gt;NetworkTopologyStrategy&lt;/code&gt;, as well as &lt;code&gt;data_endpoint_auth&lt;/code&gt; if you’ve enabled Stargate. You can then update the replication strategy to ensure data is replicated to the new datacenter. You’ll execute something like the following for each of these keyspaces:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;ALTER KEYSPACE system_auth WITH replication = {'class': 'NetworkTopologyStrategy', 'dc1': 3, 'dc2': 3}
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Important&lt;/strong&gt;: Remember to create or alter the replication strategy for any keyspaces you need for your application so that you have the desired number of replicas in each datacenter.&lt;/p&gt;

&lt;p&gt;After exiting &lt;code&gt;cqlsh&lt;/code&gt;, make sure existing data is properly replicated to the new datacenter with the &lt;code&gt;nodetool rebuild&lt;/code&gt; command.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Important&lt;/strong&gt;: Remember to change your &lt;code&gt;kubectl context&lt;/code&gt; back to the second GKE cluster.&lt;/p&gt;

&lt;p&gt;Rebuild needs to be run on each node in the new datacenter, for example:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;kubectl exec multi-region-dc2-rack1-sts-0 -n us-central1 -- nodetool --username cassandra-admin --password cassandra-admin-password rebuild dc1
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Repeat for the other nodes &lt;code&gt;multi-region-dc2-rack2-sts-0&lt;/code&gt; and &lt;code&gt;multi-region-dc2-rack3-sts-0&lt;/code&gt;.&lt;/p&gt;

&lt;h2&gt;
  
  
  &lt;strong&gt;Testing the configuration&lt;/strong&gt;
&lt;/h2&gt;

&lt;p&gt;Let’s verify the second datacenter has joined the cluster. To do this you’ll pick a Cassandra node to execute the &lt;code&gt;nodetool status&lt;/code&gt; command against. Execute the &lt;code&gt;nodetool&lt;/code&gt; command against the node:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;kubectl exec multi-region-dc2-rack1-sts-0 -n us-central1 cassandra -- nodetool --username cassandra-admin --password cassandra-admin-password status
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This will produce output similar to the following:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Datacenter: dc1
===============
Status=Up/Down
|/ State=Normal/Leaving/Joining/Moving
--  Address      Load       Tokens       Owns    Host ID                               Rack
UN  10.56.2.8    835.57 KiB  256          ?       8bc5cd4a-7953-497a-8ac0-e89c2fcc8729  rack1
UN  10.56.5.8    1.19 MiB   256          ?       fdd96600-5a7d-4c88-a5cc-cf415b3b79f0  rack2
UN  10.56.4.7    830.98 KiB  256          ?       d4303a9f-8818-40c2-a4b5-e7f2d6d78da6  rack3
Datacenter: dc2
===============
Status=Up/Down
|/ State=Normal/Leaving/Joining/Moving
--  Address      Load       Tokens       Owns    Host ID                               Rack
UN  10.24.4.99   418.52 KiB  256          ?       d2e71ab4-6747-4ac6-b314-eaaa76d3111e  rack3
UN  10.24.7.37   418.17 KiB  256          ?       24708e4a-61fc-4004-aee0-6bcc5533a48f  rack2
UN  10.24.1.214  398.22 KiB  256          ?       76c0d2ba-a9a8-46c0-87e5-311f7e05450a  rack1
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;If everything has been configured correctly, you’ll be able to see both datacenters in the cluster output. Here’s a picture that depicts what you’ve just deployed, focusing on the Cassandra nodes and networking:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fabo2j22g5aza4i27lxjq.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fabo2j22g5aza4i27lxjq.png" alt="Image description" width="800" height="573"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  &lt;strong&gt;What’s next&lt;/strong&gt;
&lt;/h2&gt;

&lt;p&gt;In the following posts in this series, we’ll explore additional multi-datacenter topologies across multiple Kubernetes clusters, including Cassandra clusters in hybrid cloud and multi-cloud deployments. We’ll also dive into more details on networking and DNS configuration. We’d love to hear your ideas for additional configurations you build, and please feel free to reach out with any questions you have on the &lt;a href="https://forum.k8ssandra.io/" rel="noopener noreferrer"&gt;forum&lt;/a&gt; or our &lt;a href="https://discord.gg/qP5tAt6Uwt" rel="noopener noreferrer"&gt;Discord&lt;/a&gt; channel.  We recommend trying it on the &lt;a href="https://astra.dev/3Maq8ER" rel="noopener noreferrer"&gt;Astra DB&lt;/a&gt; free plan for the fastest setup.&lt;/p&gt;

</description>
    </item>
    <item>
      <title>Why we decided to build a K8ssandra Operator - Part 4</title>
      <dc:creator>Jeffrey Carpenter</dc:creator>
      <pubDate>Tue, 22 Feb 2022 17:04:41 +0000</pubDate>
      <link>https://dev.to/datastax/why-we-decided-to-build-a-k8ssandra-operator-part-4-1dj6</link>
      <guid>https://dev.to/datastax/why-we-decided-to-build-a-k8ssandra-operator-part-4-1dj6</guid>
      <description>&lt;p&gt;In the &lt;a href="https://k8ssandra.io/blog/other/why_k8ssandra_operator_part_1/" rel="noopener noreferrer"&gt;first&lt;/a&gt;, &lt;a href="https://k8ssandra.io/blog/articles/why-k8ssandra-operator-part-2/" rel="noopener noreferrer"&gt;second&lt;/a&gt;, and &lt;a href="https://k8ssandra.io/blog/articles/why-we-decided-to-build-a-k8ssandra-operator-part-3/" rel="noopener noreferrer"&gt;third&lt;/a&gt; posts in this series, we’ve shared conversations with K8ssandra core team members on our journey to build a Kubernetes operator for K8ssandra. We’ve discussed the virtues of the Helm package manager versus Kubernetes operators for deploying and managing infrastructure in Kubernetes and some of our implementation choices for the operator.&lt;/p&gt;

&lt;p&gt;In this final post of the series, we pick up from the previous post with a discussion of how we decided to structure our projects in GitHub, how we test the K8ssandra operator, and our hopes for how the operator will expand the K8ssandra developer community.&lt;/p&gt;

&lt;h2&gt;
  
  
  &lt;strong&gt;Implications of operators for project structure&lt;/strong&gt;
&lt;/h2&gt;

&lt;p&gt;&lt;em&gt;Jeff Carpenter:&lt;/em&gt; There are external projects that K8ssandra is managing, but don’t have operators. If I look in GitHub, I see Reaper under The Last Pickle organization, but Reaper Operator under K8ssandra. Is this another case where Stargate isn’t building an operator under its org, but we’re building a Stargate operator under K8ssandra?&lt;/p&gt;

&lt;p&gt;&lt;em&gt;John Sanda:&lt;/em&gt; Yes, but note that while we have separate repositories for Reaper Operator, Medusa Operator, Stargate Operator, we do plan to consolidate those into the K8ssandra operator. We’ll have multiple CRDs and multiple controllers. Because cass-operator is already used independently, it will continue to be independent and will be a dependency pulled into the K8ssandra operator.&lt;/p&gt;

&lt;p&gt;&lt;em&gt;Jeff Carpenter:&lt;/em&gt; You’re saying there will be separate CRDs associated with Stargate, Reaper, and Medusa, but all managed by the K8ssandra operator. This makes me wonder, is there discussion in the Kubernetes operator world about monoliths versus microservices? Is there concern about building a monolithic cooperator?&lt;/p&gt;

&lt;p&gt;&lt;em&gt;John Sanda:&lt;/em&gt; Absolutely. It’s not a microservice architecture per se, but it is highly decoupled and highly modular. Let’s say we wanted to take the Stargate controller and run that in its own separate pod. We could do that without impacting the code of the Reaper or K8ssandra operator, or the cass-operator controllers, it would just be a matter of repackaging it. They are decoupled and modular in that regard. That’s also driven by having distinct CRDs, because you’ll typically have a separate controller per CRD, and those controllers, for the most part, act in isolation from one another.&lt;/p&gt;

&lt;h2&gt;
  
  
  &lt;strong&gt;How to test a Kubernetes operator&lt;/strong&gt;
&lt;/h2&gt;

&lt;p&gt;&lt;em&gt;Jeff Carpenter:&lt;/em&gt; Are there any interesting considerations for testing an operator?&lt;/p&gt;

&lt;p&gt;&lt;em&gt;John Sanda:&lt;/em&gt; The multi-cluster testing is going to present some challenges in terms of resource requirements. We’ve done a lot to make sure we can do all our automation and continuous integration with GitHub Actions using the free tier runners in GitHub, but this is not going to be sufficient in terms of resources for multi-cluster.&lt;/p&gt;

&lt;p&gt;&lt;em&gt;John Sanda:&lt;/em&gt; We’re using Kind clusters for running most of our tests. We’ve put together some automation, in the form of setup scripts that will deploy and configure multiple Kind clusters for testing multi-cluster, but that’s just going to be too much for those free tier runners in GitHub. That presents some interesting challenges that we need to work through.&lt;/p&gt;

&lt;p&gt;&lt;em&gt;John Sanda:&lt;/em&gt; For the CassKop operator from Orange, they’ve used a tool called &lt;a href="https://kuttl.dev/" rel="noopener noreferrer"&gt;Kuttl&lt;/a&gt;, which does full integration tests with YAML files. There was some discussion of this recently on our Discord server, and I think that will be something for us to look at. Not everyone will be a Go programmer or be familiar with the Kubernetes APIs in order to write tests, but everyone using K8ssandra should know at least a little bit about YAML. That would be a really awesome way for people to contribute and add a lot of value to the project without having to have that deep, intimate knowledge. That’s something I’d like to look into.&lt;/p&gt;

&lt;p&gt;&lt;em&gt;Jeff Carpenter:&lt;/em&gt; Is the idea is to describe the desired configuration as a YAML, and that’s the spec for a test case?&lt;/p&gt;

&lt;p&gt;&lt;em&gt;John Sanda:&lt;/em&gt; Yes, and the verification would be an additional YAML manifest.&lt;/p&gt;

&lt;p&gt;&lt;em&gt;Jeff Carpenter:&lt;/em&gt; What about making specific Stargate API calls or CQL queries? Could it test those as well?&lt;/p&gt;

&lt;p&gt;&lt;em&gt;John Sanda:&lt;/em&gt; No, it’s more along the lines of here’s what I want to deploy, like verifying a StatefulSet was created correctly. There’s certainly gonna be limitations because obviously, in our tests, we do make calls to Stargate and CQL queries and so forth. That’s beyond the scope of what a tool like Kuttl can do, but it would cover certainly cover some use cases.&lt;/p&gt;

&lt;p&gt;&lt;em&gt;Jeff Carpenter:&lt;/em&gt; It sounds like this is more about setting up user-defined configurations, and the test passes once status gets to “Ready”.&lt;/p&gt;

&lt;p&gt;&lt;em&gt;John Sanda:&lt;/em&gt; I think that would be a good example. Perhaps it would be a good candidate for user acceptance testing.&lt;/p&gt;

&lt;h2&gt;
  
  
  &lt;strong&gt;Automating operator testing&lt;/strong&gt;
&lt;/h2&gt;

&lt;p&gt;&lt;em&gt;Jeff Carpenter:&lt;/em&gt; What amount of testing do you expect to automate? What will the K8ssandra CI/CD pipeline look like with the expected combination of Helm and the K8ssandra operator?&lt;/p&gt;

&lt;p&gt;&lt;em&gt;John Sanda:&lt;/em&gt; Yes, there is automation involved. In terms of local development, the other tool that’s considered a counterpart to Helm is Kustomize. This is more of a declarative approach. It’s bundled as part of KubeBuilder and Operator SDK. You’re going to see Kustomize being used with the K8ssandra operator, and we already use it for testing scenarios. Applying this to the scenario of running unit tests locally, there’s a two-step process: first I run the build command to rebuild my operator image, then I’ll run another command that will use Kustomize to redeploy things. So while we can automate those steps, it’s still not as fast of a turnaround in terms of “wall clock” time, because you’re still having to rebuild an image.&lt;/p&gt;

&lt;p&gt;&lt;em&gt;Jeff Carpenter:&lt;/em&gt; Sure, that’s a key difference between any case where you have a compiled language versus a scripted language.&lt;/p&gt;

&lt;h2&gt;
  
  
  &lt;strong&gt;Expanding the K8ssandra community&lt;/strong&gt;
&lt;/h2&gt;

&lt;p&gt;&lt;em&gt;Jeff Carpenter:&lt;/em&gt; What does this push to build a K8ssandra operator mean for contributors outside of the core team?&lt;/p&gt;

&lt;p&gt;&lt;em&gt;John Sanda:&lt;/em&gt; Hopefully, this means that we see an increase in contributions, whether that’s in issue activity, on the forums, or Discord. The evolution of the project is a maturation process. People will be looking to use K8ssandra to solve bigger, harder, more challenging problems. That will help to shape K8ssandra to be the solution to those problems.&lt;br&gt;
&lt;iframe width="710" height="399" src="https://www.youtube.com/embed/Ok0VHiH2Px0"&gt;
&lt;/iframe&gt;
&lt;br&gt;
&lt;em&gt;John Sanda:&lt;/em&gt; Does it mean you have to be fluent in writing Go in order to get involved? Do you have to have experience with writing operators? That’s certainly helpful, but no, these things are not required. K8ssandra is still a big collection tying together various projects. There are many avenues for contributors to get involved. If nothing else, this opens the door for more contributions and hopefully bigger and better things for users and contributors.&lt;/p&gt;

&lt;p&gt;&lt;em&gt;Jeff Carpenter:&lt;/em&gt; I agree with you. On the one hand, you could make the argument that having to learn Go is an obstacle to contributing. On the other hand, I’m watching some of the help requests that come from our community, and I can attest it can be semi-inscrutable to figure out what is happening with Helm.&lt;/p&gt;

&lt;p&gt;&lt;em&gt;Jeff Carpenter:&lt;/em&gt; I also remember trying to make a change to see if I could modify the Helm templates to generate multiple Cassandra datacenters, and I thought I had the iterative looping down, but then struggled with the variable scope and pushing down the values that I needed. And that hour I spent was pretty enlightening.&lt;/p&gt;

&lt;p&gt;&lt;em&gt;Jeff Carpenter:&lt;/em&gt; I think that with Go, while you might have to spend some time spinning up on the language, that’s probably something you should learn anyway for modern, cloud-native backend development. For people that need to customize the project, it’s going to be a lot easier to do their own fork, which hopefully turns into a pull request back to the main project. It’s going to be a lot easier for them to do that in Go.&lt;/p&gt;

&lt;p&gt;&lt;em&gt;John Sanda:&lt;/em&gt; I agree, and I think this is something that Jeff DiNoto brought up when we were trying to decide at what point we should commit to building an operator. For engineers and developers, this is going to resonate more. In terms of development and testing, the libraries and the frameworks you’ll use for writing unit tests in Go code are the same ones that you can use in Kubernetes. Overall, this will make it easier for folks to get involved, and hopefully, submit PRs.&lt;/p&gt;

&lt;h2&gt;
  
  
  &lt;strong&gt;Summary&lt;/strong&gt;
&lt;/h2&gt;

&lt;p&gt;That’s where our conversation ended, and it’s a perfect place to wrap up this series. The K8ssandra team is working hard on implementing the K8ssandra operator for a 2.0 release, but the amount of Go code is still quite manageable to read and learn. This is a great time to get involved in the project, and we’d love to give you a hand with setting up and testing out the in your own environment. Please reach out in the #k8ssandra-dev channel in our &lt;a href="https://discord.gg/qP5tAt6Uwt" rel="noopener noreferrer"&gt;Discord&lt;/a&gt; server and we’ll help you get started! Curious to learn more about (or play with) Cassandra itself? We recommend trying it on the &lt;a href="https://astra.dev/3BHSCkm" rel="noopener noreferrer"&gt;Astra DB&lt;/a&gt; free plan for the fastest setup.&lt;/p&gt;

</description>
    </item>
    <item>
      <title>Unboxing K8ssandra: The Data Layer For Your Kubernetes-Powered Applications</title>
      <dc:creator>Jeffrey Carpenter</dc:creator>
      <pubDate>Tue, 01 Feb 2022 17:51:23 +0000</pubDate>
      <link>https://dev.to/datastax/unboxing-k8ssandra-the-data-layer-for-your-kubernetes-powered-applications-4dec</link>
      <guid>https://dev.to/datastax/unboxing-k8ssandra-the-data-layer-for-your-kubernetes-powered-applications-4dec</guid>
      <description>&lt;h4&gt;
  
  
  &lt;strong&gt;A Complimentary Live Webinar, Sponsored by DataStax&lt;/strong&gt;
&lt;/h4&gt;

&lt;p&gt;Kubernetes made it easy to deploy and scale out your cloud-native applications. With &lt;a href="https://k8ssandra.io/" rel="noopener noreferrer"&gt;K8ssandra&lt;/a&gt;, you can now scale application data with the same simplicity and high availability. Join us as we unbox K8ssandra a cloud native data layer for Kubernetes and explore how you can deploy it alongside your applications.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;a href="https://docs.k8ssandra.io/install/" rel="noopener noreferrer"&gt;Install k8ssandra&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;Authenticate with &lt;a href="https://stargate.io/" rel="noopener noreferrer"&gt;Stargate&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;Query your data via a convenient API (REST, document, or graphql)&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Not familiar with Cassandra?  &lt;a href="https://astra.dev/3r070zA" rel="noopener noreferrer"&gt;Astra DB&lt;/a&gt; is a great (free) place to learn without any of the infrastructure setup, or management headaches.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Speakers:&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;&lt;em&gt;Christopher Bradford, Product Manager at DataStax&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;Christopher Bradford is a Product Manager at DataStax with a role in everything Kubernetes. For the years he has been immersed in the world of distributed systems and AP databases. Christopher loves a good challenge and the complex deployment models, network topologies, and stitching together of cloud offerings never leaves him in short supply.  &lt;/p&gt;

&lt;p&gt;Recently he focused on the trivial deployment and management of Apache Cassandra (and supporting tools) through the open-source projects K8ssandra and cass-operator. Previous speaking engagements by Christopher include Cassandra Summit, Spark Summit, Kong DevOps Summit along with a number of meetups and webinars. Topics have ranged from geographic data replication to ETL pipelines with Cassandra, Spark, and Solr for 200 years of Patent data.&lt;/p&gt;

&lt;p&gt;&lt;em&gt;Jeffrey Carpenter, Developer Relations at DataStax&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;Jeff Carpenter works in Developer Relations at DataStax, where he uses his background in system architecture, microservices and Apache Cassandra to help empower developers and operations engineers to build distributed systems that are scalable, reliable, and secure. Jeff has worked on large-scale systems in the defense and hospitality industries and is co-author of Cassandra: The Definitive Guide.&lt;/p&gt;

&lt;p&gt;&lt;em&gt;See the full workshop info at &lt;a href="https://linuxfoundation.org/webinars/unboxing-k8ssandra-the-data-layer-for-your-kubernetes-powered-applications/" rel="noopener noreferrer"&gt;The Linux Foundation&lt;/a&gt;, or watch the video directly, here:&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;&lt;iframe width="710" height="399" src="https://www.youtube.com/embed/oFbyYlmDMRw"&gt;
&lt;/iframe&gt;
&lt;/p&gt;

</description>
    </item>
    <item>
      <title>The search for a cloud-native database</title>
      <dc:creator>Jeffrey Carpenter</dc:creator>
      <pubDate>Tue, 09 Nov 2021 04:03:44 +0000</pubDate>
      <link>https://dev.to/datastax/the-search-for-a-cloud-native-database-440k</link>
      <guid>https://dev.to/datastax/the-search-for-a-cloud-native-database-440k</guid>
      <description>&lt;p&gt;The concept of “cloud-native” has come to stand for a collection of best practices for application logic and infrastructure, including databases. However, many of the databases supporting our applications have been around for decades, before the cloud or cloud-native was a thing. The data gravity associated with these legacy solutions has limited our ability to move applications and workloads. As we move to the cloud, how do we evolve our data storage approach? Do we need a cloud-native database? What would it even mean for a database to be cloud-native? Let’s take a look at these questions.&lt;/p&gt;

&lt;h2&gt;
  
  
  &lt;strong&gt;What is Cloud-Native?&lt;/strong&gt;
&lt;/h2&gt;

&lt;p&gt;It’s helpful to start by defining terms. In unpacking “cloud-native”, let’s start with the word “native”. For individuals, the word may evoke thoughts of your first language, or your country or origin – things that feel natural to you. Or in nature itself, we might consider the native habitats inhabited by wildlife, and how each species is adapted to its environment. We can use this as a basis to understand the meaning of cloud-native.&lt;/p&gt;

&lt;p&gt;Here’s how the Cloud Native Computing Foundation (CNCF) &lt;a href="https://github.com/cncf/toc/blob/main/DEFINITION.md" rel="noopener noreferrer"&gt;defines the term&lt;/a&gt;:&lt;/p&gt;

&lt;p&gt;“Cloud native technologies empower organizations to build and run scalable applications in modern, dynamic environments such as public, private, and hybrid clouds: Containers, service meshes, microservices, immutable infrastructure, and declarative APIs exemplify this approach.&lt;/p&gt;

&lt;p&gt;These techniques enable loosely coupled systems that are resilient, manageable, and observable. Combined with robust automation, they allow engineers to make high-impact changes frequently and predictably with minimal toil.”&lt;/p&gt;

&lt;p&gt;This is a rich definition, but it can be a challenge to use this to define what a cloud-native database is, as evidenced by the &lt;strong&gt;Database&lt;/strong&gt; section of the CNCF Landscape Map:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F8h2l44quu94awxxdd861.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F8h2l44quu94awxxdd861.png" alt="image" width="800" height="575"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Databases are just a small portion of a crowded cloud computing landscape&lt;/p&gt;

&lt;p&gt;Look closely, and you’ll notice a wide range of offerings: both traditional relational databases and NoSQL databases, supporting a variety of different data models including key/value, document, and graph. You’ll also find technologies that layer clustering, querying or schema management capabilities on top of existing databases. And this doesn’t even consider related categories in the CNCF landscape such as &lt;strong&gt;Streaming and Messaging&lt;/strong&gt; for data movement, or &lt;strong&gt;Cloud Native Storage&lt;/strong&gt; for persistence.&lt;/p&gt;

&lt;p&gt;Which of these databases are cloud-native? Only those that are designed for the cloud, should we include those that can be adapted to work in the cloud? Bill Wilder provides an interesting perspective in his 2012 book, “Cloud Architecture Patterns”, defining “cloud-native” as:&lt;/p&gt;

&lt;p&gt;“Any application that was architected to take full advantage of cloud platforms”&lt;/p&gt;

&lt;p&gt;By this definition, cloud-native databases are those that have been architected to take full advantage of underlying cloud infrastructure. Obvious? Maybe. Contentious? Probably…&lt;/p&gt;

&lt;h2&gt;
  
  
  &lt;strong&gt;Why should I care if my database is cloud-native?&lt;/strong&gt;
&lt;/h2&gt;

&lt;p&gt;Or to ask a different way, what are the advantages of a cloud-native database? Consider the two main factors driving the popularity of the cloud: cost and time-to-market.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Cost&lt;/strong&gt; – the ability to pay-as-you-go has been vital in increasing cloud adoption. (But that doesn’t mean that cloud is cheap or that cost management is always straightforward.)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Time-to-market&lt;/strong&gt; – the ability to quickly spin up infrastructure to prototype, develop, test, and deliver new applications and features. (But that doesn’t mean that cloud development and operations are easy.)&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;These goals apply to your database selection, just as they do to any other part of your stack.&lt;/p&gt;

&lt;h2&gt;
  
  
  &lt;strong&gt;What are the characteristics of a cloud-native database?&lt;/strong&gt;
&lt;/h2&gt;

&lt;p&gt;Now we can revisit the CNCF definition and extract characteristics of a cloud-native database that will help achieve our cost and time-to-market goals:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Scalability&lt;/strong&gt; – the system must be able to add capacity dynamically to absorb additional workload&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Elasticity&lt;/strong&gt; – it must also be able to scale back down, so that you only pay for the resources you need&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Resiliency&lt;/strong&gt; – the system must survive failures without losing your data&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Observability&lt;/strong&gt; – tracking your activity, but also health checking and handling failovers&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Automation&lt;/strong&gt; – implementing operations tasks as repeatable logic to reduce the possibility of error. This characteristic is the most difficult to achieve, but is essential to achieve a high delivery tempo at scale&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Cloud-native databases are designed to embody these characteristics, which distinguish them from “cloud-ready” databases, that is, those that can be deployed to the cloud with some adaptation.&lt;/p&gt;

&lt;h2&gt;
  
  
  &lt;strong&gt;What’s a good example of a cloud-native database?&lt;/strong&gt;
&lt;/h2&gt;

&lt;p&gt;Let’s test this definition of a cloud-native database by applying it to Apache Cassandra™ as an example. While the term “cloud-native” was not yet widespread when Cassandra was developed, it bears many of the same architectural influences, since it was inspired by public cloud infrastructure such as Amazon’s Dynamo Paper and Google’s BigTable. Because of this lineage, Cassandra embodies the principles outlined above:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Cassandra demonstrates horizontal &lt;strong&gt;scalability&lt;/strong&gt; through adding nodes, and can be scaled down &lt;strong&gt;elastically&lt;/strong&gt; to free resources outside of peak load periods&lt;/li&gt;
&lt;li&gt;By default, Cassandra is an AP system, that is, it prioritizes availability and partition tolerance over consistency, as described in the CAP theorem. Cassandra’s built in replication, shared-nothing architecture and self-healing features help guarantee &lt;strong&gt;resiliency&lt;/strong&gt;.&lt;/li&gt;
&lt;li&gt;Cassandra nodes expose logging, metrics, and query tracing, which enable &lt;strong&gt;observability&lt;/strong&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Automation&lt;/strong&gt; is the most challenging aspect for Cassandra, as typical for databases.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;While automating the initial deployment of a Cassandra cluster is a relatively simple task, other tasks such as scaling up and down or upgrading can be time-consuming and difficult to automate. After all, even single-node database operations can be challenging, as many a DBA can testify. Fortunately, the K8ssandra project provides best practices for deploying Cassandra on Kubernetes, including major strides forward in automating “day 2” operations.&lt;/p&gt;

&lt;h2&gt;
  
  
  &lt;strong&gt;Does a cloud-native database have to run on Kubernetes?&lt;/strong&gt;
&lt;/h2&gt;

&lt;p&gt;Speaking of Kubernetes… When we talk about databases in the cloud, we’re really talking about stateful workloads requiring some kind of storage. But in the cloud world, stateful is painful. Data gravity is a real challenge – data may be hard to move due to regulations and laws, and the cost can get quite expensive. This results in a premium on keeping applications close to their data.&lt;/p&gt;

&lt;p&gt;The challenges only increase when we begin deploying containerized applications using Kubernetes, since it was not originally designed for stateful workloads. There’s an emerging push toward deploying databases to run on Kubernetes as well, in order to maximize development and operational efficiencies by running the entire stack on a single platform. What additional requirements does Kubernetes put on a cloud-native database?&lt;/p&gt;

&lt;h3&gt;
  
  
  &lt;strong&gt;Containerization&lt;/strong&gt;
&lt;/h3&gt;

&lt;p&gt;First, the database must run in containers. This may sound obvious, but some work is required. Storage must be externalized, the memory and other computing resources must be tuned appropriately, and the application logs and metrics must be made available to infrastructure for monitoring and log aggregation.&lt;/p&gt;

&lt;h3&gt;
  
  
  &lt;strong&gt;Storage&lt;/strong&gt;
&lt;/h3&gt;

&lt;p&gt;Next, we need to map the database’s storage needs onto Kubernetes constructs. At a minimum, each database node will make a persistent volume claim that Kubernetes can use to allocate a storage volume with appropriate capacity and I/O characteristics. Databases are typically deployed using Kubernetes Stateful Sets, which help manage the mapping of storage volumes to pods and maintain consistent, predictable, identity.&lt;/p&gt;

&lt;h3&gt;
  
  
  &lt;strong&gt;Automated Operations&lt;/strong&gt;
&lt;/h3&gt;

&lt;p&gt;Finally, we need tooling to manage and automate database operations, including installation and maintenance. This is typically implemented via the Kubernetes operator pattern. Operators are basically control loops that observe the state of Kubernetes resources and take actions to help achieve a desired state. In this way they are similar to Kubernetes built-in controllers, but with the key difference that they understand domain-specific state and thus help Kubernetes make better decisions.&lt;/p&gt;

&lt;p&gt;For example, the K8ssandra project uses &lt;a href="https://github.com/datastax/cass-operator" rel="noopener noreferrer"&gt;cass-operator&lt;/a&gt;, which defines a Kubernetes custom resource (CRD) called “CassandraDatacenter” to describe the desired state of each top-level failure domain of a Cassandra cluster. This provides a level of abstraction higher than dealing with Stateful Sets or individual pods.&lt;/p&gt;

&lt;p&gt;Kubernetes database operators typically help to answer questions like:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;What happens during failovers? (pods, disks, networks)&lt;/li&gt;
&lt;li&gt;What happens when you scale out? (pod rescheduling)&lt;/li&gt;
&lt;li&gt;How are backups performed?&lt;/li&gt;
&lt;li&gt;How do we effectively detect and prevent failure?&lt;/li&gt;
&lt;li&gt;How is software upgraded? (rolling restarts)&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  &lt;strong&gt;Conclusion and what’s next&lt;/strong&gt;
&lt;/h2&gt;

&lt;p&gt;A cloud-native database is one that is designed with cloud-native principles in mind, including scalability, elasticity, resiliency, observability, and automation. As we’ve seen with Cassandra, automation is often the final milestone to be achieved, but running databases in Kubernetes can actually help us progress toward this goal of automation.&lt;/p&gt;

&lt;p&gt;What’s next in the maturation of cloud-native databases? We’d love to hear your input as we continue to invent the future of this technology together.&lt;/p&gt;

</description>
    </item>
  </channel>
</rss>
