<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>DEV Community: Alexander Dejanovski</title>
    <description>The latest articles on DEV Community by Alexander Dejanovski (@adejanovski).</description>
    <link>https://dev.to/adejanovski</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F755751%2F70e0175a-8701-46c6-83cc-e82e47981fa2.jpeg</url>
      <title>DEV Community: Alexander Dejanovski</title>
      <link>https://dev.to/adejanovski</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://dev.to/feed/adejanovski"/>
    <language>en</language>
    <item>
      <title>Reaper 3.0 for Apache Cassandra is available</title>
      <dc:creator>Alexander Dejanovski</dc:creator>
      <pubDate>Tue, 15 Mar 2022 19:34:16 +0000</pubDate>
      <link>https://dev.to/datastax/reaper-30-for-apache-cassandra-is-available-302m</link>
      <guid>https://dev.to/datastax/reaper-30-for-apache-cassandra-is-available-302m</guid>
      <description>&lt;p&gt;The &lt;a href="https://dtsx.io/3qPzVGq" rel="noopener noreferrer"&gt;K8ssandra&lt;/a&gt; team is pleased to announce the release of &lt;a href="http://cassandra-reaper.io/" rel="noopener noreferrer"&gt;Reaper 3.1&lt;/a&gt;. Let’s dive into the features and improvements that 3.0 recently introduced (along with some notable removals) and how the newest update to 3.1 builds on that.&lt;/p&gt;

&lt;h1&gt;
  
  
  &lt;strong&gt;JDK11 support&lt;/strong&gt;
&lt;/h1&gt;

&lt;p&gt;Starting with 3.1.0, Reaper can now compile and run with jdk11. Note that jdk8 is still supported at runtime.&lt;/p&gt;

&lt;h1&gt;
  
  
  &lt;strong&gt;Storage backends&lt;/strong&gt;
&lt;/h1&gt;

&lt;p&gt;Over the years, we regularly discussed dropping support for Postgres and H2 with the &lt;a href="https://dtsx.io/3t0jB8g" rel="noopener noreferrer"&gt;The Last Pickle&lt;/a&gt; (TLP) team, now part of &lt;a href="https://dtsx.io/3pUplPa" rel="noopener noreferrer"&gt;DataStax&lt;/a&gt;, the organization leading the open-source development of Reaper. Despite our lack of expertise in Postgres, the effort required to maintain support for these storage backends was moderate as long as Reaper’s architecture was simple. However, complexity grew with more deployment options, culminating with the addition of the sidecar mode.&lt;/p&gt;

&lt;p&gt;Some features require different consensus strategies depending on the backend, which sometimes led to implementations that worked well with one backend and were buggy with others.&lt;/p&gt;

&lt;p&gt;In order to allow building new features faster, while providing a consistent experience for all users, we decided to drop the Postgres and H2 backends in 3.0.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://cassandra.apache.org/_/index.html" rel="noopener noreferrer"&gt;Apache Cassandra&lt;/a&gt; and the managed &lt;a href="https://astra.dev/3i4sJlO" rel="noopener noreferrer"&gt;DataStax Astra DB&lt;/a&gt; are now the only production storage backends for Reaper. The &lt;a href="https://astra.dev/3i4sJlO" rel="noopener noreferrer"&gt;free tier&lt;/a&gt; of Astra DB will be more than sufficient for most deployments.&lt;/p&gt;

&lt;p&gt;Reaper does not generally require high availability – even complete data loss has mild consequences. Where Astra is not an option, a single Cassandra server can be started on the instance that hosts Reaper, or an existing cluster can be used as a backend data store.&lt;/p&gt;

&lt;h1&gt;
  
  
  &lt;strong&gt;Adaptive Repairs and Schedules&lt;/strong&gt;
&lt;/h1&gt;

&lt;p&gt;One of the pain points we observed when people start using Reaper is understanding the segment orchestration and knowing how the default timeout impacts the execution of repairs.&lt;/p&gt;

&lt;p&gt;Repair is a complex choreography of operations in a distributed system. As such, and especially in the days when Reaper was created, the process could get blocked for several reasons and required a manual restart. The smart folks that designed Reaper at Spotify decided to put a timeout on segments to deal with such blockage, over which they would be terminated and rescheduled.&lt;/p&gt;

&lt;p&gt;Problems arise when segments are too big (or have too much entropy) to process within the default 30 minutes timeout, despite not being blocked. They are repeatedly terminated and recreated, and the repair appears to make no progress.&lt;/p&gt;

&lt;p&gt;Reaper did a poor job at dealing with this for mainly two reasons:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Each retry will use the same timeout, possibly failing segments forever&lt;/li&gt;
&lt;li&gt;Nothing obvious was reported to explain what was failing and how to fix the situation&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;We fixed the former by using a longer timeout on subsequent retries, which is a simple trick to make repairs more “adaptive”. If the segments are too big, they’ll eventually pass after a few retries. It’s a good first step to improve the experience, but it’s not enough for scheduled repairs as they could end up with the same repeated failures for each run.&lt;/p&gt;

&lt;p&gt;This is where we introduce adaptive schedules, which use feedback from past repair runs to adjust either the number of segments or the timeout for the next repair run.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fuonqzgx3f0jim1r72vyg.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fuonqzgx3f0jim1r72vyg.png" alt="Image description" width="800" height="519"&gt;&lt;/a&gt;&lt;br&gt;
Figure 1: Example of how to use adaptive schedules in Reaper.&lt;/p&gt;

&lt;p&gt;Adaptive schedules will be updated at the end of each repair if the run metrics justify it. The schedule can get a different number of segments or a higher segment timeout depending on the latest run.&lt;/p&gt;

&lt;p&gt;The rules are the following:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;If more than 20% segments were extended, the number of segments will be raised by 20% on the schedule.&lt;/li&gt;
&lt;li&gt;If less than 20% segments were extended (and at least one), the timeout will be set to twice the current timeout.&lt;/li&gt;
&lt;li&gt;If no segment was extended and the maximum duration of segments is below 5 minutes, the number of segments will be reduced by 10% with a minimum of 16 segments per node.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;This feature is disabled by default and is configurable on a per schedule basis. The timeout can now be set differently for each schedule, from the UI or the REST API, instead of having to change the Reaper config file and restart the process.&lt;/p&gt;
&lt;h1&gt;
  
  
  &lt;strong&gt;Incremental Repair Triggers&lt;/strong&gt;
&lt;/h1&gt;

&lt;p&gt;As we celebrate the long awaited &lt;a href="https://dtsx.io/3JJSmVC" rel="noopener noreferrer"&gt;improvements in incremental repairs&lt;/a&gt; brought by Cassandra 4.0, it was time to embrace them with more appropriate triggers. One metric that incremental repair makes available is the percentage of repaired data per table. When running against too much unrepaired data, incremental repair can put a lot of pressure on a cluster due to the heavy anti-compaction process.&lt;/p&gt;

&lt;p&gt;The best practice is to run it on a regular basis so that the amount of unrepaired data is kept low. Since your throughput may vary from one table/keyspace to the other, it can be challenging to set the right interval for your incremental repair schedules.&lt;/p&gt;

&lt;p&gt;Reaper 3.0 introduced a new trigger for the incremental schedules, which is a threshold of unrepaired data. This allows creating schedules that will start a new run as soon as, for example, 10% of the data for at least one table from the keyspace is unrepaired.&lt;/p&gt;

&lt;p&gt;Those triggers are complementary to the interval in days, which could still be necessary for low traffic keyspaces that need to be repaired to secure tombstones.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fojkyknwrg59t8ba1ua0w.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fojkyknwrg59t8ba1ua0w.png" alt="Image description" width="800" height="157"&gt;&lt;/a&gt;&lt;br&gt;
Figure 2: Setting interval for incremental repairs.&lt;/p&gt;

&lt;p&gt;These new features will allow you to securely optimize tombstone deletions by enabling the &lt;code&gt;only_purge_repaired_tombstones&lt;/code&gt; compaction subproperty in Cassandra, permitting it to reduce &lt;code&gt;gc_grace_seconds&lt;/code&gt; &lt;a href="https://dtsx.io/3eS0ftI" rel="noopener noreferrer"&gt;down to three hours&lt;/a&gt; without the concern that deleted data will reappear.&lt;/p&gt;
&lt;h1&gt;
  
  
  &lt;strong&gt;Schedules can be edited&lt;/strong&gt;
&lt;/h1&gt;

&lt;p&gt;That may sound like an obvious feature but previous versions of Reaper didn’t allow for editing of an existing schedule. This led to an annoying procedure where you had to delete the schedule (which isn’t made easy by Reaper either) and recreate it with the new settings.&lt;/p&gt;

&lt;p&gt;Version 3.0 fixed that embarrassing situation and adds an edit button to schedules, which allows you to change the mutable settings of schedules:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Ft9wgwrz89hlyts8r1ttd.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Ft9wgwrz89hlyts8r1ttd.png" alt="Image description" width="800" height="662"&gt;&lt;/a&gt;&lt;br&gt;
Figure 3: Reaper now has the ability to edit the settings for scheduled actions.&lt;/p&gt;
&lt;h1&gt;
  
  
  &lt;strong&gt;CVE fixes&lt;/strong&gt;
&lt;/h1&gt;

&lt;p&gt;With the release of Reaper 3.1.0, we were able to fix more than 80 reported CVEs by upgrading several dependencies to more current versions:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;a href="https://www.dropwizard.io/en/latest/" rel="noopener noreferrer"&gt;Dropwizard&lt;/a&gt; 2.0.25&lt;/li&gt;
&lt;li&gt;
&lt;a href="https://shiro.apache.org/" rel="noopener noreferrer"&gt;Shiro&lt;/a&gt; 1.8.0&lt;/li&gt;
&lt;li&gt;
&lt;a href="https://github.com/asomov/snakeyaml-engine" rel="noopener noreferrer"&gt;SnakeYAML&lt;/a&gt; 1.29&lt;/li&gt;
&lt;li&gt;
&lt;a href="https://netty.io/" rel="noopener noreferrer"&gt;Netty&lt;/a&gt; 4.1.70.Final&lt;/li&gt;
&lt;li&gt;
&lt;a href="https://dtsx.io/32UMhVD" rel="noopener noreferrer"&gt;Cassandra Java Driver&lt;/a&gt; 3.11.0&lt;/li&gt;
&lt;li&gt;
&lt;a href="https://eclipse-ee4j.github.io/jersey/" rel="noopener noreferrer"&gt;Jersey&lt;/a&gt; 2.33&lt;/li&gt;
&lt;li&gt;
&lt;a href="https://prometheus.io/" rel="noopener noreferrer"&gt;Prometheus Simple Client&lt;/a&gt; 0.12.0&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;This allows Reaper to be more secure and future proof as it now enables us to migrate from the deprecated &lt;a href="https://github.com/composable-systems/dropwizard-cassandra" rel="noopener noreferrer"&gt;dropwizard-cassandra&lt;/a&gt; bundle to the &lt;a href="https://github.com/dropwizard/dropwizard-cassandra" rel="noopener noreferrer"&gt;officially supported one&lt;/a&gt;, along with upgrading the Cassandra driver to the latest 4.x.  &lt;/p&gt;
&lt;h1&gt;
  
  
  &lt;strong&gt;More improvements&lt;/strong&gt;
&lt;/h1&gt;

&lt;p&gt;In order to protect clusters from running mixed incremental and full repairs in older versions of Cassandra, Reaper would disallow the creation of an incremental repair run/schedule if a full repair had been created on the same set of tables in the past (and vice versa).&lt;/p&gt;

&lt;p&gt;Now that incremental repair is safe for production use, it is necessary to allow such mixed repair types. In case of conflict, Reaper 3.0 displays a pop-up informing you and allowing you to force create the schedule/run:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F9pag484jynkaryawdspc.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F9pag484jynkaryawdspc.png" alt="Image description" width="800" height="310"&gt;&lt;/a&gt;&lt;br&gt;
Figure 4: Reaper now shows a pop-up to inform you of a conflict and allowing to force create the schedule/run.&lt;/p&gt;

&lt;p&gt;We’ve also added a special “schema migration mode” for Reaper, which will exit after the schema was created/upgraded. We use this mode in K8ssandra to prevent schema conflicts and allow the schema creation to be executed in an init container that won’t be subject to liveness probes that could trigger the premature termination of the Reaper pod:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;java -jar path/to/reaper.jar schema-migration path/to/cassandra-reaper.yaml
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;There are many other improvements and we invite all users to check the changelog in the GitHub repo.&lt;/p&gt;

&lt;h1&gt;
  
  
  &lt;strong&gt;Upgrade Now&lt;/strong&gt;
&lt;/h1&gt;

&lt;p&gt;We encourage all Reaper users to upgrade to 3.1.0, while recommending users to carefully prepare their migration out of Postgres or H2. Note that there is no export/import feature and schedules will need to be recreated after the migration.&lt;/p&gt;

&lt;p&gt;All instructions to download, install, configure, and use Reaper 3.1.0 are available on the &lt;a href="https://cassandra-reaper.io/docs/download/" rel="noopener noreferrer"&gt;Reaper website&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;&lt;em&gt;Let us know what you think of Reaper 3.1 by joining us on the &lt;a href="https://dtsx.io/32McFBg" rel="noopener noreferrer"&gt;K8ssandra Discord&lt;/a&gt; or &lt;a href="https://dtsx.io/3HAo6us" rel="noopener noreferrer"&gt;K8ssandra Forum&lt;/a&gt; today. For exclusive posts on all things data, follow &lt;a href="https://dtsx.io/3sXkbUj" rel="noopener noreferrer"&gt;DataStax on Medium&lt;/a&gt;.&lt;/em&gt;&lt;/p&gt;

&lt;h1&gt;
  
  
  &lt;strong&gt;Resources&lt;/strong&gt;
&lt;/h1&gt;

&lt;ol&gt;
&lt;li&gt;&lt;a href="http://cassandra-reaper.io/" rel="noopener noreferrer"&gt;Reaper&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="http://cassandra-reaper.io/docs/download/" rel="noopener noreferrer"&gt;Reaper Documentation: Downloads and Installation&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://cassandra.apache.org/_/index.html" rel="noopener noreferrer"&gt;Apache Cassandra&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;
&lt;a href="https://astra.dev/3i4sJlO" rel="noopener noreferrer"&gt;DataStax Astra DB&lt;/a&gt; &lt;/li&gt;
&lt;li&gt;&lt;a href="https://dtsx.io/3qPzVGq" rel="noopener noreferrer"&gt;K8ssandra&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;TLP Blog: &lt;a href="https://dtsx.io/3JJSmVC" rel="noopener noreferrer"&gt;Incremental Repair Improvements in Cassandra 4&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;TLP Blog: &lt;a href="https://dtsx.io/3eS0ftI" rel="noopener noreferrer"&gt;Hinted Handoff and GC Grace Demystified&lt;/a&gt;
&lt;/li&gt;
&lt;/ol&gt;

</description>
    </item>
    <item>
      <title>Cassandra Database Migration to Kubernetes with Zero Downtime</title>
      <dc:creator>Alexander Dejanovski</dc:creator>
      <pubDate>Tue, 15 Feb 2022 15:22:07 +0000</pubDate>
      <link>https://dev.to/datastax/cassandra-database-migration-to-kubernetes-with-zero-downtime-447k</link>
      <guid>https://dev.to/datastax/cassandra-database-migration-to-kubernetes-with-zero-downtime-447k</guid>
      <description>&lt;p&gt;K8ssandra is a cloud-native distribution of the Apache Cassandra® database that runs on Kubernetes, with a suite of tools to ease and automate operational tasks. In this post, we’ll walk you through a database migration from a Cassandra cluster running in AWS EC2 to a K8ssandra cluster running in Kubernetes on AWS EKS, with zero downtime.&lt;/p&gt;

&lt;p&gt;As an Apache Cassandra user, your expectation should be that migrating to K8ssandra would happen without downtime. To make that happen with “classic” clusters running on virtual machines or bare metal instances, you will use the datacenter (DC) switch technique which is commonly used in the Cassandra community to transfer clusters to different hardware or environments. The good news is that it’s not very different for clusters running in Kubernetes as most Container Network Interfaces (CNI) will provide routable pod IPs.&lt;/p&gt;

&lt;h2&gt;
  
  
  &lt;strong&gt;Routable pod IPs in Kubernetes&lt;/strong&gt;
&lt;/h2&gt;

&lt;p&gt;A common misconception about Kubernetes networking is that services are the only way to expose pods outside the cluster and that pods themselves are only reachable directly from within the cluster.&lt;/p&gt;

&lt;p&gt;Looking at &lt;a href="https://docs.projectcalico.org/networking/determine-best-networking" rel="noopener noreferrer"&gt;the Calico documentation&lt;/a&gt;, we can read the following:&lt;/p&gt;

&lt;p&gt;If the pod IP addresses are routable outside of the cluster then pods can connect to the outside world without SNAT, and the outside world can connect directly to pods without going via a Kubernetes service or Kubernetes ingress.&lt;/p&gt;

&lt;p&gt;The same documentation tells us that the default CNI used in AWS EKS, Azure AKS and GCP GKE provide routable pod IPs within a VPC.&lt;/p&gt;

&lt;p&gt;This is necessary because Cassandra nodes in both datacenters will need to be able to communicate with each other without having to go through services. Each Cassandra node stores the list of all the other nodes in the cluster in the &lt;code&gt;system.peers(_v2)&lt;/code&gt; table and communicates with them using the IP addresses that are stored there. If pod IPs aren’t routable, there’s no (easy) way to create a hybrid Cassandra cluster that would span outside of the boundaries of a Kubernetes cluster.&lt;/p&gt;

&lt;h2&gt;
  
  
  &lt;strong&gt;Database Migration using Cassandra Datacenter Switch&lt;/strong&gt;
&lt;/h2&gt;

&lt;p&gt;The traditional technique to migrate a cluster to a different set of hardware or environment is to add up a new datacenter to the cluster whose nodes will be located in the target infrastructure, configure keyspaces so that Cassandra replicates data to the new DC, switch traffic to the new DC once it’s up to date, and then decommission the old infrastructure.&lt;/p&gt;

&lt;p&gt;While this procedure was brilliantly documented by my co-worker Alain Rodriguez on &lt;a href="https://thelastpickle.com/blog/2019/02/26/data-center-switch.html" rel="noopener noreferrer"&gt;the TLP blog&lt;/a&gt;, there are some subtleties related to running our new datacenter in Kubernetes, and more precisely using K8ssandra, which we’ll cover in detail here.&lt;/p&gt;

&lt;p&gt;Here are the steps we’ll go through to perform the migration:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Restrict traffic to the existing datacenter.&lt;/li&gt;
&lt;li&gt;Expand the Cassandra cluster by adding a new datacenter in a Kubernetes cluster using K8ssandra.&lt;/li&gt;
&lt;li&gt;Rebuild the newly created datacenter.&lt;/li&gt;
&lt;li&gt;Switch traffic over to the K8ssandra datacenter.&lt;/li&gt;
&lt;li&gt;Decommission the original Cassandra datacenter.&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  &lt;strong&gt;Performing the migration&lt;/strong&gt;
&lt;/h2&gt;

&lt;h3&gt;
  
  
  &lt;strong&gt;Initial State&lt;/strong&gt;
&lt;/h3&gt;

&lt;p&gt;Our starting point is a Cassandra 4.0-rc1 cluster running in AWS on EC2 instances:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;$ nodetool status
Datacenter: us-west-2
=====================
Status=Up/Down
|/ State=Normal/Leaving/Joining/Moving
--  Address        Load      Tokens  Owns (effective)  Host ID                               Rack
UN  172.31.4.217   10.2 GiB  16      100.0%            9a9b5e8f-c0c2-404d-95e1-372880e02c43  us-west-2c
UN  172.31.38.15   10.2 GiB  16      100.0%            1e6a9077-bb47-4584-83d5-8bed63512fd8  us-west-2b
UN  172.31.22.153  10.2 GiB  16      100.0%            d6488a81-be1c-4b07-9145-2aa32675282a  us-west-2a
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;In the AWS console, we can access the details of a node in the EC2 service and locate its VPC id which we’ll need later to create a peering connection with the EKS cluster VPC:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F299kfazalv6d0udz05cz.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F299kfazalv6d0udz05cz.png" alt="Image description" width="800" height="376"&gt;&lt;/a&gt;&lt;br&gt;
Finding the VPC id&lt;/p&gt;

&lt;p&gt;The next step is to create an EKS cluster with the right settings so that pod IPs will be reachable from the existing EC2 instances.&lt;/p&gt;
&lt;h3&gt;
  
  
  &lt;strong&gt;Creating the EKS cluster&lt;/strong&gt;
&lt;/h3&gt;

&lt;p&gt;We’ll use the &lt;a href="https://github.com/k8ssandra/k8ssandra-terraform" rel="noopener noreferrer"&gt;k8ssandra-terraform&lt;/a&gt; project to spin up an EKS cluster with 3 nodes (see &lt;a href="https://docs.k8ssandra.io/install/eks/" rel="noopener noreferrer"&gt;https://docs.k8ssandra.io/install/eks/&lt;/a&gt; for more information).&lt;/p&gt;

&lt;p&gt;After cloning the project locally, we initialize a few env variables to get started:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;# Optional if you're using the default profile
export AWS_PROFILE=eks-poweruser
export TF_VAR_environment=dev
# Must match the existing cluster name
export TF_VAR_name=adejanovski-migration-cluster
export TF_VAR_resource_owner=adejanovski
export TF_VAR_region=us-west-2
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;We go to the env directory and initialize our Terraform files:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;cd env
terraform init
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;We can then update the &lt;code&gt;variables.tf&lt;/code&gt; file and adjust it to our needs:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;variable "instance_type" {
 description = "Type of instance to be used for the Kubernetes cluster."
 type        = string
 default     = "r5.2xlarge"
}
variable "desired_capacity" {
 description = "Desired capacity for the autoscaling Group."
 type        = number
 default     = 3
}
variable "max_size" {
 description = "Maximum number of the instances in autoscaling group"
 type        = number
 default     = 3
}
variable "min_size" {
 description = "Minimum number of the instances in autoscaling group"
 type        = number
 default     = 3
}
...
variable "private_cidr_block" {
 description = "List of private subnet cidr blocks"
 type        = list(string)
 default     = ["10.0.1.0/24", "10.0.2.0/24", "10.0.3.0/24"]
}
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Make sure the private CIDR blocks are different from the ones used in the EC2 cluster VPC, otherwise you may end up with IP addresses conflicts.&lt;/p&gt;

&lt;p&gt;Now create the EKS cluster and the three worker nodes:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;terraform plan
terraform apply
...
# Answer "yes" when asked for confirmation
Do you want to perform these actions in workspace "eks-experiment"?
 Terraform will perform the actions described above.
 Only 'yes' will be accepted to approve.
 Enter a value: yes
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The operation will take a few minutes to complete and output something similar to this at the end:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Apply complete! Resources: 50 added, 0 changed, 0 destroyed.
Outputs:
bucket_id = "dev-adejanovski-migration-cluster-s3-bucket"
cluster_Endpoint = "https://FB2B5CD5D27F43B69B54.gr7.us-west-2.eks.amazonaws.com"
cluster_name = "dev-adejanovski-migration-cluster-eks-cluster"
cluster_version = "1.20"
connect_cluster = "aws eks --region us-west-2 update-kubeconfig --name dev-adejanovski-migration-cluster-eks-cluster"
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Note the &lt;code&gt;connect_cluster&lt;/code&gt; command which will allow us to create the kubeconfig context entry to interact with the cluster using &lt;code&gt;kubectl&lt;/code&gt;:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;% aws eks --region us-west-2 update-kubeconfig --name dev-adejanovski-migration-cluster-eks-cluster
Updated context arn:aws:eks:us-west-2:3373455535488:cluster/dev-adejanovski-migration-cluster-eks-cluster in /Users/adejanovski/.kube/config
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;We can now check the list of worker nodes in our k8s cluster:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;% kubectl get nodes
NAME                                       STATUS   ROLES    AGE   VERSION
ip-10-0-1-107.us-west-2.compute.internal   Ready    &amp;lt;none&amp;gt;   5m   v1.20.4-eks-6b7464
ip-10-0-2-34.us-west-2.compute.internal    Ready    &amp;lt;none&amp;gt;   5m   v1.20.4-eks-6b7464
ip-10-0-3-239.us-west-2.compute.internal   Ready    &amp;lt;none&amp;gt;   5m   v1.20.4-eks-6b7464
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  &lt;strong&gt;VPC Peering and Security Groups&lt;/strong&gt;
&lt;/h3&gt;

&lt;p&gt;Our Terraform scripts will create a specific VPC for the EKS cluster. In order for our Cassandra nodes to communicate with the K8ssandra nodes, we will need to create a peering connection between both VPCs. Follow the documentation provided by AWS on this topic to create the peering connection: &lt;a href="https://docs.aws.amazon.com/vpc/latest/peering/create-vpc-peering-connection.html" rel="noopener noreferrer"&gt;VPC Peering Connection&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;Once the VPC peering connection is created and the route tables are updated in both VPCs, update the inbound rules of the security groups for both the EC2 Cassandra nodes and the EKS worker nodes to accept all TCP traffic on ports 7000 and 7001, which are used by Cassandra nodes to communicate with each other (unless configured otherwise).&lt;/p&gt;

&lt;h3&gt;
  
  
  &lt;strong&gt;Preparing the Cassandra cluster for the expansion&lt;/strong&gt;
&lt;/h3&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F73dgarjq3gb84c3m72sm.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F73dgarjq3gb84c3m72sm.png" alt="Image description" width="800" height="600"&gt;&lt;/a&gt;&lt;br&gt;
Original Cassandra cluster&lt;/p&gt;

&lt;p&gt;When expanding a Cassandra cluster to another DC, and assuming you haven’t created your cluster with the &lt;code&gt;&lt;a href="https://docs.datastax.com/en/cassandra-oss/3.x/cassandra/architecture/archSnitchSimple.html" rel="noopener noreferrer"&gt;SimpleSnitch&lt;/a&gt;&lt;/code&gt; (otherwise you first have to &lt;a href="https://docs.datastax.com/en/cassandra-oss/3.x/cassandra/operations/opsSwitchSnitch.html" rel="noopener noreferrer"&gt;switch snitches first&lt;/a&gt;), you need to make sure your keyspaces use the &lt;code&gt;&lt;a href="https://docs.datastax.com/en/cassandra-oss/3.x/cassandra/architecture/archDataDistributeReplication.html?hl=replication%2Cstrategy#archDataDistributeReplication__nts" rel="noopener noreferrer"&gt;NetworkTopologyStrategy&lt;/a&gt;&lt;/code&gt; (NTS). This replication strategy is the only one that is DC and rack aware. The default &lt;code&gt;&lt;a href="https://docs.datastax.com/en/cassandra-oss/3.x/cassandra/architecture/archDataDistributeReplication.html?hl=replication%2Cstrategy#archDataDistributeReplication__simpleStrategy" rel="noopener noreferrer"&gt;SimpleStrategy&lt;/a&gt;&lt;/code&gt; will not consider DCs and will behave as if all nodes were collocated in the same DC and rack.&lt;/p&gt;

&lt;p&gt;We’ll use &lt;code&gt;cqlsh&lt;/code&gt; on one of the EC2 Cassandra nodes to list the existing keyspaces and update their replication strategy.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;$ cqlsh $(hostname)
Connected to adejanovski-migration-cluster at ip-172-31-22-153:9042
[cqlsh 6.0.0 | Cassandra 4.0 | CQL spec 3.4.5 | Native protocol v5]
Use HELP for help.
cqlsh&amp;gt; DESCRIBE KEYSPACES
system       system_distributed  system_traces  system_virtual_schema
system_auth  system_schema       system_views   tlp_stress
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Several system keyspaces use the special &lt;code&gt;LocalStrategy&lt;/code&gt; and are not replicated across nodes. They contain only node specific information and cannot be altered in any way.&lt;/p&gt;

&lt;p&gt;We’ll alter the following keyspaces to make them use NTS and only put replicas on the existing datacenter:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;code&gt;system_auth&lt;/code&gt; (contains user credentials for authentication purposes)&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;system_distributed&lt;/code&gt; (contains repair history data and MV build status)&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;system_traces&lt;/code&gt; (contains probabilistic tracing data)&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;tlp_stress&lt;/code&gt; (user-created keyspace)&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Add any other user-created keyspace to the list. Here we only have the &lt;code&gt;tlp_stress&lt;/code&gt; keyspace which was created by the &lt;a href="https://thelastpickle.com/tlp-stress/" rel="noopener noreferrer"&gt;tlp-stress&lt;/a&gt; tool to generate some data for the purpose of this migration.&lt;/p&gt;

&lt;p&gt;We will now run the following command on all the above keyspaces using the existing datacenter name, in our case &lt;code&gt;us-west-2&lt;/code&gt;:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;cqlsh&amp;gt; ALTER KEYSPACE &amp;lt;keyspace_name&amp;gt; WITH replication = {'class': 'NetworkTopologyStrategy', 'us-west-2': 3};
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Make sure client traffic is pinned to the &lt;code&gt;us-west-2&lt;/code&gt; datacenter by specifying it as the local datacenter. This can be done by using the &lt;code&gt;DCAwareRoundRobinPolicy&lt;/code&gt; in some older versions of the Datastax drivers or by specifying it as local datacenter when creating a new &lt;code&gt;CqlSession&lt;/code&gt; object in the 4.x branch of the Java Driver:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;CqlSession session = CqlSession.builder()
   .withLocalDatacenter("us-west-2")
   .build();
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;More information can be found in the drivers documentation.&lt;/p&gt;

&lt;h3&gt;
  
  
  &lt;strong&gt;Deploying K8ssandra as a new datacenter&lt;/strong&gt;
&lt;/h3&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F10cumgn33yv6rpkvgnxl.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F10cumgn33yv6rpkvgnxl.png" alt="Image description" width="800" height="600"&gt;&lt;/a&gt;&lt;br&gt;
Creating a K8ssandra deployment for the new datacenter&lt;/p&gt;

&lt;p&gt;K8ssandra ships with &lt;strong&gt;&lt;a href="https://github.com/k8ssandra/cass-operator" rel="noopener noreferrer"&gt;cass-operator&lt;/a&gt;&lt;/strong&gt; which orchestrates the Cassandra nodes and handles their configuration. Cass-operator exposes an &lt;code&gt;additionalSeeds&lt;/code&gt; setting which allows us to add seed nodes that are not managed by the local instance of cass-operator and by doing so, create a new datacenter that will expand an existing cluster.&lt;/p&gt;

&lt;p&gt;We will put all our existing Cassandra nodes as additional seeds, and you should not need more than three nodes in this list even if your original cluster is larger. The following &lt;code&gt;migration.yaml&lt;/code&gt; values file will be used for our K8ssandra Helm chart:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;cassandra:
 version: "4.0.0"
 clusterName: "adejanovski-migration-cluster"
 allowMultipleNodesPerWorker: false
 additionalSeeds:
 - 172.31.4.217
 - 172.31.38.15
 - 172.31.22.153
 heap:
  size: 31g
 gc:
   g1:
     enabled: true
     setUpdatingPauseTimePercent: 5
     maxGcPauseMillis: 300
 resources:
   requests:
     memory: "59Gi"
     cpu: "7000m"
   limits:
     memory: "60Gi"
 datacenters:
 - name: k8s-1
   size: 3
   racks:
   - name: r1
     affinityLabels:
       topology.kubernetes.io/zone: us-west-2a
   - name: r2
     affinityLabels:
       topology.kubernetes.io/zone: us-west-2b
   - name: r3
     affinityLabels:
       topology.kubernetes.io/zone: us-west-2c
 ingress:
   enabled: false
 cassandraLibDirVolume:
   storageClass: gp2
   size: 3400Gi
stargate:
 enabled: false
medusa:
 enabled: false
reaper-operator:
 enabled: false
kube-prometheus-stack:
 enabled: false
reaper:
 enabled: false
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Note that the cluster name must match the value used for the EC2 Cassandra nodes and the datacenter should be named differently than the existing one(s). We will only install Cassandra in our K8ssandra datacenter, but other components could be deployed as well during this phase.&lt;/p&gt;

&lt;p&gt;Let’s deploy K8ssandra and have it join the Cassandra cluster:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;% helm install k8ssandra charts/k8ssandra -n k8ssandra --create-namespace -f ~/k8ssandra_demo/benchmarks.values.yaml
NAME: k8ssandra
LAST DEPLOYED: Thu Jul  1 09:46:54 2021
NAMESPACE: k8ssandra
STATUS: deployed
REVISION: 1
TEST SUITE: None
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;You can monitor the logs of the Cassandra pods to see if they’re joining appropriately:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;kubectl logs pod/adejanovski-migration-cluster-k8s-1-r1-sts-0 -c server-system-logger -n k8ssandra --follow
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Cass-operator will only start one node at a time so if you get a message looking like the following, try checking the logs of another pod:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;tail: can't open '/var/log/cassandra/system.log': No such file or directory
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;If VPC peering was done appropriately, the nodes should join the cluster one by one and after a while, &lt;code&gt;nodetool status&lt;/code&gt; should give an output that looks like this:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Datacenter: k8s-1
=================
Status=Up/Down
|/ State=Normal/Leaving/Joining/Moving
--  Address        Load       Tokens  Owns (effective)  Host ID                               Rack
UN  10.0.3.10      78.16 KiB  16      0.0%              c63b9b16-24fe-4232-b146-b7c2f450fcc6  r3
UN  10.0.2.66      69.14 KiB  16      0.0%              b1409a2e-cba1-482f-9ea6-c895bf296cd9  r2
UN  10.0.1.77      69.13 KiB  16      0.0%              78c53702-7a47-4629-a7bd-db41b1705bb8  r1
Datacenter: us-west-2
=====================
Status=Up/Down
|/ State=Normal/Leaving/Joining/Moving
--  Address        Load       Tokens  Owns (effective)  Host ID                               Rack
UN  172.31.4.217   10.2 GiB   16      100.0%            9a9b5e8f-c0c2-404d-95e1-372880e02c43  us-west-2c
UN  172.31.38.15   10.2 GiB   16      100.0%            1e6a9077-bb47-4584-83d5-8bed63512fd8  us-west-2b
UN  172.31.22.153  10.2 GiB   16      100.0%            d6488a81-be1c-4b07-9145-2aa32675282a  us-west-2a
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  &lt;strong&gt;Rebuilding the new datacenter&lt;/strong&gt;
&lt;/h3&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fjwep6y846uqsue3lkcdl.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fjwep6y846uqsue3lkcdl.png" alt="Image description" width="800" height="600"&gt;&lt;/a&gt;&lt;br&gt;
Replicating data to the new datacenter by rebuilding&lt;/p&gt;

&lt;p&gt;Now that our K8ssandra datacenter has joined the cluster, we will alter the replication strategies to create replicas in the &lt;code&gt;k8s-1&lt;/code&gt; DC for the keyspaces we previously altered:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;cqlsh&amp;gt; ALTER KEYSPACE &amp;lt;keyspace_name&amp;gt; WITH replication = {'class': 'NetworkTopologyStrategy', 'us-west-2': '3', 'k8s-1': '3'};
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;After altering all required keyspaces, rebuild the newly added nodes by running the following command for each Cassandra pod:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;% kubectl exec -it pod/adejanovski-migration-cluster-k8s-1-r1-sts-0 -c cassandra -n k8ssandra -- nodetool rebuild us-west-2
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Once all three nodes are rebuilt, the load should be similar on all nodes:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Datacenter: k8s-1
=================
Status=Up/Down
|/ State=Normal/Leaving/Joining/Moving
--  Address        Load       Tokens  Owns (effective)  Host ID                               Rack
UN  10.0.3.10      10.3 GiB   16      100.0%            c63b9b16-24fe-4232-b146-b7c2f450fcc6  r3
UN  10.0.2.66      10.3 GiB   16      100.0%            b1409a2e-cba1-482f-9ea6-c895bf296cd9  r2
UN  10.0.1.77      10.3 GiB   16      100.0%            78c53702-7a47-4629-a7bd-db41b1705bb8  r1
Datacenter: us-west-2
=====================
Status=Up/Down
|/ State=Normal/Leaving/Joining/Moving
--  Address        Load       Tokens  Owns (effective)  Host ID                               Rack
UN  172.31.4.217   10.32 GiB  16      100.0%            9a9b5e8f-c0c2-404d-95e1-372880e02c43  us-west-2c
UN  172.31.38.15   10.32 GiB  16      100.0%            1e6a9077-bb47-4584-83d5-8bed63512fd8  us-west-2b
UN  172.31.22.153  10.32 GiB  16      100.0%            d6488a81-be1c-4b07-9145-2aa32675282a  us-west-2a
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;&lt;em&gt;Note that K8ssandra will create a new superuser and that the existing users in the cluster will be retained as well after the migration. You can forcefully recreate the existing superuser credentials in the K8ssandra datacenter by adding the following block in the “cassandra” section of the Helm values file:&lt;/em&gt;&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt; auth:
   enabled: true
   superuser:
     secret: "superuser-password"
     username: "superuser-name"
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  &lt;strong&gt;Switching traffic to the new datacenter&lt;/strong&gt;
&lt;/h3&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fpehak9bnjp5mswv1ey6s.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fpehak9bnjp5mswv1ey6s.png" alt="Image description" width="800" height="600"&gt;&lt;/a&gt;&lt;br&gt;
Redirecting client traffic to the new datacenter&lt;/p&gt;

&lt;p&gt;Client traffic can now be directed at the &lt;code&gt;k8s-1&lt;/code&gt; datacenter, the same way we previously restricted it to &lt;code&gt;us-west-2&lt;/code&gt;. If your clients are running from within the Kubernetes cluster, use the cassandra service exposed by K8ssandra as a contact point for the driver. If the clients are running outside of the Kubernetes cluster, you’ll need to enable Ingress and configure it appropriately, which is outside the scope of this blog post and will be covered in a future one.&lt;/p&gt;
&lt;h3&gt;
  
  
  &lt;strong&gt;Decommissioning the old datacenter and finishing the migration&lt;/strong&gt;
&lt;/h3&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F2821jdg4oyx5qenqbwat.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F2821jdg4oyx5qenqbwat.png" alt="Image description" width="800" height="600"&gt;&lt;/a&gt;&lt;br&gt;
Decommission the original datacenter&lt;/p&gt;

&lt;p&gt;Once all the client apps/services have been restarted, we can alter our keyspaces to only replicate them on &lt;code&gt;k8s-1&lt;/code&gt;:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;cqlsh&amp;gt; ALTER KEYSPACE &amp;lt;keyspace_name&amp;gt; WITH replication = {'class': 'NetworkTopologyStrategy', 'k8s-1': '3'};
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Then &lt;code&gt;ssh&lt;/code&gt; into each of the Cassandra nodes in &lt;code&gt;us-west-2&lt;/code&gt; and run the following command to decommission them:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;% nodetool decommission
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;They will appear as leaving (UL) while the decommission is running:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Datacenter: k8s-1
=================
Status=Up/Down
|/ State=Normal/Leaving/Joining/Moving
--  Address        Load       Tokens  Owns (effective)  Host ID                               Rack
UN  10.0.3.10      10.3 GiB   16      100.0%            c63b9b16-24fe-4232-b146-b7c2f450fcc6  r3
UN  10.0.2.66      10.3 GiB   16      100.0%            b1409a2e-cba1-482f-9ea6-c895bf296cd9  r2
UN  10.0.1.77      10.3 GiB   16      100.0%            78c53702-7a47-4629-a7bd-db41b1705bb8  r1
Datacenter: us-west-2
=====================
Status=Up/Down
|/ State=Normal/Leaving/Joining/Moving
--  Address        Load       Tokens  Owns (effective)  Host ID                               Rack
UN  172.31.4.217   10.32 GiB  16      0.0%              9a9b5e8f-c0c2-404d-95e1-372880e02c43  us-west-2c
UN  172.31.38.15   10.32 GiB  16      0.0%              1e6a9077-bb47-4584-83d5-8bed63512fd8  us-west-2b
UL  172.31.22.153  10.32 GiB  16      0.0%              d6488a81-be1c-4b07-9145-2aa32675282a  us-west-2a
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The operation should be fairly fast as no streaming will take place since we no longer have keyspaces replicated on &lt;code&gt;us-west-2&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;Once all three nodes were decommissioned, we should be left with the &lt;code&gt;k8s-1&lt;/code&gt; datacenter only:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Datacenter: k8s-1
=================
Status=Up/Down
|/ State=Normal/Leaving/Joining/Moving
--  Address    Load      Tokens  Owns (effective)  Host ID                               Rack
UN  10.0.3.10  10.3 GiB  16      100.0%            c63b9b16-24fe-4232-b146-b7c2f450fcc6  r3
UN  10.0.2.66  10.3 GiB  16      100.0%            b1409a2e-cba1-482f-9ea6-c895bf296cd9  r2
UN  10.0.1.77  10.3 GiB  16      100.0%            78c53702-7a47-4629-a7bd-db41b1705bb8  r1
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;As a final step, we can now delete the VPC peering connection which is no longer necessary.&lt;/p&gt;

&lt;p&gt;Note that the cluster can run in hybrid mode for as long as necessary. There’s no requirement to delete the &lt;code&gt;us-west-2&lt;/code&gt; datacenter if it makes sense to keep it alive.&lt;/p&gt;

&lt;h2&gt;
  
  
  &lt;strong&gt;Conclusion&lt;/strong&gt;
&lt;/h2&gt;

&lt;p&gt;We have seen today that it was possible to migrate existing Cassandra clusters to K8ssandra without downtime, leveraging flat networking to allow Cassandra nodes running in VMs to connect to Cassandra pods running in Kubernetes directly.&lt;/p&gt;

&lt;p&gt;Join &lt;a href="https://forum.k8ssandra.io/" rel="noopener noreferrer"&gt;our forum&lt;/a&gt; if you have any questions about the above procedure and come speak with us directly &lt;a href="https://discord.com/invite/qP5tAt6Uwt" rel="noopener noreferrer"&gt;in Discord&lt;/a&gt;. Curious to learn more about (or play with) Cassandra itself?  We recommend trying it on &lt;a href="https://astra.dev/3HZgdiY" rel="noopener noreferrer"&gt;Astra DB's&lt;/a&gt; free plan for the fastest setup.&lt;/p&gt;

</description>
    </item>
    <item>
      <title>Backing up K8ssandra with MinIO</title>
      <dc:creator>Alexander Dejanovski</dc:creator>
      <pubDate>Thu, 03 Feb 2022 22:49:57 +0000</pubDate>
      <link>https://dev.to/datastax/backing-up-k8ssandra-with-minio-265g</link>
      <guid>https://dev.to/datastax/backing-up-k8ssandra-with-minio-265g</guid>
      <description>&lt;p&gt;K8ssandra includes Medusa for Apache Cassandra® to handle backup and restore for your Cassandra nodes. Recently Medusa was upgraded to introduce support for all S3 compatible backends, including &lt;a href="https://min.io/" rel="noopener noreferrer"&gt;MinIO&lt;/a&gt;, the popular k8s-native object storage suite. Let’s see how to set up K8ssandra and MinIO to backup Cassandra in just a few steps.&lt;/p&gt;

&lt;h2&gt;
  
  
  &lt;strong&gt;Deploy MinIO&lt;/strong&gt;
&lt;/h2&gt;

&lt;p&gt;Similar to K8ssandra, MinIO can be simply deployed through Helm.&lt;/p&gt;

&lt;p&gt;First, add the MinIO repository to your local list:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;helm repo add minio https://helm.min.io/
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The MinIO Helm charts allow you to do several things at once at install time:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Set the credentials to access MinIO&lt;/li&gt;
&lt;li&gt;Create a bucket for your backups that can be set as default&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;You can create a &lt;code&gt;k8ssandra-medusa&lt;/code&gt; bucket and use &lt;code&gt;minio_key/minio_secret&lt;/code&gt; as the credentials, and deploy MinIO in a new namespace called &lt;code&gt;minio&lt;/code&gt; by running the following command:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;helm install --set accessKey=minio_key,secretKey=minio_secret,defaultBucket.enabled=true,defaultBucket.name=k8ssandra-medusa minio minio/minio -n minio --create-namespace
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Note:&lt;/strong&gt; Creating the bucket is not mandatory at this stage and can be done through MinIO’s UI.&lt;/p&gt;

&lt;p&gt;After the &lt;code&gt;helm install&lt;/code&gt; command has completed, you should see something similar to this in the &lt;code&gt;minio&lt;/code&gt; namespace:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;% kubectl get all -n minio
NAME                        READY   STATUS    RESTARTS   AGE
pod/minio-5fd4dd687-gzr8j   1/1     Running   0          109s
NAME            TYPE        CLUSTER-IP     EXTERNAL-IP   PORT(S)    AGE
service/minio   ClusterIP   10.96.144.61   &amp;lt;none&amp;gt;        9000/TCP   109s
NAME                    READY   UP-TO-DATE   AVAILABLE   AGE
deployment.apps/minio   1/1     1            1           109s
NAME                              DESIRED   CURRENT   READY   AGE
replicaset.apps/minio-5fd4dd687   1         1         1       109s
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Using port forwarding, you can expose access to the MinIO UI in the browser on port 9000:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;% kubectl port-forward service/minio 9000 -n minio
Forwarding from 127.0.0.1:9000 -&amp;gt; 9000
Forwarding from [::1]:9000 -&amp;gt; 9000
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Now you can login to MinIO at &lt;a href="http://localhost:9000/" rel="noopener noreferrer"&gt;http://localhost:9000&lt;/a&gt; using your install time defined credentials (if you used the same commands above they would be &lt;code&gt;minio_key&lt;/code&gt; and &lt;code&gt;minio_secret&lt;/code&gt;):&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fyqfej9h1jpa4yjk06bxf.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fyqfej9h1jpa4yjk06bxf.png" alt="Image description" width="800" height="625"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Once logged in, you can see that the &lt;code&gt;k8ssandra-medusa&lt;/code&gt; bucket was created and is currently empty:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F90mty4npxa6muy6oo6h6.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F90mty4npxa6muy6oo6h6.png" alt="Image description" width="512" height="383"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  &lt;strong&gt;Deploy K8ssandra&lt;/strong&gt;
&lt;/h2&gt;

&lt;p&gt;Now that MinIO is up and running, you can create a namespace for your K8ssandra installation and create a secret for Medusa to access the bucket. Create a &lt;code&gt;medusa_secret.yaml&lt;/code&gt; file with the following content:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;apiVersion: v1
kind: Secret
metadata:
name: medusa-bucket-key
type: Opaque
stringData:
# Note that this currently has to be set to medusa_s3_credentials!
medusa_s3_credentials: |-
  [default]
  aws_access_key_id = minio_key
  aws_secret_access_key = minio_secret
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Now create the &lt;code&gt;k8ssandra&lt;/code&gt; namespace and the Medusa secret with the following commands:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;kubectl create namespace k8ssandra
kubectl apply -f medusa_secret.yaml -n k8ssandra
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;You should now see the &lt;code&gt;medusa-bucket-key&lt;/code&gt; secret in the &lt;code&gt;k8ssandra&lt;/code&gt; namespace:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;% kubectl get secrets -n k8ssandra
NAME                  TYPE                                  DATA   AGE
default-token-twk5w   kubernetes.io/service-account-token   3      4m49s
medusa-bucket-key     Opaque                                1      45s
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;You can then deploy K8ssandra with the following custom values file (all default values will be used if not customized here) :&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;medusa:
 enabled: true
 storage: s3_compatible
 storage_properties:
     host: minio.minio.svc.cluster.local
     port: 9000
     secure: "False"
 bucketName: k8ssandra-medusa
 storageSecret: medusa-bucket-key
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Save the above file as &lt;code&gt;k8ssandra_medusa_minio.yaml&lt;/code&gt; and then install K8ssandra with the following command:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;helm install k8ssandra k8ssandra/k8ssandra -f k8ssandra_medusa_minio.yaml -n k8ssandra
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Now wait for the Cassandra cluster to be ready by using the following &lt;code&gt;wait&lt;/code&gt; command:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;kubectl wait --for=condition=Ready cassandradatacenter/dc1 --timeout=900s -n k8ssandra
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;You should now see a list of pods similar to this:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;% kubectl get pods -n k8ssandra
NAME                                                  READY   STATUS      RESTARTS   AGE
k8ssandra-cass-operator-547845459-dwg68               1/1     Running     0          6m36s
k8ssandra-dc1-default-sts-0                           3/3     Running     0          5m56s
k8ssandra-dc1-stargate-776f88f945-p9twg               0/1     Running     0          6m36s
k8ssandra-grafana-75b9cb64cc-kndtc                    2/2     Running     0          6m36s
k8ssandra-kube-prometheus-operator-5bdd97c666-qz5vv   1/1     Running     0          6m36s
k8ssandra-medusa-operator-d766d5b66-wjt7j             1/1     Running     0          6m36s
k8ssandra-reaper-5f9bbfc989-j59xk                     1/1     Running     0          2m48s
k8ssandra-reaper-operator-858cd89bdd-7gfjj            1/1     Running     0          6m36s
k8ssandra-reaper-schema-4gshj                         0/1     Completed   0          3m3s
prometheus-k8ssandra-kube-prometheus-prometheus-0     2/2     Running     1          6m32s
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  &lt;strong&gt;Create some data and back it up&lt;/strong&gt;
&lt;/h2&gt;

&lt;p&gt;Extract the username and password to access Cassandra (the password is different for each installation unless it is explicitly set at install time) into variables:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;% username=$(kubectl get secret k8ssandra-superuser -n k8ssandra -o jsonpath="{.data.username}" | base64 --decode)
% password=$(kubectl get secret k8ssandra-superuser -n k8ssandra -o jsonpath="{.data.password}" | base64 --decode)
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Connect through CQLSH on one of the nodes:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;% kubectl exec -it k8ssandra-dc1-default-sts-0 -n k8ssandra -c cassandra -- cqlsh -u $username -p $password
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Copy/paste the following statements into the CQLSH prompt and press enter:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;CREATE KEYSPACE medusa_test  WITH replication = {'class': 'SimpleStrategy', 'replication_factor': 1};
USE medusa_test;
CREATE TABLE users (email TEXT PRIMARY KEY, name TEXT, state TEXT);
INSERT INTO users (email, name, state) VALUES ('alice@example.com', 'Alice Smith', 'TX');
INSERT INTO users (email, name, state) VALUES ('bob@example.com', 'Bob Jones', 'VA');
INSERT INTO users (email, name, state) VALUES ('carol@example.com', 'Carol Jackson', 'CA');
INSERT INTO users (email, name, state) VALUES ('david@example.com', 'David Yang', 'NV');
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Check that the rows were properly inserted:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;SELECT * FROM medusa_test.users;
email             | name          | state
-------------------+---------------+-------
alice@example.com |   Alice Smith |    TX
  bob@example.com |     Bob Jones |    VA
david@example.com |    David Yang |    NV
carol@example.com | Carol Jackson |    CA
(4 rows)
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Now backup this data, and check that files get created in your MinIO bucket.&lt;/p&gt;

&lt;p&gt;To that end, use the following command:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;helm install my-backup k8ssandra/backup -n k8ssandra --set name=backup1,cassandraDatacenter.name=dc1
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Since the backup operation is asynchronous, you can monitor its completion by running the following command:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;kubectl get cassandrabackup backup1 -n k8ssandra -o jsonpath={.status.finishTime}
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;As long as this doesn’t output a date and time, then the backup is still running. With the amount of data present and the fact that you’re using a locally accessible backend, this should complete quickly.&lt;/p&gt;

&lt;p&gt;Now refresh the MinIO UI and you should see some files in the &lt;code&gt;k8ssandra-medusa&lt;/code&gt; bucket:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fiykkb88r21o46q5ouzav.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fiykkb88r21o46q5ouzav.png" alt="Image description" width="512" height="197"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;An index folder should appear (it is Medusa’s backup index) and then another folder that is specific to each Cassandra node in the cluster (in this case there is only one node).&lt;/p&gt;

&lt;h2&gt;
  
  
  &lt;strong&gt;Deleting the data and restoring the backup&lt;/strong&gt;
&lt;/h2&gt;

&lt;p&gt;&lt;code&gt;TRUNCATE&lt;/code&gt; the table and verify it is empty:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;% kubectl exec -it k8ssandra-dc1-default-sts-0 -n k8ssandra -c cassandra -- cqlsh -u $username -p $password
TRUNCATE medusa_test.users;
SELECT * FROM medusa_test.users;
email | name | state
-------+------+-------
(0 rows)
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Now restore the backup taken previously:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;helm install restore-test k8ssandra/restore --set name=restore-backup1,backup.name=backup1,cassandraDatacenter.name=dc1 -n k8ssandra
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This operation will take a little longer as it requires to stop the StatefulSet pod and perform the restore as part of the init containers, before the Cassandra container can start. You can monitor progress using this command:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;watch -d kubectl get cassandrarestore restore-backup1 -o jsonpath={.status} -n k8ssandra
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The restore operation is fully completed once the &lt;code&gt;finishTime&lt;/code&gt; value appears in the output:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;{"finishTime":"2021-03-23T13:58:36Z","restoreKey":"83977399-44dd-4752-b4c4-407273f0339e","startTime":"2021-03-23T13:55:35Z"}
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Check that you can read the data from the previously truncated table:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;% kubectl exec -it k8ssandra-dc1-default-sts-0 -n k8ssandra -c cassandra -- cqlsh -u k8ssandra-superuser -p XHsZ943WBg5RPNhVAT8x -e "SELECT * FROM medusa_test.users"
email             | name          | state
-------------------+---------------+-------
alice@example.com |   Alice Smith |    TX
  bob@example.com |     Bob Jones |    VA
david@example.com |    David Yang |    NV
carol@example.com | Carol Jackson |    CA
(4 rows)
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;You’ve successfully restored your lost data in just a few commands!&lt;/p&gt;

&lt;h2&gt;
  
  
  &lt;strong&gt;Many backends available&lt;/strong&gt;
&lt;/h2&gt;

&lt;p&gt;MinIO, while being an obvious choice in the Kubernetes world, is not the only S3 compatible backend that K8ssandra can use. K8ssandra has supported AWS S3 and Google Cloud Storage as Medusa backends since 1.0.0. There are also a wide variety of solutions that can run on-prem (including CEPH, Cloudian, Riak S2, and Dell EMC ECS) or in cloud environments (including IBM Cloud Object Storage, and OVHcloud Object Storage). See the &lt;a href="https://docs.k8ssandra.io/tasks/backup-restore/" rel="noopener noreferrer"&gt;K8ssandra backup/restore documentation&lt;/a&gt; for more detailed instructions and let us know if you have questions, we love to help! If you are looking to learn Cassandra, or want to see how backups are handled on a Cassandra managed service, please head over to the &lt;a href="https://astra.dev/3Gety66" rel="noopener noreferrer"&gt;Astra DB&lt;/a&gt; website and try the free tier.&lt;/p&gt;

</description>
    </item>
    <item>
      <title>Requirements for running K8ssandra for development</title>
      <dc:creator>Alexander Dejanovski</dc:creator>
      <pubDate>Thu, 13 Jan 2022 17:27:05 +0000</pubDate>
      <link>https://dev.to/datastax/requirements-for-running-k8ssandra-for-development-4eaf</link>
      <guid>https://dev.to/datastax/requirements-for-running-k8ssandra-for-development-4eaf</guid>
      <description>&lt;p&gt;``# &lt;strong&gt;Managing expectations&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;The K8ssandra &lt;a href="https://k8ssandra.io/get-started/" rel="noopener noreferrer"&gt;Quick start&lt;/a&gt; is a excellent guide for doing a full installation of K8ssandra on a dev laptop and trying out the various components of the K8ssandra stack. While this is a great way to get your first hands-on experience with K8ssandra, let’s state the obvious: running K8ssandra locally on a dev laptop is not aimed at performance. In this blog post, we will start Apache Cassandra® locally then explain how to run benchmarks to help evaluate what level of performance (especially throughput) you can expect from a dev laptop deployment.&lt;/p&gt;

&lt;p&gt;Our goal was to achieve the following:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Run the whole stack, if possible, with at least three Cassandra nodes and one Stargate node. K8ssandra ships with the following open source components:

&lt;ul&gt;
&lt;li&gt;&lt;a href="https://cassandra.apache.org/" rel="noopener noreferrer"&gt;Apache Cassandra®&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;
&lt;a href="https://stargate.io/" rel="noopener noreferrer"&gt;Stargate&lt;/a&gt; : API framework and data gateway (CQL, REST, GraphQL)&lt;/li&gt;
&lt;li&gt;
&lt;a href="https://github.com/thelastpickle/cassandra-medusa" rel="noopener noreferrer"&gt;Medusa for Apache Cassandra®&lt;/a&gt; : Backup and restore tool for Cassandra&lt;/li&gt;
&lt;li&gt;
&lt;a href="http://cassandra-reaper.io/" rel="noopener noreferrer"&gt;Reaper for Apache Cassandra®&lt;/a&gt; : Repair orchestration tool for Cassandra&lt;/li&gt;
&lt;li&gt;
&lt;a href="https://github.com/datastax/metric-collector-for-apache-cassandra" rel="noopener noreferrer"&gt;Metrics Collector for Apache Cassandra&lt;/a&gt; : Metric collection and Dashboards for Apache Cassandra (2.2, 3.0, 3.11, 4.0) clusters.&lt;/li&gt;
&lt;li&gt;
&lt;a href="https://github.com/datastax/management-api-for-apache-cassandra" rel="noopener noreferrer"&gt;Management API for Apache Cassandra&lt;/a&gt; : Secure Management Sidecar for Apache Cassandra&lt;/li&gt;
&lt;li&gt;
&lt;a href="https://github.com/datastax/cass-operator" rel="noopener noreferrer"&gt;Cass Operator&lt;/a&gt; : Kubernetes Operator for Apache Cassandra&lt;/li&gt;
&lt;li&gt;
&lt;a href="https://github.com/k8ssandra/medusa-operator" rel="noopener noreferrer"&gt;Medusa Operator&lt;/a&gt; : Kubernetes Operator for Medusa&lt;/li&gt;
&lt;li&gt;
&lt;a href="https://github.com/k8ssandra/reaper-operator" rel="noopener noreferrer"&gt;Reaper Operator&lt;/a&gt; : Kubernetes Operator for Reaper&lt;/li&gt;
&lt;li&gt;
&lt;a href="https://github.com/prometheus-community/helm-charts/tree/main/charts/kube-prometheus-stack" rel="noopener noreferrer"&gt;kube-prometheus-stack&lt;/a&gt; chart:

&lt;ul&gt;
&lt;li&gt;
&lt;a href="https://prometheus.io/" rel="noopener noreferrer"&gt;Prometheus&lt;/a&gt; : Monitoring system &amp;amp; time series database&lt;/li&gt;
&lt;li&gt;
&lt;a href="https://grafana.com/" rel="noopener noreferrer"&gt;Grafana&lt;/a&gt; : Fully composable observability stack&lt;/li&gt;
&lt;/ul&gt;


&lt;/li&gt;

&lt;/ul&gt;

&lt;/li&gt;

&lt;li&gt;Achieve reasonable startup times&lt;/li&gt;

&lt;li&gt;Specify a dev setup stable enough to sustain moderate workloads (50 to 100 ops/s)&lt;/li&gt;

&lt;li&gt;Come up with some minimum requirements and recommended K8ssandra settings&lt;/li&gt;

&lt;/ul&gt;

&lt;h1&gt;
  
  
  &lt;strong&gt;Using the right settings&lt;/strong&gt;
&lt;/h1&gt;

&lt;p&gt;Cassandra can run with fairly limited resources as long as you don’t put too much pressure on it. For example, for the Reaper project, we run our integration tests with &lt;a href="https://github.com/riptano/ccm" rel="noopener noreferrer"&gt;CCM (Cassandra Cluster Manager)&lt;/a&gt;, configured at &lt;a href="https://github.com/thelastpickle/cassandra-reaper/blob/master/.github/scripts/configure-ccm.sh#L22" rel="noopener noreferrer"&gt;256MB of heap size&lt;/a&gt;. This allows the JVM to allocate an additional 256MB of off heap memory, allowing Cassandra to use up to 512MB of RAM.&lt;/p&gt;

&lt;p&gt;If we want to run K8ssandra with limited resources, we’ll need to set these appropriately in our Helm values files.&lt;/p&gt;

&lt;h2&gt;
  
  
  &lt;strong&gt;Setting heap sizes in K8ssandra&lt;/strong&gt;
&lt;/h2&gt;

&lt;p&gt;The K8ssandra Helm charts allow us to set heap sizes for both the Cassandra and Stargate pods separately.&lt;/p&gt;

&lt;h3&gt;
  
  
  &lt;strong&gt;Cassandra&lt;/strong&gt;
&lt;/h3&gt;

&lt;p&gt;For Cassandra, the heap and new gen sizes can be set at the cluster level, or at the datacenter level (K8ssandra will support multi DC deployments in a future release):&lt;/p&gt;

&lt;p&gt;`&lt;code&gt;&lt;/code&gt;&lt;br&gt;
cassandra:&lt;br&gt;
 version: "3.11.10"&lt;br&gt;
 ...&lt;br&gt;
 ...&lt;/p&gt;

&lt;h1&gt;
  
  
  Cluster level heap settings
&lt;/h1&gt;

&lt;p&gt;heap: {}&lt;br&gt;
 #size:&lt;br&gt;
 #newGenSize:&lt;br&gt;
 datacenters:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;name: dc1
size: 3
...
...
# Datacenter level heap settings
heap: {}
#size:
#newGenSize:
&lt;code&gt;&lt;/code&gt;`&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;By default, these values aren’t set, which lets Cassandra perform its own computations based on the available RAM, applying the following formula:&lt;/p&gt;

&lt;p&gt;&lt;code&gt;`&lt;br&gt;
max(min(1/2 ram, 1024MB), min(1/4 ram, 8GB))&lt;br&gt;
`&lt;/code&gt;&lt;/p&gt;

&lt;p&gt;The catch when you run several Cassandra nodes on the same machine is that they will all see the same total available RAM but won’t be aware that other Cassandra nodes could be running as well. When allocating 8GB RAM to Docker, each Cassandra node will compute a 2GB heap. With a 3 nodes cluster, it’s already 6GB of RAM used, not accounting for the additional off heap memory that can be used by each JVM. That doesn’t leave much RAM for the other components K8ssandra includes, such as Grafana, Prometheus and Stargate.&lt;/p&gt;

&lt;p&gt;The takeaway here: &lt;strong&gt;leaving heap settings blank is not a good idea for a dev environment&lt;/strong&gt; in particular, where several Cassandra instances will be collocated on the same host machine. (By default, K8ssandra does not allow multiple Cassandra nodes on the same Kubernetes worker node. For this post, we’re using kind to have multiple worker nodes run on the same OS instance – or virtual machine in the case of Docker Desktop).&lt;/p&gt;

&lt;p&gt;The chosen heap size will directly impact the throughput you can expect to achieve (although it’s not the only limiting factor). A small heap will involve more garbage collections, which will generate more stop the world pauses and directly impact throughput and latency. It also increases the odds of running out of memory if the workload is too heavy, as objects cannot end their lifecycle fast enough for the available heap space.&lt;/p&gt;

&lt;p&gt;Setting the heap size at 500MB with 200MB of new gen globally for the cluster would be done as follows:&lt;/p&gt;

&lt;p&gt;`&lt;code&gt;&lt;/code&gt;&lt;br&gt;
cassandra:&lt;br&gt;
 version: "3.11.10"&lt;br&gt;
 ...&lt;br&gt;
 ...&lt;/p&gt;

&lt;h1&gt;
  
  
  Cluster level heap settings
&lt;/h1&gt;

&lt;p&gt;heap:&lt;br&gt;
  size: 500M&lt;br&gt;
  newGenSize: 200M&lt;br&gt;
 datacenters:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;name: dc1
size: 3
&lt;code&gt;&lt;/code&gt;`&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  &lt;strong&gt;Stargate&lt;/strong&gt;
&lt;/h3&gt;

&lt;p&gt;Because Stargate nodes are special coordinator-only Cassandra nodes and run in the JVM, it is also necessary to set their max heap size:&lt;/p&gt;

&lt;p&gt;&lt;code&gt;`&lt;br&gt;
stargate:&lt;br&gt;
 enabled: true&lt;br&gt;
 version: "1.0.9"&lt;br&gt;
 replicas: 1&lt;br&gt;
 ...&lt;br&gt;
 ...&lt;br&gt;
 heapMB: 256&lt;br&gt;
`&lt;/code&gt;&lt;/p&gt;

&lt;p&gt;Stargate nodes will follow the same rule when it comes to off heap memory: the JVM will be allowed to use as much RAM for off heap memory as the configured heap size.&lt;/p&gt;

&lt;p&gt;As Stargate serves as coordinator, it is likely to hold objects for longer on heap waiting for all nodes to respond to queries before it can acknowledge them and potentially return the result sets to clients. It needs enough heap to do so without excessive garbage collection. Unlike Cassandra, Stargate doesn’t compute a heap size based on the available RAM, and the value must be set explicitly.&lt;/p&gt;

&lt;p&gt;During our tests, we observed that 256MB was a good initial value to have stable Stargate pods. In production you might want to tune this value for optimal performance.&lt;/p&gt;

&lt;h1&gt;
  
  
  &lt;strong&gt;Benchmark Environment&lt;/strong&gt;
&lt;/h1&gt;

&lt;p&gt;Our setup for running benchmarks was the following:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Apple MacBook Pro 2019 – i7 (6 cores) – 32GB RAM – 512GB SSD&lt;/li&gt;
&lt;li&gt;Docker desktop 3.1.0&lt;/li&gt;
&lt;li&gt;
&lt;a href="https://kind.sigs.k8s.io/" rel="noopener noreferrer"&gt;Kind&lt;/a&gt; 0.7.0&lt;/li&gt;
&lt;li&gt;Kubernetes 1.17.11&lt;/li&gt;
&lt;li&gt;kubectl v1.20.2&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Note that we’ve used a fairly powerful environment as our tests ran on a 2019 Apple MacBook Pro with a 6 cores i7 CPU and 32GB RAM.&lt;/p&gt;

&lt;p&gt;We used the &lt;a href="https://docs.k8ssandra.io/tasks/connect/ingress/kind-deployment/" rel="noopener noreferrer"&gt;Kind deployment guidelines&lt;/a&gt; found in the &lt;a href="https://docs.k8ssandra.io/" rel="noopener noreferrer"&gt;K8ssandra documentation&lt;/a&gt; to start a k8s cluster with 3 worker nodes.&lt;/p&gt;

&lt;p&gt;Docker Desktop allows to tune its allocated resources by clicking on its icon in the status bar, then going to “Preferences…”:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fjore8pit8civ0b37zahj.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fjore8pit8civ0b37zahj.png" alt="Image description" width="523" height="754"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Then click on “Resources” in the left menu, which will allow you to set the number of cores and the amount of RAM Docker can use overall:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Ffbei8axefcxhm2nfys8d.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Ffbei8axefcxhm2nfys8d.png" alt="Image description" width="800" height="460"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h1&gt;
  
  
  &lt;strong&gt;Running the benchmarks&lt;/strong&gt;
&lt;/h1&gt;

&lt;p&gt;We used &lt;a href="https://github.com/nosqlbench/nosqlbench" rel="noopener noreferrer"&gt;NoSQLBench&lt;/a&gt; to perform moderate load benchmarks. It comes with a convenient Docker image that we could use straight away to run stress jobs in our k8s cluster.&lt;/p&gt;

&lt;p&gt;Here’s the Helm values file we used as a base for spinning up our cluster, which we’ll name &lt;code&gt;three_nodes_cluster_with_stargate.yaml&lt;/code&gt;:&lt;/p&gt;

&lt;p&gt;`&lt;code&gt;&lt;/code&gt;&lt;br&gt;
cassandra:&lt;br&gt;
 datacenters:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;name: dc1
size: 3
ingress:
enabled: false
stargate:
enabled: true
replicas: 1
ingress:
host:
enabled: true
cassandra:
 enabled: true
medusa:
multiTenant: true
storage: s3
storage_properties:
 region: us-east-1
bucketName: k8ssandra-medusa
storageSecret: medusa-bucket-key
&lt;code&gt;&lt;/code&gt;`&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;We want Stargate to be our Cassandra gateway and enabling Medusa requires us to set up a secret (remember, we want to run the whole stack).&lt;/p&gt;

&lt;p&gt;You’ll have to adjust the Medusa storage settings to match your requirements (bucket and region) or disable it if you don’t have access to an AWS bucket at all by disabling Medusa:&lt;/p&gt;

&lt;p&gt;&lt;code&gt;`&lt;br&gt;
medusa:&lt;br&gt;
 enabled: false&lt;br&gt;
`&lt;/code&gt;&lt;/p&gt;

&lt;p&gt;Adjust the Medusa storage settings to match your requirements (bucket and region). You will need to disable Medusa if AWS usages when an S3 bucket is not available. In addition to AWS, future versions of Medusa will provide support for S3/MinIO and local storage configurations.&lt;/p&gt;

&lt;p&gt;We can create a secret for Medusa by applying the following yaml:&lt;/p&gt;

&lt;p&gt;`&lt;code&gt;&lt;/code&gt;&lt;br&gt;
apiVersion: v1&lt;br&gt;
kind: Secret&lt;br&gt;
metadata:&lt;br&gt;
name: medusa-bucket-key&lt;br&gt;
type: Opaque&lt;br&gt;
stringData:&lt;/p&gt;

&lt;h1&gt;
  
  
  Note that this currently has to be set to medusa_s3_credentials!
&lt;/h1&gt;

&lt;p&gt;medusa_s3_credentials: |-&lt;br&gt;
&lt;code&gt;&lt;/code&gt;`&lt;/p&gt;

&lt;p&gt;[default]&lt;/p&gt;

&lt;p&gt;aws_access_key_id = &amp;lt;aws key&amp;gt; aws_secret_access_key = &amp;lt;aws secret&amp;gt;&lt;/p&gt;

&lt;p&gt;You’ll notice in our Helm values that they lack heap settings. We intentionally did this to set them when invoking &lt;code&gt;helm install&lt;/code&gt; with various heap values for our different tests.&lt;/p&gt;

&lt;p&gt;To fully set up our environment, we executed the following steps:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Create the kind cluster:&lt;code&gt;kind create cluster --config ./kind.config.yaml&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;&lt;a href="https://docs.k8ssandra.io/tasks/connect/ingress/kind-deployment/#3-create-traefik-helm-values-file" rel="noopener noreferrer"&gt;Configure and install Traefik&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;Create a namespace:&lt;code&gt;kubectl create namespace k8ssandra&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;(If Medusa is enabled) Create the secret:&lt;code&gt;kubectl apply -f medusa_secret.yaml -n k8ssandra&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;Deploy K8ssandra with the desired heap settings:&lt;code&gt;helm repo add k8ssandra https://helm.k8ssandra.io/stable helm repo update helm install k8ssandra k8ssandra/k8ssandra -n k8ssandra \ -f /path/to/three_nodes_cluster_with_stargate.yaml \ --set cassandra.heap.size=500M,cassandra.heap.newGenSize=250M,stargate.heapMB=300&lt;/code&gt;
&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;You’ll have to wait for the &lt;code&gt;cassandradatacenter&lt;/code&gt; resource and then the Stargate pod to be ready before you can start interacting with Cassandra. This usually takes around 7 to 10 minutes.&lt;/p&gt;

&lt;p&gt;You can wait for the &lt;code&gt;cassandradatacenter&lt;/code&gt; to be ready with the following &lt;code&gt;kubectl&lt;/code&gt; command:&lt;/p&gt;

&lt;p&gt;&lt;code&gt;`&lt;br&gt;
kubectl wait --for=condition=Ready cassandradatacenter/dc1 --timeout=900s -n k8ssandra&lt;br&gt;
`&lt;/code&gt;&lt;/p&gt;

&lt;p&gt;Then wait for Stargate to be ready:&lt;/p&gt;

&lt;p&gt;&lt;code&gt;`&lt;br&gt;
kubectl rollout status deployment k8ssandra-dc1-stargate -n k8ssandra&lt;br&gt;
`&lt;/code&gt;&lt;/p&gt;

&lt;p&gt;Once Stargate is ready, the above command should output something like this:&lt;/p&gt;

&lt;p&gt;&lt;code&gt;`&lt;br&gt;
deployment "k8ssandra-dc1-stargate" successfully rolled out.&lt;br&gt;
`&lt;/code&gt;&lt;/p&gt;

&lt;p&gt;You can execute a NoSQLBench stress run by creating a k8s &lt;a href="https://kubernetes.io/docs/concepts/workloads/controllers/job/" rel="noopener noreferrer"&gt;job&lt;/a&gt;. You’ll need the superuser credentials so that NoSQLBench can connect to the Cassandra cluster. You can get those credentials with the following commands ( requires &lt;code&gt;&lt;a href="https://stedolan.github.io/jq/" rel="noopener noreferrer"&gt;jq&lt;/a&gt;&lt;/code&gt; to be installed):&lt;/p&gt;

&lt;p&gt;&lt;code&gt;`&lt;br&gt;
SECRET=$(kubectl get secret "k8ssandra-superuser" -n k8ssandra -o=jsonpath='{.data}')&lt;br&gt;
echo "Username: $(jq -r '.username' &amp;lt;&amp;lt;&amp;lt; "$SECRET" | base64 -d)"&lt;br&gt;
echo "Password: $(jq -r '.password' &amp;lt;&amp;lt;&amp;lt; "$SECRET" | base64 -d)"&lt;br&gt;
`&lt;/code&gt;&lt;/p&gt;

&lt;p&gt;Then create the NoSQLBench job which will start automatically:&lt;/p&gt;

&lt;p&gt;&lt;code&gt;`&lt;br&gt;
kubectl create job --image=nosqlbench/nosqlbench nosqlbench -n k8ssandra \&lt;br&gt;
   -- java -jar nb.jar cql-iot rampup-cycles=1k cyclerate=100 \&lt;br&gt;
   username=&amp;lt;superuser username&amp;gt; password=&amp;lt;superuser pass&amp;gt;    \&lt;br&gt;
   main-cycles=10k write_ratio=7 read_ratio=3 async=100       \&lt;br&gt;
   hosts=k8ssandra-dc1-stargate-service --progress console:1s -v&lt;br&gt;
`&lt;/code&gt;&lt;/p&gt;

&lt;p&gt;This will run a 10k cycle stress run with 100 ops/s with 70% writes and 30% reads, allowing 100 in-flight async queries. Note that we’re providing the Stargate service as the contact host for NoSQLBench (the exact name will differ depending on your Helm release name).&lt;/p&gt;

&lt;p&gt;While the job is running, you can tail its logs using the following command:&lt;/p&gt;

&lt;p&gt;&lt;code&gt;`&lt;br&gt;
kubectl logs job/nosqlbench -n k8ssandra --follow&lt;br&gt;
`&lt;/code&gt;&lt;/p&gt;

&lt;p&gt;Latency metrics can be found at the end of the run, and since we’re running at a fixed rate we’ll be interested in the response time which takes coordinated omission (&lt;a href="https://www.youtube.com/watch?v=lJ8ydIuPFeU&amp;amp;ab_channel=StrangeLoopConference" rel="noopener noreferrer"&gt;video&lt;/a&gt;, &lt;a href="http://btw2017.informatik.uni-stuttgart.de/slidesandpapers/E4-11-107/paper_web.pdf" rel="noopener noreferrer"&gt;paper&lt;/a&gt;) into account:&lt;/p&gt;

&lt;p&gt;&lt;code&gt;`&lt;br&gt;
kubectl logs job/nosqlbench -n k8ssandra&lt;br&gt;
 |grep cqliot_default_main.cycles.responsetime&lt;br&gt;
`&lt;/code&gt;&lt;/p&gt;

&lt;p&gt;Which should output something like this:&lt;/p&gt;

&lt;p&gt;&lt;code&gt;`&lt;br&gt;
12:41:18.924 [cqliot_default_main:008] INFO  i.n.e.c.m.PolyglotMetricRegistryBindings -&lt;br&gt;
 timer added: cqliot_default_main.cycles.responsetime&lt;br&gt;
12:42:58.788 [main] INFO  i.n.engine.core.ScenarioResult - type=TIMER,&lt;br&gt;
 name=cqliot_default_main.cycles.responsetime, count=10000, min=1560.064, max=424771.583,&lt;br&gt;
 mean=21894.6342016, stddev=45876.836258003656, median=5842.175, p75=17157.119,&lt;br&gt;
 p95=100499.455, p98=187908.095, p99=263397.375, p999=384827.391, mean_rate=100.03389528501059,&lt;br&gt;
 m1=101.58021531751795, m5=105.18698132587139, m15=106.3340149754869, rate_unit=events/second,&lt;br&gt;
 duration_unit=microseconds&lt;br&gt;
`&lt;/code&gt;&lt;/p&gt;

&lt;p&gt;As Cassandra operators, we usually focus on p99 latencies: &lt;code&gt;p99=263397.375&lt;/code&gt;. That’s 263ms at p99, which is fine considering our environment (a laptop) and our performance requirements (very low).&lt;/p&gt;

&lt;h1&gt;
  
  
  &lt;strong&gt;Benchmark results&lt;/strong&gt;
&lt;/h1&gt;

&lt;p&gt;We ran our benchmarks with the following matrix of settings:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Cores: 4 and 8&lt;/li&gt;
&lt;li&gt;RAM: 4GB and 8GB&lt;/li&gt;
&lt;li&gt;Ops rate: 100, 500, 1000 and 1500 ops/s&lt;/li&gt;
&lt;li&gt;Cassandra Heap: 500MB&lt;/li&gt;
&lt;li&gt;Stargate Heap: 300MB and 500MB&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Running the full stack with three Cassandra nodes, one Stargate node and 4GB allocated to Docker fails on any attempt of running stress tests, even moderate ones. However, running with a single Cassandra node allowed the stress test to run with the full stack loaded using 4GB RAM.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fvazfn3d63bodxpqvcqcn.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fvazfn3d63bodxpqvcqcn.png" alt="Image description" width="639" height="143"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Latencies are very reasonable for all settings when using a 100 ops/s rate. Trying to achieve higher throughput requires using at least 8 cores, allowing to reach 1000 ops/s with 290ms p99 latencies. None of our tests allowed us to reach a sustained throughput of 1500 ops/s as shown by the response times that go over 9 seconds per operation.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fnv6yubrcbxmlkbuf4uwf.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fnv6yubrcbxmlkbuf4uwf.png" alt="Image description" width="800" height="581"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h1&gt;
  
  
  &lt;strong&gt;Conclusion&lt;/strong&gt;
&lt;/h1&gt;

&lt;p&gt;Getting the full K8ssandra experience on a laptop will require at least 4 cores and 8GB of RAM available to Docker and appropriate heap sizes for Cassandra and Stargate. If you don’t have those resources available for Docker on your development machine, you can avoid deploying features such as monitoring, Reaper and Medusa, and also reduce the number of Cassandra nodes. Using heap sizes of 500MB for Cassandra and 300MB for Stargate proved to be enough to sustain workloads between 100 and 500 operations per second, which should be sufficient for development purposes.&lt;/p&gt;

&lt;p&gt;Note that starting the whole stack takes around 7 to 10 minutes at the time of this writing on a fairly recent high end MacBook Pro, so expect your mileage to vary a bit depending on your hardware. Part of this time is spent pulling images from Docker Hub, meaning that your internet connection will play a big role in startup duration. &lt;/p&gt;

&lt;p&gt;Now you know how to configure K8ssandra for your development machine, and you’re ready to start building cloud-native apps! Visit the &lt;a href="https://docs.k8ssandra.io/tasks/" rel="noopener noreferrer"&gt;Tasks&lt;/a&gt; section of our documentation site for detailed instructions on developing against your deployed cluster. If you're looking to learn Cassandra, or learn about the performance of our managed service, we suggest heading to the &lt;a href="https://astra.dev/3mVhzmk" rel="noopener noreferrer"&gt;Astra DB&lt;/a&gt; website.&lt;/p&gt;

</description>
    </item>
  </channel>
</rss>
