DEV Community

Cover image for Simulating a Multi-Region CockroachDB cluster on Kubernetes with Minikube
Fabio Ghirardello for Cockroach Labs

Posted on • Updated on

Simulating a Multi-Region CockroachDB cluster on Kubernetes with Minikube

Minikube is a great platform to work on Kubernetes locally. It allows you to develop and test your apps on your own computer before you are ready to deploy them in a live cluster.
Following are instructions on how to create a 9 nodes CockroachDB cluster on Kubernetes, locally, using Minikube. This is very similar to the deployment we previously run using Docker, so check it out, too.

Below is the high level architecture. We create:

  • 3 Services of type NodePort, one for each region, so we can simulate accessing a particular region.
  • 1 Service of type ClusterIP, to allow all Pods to communicate with each other
  • 9 Pods and 9 PersistentVolumeClaims objects:
Pod name region zone
roach-seattle-1 us-west2 a
roach-seattle-2 us-west2 b
roach-seattle-3 us-west2 c
roach-newyork-1 us-east4 a
roach-newyork-2 us-east4 b
roach-newyork-3 us-east4 c
roach-london-1 eu-west2 a
roach-london-2 eu-west2 b
roach-london-3 eu-west2 c
  • 2 Jobs objects: one to init the cluster and one to run some SQL queries.
  • 3 NetworkChaos objects, to simulate the latency between the regions.

Setup

Minikube resources

It's important to setup the Minikube VM with enough resources to run the cluster and the workload you'll be running on top of it. Everyone's environment is different, so this is for reference only.

Om my laptop, I have allocated 8 CPUs and 24 GB RAM to Minikube. Ensure you have a similar profile; the default 2 CPUs won't be sufficient to run the cluster flawlessly.

minikube config set cpus 8
minikube config set memory 24000
minikube config set driver hyperkit
minikube delete
Enter fullscreen mode Exit fullscreen mode

Ensure the configuration is saved

$ minikube config view
- cpus: 8
- driver: hyperkit
- memory: 24000
Enter fullscreen mode Exit fullscreen mode

All good, you can start Minikube with the new defaults

$ minikube start
๐Ÿ˜„  minikube v1.14.0 on Darwin 10.15.7
โœจ  Using the hyperkit driver based on user configuration
๐Ÿ‘  Starting control plane node minikube in cluster minikube
๐Ÿ”ฅ  Creating hyperkit VM (CPUs=8, Memory=24000MB, Disk=20000MB) ...
๐Ÿณ  Preparing Kubernetes v1.19.2 on Docker 19.03.12 ...
๐Ÿ”Ž  Verifying Kubernetes components...
๐ŸŒŸ  Enabled addons: storage-provisioner, default-storageclass
๐Ÿ„  Done! kubectl is now configured to use "minikube" by default
Enter fullscreen mode Exit fullscreen mode

Ensure Minikube and kubectl are configured properly

$ minikube status
minikube
type: Control Plane
host: Running
kubelet: Running
apiserver: Running
kubeconfig: Configured

$ kubectl cluster-info
Kubernetes master is running at https://192.168.64.3:8443
KubeDNS is running at https://192.168.64.3:8443/api/v1/namespaces/kube-system/services/kube-dns:dns/proxy

To further debug and diagnose cluster problems, use 'kubectl cluster-info dump'.
Enter fullscreen mode Exit fullscreen mode

From above output, my Minikube cluster has address 192.168.64.3. This will be helpful later when it comes to open the Admin UI and connect our SQL client to the CockroachDB cluster.

Please note: The Kubernetes definition files are hosted on Github in my Gist repository. We will use this repo later when we get to apply them.

The first step is to create the NodePort Services objects, one for each region, plus a ClusterIP Service so that all nodes can communicate with each other.

Here is the Service object definition for region us-west2. Note the nodePort parameter, which we use to expose the port externally.

---
# us-west2
apiVersion: v1
kind: Service
metadata:
  name: us-west2
  labels:
    app: cockroachdb
spec:
  type: NodePort
  ports:
    # SQL client port
    - name: grpc
      port: 26257
      targetPort: 26257
      nodePort: 31257
    # Admin UI
    - name: http
      port: 8080
      targetPort: 8080
      nodePort: 31080
  selector:
    app: cockroachdb
    region: us-west2
Enter fullscreen mode Exit fullscreen mode

Below the intra-node service.

---
# intra-node service
apiVersion: v1
kind: Service
metadata:
  name: cockroachdb
  labels:
    app: cockroachdb
  annotations:
    service.alpha.kubernetes.io/tolerate-unready-endpoints: "true"
    prometheus.io/scrape: "true"
    prometheus.io/path: "_status/vars"
    prometheus.io/port: "8080"
spec:
  ports:
    - port: 26257
      targetPort: 26257
      name: grpc
    - port: 8080
      targetPort: 8080
      name: http
  publishNotReadyAddresses: true
  clusterIP: None
  selector:
    app: cockroachdb
Enter fullscreen mode Exit fullscreen mode

We then need the Pods and their PVCs objects.

Below is the definition of Pod roach-seattle-1 and its PVC object. Check the hostname and subdomain, and the command used to start up the object - specifically the --locality flag.

# roach-seattle-1
apiVersion: v1
kind: Pod
metadata:
  name: roach-seattle-1
  labels:
    app: cockroachdb
    region: us-west2
spec:
  hostname: roach-seattle-1
  subdomain: cockroachdb
  containers:
    - name: roach-seattle-1
      image: cockroachdb/cockroach:latest
      imagePullPolicy: IfNotPresent
      ports:
        - containerPort: 26257
          name: grpc
        - containerPort: 8080
          name: http
      livenessProbe:
        httpGet:
          path: "/health"
          port: http
        initialDelaySeconds: 30
        periodSeconds: 5
      readinessProbe:
        httpGet:
          path: "/health?ready=1"
          port: http
        initialDelaySeconds: 10
        periodSeconds: 5
        failureThreshold: 2
      volumeMounts:
        - name: datadir
          mountPath: /cockroach/cockroach-data
      env:
        - name: COCKROACH_CHANNEL
          value: kubernetes-insecure
        - name: GOMAXPROCS
          valueFrom:
            resourceFieldRef:
              resource: limits.cpu
              divisor: "1"
        - name: MEMORY_LIMIT_MIB
          valueFrom:
            resourceFieldRef:
              resource: limits.memory
              divisor: "1Mi"
      command:
        - "/bin/bash"
        - "-ecx"
        - exec
          /cockroach/cockroach
          start
          --logtostderr
          --insecure
          --advertise-host $(hostname -f)
          --http-addr 0.0.0.0
          --join roach-seattle-1.cockroachdb,roach-newyork-1.cockroachdb,roach-london-1.cockroachdb
          --cache $(expr $MEMORY_LIMIT_MIB / 4)MiB
          --max-sql-memory $(expr $MEMORY_LIMIT_MIB / 4)MiB
          --locality=region=us-west2,zone=a
  terminationGracePeriodSeconds: 60
  volumes:
    - name: datadir
      persistentVolumeClaim:
        claimName: roach-seattle-1-data

---
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  name: roach-seattle-1-data
  labels:
    app: cockroachdb
spec:
  accessModes:
    - ReadWriteMany
  volumeMode: Filesystem
  storageClassName: standard
  resources:
    requests:
      storage: 1Gi
Enter fullscreen mode Exit fullscreen mode

Finally, we have the Job objects.

The first Job object initiates the cluster

---
apiVersion: batch/v1
kind: Job
metadata:
  name: cluster-init
  labels:
    app: cockroachdb
spec:
  template:
    spec:
      containers:
      - name: cluster-init
        image: cockroachdb/cockroach:latest
        imagePullPolicy: IfNotPresent
        command:
          - "/cockroach/cockroach"
          - "init"
          - "--insecure"
          - "--host=roach-seattle-1.cockroachdb"
      restartPolicy: OnFailure
Enter fullscreen mode Exit fullscreen mode

The second Job object runs the SQL statements to set the geo coordinates of the regions, so the Map knows where to place them (provided you have an enterprise license).

---
apiVersion: batch/v1
kind: Job
metadata:
  name: cluster-sql-init
  labels:
    app: cockroachdb
spec:
  template:
    spec:
      containers:
      - name: cluster-sql-init
        image: cockroachdb/cockroach:latest
        imagePullPolicy: IfNotPresent
        command:
          - "/cockroach/cockroach"
          - "sql"
          - "--insecure"
          - "--url"
          - "postgresql://roach-seattle-1.cockroachdb:26257/defaultdb?sslmode=disable"
          - "-e"
          - "UPSERT into system.locations VALUES ('region', 'us-east4', 37.478397, -76.453077), ('region', 'us-west2', 43.804133, -120.554201), ('region', 'eu-west2', 51.5073509, -0.1277583);"
      restartPolicy: OnFailure
Enter fullscreen mode Exit fullscreen mode

Now that we understand the required objects, apply the full yaml definition file to create the cluster

kubectl apply -f https://gist.githubusercontent.com/fabiog1901/fc09e6fd98d0419b4528ca1c9553d478/raw/crdb-k8s-cluster.yaml
Enter fullscreen mode Exit fullscreen mode

After a few minutes, the cluster should be up and running, check the Pods are ready and in a running state:

$ kubectl get all --show-labels
NAME                         READY   STATUS      RESTARTS   AGE    LABELS
pod/cluster-init-665lj       0/1     Completed   0          118m   controller-uid=ca1a2d5d-38db-48b6-834d-a5ffdbcb9ef8,job-name=cluster-init
pod/cluster-sql-init-7s69s   0/1     Completed   2          118m   controller-uid=960f70fb-710d-4da5-89b5-b7af33cf913f,job-name=cluster-sql-init
pod/roach-london-1           1/1     Running     0          118m   app=cockroachdb,region=eu-west2
pod/roach-london-2           1/1     Running     0          118m   app=cockroachdb,region=eu-west2
pod/roach-london-3           1/1     Running     0          118m   app=cockroachdb,region=eu-west2
pod/roach-newyork-1          1/1     Running     0          118m   app=cockroachdb,region=us-east4
pod/roach-newyork-2          1/1     Running     0          118m   app=cockroachdb,region=us-east4
pod/roach-newyork-3          1/1     Running     0          118m   app=cockroachdb,region=us-east4
pod/roach-seattle-1          1/1     Running     0          118m   app=cockroachdb,region=us-west2
pod/roach-seattle-2          1/1     Running     0          118m   app=cockroachdb,region=us-west2
pod/roach-seattle-3          1/1     Running     0          118m   app=cockroachdb,region=us-west2

NAME                  TYPE        CLUSTER-IP       EXTERNAL-IP   PORT(S)                          AGE    LABELS
service/cockroachdb   ClusterIP   None             <none>        26257/TCP,8080/TCP               118m   app=cockroachdb
service/kubernetes    ClusterIP   10.96.0.1        <none>        443/TCP                          38d    component=apiserver,provider=kubernetes
service/eu-west2    NodePort    10.103.129.120   <none>        26257:31259/TCP,8080:31280/TCP   118m   app=cockroachdb
service/us-east4    NodePort    10.97.225.172    <none>        26257:31258/TCP,8080:31180/TCP   118m   app=cockroachdb
service/us-west2    NodePort    10.109.153.233   <none>        26257:31257/TCP,8080:31080/TCP   118m   app=cockroachdb

NAME                         COMPLETIONS   DURATION   AGE    LABELS
job.batch/cluster-init       1/1           32s        118m   app=cockroachdb
job.batch/cluster-sql-init   1/1           47s        118m   app=cockroachdb
Enter fullscreen mode Exit fullscreen mode

Latency Simulation

Simulate the latency between the regions. We set region latency as follows:

  • us-west2 <=> us-east4: 60ms
  • us-east4 <=> eu-west2: 120ms
  • us-west2 <=> eu-west2: 180ms

We will use a special object for this, a NetworkChaos object which you need to install first.

Install the NetworkChaos object locally, check these instructions for Minikube.

Below is the definition of the NetworkChaos object to set the latency between the us-west2 and us-east4 nodes.

---
apiVersion: chaos-mesh.org/v1alpha1
kind: NetworkChaos
metadata:
  name: delay-uswest-useast
  labels:
    app: cockroachdb
spec:
  action: delay # chaos action
  mode: all
  selector:
    pods:
      default: # namespace of the target pods
        - roach-seattle-1
        - roach-seattle-2
        - roach-seattle-3
  delay:
    latency: "60ms"
  direction: to
  target:
    selector:
      pods:
        default: # namespace of the target pods
          - roach-newyork-1
          - roach-newyork-2
          - roach-newyork-3
    mode: all
Enter fullscreen mode Exit fullscreen mode

Create the NetworkChaos objects

kubectl apply -f https://gist.githubusercontent.com/fabiog1901/fc09e6fd98d0419b4528ca1c9553d478/raw/chaos.yaml
Enter fullscreen mode Exit fullscreen mode

Confirm the NetworkChaos objects are up

$ kubectl get networkchaos.chaos-mesh.org                                                                                                                                        
NAME                  AGE
delay-useast-euwest   78s
delay-uswest-euwest   78s
delay-uswest-useast   78s
Enter fullscreen mode Exit fullscreen mode

At this point, retrieve from Minikube the URLs of the Services you created.

$ minikube service us-west2 --url
http://192.168.64.3:31257
http://192.168.64.3:31080

$ minikube service us-east4 --url
http://192.168.64.3:31258
http://192.168.64.3:31180

$ minikube service eu-west2 --url
http://192.168.64.3:31259
http://192.168.64.3:31280
Enter fullscreen mode Exit fullscreen mode

As expected, each service returns 2 ports, one for the Admin UI, and one for the SQL client access.

In your browser, open the Admin UI using the URL with port 31080, 31180 or 31280. In my case, this is http://192.168.64.3:31080. Check the Network Latency page, confirm you see the latency between regions, but not within a region.

k8s-latency

Very good, as expected!

Open a SQL shell. You can download the cockroachdb binary which includes a built in SQL client or, thanks to CockroachDB's compliance with the PostgreSQL wire protocol, you can use the psql client.

# ----------------------------
# ports mapping:
# 31257: us-west2
# 31258: us-east4
# 31259: eu-west2
# ----------------------------

# use cockroach sql
cockroach sql --insecure --url "postgresql://192.168.64.3:31257/defaultdb?sslmode=disable"

# or use psql
psql -h 192.168.64.3 -p 31257 -U root defaultdb
Enter fullscreen mode Exit fullscreen mode

You will require an Enterprise license to unlock some of the features described below, like the Map view. You can request a Trial license or, alternatively, just skip the license registration - the deployment will still succeed.

Enter the license registration, if you have one

SET CLUSTER SETTING cluster.organization = "ABC Corp";
SET CLUSTER SETTING enterprise.license = "xxxx=yyyy-zzzz";
Enter fullscreen mode Exit fullscreen mode

Confirm you are in the correct region as per port mappings:

SHOW LOCALITY;
Enter fullscreen mode Exit fullscreen mode
         locality
--------------------------
  region=us-west2,zone=b
(1 row)

Time: 759ยตs
Enter fullscreen mode Exit fullscreen mode

Verify you can see the latency between nodes. In this example, we connect to region eu-west2 and from within the container, we initiate a SQL connection to the database using one of the Seattle nodes. From there, we issue a simple query and verify the latency.

Connect to the node in London

kubectl exec -it roach-london-1 -- bash
Enter fullscreen mode Exit fullscreen mode

Once in the London container, connect to the CockroachDB cluster at the Seattle node.

cockroach sql --insecure --url "postgresql://roach-seattle-1.cockroachdb:26257/defaultdb?sslmode=disable"
Enter fullscreen mode Exit fullscreen mode

At the SQL prompt, ask the Seattle node to tell you its locality and check the Time it takes to execute. The query is trivial, so the Time will basically reflect the latency.

SHOW LOCALITY;
Enter fullscreen mode Exit fullscreen mode
         locality
--------------------------
  region=us-west2,zone=a
(1 row)

Time: 181.064515ms
Enter fullscreen mode Exit fullscreen mode

As expected, we get ~180ms latency, which is what we have setup using the NetworkChaos objects. You can do the same exercise for the other nodes, too.

Congratulations! The cluster is ready for your development work!

Clean up

To delete the deployment, simply delete the deployment definition files.

kubectl delete -f https://gist.githubusercontent.com/fabiog1901/fc09e6fd98d0419b4528ca1c9553d478/raw/crdb-k8s-cluster.yaml

kubectl delete -f https://gist.githubusercontent.com/fabiog1901/fc09e6fd98d0419b4528ca1c9553d478/raw/chaos.yaml
Enter fullscreen mode Exit fullscreen mode

To uninstall the NetworkChaos object, follow these instructions.

References

CockroachDB Docs

CockroachDB on Kubernetes

Minikube

NetworkChaos

Network latency simulation with NetworkChaos

Top comments (1)

Collapse
 
pratibhagupta2109 profile image
Pratibha

This article is quite descriptive and very helpful.
Any specific reason Pod is preferred over StatefulSets for setting up cluster environment?
I am setting up the similar environment but using StatefulSet. What are the additional things I need to consider while creating sts resources from same yaml file? I am trying to set replica parameter to create multiple nodes in a single region. Can you provide an example of configuration with sts and replicas?