Shuhaib K K

Posted on Nov 24, 2023

Running Debezium On Kubernetes

#debezium #kubernetes #cdc #kafkaconnect

In this blog, we will see how to run debezium on kubernetes, For understanding the fundamentals of debezium please refer debezium offical documentations. This blog will guide you through the process of setting up a robust and scalable CDC infrastructure.

Running Kafka Connect On Kubernetes

To run kafka connect on kubernetes first we need to install strimzi operator.

Please follow the offical documentation .

You can download strimzi from the release page

We can deploy strimzi operator to watch single namespace or multiple namespaces. In our case we will be deploying to watch multiple namespaces.

1. Edit the strimzi installation files to use the namespace the cluster operator is going to be installed into.

On Linux, use:

sed -i 's/namespace: .*/namespace: my-cluster-operator-namespace/' install/cluster-operator/*RoleBinding*.yaml

2. Edit the install/cluster-operator/060-deployment-strimzi-cluster-operator. Yaml file to add a list of all the namespaces the cluster operator will watch to the strimzi_namespace environment variable. For example, in this procedure the cluster operator will watch the namespaces watched-namespace-1, watched-namespace-2,
watched-namespace-3.

apiVersion: apps/v1
kind: Deployment
spec:
  # ...
  template:
    spec:
      serviceAccountName: strimzi-cluster-operator
      containers:
      - name: strimzi-cluster-operator
        image: quay.io/strimzi/operator:0.38.0
        imagePullPolicy: IfNotPresent
        env:
        - name: STRIMZI_NAMESPACE
          value: watched-namespace-1,watched-namespace-2, watched-namespace-3

3. For each namespace listed, install the RoleBindings.

In this example, we replace watched-namespace in these commands with the namespaces listed in the previous step, repeating them for watched-namespace-1, watched-namespace-2, watched-namespace-3

  kubectl create -f install/cluster-operator/020-RoleBinding-strimzi-cluster-operator.yaml -n <watched_namespace>
  kubectl create -f install/cluster-operator/023-RoleBinding-strimzi-cluster-operator.yaml -n <watched_namespace>
  kubectl create -f install/cluster-operator/031-RoleBinding-strimzi-cluster-operator-entity-operator-delegation.yaml -n <watched_namespace>

4. Deploy the cluster Operator

kubectl create -f install/cluster-operator -n my-cluster-operator-namespace

5. Check the status of the deployment:

kubectl get deployments -n my-cluster-operator-namespace

Output shows the deployment name and readiness

NAME                      READY  UP-TO-DATE  AVAILABLE
strimzi-cluster-operator  1/1    1           1

The above steps sums up the steps for deploying strimzi operator on Kubernetes to watch multiple namespaces.

Build Docker Images For Debezium

To run kafka connect with debezium we need to copy the debezium plugins to strimzi kafka image.

You can find images from this repo.

Example:

FROM quay.io/strimzi/kafka:0.35.1-kafka-3.4.0
USER root:root
ENV KAFKA_HOME=/opt/kafka

ARG DEBEZIUM_VERSION=2.4.1.Final
ARG CONNECTOR_NAME=debezium-connector-mongodb

RUN mkdir -p /opt/kafka/plugins

RUN curl https://repo1.maven.org/maven2/io/debezium/debezium-connector-mongodb/${DEBEZIUM_VERSION}/${CONNECTOR_NAME}-${DEBEZIUM_VERSION}-plugin.tar.gz -o /opt/kafka/plugins/${CONNECTOR_NAME}-${DEBEZIUM_VERSION}-plugin.tar.gz \
    && tar -xzvf /opt/kafka/plugins/${CONNECTOR_NAME}-${DEBEZIUM_VERSION}-plugin.tar.gz -C /opt/kafka/plugins/ \
    && rm /opt/kafka/plugins/${CONNECTOR_NAME}-${DEBEZIUM_VERSION}-plugin.tar.gz

USER 1001

After building the image we need to start running kafka connect on Kubernetes.

You can find examples from the repo.

Example:

apiVersion: kafka.strimzi.io/v1beta2
kind: KafkaConnect
metadata:
  name: mongo-connect-cluster-connect
  annotations:
    strimzi.io/use-connector-resources: "true"
spec:
  version: 3.4.0
  image: << image you built >>
  replicas: 1
  bootstrapServers: << kafka url >>
  config:
    config.providers: secrets
    config.providers.secrets.class: io.strimzi.kafka.KubernetesSecretConfigProvider
    group.id: mongo-example
    offset.storage.topic: mongo-example-offsets
    config.storage.topic: mongo-example-configs
    status.storage.topic: mongo-example-status
    # -1 means it will use the default replication factor configured in the broker
    config.storage.replication.factor: -1
    offset.storage.replication.factor: -1
    status.storage.replication.factor: -1
    key.converter.schemas.enable: true
    value.converter.schemas.enable: true
  jvmOptions:
    "-Xmx": "1g" ##<< change this values accordingly >>
    "-Xms": "1g" ##<< change this values accordingly >>

Apply the kafka connect yaml created.

kubectl apply -f mongo-example.yaml

Now you can verify if the kafka connect is running.

kubectl get pods

NAME                                             READY   STATUS    RESTARTS        AGE
mongo-connect-cluster-connect-848c9bf6b8-25pc6   1/1     Running   0               4h38m

Once kafka connect is started we need to configure a connector. A connector configuration is given below.
You can also find more examples in the repo on configuring connectors including using secrets in place of username and password.

Example for a MongoDB debezium connector

apiVersion: kafka.strimzi.io/v1beta2
kind: KafkaConnector
metadata:
  name: mongo-connector-example
  labels:
    strimzi.io/cluster: mongo-connect-cluster-connect ## this value should match the name of kafka connector created
spec:
  class: io.debezium.connector.mongodb.MongoDbConnector
  tasksMax: 1
  config:
    # more property details can be found in debezium documentation
    ##  https://debezium.io/documentation/reference/2.3/connectors/mongodb.html
    mongodb.connection.string: mongodb://<username>:<password>@<host>:<port>/?replicaSet=rs0
    topic.prefix: << topic_prefix >>
    collection.include.list: << change values here >>
    snapshot.mode: never
    topic.creation.default.partitions: 1
    topic.creation.default.replication.factor: 1

Once the connector is created you can apply the connector configuration.

kubectl create -f example-connector.yaml

To verify if a connector is running.

kubectl get kctr

NAME                           CLUSTER                 CONNECTOR CLASS                                   MAX TASKS   READY
mongo-connector-example     mongo-connect-cluster   io.debezium.connector.mongodb.MongoDbConnector       1           True

The 'True' status in READY section indicates the connector is running properly.

To get more details of the connector including worker details or for any troubleshooting issues you can get output of the worker using the following command.

kubectl get kctr mongo-connector-example -o yaml

status:
  conditions:
    status: "True"
    type: Ready
  connectorStatus:
    connector:
      state: RUNNING
      worker_id: 10.124.10.14:8083
    name: mongo-connector-example
    tasks:
    - id: 0
      state: RUNNING
      worker_id: 10.124.10.14:8083
    type: source
  observedGeneration: 1

DEV Community

Running Debezium On Kubernetes

Running Kafka Connect On Kubernetes

Build Docker Images For Debezium

Top comments (0)

Read next

I am going to become a software engineer - and I'd like to be a good one

Mastering UI Consistency: Elevate Your Application's User Experience

Connecting SQLite Database in Java

3D Car Image Just With One API!