DEV Community

Shuhaib K K
Shuhaib K K

Posted on

Running Debezium On Kubernetes

In this blog, we will see how to run debezium on kubernetes, For understanding the fundamentals of debezium please refer debezium offical documentations. This blog will guide you through the process of setting up a robust and scalable CDC infrastructure.

Running Kafka Connect On Kubernetes

To run kafka connect on kubernetes first we need to install strimzi operator.

Please follow the offical documentation .

You can download strimzi from the release page

We can deploy strimzi operator to watch single namespace or multiple namespaces. In our case we will be deploying to watch multiple namespaces.

1. Edit the strimzi installation files to use the namespace the cluster operator is going to be installed into.

On Linux, use:

sed -i 's/namespace: .*/namespace: my-cluster-operator-namespace/' install/cluster-operator/*RoleBinding*.yaml

Enter fullscreen mode Exit fullscreen mode

2. Edit the install/cluster-operator/060-deployment-strimzi-cluster-operator. Yaml file to add a list of all the namespaces the cluster operator will watch to the strimzi_namespace environment variable. For example, in this procedure the cluster operator will watch the namespaces watched-namespace-1, watched-namespace-2,
watched-namespace-3.

apiVersion: apps/v1
kind: Deployment
spec:
  # ...
  template:
    spec:
      serviceAccountName: strimzi-cluster-operator
      containers:
      - name: strimzi-cluster-operator
        image: quay.io/strimzi/operator:0.38.0
        imagePullPolicy: IfNotPresent
        env:
        - name: STRIMZI_NAMESPACE
          value: watched-namespace-1,watched-namespace-2, watched-namespace-3

Enter fullscreen mode Exit fullscreen mode

3. For each namespace listed, install the RoleBindings.

In this example, we replace watched-namespace in these commands with the namespaces listed in the previous step, repeating them for watched-namespace-1, watched-namespace-2, watched-namespace-3

  kubectl create -f install/cluster-operator/020-RoleBinding-strimzi-cluster-operator.yaml -n <watched_namespace>
  kubectl create -f install/cluster-operator/023-RoleBinding-strimzi-cluster-operator.yaml -n <watched_namespace>
  kubectl create -f install/cluster-operator/031-RoleBinding-strimzi-cluster-operator-entity-operator-delegation.yaml -n <watched_namespace>
Enter fullscreen mode Exit fullscreen mode

4. Deploy the cluster Operator

kubectl create -f install/cluster-operator -n my-cluster-operator-namespace
Enter fullscreen mode Exit fullscreen mode

5. Check the status of the deployment:

kubectl get deployments -n my-cluster-operator-namespace
Enter fullscreen mode Exit fullscreen mode

Output shows the deployment name and readiness

NAME                      READY  UP-TO-DATE  AVAILABLE
strimzi-cluster-operator  1/1    1           1
Enter fullscreen mode Exit fullscreen mode

The above steps sums up the steps for deploying strimzi operator on Kubernetes to watch multiple namespaces.

Build Docker Images For Debezium

To run kafka connect with debezium we need to copy the debezium plugins to strimzi kafka image.

You can find images from this repo.

Example:

FROM quay.io/strimzi/kafka:0.35.1-kafka-3.4.0
USER root:root
ENV KAFKA_HOME=/opt/kafka

ARG DEBEZIUM_VERSION=2.4.1.Final
ARG CONNECTOR_NAME=debezium-connector-mongodb

RUN mkdir -p /opt/kafka/plugins

RUN curl https://repo1.maven.org/maven2/io/debezium/debezium-connector-mongodb/${DEBEZIUM_VERSION}/${CONNECTOR_NAME}-${DEBEZIUM_VERSION}-plugin.tar.gz -o /opt/kafka/plugins/${CONNECTOR_NAME}-${DEBEZIUM_VERSION}-plugin.tar.gz \
    && tar -xzvf /opt/kafka/plugins/${CONNECTOR_NAME}-${DEBEZIUM_VERSION}-plugin.tar.gz -C /opt/kafka/plugins/ \
    && rm /opt/kafka/plugins/${CONNECTOR_NAME}-${DEBEZIUM_VERSION}-plugin.tar.gz

USER 1001
Enter fullscreen mode Exit fullscreen mode

After building the image we need to start running kafka connect on Kubernetes.

You can find examples from the repo.

Example:

apiVersion: kafka.strimzi.io/v1beta2
kind: KafkaConnect
metadata:
  name: mongo-connect-cluster-connect
  annotations:
    strimzi.io/use-connector-resources: "true"
spec:
  version: 3.4.0
  image: << image you built >>
  replicas: 1
  bootstrapServers: << kafka url >>
  config:
    config.providers: secrets
    config.providers.secrets.class: io.strimzi.kafka.KubernetesSecretConfigProvider
    group.id: mongo-example
    offset.storage.topic: mongo-example-offsets
    config.storage.topic: mongo-example-configs
    status.storage.topic: mongo-example-status
    # -1 means it will use the default replication factor configured in the broker
    config.storage.replication.factor: -1
    offset.storage.replication.factor: -1
    status.storage.replication.factor: -1
    key.converter.schemas.enable: true
    value.converter.schemas.enable: true
  jvmOptions:
    "-Xmx": "1g" ##<< change this values accordingly >>
    "-Xms": "1g" ##<< change this values accordingly >>
Enter fullscreen mode Exit fullscreen mode

Apply the kafka connect yaml created.

kubectl apply -f mongo-example.yaml
Enter fullscreen mode Exit fullscreen mode

Now you can verify if the kafka connect is running.

kubectl get pods

NAME                                             READY   STATUS    RESTARTS        AGE
mongo-connect-cluster-connect-848c9bf6b8-25pc6   1/1     Running   0               4h38m
Enter fullscreen mode Exit fullscreen mode

Once kafka connect is started we need to configure a connector. A connector configuration is given below.
You can also find more examples in the repo on configuring connectors including using secrets in place of username and password.

Example for a MongoDB debezium connector

apiVersion: kafka.strimzi.io/v1beta2
kind: KafkaConnector
metadata:
  name: mongo-connector-example
  labels:
    strimzi.io/cluster: mongo-connect-cluster-connect ## this value should match the name of kafka connector created
spec:
  class: io.debezium.connector.mongodb.MongoDbConnector
  tasksMax: 1
  config:
    # more property details can be found in debezium documentation
    ##  https://debezium.io/documentation/reference/2.3/connectors/mongodb.html
    mongodb.connection.string: mongodb://<username>:<password>@<host>:<port>/?replicaSet=rs0
    topic.prefix: << topic_prefix >>
    collection.include.list: << change values here >>
    snapshot.mode: never
    topic.creation.default.partitions: 1
    topic.creation.default.replication.factor: 1
Enter fullscreen mode Exit fullscreen mode

Once the connector is created you can apply the connector configuration.

kubectl create -f example-connector.yaml
Enter fullscreen mode Exit fullscreen mode

To verify if a connector is running.

kubectl get kctr

NAME                           CLUSTER                 CONNECTOR CLASS                                   MAX TASKS   READY
mongo-connector-example     mongo-connect-cluster   io.debezium.connector.mongodb.MongoDbConnector       1           True
Enter fullscreen mode Exit fullscreen mode

The 'True' status in READY section indicates the connector is running properly.

To get more details of the connector including worker details or for any troubleshooting issues you can get output of the worker using the following command.

kubectl get kctr mongo-connector-example -o yaml

status:
  conditions:
    status: "True"
    type: Ready
  connectorStatus:
    connector:
      state: RUNNING
      worker_id: 10.124.10.14:8083
    name: mongo-connector-example
    tasks:
    - id: 0
      state: RUNNING
      worker_id: 10.124.10.14:8083
    type: source
  observedGeneration: 1
Enter fullscreen mode Exit fullscreen mode

Top comments (0)