In this blog, we will see how to run debezium on kubernetes, For understanding the fundamentals of debezium please refer debezium offical documentations. This blog will guide you through the process of setting up a robust and scalable CDC infrastructure.
Running Kafka Connect On Kubernetes
To run kafka connect on kubernetes first we need to install strimzi operator.
Please follow the offical documentation .
You can download strimzi from the release page
We can deploy strimzi operator to watch single namespace or multiple namespaces. In our case we will be deploying to watch multiple namespaces.
1. Edit the strimzi installation files to use the namespace the cluster operator is going to be installed into.
On Linux, use:
sed -i 's/namespace: .*/namespace: my-cluster-operator-namespace/' install/cluster-operator/*RoleBinding*.yaml
2. Edit the install/cluster-operator/060-deployment-strimzi-cluster-operator. Yaml file to add a list of all the namespaces the cluster operator will watch to the strimzi_namespace environment variable. For example, in this procedure the cluster operator will watch the namespaces watched-namespace-1, watched-namespace-2,
watched-namespace-3.
apiVersion: apps/v1
kind: Deployment
spec:
# ...
template:
spec:
serviceAccountName: strimzi-cluster-operator
containers:
- name: strimzi-cluster-operator
image: quay.io/strimzi/operator:0.38.0
imagePullPolicy: IfNotPresent
env:
- name: STRIMZI_NAMESPACE
value: watched-namespace-1,watched-namespace-2, watched-namespace-3
3. For each namespace listed, install the RoleBindings.
In this example, we replace watched-namespace in these commands with the namespaces listed in the previous step, repeating them for watched-namespace-1, watched-namespace-2, watched-namespace-3
kubectl create -f install/cluster-operator/020-RoleBinding-strimzi-cluster-operator.yaml -n <watched_namespace>
kubectl create -f install/cluster-operator/023-RoleBinding-strimzi-cluster-operator.yaml -n <watched_namespace>
kubectl create -f install/cluster-operator/031-RoleBinding-strimzi-cluster-operator-entity-operator-delegation.yaml -n <watched_namespace>
4. Deploy the cluster Operator
kubectl create -f install/cluster-operator -n my-cluster-operator-namespace
5. Check the status of the deployment:
kubectl get deployments -n my-cluster-operator-namespace
Output shows the deployment name and readiness
NAME READY UP-TO-DATE AVAILABLE
strimzi-cluster-operator 1/1 1 1
The above steps sums up the steps for deploying strimzi operator on Kubernetes to watch multiple namespaces.
Build Docker Images For Debezium
To run kafka connect with debezium we need to copy the debezium plugins to strimzi kafka image.
You can find images from this repo.
Example:
FROM quay.io/strimzi/kafka:0.35.1-kafka-3.4.0
USER root:root
ENV KAFKA_HOME=/opt/kafka
ARG DEBEZIUM_VERSION=2.4.1.Final
ARG CONNECTOR_NAME=debezium-connector-mongodb
RUN mkdir -p /opt/kafka/plugins
RUN curl https://repo1.maven.org/maven2/io/debezium/debezium-connector-mongodb/${DEBEZIUM_VERSION}/${CONNECTOR_NAME}-${DEBEZIUM_VERSION}-plugin.tar.gz -o /opt/kafka/plugins/${CONNECTOR_NAME}-${DEBEZIUM_VERSION}-plugin.tar.gz \
&& tar -xzvf /opt/kafka/plugins/${CONNECTOR_NAME}-${DEBEZIUM_VERSION}-plugin.tar.gz -C /opt/kafka/plugins/ \
&& rm /opt/kafka/plugins/${CONNECTOR_NAME}-${DEBEZIUM_VERSION}-plugin.tar.gz
USER 1001
After building the image we need to start running kafka connect on Kubernetes.
You can find examples from the repo.
Example:
apiVersion: kafka.strimzi.io/v1beta2
kind: KafkaConnect
metadata:
name: mongo-connect-cluster-connect
annotations:
strimzi.io/use-connector-resources: "true"
spec:
version: 3.4.0
image: << image you built >>
replicas: 1
bootstrapServers: << kafka url >>
config:
config.providers: secrets
config.providers.secrets.class: io.strimzi.kafka.KubernetesSecretConfigProvider
group.id: mongo-example
offset.storage.topic: mongo-example-offsets
config.storage.topic: mongo-example-configs
status.storage.topic: mongo-example-status
# -1 means it will use the default replication factor configured in the broker
config.storage.replication.factor: -1
offset.storage.replication.factor: -1
status.storage.replication.factor: -1
key.converter.schemas.enable: true
value.converter.schemas.enable: true
jvmOptions:
"-Xmx": "1g" ##<< change this values accordingly >>
"-Xms": "1g" ##<< change this values accordingly >>
Apply the kafka connect yaml created.
kubectl apply -f mongo-example.yaml
Now you can verify if the kafka connect is running.
kubectl get pods
NAME READY STATUS RESTARTS AGE
mongo-connect-cluster-connect-848c9bf6b8-25pc6 1/1 Running 0 4h38m
Once kafka connect is started we need to configure a connector. A connector configuration is given below.
You can also find more examples in the repo on configuring connectors including using secrets in place of username and password.
Example for a MongoDB debezium connector
apiVersion: kafka.strimzi.io/v1beta2
kind: KafkaConnector
metadata:
name: mongo-connector-example
labels:
strimzi.io/cluster: mongo-connect-cluster-connect ## this value should match the name of kafka connector created
spec:
class: io.debezium.connector.mongodb.MongoDbConnector
tasksMax: 1
config:
# more property details can be found in debezium documentation
## https://debezium.io/documentation/reference/2.3/connectors/mongodb.html
mongodb.connection.string: mongodb://<username>:<password>@<host>:<port>/?replicaSet=rs0
topic.prefix: << topic_prefix >>
collection.include.list: << change values here >>
snapshot.mode: never
topic.creation.default.partitions: 1
topic.creation.default.replication.factor: 1
Once the connector is created you can apply the connector configuration.
kubectl create -f example-connector.yaml
To verify if a connector is running.
kubectl get kctr
NAME CLUSTER CONNECTOR CLASS MAX TASKS READY
mongo-connector-example mongo-connect-cluster io.debezium.connector.mongodb.MongoDbConnector 1 True
The 'True' status in READY section indicates the connector is running properly.
To get more details of the connector including worker details or for any troubleshooting issues you can get output of the worker using the following command.
kubectl get kctr mongo-connector-example -o yaml
status:
conditions:
status: "True"
type: Ready
connectorStatus:
connector:
state: RUNNING
worker_id: 10.124.10.14:8083
name: mongo-connector-example
tasks:
- id: 0
state: RUNNING
worker_id: 10.124.10.14:8083
type: source
observedGeneration: 1
Top comments (0)