Dominik Gawlik

Posted on Oct 16, 2024

Writing k8s operator in Java

#java #kubernetes #spring #devops

This tutorial is specifically for developers with Java background who want to learn how to write first kubernetes operator fast. Why operators? There are several advantages:

significant maintenance reduction, saving keystrokes
resiliency built in into whatever system you are creating
fun of learning, getting serious about kubernetes nuts and bolts

I will try to limit theory to minimum and show a fool-proof recipe how to “bake a cake”. I chose Java because it is close to my working experience and to be honest it is easier than Go (but some may disagree).

Lets jump straight to it.

Theory and background

Nobody likes reading lengthy documentation, but let’s get this quickly of our chest, shall we?

What is a pod?

Pod is a group of containers with shared network interface (and given unique IP address) and storage.

What is a replica set?

Replica set controls creation and deleting of pods so that at each instant there is exactly specified number of pods with given template.

What is deployment?

Deployment owns replica set and indirectly owns pods. When you create deployment pods are created, when you delete it pods are gone.

What is service?

Service is SINGLE internet endpoint for bunch of pods (it distributes the load among them equally). You can expose it to be visible from outside the cluster. It automates the creation of endpoint slices.

The problem with kubernetes is that from the inception it was designed to be stateless. Replica sets don’t track the identities of pods, when particular pod is gone new one is just created. There are some use cases that need state like databases and cache clusters. Stateful sets only partially mitigate the problem.

This is why people started writing operators to take off the burden of maintenance. I wont go into the depths of the pattern and various sdks — you can start from here.

Controllers and reconcillation

Everything that works in kubernetes, every tiny gear of machinery is based on simple concept of control loop. So what this control loop does for particular resource type is that it checks what is and what should be (as defined in manifest). If there is mismatch it tries to perform some actions to fix that. This is called reconcillation.

And what operators really are is the same concept but for custom resources. Custom resources are the means of extending kubernetes api to some resource types that are defined by you. If you set up crd in kubernetes then all the actions like get, list, update, delete and so on will be possible on this resource. And what will do the actual work? Thats right — our operator.

Motivating example and java app

As typical for testing technology for the first time you pick the problem that is most basic to do. Because the concept is particularly complex then hello world in this case will be a little bit long. Anyways, in most of the sources I have seen that the simplest use case is setting up static page serving.

So the project is like this : we will define custom resource that represents two pages we want to serve. After applying that resource operator will automatically set up serving application in Spring Boot, create config map with pages content, mount the config map into a volume in apps pod and set up service for that pod. What is fun about this is that if we modify the resource, it will rebind everything on the fly and new page changes will be instantly visible. Second fun thing is that if we delete the resource it will delete everything leaving our cluster clean.

Serving java app

This will be really simple static page server in Spring Boot. You will only need spring-boot-starter-web so go ahead to spring initializer and pick:

maven
java 21
newest stable version (3.3.4 for me)
graal vm
and spring boot starter web

The app is just this:

@SpringBootApplication
@RestController
public class WebpageServingApplication {

    @GetMapping(value = "/{page}", produces = "text/html")
    public String page(@PathVariable String page) throws IOException {
        return Files.readString(Path.of("/static/"+page));
    }

    public static void main(String[] args) {
        SpringApplication.run(WebpageServingApplication.class, args);
    }

}

Whatever we pass as path variable will be fetched from /static directory (in our case page1 and page2). So static directory will be mounted from config map, but about that later.

So now we have to build a native image and push it to the remote repository.

Tip number 1

<plugin>
    <groupId>org.graalvm.buildtools</groupId>
    <artifactId>native-maven-plugin</artifactId>
    <configuration>
        <buildArgs>
            <buildArg>-Ob</buildArg>
        </buildArgs>
    </configuration>
</plugin>

Configuring GraalVM like so you will have fastest build with lowest memory consumption (around 2GB). For me it was a must as I only have 16GB of memory and lots of stuff installed.

Tip number 2

<plugin>
    <groupId>org.springframework.boot</groupId>
    <artifactId>spring-boot-maven-plugin</artifactId>
    <configuration>
        <image>
            <publish>true</publish>
            <builder>paketobuildpacks/builder-jammy-full:latest</builder>
            <name>ghcr.io/dgawlik/webpage-serving:1.0.5</name>
            <env>
                <BP_JVM_VERSION>21</BP_JVM_VERSION>
            </env>
        </image>
        <docker>
            <publishRegistry>
                <url>https://ghcr.io/dgawlik</url>
                <username>dgawlik</username>
                <password>${env.GITHUB_TOKEN}</password>
            </publishRegistry>
        </docker>
    </configuration>
</plugin>

use paketobuildpacks/builder-jammy-full:latest while you are testing because -tiny and -base won’t have bash installed and you won’t be able to attach to container. Once you are done you can switch.
publish true will cause building image to push it to repository, so go ahead and switch it to your repo
BP_JVM_VERSION will be the java version of the builder image, it should be the same as the java of your project. As far as I know the latest java available is 21.

So now you do:

mvn spring-boot:build-image

And that’s it.

Operator with Fabric8

Now the fun starts. First you will need this in your pom:

<dependencies>
   <dependency>
      <groupId>io.fabric8</groupId>
      <artifactId>kubernetes-client</artifactId>
      <version>6.13.4</version>
   </dependency>
   <dependency>
      <groupId>io.fabric8</groupId>
      <artifactId>crd-generator-apt</artifactId>
      <version>6.13.4</version>
      <scope>provided</scope>
   </dependency>
</dependencies>

crd-generator-apt is a plugin that scans a project, detects CRD pojos and generates the manifest.

Since I mentioned it, these resources are:

@Group("com.github.webserving")
@Version("v1alpha1")
@ShortNames("websrv")
public class WebServingResource extends CustomResource<WebServingSpec, WebServingStatus> implements Namespaced {
}

public record WebServingSpec(String page1, String page2) {
}

public record WebServingStatus (String status) {
}

What is common in all resource manifests in Kubernetes is that most of them has spec and status. So you can see that the spec will consist of two pages pasted in heredoc format. Now, the proper way to handle things would be to manipulate status to reflect whatever operator is doing. If for example it is waiting on deployment to finish it would have status = “Processing”, on everything done it would patch the status to “Ready” and so on. But we will skip that because this is just simple demo.

Good news is that the logic of the operator is all in main class and really short. So step by step here it is:

KubernetesClient client = new KubernetesClientBuilder()
    .withTaskExecutor(executor).build();

var crdClient = client.resources(WebServingResource.class)
    .inNamespace("default");


var handler = new GenericResourceEventHandler<>(update -> {
   synchronized (changes) {
       changes.notifyAll();
   }
});

crdClient.inform(handler).start();

client.apps().deployments().inNamespace("default")
     .withName("web-serving-app-deployment").inform(handler).start();

client.services().inNamespace("default")
   .withName("web-serving-app-svc").inform(handler).start();

client.configMaps().inNamespace("default")
    .withName("web-serving-app-config").inform(handler).start();

So the heart of the program is of course Fabric8 Kuberenetes client built in first line. It is convenient to customize it with own executor. I used famous virtual threads, so when waiting on blocking io java will suspend the logic and move to main.

How here is a new part. The most basic version would be to run forever the loop and put Thread.sleep(1000) in it or so. But there is more clever way - kubernetes informers. Informer is websocket connection to kubernetes api server and it informs the client each time the subscribed resource changes. There is more to it you can read on the internet for example how to use various caches which fetch updates all at once in batch. But here it just subscribes directly per resource. The handler is a little bit bloated so I wrote a helper class GenericResourceEventHandler.

public class GenericResourceEventHandler<T> implements ResourceEventHandler<T> {

    private final Consumer<T> handler;

    public GenericResourceEventHandler(Consumer<T> handler) {
        this.handler = handler;
    }


    @Override
    public void onAdd(T obj) {
        this.handler.accept(obj);
    }

    @Override
    public void onUpdate(T oldObj, T newObj) {
        this.handler.accept(newObj);
    }

    @Override
    public void onDelete(T obj, boolean deletedFinalStateUnknown) {
        this.handler.accept(null);
    }
}

Since we only need to wake up the loop in all of the cases then we pass it a generic lambda. The idea for the loop is to wait on lock in the end and then the informer callback releases the lock each time the changes are detected.


for (; ; ) {

    var crdList = crdClient.list().getItems();
    var crd = Optional.ofNullable(crdList.isEmpty() ? null : crdList.get(0));


    var skipUpdate = false;
    var reload = false;

    if (!crd.isPresent()) {
        System.out.println("No WebServingResource found, reconciling disabled");
        currentCrd = null;
        skipUpdate = true;
    } else if (!crd.get().getSpec().equals(
            Optional.ofNullable(currentCrd)
                    .map(WebServingResource::getSpec).orElse(null))) {
        currentCrd = crd.orElse(null);
        System.out.println("Crd changed, Reconciling ConfigMap");
        reload = true;
    }

If there is no crd then there is nothing to be done. And if the crd changed then we have to reload everything.

var currentConfigMap = client.configMaps().inNamespace("default")
        .withName("web-serving-app-config").get();

if(!skipUpdate && (reload || desiredConfigMap(currentCrd).equals(currentConfigMap))) {
    System.out.println("New configmap, reconciling WebServingResource");
    client.configMaps().inNamespace("default").withName("web-serving-app-config")
            .createOrReplace(desiredConfigMap(currentCrd));
    reload = true;
}

This is for the case that ConfigMap is changed in between the iterations. Since it is mounted in pod then we have to reload the deployment.

var currentServingDeploymentNullable = client.apps().deployments().inNamespace("default")
        .withName("web-serving-app-deployment").get();
var currentServingDeployment = Optional.ofNullable(currentServingDeploymentNullable);

if(!skipUpdate && (reload || !desiredWebServingDeployment(currentCrd).getSpec().equals(
        currentServingDeployment.map(Deployment::getSpec).orElse(null)))) {

    System.out.println("Reconciling Deployment");
    client.apps().deployments().inNamespace("default").withName("web-serving-app-deployment")
            .createOrReplace(desiredWebServingDeployment(currentCrd));
}

var currentServingServiceNullable = client.services().inNamespace("default")
            .withName("web-serving-app-svc").get();
var currentServingService = Optional.ofNullable(currentServingServiceNullable);

if(!skipUpdate && (reload || !desiredWebServingService(currentCrd).getSpec().equals(
        currentServingService.map(Service::getSpec).orElse(null)))) {

    System.out.println("Reconciling Service");
    client.services().inNamespace("default").withName("web-serving-app-svc")
            .createOrReplace(desiredWebServingService(currentCrd));
}

If any of the service or deployment differs from the defaults we will replace them with the defaults.

synchronized (changes) {
    changes.wait();
}

Then the aforementioned lock.

So now the only thing is to define the desired configmap, service and deployment.

private static Deployment desiredWebServingDeployment(WebServingResource crd) {
    return new DeploymentBuilder()
            .withNewMetadata()
            .withName("web-serving-app-deployment")
            .withNamespace("default")
            .addToLabels("app", "web-serving-app")
            .withOwnerReferences(createOwnerReference(crd))
            .endMetadata()
            .withNewSpec()
            .withReplicas(1)
            .withNewSelector()
            .addToMatchLabels("app", "web-serving-app")
            .endSelector()
            .withNewTemplate()
            .withNewMetadata()
            .addToLabels("app", "web-serving-app")
            .endMetadata()
            .withNewSpec()
            .addNewContainer()
            .withName("web-serving-app-container")
            .withImage("ghcr.io/dgawlik/webpage-serving:1.0.5")
            .withVolumeMounts(new VolumeMountBuilder()
                    .withName("web-serving-app-config")
                    .withMountPath("/static")
                    .build())
            .addNewPort()
            .withContainerPort(8080)
            .endPort()
            .endContainer()
            .withVolumes(new VolumeBuilder()
                    .withName("web-serving-app-config")
                    .withConfigMap(new ConfigMapVolumeSourceBuilder()
                            .withName("web-serving-app-config")
                            .build())
                    .build())
            .withImagePullSecrets(new LocalObjectReferenceBuilder()
                    .withName("regcred").build())
            .endSpec()
            .endTemplate()
            .endSpec()
            .build();
}

private static Service desiredWebServingService(WebServingResource crd) {
    return new ServiceBuilder()
            .editMetadata()
            .withName("web-serving-app-svc")
            .withOwnerReferences(createOwnerReference(crd))
            .withNamespace(crd.getMetadata().getNamespace())
            .endMetadata()
            .editSpec()
            .addNewPort()
            .withPort(8080)
            .withTargetPort(new IntOrString(8080))
            .endPort()
            .addToSelector("app", "web-serving-app")
            .endSpec()
            .build();
}

private static ConfigMap desiredConfigMap(WebServingResource crd) {
    return new ConfigMapBuilder()
            .withMetadata(
                    new ObjectMetaBuilder()
                            .withName("web-serving-app-config")
                            .withNamespace(crd.getMetadata().getNamespace())
                            .withOwnerReferences(createOwnerReference(crd))
                            .build())
            .withData(Map.of("page1", crd.getSpec().page1(),
                    "page2", crd.getSpec().page2()))
            .build();
}

private static OwnerReference createOwnerReference(WebServingResource crd) {
    return new OwnerReferenceBuilder()
            .withApiVersion(crd.getApiVersion())
            .withKind(crd.getKind())
            .withName(crd.getMetadata().getName())
            .withUid(crd.getMetadata().getUid())
            .withController(true)
            .build();
}

The magic of the OwnerReference is that you mark the resource which is it’s parent. Whenever you delete the parent k8s will delete automatically all the dependant resources.

But you can’t run it yet. You need a docker credentials in kubernetes:

kubectl delete secret regcred

kubectl create secret docker-registry regcred \
  --docker-server=ghcr.io \
  --docker-username=dgawlik \
  --docker-password=$GITHUB_TOKEN

Run this script once. Then we also need to set up the ingress:

apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
  name: demo-ingress
spec:
  rules:
    - http:
        paths:
          - path: /
            pathType: Prefix
            backend:
              service:
                name: web-serving-app-svc
                port:
                  number: 8080

The workflow

So first you build the operator project. Then you take target/classes/META-INF/fabric8/webservingresources.com.github.webserving-v1.yml and apply it. From now on the kubernetes is ready to accept your crd. Here it is:

apiVersion: com.github.webserving/v1alpha1
kind: WebServingResource
metadata:
  name: example-ws
  namespace: default
spec:
  page1: |
    <h1>Hola amigos!</h1>
    <p>Buenos dias!</p>
  page2: |
    <h1>Hello my friend</h1>
    <p>Good evening</p>

You apply the crd kubectl apply -f src/main/resources/crd-instance.yaml. And then you run Main of the operator.

Then monitor the pod if it is up. Next just take the ip of the cluster:

minikube ip

And in your browser navigate to /page1 and /page2.

Then try to change the crd and apply it again. After a second you should see the changes.

The end.

Conclusion

A bright observer will notice that the code has some concurrency issues. A lot can happen in between the start and the end of the loop. But there are a lot of cases to consider and tried to keep it simple. You can do it as aftermath.

Like wise for the deployment. Instead of running it in IDE you can build the image the same way as for serving app and write deployment of it. That’s basically demystification of the operator — it is just a pod like every other.

I hope you found it useful.

Thanks for reading.

I almost forgot - here is the repo:

https://github.com/dgawlik/operator-hello-world