Arseny Zinchenko

Posted on Sep 13, 2025 • Originally published at rtfm.co.ua on Sep 16, 2025

Kubernetes: what are the Kubernetes Operator and CustomResourceDefinition

#devops #kubernetes #todayilearned #tutorial

Perhaps everyone has used operators in Kubernetes, for example, PostgreSQL operator, VictoriaMetrics Operator.

But what’s going on under the hood? How and to what are CustomResourceDefinition (CRD) applied, and what is an “operator”?

And finally, what is the difference between a Kubernetes Operator and a Kubernetes Controller?

In the previous part — Kubernetes:Kubernetes APIs, API Groups, CRDs, etcd — we dug a little deeper into how the Kubernetes API works and what a CRD is, and now we can try to write our own micro-operator, a simple MVP, and use it as an example to understand the details.

Kubernetes Controller vs Kubernetes Operator
What is: Kubernetes Controller
What is: Kubernetes Operator
Kubernetes Operator frameworks
Creating a CustomResourceDefinition
Creating a Kubernetes Operator with Kopf
Resource templates: Kopf and Kubebuilder
And what about in real operators?

Kubernetes Controller vs Kubernetes Operator

So, what is the main difference between Controllers and Operators?

What is: Kubernetes Controller

Simply put, a Controller is just some service that monitors resources in a cluster and brings their state in line with how this state is described in the database — etcd.

In Kubernetes, we have a set of default controllers — Core Controllers within the Kube Controller Manager, such as the ReplicaSet Controller, which checks the number of pods in the Deployment against the replicas value, or the Deployment Controller, which controls the creation and update of ReplicaSets, or the PersistentVolume Controller and PersistentVolumeClaim Binder for working with disks, etc.

In addition to these default controllers, you can create your own controller or use an existing one, such as ExternalDNS Controller. These are examples of custom controllers.

Controllers work in a control loop — a cyclic process in which they constantly check the resources assigned to them — either to change existing resources in the system or to respond to the addition of new ones.

During each check*(reconciliation loop*), the Controller compares the current state of the resource and compares it with the desired state — that is, the parameters specified in its manifest when the resource was created or updated.

If the desired state does not correspond to the current state, the controller performs the necessary actions to bring these states into alignment.

What is: Kubernetes Operator

Kubernetes Operator, in turn, is a kind of “controller on steroids”: in fact, Operator is a Custom Controller in the sense that it has its own service in the form of a Pod that communicates with the Kubernetes API to receive and update information about resources.

But if ordinary controllers work with “default” resource types (Pod, Endpoint Slice, Node, PVC), then for Operator we describe our own custom resources using a manifest with Custom Resource.

And how these resources will look like and what parameters they will have — we set through CustomResourceDefinition which are written to the Kubernetes database and added to the Kubernetes API, and thus the Kubernetes API allows our custom Controller to operate with these resources.

That is:

Controller is a component, a service, and Operator is a combination of one or more custom Controllers and corresponding CRDs
Controller — responds to changes in resources, and Operator — adds new types of resources + controller that controls these resources

Kubernetes Operator frameworks

There are several solutions that simplify the creation of operators.

The main ones are Kubebuilder, a framework for creating controllers in Go, and Kopf, a framework in Python.

There is also the Operator SDK, which allows you to work with controllers even with Helm, without code.

At first, I was thinking of doing it in bare Go, without any frameworks, to better understand how everything works under the hood — but this post started to turn into 95% Golang.

And since the main idea of the post was to show conceptually what a Kubernetes Operator is, what role CustomResourceDefinitions play, and how they interact with each other and allow you to manage resources, I decided to use Kopf because it’s very simple and quite suitable for these purposes.

Creating a CustomResourceDefinition

Let’s start with writing the CRD.

Actually, CustomResourceDefinition is just a description of what fields our custom resource will have so that the controller can use them through the Kubernetes API to create real resources — whether they are some resources in Kubernetes itself, or external ones like AWS Load Balancer or AWS Route 53.

What we will do: we will write a CRD that will describe the MyApp resource, and this resource will have fields for the Docker image and a custom field with some text that will then be written to the Kubernetes Pod logs.

Kubernetes documentation on CRD — Extend the Kubernetes API with CustomResourceDefinitions.

Create the file myapp-crd.yaml:

apiVersion: apiextensions.k8s.io/v1
kind: CustomResourceDefinition
metadata:
  name: myapps.demo.rtfm.co.ua
spec:
  group: demo.rtfm.co.ua
  versions:
    - name: v1
      served: true
      storage: true
      schema:
        openAPIV3Schema:
          type: object
          properties:
            spec:
              type: object
              properties:
                image:
                  type: string
                banner: 
                  type: string
                  description: "Optional banner text for the application"
  scope: Namespaced
  names:
    plural: myapps
    singular: myapp
    kind: MyApp
    shortNames:
      - ma

Here:

spec.group: demo.rtfm.co.ua: create a new API Group, all resources of this type will be available at /apis/demo.rtfm.co.ua/...
versions: list of versions of the new resource
name.v1: we will have an only one version
served: true: add a new resource to the Kube API - you can do kubectl get myapp (GET /apis/demo.rtfm.co.ua/v1/myapps)
storage: true: this version will be used for storage in etcd (if several versions are described, only one should be with storage: true)
schema:
openAPIV3Schema: describe the API scheme according to the OpenAPI v3
type: object: describe an object with nested fields (key: value)
properties: what fields the object will have
spec: what we can use in YAML manifests when creating
type: object - describe the following fields:
properties:
image.type: string: a Docker image
banner.type: string: our custom field through which we will add some entry to the resource logs
scope: Namespaced: all resources of this type will exist in a specific Kubernetes Namespace
names:
plural: myapps: the resources will be available through /apis/demo.rtfm.co.ua/v1/namespaces/<ns>/myapps/, and how we can "access" the resource (kubectl get myapp), used in RBAC where you need to specify resources:["myapps"]
singular: myapp: alias for convenience
shortNames:[ma]: short alias for convenience

Let’s start Minikube:

$ minikube start

Add the CRD:

$ kk apply -f myapp-crd.yaml 
customresourcedefinition.apiextensions.k8s.io/myapps.demo.rtfm.co.ua created

Let’s look at the Groups API:

$kubectl api-versions 
...
demo.rtfm.co.ua/v1 
...

And a new resource in this API Group:

$ kubectl api-resources --api-group=demo.rtfm.co.ua
NAME SHORTNAMES APIVERSION NAMESPACED KIND
myapps ma demo.rtfm.co.ua/v1 true MyApp

OK — we have created a CRD, and now we can even create a CustomResource (CR).

Create the file myapp-example-resource.yaml:

apiVersion: demo.rtfm.co.ua/v1 # matches the CRD's group and version
kind: MyApp # kind from the CRD's 'spec.names.kind'
metadata:
  name: example-app # name of this custom resource
  namespace: default # namespace (CRD has scope: Namespaced)
spec:
  image: nginx:latest # container image to use (from our schema)
  banner: "This pod was created by MyApp operator 🚀"

Deploy:

$ kk apply -f myapp-example-resource.yaml 
myapp.demo.rtfm.co.ua/example-app created

And check:

$ kk get myapp
NAME AGE
example-app 15s

But there are no resources of type Pod — because we do not have a controller that will work with this type of resources.

Creating a Kubernetes Operator with Kopf

So, we will use Kopf to create a Kubernetes Pod, but using our own CRD.

Create a Python virtual environment:

$ python -m venv venv 
$ . ./venv/bin/activate 
(venv)

Add dependencies — requirements.txt file :

kopf 
kubernetes
PyYAML

Install them — with pip or uv:

$ pip install -r requirements.txt

Let’s write the operator code:

import os
import kopf
import kubernetes
import yaml

# use kopf to register a handler for the creation of MyApp custom resources
@kopf.on.create('demo.rtfm.co.ua', 'v1', 'myapps')
# this function will be called when a new MyApp resource is created
def create_myapp(spec, name, namespace, logger, **kwargs):
    # get image value from the spec of the CustomResource manifest
    image = spec.get('image')
    if not image:
        raise kopf.PermanentError("Field 'spec.image' must be provided.")

    # get optional banner value from the CR manifest spec
    banner = spec.get('banner')

    # load pod template YAML from file
    path = os.path.join(os.path.dirname( __file__ ), 'pod.yaml')
    with open(path, 'rt') as f:
        pod_template = f.read()

    # render pod YAML with provided values
    pod_yaml = pod_template.format(
        name=f"{name}-pod",
        image=image,
        app_name=name,
    )
    # create Pod difinition from the rendered YAML
    # it uses PyYAML to parse the YAML string into a Python dictionary
    # which can be used by Kubernetes API client
    # it is used to create a Pod object in Kubernetes
    pod_spec = yaml.safe_load(pod_yaml)

    # inject banner as environment variable if provided
    if banner:
        # it is used to add a new environment variable into the container spec
        container = pod_spec['spec']['containers'][0]
        env = container.setdefault('env', [])
        env.append({
            'name': 'BANNER',
            'value': banner
        })

    # create Kubernetes CoreV1 API client
    # used to interact with the Kubernetes API
    api = kubernetes.client.CoreV1Api()

    try:
        # it sends a request to the Kubernetes API to create a new Pod
        # uses 'create_namespaced_pod' method to create the Pod in the specified namespace
        # 'namespace' is the namespace where the Pod will be created
        # 'body' is the Pod specification that was created from the YAML template
        api.create_namespaced_pod(namespace=namespace, body=pod_spec)
        logger.info(f"Pod {name}-pod created.")
    except kubernetes.client.exceptions.ApiException as e:
        logger.error(f"Failed to create pod {name}-pod: {e}")

Create a template that will be used by our Operator to create resources:

apiVersion: v1
kind: Pod
metadata:
  name: {name}
  labels:
    app: {app_name}
spec:
  containers:
    - name: {app_name}
      image: {image}
      ports:
        - containerPort: 80
      env:
        - name: BANNER
          value: "" # will be overridden in code if provided
      command: ["/bin/sh", "-c"]
      args:
        - |
          if [-n "$BANNER"]; then
            echo "$BANNER";
          fi
          exec sleep infinity

Run the operator with kopf run myoperator.py.

We already have a CustomResource created, and the Operator should see it and create a Kubernetes Pod:

$ kopf run myoperator.py --verbose
...
[2025-07-18 13:59:58,201] kopf._cogs.clients.w [DEBUG] Starting the watch-stream for customresourcedefinitions.v1.apiextensions.k8s.io cluster-wide.
[2025-07-18 13:59:58,201] kopf._cogs.clients.w [DEBUG] Starting the watch-stream for myapps.v1.demo.rtfm.co.ua cluster-wide.
[2025-07-18 13:59:58,305] kopf.objects [DEBUG] [default/example-app] Creation is in progress: {'apiVersion': 'demo.rtfm.co.ua/v1', 'kind': 'MyApp', 'metadata': {'annotations': {'kubectl.kubernetes.io/last-applied-configuration': '{"apiVersion":"demo.rtfm.co.ua/v1","kind":"MyApp","metadata":{"annotations":{},"name":"example-app","namespace":"default"},"spec":{"banner":"This pod was created by MyApp operator 🚀","image":"nginx:latest","replicas":3}}\n'}, 'creationTimestamp': '2025-07-18T09:55:42Z', 'generation': 2, 'managedFields': [{'apiVersion': 'demo.rtfm.co.ua/v1', 'fieldsType': 'FieldsV1', 'fieldsV1': {'f:metadata': {'f:annotations': {'.': {}, 'f:kubectl.kubernetes.io/last-applied-configuration': {}}}, 'f:spec': {'.': {}, 'f:banner': {}, 'f:image': {}, 'f:replicas': {}}}, 'manager': 'kubectl-client-side-apply', 'operation': 'Update', 'time': '2025-07-18T10:48:27Z'}], 'name': 'example-app', 'namespace': 'default', 'resourceVersion': '2955', 'uid': '8b674a99-05ab-4d4b-8205-725de450890a'}, 'spec': {'banner': 'This pod was created by MyApp operator 🚀', 'image': 'nginx:latest', 'replicas': 3}}
...
[2025-07-18 13:59:58,325] kopf.objects [INFO] [default/example-app] Pod example-app-pod created.
[2025-07-18 13:59:58,326] kopf.objects [INFO] [default/example-app] Handler 'create_myapp' succeeded.
...

Check the Pod:

$ kk get pod
NAME READY STATUS RESTARTS AGE
example-app-pod 1/1 Running 0 68s

And its logs:

$ kk logs -f example-app-pod 
This pod was created by MyApp operator 🚀

So, the Operator launched the Pod using our CustomResource in which he took the spec.banner field with the string "This pod was created by MyApp operator 🚀", and executed the command /bin/sh -c " $BANNER" command in the pod.

Resource templates: Kopf and Kubebuilder

Instead of having a separate pod-template.yaml file, we could describe everything directly in the operator code.

That is, you can describe something like this:

...
    # get optional banner value
    banner = spec.get('banner', '')

    # define Pod spec as a Python dict
    pod_spec = {
        "apiVersion": "v1",
        "kind": "Pod",
        "metadata": {
            "name": f"{name}-pod",
            "labels": {
                "app": name,
            },
        },
        "spec": {
            "containers": [
                {
                    "name": name,
                    "image": image,
                    "env": [
                        {
                            "name": "BANNER",
                            "value": banner
                        }
                    ],
                    "command": ["/bin/sh", "-c"],
                    "args": [f'echo "$BANNER"; exec sleep infinity'],
                    "ports": [
                        {
                            "containerPort": 80
                        }
                    ]
                }
            ]
        }
    }

    # create Kubernetes API client
    api = kubernetes.client.CoreV1Api()
...

And in the case of Kubebuilder, a function is usually created that uses the CustomResource manifest (cr *myappv1.MyApp) and forms an object of type *corev1.Pod using the Go structures corev1.PodSpec and corev1.Container:

...
// newPod is a helper function that builds a Kubernetes Pod object
// based on the custom MyApp resource. It returns a pointer to corev1.Pod,
// which is later passed to controller-runtime's client.Create(...) to create the Pod in the cluster.
func newPod(cr *myappv1.MyApp) *corev1.Pod {
    // `cr` is a pointer to your CustomResource of kind MyApp
    // type MyApp is generated by Kubebuilder and lives in your `api/v1/myapp_types.go`
    // it contains fields like cr.Spec.Image, cr.Spec.Banner, cr.Name, cr.Namespace, etc.
    return &corev1.Pod{
        // corev1.Pod is a Go struct representing the built-in Kubernetes Pod type
        // it's defined in "k8s.io/api/core/v1" package (aliased here as corev1)
        // we return a pointer to it (`*corev1.Pod`) because client-go methods like
        // `client.Create()` expect pointer types

        ObjectMeta: metav1.ObjectMeta{
            // metav1.ObjectMeta comes from "k8s.io/apimachinery/pkg/apis/meta/v1"
            // it defines metadata like name, namespace, labels, annotations, ownerRefs, etc.
            Name: cr.Name + "-pod", // generate Pod name based on the CR's name
            Namespace: cr.Namespace, // place the Pod in the same namespace as the CR
            Labels: map[string]string{ // set a label for identification or selection
                "app": cr.Name, // e.g., `app=example-app`
            },
        },

        Spec: corev1.PodSpec{
            // corev1.PodSpec defines everything about how the Pod runs
            // including containers, volumes, restart policy, etc.

            Containers: []corev1.Container{
                // define a single container inside the Pod

                {
                    Name: cr.Name, // use CR name as container name (must be DNS compliant)
                    Image: cr.Spec.Image, // container image (e.g., "nginx:1.25")

                    Env: []corev1.EnvVar{
                        // corev1.EnvVar is a struct that defines environment variables
                        {
                            Name: "BANNER", // name of the variable
                            Value: cr.Spec.Banner, // value from the CR spec
                        },
                    },

                    Command: []string{"/bin/sh", "-c"},
                    // override container ENTRYPOINT to run a shell command

                    Args: []string{
                        // run a command that prints the banner and sleeps forever
                        // fmt.Sprintf(...) injects the value at runtime into the string
                        fmt.Sprintf(`echo "%s"; exec sleep infinity`, cr.Spec.Banner),
                    },

                    // optional: could also add ports, readiness/liveness probes, etc.
                },
            },
        },
    }
}
...

And what about in real operators?

But we did this for “internal” Kubernetes resources.

What about external resources?

Here’s just an example — I haven’t tested it, but the general idea is this: just take an SDK (in the Python example, it’s boto3), and using the fields from the CustomResource (for example, subnets or schema), make the appropriate API requests to AWS through the SDK.

An example of such a CustomResource:

apiVersion: demo.rtfm.co.ua/v1
kind: MyIngress
metadata:
  name: myapp
spec:
  subnets:
    - subnet-abc
    - subnet-def
  scheme: internet-facing

And the code that could create an AWS ALB from it:

import kopf
import boto3
import botocore
import logging

# create a global boto3 client for AWS ELBv2 service
# this client will be reused for all requests from the operator
# NOTE: region must match where your subnets and VPC exist
elbv2 = boto3.client("elbv2", region_name="us-east-1")

# define a handler that is triggered when a new MyIngress resource is created
@kopf.on.create('demo.rtfm.co.ua', 'v1', 'myingresses')
def create_ingress(spec, name, namespace, status, patch, logger, **kwargs):
    # extract the list of subnet IDs from the CustomResource 'spec.subnets' field
    # these subnets must belong to the same VPC and be public if scheme=internet-facing
    subnets = spec.get('subnets')

    # extract optional scheme (default to 'internet-facing' if not provided)
    scheme = spec.get('scheme', 'internet-facing')

    # validate input: at least 2 subnets are required to create an ALB
    if not subnets:
        raise kopf.PermanentError("spec.subnets is required.")

    # attempt to create an ALB in AWS using the provided spec
    # using the boto3 ELBv2 client
    try:
        response = elbv2.create_load_balancer(
            Name=f"{name}-alb", # ALB name will be derived from CR name
            Subnets=subnets, # list of subnet IDs provided by user
            Scheme=scheme, # 'internet-facing' or 'internal'
            Type='application', # we are creating an ALB (not NLB)
            IpAddressType='ipv4', # only IPv4 supported here (could be 'dualstack')
            Tags=[ # add tags for ownership tracking
                {'Key': 'ManagedBy', 'Value': 'kopf'},
            ]
        )
    except botocore.exceptions.ClientError as e:
        # if AWS API fails (e.g. invalid subnet, quota exceeded), retry later
        raise kopf.TemporaryError(f"Failed to create ALB: {e}", delay=30)

    # parse ALB metadata from AWS response
    lb = response['LoadBalancers'][0] # ALB list should contain exactly one entry
    dns_name = lb['DNSName'] # external DNS of the ALB (e.g. abc.elb.amazonaws.com)
    arn = lb['LoadBalancerArn'] # unique ARN of the ALB (used for deletion or listeners)

    # log the creation for operator diagnostics
    logger.info(f"Created ALB: {dns_name}")

    # save ALB info into the CustomResource status field
    # this updates .status.alb.dns and .status.alb.arn in the CR object
    patch.status['alb'] = {
        'dns': dns_name,
        'arn': arn,
    }

    # return a dict, will be stored in the finalizer state
    # used later during deletion to clean up the ALB
    return {'alb-arn': arn}

In the case of Go and Kubebuilder, we would use the aws-sdk-go library:

import (
    "context"
    "fmt"

    elbv2 "github.com/aws/aws-sdk-go-v2/service/elasticloadbalancingv2"
    "github.com/aws/aws-sdk-go-v2/aws"
    metav1 "k8s.io/apimachinery/pkg/apis/meta/v1"
    networkingv1 "k8s.io/api/networking/v1"
)

func newALB(ctx context.Context, client *elbv2.Client, cr *networkingv1.Ingress) (string, error) {
    // build input for the ALB
    input := &elbv2.CreateLoadBalancerInput{
        Name: aws.String(fmt.Sprintf("%s-alb", cr.Name)),
        Subnets: []string{"subnet-abc123", "subnet-def456"}, // replace with real subnets
        Scheme: elbv2.LoadBalancerSchemeEnumInternetFacing,
        Type: elbv2.LoadBalancerTypeEnumApplication,
        IpAddressType: elbv2.IpAddressTypeIpv4,
        Tags: []types.Tag{
            {
                Key: aws.String("ManagedBy"),
                Value: aws.String("MyIngressOperator"),
            },
        },
    }

    // create ALB
    output, err := client.CreateLoadBalancer(ctx, input)
    if err != nil {
        return "", fmt.Errorf("failed to create ALB: %w", err)
    }

    if len(output.LoadBalancers) == 0 {
        return "", fmt.Errorf("ALB was not returned by AWS")
    }

    // return the DNS name of the ALB
    return aws.ToString(output.LoadBalancers[0].DNSName), nil
}

In the real AWS ALB Ingress Controller, the creation of an ALB is called in the elbv2.go file :

...
func (c *elbv2Client) CreateLoadBalancerWithContext(ctx context.Context, input *elasticloadbalancingv2.CreateLoadBalancerInput) (*elasticloadbalancingv2.CreateLoadBalancerOutput, error) {
  client, err := c.getClient(ctx, "CreateLoadBalancer")
  if err != nil {
    return nil, err
  }
  return client.CreateLoadBalancer(ctx, input)
}
...

Actually, that’s all there is to it.

Originally published at RTFM: Linux, DevOps, and system administration.