Aviral Srivastava

Posted on Feb 3

Writing Custom Controllers in K8s

#automation #devops #kubernetes #tutorial

Taming the Kubernetes Beast: Your Grand Tour of Custom Controllers

So, you've been navigating the majestic, sometimes bewildering, world of Kubernetes. You're comfortable with Pods, Deployments, and maybe even Services. You're feeling pretty slick, right? But then, you hit a wall. You have a specific, niche operational task that Kubernetes just doesn't handle out-of-the-box. You need something more. Something tailored to your unique flavor of chaos.

Enter the Custom Controller.

Think of Kubernetes as a giant, incredibly powerful operating system for your applications. It has a rich set of built-in tools (those native resources we love) to manage your workloads. But just like any OS, there are always specialized tasks that require custom scripts or utilities. Custom Controllers are Kubernetes' way of letting you write those specialized utilities, empowering you to extend its functionality and automate the heck out of anything you can dream up.

This isn't just about adding a few more YAML files. This is about diving deep into the heart of Kubernetes' control plane, becoming a wizard of its reconciliation loop, and wielding the power to automate complex scenarios. So, buckle up, buttercup, because we're about to embark on a grand tour of writing your very own K8s Custom Controllers.

Why Bother? The Sweet Symphony of Custom Controllers

Before we dive headfirst into the nitty-gritty, let's address the elephant in the room: why would you even want to write a custom controller? What's wrong with the existing tools?

Well, sometimes, the existing tools are like trying to hammer in a screw. They work, but they're not ideal. Custom Controllers offer some serious advantages:

Tailored Automation: This is the big kahuna. You have a specific workflow, a unique integration, or a complex operational pattern? A custom controller can automate it precisely how you need it. Think about automatically provisioning cloud resources based on a custom object, managing a stateful application with intricate dependencies, or enforcing custom security policies.
Declarative Dreamland: Kubernetes thrives on the declarative model. You declare the desired state, and Kubernetes makes it happen. Custom Controllers extend this principle. You define your custom resource's desired state in YAML, and your controller ensures the cluster reaches that state. No more imperative scripts that get messy.
Extending Kubernetes API: Custom Controllers work hand-in-hand with Custom Resource Definitions (CRDs). CRDs allow you to define new object types in Kubernetes, essentially extending its API. Your controller then becomes the "operator" for these new objects, managing their lifecycle.
Reduced Operational Burden: Automating complex tasks means less manual intervention, fewer errors, and happier ops teams. Imagine not having to manually set up complex networking configurations or provision storage for every new instance of your custom application.
Vendor-Agnostic Solutions: You can build controllers that abstract away vendor-specific details, providing a consistent Kubernetes experience regardless of the underlying infrastructure.

The Bare Necessities: What You'll Need

Before we start sketching out our controller blueprints, let's make sure you have the right tools and knowledge in your arsenal. Think of this as your pre-flight checklist:

Kubernetes Cluster Access: Obviously! You'll need a cluster to deploy and test your controller. A local setup like Minikube, Kind, or Docker Desktop with Kubernetes enabled is perfect for development.
kubectl Familiarity: You should be comfortable using kubectl to interact with your cluster, create resources, and check logs.
Go Programming Language: While you can write controllers in other languages (Python with Kubebuilder, for example), Go is the native language of Kubernetes and has excellent tooling and community support for controller development.
Understanding of Kubernetes Concepts: A solid grasp of Pods, Deployments, Services, Namespaces, RBAC, and the Kubernetes API is crucial.
Basic Understanding of CRDs: You'll need to know how to define your custom resources.
Optional but Highly Recommended: Kubebuilder or Operator SDK: These are fantastic frameworks that significantly simplify controller development by providing scaffolding, code generation, and helpful utilities. We'll be referencing their principles, and using them is highly encouraged.

The Dark Side: Potential Pitfalls and Challenges

As with any powerful tool, custom controllers come with their own set of challenges. It's good to be aware of them upfront:

Complexity: Writing and maintaining custom controllers can be complex. You're dealing with distributed systems, state management, and the intricacies of the Kubernetes API.
Debugging Headaches: Debugging distributed systems is notoriously difficult. You'll need robust logging and potentially remote debugging capabilities.
Version Skew: Kubernetes is constantly evolving. You'll need to be mindful of API version changes and ensure your controller is compatible with the Kubernetes versions you're targeting.
Resource Intensive: Your controller itself will run as a Pod in the cluster, consuming resources. You need to ensure it's efficient and doesn't hog the cluster.
Security Considerations: Controllers often have elevated privileges to manage cluster resources. Implementing proper RBAC is paramount to prevent unintended consequences.

The Heart of the Matter: How Controllers Work – The Reconciliation Loop

This is where the magic happens. At its core, a Kubernetes controller operates on a principle called the reconciliation loop. Imagine it as a tireless guardian constantly observing your cluster.

Here's the simplified breakdown:

Watch: The controller constantly "watches" for changes to specific Kubernetes resources (both native ones like Pods and your custom CRDs). It leverages the Kubernetes API server's watch mechanism.
Compare: When a change is detected, the controller compares the actual state of the resource in the cluster with its desired state. The desired state is typically defined in the spec field of your custom resource.
Act: If there's a discrepancy, the controller takes action to reconcile the actual state with the desired state. This might involve creating, updating, or deleting other Kubernetes resources (Pods, Services, Deployments, etc.) or performing external actions.
Repeat: The loop continuously repeats, ensuring that the cluster's state always aligns with the desired state.

Think of it like this: you have a thermostat (your CRD's spec defining the desired temperature). The thermostat (your controller) constantly checks the room temperature (the actual state). If the room is too cold, it turns on the heater (creates a Pod). If it's too hot, it might turn off a fan (deletes a resource).

Bringing Your Custom Resources to Life: CRDs and Controllers in Action

Let's get our hands dirty with a conceptual example. Imagine we want to create a custom resource called MyApp that represents our custom application. We want the controller to ensure that a Deployment and a Service are always running for each MyApp instance.

Step 1: Define Your Custom Resource (CRD)

First, we need to define our MyApp resource using a CRD. This tells Kubernetes about our new object type.

# myapp-crd.yaml
apiVersion: apiextensions.k8s.io/v1
kind: CustomResourceDefinition
metadata:
  name: myapps.example.com # Plural name of the custom resource
spec:
  group: example.com        # API group
  versions:
    - name: v1              # API version
      served: true
      storage: true
      schema:
        openAPIV3Schema:
          type: object
          properties:
            spec:
              type: object
              properties:
                image:
                  type: string
                  description: The container image to run for the application.
                replicas:
                  type: integer
                  description: The desired number of replicas for the deployment.
                port:
                  type: integer
                  description: The port the application listens on.
              required:
                - image
                - replicas
                - port
            status:
              type: object
              properties:
                availableReplicas:
                  type: integer
                  description: The number of available replicas for the deployment.
                conditions:
                  type: array
                  items:
                    type: object
                    properties:
                      type:
                        type: string
                      status:
                        type: string
                      lastTransitionTime:
                        type: string
                      reason:
                        type: string
                      message:
                        type: string
  scope: Namespaced           # Can be Namespaced or Cluster
  names:
    plural: myapps
    singular: myapp
    kind: MyApp             # Kind of the custom resource
    shortNames:
      - ma

Once we apply this CRD (kubectl apply -f myapp-crd.yaml), Kubernetes will recognize MyApp objects.

Step 2: Create an Instance of Your Custom Resource

Now, we can create an instance of our MyApp resource.

# myapp-instance.yaml
apiVersion: example.com/v1
kind: MyApp
metadata:
  name: my-awesome-app
  namespace: default
spec:
  image: nginx:latest
  replicas: 3
  port: 80

Apply this: kubectl apply -f myapp-instance.yaml.

Step 3: Building the Controller (Conceptual Go Code)

This is where the actual controller logic lives. Using a framework like Kubebuilder, you'd generate a lot of this boilerplate. Here's a simplified conceptual Go snippet to illustrate the core idea.

// main.go (simplified conceptual controller logic)
package main

import (
    "context"
    "fmt"
    "log"
    "time"

    appsv1 "k8s.io/api/apps/v1"
    corev1 "k8s.io/api/core/v1"
    metav1 "k8s.io/apimachinery/pkg/apis/meta/v1"
    "k8s.io/client-go/kubernetes"
    "k8s.io/client-go/rest"
    ctrl "sigs.k8s.io/controller-runtime"
    "sigs.k8s.io/controller-runtime/pkg/client"

    // Import your custom resource API types
    examplev1 "your_module_path/api/v1"
)

// MyAppReconciler reconciles a MyApp object
type MyAppReconciler struct {
    client.Client
    Clientset kubernetes.Clientset
}

// Reconcile is part of the main kubernetes reconciliation loop
func (r *MyAppReconciler) Reconcile(ctx context.Context, req ctrl.Request) (ctrl.Result, error) {
    log.Printf("Reconciling MyApp: %s/%s", req.Namespace, req.Name)

    // Fetch the MyApp instance
    myApp := &examplev1.MyApp{}
    if err := r.Get(ctx, req.NamespacedName, myApp); err != nil {
        // If MyApp instance not found, we can safely ignore it
        // and stop reconciliation.
        if client.IgnoreNotFound(err) != nil {
            log.Printf("Error fetching MyApp: %v", err)
            return ctrl.Result{}, err
        }
        return ctrl.Result{}, nil
    }

    // --- Reconciliation Logic ---

    // 1. Ensure Deployment exists
    deploymentName := fmt.Sprintf("%s-deployment", myApp.Name)
    desiredDeployment := &appsv1.Deployment{
        ObjectMeta: metav1.ObjectMeta{
            Name:      deploymentName,
            Namespace: myApp.Namespace,
            Labels: map[string]string{
                "app": myApp.Name,
            },
        },
        Spec: appsv1.DeploymentSpec{
            Replicas: &myApp.Spec.Replicas,
            Selector: &metav1.LabelSelector{
                MatchLabels: map[string]string{
                    "app": myApp.Name,
                },
            },
            Template: corev1.PodTemplateSpec{
                ObjectMeta: metav1.ObjectMeta{
                    Labels: map[string]string{
                        "app": myApp.Name,
                    },
                },
                Spec: corev1.PodSpec{
                    Containers: []corev1.Container{{
                        Name:  "app-container",
                        Image: myApp.Spec.Image,
                        Ports: []corev1.ContainerPort{{
                            ContainerPort: int32(myApp.Spec.Port),
                        }},
                    }},
                },
            },
        },
    }

    // Try to get the existing deployment
    existingDeployment := &appsv1.Deployment{}
    err := r.Get(ctx, client.ObjectKeyFromObject(desiredDeployment), existingDeployment)

    if err != nil {
        // If deployment doesn't exist, create it
        if client.IgnoreNotFound(err) != nil {
            log.Printf("Error checking deployment: %v", err)
            return ctrl.Result{}, err
        }
        log.Printf("Creating deployment for %s", myApp.Name)
        if err := r.Create(ctx, desiredDeployment); err != nil {
            log.Printf("Error creating deployment: %v", err)
            return ctrl.Result{}, err
        }
    } else {
        // If deployment exists, update it if necessary
        // (simplified: just check replica count for this example)
        if *existingDeployment.Spec.Replicas != myApp.Spec.Replicas {
            log.Printf("Updating deployment for %s", myApp.Name)
            existingDeployment.Spec.Replicas = &myApp.Spec.Replicas
            if err := r.Update(ctx, existingDeployment); err != nil {
                log.Printf("Error updating deployment: %v", err)
                return ctrl.Result{}, err
            }
        }
    }

    // 2. Ensure Service exists
    serviceName := fmt.Sprintf("%s-service", myApp.Name)
    desiredService := &corev1.Service{
        ObjectMeta: metav1.ObjectMeta{
            Name:      serviceName,
            Namespace: myApp.Namespace,
        },
        Spec: corev1.ServiceSpec{
            Selector: map[string]string{
                "app": myApp.Name,
            },
            Ports: []corev1.ServicePort{{
                Port:       int32(myApp.Spec.Port),
                TargetPort: int32(myApp.Spec.Port),
            }},
            Type: corev1.ServiceTypeClusterIP,
        },
    }

    // Try to get the existing service
    existingService := &corev1.Service{}
    err = r.Get(ctx, client.ObjectKeyFromObject(desiredService), existingService)

    if err != nil {
        if client.IgnoreNotFound(err) != nil {
            log.Printf("Error checking service: %v", err)
            return ctrl.Result{}, err
        }
        log.Printf("Creating service for %s", myApp.Name)
        if err := r.Create(ctx, desiredService); err != nil {
            log.Printf("Error creating service: %v", err)
            return ctrl.Result{}, err
        }
    } else {
        // Optionally, check and update service if spec changes (e.g., port)
    }

    // Update the status of the MyApp object (optional but good practice)
    myApp.Status.AvailableReplicas = existingDeployment.Status.AvailableReplicas // This would be more complex in reality
    if err := r.Status().Update(ctx, myApp); err != nil {
        log.Printf("Error updating MyApp status: %v", err)
        return ctrl.Result{}, err
    }


    // Requeue after a short period if something might still need reconciliation
    // or if we want to periodically check something.
    return ctrl.Result{RequeueAfter: time.Minute}, nil
}

func main() {
    // Setup logger
    setupLog := ctrl.Log.WithName("setup")

    // Manager setup (using controller-runtime)
    mgr, err := ctrl.NewManager(ctrl.GetConfigOrDie(), ctrl.Options{
        Scheme: examplev1.Scheme, // Your custom resource scheme
    })
    if err != nil {
        setupLog.Error(err, "unable to start manager")
        os.Exit(1)
    }

    // Create a reconciler and register it with the manager
    reconciler := &MyAppReconciler{
        Client: mgr.GetClient(),
        // Clientset can be obtained from mgr.GetAPIReader() or by creating a new one
        // For simplicity, we'll assume a basic client usage here.
    }
    if err = reconciler.SetupWithManager(mgr); err != nil {
        setupLog.Error(err, "unable to create controller", "controller", "MyApp")
        os.Exit(1)
    }

    // Start the manager
    setupLog.Info("starting manager")
    if err := mgr.Start(ctrl.SetupSignalHandler()); err != nil {
        setupLog.Error(err, "problem running manager")
        os.Exit(1)
    }
}

Key takeaways from the code:

client.Client: This is your interface to the Kubernetes API for reading and writing objects.
ctrl.Request: This contains the namespaced name of the resource that triggered the reconciliation.
r.Get(): Used to fetch existing Kubernetes resources.
client.IgnoreNotFound(): A handy utility to gracefully handle resources that don't exist yet.
r.Create(), r.Update(), r.Delete(): Methods for modifying resources.
metav1.ObjectMeta: Essential for defining metadata like name, namespace, and labels.
appsv1.Deployment and corev1.Service: Examples of native Kubernetes resources you'll often create or manage.
myApp.Spec and myApp.Status: Accessing the desired state from your custom resource and updating its actual status.
ctrl.Result{RequeueAfter: time.Minute}: A common pattern to tell the controller to re-evaluate the resource after a certain period.

Orchestrating Your Controller: Deployment and RBAC

Once you've written your controller, you need to deploy it to your Kubernetes cluster. This typically involves:

Building a Docker Image: Package your controller code into a container image.
Creating a Deployment: Define a Kubernetes Deployment to run your controller Pods.
Setting up RBAC: This is CRITICAL. Your controller will need permissions to watch and manipulate other Kubernetes resources. You'll create a ServiceAccount, ClusterRole (or Role), and ClusterRoleBinding (or RoleBinding) to grant these permissions.

Here's a peek at a simplified Deployment for our controller:

# controller-deployment.yaml
apiVersion: apps/v1
kind: Deployment
metadata:
  name: myapp-controller
  namespace: kube-system # Or any other suitable namespace
spec:
  replicas: 1
  selector:
    matchLabels:
      app: myapp-controller
  template:
    metadata:
      labels:
        app: myapp-controller
    spec:
      serviceAccountName: myapp-controller-sa # The Service Account your controller will use
      containers:
      - name: controller
        image: your-dockerhub-username/myapp-controller:latest # Replace with your image
        command: ["/myapp-controller"] # The entrypoint of your controller
        # Add resource requests/limits here

And here's a glimpse of the RBAC:

# rbac.yaml
apiVersion: v1
kind: ServiceAccount
metadata:
  name: myapp-controller-sa
  namespace: kube-system

---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
  name: myapp-controller-role
rules:
- apiGroups: ["example.com"]
  resources: ["myapps"]
  verbs: ["get", "list", "watch", "create", "update", "patch", "delete"]
- apiGroups: ["apps"]
  resources: ["deployments"]
  verbs: ["get", "list", "watch", "create", "update", "patch", "delete"]
- apiGroups: [""] # Core API group
  resources: ["services"]
  verbs: ["get", "list", "watch", "create", "update", "patch", "delete"]
- apiGroups: [""] # Core API group
  resources: ["events"]
  verbs: ["create", "patch"]

---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
  name: myapp-controller-binding
subjects:
- kind: ServiceAccount
  name: myapp-controller-sa
  namespace: kube-system # Namespace where your controller is deployed
roleRef:
  kind: ClusterRole
  name: myapp-controller-role
  apiGroup: rbac.authorization.k8s.io

Features and Best Practices for Robust Controllers

As you mature in your controller development, consider these features and best practices:

Status Updates: Always update the status subresource of your custom object to reflect the actual state of the managed resources. This provides valuable insight to users.
Events: Emit Kubernetes Events to signal important occurrences (creation, update, errors). This helps with debugging and monitoring.
Owner References: Ensure your controller sets OwnerReference on the resources it creates. This allows Kubernetes to garbage collect these resources when the parent custom object is deleted.
Finalizers: For graceful deletion, implement Finalizers. They prevent an object from being deleted until your controller has performed cleanup tasks.
Idempotency: Design your reconciliation logic to be idempotent. This means running the same reconciliation multiple times should have the same outcome as running it once.
Error Handling and Retries: Implement robust error handling and leverage ctrl.Result{Requeue: true} or RequeueAfter for transient errors.
Metrics and Monitoring: Expose Prometheus metrics from your controller to monitor its health and performance.
Testing: Write unit and integration tests for your controller to ensure its correctness.
Use a Framework: Seriously, Kubebuilder or Operator SDK will save you immense pain and time.

The Horizon: Operators and Beyond

Custom Controllers are the foundation for Operators. An Operator is essentially a custom controller that encapsulates operational knowledge for a specific application or service. It knows how to deploy, scale, upgrade, and manage that application within Kubernetes. The Operator SDK is a fantastic tool for building Operators.

Conclusion: Your Journey to K8s Mastery Continues

Writing custom controllers is a significant step in becoming a Kubernetes power user. It's about moving beyond just consuming Kubernetes resources to actively shaping and extending its capabilities. While it can be challenging, the rewards in terms of automation, customization, and operational efficiency are immense.

So, go forth, embrace the reconciliation loop, and start taming the Kubernetes beast to your specific needs. The world of custom controllers awaits, and your journey to deeper K8s mastery has just begun! Happy coding!

DEV Community