DEV Community

Ankur Sinha
Ankur Sinha

Posted on • Edited on

From CRDs to Controllers: Building a Kubernetes Custom Controller from Scratch

If you’ve worked with Kubernetes, you’ve probably seen how it can orchestrate complex workflows using custom resources.

But how does this actually work behind the scenes?

To understand this better, I built Mini Task Runner, a simplified Tekton-like task execution system from scratch. This project explores how Kubernetes can be extended using Custom Resource Definitions (CRDs) and custom controllers.

In this post, we will look at how custom resources interact with the Kubernetes API server and etcd, why code generation is required when building controllers in Go, and how an event-driven controller architecture works.


1. Kubernetes Extensibility with CRDs

Out of the box, Kubernetes understands a fixed set of standard resources such as Pods, Deployments, Services, and ReplicaSets.

When you submit a Deployment to the API server, the configuration is stored in etcd, which acts as the cluster’s key-value datastore. Built-in controllers monitor these resources and continuously reconcile the desired state with the actual state of the cluster.

But what if you want Kubernetes to understand concepts like:

  • a CI/CD pipeline
  • a database cluster
  • a workflow execution

This is where Custom Resource Definitions (CRDs) come in.

CRDs allow developers to define their own API objects and extend the Kubernetes API.

For the Mini Task Runner, I defined two custom resources:

Task

A reusable template describing the steps required to execute a task.

Each step specifies:

  • a container image
  • a script to run

TaskRun

An execution instance that triggers a Task.

A TaskRun references a Task and stores execution status such as the current phase, start time, and completion time.

However, defining CRDs alone is not enough. Kubernetes will store these resources in etcd, but nothing will act on them.

To make these resources functional, we need a Custom Controller. The controller watches the API server for changes to these resources and performs actions to reconcile the desired state with the current state.


2. Code Generation for Custom Resources

Before writing the controller logic, our Go program needs to understand the custom API types (Task and TaskRun).

Unlike built-in Kubernetes objects, these types do not exist in standard Kubernetes client libraries. As a result, we must generate supporting code.

Every Kubernetes object must implement certain interfaces, including methods that safely create deep copies of objects in memory. Writing these functions manually would be error-prone and repetitive.

Kubernetes provides code-generation tools that automatically generate the required plumbing.

After defining the Go structs for our custom resources and adding the appropriate annotations, the code generator produces:

  1. DeepCopy methods – required for Kubernetes object handling
  2. Typed clientsets – strongly typed clients for interacting with the API server
  3. Informers and listers – used to build event-driven controllers

After code generation, the controller has two clients available:

  • A core Kubernetes client to create and manage resources such as Pods
  • A custom client to interact with the Task and TaskRun resources

This generated code provides the foundation needed to implement the controller.


3. Controller Architecture

A controller is responsible for observing changes in the cluster and taking actions to move the system toward the desired state.

There are two main approaches for implementing a controller.


Polling-Based Controller

A naive approach would be to continuously poll the Kubernetes API server.

For example, a loop might periodically ask:

  • list all TaskRuns
  • list all Pods
  • compare the state
  • take action if needed

Polling.png

Polling is inefficient because it repeatedly queries the API server, increasing load and introducing unnecessary latency.

Kubernetes controllers instead rely on an event-driven architecture.


Event-Driven Controller Architecture

Production-grade Kubernetes controllers rely on four key components:

  • Informers
  • Listers
  • Workqueues
  • Workers

Workflow

Informers

Informers maintain a long-lived watch connection with the Kubernetes API server. Whenever a resource is created, updated, or deleted, the API server sends an event to the informer.

This avoids repeated polling and allows the controller to react immediately to changes.

Listers (Local Cache)

Informers maintain a local cache of resources. Instead of querying the API server for every read operation, the controller reads data from this in-memory cache.

This significantly reduces network calls and improves performance.

Workqueue

When an event occurs, the informer pushes the resource key into a rate-limited workqueue.

The queue acts as a buffer that ensures resources are processed safely and avoids overwhelming the controller during bursts of updates.

Workers

Workers are background goroutines that continuously read items from the workqueue and process them using the controller’s reconciliation logic.

Multiple workers can run in parallel, allowing the controller to handle many events concurrently.


4. The Reconciliation Loop

The core logic of the controller is implemented in the reconciliation loop.

A worker retrieves a resource key from the workqueue and performs the following steps:

1. Fetch the Current State

The controller retrieves the corresponding TaskRun from the local cache.

2. Determine the Current Phase

The controller checks the TaskRun status.

  • If the phase is Succeeded or Failed, no further action is required.
  • If the phase is empty, it indicates a new execution request.

3. Start Execution

For a new TaskRun:

  • The referenced Task is retrieved.
  • Task steps are converted into container definitions.
  • A Pod is created with restartPolicy: Never.
  • The TaskRun status is updated to Pending.

4. Track Execution

If the TaskRun is in Pending or Running, the controller checks the associated Pod.

  • If the Pod is running → update phase to Running
  • If the Pod succeeds → update phase to Succeeded
  • If the Pod fails → update phase to Failed

5. Update Status

The TaskRun status is updated through the Kubernetes API server, ensuring the cluster state remains consistent.

If reconciliation fails due to transient issues, the resource is automatically requeued with exponential backoff.

This ensures reliability without overwhelming the API server.


5. Adding a kubectl Plugin

Creating TaskRun resources manually using YAML can be inconvenient. Tools like Tekton provide CLI utilities to simplify this process.

kubectl supports plugins, which allows custom commands to be added easily.

If an executable named kubectl-<command> is placed in the system PATH, kubectl treats it as a native command.

To improve usability, I implemented a small CLI tool:

kubectl task start <task-name>
Enter fullscreen mode Exit fullscreen mode

This command creates a TaskRun resource with a unique name and submits it to the cluster.

Once the TaskRun is created, the controller detects it through the informer, enqueues the event, and begins executing the corresponding Task.


Want to Try It Yourself?

This article focuses on the architecture and design of the controller.

If you want to run the project yourself, the implementation described in this article is available in the fork1 branch of the repository:

👉 https://github.com/ankrsinha/mini-task/tree/fork1

The repository includes instructions for:

  • installing the CRDs
  • running the code generator
  • starting the controller locally
  • using the kubectl task start plugin

Conclusion

Building Mini Task Runner provided a deeper understanding of how Kubernetes controllers operate internally.

Some key takeaways from this project:

  • CRDs define the data model, but controllers provide the logic that makes them useful.
  • Code generation simplifies controller development by generating clients, informers, and deep-copy methods.
  • Event-driven architectures using informers and workqueues allow controllers to scale efficiently without overloading the API server.

For anyone interested in Kubernetes internals, implementing a custom controller is one of the best ways to understand how the control plane operates.


Next Step

In the next article, I rebuild the same controller using controller-runtime, a framework widely used for implementing Kubernetes controllers.

From client-go to controller-runtime: Rebuilding a Kubernetes Controller https://dev.to/ankrsinha/from-client-go-to-controller-runtime-rebuilding-a-kubernetes-controller-5c20


Authors

  • Ankur Sinha
  • Aditya Shinde

Top comments (0)