I actually started to write about creating my own Kubernetes Operator, but decided to make a separate topic about what a Kubernetes CustomResourceDefinition is, and how creating a CRD works at the level of the Kubernetes API and the etcd
.
That is, to start with how Kubernetes actually works with resources, and what happens when we create or edit resources.
The second part: Kubernetes: what is Kubernetes Operator and CustomResourceDefinition.
Contents
- Kubernetes API
- Kubernetes API Groups and Kind
- Kubernetes and etcd
- CustomResourceDefinitions and Kubernetes API
- Kubernetes API Service
Kubernetes API
So, all communication with the Kubernetes Control Plane takes place through its main endpoint — the Kubernetes API, which is a component of the Kubernetes Control Plane — see Cluster Architecture.
Documentation — The Kubernetes API and Kubernetes API Concepts.
Through the API, we communicate with Kubernetes, and all resources and information about them are stored in the database — etcd
.
Other components of the Control Plane are the Kube Controller Manager with a set of default controllers that are responsible for working with resources, and the Scheduler, which is responsible for how resources will be placed on Worker Nodes.
The Kubernetes API is just a regular HTTPS REST API that we can access even from curl
.
To access the cluster, we can use kubectl proxy
, which will take the parameters from ~/.kube/config
with the API Server address and token, and create a tunnel to it.
I have access to AWS EKS configured, so the connection will go to it:
$ kubectl proxy --port=8080
Starting to serve on 127.0.0.1:8080
And we turn to the API:
$ curl -s localhost:8080 | jq
{
"paths": [
"/.well-known/openid-configuration",
"/api",
"/api/v1",
"/apis",
...
"/version"
]
}
Actually, what we see is a list of API endpoints supported by the Kubernetes API:
-
/api/
: information on the Kubernetes API itself and the entry point to the core API Groups (see below) -
/api/v1
: core API group with Pods, ConfigMaps, Services, etc. -
/apis/
: APIGroupList - the rest of the API Groups in the system and their versions, including API Groups created from different CRDs - for example, for the API Group
operator.victoriametrics.com
we can see support for two versions - "operator.victoriametrics.com/v1
" "operator.victoriametrics.com/v1beta1
" -
/version
: information on the cluster version
And then we can go deeper and see what’s inside each endpoint, for example, to get information about all Pods in the cluster:
$ curl -s localhost:8080/api/v1/pods | jq
...
{
"metadata": {
"name": "backend-ws-deployment-6db58cc97c-k56lm",
...
"namespace": "staging-backend-api-ns"
"labels": {
"app": "backend-ws",
"component": "backend",
...
"spec": {
"volumes": [
{
"name": "eks-pod-identity-token",
...
"containers": [
{
"name": "backend-ws-container",
"image": "492***148.dkr.ecr.us-east-1.amazonaws.com/challenge-backend-api:v0.171.9",
"command": [
"gunicorn",
"websockets_backend.run_api:app",
...
"resources": {
"requests": {
"cpu": "200m",
"memory": "512Mi"
}
},
...
Here we can see information about the Pod named “backend-ws-deployment-6db58cc97c-k56lm” which lives in the Kubernetes Namespace “staging-backend-api-ns”, and the rest of the information about it — the volumes, which containers, resources, etc.
Kubernetes API Groups and Kind
API Groups are a way to organize resources in Kubernetes. They are grouped by groups, versions, and resource types (Kind).
That is the structure of the API:
- API Group
- versions
- kind
For example, in /api/v1
we see the Kubernetes Core API Group, in /apis
- API Groups apps
, batch
, events
, and so on.
The structure will be as follows:
-
/apis/<group>
- the group itself and its versions -
/apis/<group>/<version>
- a specific version of the group with specific resources (Kind) -
/apis/<group>/<version>/<resource>
- access to a specific resource and objects in it
Note : Kind vs resource: Kind is the name of the resource that is specified in the schema of this resource. And resource is the name that is used to build the URI when requesting the API Server.
For example, for the API Group apps
we have the version v1
:
$ curl -s localhost:8080/apis/apps | jq
{
"kind": "APIGroup",
"apiVersion": "v1",
"name": "apps",
"versions": [
{
"groupVersion": "apps/v1",
"version": "v1"
}
],
...
And inside the version — resources, for example, deployments
:
$ curl -s localhost:8080/apis/apps/v1 | jq
{
...
{
"name": "deployments",
"singularName": "deployment",
"namespaced": true,
"kind": "Deployment",
"verbs": [
"create",
"delete",
"deletecollection",
"get",
"list",
"patch",
"update",
"watch"
],
"shortNames": [
"deploy"
],
"categories": [
"all"
],
...
And using this group, version, and specific resource type (kind), we get all the objects:
$ curl -s localhost:8080/apis/apps/v1/deployments/ | jq
{
"kind": "DeploymentList",
"apiVersion": "apps/v1",
"metadata": {
"resourceVersion": "1534"
},
"items": [
{
"metadata": {
"name": "coredns",
"namespace": "kube-system",
"uid": "9d7f6de3-041e-4afe-84f4-e124d2cc6e8a",
"resourceVersion": "709",
"generation": 2,
"creationTimestamp": "2025-07-12T10:15:33Z",
"labels": {
"k8s-app": "kube-dns"
},
...
Okay, so we’ve accessed the API — but where does it get all that data that we’re being shown?
Kubernetes and etcd
For storing data in Kubernetes, we have another key component of the Control Plane — etcd
.
Actually, this is just a key:value database with all the data that forms our cluster — all its settings, all resources, all states of these resources, RBAC rules, etc.
When the Kubernetes API Server receives a request, for example, POST /apis/apps/v1/namespaces/default/deployments
, it first checks if the object matches the resource schema (validation), and only then saves it to etcd
.
The etcd
database consists of a set of keys. For example, a Pod named "nginx-abc" will be stored in a key named /registry/pods/default/nginx-abc
.
See the documentation Operating etcd clusters for Kubernetes.
In AWS EKS, we don’t have access to etcd
(and that's a good thing), but we can start Minikube and have a look at it:
$ minikube start
...
🏄 Done! kubectl is now configured to use "minikube" cluster and "default" namespace by default
Check the system pods:
$ kubectl -n kube-system get pod
NAME READY STATUS RESTARTS AGE
coredns-674b8bbfcf-68q8p 0/1 ContainerCreating 0 57s
etcd-minikube 1/1 Running 0 62s
...
Connect to the cluster:
$ minikube ssh
If we had used minikube start --driver=virtualbox
, we would have used minikube ssh
to enter the VirtualBox instance.
But since we have the default docker
driver, we simply enter the minikube
container.
Install etcd
here to get the etcdctl
CLI utility:
docker@minikube:~$ sudo apt update
docker@minikube:~$ sudo apt install etcd
Check it:
docker@minikube:~$ etcdctl -version
etcdctl version: 3.3.25
And now we can see what’s in the database:
docker@minikube:~$ sudo ETCDCTL_API=3 etcdctl \
--endpoints=https://127.0.0.1:2379 \
--cacert=/var/lib/minikube/certs/etcd/ca.crt \
--cert=/var/lib/minikube/certs/etcd/server.crt \
--key=/var/lib/minikube/certs/etcd/server.key \
get "" --prefix --keys-only
...
/registry/namespaces/kube-system
/registry/pods/kube-system/coredns-674b8bbfcf-68q8p
/registry/pods/kube-system/etcd-minikube
...
/registry/services/endpoints/default/kubernetes
/registry/services/endpoints/kube-system/kube-dns
...
The data in the keys is stored in Protobuf (Protocol Buffers) format, so with the usual etcdctl get KEY
, the data will look a little crooked.
Let’s see what is in the database about the Pod of etcd
itself :
docker@minikube:~$ sudo ETCDCTL_API=3 etcdctl --endpoints=https://127.0.0.1:2379 --cacert=/var/lib/minikube/certs/etcd/ca.crt --cert=/var/lib/minikube/certs/etcd/server.crt --key=/var/lib/minikube/certs/etcd/server.key get "/registry/pods/kube-system/etcd-minikube"
The result:
OK.
CustomResourceDefinitions and Kubernetes API
So, when we create a CRD, we extend the Kubernetes API by creating our own API Group with our own name, version, and a new resource type (Kind) that is described in the CRD.
Documentation — Extend the Kubernetes API with CustomResourceDefinitions.
Let’s write a simple CRD:
apiVersion: apiextensions.k8s.io/v1
kind: CustomResourceDefinition
metadata:
name: myapps.mycompany.com
spec:
group: mycompany.com
names:
kind: MyApp
plural: myapps
singular: myapp
scope: Namespaced
versions:
- name: v1
served: true
storage: true
schema:
openAPIV3Schema:
type: object
properties:
spec:
type: object
properties:
image:
type: string
Here we have:
- use the existing API Group
apiextensions.k8s.io
and versionv1
- from it take the schema of the CustomResourceDefinition object
- and based on this schema, we create our own API Group named
mycompany.com
- in this API Group, we describe a single resource type —
kind: MyApp
- and one version —
v1
- then using
openAPIV3Schema
we describe the schema of our resource - what fields it has and their types, and here you can also set default values (see OpenAPI Specification)
With this CRD, we will be able to create new Custom Resources with a manifest in which we pass the apiVersion
, kind
, and spec.image
fields from the schema.openAPIV3Schema.properties.spec.properties.image
of our CRD:
apiVersion: mycompany.com/v1
kind: MyApp
metadata:
name: example
spec:
image: nginx:1.25
Create the CRD:
$ kk apply -f test-crd.yaml
customresourcedefinition.apiextensions.k8s.io/myapps.mycompany.com created
Check in the Kubernetes API (you can use the | jq '.groups[] | select(.name == "mycompany.com")'
selector):
$ curl -s localhost:8080/apis/ | jq
...
{
"name": "mycompany.com",
"versions": [
{
"groupVersion": "mycompany.com/v1",
"version": "v1"
}
],
...
}
...
And the API Group mycompany.com
itself :
$ curl -s localhost:8080/apis/mycompany.com/v1 | jq
{
"kind": "APIResourceList",
"apiVersion": "v1",
"groupVersion": "mycompany.com/v1",
"resources": [
{
"name": "myapps",
"singularName": "myapp",
"namespaced": true,
"kind": "MyApp",
"verbs": [
"delete",
"deletecollection",
"get",
"list",
"patch",
"create",
"update",
"watch"
],
"storageVersionHash": "MZjF6nKlCOU="
}
]
}
And in the etcd
:
docker@minikube:~$ sudo ETCDCTL_API=3 etcdctl --endpoints=https://127.0.0.1:2379 --cacert=/var/lib/minikube/certs/etcd/ca.crt --cert=/var/lib/minikube/certs/etcd/server.crt --key=/var/lib/minikube/certs/etcd/server.key get "" --prefix --keys-only
/registry/apiextensions.k8s.io/customresourcedefinitions/myapps.mycompany.com
...
/registry/apiregistration.k8s.io/apiservices/v1.mycompany.com
...
Here, the /registry/apiextensions.k8s.io/customresourcedefinitions/myapps. mycompany.com
key stores information about the new CRD itself - the CRD structure, its OpenAPI schema, versions, etc., and the /registry/apiregistration.k8s.io/apiservices/v1.mycompany.com
key registers the API Service for this group to access the group via the Kubernetes API.
And of course, we can see the CRD with kubectl`:
$ kk get crd
NAME CREATED AT
myapps.mycompany.com 2025-07-12T11:23:19Z
Create the CustomResource itself from the manifest we wrote above:
$ kk apply -f test-resource.yaml
myapp.mycompany.com/example created
Test it:
$ kk describe MyApp
Name: example
Namespace: default
Labels: <none>
Annotations: <none>
API Version: mycompany.com/v1
Kind: MyApp
Metadata:
Creation Timestamp: 2025-07-12T13:34:52Z
Generation: 1
Resource Version: 4611
UID: a88e37fd-1477-4a7e-8c00-46c925f510ac
Spec:
Image: nginx:1.25
But this is just data in etcd
for now - we don't have any real Pods resources, because there is no controller that handles resources from Kind: MyApp
.
Note : looking ahead to the next post: actually, Kubernetes Operator is a set of CRDs and a controller that “controls” resources with the specified Kind
Kubernetes API Service
When we add a new CRD, Kubernetes not only has to create a new key in etcd
with the new API Group and the corresponding resource schema, but also add a new endpoint to its routes - as we do in Python with @app.get("/")
in FastAPI - so that the API server knows that the GET
request /apis/mycompany.com/v1/myapps
should return resources of this type.
The corresponding API Service will contain a spec
with the group and version:
$ kk get apiservice v1.mycompany.com -o yaml
apiVersion: apiregistration.k8s.io/v1
kind: APIService
metadata:
creationTimestamp: "2025-07-12T11:53:52Z"
labels:
kube-aggregator.kubernetes.io/automanaged: "true"
name: v1.mycompany.com
resourceVersion: "2632"
uid: 26fc8c6b-6770-422f-8996-3f35d86be6c7
spec:
group: mycompany.com
groupPriorityMinimum: 1000
version: v1
versionPriority: 100
...
That is, when we create a new CRD, Kubernetes API Server creates an API Service (writing it to /registry/apiregistration.k8s.io/apiservices/v1.mycompany.com
), and adds it to its own routers in the /apis
endpoint.
And now, having an idea of what the API looks like and the database that stores all the resources, we can move on to creating the CRD and controller, that is, to actually write the Operator itself.
But this is already in the next part.
Originally published at RTFM: Linux, DevOps, and system administration.
Top comments (0)