Goal
[!TIP]
In hurry? Jump directly to Result section to see the outcome of this dive.
Hello everyone! This post's primary goal is to demystify how Kubernetes Controllers work by building a custom Athenz/k8s-athenz-syncer from scratch! Instead of relying on "magic" libraries or copying existing production code, we are diving deep into the core concepts by implementing the operator ourselves.
By doing this, we can:
- Understand the core concepts of Kubernetes Operators and the Reconciliation Loop.
- Learn how to interact with the Athenz ZMS server via its API programmatically.
- Implement a logic that automatically maps external identity roles (Athenz) to Kubernetes RBAC.
- Identify the subtle details often overlooked when simply deploying pre-made operators.
- Know exactly what to look for to catch the subtle details that other reviewers would miss
Table of Contents
- Goal
- Table of Contents
- Result
- What I Learned
-
Walkthrough
- Prerequisites & Setup
- a. Local Kubernetes Cluster (Kind)
- b. Deploy Athenz Server
- c. Configure Athenz Domains
- Implementation: The Hard & Clean Way
- 1. Initialize the Project
- 2. Make an operator that works from scratch
- 3. Refactor the operator to make it cleaner
- 4. Write demo/local setup guide in README.md
- 5. Make
make runwrites a config - Verification: Does it actually work?
- What's next?
- Dive Hours: 28 hours
- Closing
Result
I successfully built a working Kubernetes operator and you can find it here:
- How to setup guide: mlajkim/k8s-athenz-syncer-the-hard-clean-way/README.md
- PR: PR: k8s-athenz-syncer-the-hard-clean-way.
The operator, k8s-athenz-syncer-the-hard-clean-way, performs the following actions automatically when a Namespace <ns> is created in the cluster:
- Creates a corresponding Athenz Domain (e.g.,
eks.users.<ns>). - Creates Athenz Roles within that domain based on a configuration.
- Creates Kubernetes RBAC Roles that map to those Athenz roles.
- Periodically polls Athenz to sync member changes into Kubernetes RoleBindings.
Demo 1: Auto-creation of Resources in Athenz with k8s namespace only
Demo 2: Giving permission with Syncing Membership (Polling)
Operator k8s-athenz-syncer-the-hard-clean-way periodically polls Athenz roles under certain parent domain (e.g., eks.users), and syncs the members of the Athenz roles into corresponding Kubernetes RBAC Roles, which results in automatic access control based on Athenz role membership.

Operator k8s-athenz-syncer-the-hard-clean-way makes sure that if you delete members from Athenz roles, the members are also removed from corresponding Kubernetes RBAC Roles.
Demo 3: Restricting permission with Syncing Membership (Polling)
What I Learned
Through this "hard way" implementation, I gained several key technical insights:
- The Power of Kubebuilder Scaffolding
Starting from scratch doesn't mean writing boilerplate.
kubebuilderabstracts away the complexity of leader election, metrics server, and signal handling, allowing us to focus purely on theReconcilelogic.Level-Triggered vs. Edge-Triggered
Kubernetes controllers are primarily level-triggered. While I implemented a polling mechanism for the external Athenz API, the internal Kubernetes state relies on the
Reconcileloop constantly attempting to move the current state to the desired state.I learned that for external resources (like Athenz), we explicitly need to manage the polling interval or set up webhooks, as Kubernetes cannot "watch" an external API by default.
Security Integration Details
Connecting Kubernetes RBAC with an external system isn't just about mapping strings. It involves handling X.509 certificates for authentication (Athenz) and correctly signing Kubernetes CSRs for user testing (
user.mlajkim).
Walkthrough
Here is the step-by-step record of how I achieved the result.
Prerequisites & Setup
a. Local Kubernetes Cluster (Kind)
I used kind to run a cluster locally.
kind create cluster
kubectl cluster-info --context kind-kind
b. Deploy Athenz Server
I utilized @ctyano's athenz-distribution to deploy a local Athenz instance.
# Clone and deploy Athenz
git clone https://github.com/ctyano/athenz-distribution.git athenz_distribution
make -C ./athenz_distribution clean-kubernetes-athenz deploy-kubernetes-athenz
# Port forward the UI
kubectl -n athenz port-forward deployment/athenz-ui 3000:3000 &
open http://localhost:3000
c. Configure Athenz Domains
Set up the ZMS server access and create the Top Level Domain (TLD) for testing.
# Port forward ZMS
kubectl -n athenz port-forward deployment/athenz-zms-server 4443:4443 &
# Create TLD "eks"
curl -k -X POST "https://localhost:4443/zms/v1/domain" \
--cert ./athenz_distribution/certs/athenz_admin.cert.pem \
--key ./athenz_distribution/keys/athenz_admin.private.pem \
-H "Content-Type: application/json" \
-d '{"name": "eks", "adminUsers": ["user.athenz_admin"]}'
# Create Subdomain "eks.users"
curl -k -X POST "https://localhost:4443/zms/v1/subdomain/eks" \
--cert ./athenz_distribution/certs/athenz_admin.cert.pem \
--key ./athenz_distribution/keys/athenz_admin.private.pem \
-H "Content-Type: application/json" \
-d '{"parent": "eks", "name": "users", "adminUsers": ["user.athenz_admin"]}'
Implementation: The Hard & Clean Way
Here is how I built the operator from scratch in a clean way.
1. Initialize the Project
I initialized the project using kubebuilder.
repo="github.com/mlajkim"
mkdir -p k8s-athenz-syncer-the-hard-clean-way && cd k8s-athenz-syncer-the-hard-clean-way
kubebuilder init --domain "ajktown.com" --repo "$repo/k8s-athenz-syncer-the-hard-clean-way"
2. Make an operator that works from scratch
I first created an operator that works in bare minimum and deployed it public to the repository: k8s-athenz-syncer-the-hard-way
3. Refactor the operator to make it cleaner
But I found myself to improve the code structure and make it cleaner. So I created a new repository: k8s-athenz-syncer-the-hard-clean-way, with clean meaning that I re-organized the code structure to make it more modular and readable.
This is amount of changes I made in a PR to make it clean.
4. Write demo/local setup guide in README.md
I realized that visuals speak louder than words when demonstrating infrastructure tools. Instead of greeting users with a wall of text, I structured the README.md to lead with GIFs that showcase the operator's features immediately. Once I've captured the reader's interest, I provide a "copy-paste friendly" local setup guide to make the onboarding process as seamless as possible.
5. Make make run writes a config
To make the README.md easier to follow, I made make run write a config file config.yaml automatically with hack/ensure-config.sh, so that users don't have to manually create the config file, that they have no idea what to write in there. Also I've utilized read command in bash to make it interactive, so that users can just copy and paste the values when they run make run.
Verification: Does it actually work?
You can see the verifications from the step above named "Result"
What's next?
Now that I have a working prototype built from scratch, I want to bridge the gap between this "clean" version and a robust, production-grade operator. My roadmap for the coming weeks is as follows:
I: The "Weekly Dive": Performance Optimization
I plan to dedicate a full week (my "Weekly Dive") to deeply thinking about performance:
- Optimization Strategy: instead of blindly fetching full role memberships every time, I want to investigate using HTTP headers (like ETag or Last-Modified) to implement a delta-sync mechanism.
-
Scaling Complexity: I am also curious about how Assumed Roles affect performance with
expand=truein API and logic compared to standard domain roles.
II: Dissecting the Production Code (athenz/k8s-athenz-syncer)
After my independent study, I will deploy the official upstream Athenz/k8s-athenz-syncer locally:
- Documentation: I aim to document exactly what configurations are required to run it and create easy-to-use manifests so anyone can deploy it effortlessly.
- Code Audit: I will conduct a line-by-line analysis of the production code. My goal is to reverse-engineer the "why" behind their design decisions—what edge cases did they handle that I missed? Why did they choose specific architectural patterns?
III: Contributing Back
Finally, I don't just want to be an observer:
- Feedback Loop: If I find performance bottlenecks or logic gaps during my audit, I plan to raise Issues or submit PRs to the upstream repository.
- Community: I hope to start a conversation with the maintainers (Yahoo Inc.) to validate my assumptions and share my findings.
- Guide others: If some teams want to use Athenz role as SSoT for Kubernetes RBAC, I want to help them by sharing my learnings and possibly providing a more production-ready version of my "hard & clean way" operator.
Dive Hours: 28 hours
This post took me approximately 28 hours of focused work and development, broken down as follows:
-
12/26 Fri: 4.5h -
12/27 Sat: 5.5h -
12/28 Sun: 8.5h -
12/30 Tue: 9.5h
Closing
If you enjoyed this deep dive, please leave a like & subscribe for more!



Top comments (0)