DEV Community: M. Hamzah Khan

Grafana Alloy in My Homelab: Why I Run Three Separate Instances

M. Hamzah Khan — Sat, 07 Mar 2026 10:00:00 +0000

When I set up Grafana Alloy across my homelab Kubernetes cluster, the first question was: how many instances do I actually need? Most tutorials show a single Alloy deployment handling everything. That works for a proof of concept but it papers over a real architectural question — one that comes down to a single word: clustering.

My setup runs three separate Alloy deployments inside Kubernetes, plus standalone Alloy on bare-metal nodes outside the cluster. The reasons for the split are not aesthetic. The primary driver is clustering — some collection tasks need it and others must not use it. The secondary driver is resource isolation: if alloy-cluster starts OOMing under a burst of ServiceMonitor scrapes, I do not want host metrics collection to stop. Keeping them separate means a problem in one deployment cannot starve the others.

What Alloy Clustering Actually DoesLink to heading

Alloy clustering uses a gossip protocol to form a peer mesh between instances. When a component like prometheus.operator.servicemonitors has clustering { enabled = true }, all Alloy instances in the cluster share a hash ring and each one independently computes which targets it owns. The result is that a set of N replicas collectively scrapes all targets, with each target scraped exactly once.

Peer discovery works via DNS against a Kubernetes headless Service — which is why clustering requires a StatefulSet. StatefulSets give pods stable DNS identities; Deployments and DaemonSets do not.

This is enormously useful when you want high-availability metrics collection with multiple replicas. Without clustering, N replicas each scrape all targets independently, producing N× duplicate time series and out-of-order sample errors downstream.

Why DaemonSet Pods Do Not Need ClusteringLink to heading

Here is the thing: DaemonSet pods are already partitioned by Kubernetes. There is one pod per node. Each pod only ever scrapes resources local to its own node — the host filesystem, the local kubelet endpoint, the local cAdvisor endpoint. There is no shared pool of targets to distribute.

Enabling clustering on a DaemonSet would achieve nothing. The Grafana Alloy clustering docs are blunt about this:

“A particularly common mistake is enabling clustering on logs collecting DaemonSets. Collecting logs from Pods on the mounted node doesn’t benefit from having clustering enabled since each instance typically collects logs only from Pods on its own node.”

Each DaemonSet pod is an entirely independent instance. The work partitioning is handled by Kubernetes, not by Alloy’s gossip protocol.

The Three DeploymentsLink to heading

This is where the topology comes from. Tasks divide into three categories: inherently node-local (DaemonSet, no clustering), cluster-wide with HA (StatefulSet, clustering on), and singleton (single Deployment, no clustering). Running one Alloy that tries to do all three would either require clustering on things that don’t need it, or no clustering on things that do. It would also mean a single resource budget covering everything — one OOM kill and both host metrics and cluster-wide scraping go down together.

alloy-node — DaemonSet, no clusteringLink to heading

One pod per node. Tolerates all taints so it runs on control plane nodes too. Runs with hostNetwork: true and hostPID: true so it can see the host’s process tree and network interfaces.

What it collects:

Host metrics via the built-in prometheus.exporter.unix — Alloy’s native node_exporter. No separate binary needed.
cAdvisor metrics (container CPU/memory) scraped from the local kubelet endpoint only. The key is filtering discovery results to the local node using constants.hostname, so each pod only scrapes itself:

discovery.relabel "local_node_only_cadvisor" {
 targets = discovery.kubernetes.nodes.targets

 rule {
 source_labels = ["__meta_kubernetes_node_name"]
 action = "keep"
 regex = constants.hostname
 }
}

Without this filter, every DaemonSet pod would attempt to scrape every node’s cAdvisor — 7 pods × 7 nodes = 49 scrape attempts for what should be 7.

kubelet metrics using the same local-node-only pattern.
Pod logs from /var/log/pods/**/*.log with CRI parsing and label extraction from the file path (namespace, pod name, container name).
systemd journal logs via loki.source.journal — picks up kubelet, containerd, and any other systemd units on the host.

No clustering block anywhere in this config. Each pod runs entirely independently.

alloy-cluster — StatefulSet, clustering onLink to heading

Two replicas with clustering.enabled: true. This is for cluster-wide metric collection — anything that requires Kubernetes API access to discover targets and that would produce duplicates if scraped by multiple independent instances.

prometheus.operator.servicemonitors "services" {
 forward_to = [prometheus.remote_write.default.receiver]

 clustering {
 enabled = true
 }
}

prometheus.operator.podmonitors "pods" {
 forward_to = [prometheus.remote_write.default.receiver]

 clustering {
 enabled = true
 }
}

The two replicas share the ServiceMonitor and PodMonitor workload via the hash ring. If one goes down, the other takes the full load. When it comes back, targets are automatically rebalanced.

It also handles Mimir rule synchronisation — reading PrometheusRule CRDs from Kubernetes and syncing them into Mimir’s ruler:

mimir.rules.kubernetes "local" {
 address = "http://mimir-ruler.mimir-system.svc.cluster.local:8080"
 tenant_id = "1"
}

And an OTLP receiver for anything that wants to push telemetry in OpenTelemetry format, exposed via a regular Service backed by both replicas.

The StatefulSet uses 1Gi persistent storage (Rook/Ceph SSD) for Alloy’s write-ahead log, which buffers data locally if Mimir or Loki are temporarily unavailable.

alloy-kube-events — single Deployment, no clusteringLink to heading

Kubernetes events exist only in the API server and are garbage collected after a short window. This deployment runs a single replica that watches the events API continuously and ships everything to Loki:

loki.source.kubernetes_events "kubernetes_events" {
 job_name = "integrations/kubernetes/eventhandler"
 log_format = "json"
 forward_to = [loki.process.kubernetes_events.receiver]
}

A single replica is correct here. Events are cluster-wide objects, not node-local, so a DaemonSet would forward each event from every node. But you also do not need or want clustering — a second replica with clustering enabled would not help, and a second unclustered replica would duplicate all events. One instance, watching the API, is the right answer.

Very lightweight: 50m CPU request, 128Mi RAM.

Cardinality: An Ongoing JobLink to heading

Getting the topology right is the structural problem. Cardinality is the operational one. In a Kubernetes cluster with many pods and containers, the default label sets from node_exporter and cAdvisor generate an enormous number of time series — and Mimir has to store all of them. Left unchecked, this drives up memory usage across the whole observability stack.

The approach is the same everywhere: drop labels and metrics you will never query, as close to the source as possible.

Virtual network interfacesLink to heading

A Kubernetes node running many pods will have hundreds of virtual network interfaces — one veth pair per pod, plus Calico (cali*) interfaces. The node_exporter netclass and netdev collectors would create a separate set of time series for every one of them. They are not useful for node-level network monitoring:

prometheus.exporter.unix "node_exporter_metrics" {
 netclass {
 ignored_devices = "^(veth.*|cali.*|[a-f0-9]{15})$"
 }

 netdev {
 device_exclude = "^(veth.*|cali.*|[a-f0-9]{15})$"
 }
}

The regex also catches 15-character hex strings — the interface names Kubernetes generates for container network namespaces. Without this, every pod churn event adds and then expires a batch of time series.

Container and virtual filesystemsLink to heading

A Kubernetes node also mounts a huge number of ephemeral filesystems: one overlay mount per container layer, tmpfs for secrets and service account tokens, cgroup hierarchies, proc, devtmpfs, and so on. None of these are useful for disk space monitoring:

filesystem {
 fs_types_exclude = "^(autofs|binfmt_misc|bpf|cgroup2?|configfs|debugfs|devpts|devtmpfs|tmpfs|fusectl|hugetlbfs|iso9660|mqueue|nsfs|overlay|proc|procfs|pstore|rpc_pipefs|securityfs|selinuxfs|squashfs|sysfs|tracefs)$"
 mount_points_exclude = "^/(dev|proc|sys|var/lib/docker/.+|var/lib/kubelet/.+)($|/)"
}

Dropping unused collectors entirelyLink to heading

The ipvs collector is disabled outright — the cluster uses iptables, not IPVS. There is no point scraping metrics for something that is not running:

prometheus.exporter.unix "node_exporter_metrics" {
 disable_collectors = ["ipvs"]
}

cAdvisor container labelsLink to heading

cAdvisor attaches id and name labels to container metrics. The id label is the full container runtime ID — a long hex string that is unique per container instance and changes every time a pod restarts. Keeping it would mean every pod restart permanently adds a new set of time series that never get reused:

prometheus.relabel "drop_cadvisor" {
 rule {
 action = "labeldrop"
 regex = "id|name|instance"
 }
}

Pod log stream labelsLink to heading

The filename label on pod logs is the full path on the host: /var/log/pods/<namespace>_<pod-name>_<pod-uid>/<container>/<n>.log. The pod UID component is unique per pod instance, so every pod restart creates a new log stream label value that Loki has to index. Dropping it keeps the stream cardinality manageable:

stage.label_drop {
 values = ["filename", "flags"]
}

The useful labels — namespace, pod, container, stream — are extracted separately from the file path via a regex stage and kept. Only the high-cardinality junk is dropped.

Internal scrape metricsLink to heading

node_exporter emits node_scrape_collector_* metrics that track its own internal scrape performance per collector. Useful for debugging node_exporter itself, but not worth storing long-term in Mimir:

rule {
 source_labels = [" __name__"]
 regex = "node_scrape_collector_.+"
 action = "drop"
}

This is all incremental work. As new exporters get added or existing dashboards evolve, there are always new labels to audit and unused metrics to prune. The cardinality pressure does not go away — it just needs to be managed continuously.

Standalone Alloy on Bare-Metal NodesLink to heading

Not everything in the lab runs inside Kubernetes. Proxmox hypervisors, the VyOS router, and Raspberry Pi systems all run standalone Alloy as a systemd service. The config is simpler — no pod log collection, no Kubernetes API access, just host metrics and journal logs forwarded to the same Mimir and Loki endpoints as the cluster:

prometheus.exporter.unix "node" { }

prometheus.scrape "node" {
 targets = prometheus.exporter.unix.node.targets
 forward_to = [prometheus.remote_write.default.receiver]
}

loki.source.journal "journal" {
 forward_to = [loki.write.default.receiver]
 labels = {
 job = "integrations/systemd-journal",
 instance = constants.hostname,
 }
}

The same datacentre and cluster external labels are set on the remote_write and loki.write blocks, so in Grafana I can use a single dashboard and filter between Kubernetes nodes and bare-metal hosts.

SummaryLink to heading

graph LR
subgraph bare["Bare-metal nodes"]
BM["alloy (systemd)
Proxmox · VyOS · Raspberry Pi"]
end
subgraph k8s["Kubernetes Cluster — lab-lon1-uk"]
subgraph ds["DaemonSet — 1 per node"]
AN["alloy-node
Host metrics · cAdvisor
Kubelet · Pod logs
Journal logs"]
end
subgraph sts["StatefulSet ×2, clustering on"]
AC["alloy-cluster
ServiceMonitors · PodMonitors
Mimir rules sync · OTLP receiver"]
end
subgraph dep["Deployment ×1"]
AE["alloy-kube-events
Kubernetes events"]
end
end
subgraph obs["Observability backends"]
MIMIR[("Mimir")]
LOKI[("Loki")]
GRAFANA["Grafana"]
end
AN -->|metrics| MIMIR
AN -->|logs| LOKI
AC -->|metrics| MIMIR
AC -->|logs| LOKI
AE -->|logs| LOKI
BM -->|metrics| MIMIR
BM -->|logs| LOKI
MIMIR --> GRAFANA
LOKI --> GRAFANA

Deployment	Type	Clustering	Why
`alloy-node`	DaemonSet	No	Node-local collection — Kubernetes already partitions by node
`alloy-cluster`	StatefulSet (×2)	Yes	Cluster-wide ServiceMonitor/PodMonitor scraping — needs HA without duplicates
`alloy-kube-events`	Deployment (×1)	No	Single-instance by design — duplicate event forwarding would be wrong
Standalone	systemd	n/a	Bare-metal hosts outside the cluster

The organising principle is clustering, not aesthetics. If a task is node-local, a DaemonSet handles partitioning naturally. If a task is cluster-wide and you want more than one replica, clustering is what prevents duplicate data. And if a task must run exactly once, you use a single Deployment and keep clustering out of the picture entirely.

This Is Also What Grafana DoesLink to heading

It is worth noting that Grafana’s own k8s-monitoring Helm chart arrives at the same topology. Their chart deploys:

alloy-metrics — StatefulSet, for cluster-wide metrics collection
alloy-logs — DaemonSet, for node-local pod and host log collection
alloy-singleton — single Deployment, for cluster events and other once-only tasks

The names are different and the internals diverge — their chart is considerably more opinionated, with its own abstraction layer over the raw Alloy config — but the underlying reasoning is identical.

My current setup rolls its own Helm releases and Alloy configs directly. I plan to migrate to the k8s-monitoring chart, which also brings in the Alloy Operator for lifecycle management of the collector instances. When that migration happens I will write it up.

Parenting Like a DevOps Engineer: Managing the Chaos of Family Life

M. Hamzah Khan — Thu, 12 Jun 2025 20:00:00 +0000

Father’s Day just passed, which got me thinking—not just about fatherhood in general, but how weirdly useful my job as a DevOps engineer has been in helping me parent. I have three kids: two sons (8 years old, and 6 years old), and one daughter (4 years old). They’re amazing, unpredictable, and chaotic—kind of like a Kubernetes cluster that’s constantly in flux, demanding constant monitoring, quick rollbacks, and a whole lot of automation to keep from spiralling into an unmanageable mess.

I’m not the world’s greatest parent. Far from it. But I’m learning. Slowly. And somewhere between incident response and bedtime battles, I’ve realised that parenting, like DevOps, is mostly about managing chaos, making tiny, incremental improvements and iterating on what works.

Just like in DevOps, the key to a happy home is good ‘observability’ – mainly through the faint sounds of mischief from the other room.

A few months ago, my six-year-old began resisting going to school. Each morning turned into a dramatic struggle. When we asked him why he didn’t want to go, he would simply shrug and mumble, “I don’t like it.” Unfortunately, that didn’t provide us with much actionable information.

In engineering, when problems arise, we start by gathering context. We don’t jump to conclusions; instead, we observe and investigate. So, one day, I invited him into my home office—my safe space—and told him it was our safe space now. “In here,” I explained, “we’re friends who can talk about anything, from the silliest thing to the craziest. Just us. No pressure.”

He sat quietly in the chair beside me for a while. Then, finally, he said:

“I don’t like school because… I don’t know how to talk to the other kids.”

That hit me hard. He wasn’t being defiant; he was simply overwhelmed. It particularly resonated with me because it was an issue I struggled with as a child too.

From there, we were able to speak to his teacher, who gently helped him integrate into games with other children. Now that he has friends, he actually looks forward to seeing them. That breakthrough happened not through interrogation but through observability and patience.

Using my home office as our safe space has now become a regular occurrence. Strangely, this technique of establishing a room as our “safe space” doesn’t work for my wife.

🔁 Blameless Postmortems (Even When Homework is Due Tomorrow)Link to heading

DevOps culture teaches us to run blameless retrospectives after incidents. Not because we don’t care about what went wrong but because assigning blame prevents learning.

My 8-year-old has a bad habit of revealing school projects the night before they’re due. No matter how often we ask him, “Any homework?” he’ll respond with an Oscar-worthy performance of “Nope.” Then, at 8:00 PM on a Thursday: “Oh yeah, I need to make a cardboard Roman sword and write about it.”

The old me would’ve panicked or scolded. But now, I try to treat it like a retro: What were the signals we missed? How can we improve visibility? Do we need a new “homework alerting system” (also known as a whiteboard on the fridge)?

We still get frustrated. But now it’s frustration aimed at the system, not the child.

🧭 Observability: Beyond the Logs (and into the Babychinos)Link to heading

With my youngest, she’s four—things are different. She’s in the plushie-and-babychino phase of life, so we go on “coffee dates” together. I get a double espresso latte; she gets a babychino and a cinnamon swirl, and we just… sit. She talks about Barbie, how she wants to be a ballerina, and how she wants a real pet sheep she’d call ‘Baa-llerina’ because it’s a sheep, and sheep say “baa,” and she likes ballet.

She doesn’t say, “Dad, I’m feeling emotionally disconnected and would benefit from some focused one-on-one time.” But I’ve learned to watch the metrics: her mood shifts, clinginess, eye contact, sleep patterns. You get better at reading logs when you stop waiting for alerts.

Parenting isn’t just about reacting to tantrums—it’s about noticing subtle changes and responding early.

Observability at home? It’s empathy, finely tuned with instrumentation.

🤖 Automation: The Bedtime Pipeline (and Beyond)Link to heading

In DevOps, we obsess over automation. Why? Because it reduces friction, ensures consistency, and frees up our engineers for more complex, creative work. Turns out, the same principle applies when you’re trying to get three small humans from hyperactive to horizontal.

Our bedtime routine, for example, is a finely tuned, automated pipeline: Dinner, PJs, brushing teeth, using the toilet, stories, cuddles, and lights out. When it works, it’s beautiful. Each step flows into the next, reducing decision fatigue for both us and the kids. They know what’s coming, which minimises resistance. We know what’s coming, which minimises parental meltdowns.

It’s not just bedtime; it works for the morning routine before school or even just having designated spots for shoes and backpacks – these are all tiny automations. They’re like mini-scripts running in the background of our family life, reducing cognitive load and preventing us from constantly having to “manually deploy” every single task. When the system is automated, we have more time and energy for unexpected ‘incidents’ – like explaining for the fifth time why we can’t have a pet unicorn.

🔄 Continuous Integration and Daily Stand-UpsLink to heading

In engineering, Continuous Integration refers to the practice of frequently merging code into a shared project repository. This approach includes automated builds and tests that detect issues early on, assisting in the identification of conflicts before they develop into major problems.

My wife and I may not be merging lines of code, but we are continually integrating our parenting approaches. We represent two distinct ‘branches’ of the same ‘project,’ and if we don’t regularly synchronize, we risk encountering merge conflicts that affect the entire ‘system’ (i.e., the kids).

Our daily stand-up usually happens over breakfast or after the kids are asleep. We ask questions like, “How was school pickup?” “Did you talk to him about the math homework?” and “She seems a bit quiet or clingy today; is something wrong?” These are not formal meetings but quick and important check-ins. We share what we notice, align our responses to new behaviors, and bring up any potential issues before they escalate. This keeps our family approach—our shared way of parenting—consistent and harmonious. When we are not on the same page, things become chaotic. One parent says yes, the other says no, and suddenly, our perfectly crafted ‘deployment’ (e.g., getting everyone out the door on time) grinds to a halt. CI, even in parenting, makes for a smoother operation.

🧩 The Monolith vs Microservices Debate (aka Marriage)Link to heading

My wife and I parent in very different ways. She’s not an engineer. She doesn’t think about “event-driven architecture” or “incident response timelines.” Her approach is more intuitive, relational, and deeply human.

At first, this led to some friction. Why didn’t she want to optimise bedtime flow with a Kanban board? Why didn’t I just feel that someone was about to have a meltdown?

But over time, I’ve realised that our differences are a feature, not a bug. We balance each other out. Like a good system composed of microservices and a stable monolith—you need both agility and cohesion. Flexibility and structure. Love and logic.

We’re both debugging this system in real-time, just using different tools.

🕹 When Roblox Becomes Pair ProgrammingLink to heading

I don’t particularly enjoy Roblox. The games are confusing, and they give me motion sickness like I just went on a roller coaster.

But my 6-year-old loves it. He lights up when we play together.

The other day, he tried to explain a game to me. I nodded along, trying not to feel sick while hiding from “Scary Larry.” He laughed at how lost I was. I was confused but still there.

This is what matters. The primary objective of pair programming is to write better code and share knowledge. However, its real strength is in the teamwork and connection built during the process. Similar to Roblox, the most valuable result isn’t always what shows up on the screen.

🙃 Closing ThoughtsLink to heading

DevOps didn’t make me a perfect parent, but it gave me a mindset: one that values systems thinking, curiosity, and resilience.

And fatherhood made me a better engineer, too. It taught me that no system—technical or human—responds well to blame. That emotional outages need graceful recovery.

So, this Father’s Day, I’m not celebrating my success. I’m celebrating the debugging process. The retros. The messy commits. The half-working prototypes.

And the three little humans who remind me daily that parenting is the most complex system I’ll ever help build.

How to Redirect Hardcoded DNS with VyOS (Perfect for Pi-hole or Blocky Setups)

M. Hamzah Khan — Thu, 28 Mar 2024 10:25:00 +0000

Smart devices like Chromecasts and TVs often use hardcoded DNS servers that bypass your custom DNS filters like Pi-hole or Blocky. In this guide, you’ll learn how to configure VyOS NAT rules to intercept and redirect all DNS requests to your preferred DNS server — even if the client tries to bypass it.

I use Blocky as my DNS server on my home network, but this should work with Pi-Hole and any other DNS server as well.

In order to disable this, I setup a few NAT rules on my Vyos router to redirect any DNS queries to unknown DNS servers to my Blocky server.

Step 1: Define Allowed DNS ServersLink to heading

Start by creating an address group containing the allowed DNS servers. This ensures that legitimate DNS queries are not redirected.

mhamzahkhan@homelab-gw:~$ configure
[edit]
set firewall group address-group dns-servers address '10.254.95.3'
set firewall group address-group dns-servers address '10.254.95.4'

Step 2: Redirect Unapproved DNS Requests with NATLink to heading

Next, set up a destination NAT rule to redirect DNS queries not intended for the allowed DNS servers to the Blocky DNS server.

mhamzahkhan@homelab-gw:~$ configure
[edit]
set nat destination rule 5010 description 'Captive DNS'
set nat destination rule 5010 destination group address-group '!dns-servers'
set nat destination rule 5010 destination port '53'
set nat destination rule 5010 inbound-interface name 'bond1.90'
set nat destination rule 5010 protocol 'tcp_udp'
set nat destination rule 5010 translation address '10.254.95.4'
set nat destination rule 5010 translation port '53'

In this example, bond1.90 is my internal home network and 10.254.95.4 is my Blocky DNS server.

VyOS - WireGuard based Road Warrior VPN Configuration

M. Hamzah Khan — Sat, 16 Sep 2023 22:32:18 +0000

In our modern, hyper-connected world, where remote work and global access are increasingly vital, the need for secure connectivity to your home or office network has evolved from a luxury to an essential requirement.

Whether you’re a professional in need of remote access to an office network or a passionate home lab enthusiast managing various services, a road-warrior style VPN is your key to top-tier, secure and hassle-free remote server access from anywhere in the world.

Regardless of if you are managing a personal web server, delving into home automation experiments, or overseeing your own cloud services, this guide serves as your trusty roadmap, expanding on the principles covered in our previous post about establishing a site-to-site VPN with WireGuard and VyOS. We now shift our focus to the individual user’s perspective, bridging the geographical gap between your current location and the heart of your network from anywhere in the world. Together, we’ll navigate the process of configuring VyOS to function as a WireGuard VPN server, enabling you to access your digital realm with unwavering security and unrivaled ease.

Let’s dive in and get started!

Configure the WireGuard Server on VyOSLink to heading

VyOS’ command line interface simplifies the configuration of a Wireguard server and makes client configuration a breeze as well.

All of the configuration for WireGuard on VyOS is done in the WireGuard interface configuration commands, which are prefixed with interface wireguard $INTERFACE_NAME.

Setup VariablesLink to heading

I refer to these variables throughout this guide:

SERVER_PUBLIC_IP - This is the server’s public IP address
SERVER_PRIVATE_KEY - This is the server’s private key - This is generated by the generate pki wireguard key-pair command
SERVER_PUBLIC_KEY - This is the server’s public key - This is generated by the generate pki wireguard key-pair command
CLIENT_PRIVATE_KEY - This is the client’s private key - This is generated by the generate wireguard client-config command
CLIENT_PUBLIC_KEY - This is the client’s private key - This is generated by the generate wireguard client-config command

Generate Server KeypairLink to heading

Generate a keypair for the WireGuard server. Make note of these, as you will need these again.

mhamzahkhan@gw:~$ generate pki wireguard key-pair
Private key: <- OMITTED - USE YOUR OWN ONE - I will refer to this as ${SERVER_PRIVATE_KEY} ->
Public key: <- OMITTED - USE YOUR OWN ONE - I will refer to this as ${SERVER_PUBLIC_KEY} ->

Configure WireGuard InterfacesLink to heading

Next we can configure the WireGuard interface.

For I am using the subnet 10.254.254.0/24 for my VPN, but you can use whatever you like.

mhamzahkhan@gw# set interfaces wireguard wg1 address '10.254.254.1/24'
mhamzahkhan@gw# set interfaces wireguard wg1 description 'VPN'
mhamzahkhan@gw# set interfaces wireguard wg1 ip adjust-mss '1380'
mhamzahkhan@gw# set interfaces wireguard wg1 mtu '1420'
mhamzahkhan@gw# set interfaces wireguard wg1 port '51920'
mhamzahkhan@gw# set interfaces wireguard wg1 private-key '${SERVER_PRIVATE_KEY}'

Next, for each device that will connect to the VPN, we need to add a peer definition. VyOS makes this extremely easy, and even generates a QR code which can be scanned to easily configure the WireGuard client on a phone, for example:

mhamzahkhan@gw:~$ generate wireguard client-config hamzah-phone interface wg1 server ${VYOS_SERVER_PUBLIC_ADDRESS} address 10.254.254.2/24

WireGuard client configuration for interface: wg1

To enable this configuration on a VyOS router you can use the following commands:

=== VyOS (server) configurtation ===

set interfaces wireguard wg1 peer hamzah-phone allowed-ips '10.254.254.2/32'
set interfaces wireguard wg1 peer hamzah-phone public-key '${CLIENT_PUBLIC_KEY}'

=== RoadWarrior (client) configuration ===

[Interface]
PrivateKey = ${CLIENT_PRIVATE_KEY}
Address = 10.254.254.2/32
DNS = 1.1.1.1

[Peer]
PublicKey = ${SERVER_PUBLIC_KEY}
Endpoint = ${SERVER_PUBLIC_ADDRESS}:51821
AllowedIPs = 0.0.0.0/0, ::/0

█████████████████████████████████████████████████████████████
█████████████████████████████████████████████████████████████
████ ▄▄▄▄▄ █ ██▀▄█ ▄██▀▀ ▀██▀▀▄▀▄ ▀ ▄█▄▄▀▄█▀▀ ▀██ ▄▄▄▄▄ ████
████ █ █ █ ███▄█▀ ▄█▀▀ ███▀▀ ▀▄▄▄▄ ▀▀▀▀▀▄▀█ █ █ ████
████ █▄▄▄█ █▀█ ▄▀▄▄█▄█▀▄ ██▄ ▄▄▄ ▀▄█▀▀█ ▀▄▄ ▄ ███ █▄▄▄█ ████
████▄▄▄▄▄▄▄█▄▀▄▀▄█ ▀▄▀▄▀▄▀▄▀ █▄█ █▄▀ █▄█ █ █▄▀ █ █▄▄▄▄▄▄▄████
████▄ █▀ ▄▄▀▀▄▀▀ ▀▄ ▄ ▄ ▄ ▄ ▀▄ ▀▄█▄█▀▄█▄ █▀▀█▄█ ▄▄ ████
████▀▀██▄▄▄█▄▄▄█▀ █▄ █▀█ █ ▀█▀█▀▄▀▀ ▀ ██▀█▀▀▄▄▄ █▀ ▄▄█ █ ████
████▄ ▄▀▀▄▄▄▀ ██ ▄▄██▄ ▄█▀▄▄██▄█ ███▀█▀█▀█▄█▀▀██████▀ ████
████▀ ▄▀▀ ▄▀██▄▀▄███▀▀▄ ▀ ▀ ▀▀ ▀▄█▄▀▀▄██▀ ▀▀ ▀██ ▀▀▀▄▀▄ ████
████████▄▄▄▄██▄▄▄▄ ▄▄▄█▀ ▄█ ▄ █ ▀▀█▄ █ ▄ ▄██ ▄▀▀█▀ ▀▀█▄████
████ ▀▄ ▄▄█▄ ▀ ▄ ▄▄██▄ ▀▄▀█▄▄▄█▄ █▀█▄▄ ▄██▄▄ ▀▀█▄▄██▄████
████ ▀█▄▄█▄▀▄▄ █ █▄▀▀▀ ▀ ▀█▄█▀█▄▄█▄ ▄▀█▀ █▀▀▄█ ▀▄▀█ █▀█ ████
████▀▄█ ▀ ▄▄▀▀ █▄█ ▄ ██ ▀ ▄ ▀▄ █▄▄█ ▀ ▀▄▄▀█ ▄█ ▀▄█▀█▄ ████
████▀▄ ▄▄▄ ▀▀ █ ▀█ ▄ ▄▄ ▄▄▄ █▀▀▄▀▄ █▀ █▄ ▄▄▄ ▄▀ █████
█████▀██ █▄█ █ ▀ █▄ ▄ █▀▄▀▀█ █▄█ █▄██▀▀▄▀▀█▄▀ ▄ █▄█ █▄▀▄████
█████ █▀ ▄▄ ▄▄ ▄▄▄▄█▀ ▄ ▄▀▀▄▄ █▄ ██▄▀▀ ▄█ ▄ ▀▄▄ █▀█▄████
████▀▄ ▀█▄▄▀▄█▄▀ ▄ █▀▀▄▀█▀█▄▄█▀▀▀█▄ ▄ ██▀▀ ▄▀ ▄▀█▀▄██ █ █████
████▄█▄ ▄▄▄▀ ▀▄▀▀▀ █▄▄▄█▄ ▀▀▄██ ▀▀▄▀█ ▄ █▀ █▀ ▀▄▄█▀▄▄████
████▄▀▄▀ ▄█▀█ ▄▄█▀ ▀ ████ ██▄▀▀██▀█▀▀▀▀▄█ █ ▀ ▀▄▀▄▀█▀ ▄████
████▄▀ ▄█▄▀█▄▀▀▀▄█▄▀▀▀▄ ███ ▄█▄ ▄▀ ██ █ ▄█▄█▀ ▄▀▄▀▀▀▀█ ████
███████▄ ▄█ ▄█▄ ▀█ ▄ █▄█▀█ █▀▄▀ █▄▀█▀▄ ██▀ ▀██▄▀▄▀▄▄ ████
████▄█▀▀█ ▄ ▀▀▀ ▄ ▀▄ █▄▄▀ █▄▀ █ █▄ █▀▄█ █▀ █▄▄▄█ ▀█▄████
████▄ ▀▄▄▄▄▀████▄▀▀▄█ ██▄█ ▄▄▄ ▄▀▀ ▄▀ █▄▀██▀▄▄█▀ ▄█ ▄▄▀▄ ████
███████▄██▄▄▀ ▄▄ █▄█▀ ▀ ▀ ▄▄▄ █▀▄▀█▀▀ ▀▄▀▀█ ▄ ▄▄▄ ▄▀▀▀████
████ ▄▄▄▄▄ █▀▄ █ █▀▀▄▀▀ █▀ █▄█ ▀█▀▀▀▄▀▀ ▄ ▀█ █ █▄█ ▀▄ █████
████ █ █ █▄▀█▄▄▄▄ █▄▄▀▄▄▄█ ▄▀▀ ▄ █▄▄ ▀ █ ▄ ▄▄▄▄▀▀█████
████ █▄▄▄█ █▀ ▀▀▀ ▄█▀▄ ▄ ███ ██ ▄▄▀▄▄▄█▀ █▀▄▀██▄▀▀ ████
████▄▄▄▄▄▄▄█▄██▄▄██▄██████▄▄▄█████▄▄▄▄██▄▄██▄▄▄█▄█▄█▄██▄█████
█████████████████████████████████████████████████████████████
█████████████████████████████████████████████████████████████

If you are configuring the client on a phone, using the QR code makes it increcibly easy to configure the client, alternatively, configuring the Mac OS X client allows you to just copy and paste in the client conifuration above the QR code.

ConclusionLink to heading

As we conclude our journey through configuring VyOS as a WireGuard VPN server, you now possess a fully functional WireGuard VPN setup, empowering you to securely access your self-hosted digital resources from anywhere on the planet.

In our ever-evolving, interconnected world, the demand for secure, remote network access remains as vital as ever. By utilising WireGuard and VyOS, you have armed yourself with the ability to stay seamlessly connected to your internal services and servers, whether you’re managing a personal web server, experimenting with home automation, or trying to access secure files on your office network.

In my next post, I will be discussing how I use WireGuard to allow me to host services in my home lab, despite being behind CGNAT.

VyOS - Site-to-Site VPN using Wireguard and OSPF

M. Hamzah Khan — Thu, 07 Sep 2023 22:32:18 +0000

Connecting two sites securely and efficiently is essential for many businesses and individuals.

In this post, we’ll explore how to achieve seamless connectivity between two locations using the powerful combination of WireGuard, a modern and high-performance VPN protocol, and VyOS, a robust and versatile network operating system.

Whether you’re looking to enhance communication between remote offices, create a secure link between your data center and a cloud-based infrastructure, or simply want to connect two geographically separated sites, this guide will walk you through the process, ensuring a reliable and secure connection every step of the way.

To illustrate this process, I will use my own use case as an example. I manage equipment hosted in a colocation data center, which I affectionately refer to as my ‘colo-lab’, and I also maintain a ‘home-lab’.

Previously, I relied on GRE over IPsec for connectivity between the two sites, but I’ve recently migrated these over to WireGuard.

WireGuard boasts a slew of compelling advantages over traditional IPsec, including speed, security, and a refreshingly straightforward setup. Its minimalist design significantly simplifies the configuration process, especially when compared to the complexity of GRE over IPsec.

Throughout this post, I’ll walk you through the precise steps I took to configure two VyOS routers to seamlessly integrate with WireGuard while enabling efficient route distribution through OSPF. By the end, you’ll be equipped with the knowledge to configure your own WireGuard based site-to-site VPN.

TopologyLink to heading

Colo LabLink to heading

WireGuard Interface IP: 10.254.2.0/31
Internal Networks:
- 10.254.112.0/24
- 10.254.113.0/24
- 10.254.114.0/24
Internal Network Aggregate: 10.254.112.0/21
Public IP: Refered to as ${COLO_LAB_PUBLIC_IP}

Home LabLink to heading

WireGuard Interface IP: 10.254.2.1/31
Internal Networks:
- 10.254.88.0/24
- 10.254.89.0/24
- 10.254.90.0/24
Internal Network Aggregate: 10.254.88.0/21
Public IP: None (It’s behind CGNAT)

Generate KeypairsLink to heading

First things first, let’s generate keypairs for both routers. Make note of these, and keep them safe.

First the cololab router:

mhamzahkhan@cololab-gw:~$ generate pki wireguard key-pair
Private key: <- OMITTED - USE YOUR OWN ONE - I will refer to this as ${COLOLAB_PRIVATE_KEY} ->
Public key: <- OMITTED - USE YOUR OWN ONE - I will refer to this as ${COLOLAB_PUBLIC_KEY} ->

Then the homelab router:

mhamzahkhan@homelab-gw:~$ generate pki wireguard key-pair
Private key: <- OMITTED - USE YOUR OWN ONE - I will refer to this as ${HOMELAB_PRIVATE_KEY} ->
Public key: <- OMITTED - USE YOUR OWN ONE - I will refer to this as ${HOMELAB_PUBLIC_KEY} ->

Configure WireGuard InterfacesLink to heading

Next, let’s set up the WireGuard interfaces.

For these interfaces, I’ve chosen a private /31 range, which gives us precisely two IP addresses, perfect for a point-to-point link. In my example, we’ll use 10.254.2.0/31 and 10.254.2.1/31.

Colo Lab Router WireGuard ConfigurationLink to heading

Please note that because my home lab’s internet connection is behind CGNAT, I haven’t specified the peer address on the Colo Lab router. This means that the connection will be initiated from the home-lab side. If you have a static IP address (or dynamic IP address that doesn’t change much), it would be a good idea to specify the peer address so the connection can be initiated from either side.

mhamzahkhan@cololab-gw:~$ configure
[edit]
set interfaces wireguard wg0 address '10.254.2.0/31'
set interfaces wireguard wg0 description 'Connection to Home-Lab'
set interfaces wireguard wg0 ip adjust-mss '1380'
set interfaces wireguard wg0 mtu '1420'
set interfaces wireguard wg0 peer home-lab allowed-ips '0.0.0.0/0'
set interfaces wireguard wg0 peer home-lab persistent-keepalive '10'
set interfaces wireguard wg0 peer home-lab public-key '${HOMELAB_PUBLIC_KEY}'
set interfaces wireguard wg0 port '51820'
set interfaces wireguard wg0 private-key '${COLOLAB_PRIVATE_KEY}'

Home Lab Router WireGuard ConfigurationLink to heading

mhamzahkhan@homelab-gw:~$ configure
[edit]
set interfaces wireguard wg0 address '10.254.2.1/31'
set interfaces wireguard wg0 description 'Connection to Colo-Lab'
set interfaces wireguard wg0 ip adjust-mss '1380'
set interfaces wireguard wg0 mtu '1420'
set interfaces wireguard wg0 peer colo-lab address '${COLO_LAB_PUBLIC_IP}'
set interfaces wireguard wg0 peer colo-lab allowed-ips '0.0.0.0/0'
set interfaces wireguard wg0 peer colo-lab persistent-keepalive '10'
set interfaces wireguard wg0 peer colo-lab port '51820'
set interfaces wireguard wg0 peer colo-lab public-key '${COLOLAB_PUBLIC_KEY}'
set interfaces wireguard wg0 port '51820'
set interfaces wireguard wg0 private-key '${HOMELAB_PRIVATE_KEY}'

Test WireGuard connectionLink to heading

At this point, both routers should be able to ping each other via the VPN link:

mhamzahkhan@cololab-gw:~$ ping 10.254.2.1 count 4
PING 10.254.2.1 (10.254.2.1) 56(84) bytes of data.
64 bytes from 10.254.2.1: icmp_seq=1 ttl=64 time=0.339 ms
64 bytes from 10.254.2.1: icmp_seq=2 ttl=64 time=0.382 ms
64 bytes from 10.254.2.1: icmp_seq=3 ttl=64 time=0.344 ms
64 bytes from 10.254.2.1: icmp_seq=4 ttl=64 time=0.347 ms

--- 10.254.2.1 ping statistics ---
4 packets transmitted, 4 received, 0% packet loss, time 3106ms
rtt min/avg/max/mdev = 0.339/0.353/0.382/0.017 ms

mhamzahkhan@homelab-gw:~$ ping 10.254.2.0 count 4
PING 10.254.2.0 (10.254.2.0) 56(84) bytes of data.
64 bytes from 10.254.2.0: icmp_seq=1 ttl=64 time=0.290 ms
64 bytes from 10.254.2.0: icmp_seq=2 ttl=64 time=0.227 ms
64 bytes from 10.254.2.0: icmp_seq=3 ttl=64 time=0.404 ms
64 bytes from 10.254.2.0: icmp_seq=4 ttl=64 time=0.380 ms

--- 10.254.2.0 ping statistics ---
4 packets transmitted, 4 received, 0% packet loss, time 3078ms
rtt min/avg/max/mdev = 0.227/0.325/0.404/0.070 ms

To gauge the bandwidth between our networks, we can use iPerf3.

First start start iPerf3 in server mode on either side of the VPN. I’m running it on the colo lab router:

mhamzahkhan@cololab-gw:~$ iperf3 -s
-----------------------------------------------------------
Server listening on 5201 (test #1)
-----------------------------------------------------------

Next, start iPerf3 on the home lab router. Let’s start with an upload bandwidth test from the home-lab router to the colo-lab router:

mhamzahkhan@homelab-gw:~$ iperf3 -c 10.254.2.0
Connecting to host 10.254.2.0, port 5201
[5] local 10.254.2.1 port 33008 connected to 10.254.2.0 port 5201
[ID] Interval Transfer Bitrate Retr Cwnd
[5] 0.00-1.00 sec 20.8 MBytes 174 Mbits/sec 99 207 KBytes
[5] 1.00-2.00 sec 20.7 MBytes 174 Mbits/sec 0 269 KBytes
[5] 2.00-3.00 sec 19.8 MBytes 166 Mbits/sec 131 194 KBytes
[5] 3.00-4.00 sec 22.1 MBytes 185 Mbits/sec 0 263 KBytes
[5] 4.00-5.00 sec 17.3 MBytes 145 Mbits/sec 195 18.7 KBytes
[5] 5.00-6.00 sec 16.4 MBytes 137 Mbits/sec 63 224 KBytes
[5] 6.00-7.00 sec 19.9 MBytes 167 Mbits/sec 95 168 KBytes
[5] 7.00-8.00 sec 11.3 MBytes 95.2 Mbits/sec 123 123 KBytes
[5] 8.00-9.00 sec 18.9 MBytes 158 Mbits/sec 0 202 KBytes
[5] 9.00-10.00 sec 20.2 MBytes 169 Mbits/sec 35 207 KBytes
- - - - - - - - - - - - - - - - - - - - - - - - -
[ID] Interval Transfer Bitrate Retr
[5] 0.00-10.00 sec 187 MBytes 157 Mbits/sec 741 sender
[5] 0.00-10.01 sec 186 MBytes 156 Mbits/sec receiver

iperf Done.

I’m not sure why there are retransmissions. I still need to investigate that, but it’s maxing out my home connection upload.

Now, let’s reverse the test, with the colo-lab router sending data to the home-lab router. Use the -R flag for this:

mhamzahkhan@homelab-gw:~$ iperf3 -c 10.254.2.0 -R
Connecting to host 10.254.2.0, port 5201
Reverse mode, remote host 10.254.2.0 is sending
[5] local 10.254.2.1 port 52016 connected to 10.254.2.0 port 5201
[ID] Interval Transfer Bitrate
[5] 0.00-1.00 sec 14.8 MBytes 124 Mbits/sec
[5] 1.00-2.00 sec 17.4 MBytes 145 Mbits/sec
[5] 2.00-3.00 sec 17.6 MBytes 148 Mbits/sec
[5] 3.00-4.00 sec 15.5 MBytes 130 Mbits/sec
[5] 4.00-5.00 sec 16.3 MBytes 137 Mbits/sec
[5] 5.00-6.00 sec 12.2 MBytes 102 Mbits/sec
[5] 6.00-7.00 sec 9.33 MBytes 78.3 Mbits/sec
[5] 7.00-8.00 sec 7.86 MBytes 65.9 Mbits/sec
[5] 8.00-9.00 sec 14.7 MBytes 124 Mbits/sec
[5] 9.00-10.00 sec 15.3 MBytes 128 Mbits/sec
- - - - - - - - - - - - - - - - - - - - - - - - -
[ID] Interval Transfer Bitrate Retr
[5] 0.00-10.01 sec 142 MBytes 119 Mbits/sec 282 sender
[5] 0.00-10.00 sec 141 MBytes 118 Mbits/sec receiver

iperf Done.

Some tuning may be needed, but for now, these numbers should suffice.

Configure OSPFLink to heading

Now, let’s dive into OSPF configuration. Note that I use OSPF route summarization, which means we summarize individual subnets on each side into a single summary route, simplifying the routing table.

Colo Lab Router OSPF ConfigurationLink to heading

set protocols ospf area 0.0.0.0 network '10.254.2.0/31'
set protocols ospf area 0.0.0.1 network '10.254.112.0/24'
set protocols ospf area 0.0.0.1 network '10.254.113.0/24'
set protocols ospf area 0.0.0.1 network '10.254.114.0/24'
set protocols ospf area 0.0.0.1 range 10.254.112.0/21
set protocols ospf interface eth0 passive
set protocols ospf log-adjacency-changes
set protocols ospf parameters router-id '10.254.2.0'

Home Lab Router OSPF ConfigurationLink to heading

set protocols ospf area 0.0.0.0 network '10.254.2.0/31'
set protocols ospf area 0.0.0.1 network '10.254.88.0/24'
set protocols ospf area 0.0.0.1 network '10.254.89.0/24'
set protocols ospf area 0.0.0.1 network '10.254.90.0/24'
set protocols ospf area 0.0.0.1 range 10.254.88.0/21
set protocols ospf interface eth0 passive
set protocols ospf log-adjacency-changes
set protocols ospf parameters router-id '10.254.2.1'

And magically your routes should be in your routing table!

Colo Lab Router VerificationLink to heading

mhamzahkhan@cololab-gw:~$ show ip route 10.254.88.0
Routing entry for 10.254.88.0/21
 Known via "ospf", distance 110, metric 2, best
 Last update 11:58:19 ago
 * 10.254.2.1, via wg0, weight 1

Home Lab Router VerificationLink to heading

mhamzahkhan@homelab-gw:~$ show ip route 10.254.88.0
Routing entry for 10.254.112.0/21
 Known via "ospf", distance 110, metric 2, best
 Last update 12:00:02 ago
 * 10.254.2.0, via wg0, weight 1

ConclusionLink to heading

With the successful implementation of WireGuard VPN and OSPF routing, your two sites can now seamlessly communicate, marking a significant step in enhancing your network capabilities. While this guide has laid a solid foundation for your site-to-site VPN, there’s more to explore and build upon in future configurations.

In my next post, we will discuss configuring a VyOS-based WireGuard VPN for road-warrior style clients. This will enable secure remote access to your network, allowing you to connect from virtually anywhere with an internet connection. I will guide you through the setups, ensuring you have the tools to establish a secure and efficient network for remote users.

Stay tuned for this next installment, where we continue to harness the power of WireGuard and VyOS to expand the horizons of your network. Elevate your connectivity and security to new heights, and don’t miss out on future updates and valuable networking insights—subscribe and stay connected!

Using FreeIPA CA as an ACME Provider for cert-manager

M. Hamzah Khan — Wed, 27 Jul 2022 22:32:18 +0000

I’m using FreeIPA for authentication services in my home lab. It’s extreme overkill for my situation, as I don’t have many users (mainly just me!) but alas I like overkill. :)

I am using FreeIPA’s DNS service to host some DNS subdomains for internal services. The way I have configured these subdomains is through DNS delegations, but since my IPA servers are not accessible from the internet, it breaks both the HTTP-01 and DNS-01 verification challenges from LetsEncypt’s.

Yesterday evening, I was playing around with TrueCommand and have it hosted on one of my IPA internal domains, but as I cannot use LetsEncrypt to issue a certificate for it, I decided to use the CA built into FreeIPA since it supports ACME as well.

As all the machines that will need to use the service are enrolled into IPA already, the CA certificate for IPA is also installed on those nodes, meaning any certificate issues by FreeIPA are automatically trusted.

To get this to work, I had to first enable ACME support from within FreeIPA:

[root@ipa-server ~]# ipa-acme-manage enable

FreeIPA’s ACME service supports both HTTP-01 and DNS-01 challenges, but I generally prefer DNS-01. For cert-manager to add the _acme-challenge DNS record to FreeIPA, we can use cert-manager’s RFC-2136 provider.

To do this, we must create a new TSIG key on our IPA server:

[root@ipa-server ~]# tsig-keygen -a hmac-sha512 acme-update >> /etc/named/ipa-ext.conf
[root@ipa-server ~]# systemctl restart named-pkcs11.service

Enable dynamic updates for the IPA DNS subdomain:

[root@ipa-server ~]# ipa dnszone-mod k8s.intahnet.co.uk --dynamic-update=True --update-policy='grant acme-update wildcard * ANY;'

Next, I had to modify my cert-manager installation slightly to include my own CA certificate bundle, which includes my IPA CA cert. To do this I had to first create the bundle, and then create a Kubernetes ConfigMap for it:

[mhamzahkhan@laptop ~]# cat /etc/ipa/ca.crt > ca-certificates.crt
[mhamzahkhan@laptop ~]# kubectl -n cert-manager create configmap ca-bundle --from-file ca-certificates.crt

Info

If the machine you are using is enrolled in the IPA domain, you could also just use /etc/pki/tls/certs/ca-bundle.crt, which is actually what I did since it contains all the other CA certificates that cert-manager may need (for example the ISRG Root X1 CA certificate, which is needed so cert-manager can properly access the LetsEncrypt ACME servers).

Next, I had to modify the cert-manager deployment to make use of the ca-bundle. As I am using the cert-manager helm chart, this was quite easy. I added the following to my cert-manager helm values file:

---
volumes:
 - name: ca-bundle
 configMap:
 name: ca-bundle

volumeMounts:
 - name: ca-bundle
 mountPath: /etc/ssl/certs/ca-certificates.crt
 subPath: ca-certificates.crt
 readOnly: false

Once this has been deployed, we can need to create a secret in Kubernetes for the TSIG key. Grab the TSIG key we generated earlier from your IPA server (/etc/named/ipa-ext.conf), and create a Kubernetes secret with it:

[mhamzahkhan@laptop ~]# kubectl -n cert-manager create secret generic ipa-tsig-secret --from-literal=tsig-secret-key="XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX"

Next, add a new ClusterIssuer for IPA’s ACME service:

---
apiVersion: cert-manager.io/v1
kind: ClusterIssuer
metadata:
 name: ipa
 namespace: cert-manager
spec:
 acme:
 email: admin@ipa.intahnet.co.uk
 server: https://ipa-ca.ipa.intahnet.co.uk/acme/directory
 privateKeySecretRef:
 name: ipa-issuer-account-key
 solvers:
 - dns01:
 rfc2136:
 nameserver: 10.0.0.22
 tsigKeyName: acme-update
 tsigAlgorithm: HMACSHA512
 tsigSecretSecretRef:
 name: ipa-tsig-secret
 key: tsig-secret-key
 selector:
 dnsZones:
 - 'k8s.intahnet.co.uk'

Now you should be set to request certificates!

---
apiVersion: cert-manager.io/v1
kind: Certificate
metadata:
 name: truecommand-certificate
 namespace: default
spec:
 commonName: 'truecommand.k8s.intahnet.co.uk'
 dnsNames:
 - truecommand.k8s.intahnet.co.uk
 issuerRef:
 name: ipa
 kind: ClusterIssuer
 privateKey:
 algorithm: RSA
 encoding: PKCS1
 size: 4096
 secretName: truecommand-tls

All working:

[mhamzahkhan@laptop ~]# kubectl get certificate
NAME READY SECRET AGE
truecommand-certificate True truecommand-tls 23s

[mhamzahkhan@laptop ~]# kubectl get secrets
NAME TYPE DATA AGE
truecommand-certificate-q8qkh kubernetes.io/tls 2 29s

It’s a very similar process to use ExternalDNS with FreeIPA as ExternalDNS also supports RFC2136. I have not set this up yet, but the process is described in this excellent blog post: How to set up Dynamic DNS on FreeIPA for your Kubernetes Cluster.