<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>DEV Community: M. Hamzah Khan</title>
    <description>The latest articles on DEV Community by M. Hamzah Khan (@mhamzahkhan).</description>
    <link>https://dev.to/mhamzahkhan</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F3273796%2Fbc8d6e85-8d76-403b-9ef3-51e3c7b7672a.png</url>
      <title>DEV Community: M. Hamzah Khan</title>
      <link>https://dev.to/mhamzahkhan</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://dev.to/feed/mhamzahkhan"/>
    <language>en</language>
    <item>
      <title>Grafana Alloy in My Homelab: Why I Run Three Separate Instances</title>
      <dc:creator>M. Hamzah Khan</dc:creator>
      <pubDate>Sat, 07 Mar 2026 10:00:00 +0000</pubDate>
      <link>https://dev.to/mhamzahkhan/grafana-alloy-in-my-homelab-why-i-run-three-separate-instances-5e7</link>
      <guid>https://dev.to/mhamzahkhan/grafana-alloy-in-my-homelab-why-i-run-three-separate-instances-5e7</guid>
      <description>&lt;p&gt;When I set up Grafana Alloy across my homelab Kubernetes cluster, the first question was: how many instances do I actually need? Most tutorials show a single Alloy deployment handling everything. That works for a proof of concept but it papers over a real architectural question — one that comes down to a single word: &lt;strong&gt;clustering&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;My setup runs three separate Alloy deployments inside Kubernetes, plus standalone Alloy on bare-metal nodes outside the cluster. The reasons for the split are not aesthetic. The primary driver is clustering — some collection tasks need it and others must not use it. The secondary driver is resource isolation: if &lt;code&gt;alloy-cluster&lt;/code&gt; starts OOMing under a burst of ServiceMonitor scrapes, I do not want host metrics collection to stop. Keeping them separate means a problem in one deployment cannot starve the others.&lt;/p&gt;




&lt;h2&gt;
  
  
  What Alloy Clustering Actually DoesLink to heading
&lt;/h2&gt;

&lt;p&gt;Alloy clustering uses a gossip protocol to form a peer mesh between instances. When a component like &lt;code&gt;prometheus.operator.servicemonitors&lt;/code&gt; has &lt;code&gt;clustering { enabled = true }&lt;/code&gt;, all Alloy instances in the cluster share a hash ring and each one independently computes which targets it owns. The result is that a set of N replicas collectively scrapes all targets, with each target scraped exactly once.&lt;/p&gt;

&lt;p&gt;Peer discovery works via DNS against a Kubernetes headless Service — which is why clustering requires a StatefulSet. StatefulSets give pods stable DNS identities; Deployments and DaemonSets do not.&lt;/p&gt;

&lt;p&gt;This is enormously useful when you want high-availability metrics collection with multiple replicas. Without clustering, N replicas each scrape all targets independently, producing N× duplicate time series and &lt;code&gt;out-of-order sample&lt;/code&gt; errors downstream.&lt;/p&gt;




&lt;h2&gt;
  
  
  Why DaemonSet Pods Do Not Need ClusteringLink to heading
&lt;/h2&gt;

&lt;p&gt;Here is the thing: DaemonSet pods are already partitioned by Kubernetes. There is one pod per node. Each pod only ever scrapes resources local to its own node — the host filesystem, the local kubelet endpoint, the local cAdvisor endpoint. There is no shared pool of targets to distribute.&lt;/p&gt;

&lt;p&gt;Enabling clustering on a DaemonSet would achieve nothing. The &lt;a href="https://grafana.com/docs/alloy/latest/get-started/clustering/" rel="noopener noreferrer"&gt;Grafana Alloy clustering docs&lt;/a&gt; are blunt about this:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;“A particularly common mistake is enabling clustering on logs collecting DaemonSets. Collecting logs from Pods on the mounted node doesn’t benefit from having clustering enabled since each instance typically collects logs only from Pods on its own node.”&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;Each DaemonSet pod is an entirely independent instance. The work partitioning is handled by Kubernetes, not by Alloy’s gossip protocol.&lt;/p&gt;




&lt;h2&gt;
  
  
  The Three DeploymentsLink to heading
&lt;/h2&gt;

&lt;p&gt;This is where the topology comes from. Tasks divide into three categories: inherently node-local (DaemonSet, no clustering), cluster-wide with HA (StatefulSet, clustering on), and singleton (single Deployment, no clustering). Running one Alloy that tries to do all three would either require clustering on things that don’t need it, or no clustering on things that do. It would also mean a single resource budget covering everything — one OOM kill and both host metrics and cluster-wide scraping go down together.&lt;/p&gt;

&lt;h3&gt;
  
  
  alloy-node — DaemonSet, no clusteringLink to heading
&lt;/h3&gt;

&lt;p&gt;One pod per node. Tolerates all taints so it runs on control plane nodes too. Runs with &lt;code&gt;hostNetwork: true&lt;/code&gt; and &lt;code&gt;hostPID: true&lt;/code&gt; so it can see the host’s process tree and network interfaces.&lt;/p&gt;

&lt;p&gt;What it collects:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Host metrics&lt;/strong&gt; via the built-in &lt;code&gt;prometheus.exporter.unix&lt;/code&gt; — Alloy’s native node_exporter. No separate binary needed.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;cAdvisor metrics&lt;/strong&gt; (container CPU/memory) scraped from the local kubelet endpoint only. The key is filtering discovery results to the local node using &lt;code&gt;constants.hostname&lt;/code&gt;, so each pod only scrapes itself:
&lt;/li&gt;
&lt;/ul&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight hcl"&gt;&lt;code&gt;&lt;span class="nx"&gt;discovery&lt;/span&gt;&lt;span class="err"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;relabel&lt;/span&gt; &lt;span class="s2"&gt;"local_node_only_cadvisor"&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
 &lt;span class="nx"&gt;targets&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;discovery&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;kubernetes&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;nodes&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;targets&lt;/span&gt;

 &lt;span class="nx"&gt;rule&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
 &lt;span class="nx"&gt;source_labels&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s2"&gt;"__meta_kubernetes_node_name"&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
 &lt;span class="nx"&gt;action&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"keep"&lt;/span&gt;
 &lt;span class="nx"&gt;regex&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;constants&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;hostname&lt;/span&gt;
 &lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Without this filter, every DaemonSet pod would attempt to scrape every node’s cAdvisor — 7 pods × 7 nodes = 49 scrape attempts for what should be 7.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;kubelet metrics&lt;/strong&gt; using the same local-node-only pattern.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Pod logs&lt;/strong&gt; from &lt;code&gt;/var/log/pods/**/*.log&lt;/code&gt; with CRI parsing and label extraction from the file path (namespace, pod name, container name).&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;systemd journal logs&lt;/strong&gt; via &lt;code&gt;loki.source.journal&lt;/code&gt; — picks up kubelet, containerd, and any other systemd units on the host.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;No clustering block anywhere in this config. Each pod runs entirely independently.&lt;/p&gt;

&lt;h3&gt;
  
  
  alloy-cluster — StatefulSet, clustering onLink to heading
&lt;/h3&gt;

&lt;p&gt;Two replicas with &lt;code&gt;clustering.enabled: true&lt;/code&gt;. This is for cluster-wide metric collection — anything that requires Kubernetes API access to discover targets and that would produce duplicates if scraped by multiple independent instances.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight hcl"&gt;&lt;code&gt;&lt;span class="nx"&gt;prometheus&lt;/span&gt;&lt;span class="err"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;operator&lt;/span&gt;&lt;span class="err"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;servicemonitors&lt;/span&gt; &lt;span class="s2"&gt;"services"&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
 &lt;span class="nx"&gt;forward_to&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nx"&gt;prometheus&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;remote_write&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;default&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;receiver&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;

 &lt;span class="nx"&gt;clustering&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
 &lt;span class="nx"&gt;enabled&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="kc"&gt;true&lt;/span&gt;
 &lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="nx"&gt;prometheus&lt;/span&gt;&lt;span class="err"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;operator&lt;/span&gt;&lt;span class="err"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;podmonitors&lt;/span&gt; &lt;span class="s2"&gt;"pods"&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
 &lt;span class="nx"&gt;forward_to&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nx"&gt;prometheus&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;remote_write&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;default&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;receiver&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;

 &lt;span class="nx"&gt;clustering&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
 &lt;span class="nx"&gt;enabled&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="kc"&gt;true&lt;/span&gt;
 &lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The two replicas share the ServiceMonitor and PodMonitor workload via the hash ring. If one goes down, the other takes the full load. When it comes back, targets are automatically rebalanced.&lt;/p&gt;

&lt;p&gt;It also handles Mimir rule synchronisation — reading &lt;code&gt;PrometheusRule&lt;/code&gt; CRDs from Kubernetes and syncing them into Mimir’s ruler:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight hcl"&gt;&lt;code&gt;&lt;span class="nx"&gt;mimir&lt;/span&gt;&lt;span class="err"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;rules&lt;/span&gt;&lt;span class="err"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;kubernetes&lt;/span&gt; &lt;span class="s2"&gt;"local"&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
 &lt;span class="nx"&gt;address&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"http://mimir-ruler.mimir-system.svc.cluster.local:8080"&lt;/span&gt;
 &lt;span class="nx"&gt;tenant_id&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"1"&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;And an OTLP receiver for anything that wants to push telemetry in OpenTelemetry format, exposed via a regular Service backed by both replicas.&lt;/p&gt;

&lt;p&gt;The StatefulSet uses 1Gi persistent storage (Rook/Ceph SSD) for Alloy’s write-ahead log, which buffers data locally if Mimir or Loki are temporarily unavailable.&lt;/p&gt;

&lt;h3&gt;
  
  
  alloy-kube-events — single Deployment, no clusteringLink to heading
&lt;/h3&gt;

&lt;p&gt;Kubernetes events exist only in the API server and are garbage collected after a short window. This deployment runs a single replica that watches the events API continuously and ships everything to Loki:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight hcl"&gt;&lt;code&gt;&lt;span class="nx"&gt;loki&lt;/span&gt;&lt;span class="err"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;source&lt;/span&gt;&lt;span class="err"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;kubernetes_events&lt;/span&gt; &lt;span class="s2"&gt;"kubernetes_events"&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
 &lt;span class="nx"&gt;job_name&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"integrations/kubernetes/eventhandler"&lt;/span&gt;
 &lt;span class="nx"&gt;log_format&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"json"&lt;/span&gt;
 &lt;span class="nx"&gt;forward_to&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nx"&gt;loki&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;process&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;kubernetes_events&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;receiver&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;A single replica is correct here. Events are cluster-wide objects, not node-local, so a DaemonSet would forward each event from every node. But you also do not need or want clustering — a second replica with clustering enabled would not help, and a second unclustered replica would duplicate all events. One instance, watching the API, is the right answer.&lt;/p&gt;

&lt;p&gt;Very lightweight: 50m CPU request, 128Mi RAM.&lt;/p&gt;




&lt;h2&gt;
  
  
  Cardinality: An Ongoing JobLink to heading
&lt;/h2&gt;

&lt;p&gt;Getting the topology right is the structural problem. Cardinality is the operational one. In a Kubernetes cluster with many pods and containers, the default label sets from node_exporter and cAdvisor generate an enormous number of time series — and Mimir has to store all of them. Left unchecked, this drives up memory usage across the whole observability stack.&lt;/p&gt;

&lt;p&gt;The approach is the same everywhere: drop labels and metrics you will never query, as close to the source as possible.&lt;/p&gt;

&lt;h3&gt;
  
  
  Virtual network interfacesLink to heading
&lt;/h3&gt;

&lt;p&gt;A Kubernetes node running many pods will have hundreds of virtual network interfaces — one &lt;code&gt;veth&lt;/code&gt; pair per pod, plus Calico (&lt;code&gt;cali*&lt;/code&gt;) interfaces. The node_exporter &lt;code&gt;netclass&lt;/code&gt; and &lt;code&gt;netdev&lt;/code&gt; collectors would create a separate set of time series for every one of them. They are not useful for node-level network monitoring:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight hcl"&gt;&lt;code&gt;&lt;span class="nx"&gt;prometheus&lt;/span&gt;&lt;span class="err"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;exporter&lt;/span&gt;&lt;span class="err"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;unix&lt;/span&gt; &lt;span class="s2"&gt;"node_exporter_metrics"&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
 &lt;span class="nx"&gt;netclass&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
 &lt;span class="nx"&gt;ignored_devices&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"^(veth.*|cali.*|[a-f0-9]{15})$"&lt;/span&gt;
 &lt;span class="p"&gt;}&lt;/span&gt;

 &lt;span class="nx"&gt;netdev&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
 &lt;span class="nx"&gt;device_exclude&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"^(veth.*|cali.*|[a-f0-9]{15})$"&lt;/span&gt;
 &lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The regex also catches 15-character hex strings — the interface names Kubernetes generates for container network namespaces. Without this, every pod churn event adds and then expires a batch of time series.&lt;/p&gt;

&lt;h3&gt;
  
  
  Container and virtual filesystemsLink to heading
&lt;/h3&gt;

&lt;p&gt;A Kubernetes node also mounts a huge number of ephemeral filesystems: one &lt;code&gt;overlay&lt;/code&gt; mount per container layer, &lt;code&gt;tmpfs&lt;/code&gt; for secrets and service account tokens, &lt;code&gt;cgroup&lt;/code&gt; hierarchies, &lt;code&gt;proc&lt;/code&gt;, &lt;code&gt;devtmpfs&lt;/code&gt;, and so on. None of these are useful for disk space monitoring:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight hcl"&gt;&lt;code&gt;&lt;span class="nx"&gt;filesystem&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
 &lt;span class="nx"&gt;fs_types_exclude&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"^(autofs|binfmt_misc|bpf|cgroup2?|configfs|debugfs|devpts|devtmpfs|tmpfs|fusectl|hugetlbfs|iso9660|mqueue|nsfs|overlay|proc|procfs|pstore|rpc_pipefs|securityfs|selinuxfs|squashfs|sysfs|tracefs)$"&lt;/span&gt;
 &lt;span class="nx"&gt;mount_points_exclude&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"^/(dev|proc|sys|var/lib/docker/.+|var/lib/kubelet/.+)($|/)"&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Dropping unused collectors entirelyLink to heading
&lt;/h3&gt;

&lt;p&gt;The &lt;code&gt;ipvs&lt;/code&gt; collector is disabled outright — the cluster uses iptables, not IPVS. There is no point scraping metrics for something that is not running:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight hcl"&gt;&lt;code&gt;&lt;span class="nx"&gt;prometheus&lt;/span&gt;&lt;span class="err"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;exporter&lt;/span&gt;&lt;span class="err"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;unix&lt;/span&gt; &lt;span class="s2"&gt;"node_exporter_metrics"&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
 &lt;span class="nx"&gt;disable_collectors&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s2"&gt;"ipvs"&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  cAdvisor container labelsLink to heading
&lt;/h3&gt;

&lt;p&gt;cAdvisor attaches &lt;code&gt;id&lt;/code&gt; and &lt;code&gt;name&lt;/code&gt; labels to container metrics. The &lt;code&gt;id&lt;/code&gt; label is the full container runtime ID — a long hex string that is unique per container instance and changes every time a pod restarts. Keeping it would mean every pod restart permanently adds a new set of time series that never get reused:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight hcl"&gt;&lt;code&gt;&lt;span class="nx"&gt;prometheus&lt;/span&gt;&lt;span class="err"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;relabel&lt;/span&gt; &lt;span class="s2"&gt;"drop_cadvisor"&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
 &lt;span class="nx"&gt;rule&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
 &lt;span class="nx"&gt;action&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"labeldrop"&lt;/span&gt;
 &lt;span class="nx"&gt;regex&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"id|name|instance"&lt;/span&gt;
 &lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Pod log stream labelsLink to heading
&lt;/h3&gt;

&lt;p&gt;The &lt;code&gt;filename&lt;/code&gt; label on pod logs is the full path on the host: &lt;code&gt;/var/log/pods/&amp;lt;namespace&amp;gt;_&amp;lt;pod-name&amp;gt;_&amp;lt;pod-uid&amp;gt;/&amp;lt;container&amp;gt;/&amp;lt;n&amp;gt;.log&lt;/code&gt;. The pod UID component is unique per pod instance, so every pod restart creates a new log stream label value that Loki has to index. Dropping it keeps the stream cardinality manageable:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight hcl"&gt;&lt;code&gt;&lt;span class="nx"&gt;stage&lt;/span&gt;&lt;span class="err"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;label_drop&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
 &lt;span class="nx"&gt;values&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s2"&gt;"filename"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s2"&gt;"flags"&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The useful labels — &lt;code&gt;namespace&lt;/code&gt;, &lt;code&gt;pod&lt;/code&gt;, &lt;code&gt;container&lt;/code&gt;, &lt;code&gt;stream&lt;/code&gt; — are extracted separately from the file path via a regex stage and kept. Only the high-cardinality junk is dropped.&lt;/p&gt;

&lt;h3&gt;
  
  
  Internal scrape metricsLink to heading
&lt;/h3&gt;

&lt;p&gt;node_exporter emits &lt;code&gt;node_scrape_collector_*&lt;/code&gt; metrics that track its own internal scrape performance per collector. Useful for debugging node_exporter itself, but not worth storing long-term in Mimir:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight hcl"&gt;&lt;code&gt;&lt;span class="nx"&gt;rule&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
 &lt;span class="nx"&gt;source_labels&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s2"&gt;" __name__"&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
 &lt;span class="nx"&gt;regex&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"node_scrape_collector_.+"&lt;/span&gt;
 &lt;span class="nx"&gt;action&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"drop"&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This is all incremental work. As new exporters get added or existing dashboards evolve, there are always new labels to audit and unused metrics to prune. The cardinality pressure does not go away — it just needs to be managed continuously.&lt;/p&gt;




&lt;h2&gt;
  
  
  Standalone Alloy on Bare-Metal NodesLink to heading
&lt;/h2&gt;

&lt;p&gt;Not everything in the lab runs inside Kubernetes. Proxmox hypervisors, the VyOS router, and Raspberry Pi systems all run standalone Alloy as a systemd service. The config is simpler — no pod log collection, no Kubernetes API access, just host metrics and journal logs forwarded to the same Mimir and Loki endpoints as the cluster:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight hcl"&gt;&lt;code&gt;&lt;span class="nx"&gt;prometheus&lt;/span&gt;&lt;span class="err"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;exporter&lt;/span&gt;&lt;span class="err"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;unix&lt;/span&gt; &lt;span class="s2"&gt;"node"&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="nx"&gt;prometheus&lt;/span&gt;&lt;span class="err"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;scrape&lt;/span&gt; &lt;span class="s2"&gt;"node"&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
 &lt;span class="nx"&gt;targets&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;prometheus&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;exporter&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;unix&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;node&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;targets&lt;/span&gt;
 &lt;span class="nx"&gt;forward_to&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nx"&gt;prometheus&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;remote_write&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;default&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;receiver&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="nx"&gt;loki&lt;/span&gt;&lt;span class="err"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;source&lt;/span&gt;&lt;span class="err"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;journal&lt;/span&gt; &lt;span class="s2"&gt;"journal"&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
 &lt;span class="nx"&gt;forward_to&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nx"&gt;loki&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;write&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;default&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;receiver&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
 &lt;span class="nx"&gt;labels&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
 &lt;span class="nx"&gt;job&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"integrations/systemd-journal"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
 &lt;span class="nx"&gt;instance&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;constants&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;hostname&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
 &lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The same &lt;code&gt;datacentre&lt;/code&gt; and &lt;code&gt;cluster&lt;/code&gt; external labels are set on the remote_write and loki.write blocks, so in Grafana I can use a single dashboard and filter between Kubernetes nodes and bare-metal hosts.&lt;/p&gt;




&lt;h2&gt;
  
  
  SummaryLink to heading
&lt;/h2&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;graph LR
subgraph bare["Bare-metal nodes"]
BM["alloy (systemd)
Proxmox · VyOS · Raspberry Pi"]
end
subgraph k8s["Kubernetes Cluster — lab-lon1-uk"]
subgraph ds["DaemonSet — 1 per node"]
AN["alloy-node
Host metrics · cAdvisor
Kubelet · Pod logs
Journal logs"]
end
subgraph sts["StatefulSet ×2, clustering on"]
AC["alloy-cluster
ServiceMonitors · PodMonitors
Mimir rules sync · OTLP receiver"]
end
subgraph dep["Deployment ×1"]
AE["alloy-kube-events
Kubernetes events"]
end
end
subgraph obs["Observability backends"]
MIMIR[("Mimir")]
LOKI[("Loki")]
GRAFANA["Grafana"]
end
AN --&amp;gt;|metrics| MIMIR
AN --&amp;gt;|logs| LOKI
AC --&amp;gt;|metrics| MIMIR
AC --&amp;gt;|logs| LOKI
AE --&amp;gt;|logs| LOKI
BM --&amp;gt;|metrics| MIMIR
BM --&amp;gt;|logs| LOKI
MIMIR --&amp;gt; GRAFANA
LOKI --&amp;gt; GRAFANA
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Deployment&lt;/th&gt;
&lt;th&gt;Type&lt;/th&gt;
&lt;th&gt;Clustering&lt;/th&gt;
&lt;th&gt;Why&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;alloy-node&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;DaemonSet&lt;/td&gt;
&lt;td&gt;No&lt;/td&gt;
&lt;td&gt;Node-local collection — Kubernetes already partitions by node&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;alloy-cluster&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;StatefulSet (×2)&lt;/td&gt;
&lt;td&gt;Yes&lt;/td&gt;
&lt;td&gt;Cluster-wide ServiceMonitor/PodMonitor scraping — needs HA without duplicates&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;alloy-kube-events&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Deployment (×1)&lt;/td&gt;
&lt;td&gt;No&lt;/td&gt;
&lt;td&gt;Single-instance by design — duplicate event forwarding would be wrong&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Standalone&lt;/td&gt;
&lt;td&gt;systemd&lt;/td&gt;
&lt;td&gt;n/a&lt;/td&gt;
&lt;td&gt;Bare-metal hosts outside the cluster&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;The organising principle is clustering, not aesthetics. If a task is node-local, a DaemonSet handles partitioning naturally. If a task is cluster-wide and you want more than one replica, clustering is what prevents duplicate data. And if a task must run exactly once, you use a single Deployment and keep clustering out of the picture entirely.&lt;/p&gt;




&lt;h2&gt;
  
  
  This Is Also What Grafana DoesLink to heading
&lt;/h2&gt;

&lt;p&gt;It is worth noting that Grafana’s own &lt;a href="https://github.com/grafana/k8s-monitoring-helm" rel="noopener noreferrer"&gt;k8s-monitoring Helm chart&lt;/a&gt; arrives at the same topology. Their chart deploys:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;code&gt;alloy-metrics&lt;/code&gt; — StatefulSet, for cluster-wide metrics collection&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;alloy-logs&lt;/code&gt; — DaemonSet, for node-local pod and host log collection&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;alloy-singleton&lt;/code&gt; — single Deployment, for cluster events and other once-only tasks&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The names are different and the internals diverge — their chart is considerably more opinionated, with its own abstraction layer over the raw Alloy config — but the underlying reasoning is identical.&lt;/p&gt;

&lt;p&gt;My current setup rolls its own Helm releases and Alloy configs directly. I plan to migrate to the k8s-monitoring chart, which also brings in the &lt;a href="https://grafana.com/docs/grafana-cloud/monitor-infrastructure/kubernetes-monitoring/configuration/helm-chart-config/helm-chart/collector-reference/" rel="noopener noreferrer"&gt;Alloy Operator&lt;/a&gt; for lifecycle management of the collector instances. When that migration happens I will write it up.&lt;/p&gt;

</description>
      <category>kubernetes</category>
      <category>homelab</category>
      <category>monitoring</category>
      <category>devops</category>
    </item>
    <item>
      <title>Parenting Like a DevOps Engineer: Managing the Chaos of Family Life</title>
      <dc:creator>M. Hamzah Khan</dc:creator>
      <pubDate>Thu, 12 Jun 2025 20:00:00 +0000</pubDate>
      <link>https://dev.to/mhamzahkhan/parenting-like-a-devops-engineer-managing-the-chaos-of-family-life-1pfa</link>
      <guid>https://dev.to/mhamzahkhan/parenting-like-a-devops-engineer-managing-the-chaos-of-family-life-1pfa</guid>
      <description>&lt;p&gt;Father’s Day just passed, which got me thinking—not just about fatherhood in general, but how &lt;em&gt;weirdly&lt;/em&gt; useful my job as a DevOps engineer has been in helping me parent. I have three kids: two sons (8 years old, and 6 years old), and one daughter (4 years old). They’re amazing, unpredictable, and chaotic—kind of like a Kubernetes cluster that’s constantly in flux, demanding constant monitoring, quick rollbacks, and a whole lot of automation to keep from spiralling into an unmanageable mess.&lt;/p&gt;

&lt;p&gt;I’m not the world’s greatest parent. Far from it. But I’m learning. Slowly. And somewhere between incident response and bedtime battles, I’ve realised that parenting, like DevOps, is mostly about managing chaos, making tiny, incremental improvements and iterating on what works.&lt;/p&gt;

&lt;p&gt;Just like in DevOps, the key to a happy home is good ‘observability’ – mainly through the faint sounds of mischief from the other room.&lt;/p&gt;

&lt;p&gt;A few months ago, my six-year-old began resisting going to school. Each morning turned into a dramatic struggle. When we asked him why he didn’t want to go, he would simply shrug and mumble, “I don’t like it.” Unfortunately, that didn’t provide us with much actionable information.&lt;/p&gt;

&lt;p&gt;In engineering, when problems arise, we start by gathering context. We don’t jump to conclusions; instead, we observe and investigate. So, one day, I invited him into my home office—my safe space—and told him it was our safe space now. “In here,” I explained, “we’re friends who can talk about anything, from the silliest thing to the craziest. Just us. No pressure.”&lt;/p&gt;

&lt;p&gt;He sat quietly in the chair beside me for a while. Then, finally, he said:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;“I don’t like school because… I don’t know how to talk to the other kids.”&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;That hit me hard. He wasn’t being defiant; he was simply overwhelmed. It particularly resonated with me because it was an issue I struggled with as a child too.&lt;/p&gt;

&lt;p&gt;From there, we were able to speak to his teacher, who gently helped him integrate into games with other children. Now that he has friends, he actually looks forward to seeing them. That breakthrough happened not through interrogation but through observability and patience.&lt;/p&gt;

&lt;p&gt;Using my home office as our safe space has now become a regular occurrence. Strangely, this technique of establishing a room as our “safe space” doesn’t work for my wife.&lt;/p&gt;

&lt;h2&gt;
  
  
  🔁 Blameless Postmortems (Even When Homework is Due Tomorrow)Link to heading
&lt;/h2&gt;

&lt;p&gt;DevOps culture teaches us to run blameless retrospectives after incidents. Not because we don’t care about what went wrong but because assigning blame prevents learning.&lt;/p&gt;

&lt;p&gt;My 8-year-old has a bad habit of revealing school projects the night before they’re due. No matter how often we ask him, “Any homework?” he’ll respond with an Oscar-worthy performance of “Nope.” Then, at 8:00 PM on a Thursday: “Oh yeah, I need to make a cardboard Roman sword and write about it.”&lt;/p&gt;

&lt;p&gt;The old me would’ve panicked or scolded. But now, I try to treat it like a retro: What were the signals we missed? How can we improve visibility? Do we need a new “homework alerting system” (also known as a whiteboard on the fridge)?&lt;/p&gt;

&lt;p&gt;We still get frustrated. But now it’s frustration aimed at the system, not the child.&lt;/p&gt;

&lt;h2&gt;
  
  
  🧭 Observability: Beyond the Logs (and into the Babychinos)Link to heading
&lt;/h2&gt;

&lt;p&gt;With my youngest, she’s four—things are different. She’s in the plushie-and-babychino phase of life, so we go on “coffee dates” together. I get a double espresso latte; she gets a babychino and a cinnamon swirl, and we just… sit. She talks about Barbie, how she wants to be a ballerina, and how she wants a real pet sheep she’d call ‘Baa-llerina’ because it’s a sheep, and sheep say “baa,” and she likes ballet.&lt;/p&gt;

&lt;p&gt;She doesn’t say, “Dad, I’m feeling emotionally disconnected and would benefit from some focused one-on-one time.” But I’ve learned to watch the metrics: her mood shifts, clinginess, eye contact, sleep patterns. You get better at reading logs when you stop waiting for alerts.&lt;/p&gt;

&lt;p&gt;Parenting isn’t just about reacting to tantrums—it’s about noticing subtle changes and responding early.&lt;/p&gt;

&lt;p&gt;Observability at home? It’s empathy, finely tuned with instrumentation.&lt;/p&gt;

&lt;h2&gt;
  
  
  🤖 Automation: The Bedtime Pipeline (and Beyond)Link to heading
&lt;/h2&gt;

&lt;p&gt;In DevOps, we obsess over automation. Why? Because it reduces friction, ensures consistency, and frees up our engineers for more complex, creative work. Turns out, the same principle applies when you’re trying to get three small humans from hyperactive to horizontal.&lt;/p&gt;

&lt;p&gt;Our bedtime routine, for example, is a finely tuned, automated pipeline: Dinner, PJs, brushing teeth, using the toilet, stories, cuddles, and lights out. When it works, it’s beautiful. Each step flows into the next, reducing decision fatigue for both us and the kids. They know what’s coming, which minimises resistance. We know what’s coming, which minimises parental meltdowns.&lt;/p&gt;

&lt;p&gt;It’s not just bedtime; it works for the morning routine before school or even just having designated spots for shoes and backpacks – these are all tiny automations. They’re like mini-scripts running in the background of our family life, reducing cognitive load and preventing us from constantly having to “manually deploy” every single task. When the system is automated, we have more time and energy for unexpected ‘incidents’ – like explaining for the fifth time why we can’t have a pet unicorn.&lt;/p&gt;

&lt;h2&gt;
  
  
  🔄 Continuous Integration and Daily Stand-UpsLink to heading
&lt;/h2&gt;

&lt;p&gt;In engineering, Continuous Integration refers to the practice of frequently merging code into a shared project repository. This approach includes automated builds and tests that detect issues early on, assisting in the identification of conflicts before they develop into major problems.&lt;/p&gt;

&lt;p&gt;My wife and I may not be merging lines of code, but we are continually integrating our parenting approaches. We represent two distinct ‘branches’ of the same ‘project,’ and if we don’t regularly synchronize, we risk encountering merge conflicts that affect the entire ‘system’ (i.e., the kids).&lt;/p&gt;

&lt;p&gt;Our daily stand-up usually happens over breakfast or after the kids are asleep. We ask questions like, “How was school pickup?” “Did you talk to him about the math homework?” and “She seems a bit quiet or clingy today; is something wrong?” These are not formal meetings but quick and important check-ins. We share what we notice, align our responses to new behaviors, and bring up any potential issues before they escalate. This keeps our family approach—our shared way of parenting—consistent and harmonious. When we are not on the same page, things become chaotic. One parent says yes, the other says no, and suddenly, our perfectly crafted ‘deployment’ (e.g., getting everyone out the door on time) grinds to a halt. CI, even in parenting, makes for a smoother operation.&lt;/p&gt;

&lt;h2&gt;
  
  
  🧩 The Monolith vs Microservices Debate (aka Marriage)Link to heading
&lt;/h2&gt;

&lt;p&gt;My wife and I parent in very different ways. She’s not an engineer. She doesn’t think about “event-driven architecture” or “incident response timelines.” Her approach is more intuitive, relational, and deeply human.&lt;/p&gt;

&lt;p&gt;At first, this led to some friction. Why didn’t she want to optimise bedtime flow with a Kanban board? Why didn’t I just &lt;em&gt;feel&lt;/em&gt; that someone was about to have a meltdown?&lt;/p&gt;

&lt;p&gt;But over time, I’ve realised that our differences are a feature, not a bug. We balance each other out. Like a good system composed of microservices and a stable monolith—you need both agility and cohesion. Flexibility and structure. Love and logic.&lt;/p&gt;

&lt;p&gt;We’re both debugging this system in real-time, just using different tools.&lt;/p&gt;

&lt;h2&gt;
  
  
  🕹 When Roblox Becomes Pair ProgrammingLink to heading
&lt;/h2&gt;

&lt;p&gt;I don’t particularly enjoy Roblox. The games are confusing, and they give me motion sickness like I just went on a roller coaster.&lt;/p&gt;

&lt;p&gt;But my 6-year-old loves it. He &lt;em&gt;lights up&lt;/em&gt; when we play together.&lt;/p&gt;

&lt;p&gt;The other day, he tried to explain a game to me. I nodded along, trying not to feel sick while hiding from “Scary Larry.” He laughed at how lost I was. I was confused but still there.&lt;/p&gt;

&lt;p&gt;This is what matters. The primary objective of pair programming is to write better code and share knowledge. However, its real strength is in the teamwork and connection built during the process. Similar to Roblox, the most valuable result isn’t always what shows up on the screen.&lt;/p&gt;

&lt;h2&gt;
  
  
  🙃 Closing ThoughtsLink to heading
&lt;/h2&gt;

&lt;p&gt;DevOps didn’t make me a perfect parent, but it gave me a mindset: one that values systems thinking, curiosity, and resilience.&lt;/p&gt;

&lt;p&gt;And fatherhood made me a better engineer, too. It taught me that no system—technical or human—responds well to blame. That emotional outages need graceful recovery.&lt;/p&gt;

&lt;p&gt;So, this Father’s Day, I’m not celebrating my success. I’m celebrating the debugging process. The retros. The messy commits. The half-working prototypes.&lt;/p&gt;

&lt;p&gt;And the three little humans who remind me daily that parenting is the most complex system I’ll ever help build.&lt;/p&gt;

</description>
      <category>devops</category>
      <category>career</category>
      <category>productivity</category>
      <category>watercooler</category>
    </item>
    <item>
      <title>How to Redirect Hardcoded DNS with VyOS (Perfect for Pi-hole or Blocky Setups)</title>
      <dc:creator>M. Hamzah Khan</dc:creator>
      <pubDate>Thu, 28 Mar 2024 10:25:00 +0000</pubDate>
      <link>https://dev.to/mhamzahkhan/how-to-redirect-hardcoded-dns-with-vyos-perfect-for-pi-hole-or-blocky-setups-5670</link>
      <guid>https://dev.to/mhamzahkhan/how-to-redirect-hardcoded-dns-with-vyos-perfect-for-pi-hole-or-blocky-setups-5670</guid>
      <description>&lt;p&gt;Smart devices like Chromecasts and TVs often use hardcoded DNS servers that bypass your custom DNS filters like Pi-hole or Blocky. In this guide, you’ll learn how to configure VyOS NAT rules to &lt;strong&gt;intercept and redirect all DNS requests&lt;/strong&gt; to your preferred DNS server — even if the client tries to bypass it.&lt;/p&gt;

&lt;p&gt;I use &lt;a href="https://0xerr0r.github.io/blocky/" rel="noopener noreferrer"&gt;Blocky&lt;/a&gt; as my DNS server on my home network, but this should work with Pi-Hole and any other DNS server as well.&lt;/p&gt;

&lt;p&gt;In order to disable this, I setup a few NAT rules on my &lt;a href="https://vyos.io/" rel="noopener noreferrer"&gt;Vyos&lt;/a&gt; router to redirect any DNS queries to unknown DNS servers to my Blocky server.&lt;/p&gt;

&lt;h3&gt;
  
  
  Step 1: Define Allowed DNS ServersLink to heading
&lt;/h3&gt;

&lt;p&gt;Start by creating an address group containing the allowed DNS servers. This ensures that legitimate DNS queries are not redirected.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight cisco_ios"&gt;&lt;code&gt;&lt;span class="k"&gt;mhamzahkhan@homelab-gw:~$&lt;/span&gt; configure
&lt;span class="k"&gt;[edit]&lt;/span&gt;
&lt;span class="k"&gt;set&lt;/span&gt; firewall group address-group dns-servers address '&lt;span class="m"&gt;10.254.95.3&lt;/span&gt;'
&lt;span class="k"&gt;set&lt;/span&gt; firewall group address-group dns-servers address '&lt;span class="m"&gt;10.254.95.4&lt;/span&gt;'

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Step 2: Redirect Unapproved DNS Requests with NATLink to heading
&lt;/h3&gt;

&lt;p&gt;Next, set up a destination NAT rule to redirect DNS queries not intended for the allowed DNS servers to the Blocky DNS server.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight cisco_ios"&gt;&lt;code&gt;&lt;span class="k"&gt;mhamzahkhan@homelab-gw:~$&lt;/span&gt; configure
&lt;span class="k"&gt;[edit]&lt;/span&gt;
&lt;span class="k"&gt;set&lt;/span&gt; nat &lt;span class="k"&gt;destin&lt;/span&gt;&lt;span class="c1"&gt;ation rule 5010 description 'Captive DNS'&lt;/span&gt;
&lt;span class="k"&gt;set&lt;/span&gt; nat &lt;span class="k"&gt;destin&lt;/span&gt;&lt;span class="c1"&gt;ation rule 5010 destination group address-group '!dns-servers'&lt;/span&gt;
&lt;span class="k"&gt;set&lt;/span&gt; nat &lt;span class="k"&gt;destin&lt;/span&gt;&lt;span class="c1"&gt;ation rule 5010 destination port '53'&lt;/span&gt;
&lt;span class="k"&gt;set&lt;/span&gt; nat &lt;span class="k"&gt;destin&lt;/span&gt;&lt;span class="c1"&gt;ation rule 5010 inbound-interface name 'bond1.90'&lt;/span&gt;
&lt;span class="k"&gt;set&lt;/span&gt; nat &lt;span class="k"&gt;destin&lt;/span&gt;&lt;span class="c1"&gt;ation rule 5010 protocol 'tcp_udp'&lt;/span&gt;
&lt;span class="k"&gt;set&lt;/span&gt; nat &lt;span class="k"&gt;destin&lt;/span&gt;&lt;span class="c1"&gt;ation rule 5010 translation address '10.254.95.4'&lt;/span&gt;
&lt;span class="k"&gt;set&lt;/span&gt; nat &lt;span class="k"&gt;destin&lt;/span&gt;&lt;span class="c1"&gt;ation rule 5010 translation port '53'&lt;/span&gt;

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;In this example, bond1.90 is my internal home network and 10.254.95.4 is my Blocky DNS server.&lt;/p&gt;

</description>
      <category>networking</category>
      <category>homelab</category>
      <category>devops</category>
      <category>dns</category>
    </item>
    <item>
      <title>VyOS - WireGuard based Road Warrior VPN Configuration</title>
      <dc:creator>M. Hamzah Khan</dc:creator>
      <pubDate>Sat, 16 Sep 2023 22:32:18 +0000</pubDate>
      <link>https://dev.to/mhamzahkhan/vyos-wireguard-based-road-warrior-vpn-configuration-3bl</link>
      <guid>https://dev.to/mhamzahkhan/vyos-wireguard-based-road-warrior-vpn-configuration-3bl</guid>
      <description>&lt;p&gt;In our modern, hyper-connected world, where remote work and global access are increasingly vital, the need for secure connectivity to your home or office network has evolved from a luxury to an essential requirement.&lt;/p&gt;

&lt;p&gt;Whether you’re a professional in need of remote access to an office network or a passionate home lab enthusiast managing various services, a road-warrior style VPN is your key to top-tier, secure and hassle-free remote server access from anywhere in the world.&lt;/p&gt;

&lt;p&gt;Regardless of if you are managing a personal web server, delving into home automation experiments, or overseeing your own cloud services, this guide serves as your trusty roadmap, expanding on the principles covered in our previous post about &lt;a href="https://www.hamzahkhan.com/vyos-ospf-wireguard" rel="noopener noreferrer"&gt;establishing a site-to-site VPN with WireGuard and VyOS&lt;/a&gt;. We now shift our focus to the individual user’s perspective, bridging the geographical gap between your current location and the heart of your network from anywhere in the world. Together, we’ll navigate the process of configuring VyOS to function as a WireGuard VPN server, enabling you to access your digital realm with unwavering security and unrivaled ease.&lt;/p&gt;

&lt;p&gt;Let’s dive in and get started!&lt;/p&gt;

&lt;h2&gt;
  
  
  Configure the WireGuard Server on VyOSLink to heading
&lt;/h2&gt;

&lt;p&gt;VyOS’ command line interface simplifies the configuration of a Wireguard server and makes client configuration a breeze as well.&lt;/p&gt;

&lt;p&gt;All of the configuration for WireGuard on VyOS is done in the WireGuard interface configuration commands, which are prefixed with &lt;code&gt;interface wireguard $INTERFACE_NAME&lt;/code&gt;.&lt;/p&gt;

&lt;h3&gt;
  
  
  Setup VariablesLink to heading
&lt;/h3&gt;

&lt;p&gt;I refer to these variables throughout this guide:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;code&gt;SERVER_PUBLIC_IP&lt;/code&gt; - This is the server’s public IP address&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;SERVER_PRIVATE_KEY&lt;/code&gt; - This is the server’s private key - This is generated by the &lt;code&gt;generate pki wireguard key-pair&lt;/code&gt; command&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;SERVER_PUBLIC_KEY&lt;/code&gt; - This is the server’s public key - This is generated by the &lt;code&gt;generate pki wireguard key-pair&lt;/code&gt; command&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;CLIENT_PRIVATE_KEY&lt;/code&gt; - This is the client’s private key - This is generated by the &lt;code&gt;generate wireguard client-config&lt;/code&gt; command&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;CLIENT_PUBLIC_KEY&lt;/code&gt; - This is the client’s private key - This is generated by the &lt;code&gt;generate wireguard client-config&lt;/code&gt; command&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Generate Server KeypairLink to heading
&lt;/h3&gt;

&lt;p&gt;Generate a keypair for the WireGuard server. Make note of these, as you will need these again.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight console"&gt;&lt;code&gt;&lt;span class="gp"&gt;mhamzahkhan@gw:~$&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;generate pki wireguard key-pair
&lt;span class="gp"&gt;Private key: &amp;lt;- OMITTED - USE YOUR OWN ONE - I will refer to this as $&lt;/span&gt;&lt;span class="o"&gt;{&lt;/span&gt;SERVER_PRIVATE_KEY&lt;span class="o"&gt;}&lt;/span&gt; -&amp;gt;
&lt;span class="gp"&gt;Public key: &amp;lt;- OMITTED - USE YOUR OWN ONE - I will refer to this as $&lt;/span&gt;&lt;span class="o"&gt;{&lt;/span&gt;SERVER_PUBLIC_KEY&lt;span class="o"&gt;}&lt;/span&gt; -&amp;gt;
&lt;span class="go"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  Configure WireGuard InterfacesLink to heading
&lt;/h2&gt;

&lt;p&gt;Next we can configure the WireGuard interface.&lt;/p&gt;

&lt;p&gt;For I am using the subnet 10.254.254.0/24 for my VPN, but you can use whatever you like.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight cisco_ios"&gt;&lt;code&gt;&lt;span class="k"&gt;mhamzahkhan@gw#&lt;/span&gt; set interfaces wireguard wg1 address '&lt;span class="m"&gt;10.254.254.1/24&lt;/span&gt;'
&lt;span class="k"&gt;mhamzahkhan@gw#&lt;/span&gt; set interfaces wireguard wg1 &lt;span class="k"&gt;description&lt;/span&gt;&lt;span class="c1"&gt; 'VPN'&lt;/span&gt;
&lt;span class="k"&gt;mhamzahkhan@gw#&lt;/span&gt; set interfaces wireguard wg1 ip adjust-mss '1380'
&lt;span class="k"&gt;mhamzahkhan@gw#&lt;/span&gt; set interfaces wireguard wg1 mtu '1420'
&lt;span class="k"&gt;mhamzahkhan@gw#&lt;/span&gt; set interfaces wireguard wg1 port '51920'
&lt;span class="k"&gt;mhamzahkhan@gw#&lt;/span&gt; set interfaces wireguard wg1 private-key '${SERVER_PRIVATE_KEY}'

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Next, for each device that will connect to the VPN, we need to add a peer definition. VyOS makes this extremely easy, and even generates a QR code which can be scanned to easily configure the WireGuard client on a phone, for example:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;mhamzahkhan@gw:~$ generate wireguard client-config hamzah-phone interface wg1 server ${VYOS_SERVER_PUBLIC_ADDRESS} address 10.254.254.2/24

WireGuard client configuration for interface: wg1

To enable this configuration on a VyOS router you can use the following commands:

=== VyOS (server) configurtation ===

set interfaces wireguard wg1 peer hamzah-phone allowed-ips '10.254.254.2/32'
set interfaces wireguard wg1 peer hamzah-phone public-key '${CLIENT_PUBLIC_KEY}'

=== RoadWarrior (client) configuration ===

[Interface]
PrivateKey = ${CLIENT_PRIVATE_KEY}
Address = 10.254.254.2/32
DNS = 1.1.1.1

[Peer]
PublicKey = ${SERVER_PUBLIC_KEY}
Endpoint = ${SERVER_PUBLIC_ADDRESS}:51821
AllowedIPs = 0.0.0.0/0, ::/0

█████████████████████████████████████████████████████████████
█████████████████████████████████████████████████████████████
████ ▄▄▄▄▄ █ ██▀▄█ ▄██▀▀ ▀██▀▀▄▀▄ ▀ ▄█▄▄▀▄█▀▀ ▀██ ▄▄▄▄▄ ████
████ █ █ █ ███▄█▀ ▄█▀▀ ███▀▀ ▀▄▄▄▄ ▀▀▀▀▀▄▀█ █ █ ████
████ █▄▄▄█ █▀█ ▄▀▄▄█▄█▀▄ ██▄ ▄▄▄ ▀▄█▀▀█ ▀▄▄ ▄ ███ █▄▄▄█ ████
████▄▄▄▄▄▄▄█▄▀▄▀▄█ ▀▄▀▄▀▄▀▄▀ █▄█ █▄▀ █▄█ █ █▄▀ █ █▄▄▄▄▄▄▄████
████▄ █▀ ▄▄▀▀▄▀▀ ▀▄ ▄ ▄ ▄ ▄ ▀▄ ▀▄█▄█▀▄█▄ █▀▀█▄█ ▄▄ ████
████▀▀██▄▄▄█▄▄▄█▀ █▄ █▀█ █ ▀█▀█▀▄▀▀ ▀ ██▀█▀▀▄▄▄ █▀ ▄▄█ █ ████
████▄ ▄▀▀▄▄▄▀ ██ ▄▄██▄ ▄█▀▄▄██▄█ ███▀█▀█▀█▄█▀▀██████▀ ████
████▀ ▄▀▀ ▄▀██▄▀▄███▀▀▄ ▀ ▀ ▀▀ ▀▄█▄▀▀▄██▀ ▀▀ ▀██ ▀▀▀▄▀▄ ████
████████▄▄▄▄██▄▄▄▄ ▄▄▄█▀ ▄█ ▄ █ ▀▀█▄ █ ▄ ▄██ ▄▀▀█▀ ▀▀█▄████
████ ▀▄ ▄▄█▄ ▀ ▄ ▄▄██▄ ▀▄▀█▄▄▄█▄ █▀█▄▄ ▄██▄▄ ▀▀█▄▄██▄████
████ ▀█▄▄█▄▀▄▄ █ █▄▀▀▀ ▀ ▀█▄█▀█▄▄█▄ ▄▀█▀ █▀▀▄█ ▀▄▀█ █▀█ ████
████▀▄█ ▀ ▄▄▀▀ █▄█ ▄ ██ ▀ ▄ ▀▄ █▄▄█ ▀ ▀▄▄▀█ ▄█ ▀▄█▀█▄ ████
████▀▄ ▄▄▄ ▀▀ █ ▀█ ▄ ▄▄ ▄▄▄ █▀▀▄▀▄ █▀ █▄ ▄▄▄ ▄▀ █████
█████▀██ █▄█ █ ▀ █▄ ▄ █▀▄▀▀█ █▄█ █▄██▀▀▄▀▀█▄▀ ▄ █▄█ █▄▀▄████
█████ █▀ ▄▄ ▄▄ ▄▄▄▄█▀ ▄ ▄▀▀▄▄ █▄ ██▄▀▀ ▄█ ▄ ▀▄▄ █▀█▄████
████▀▄ ▀█▄▄▀▄█▄▀ ▄ █▀▀▄▀█▀█▄▄█▀▀▀█▄ ▄ ██▀▀ ▄▀ ▄▀█▀▄██ █ █████
████▄█▄ ▄▄▄▀ ▀▄▀▀▀ █▄▄▄█▄ ▀▀▄██ ▀▀▄▀█ ▄ █▀ █▀ ▀▄▄█▀▄▄████
████▄▀▄▀ ▄█▀█ ▄▄█▀ ▀ ████ ██▄▀▀██▀█▀▀▀▀▄█ █ ▀ ▀▄▀▄▀█▀ ▄████
████▄▀ ▄█▄▀█▄▀▀▀▄█▄▀▀▀▄ ███ ▄█▄ ▄▀ ██ █ ▄█▄█▀ ▄▀▄▀▀▀▀█ ████
███████▄ ▄█ ▄█▄ ▀█ ▄ █▄█▀█ █▀▄▀ █▄▀█▀▄ ██▀ ▀██▄▀▄▀▄▄ ████
████▄█▀▀█ ▄ ▀▀▀ ▄ ▀▄ █▄▄▀ █▄▀ █ █▄ █▀▄█ █▀ █▄▄▄█ ▀█▄████
████▄ ▀▄▄▄▄▀████▄▀▀▄█ ██▄█ ▄▄▄ ▄▀▀ ▄▀ █▄▀██▀▄▄█▀ ▄█ ▄▄▀▄ ████
███████▄██▄▄▀ ▄▄ █▄█▀ ▀ ▀ ▄▄▄ █▀▄▀█▀▀ ▀▄▀▀█ ▄ ▄▄▄ ▄▀▀▀████
████ ▄▄▄▄▄ █▀▄ █ █▀▀▄▀▀ █▀ █▄█ ▀█▀▀▀▄▀▀ ▄ ▀█ █ █▄█ ▀▄ █████
████ █ █ █▄▀█▄▄▄▄ █▄▄▀▄▄▄█ ▄▀▀ ▄ █▄▄ ▀ █ ▄ ▄▄▄▄▀▀█████
████ █▄▄▄█ █▀ ▀▀▀ ▄█▀▄ ▄ ███ ██ ▄▄▀▄▄▄█▀ █▀▄▀██▄▀▀ ████
████▄▄▄▄▄▄▄█▄██▄▄██▄██████▄▄▄█████▄▄▄▄██▄▄██▄▄▄█▄█▄█▄██▄█████
█████████████████████████████████████████████████████████████
█████████████████████████████████████████████████████████████

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;If you are configuring the client on a phone, using the QR code makes it increcibly easy to configure the client, alternatively, configuring the Mac OS X client allows you to just copy and paste in the client conifuration above the QR code.&lt;/p&gt;

&lt;h2&gt;
  
  
  ConclusionLink to heading
&lt;/h2&gt;

&lt;p&gt;As we conclude our journey through configuring VyOS as a WireGuard VPN server, you now possess a fully functional WireGuard VPN setup, empowering you to securely access your self-hosted digital resources from anywhere on the planet.&lt;/p&gt;

&lt;p&gt;In our ever-evolving, interconnected world, the demand for secure, remote network access remains as vital as ever. By utilising WireGuard and VyOS, you have armed yourself with the ability to stay seamlessly connected to your internal services and servers, whether you’re managing a personal web server, experimenting with home automation, or trying to access secure files on your office network.&lt;/p&gt;

&lt;p&gt;In my next post, I will be discussing how I use WireGuard to allow me to host services in my home lab, despite being behind CGNAT.&lt;/p&gt;

</description>
      <category>networking</category>
      <category>vpn</category>
      <category>homelab</category>
      <category>devops</category>
    </item>
    <item>
      <title>VyOS - Site-to-Site VPN using Wireguard and OSPF</title>
      <dc:creator>M. Hamzah Khan</dc:creator>
      <pubDate>Thu, 07 Sep 2023 22:32:18 +0000</pubDate>
      <link>https://dev.to/mhamzahkhan/vyos-site-to-site-vpn-using-wireguard-and-ospf-2eco</link>
      <guid>https://dev.to/mhamzahkhan/vyos-site-to-site-vpn-using-wireguard-and-ospf-2eco</guid>
      <description>&lt;p&gt;Connecting two sites securely and efficiently is essential for many businesses and individuals.&lt;/p&gt;

&lt;p&gt;In this post, we’ll explore how to achieve seamless connectivity between two locations using the powerful combination of WireGuard, a modern and high-performance VPN protocol, and VyOS, a robust and versatile network operating system.&lt;/p&gt;

&lt;p&gt;Whether you’re looking to enhance communication between remote offices, create a secure link between your data center and a cloud-based infrastructure, or simply want to connect two geographically separated sites, this guide will walk you through the process, ensuring a reliable and secure connection every step of the way.&lt;/p&gt;

&lt;p&gt;To illustrate this process, I will use my own use case as an example. I manage equipment hosted in a colocation data center, which I affectionately refer to as my ‘colo-lab’, and I also maintain a ‘home-lab’.&lt;/p&gt;

&lt;p&gt;Previously, I relied on GRE over IPsec for connectivity between the two sites, but I’ve recently migrated these over to WireGuard.&lt;/p&gt;

&lt;p&gt;WireGuard boasts a slew of compelling advantages over traditional IPsec, including speed, security, and a refreshingly straightforward setup. Its minimalist design significantly simplifies the configuration process, especially when compared to the complexity of GRE over IPsec.&lt;/p&gt;

&lt;p&gt;Throughout this post, I’ll walk you through the precise steps I took to configure two VyOS routers to seamlessly integrate with WireGuard while enabling efficient route distribution through OSPF. By the end, you’ll be equipped with the knowledge to configure your own WireGuard based site-to-site VPN.&lt;/p&gt;

&lt;h2&gt;
  
  
  TopologyLink to heading
&lt;/h2&gt;

&lt;h3&gt;
  
  
  Colo LabLink to heading
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;WireGuard Interface IP: 10.254.2.0/31&lt;/li&gt;
&lt;li&gt;Internal Networks:

&lt;ul&gt;
&lt;li&gt;10.254.112.0/24&lt;/li&gt;
&lt;li&gt;10.254.113.0/24&lt;/li&gt;
&lt;li&gt;10.254.114.0/24&lt;/li&gt;
&lt;/ul&gt;


&lt;/li&gt;

&lt;li&gt;Internal Network Aggregate: 10.254.112.0/21&lt;/li&gt;

&lt;li&gt;Public IP: Refered to as &lt;code&gt;${COLO_LAB_PUBLIC_IP}&lt;/code&gt;
&lt;/li&gt;

&lt;/ul&gt;

&lt;h3&gt;
  
  
  Home LabLink to heading
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;WireGuard Interface IP: 10.254.2.1/31&lt;/li&gt;
&lt;li&gt;Internal Networks:

&lt;ul&gt;
&lt;li&gt;10.254.88.0/24&lt;/li&gt;
&lt;li&gt;10.254.89.0/24&lt;/li&gt;
&lt;li&gt;10.254.90.0/24&lt;/li&gt;
&lt;/ul&gt;


&lt;/li&gt;

&lt;li&gt;Internal Network Aggregate: 10.254.88.0/21&lt;/li&gt;

&lt;li&gt;Public IP: None (It’s behind CGNAT)&lt;/li&gt;

&lt;/ul&gt;

&lt;h2&gt;
  
  
  Generate KeypairsLink to heading
&lt;/h2&gt;

&lt;p&gt;First things first, let’s generate keypairs for both routers. Make note of these, and keep them safe.&lt;/p&gt;

&lt;p&gt;First the cololab router:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;mhamzahkhan@cololab-gw:~&lt;span class="nv"&gt;$ &lt;/span&gt;generate pki wireguard key-pair
Private key: &amp;lt;- OMITTED - USE YOUR OWN ONE - I will refer to this as &lt;span class="k"&gt;${&lt;/span&gt;&lt;span class="nv"&gt;COLOLAB_PRIVATE_KEY&lt;/span&gt;&lt;span class="k"&gt;}&lt;/span&gt; -&amp;gt;
Public key: &amp;lt;- OMITTED - USE YOUR OWN ONE - I will refer to this as &lt;span class="k"&gt;${&lt;/span&gt;&lt;span class="nv"&gt;COLOLAB_PUBLIC_KEY&lt;/span&gt;&lt;span class="k"&gt;}&lt;/span&gt; -&amp;gt;

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Then the homelab router:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;mhamzahkhan@homelab-gw:~&lt;span class="nv"&gt;$ &lt;/span&gt;generate pki wireguard key-pair
Private key: &amp;lt;- OMITTED - USE YOUR OWN ONE - I will refer to this as &lt;span class="k"&gt;${&lt;/span&gt;&lt;span class="nv"&gt;HOMELAB_PRIVATE_KEY&lt;/span&gt;&lt;span class="k"&gt;}&lt;/span&gt; -&amp;gt;
Public key: &amp;lt;- OMITTED - USE YOUR OWN ONE - I will refer to this as &lt;span class="k"&gt;${&lt;/span&gt;&lt;span class="nv"&gt;HOMELAB_PUBLIC_KEY&lt;/span&gt;&lt;span class="k"&gt;}&lt;/span&gt; -&amp;gt;

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  Configure WireGuard InterfacesLink to heading
&lt;/h2&gt;

&lt;p&gt;Next, let’s set up the WireGuard interfaces.&lt;/p&gt;

&lt;p&gt;For these interfaces, I’ve chosen a private /31 range, which gives us precisely two IP addresses, perfect for a point-to-point link. In my example, we’ll use 10.254.2.0/31 and 10.254.2.1/31.&lt;/p&gt;

&lt;h3&gt;
  
  
  Colo Lab Router WireGuard ConfigurationLink to heading
&lt;/h3&gt;

&lt;p&gt;Please note that because my home lab’s internet connection is behind CGNAT, I haven’t specified the peer address on the Colo Lab router. This means that the connection will be initiated from the home-lab side. If you have a static IP address (or dynamic IP address that doesn’t change much), it would be a good idea to specify the peer address so the connection can be initiated from either side.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;mhamzahkhan@cololab-gw:~&lt;span class="nv"&gt;$ &lt;/span&gt;configure
&lt;span class="o"&gt;[&lt;/span&gt;edit]
&lt;span class="nb"&gt;set &lt;/span&gt;interfaces wireguard wg0 address &lt;span class="s1"&gt;'10.254.2.0/31'&lt;/span&gt;
&lt;span class="nb"&gt;set &lt;/span&gt;interfaces wireguard wg0 description &lt;span class="s1"&gt;'Connection to Home-Lab'&lt;/span&gt;
&lt;span class="nb"&gt;set &lt;/span&gt;interfaces wireguard wg0 ip adjust-mss &lt;span class="s1"&gt;'1380'&lt;/span&gt;
&lt;span class="nb"&gt;set &lt;/span&gt;interfaces wireguard wg0 mtu &lt;span class="s1"&gt;'1420'&lt;/span&gt;
&lt;span class="nb"&gt;set &lt;/span&gt;interfaces wireguard wg0 peer home-lab allowed-ips &lt;span class="s1"&gt;'0.0.0.0/0'&lt;/span&gt;
&lt;span class="nb"&gt;set &lt;/span&gt;interfaces wireguard wg0 peer home-lab persistent-keepalive &lt;span class="s1"&gt;'10'&lt;/span&gt;
&lt;span class="nb"&gt;set &lt;/span&gt;interfaces wireguard wg0 peer home-lab public-key &lt;span class="s1"&gt;'${HOMELAB_PUBLIC_KEY}'&lt;/span&gt;
&lt;span class="nb"&gt;set &lt;/span&gt;interfaces wireguard wg0 port &lt;span class="s1"&gt;'51820'&lt;/span&gt;
&lt;span class="nb"&gt;set &lt;/span&gt;interfaces wireguard wg0 private-key &lt;span class="s1"&gt;'${COLOLAB_PRIVATE_KEY}'&lt;/span&gt;

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Home Lab Router WireGuard ConfigurationLink to heading
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;mhamzahkhan@homelab-gw:~&lt;span class="nv"&gt;$ &lt;/span&gt;configure
&lt;span class="o"&gt;[&lt;/span&gt;edit]
&lt;span class="nb"&gt;set &lt;/span&gt;interfaces wireguard wg0 address &lt;span class="s1"&gt;'10.254.2.1/31'&lt;/span&gt;
&lt;span class="nb"&gt;set &lt;/span&gt;interfaces wireguard wg0 description &lt;span class="s1"&gt;'Connection to Colo-Lab'&lt;/span&gt;
&lt;span class="nb"&gt;set &lt;/span&gt;interfaces wireguard wg0 ip adjust-mss &lt;span class="s1"&gt;'1380'&lt;/span&gt;
&lt;span class="nb"&gt;set &lt;/span&gt;interfaces wireguard wg0 mtu &lt;span class="s1"&gt;'1420'&lt;/span&gt;
&lt;span class="nb"&gt;set &lt;/span&gt;interfaces wireguard wg0 peer colo-lab address &lt;span class="s1"&gt;'${COLO_LAB_PUBLIC_IP}'&lt;/span&gt;
&lt;span class="nb"&gt;set &lt;/span&gt;interfaces wireguard wg0 peer colo-lab allowed-ips &lt;span class="s1"&gt;'0.0.0.0/0'&lt;/span&gt;
&lt;span class="nb"&gt;set &lt;/span&gt;interfaces wireguard wg0 peer colo-lab persistent-keepalive &lt;span class="s1"&gt;'10'&lt;/span&gt;
&lt;span class="nb"&gt;set &lt;/span&gt;interfaces wireguard wg0 peer colo-lab port &lt;span class="s1"&gt;'51820'&lt;/span&gt;
&lt;span class="nb"&gt;set &lt;/span&gt;interfaces wireguard wg0 peer colo-lab public-key &lt;span class="s1"&gt;'${COLOLAB_PUBLIC_KEY}'&lt;/span&gt;
&lt;span class="nb"&gt;set &lt;/span&gt;interfaces wireguard wg0 port &lt;span class="s1"&gt;'51820'&lt;/span&gt;
&lt;span class="nb"&gt;set &lt;/span&gt;interfaces wireguard wg0 private-key &lt;span class="s1"&gt;'${HOMELAB_PRIVATE_KEY}'&lt;/span&gt;

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  Test WireGuard connectionLink to heading
&lt;/h2&gt;

&lt;p&gt;At this point, both routers should be able to ping each other via the VPN link:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;mhamzahkhan@cololab-gw:~&lt;span class="nv"&gt;$ &lt;/span&gt;ping 10.254.2.1 count 4
PING 10.254.2.1 &lt;span class="o"&gt;(&lt;/span&gt;10.254.2.1&lt;span class="o"&gt;)&lt;/span&gt; 56&lt;span class="o"&gt;(&lt;/span&gt;84&lt;span class="o"&gt;)&lt;/span&gt; bytes of data.
64 bytes from 10.254.2.1: &lt;span class="nv"&gt;icmp_seq&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;1 &lt;span class="nv"&gt;ttl&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;64 &lt;span class="nb"&gt;time&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;0.339 ms
64 bytes from 10.254.2.1: &lt;span class="nv"&gt;icmp_seq&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;2 &lt;span class="nv"&gt;ttl&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;64 &lt;span class="nb"&gt;time&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;0.382 ms
64 bytes from 10.254.2.1: &lt;span class="nv"&gt;icmp_seq&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;3 &lt;span class="nv"&gt;ttl&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;64 &lt;span class="nb"&gt;time&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;0.344 ms
64 bytes from 10.254.2.1: &lt;span class="nv"&gt;icmp_seq&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;4 &lt;span class="nv"&gt;ttl&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;64 &lt;span class="nb"&gt;time&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;0.347 ms

&lt;span class="nt"&gt;---&lt;/span&gt; 10.254.2.1 ping statistics &lt;span class="nt"&gt;---&lt;/span&gt;
4 packets transmitted, 4 received, 0% packet loss, &lt;span class="nb"&gt;time &lt;/span&gt;3106ms
rtt min/avg/max/mdev &lt;span class="o"&gt;=&lt;/span&gt; 0.339/0.353/0.382/0.017 ms

mhamzahkhan@homelab-gw:~&lt;span class="nv"&gt;$ &lt;/span&gt;ping 10.254.2.0 count 4
PING 10.254.2.0 &lt;span class="o"&gt;(&lt;/span&gt;10.254.2.0&lt;span class="o"&gt;)&lt;/span&gt; 56&lt;span class="o"&gt;(&lt;/span&gt;84&lt;span class="o"&gt;)&lt;/span&gt; bytes of data.
64 bytes from 10.254.2.0: &lt;span class="nv"&gt;icmp_seq&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;1 &lt;span class="nv"&gt;ttl&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;64 &lt;span class="nb"&gt;time&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;0.290 ms
64 bytes from 10.254.2.0: &lt;span class="nv"&gt;icmp_seq&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;2 &lt;span class="nv"&gt;ttl&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;64 &lt;span class="nb"&gt;time&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;0.227 ms
64 bytes from 10.254.2.0: &lt;span class="nv"&gt;icmp_seq&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;3 &lt;span class="nv"&gt;ttl&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;64 &lt;span class="nb"&gt;time&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;0.404 ms
64 bytes from 10.254.2.0: &lt;span class="nv"&gt;icmp_seq&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;4 &lt;span class="nv"&gt;ttl&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;64 &lt;span class="nb"&gt;time&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;0.380 ms

&lt;span class="nt"&gt;---&lt;/span&gt; 10.254.2.0 ping statistics &lt;span class="nt"&gt;---&lt;/span&gt;
4 packets transmitted, 4 received, 0% packet loss, &lt;span class="nb"&gt;time &lt;/span&gt;3078ms
rtt min/avg/max/mdev &lt;span class="o"&gt;=&lt;/span&gt; 0.227/0.325/0.404/0.070 ms

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;To gauge the bandwidth between our networks, we can use iPerf3.&lt;/p&gt;

&lt;p&gt;First start start iPerf3 in server mode on either side of the VPN. I’m running it on the colo lab router:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;mhamzahkhan@cololab-gw:~&lt;span class="nv"&gt;$ &lt;/span&gt;iperf3 &lt;span class="nt"&gt;-s&lt;/span&gt;
&lt;span class="nt"&gt;-----------------------------------------------------------&lt;/span&gt;
Server listening on 5201 &lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;test&lt;/span&gt; &lt;span class="c"&gt;#1)&lt;/span&gt;
&lt;span class="nt"&gt;-----------------------------------------------------------&lt;/span&gt;

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Next, start iPerf3 on the home lab router. Let’s start with an upload bandwidth test from the home-lab router to the colo-lab router:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;mhamzahkhan@homelab-gw:~&lt;span class="nv"&gt;$ &lt;/span&gt;iperf3 &lt;span class="nt"&gt;-c&lt;/span&gt; 10.254.2.0
Connecting to host 10.254.2.0, port 5201
&lt;span class="o"&gt;[&lt;/span&gt;5] &lt;span class="nb"&gt;local &lt;/span&gt;10.254.2.1 port 33008 connected to 10.254.2.0 port 5201
&lt;span class="o"&gt;[&lt;/span&gt;ID] Interval Transfer Bitrate Retr Cwnd
&lt;span class="o"&gt;[&lt;/span&gt;5] 0.00-1.00 sec 20.8 MBytes 174 Mbits/sec 99 207 KBytes
&lt;span class="o"&gt;[&lt;/span&gt;5] 1.00-2.00 sec 20.7 MBytes 174 Mbits/sec 0 269 KBytes
&lt;span class="o"&gt;[&lt;/span&gt;5] 2.00-3.00 sec 19.8 MBytes 166 Mbits/sec 131 194 KBytes
&lt;span class="o"&gt;[&lt;/span&gt;5] 3.00-4.00 sec 22.1 MBytes 185 Mbits/sec 0 263 KBytes
&lt;span class="o"&gt;[&lt;/span&gt;5] 4.00-5.00 sec 17.3 MBytes 145 Mbits/sec 195 18.7 KBytes
&lt;span class="o"&gt;[&lt;/span&gt;5] 5.00-6.00 sec 16.4 MBytes 137 Mbits/sec 63 224 KBytes
&lt;span class="o"&gt;[&lt;/span&gt;5] 6.00-7.00 sec 19.9 MBytes 167 Mbits/sec 95 168 KBytes
&lt;span class="o"&gt;[&lt;/span&gt;5] 7.00-8.00 sec 11.3 MBytes 95.2 Mbits/sec 123 123 KBytes
&lt;span class="o"&gt;[&lt;/span&gt;5] 8.00-9.00 sec 18.9 MBytes 158 Mbits/sec 0 202 KBytes
&lt;span class="o"&gt;[&lt;/span&gt;5] 9.00-10.00 sec 20.2 MBytes 169 Mbits/sec 35 207 KBytes
- - - - - - - - - - - - - - - - - - - - - - - - -
&lt;span class="o"&gt;[&lt;/span&gt;ID] Interval Transfer Bitrate Retr
&lt;span class="o"&gt;[&lt;/span&gt;5] 0.00-10.00 sec 187 MBytes 157 Mbits/sec 741 sender
&lt;span class="o"&gt;[&lt;/span&gt;5] 0.00-10.01 sec 186 MBytes 156 Mbits/sec receiver

iperf Done.

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;I’m not sure why there are retransmissions. I still need to investigate that, but it’s maxing out my home connection upload.&lt;/p&gt;

&lt;p&gt;Now, let’s reverse the test, with the colo-lab router sending data to the home-lab router. Use the -R flag for this:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;mhamzahkhan@homelab-gw:~&lt;span class="nv"&gt;$ &lt;/span&gt;iperf3 &lt;span class="nt"&gt;-c&lt;/span&gt; 10.254.2.0 &lt;span class="nt"&gt;-R&lt;/span&gt;
Connecting to host 10.254.2.0, port 5201
Reverse mode, remote host 10.254.2.0 is sending
&lt;span class="o"&gt;[&lt;/span&gt;5] &lt;span class="nb"&gt;local &lt;/span&gt;10.254.2.1 port 52016 connected to 10.254.2.0 port 5201
&lt;span class="o"&gt;[&lt;/span&gt;ID] Interval Transfer Bitrate
&lt;span class="o"&gt;[&lt;/span&gt;5] 0.00-1.00 sec 14.8 MBytes 124 Mbits/sec
&lt;span class="o"&gt;[&lt;/span&gt;5] 1.00-2.00 sec 17.4 MBytes 145 Mbits/sec
&lt;span class="o"&gt;[&lt;/span&gt;5] 2.00-3.00 sec 17.6 MBytes 148 Mbits/sec
&lt;span class="o"&gt;[&lt;/span&gt;5] 3.00-4.00 sec 15.5 MBytes 130 Mbits/sec
&lt;span class="o"&gt;[&lt;/span&gt;5] 4.00-5.00 sec 16.3 MBytes 137 Mbits/sec
&lt;span class="o"&gt;[&lt;/span&gt;5] 5.00-6.00 sec 12.2 MBytes 102 Mbits/sec
&lt;span class="o"&gt;[&lt;/span&gt;5] 6.00-7.00 sec 9.33 MBytes 78.3 Mbits/sec
&lt;span class="o"&gt;[&lt;/span&gt;5] 7.00-8.00 sec 7.86 MBytes 65.9 Mbits/sec
&lt;span class="o"&gt;[&lt;/span&gt;5] 8.00-9.00 sec 14.7 MBytes 124 Mbits/sec
&lt;span class="o"&gt;[&lt;/span&gt;5] 9.00-10.00 sec 15.3 MBytes 128 Mbits/sec
- - - - - - - - - - - - - - - - - - - - - - - - -
&lt;span class="o"&gt;[&lt;/span&gt;ID] Interval Transfer Bitrate Retr
&lt;span class="o"&gt;[&lt;/span&gt;5] 0.00-10.01 sec 142 MBytes 119 Mbits/sec 282 sender
&lt;span class="o"&gt;[&lt;/span&gt;5] 0.00-10.00 sec 141 MBytes 118 Mbits/sec receiver

iperf Done.

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Some tuning may be needed, but for now, these numbers should suffice.&lt;/p&gt;

&lt;h2&gt;
  
  
  Configure OSPFLink to heading
&lt;/h2&gt;

&lt;p&gt;Now, let’s dive into OSPF configuration. Note that I use OSPF route summarization, which means we summarize individual subnets on each side into a single summary route, simplifying the routing table.&lt;/p&gt;

&lt;h3&gt;
  
  
  Colo Lab Router OSPF ConfigurationLink to heading
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nb"&gt;set &lt;/span&gt;protocols ospf area 0.0.0.0 network &lt;span class="s1"&gt;'10.254.2.0/31'&lt;/span&gt;
&lt;span class="nb"&gt;set &lt;/span&gt;protocols ospf area 0.0.0.1 network &lt;span class="s1"&gt;'10.254.112.0/24'&lt;/span&gt;
&lt;span class="nb"&gt;set &lt;/span&gt;protocols ospf area 0.0.0.1 network &lt;span class="s1"&gt;'10.254.113.0/24'&lt;/span&gt;
&lt;span class="nb"&gt;set &lt;/span&gt;protocols ospf area 0.0.0.1 network &lt;span class="s1"&gt;'10.254.114.0/24'&lt;/span&gt;
&lt;span class="nb"&gt;set &lt;/span&gt;protocols ospf area 0.0.0.1 range 10.254.112.0/21
&lt;span class="nb"&gt;set &lt;/span&gt;protocols ospf interface eth0 passive
&lt;span class="nb"&gt;set &lt;/span&gt;protocols ospf log-adjacency-changes
&lt;span class="nb"&gt;set &lt;/span&gt;protocols ospf parameters router-id &lt;span class="s1"&gt;'10.254.2.0'&lt;/span&gt;

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Home Lab Router OSPF ConfigurationLink to heading
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nb"&gt;set &lt;/span&gt;protocols ospf area 0.0.0.0 network &lt;span class="s1"&gt;'10.254.2.0/31'&lt;/span&gt;
&lt;span class="nb"&gt;set &lt;/span&gt;protocols ospf area 0.0.0.1 network &lt;span class="s1"&gt;'10.254.88.0/24'&lt;/span&gt;
&lt;span class="nb"&gt;set &lt;/span&gt;protocols ospf area 0.0.0.1 network &lt;span class="s1"&gt;'10.254.89.0/24'&lt;/span&gt;
&lt;span class="nb"&gt;set &lt;/span&gt;protocols ospf area 0.0.0.1 network &lt;span class="s1"&gt;'10.254.90.0/24'&lt;/span&gt;
&lt;span class="nb"&gt;set &lt;/span&gt;protocols ospf area 0.0.0.1 range 10.254.88.0/21
&lt;span class="nb"&gt;set &lt;/span&gt;protocols ospf interface eth0 passive
&lt;span class="nb"&gt;set &lt;/span&gt;protocols ospf log-adjacency-changes
&lt;span class="nb"&gt;set &lt;/span&gt;protocols ospf parameters router-id &lt;span class="s1"&gt;'10.254.2.1'&lt;/span&gt;

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;And magically your routes should be in your routing table!&lt;/p&gt;

&lt;h3&gt;
  
  
  Colo Lab Router VerificationLink to heading
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;mhamzahkhan@cololab-gw:~&lt;span class="nv"&gt;$ &lt;/span&gt;show ip route 10.254.88.0
Routing entry &lt;span class="k"&gt;for &lt;/span&gt;10.254.88.0/21
 Known via &lt;span class="s2"&gt;"ospf"&lt;/span&gt;, distance 110, metric 2, best
 Last update 11:58:19 ago
 &lt;span class="k"&gt;*&lt;/span&gt; 10.254.2.1, via wg0, weight 1

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Home Lab Router VerificationLink to heading
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;mhamzahkhan@homelab-gw:~&lt;span class="nv"&gt;$ &lt;/span&gt;show ip route 10.254.88.0
Routing entry &lt;span class="k"&gt;for &lt;/span&gt;10.254.112.0/21
 Known via &lt;span class="s2"&gt;"ospf"&lt;/span&gt;, distance 110, metric 2, best
 Last update 12:00:02 ago
 &lt;span class="k"&gt;*&lt;/span&gt; 10.254.2.0, via wg0, weight 1

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  ConclusionLink to heading
&lt;/h2&gt;

&lt;p&gt;With the successful implementation of WireGuard VPN and OSPF routing, your two sites can now seamlessly communicate, marking a significant step in enhancing your network capabilities. While this guide has laid a solid foundation for your site-to-site VPN, there’s more to explore and build upon in future configurations.&lt;/p&gt;

&lt;p&gt;In my next post, we will discuss configuring a VyOS-based WireGuard VPN for road-warrior style clients. This will enable secure remote access to your network, allowing you to connect from virtually anywhere with an internet connection. I will guide you through the setups, ensuring you have the tools to establish a secure and efficient network for remote users.&lt;/p&gt;

&lt;p&gt;Stay tuned for this next installment, where we continue to harness the power of WireGuard and VyOS to expand the horizons of your network. Elevate your connectivity and security to new heights, and don’t miss out on future updates and valuable networking insights—subscribe and stay connected!&lt;/p&gt;

</description>
      <category>networking</category>
      <category>devops</category>
      <category>homelab</category>
      <category>vpn</category>
    </item>
    <item>
      <title>Using FreeIPA CA as an ACME Provider for cert-manager</title>
      <dc:creator>M. Hamzah Khan</dc:creator>
      <pubDate>Wed, 27 Jul 2022 22:32:18 +0000</pubDate>
      <link>https://dev.to/mhamzahkhan/using-freeipa-ca-as-an-acme-provider-for-cert-manager-13d</link>
      <guid>https://dev.to/mhamzahkhan/using-freeipa-ca-as-an-acme-provider-for-cert-manager-13d</guid>
      <description>&lt;p&gt;I’m using &lt;a href="https://www.freeipa.org/" rel="noopener noreferrer"&gt;FreeIPA&lt;/a&gt; for authentication services in my home lab. It’s extreme overkill for my situation, as I don’t have many users (mainly just me!) but alas I like overkill. :)&lt;/p&gt;

&lt;p&gt;I am using FreeIPA’s DNS service to host some DNS subdomains for internal services. The way I have configured these subdomains is through DNS delegations, but since my IPA servers are not accessible from the internet, it breaks both the HTTP-01 and DNS-01 verification challenges from &lt;a href="https://letsencrypt.org/" rel="noopener noreferrer"&gt;LetsEncypt’s&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;Yesterday evening, I was playing around with &lt;a href="https://www.truenas.com/truecommand/" rel="noopener noreferrer"&gt;TrueCommand&lt;/a&gt; and have it hosted on one of my IPA internal domains, but as I cannot use LetsEncrypt to issue a certificate for it, I decided to use the CA built into FreeIPA since it supports ACME as well.&lt;/p&gt;

&lt;p&gt;As all the machines that will need to use the service are enrolled into IPA already, the CA certificate for IPA is also installed on those nodes, meaning any certificate issues by FreeIPA are automatically trusted.&lt;/p&gt;

&lt;p&gt;To get this to work, I had to first enable ACME support from within FreeIPA:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;[root@ipa-server ~]# ipa-acme-manage enable

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;FreeIPA’s ACME service supports both HTTP-01 and DNS-01 challenges, but I generally prefer DNS-01. For cert-manager to add the _acme-challenge DNS record to FreeIPA, we can use cert-manager’s RFC-2136 provider.&lt;/p&gt;

&lt;p&gt;To do this, we must create a new TSIG key on our IPA server:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;[root@ipa-server ~]# tsig-keygen -a hmac-sha512 acme-update &amp;gt;&amp;gt; /etc/named/ipa-ext.conf
[root@ipa-server ~]# systemctl restart named-pkcs11.service

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Enable dynamic updates for the IPA DNS subdomain:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;[root@ipa-server ~]# ipa dnszone-mod k8s.intahnet.co.uk --dynamic-update=True --update-policy='grant acme-update wildcard * ANY;'

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Next, I had to modify my cert-manager installation slightly to include my own CA certificate bundle, which includes my IPA CA cert. To do this I had to first create the bundle, and then create a Kubernetes ConfigMap for it:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;[mhamzahkhan@laptop ~]# cat /etc/ipa/ca.crt &amp;gt; ca-certificates.crt
[mhamzahkhan@laptop ~]# kubectl -n cert-manager create configmap ca-bundle --from-file ca-certificates.crt

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Info&lt;/p&gt;

&lt;p&gt;If the machine you are using is enrolled in the IPA domain, you could also just use /etc/pki/tls/certs/ca-bundle.crt, which is actually what I did since it contains all the other CA certificates that cert-manager may need (for example the ISRG Root X1 CA certificate, which is needed so cert-manager can properly access the LetsEncrypt ACME servers).&lt;/p&gt;

&lt;p&gt;Next, I had to modify the cert-manager deployment to make use of the ca-bundle. As I am using the cert-manager helm chart, this was quite easy. I added the following to my cert-manager helm values file:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;---
volumes:
 - name: ca-bundle
 configMap:
 name: ca-bundle

volumeMounts:
 - name: ca-bundle
 mountPath: /etc/ssl/certs/ca-certificates.crt
 subPath: ca-certificates.crt
 readOnly: false

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Once this has been deployed, we can need to create a secret in Kubernetes for the TSIG key. Grab the TSIG key we generated earlier from your IPA server (/etc/named/ipa-ext.conf), and create a Kubernetes secret with it:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;[mhamzahkhan@laptop ~]# kubectl -n cert-manager create secret generic ipa-tsig-secret --from-literal=tsig-secret-key="XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX"

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Next, add a new ClusterIssuer for IPA’s ACME service:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;---
apiVersion: cert-manager.io/v1
kind: ClusterIssuer
metadata:
 name: ipa
 namespace: cert-manager
spec:
 acme:
 email: admin@ipa.intahnet.co.uk
 server: https://ipa-ca.ipa.intahnet.co.uk/acme/directory
 privateKeySecretRef:
 name: ipa-issuer-account-key
 solvers:
 - dns01:
 rfc2136:
 nameserver: 10.0.0.22
 tsigKeyName: acme-update
 tsigAlgorithm: HMACSHA512
 tsigSecretSecretRef:
 name: ipa-tsig-secret
 key: tsig-secret-key
 selector:
 dnsZones:
 - 'k8s.intahnet.co.uk'

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Now you should be set to request certificates!&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;---
apiVersion: cert-manager.io/v1
kind: Certificate
metadata:
 name: truecommand-certificate
 namespace: default
spec:
 commonName: 'truecommand.k8s.intahnet.co.uk'
 dnsNames:
 - truecommand.k8s.intahnet.co.uk
 issuerRef:
 name: ipa
 kind: ClusterIssuer
 privateKey:
 algorithm: RSA
 encoding: PKCS1
 size: 4096
 secretName: truecommand-tls

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;All working:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;[mhamzahkhan@laptop ~]# kubectl get certificate
NAME READY SECRET AGE
truecommand-certificate True truecommand-tls 23s

[mhamzahkhan@laptop ~]# kubectl get secrets
NAME TYPE DATA AGE
truecommand-certificate-q8qkh kubernetes.io/tls 2 29s

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;It’s a very similar process to use ExternalDNS with FreeIPA as ExternalDNS also supports RFC2136. I have not set this up yet, but the process is described in this excellent blog post: &lt;a href="https://astrid.tech/2021/04/18/0/k8s-freeipa-dns/" rel="noopener noreferrer"&gt;How to set up Dynamic DNS on FreeIPA for your Kubernetes Cluster&lt;/a&gt;.&lt;/p&gt;

</description>
      <category>kubernetes</category>
      <category>devops</category>
      <category>acme</category>
      <category>freeipa</category>
    </item>
  </channel>
</rss>
