<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>DEV Community: Ivan Porta</title>
    <description>The latest articles on DEV Community by Ivan Porta (@gtrekter).</description>
    <link>https://dev.to/gtrekter</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F282182%2F851e471f-e7fa-4c79-801c-7824c291696d.jpg</url>
      <title>DEV Community: Ivan Porta</title>
      <link>https://dev.to/gtrekter</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://dev.to/feed/gtrekter"/>
    <language>en</language>
    <item>
      <title>Deep Dive Into Linkerd Automated Sidecar Injection Workflow</title>
      <dc:creator>Ivan Porta</dc:creator>
      <pubDate>Sat, 21 Jun 2025 23:28:24 +0000</pubDate>
      <link>https://dev.to/gtrekter/deep-dive-into-linkerd-automated-sidecar-injection-workflow-4f11</link>
      <guid>https://dev.to/gtrekter/deep-dive-into-linkerd-automated-sidecar-injection-workflow-4f11</guid>
      <description>&lt;p&gt;The Linkerd Proxy-Injector uses a mutating webhook to intercept requests to the Kubernetes API whenever a new Pod is created. If the namespace or Pod is annotated with &lt;code&gt;linkerd.io/inject: enabled&lt;/code&gt;, the webhook automatically injects the Linkerd proxy and ProxyInit containers into the Pod spec. In this article, we will take a guided dive into its source code by walking through a sample application.&lt;/p&gt;

&lt;h1&gt;
  
  
  Prerequisites
&lt;/h1&gt;

&lt;ul&gt;
&lt;li&gt;macOS/Linux/Windows with a Unix‑style shell&lt;/li&gt;
&lt;li&gt;k3d (v5+) for local Kubernetes clusters&lt;/li&gt;
&lt;li&gt;kubectl (v1.25+)&lt;/li&gt;
&lt;li&gt;Helm (v3+)&lt;/li&gt;
&lt;li&gt;Smallstep (step) CLI for certificate generation&lt;/li&gt;
&lt;/ul&gt;

&lt;h1&gt;
  
  
  Setup
&lt;/h1&gt;

&lt;p&gt;First, we need to spin up a new cluster with k3d. Create the following configuration file:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;cat &amp;lt;&amp;lt; 'EOF' &amp;gt; cluster.yaml
apiVersion: k3d.io/v1alpha5
kind: Simple
metadata:
  name: "cluster"
servers: 1
agents: 0
image: rancher/k3s:v1.33.0-k3s1
network: playground
options:
  k3s:
    extraArgs:
      - arg: --disable=traefik
        nodeFilters: ["server:*"]
      - arg: --cluster-cidr=10.23.0.0/16
        nodeFilters: ["server:*"]
      - arg: --service-cidr=10.247.0.0/16
        nodeFilters: ["server:*"]
      - arg: --debug
        nodeFilters: ["server:*"]
ports:
  - port: 8081:80
    nodeFilters: ["loadbalancer"]
EOF
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;We will then use this file to create the cluster:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;k3d cluster create --kubeconfig-update-default -c ./cluster.yaml
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Now that the cluster is running, we can install Linkerd. First, Linkerd requires a root trust anchor and an intermediate issuer certificate for mTLS identity.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;step certificate create root.linkerd.cluster.local ./certificates/ca.crt ./certificates/ca.key \
    --profile root-ca \
    --no-password \
    --insecure
step certificate create identity.linkerd.cluster.local ./certificates/issuer.crt ./certificates/issuer.key \
    --profile intermediate-ca \
    --not-after 8760h \
    --no-password \
    --insecure \
    --ca ./certificates/ca.crt \
    --ca-key ./certificates/ca.key
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Finally, install Linkerd with Helm:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;helm repo add linkerd-edge https://helm.linkerd.io/edge
helm repo update
helm install linkerd-crds linkerd-edge/linkerd-crds \
  -n linkerd \
  --create-namespace \
  --set installGatewayAPI=true
helm upgrade --install linkerd-control-plane \
  -n linkerd \
  --set-file identityTrustAnchorsPEM=./certificates/ca.crt \
  --set-file identity.issuer.tls.crtPEM=./certificates/issuer.crt \
  --set-file identity.issuer.tls.keyPEM=./certificates/issuer.key \
  --set controllerLogLevel=debug \
  --set policyController.logLevel=debug \
  linkerd-edge/linkerd-control-plane
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h1&gt;
  
  
  The Injection process
&lt;/h1&gt;

&lt;p&gt;First, let’s deploy the following sample application:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;kubectl apply -f - &amp;lt;&amp;lt;EOF
apiVersion: v1
kind: Namespace
metadata:
  name: simple-app
  annotations:
    linkerd.io/inject: enabled
---
apiVersion: apps/v1
kind: Deployment
metadata:
  name: simple-app-v1
  namespace: simple-app
spec:
  replicas: 1
  selector:
    matchLabels:
      app: server
      version: v1
  template:
    metadata:
      labels:
        app: server
        version: v1
    spec:
      containers:
        - name: http-app
          image: kong/httpbin:latest
          ports:
            - containerPort: 80
EOF
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Because the &lt;strong&gt;simple-app&lt;/strong&gt; namespace is annotated with &lt;code&gt;linkerd.io/inject: enabled&lt;/code&gt;, Linkerd’s proxy-injector webhook automatically injects a sidecar into the &lt;strong&gt;simple-app-v1&lt;/strong&gt; Deployment’s Pods. At this point, Kubernetes sends a &lt;strong&gt;CREATE&lt;/strong&gt; request for the Deployment’s ReplicaSet and then for each Pod.&lt;/p&gt;

&lt;h1&gt;
  
  
  Interactions with the Kuberentes API
&lt;/h1&gt;

&lt;p&gt;Before creating the actual Pod, Kubernetes processes the resource and invokes any matching mutating webhooks in alphabetical order. In Linkerd’s case, this is the *&lt;em&gt;linkerd-proxy-injector-webhook *&lt;/em&gt;(defined in a &lt;code&gt;MutatingWebhookConfiguration&lt;/code&gt;).&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;kubectl get mutatingwebhookconfiguration linkerd-proxy-injector-webhook-config -o yaml
apiVersion: admissionregistration.k8s.io/v1
kind: MutatingWebhookConfiguration
metadata:
  name: linkerd-proxy-injector-webhook-config
webhooks:
- admissionReviewVersions:
  - v1
  - v1beta1
  clientConfig:
    caBundle: LS0tLS1CRUdJTiBDRVJUSUZJQ0FURS0tLS0tCk1JSURVakNDQWpxZ0F3SUJBZ0lRSzVya2tEMHVmQVNyRTRPeEV0Q2JTekFOQmdrcWhraUc5dzBCQVFzRkFEQXQKTVNzd0tRWURWUVFERXlKc2FXNXJaWEprTFhCeWIzaDVMV2x1YW1WamRHOXlMbXhwYm10bGNtUXVjM1pqTUI0WApEVEkxTURZd05ERXhNVGcxTVZvWERUSTJNRFl3TkRFeE1UZzFNVm93TFRFck1Da0dBMVVFQXhNaWJHbHVhMlZ5ClpDMXdjbTk0ZVMxcGJtcGxZM1J2Y2k1c2FXNXJaWEprTG5OMll6Q0NBU0l3RFFZSktvWklodmNOQVFFQkJRQUQKZ2dFUEFEQ0NBUW9DZ2dFQkFNOGM3ZXNMNXhNakxFMzlXYUMwZVpJOThtTVhSK24zTUdvWHJJSXc0S3NCeUw1QwpuWHp3Um9ISTV1WnVMR1ZMY0N6L1h0YWozWWp3T0RhL2pLODVKRHZ4ajF2MTFMV3J2NWN5b1ladTBJRm8ybkVLCnpIY21TdVJZSjJwSHFFOHhZQXRmcnh0SktDdldWK3FZTTFLTTI2V1lVT2kzSU9DVGNoV0d4MS9vSENCclFiUnAKalRpSUEvY2d3QU55dXpqQUV3a1ZCRWl4UE92YnduVHl4YmhDZVFBTGZCV2JiM3Z6MGJwTUVKOUxpNkoxVms2egpWOW9ycFA2UW0yam1iNHJ3SElWVGRTN1dXOXU5YWY5SEFGdlozeFdldHhYRXkzRzNvSEl0REFiQ3YyemhaZDNWCkVlYmZHdGR3RDFTQmNqbnlHbTllc1IzSlMySU4vejRKWC9KWmoyVUNBd0VBQWFOdU1Hd3dEZ1lEVlIwUEFRSC8KQkFRREFnV2dNQjBHQTFVZEpRUVdNQlFHQ0NzR0FRVUZCd01CQmdnckJnRUZCUWNEQWpBTUJnTlZIUk1CQWY4RQpBakFBTUMwR0ExVWRFUVFtTUNTQ0lteHBibXRsY21RdGNISnZlSGt0YVc1cVpXTjBiM0l1YkdsdWEyVnlaQzV6CmRtTXdEUVlKS29aSWh2Y05BUUVMQlFBRGdnRUJBR1pxRlFFY0g0THBqK0l5K1dwVFY1VTFuOHFqRGNOMFcyS3AKMHg0T25RaHp3NkZNUm8rR2NBdUR0Nk5kNkROVlZHZjNEdFBtcXhBM21wTUxDTDFSbytnUm9FSWg5N3pxdlZjSQpNalBmeXpkNGhRQ09ocmhyblJFazh2OEN6Rm5YREtPYmkyaUx1THVTNlJtc3I0alpPV2FrdWRKTzlqaUREUmJVCnlvcHhpWTgycW81VmNoT1IvaGg4K1o3S1FKL29lT29BMlp0Zk9QbmZ0VGYvenBwekJPQmtXRUxvYlRRVHRUbUoKbVdNaS9URUQ5QlE4U0NMUU5TUk1SRXpuaElFTGhja0lPVzBqMkNkYmJWWXdkZ2wrTSt1aVYweHdlTk9pQ2RxZApyZm9Yamp6TTh6MWk1Y0FzMS9IcGcvaEt2czlpM2hKNHQwd0NrZ0JQVXcyZzQxN29MU2s9Ci0tLS0tRU5EIENFUlRJRklDQVRFLS0tLS0=
    service:
      name: linkerd-proxy-injector
      namespace: linkerd
      path: /
      port: 443
  failurePolicy: Ignore
  matchPolicy: Equivalent
  name: linkerd-proxy-injector.linkerd.io
  namespaceSelector:
    matchExpressions:
    - key: config.linkerd.io/admission-webhooks
      operator: NotIn
      values:
      - disabled
    - key: kubernetes.io/metadata.name
      operator: NotIn
      values:
      - kube-system
      - cert-manager
  objectSelector:
    matchExpressions:
    - key: linkerd.io/control-plane-component
      operator: DoesNotExist
    - key: linkerd.io/cni-resource
      operator: DoesNotExist
  reinvocationPolicy: Never
  rules:
  - apiGroups:
    - ""
    apiVersions:
    - v1
    operations:
    - CREATE
    resources:
    - pods
    - services
    scope: Namespaced
  sideEffects: None
  timeoutSeconds: 10
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Several selectors determine whether the webhook is triggered. For example, the webhook is skipped if:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;The Pod is in a namespace with the label &lt;code&gt;config.linkerd.io/admission-webhooks=disabled&lt;/code&gt;, &lt;code&gt;kubernetes.io/metadata.name=kube-system&lt;/code&gt;, or &lt;code&gt;kubernetes.io/metadata.name=cert-manager&lt;/code&gt;.&lt;/li&gt;
&lt;li&gt;The Pod has the label &lt;code&gt;linkerd.io/control-plane-component&lt;/code&gt; or &lt;code&gt;linkerd.io/cni-resource&lt;/code&gt;.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;As you can see in the &lt;code&gt;webhooks.clientConfig.service&lt;/code&gt; block, the actual webhook logic resides in the &lt;strong&gt;linkerd-proxy-injector&lt;/strong&gt; Service in the &lt;strong&gt;linkerd&lt;/strong&gt; namespace. Next, let’s take a look at the &lt;strong&gt;linkerd-proxy-injector&lt;/strong&gt; Pod, which contains the actual injection container:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;kubectl get pods -n linkerd linkerd-proxy-injector-******** -o yaml
apiVersion: v1
kind: Pod
  ...
  name: linkerd-proxy-injector-********
  namespace: linkerd
spec:
  containers:
  ...
  - args:
    - proxy-injector
    - -log-level=debug
    - -log-format=plain
    - -linkerd-namespace=linkerd
    - -enable-pprof=false
    image: ghcr.io/buoyantio/controller:enterprise-2.18.0
    ...
    name: proxy-injector
    ports:
    - containerPort: 8443
      name: proxy-injector
      protocol: TCP
    - containerPort: 9995
      name: admin-http
      protocol: TCP
    volumeMounts:
    - mountPath: /var/run/linkerd/config
      name: config
    - mountPath: /var/run/linkerd/identity/trust-roots
      name: trust-roots
    - mountPath: /var/run/linkerd/tls
      name: tls
      readOnly: true
    - mountPath: /var/run/secrets/kubernetes.io/serviceaccount
      name: kube-api-access
      readOnly: true
  volumes:
  - configMap:
      defaultMode: 420
      name: linkerd-config
    name: config
  - configMap:
      defaultMode: 420
      name: linkerd-identity-trust-roots
    name: trust-roots
  - name: tls
    secret:
      defaultMode: 420
      secretName: linkerd-proxy-injector-k8s-tls
  - name: kube-api-access
    ...
  - name: linkerd-identity-token
    projected:
      defaultMode: 420
      sources:
      - serviceAccountToken:
          audience: identity.l5d.io
          expirationSeconds: 86400
          path: linkerd-identity-token
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Some important details worth mentioning are:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;The Pod mounts a volume backed by the &lt;code&gt;linkerd-config&lt;/code&gt; ConfigMap, which holds the chart‑rendered &lt;code&gt;values.yaml&lt;/code&gt;, including defaults for the proxy image, opaque ports, resource limits, and more.&lt;/li&gt;
&lt;li&gt;It also mounts a volume with the &lt;code&gt;linkerd-identity-trust-roots&lt;/code&gt; ConfigMap, which contains the trust‑anchor certificate.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;This information is read at the beginning of execution and will be used later during injection to configure proxy arguments and environment variables.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;valuesConfig, err := config.Values(pkgK8s.MountPathValuesConfig)
if err != nil {
  return nil, err
}
caPEM, err := os.ReadFile(pkgK8s.MountPathTrustRootsPEM)
if err != nil {
  return nil, err
}
valuesConfig.IdentityTrustAnchorsPEM = string(caPEM)
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Next, the webhook fetches the namespace object to read any namespace‑level Linkerd annotations. This is important because these values will be propagated to the proxy itself if no annotations in the Deployment override them.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;ns, err := api.Get(k8s.NS, request.Namespace)
if err != nil {
  return nil, err
}
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Then it constructs a &lt;code&gt;ResourceConfig&lt;/code&gt; struct that carries:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;The chart values merged with any overrides from Pod annotations.&lt;/li&gt;
&lt;li&gt;All namespace‑level annotations, so that any Linkerd annotation set at the namespace is automatically inherited by each Pod.&lt;/li&gt;
&lt;li&gt;The resource kind, so the code knows where to look for a Pod template if the resource is a Deployment.&lt;/li&gt;
&lt;li&gt;An &lt;code&gt;OwnerRetriever&lt;/code&gt; function that can look up the parent resource (e.g., Deployment → ReplicaSet → Pod) so that injected events can be attached to the highest‑level owner.
&lt;/li&gt;
&lt;/ul&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;resourceConfig := inject.NewResourceConfig(valuesConfig, inject.OriginWebhook, linkerdNamespace).
  WithOwnerRetriever(ownerRetriever(ctx, api, request.Namespace)).
  WithNsAnnotations(ns.GetAnnotations()).
  WithKind(request.Kind.Kind)
...
func NewResourceConfig(values *l5dcharts.Values, origin Origin, ns string) *ResourceConfig {
 config := &amp;amp;ResourceConfig{
  namespace:     ns,
  nsAnnotations: make(map[string]string),
  values:        values,
  origin:        origin,
 }
 config.workload.Meta = &amp;amp;metav1.ObjectMeta{}
 config.pod.meta = &amp;amp;metav1.ObjectMeta{}
 config.pod.labels = map[string]string{k8s.ControllerNSLabel: ns}
 config.pod.annotations = map[string]string{}
 return config
}
func (conf *ResourceConfig) WithOwnerRetriever(f OwnerRetrieverFunc) *ResourceConfig {
 conf.ownerRetriever = f
 return conf
}
func (conf *ResourceConfig) WithNsAnnotations(m map[string]string) *ResourceConfig {
 conf.nsAnnotations = m
 return conf
}
func (conf *ResourceConfig) WithKind(kind string) *ResourceConfig {
 conf.workload.metaType = metav1.TypeMeta{Kind: kind}
 return conf
}
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;At this point, the webhook deserializes the raw JSON bytes from the admission request into typed Kubernetes objects.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;report, err := resourceConfig.ParseMetaAndYAML(request.Object.Raw)
if err != nil {
  return nil, err
}
log.Infof("received %s", report.ResName())
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;You will see logs like the following:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;time="2025-06-04T11:19:18Z" level=info msg="received service/simple-app-v1"
time="2025-06-04T11:19:18Z" level=info msg="received admission review request \"919c6889-a59c-4168-be0d-6d448460af98\""
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;If the resource has a parent, the code looks it up so that any Injected or Skipped Kubernetes Event is attached to the parent:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;var parent *metav1.PartialObjectMetadata
var ownerKind string
if ownerRef := resourceConfig.GetOwnerRef(); ownerRef != nil {
  res, err := k8s.GetAPIResource(ownerRef.Kind)
  if err != nil {
    log.Tracef("skipping event for parent %s: %s", ownerRef.Kind, err)
  } else {
    objs, err := api.GetByNamespaceFiltered(res, request.Namespace, ownerRef.Name, labels.Everything())
    if err != nil {
      log.Warnf("couldn't retrieve parent object %s-%s-%s; error: %s", request.Namespace, ownerRef.Kind, ownerRef.Name, err)
    } else if len(objs) == 0 {
      log.Warnf("couldn't retrieve parent object %s-%s-%s", request.Namespace, ownerRef.Kind, ownerRef.Name)
    } else {
      parent = objs[0]
    }
    ownerKind = strings.ToLower(ownerRef.Kind)
  }
}
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;With those pieces in place, the webhook calls &lt;code&gt;report.Injectable()&lt;/code&gt; to decide whether injection should proceed. To do so, it checks the following conditions:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;HostNetwork&lt;/strong&gt; mode (iptables will not work if the Pod is on the host network).&lt;/li&gt;
&lt;li&gt;Existing sidecar detection that would make injection redundant&lt;/li&gt;
&lt;li&gt;Unsupported resource kinds.&lt;/li&gt;
&lt;li&gt;An explicit annotation that disables injection.&lt;/li&gt;
&lt;li&gt;Whether the Pod automatically mounts its ServiceAccount token (needed for mTLS).&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;If any of these checks fail, the function returns false along with one or more human‑readable reasons.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;injectable, reasons := report.Injectable()
...
func (r *Report) Injectable() (bool, []string) {
    var reasons []string
    if r.HostNetwork {
        reasons = append(reasons, hostNetworkEnabled)
    }
    if r.Sidecar {
        reasons = append(reasons, sidecarExists)
    }
    if r.UnsupportedResource {
        reasons = append(reasons, unsupportedResource)
    }
    if r.InjectDisabled {
        reasons = append(reasons, r.InjectDisabledReason)
    }

    if !r.AutomountServiceAccountToken {
        reasons = append(reasons, disabledAutomountServiceAccountToken)
    }

    if len(reasons) &amp;gt; 0 {
        return false, reasons
    }
    return true, nil
}
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;If &lt;code&gt;Injectable()&lt;/code&gt; returns &lt;strong&gt;true&lt;/strong&gt;, the webhook proceeds with injection and:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Adds the &lt;code&gt;linkerd-init&lt;/code&gt; init container (which configures the iptables rules).&lt;/li&gt;
&lt;li&gt;Adds the &lt;code&gt;linkerd-proxy&lt;/code&gt; sidecar container with all required environment variables, volume mounts, and command‑line flags.&lt;/li&gt;
&lt;li&gt;Appends a &lt;code&gt;config.linkerd.io/created-by&lt;/code&gt; annotation.&lt;/li&gt;
&lt;li&gt;If the Pod does not already have a &lt;code&gt;config.linkerd.io/opaque-ports&lt;/code&gt; annotation, splits the comma‑separated default opaque ports from &lt;code&gt;valuesConfig.Proxy.OpaquePorts&lt;/code&gt;, filters them against the actual container ports in the Pod spec, and then sets the annotation to just the matching ports.&lt;/li&gt;
&lt;li&gt;If a parent was found, emits a Kubernetes Event on the parent resource with reason &lt;strong&gt;Injected&lt;/strong&gt; and message &lt;em&gt;Linkerd sidecar proxy injected&lt;/em&gt;.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Finally, it logs the generated patch at &lt;strong&gt;INFO&lt;/strong&gt; level and debug‑prints the full JSON patch.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;
if injectable {
  resourceConfig.AppendPodAnnotation(pkgK8s.CreatedByAnnotation, fmt.Sprintf("linkerd/proxy-injector %s", version.Version))
  inject.AppendNamespaceAnnotations(resourceConfig.GetOverrideAnnotations(), resourceConfig.GetNsAnnotations(), resourceConfig.GetWorkloadAnnotations())
  if !resourceConfig.HasWorkloadAnnotation(pkgK8s.ProxyOpaquePortsAnnotation) {
    defaultPorts := strings.Split(resourceConfig.GetValues().Proxy.OpaquePorts, ",")
    filteredPorts := resourceConfig.FilterPodOpaquePorts(defaultPorts)
    if len(filteredPorts) != 0 {
      ports := strings.Join(filteredPorts, ",")
      resourceConfig.AppendPodAnnotation(pkgK8s.ProxyOpaquePortsAnnotation, ports)
    }
  }
  patchJSON, err := resourceConfig.GetPodPatch(true)
  if err != nil {
    return nil, err
  }
  if parent != nil {
    recorder.Event(parent, v1.EventTypeNormal, eventTypeInjected, "Linkerd sidecar proxy injected")
  }
  log.Infof("injection patch generated for: %s", report.ResName())
  log.Debugf("injection patch: %s", patchJSON)
  proxyInjectionAdmissionResponses.With(admissionResponseLabels(ownerKind, request.Namespace, "false", "", report.InjectAnnotationAt, configLabels)).Inc()
  patchType := admissionv1beta1.PatchTypeJSONPatch
  return &amp;amp;admissionv1beta1.AdmissionResponse{
    UID:       request.UID,
    Allowed:   true,
    PatchType: &amp;amp;patchType,
    Patch:     patchJSON,
  }, nil
}
...
func (conf *ResourceConfig) GetPodPatch(injectProxy bool) ([]byte, error) {
    namedPorts := make(map[string]int32)
    if conf.HasPodTemplate() {
        namedPorts = util.GetNamedPorts(conf.pod.spec.Containers)
    }
    values, err := GetOverriddenValues(conf.values, conf.getAnnotationOverrides(), namedPorts)
    values.Proxy.PodInboundPorts = getPodInboundPorts(conf.pod.spec)
    if err != nil {
        return nil, fmt.Errorf("could not generate Overridden Values: %w", err)
    }
    if values.ClusterNetworks != "" {
        for _, network := range strings.Split(strings.Trim(values.ClusterNetworks, ","), ",") {
            if _, _, err := net.ParseCIDR(network); err != nil {
                return nil, fmt.Errorf("cannot parse destination get networks: %w", err)
            }
        }
    }
    patch := &amp;amp;podPatch{
        Values:      *values,
        Annotations: map[string]string{},
        Labels:      map[string]string{},
    }
    switch strings.ToLower(conf.workload.metaType.Kind) {
    case k8s.Pod:
    case k8s.CronJob:
        patch.PathPrefix = "/spec/jobTemplate/spec/template"
    default:
        patch.PathPrefix = "/spec/template"
    }
    if conf.pod.spec != nil {
        conf.injectPodAnnotations(patch)
        if injectProxy {
            conf.injectObjectMeta(patch)
            conf.injectPodSpec(patch)
        } else {
            patch.Proxy = nil
            patch.ProxyInit = nil
        }
    }
    rawValues, err := yaml.Marshal(patch)
    if err != nil {
        return nil, err
    }
    files := []*loader.BufferedFile{
        {Name: chartutil.ChartfileName},
        {Name: "requirements.yaml"},
        {Name: "templates/patch.json"},
    }
    chart := &amp;amp;charts.Chart{
        Name:      "patch",
        Dir:       "patch",
        Namespace: conf.namespace,
        RawValues: rawValues,
        Files:     files,
        Fs:        static.Templates,
    }
    buf, err := chart.Render()
    if err != nil {
        return nil, err
    }
    res := rTrail.ReplaceAll(buf.Bytes(), []byte("\n"))
    return res, nil
}
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;You can see the related events in the parent deployment:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;kubectl describe deployment/simple-app-v1 -n simple-app
...
Events:
  Type    Reason             Age                    From                    Message
  ----    ------             ----                   ----                    -------
  Normal  ScalingReplicaSet  2m26s                  deployment-controller   Scaled up replica set simple-app-v1-658b475d7c from 0 to 1
  Normal  Injected           2m25s (x2 over 2m26s)  linkerd-proxy-injector  Linkerd sidecar proxy injected
  Normal  ScalingReplicaSet  2m25s                  deployment-controller   Scaled up replica set simple-app-v1-76fc99b86b from 0 to 1
  Normal  ScalingReplicaSet  2m19s                  deployment-controller   Scaled down replica set simple-app-v1-658b475d7c from 1 to 0
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Once the JSON is generated it returns the resulting []byte patch that Kubernetes applies. You can see the raw byte array of the incoming admission request, including the trust‑anchor PEM, Linkerd‑related annotations, and Pod spec in the logs. For example:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;time="2025-06-04T11:19:18Z" level=debug msg="admission request: &amp;amp;AdmissionRequest{UID:919c6889-a59c-4168-be0d-6d448460af98,Kind:/v1, Kind=Service,Resource:{ v1 services},SubResource:,Name:simple-app-v2,Namespace:simple-app,Operation:CREATE,UserInfo:{system:admin  [system:masters system:authenticated] map[authentication.kubernetes.io/credential-id:[X509SHA256=4bc6de6278f805fc173745e09f4a564b7c7b4fac138201f729d912cd623fa55b]]},Object:{[123 34 97 112 105 86 101 114 115 105 111 110 34 58 34 118 49 34 44 34 107 105 110 100 34 58 34 83 101 114 118 105 99 101 34 44 34 109 101 116 97 100 97 116 97 34 58 123 34 97 110 110 111 116 97 116 105 111 110 115 34 58 123 34 107 117 98 101 99 116 108 46 107 117 98 101 114 110 101 116 101 115 46 105 111 47 108 97 115 116 45 97 112 112 108 105 101 100 45 99 111 110 102 105 103 117 114 97 116 105 111 110 34 58 34 123 92 34 97 112 105 86 101 114 115 105 111 110 92 34 58 92 34 118 49 92 34 44 92 34 107 105 110 100 92 34 58 92 34 83 101 114 118 105 99 101 92 34 44 92 34 109 101 116 97 100 97 116 97 92 34 58 123 92 34 97 110 110 111 116 97 116 105 111 110 115 92 34 58 123 125 44 92 34 110 97 109 101 92 34 58 92 34 115 105 109 112 108 101 45 97 112 112 45 118 50 92 34 44 92 34 110 97 109 101 115 112 97 99 101 92 34 58 92 34 115 105 109 112 108 101 45 97 112 112 92 34 125 44 92 34 115 112 101 99 92 34 58 123 92 34 112 111 114 116 115 92 34 58 91 123 92 34 112 111 114 116 92 34 58 56 48 44 92 34 116 97 114 103 101 116 80 111 114 116 92 34 58 53 54 55 56 125 93 44 92 34 115 101 108 101 99 116 111 114 92 34 58 123 92 34 97 112 112 92 34 58 92 34 115 105 109 112 108 101 45 97 112 112 45 118 50 92 34 44 92 34 118 101 114 115 105 111 110 92 34 58 92 34 118 50 92 34 125 125 125 92 110 34 125 44 34 99 114 101 97 116 105 111 110 84 105 109 101 115 116 97 109 112 34 58 110 117 108 108 44 34 109 97 110 97 103 101 100 70 105 101 108 100 115 34 58 91 123 34 97 112 105 86 101 114 115 105 111 110 34 58 34 118 49 34 44 34 102 105 101 108 100 115 84 121 112 101 34 58 34 70 105 101 108 100 115 86 49 34 44 34 102 105 101 108 100 115 86 49 34 58 123 34 102 58 109 101 116 97 100 97 116 97 34 58 123 34 102 58 97 110 110 111 116 97 116 105 111 110 115 34 58 123 34 46 34 58 123 125 44 34 102 58 107 117 98 101 99 116 108 46 107 117 98 101 114 110 101 116 101 115 46 105 111 47 108 97 115 116 45 97 112 112 108 105 101 100 45 99 111 110 102 105 103 117 114 97 116 105 111 110 34 58 123 125 125 125 44 34 102 58 115 112 101 99 34 58 123 34 102 58 105 110 116 101 114 110 97 108 84 114 97 102 102 105 99 80 111 108 105 99 121 34 58 123 125 44 34 102 58 112 111 114 116 115 34 58 123 34 46 34 58 123 125 44 34 107 58 123 92 34 112 111 114 116 92 34 58 56 48 44 92 34 112 114 111 116 111 99 111 108 92 34 58 92 34 84 67 80 92 34 125 34 58 123 34 46 34 58 123 125 44 34 102 58 112 111 114 116 34 58 123 125 44 34 102 58 112 114 111 116 111 99 111 108 34 58 123 125 44 34 102 58 116 97 114 103 101 116 80 111 114 116 34 58 123 125 125 125 44 34 102 58 115 101 108 101 99 116 111 114 34 58 123 125 44 34 102 58 115 101 115 115 105 111 110 65 102 102 105 110 105 116 121 34 58 123 125 44 34 102 58 116 121 112 101 34 58 123 125 125 125 44 34 109 97 110 97 103 101 114 34 58 34 107 117 98 101 99 116 108 45 99 108 105 101 110 116 45 115 105 100 101 45 97 112 112 108 121 34 44 34 111 112 101 114 97 116 105 111 110 34 58 34 85 112 100 97 116 101 34 44 34 116 105 109 101 34 58 34 50 48 50 53 45 48 54 45 48 52 84 49 49 58 49 57 58 49 56 90 34 125 93 44 34 110 97 109 101 34 58 34 115 105 109 112 108 101 45 97 112 112 45 118 50 34 44 34 110 97 109 101 115 112 97 99 101 34 58 34 115 105 109 112 108 101 45 97 112 112 34 125 44 34 115 112 101 99 34 58 123 34 105 110 116 101 114 110 97 108 84 114 97 102 102 105 99 80 111 108 105 99 121 34 58 34 67 108 117 115 116 101 114 34 44 34 112 111 114 116 115 34 58 91 123 34 112 111 114 116 34 58 56 48 44 34 112 114 111 116 111 99 111 108 34 58 34 84 67 80 34 44 34 116 97 114 103 101 116 80 111 114 116 34 58 53 54 55 56 125 93 44 34 115 101 108 101 99 116 111 114 34 58 123 34 97 112 112 34 58 34 115 105 109 112 108 101 45 97 112 112 45 118 50 34 44 34 118 101 114 115 105 111 110 34 58 34 118 50 34 125 44 34 115 101 115 115 105 111 110 65 102 102 105 110 105 116 121 34 58 34 78 111 110 101 34 44 34 116 121 112 101 34 58 34 67 108 117 115 116 101 114 73 80 34 125 44 34 115 116 97 116 117 115 34 58 123 34 108 111 97 100 66 97 108 97 110 99 101 114 34 58 123 125 125 125] &amp;lt;nil&amp;gt;},OldObject:{[] &amp;lt;nil&amp;gt;},DryRun:*false,Options:{[123 34 97 112 105 86 101 114 115 105 111 110 34 58 34 109 101 116 97 46 107 56 115 46 105 111 47 118 49 34 44 34 102 105 101 108 100 77 97 110 97 103 101 114 34 58 34 107 117 98 101 99 116 108 45 99 108 105 101 110 116 45 115 105 100 101 45 97 112 112 108 121 34 44 34 102 105 101 108 100 86 97 108 105 100 97 116 105 111 110 34 58 34 83 116 114 105 99 116 34 44 34 107 105 110 100 34 58 34 67 114 101 97 116 101 79 112 116 105 111 110 115 34 125] &amp;lt;nil&amp;gt;},RequestKind:/v1, Kind=Service,RequestResource:/v1, Resource=services,RequestSubResource:,}"
time="2025-06-04T11:19:18Z" level=debug msg="request object bytes: {\"apiVersion\":\"v1\",\"kind\":\"Service\",\"metadata\":{\"annotations\":{\"kubectl.kubernetes.io/last-applied-configuration\":\"{\\\"apiVersion\\\":\\\"v1\\\",\\\"kind\\\":\\\"Service\\\",\\\"metadata\\\":{\\\"annotations\\\":{},\\\"name\\\":\\\"simple-app-v2\\\",\\\"namespace\\\":\\\"simple-app\\\"},\\\"spec\\\":{\\\"ports\\\":[{\\\"port\\\":80,\\\"targetPort\\\":5678}],\\\"selector\\\":{\\\"app\\\":\\\"simple-app-v2\\\",\\\"version\\\":\\\"v2\\\"}}}\\n\"},\"creationTimestamp\":null,\"managedFields\":[{\"apiVersion\":\"v1\",\"fieldsType\":\"FieldsV1\",\"fieldsV1\":{\"f:metadata\":{\"f:annotations\":{\".\":{},\"f:kubectl.kubernetes.io/last-applied-configuration\":{}}},\"f:spec\":{\"f:internalTrafficPolicy\":{},\"f:ports\":{\".\":{},\"k:{\\\"port\\\":80,\\\"protocol\\\":\\\"TCP\\\"}\":{\".\":{},\"f:port\":{},\"f:protocol\":{},\"f:targetPort\":{}}},\"f:selector\":{},\"f:sessionAffinity\":{},\"f:type\":{}}},\"manager\":\"kubectl-client-side-apply\",\"operation\":\"Update\",\"time\":\"2025-06-04T11:19:18Z\"}],\"name\":\"simple-app-v2\",\"namespace\":\"simple-app\"},\"spec\":{\"internalTrafficPolicy\":\"Cluster\",\"ports\":[{\"port\":80,\"protocol\":\"TCP\",\"targetPort\":5678}],\"selector\":{\"app\":\"simple-app-v2\",\"version\":\"v2\"},\"sessionAffinity\":\"None\",\"type\":\"ClusterIP\"},\"status\":{\"loadBalancer\":{}}}"
time="2025-06-04T11:19:18Z" level=debug msg="/var/run/linkerd/config/values config YAML: clusterDomain: cluster.local\nclusterNetworks: 10.0.0.0/8,100.64.0.0/10,172.16.0.0/12,192.168.0.0/16,fd00::/8\ncniEnabled: false\ncommonLabels: {}\ncontrolPlaneTracing: false\ncontrolPlaneTracingNamespace: linkerd-jaeger\ncontroller:\n  podDisruptionBudget:\n    maxUnavailable: 1\ncontrollerGID: -1\ncontrollerImage: ghcr.io/buoyantio/controller\ncontrollerImageVersion: \"\"\ncontrollerLogFormat: plain\ncontrollerLogLevel: debug\ncontrollerReplicas: 1\ncontrollerUID: 2103\ndebugContainer:\n  image:\n    name: cr.l5d.io/linkerd/debug\n    pullPolicy: \"\"\n    version: edge-25.4.4\ndeploymentStrategy:\n  rollingUpdate:\n    maxSurge: 25%\n    maxUnavailable: 25%\ndestinationController:\n  additionalArgs:\n  - -ext-endpoint-zone-weights\n  livenessProbe:\n    timeoutSeconds: 1\n  podAnnotations: {}\n  readinessProbe:\n    timeoutSeconds: 1\ndisableHeartBeat: false\ndisableIPv6: true\negress:\n  globalEgressNetworkNamespace: linkerd-egress\nenableEndpointSlices: true\nenableH2Upgrade: true\nenablePSP: false\nenablePodAntiAffinity: false\nenablePodDisruptionBudget: false\nenablePprof: false\nidentity:\n  externalCA: false\n  issuer:\n    clockSkewAllowance: 20s\n    issuanceLifetime: 24h0m0s\n    scheme: linkerd.io/tls\n    tls:\n      crtPEM: |\n        -----BEGIN CERTIFICATE-----\n        MIIBsjCCAVigAwIBAgIQG4RR1EkQLvanRZspKw9R3jAKBggqhkjOPQQDAjAlMSMw\n        IQYDVQQDExpyb290LmxpbmtlcmQuY2x1c3Rlci5sb2NhbDAeFw0yNTA2MDQxMTE4\n        NDJaFw0yNjA2MDQxMTE4NDJaMCkxJzAlBgNVBAMTHmlkZW50aXR5LmxpbmtlcmQu\n        Y2x1c3Rlci5sb2NhbDBZMBMGByqGSM49AgEGCCqGSM49AwEHA0IABKyE5Px3kwpI\n        ZEGR9Ky0feN3/X/3DQOSDweb3B1O6JK4fAtYDetnyUul+T0zXKtrLX0lrAdRzyaj\n        MLhci5ZMEd6jZjBkMA4GA1UdDwEB/wQEAwIBBjASBgNVHRMBAf8ECDAGAQH/AgEA\n        MB0GA1UdDgQWBBSkrmSMxXmF/CJz14sL5SNbwNh9qjAfBgNVHSMEGDAWgBSw5rC0\n        vxQuKzp3Qyo9+367k6kzMTAKBggqhkjOPQQDAgNIADBFAiA7L9KiSSJdKD8WxSXM\n        cLcyqPe7Sw9lBko/Wcgcue80iwIhAJjddq/892QBoQspnTBctEfUVovznJCIMSKq\n        P4YtzyEn\n        -----END CERTIFICATE-----\n  kubeAPI:\n    clientBurst: 200\n    clientQPS: 100\n  livenessProbe:\n    timeoutSeconds: 1\n  podAnnotations: {}\n  readinessProbe:\n    timeoutSeconds: 1\n  serviceAccountTokenProjection: true\nidentityTrustAnchorsPEM: |\n  -----BEGIN CERTIFICATE-----\n  MIIBjTCCATSgAwIBAgIRAIMD4XLxwxvmNPAOcIuzz/EwCgYIKoZIzj0EAwIwJTEj\n  MCEGA1UEAxMacm9vdC5saW5rZXJkLmNsdXN0ZXIubG9jYWwwHhcNMjUwNjA0MTEx\n  ODQyWhcNMzUwNjAyMTExODQyWjAlMSMwIQYDVQQDExpyb290LmxpbmtlcmQuY2x1\n  c3Rlci5sb2NhbDBZMBMGByqGSM49AgEGCCqGSM49AwEHA0IABLwQ70dJQiN0LHY6\n  q4fvIND1LqcyypW8P+qrhVuIdHThgPx/KXXLa2+KjAbUzzeu8PRagGriwRn6+A69\n  AixeeuKjRTBDMA4GA1UdDwEB/wQEAwIBBjASBgNVHRMBAf8ECDAGAQH/AgEBMB0G\n  A1UdDgQWBBSw5rC0vxQuKzp3Qyo9+367k6kzMTAKBggqhkjOPQQDAgNHADBEAiAt\n  ZkhSf0dHy7c6dDorCcfUiwNVjSdV2Z+Sl2EJ0ZxorgIgO9hII30K/26KlicbXygh\n  CxaYQ3t5qyY437Z08s11FEg=\n  -----END CERTIFICATE-----\nidentityTrustDomain: cluster.local\nimagePullPolicy: IfNotPresent\nimagePullSecrets: []\nkubeAPI:\n  clientBurst: 200\n  clientQPS: 100\nlicenseResources:\n  resources:\n    limits:\n      cpu: 500m\n      memory: 256Mi\n    requests:\n      cpu: 250m\n      memory: 128Mi\nlicenseSecret: null\nlinkerdVersion: enterprise-2.18.0\nmanageExternalWorkloads: true\nnetworkValidator:\n  connectAddr: \"\"\n  enableSecurityContext: true\n  listenAddr: \"\"\n  logFormat: plain\n  logLevel: debug\n  timeout: 10s\nnodeSelector:\n  kubernetes.io/os: linux\npodAnnotations: {}\npodLabels: {}\npodMonitor:\n  controller:\n    enabled: true\n    namespaceSelector: |\n      matchNames:\n        - {{ .Release.Namespace }}\n        - linkerd-viz\n        - linkerd-jaeger\n  enabled: false\n  labels: {}\n  proxy:\n    enabled: true\n  scrapeInterval: 10s\n  scrapeTimeout: 10s\n  serviceMirror:\n    enabled: true\npolicyController:\n  image:\n    name: ghcr.io/buoyantio/policy-controller\n    pullPolicy: \"\"\n    version: \"\"\n  livenessProbe:\n    timeoutSeconds: 1\n  logLevel: info\n  probeNetworks:\n  - 0.0.0.0/0\n  - ::/0\n  readinessProbe:\n    timeoutSeconds: 1\n  resources:\n    cpu:\n      limit: \"\"\n      request: \"\"\n    ephemeral-storage:\n      limit: \"\"\n      request: \"\"\n    memory:\n      limit: \"\"\n      request: \"\"\npolicyValidator:\n  caBundle: \"\"\n  crtPEM: \"\"\n  externalSecret: false\n  injectCaFrom: \"\"\n  injectCaFromSecret: \"\"\n  namespaceSelector:\n    matchExpressions:\n    - key: config.linkerd.io/admission-webhooks\n      operator: NotIn\n      values:\n      - disabled\npriorityClassName: \"\"\nprofileValidator:\n  caBundle: \"\"\n  crtPEM: \"\"\n  externalSecret: false\n  injectCaFrom: \"\"\n  injectCaFromSecret: \"\"\n  namespaceSelector:\n    matchExpressions:\n    - key: config.linkerd.io/admission-webhooks\n      operator: NotIn\n      values:\n      - disabled\nprometheusUrl: \"\"\nproxy:\n  additionalEnv:\n  - name: BUOYANT_BALANCER_LOAD_LOW\n    value: \"0.1\"\n  - name: BUOYANT_BALANCER_LOAD_HIGH\n    value: \"3.0\"\n  await: true\n  control:\n    streams:\n      idleTimeout: 5m\n      initialTimeout: 3s\n      lifetime: 1h\n  cores: null\n  defaultInboundPolicy: all-unauthenticated\n  disableInboundProtocolDetectTimeout: false\n  disableOutboundProtocolDetectTimeout: false\n  enableExternalProfiles: false\n  enableShutdownEndpoint: false\n  gid: -1\n  image:\n    name: ghcr.io/buoyantio/proxy\n    pullPolicy: \"\"\n    version: \"\"\n  inbound:\n    server:\n      http2:\n        keepAliveInterval: 100s\n        keepAliveTimeout: 100s\n  inboundConnectTimeout: 100ms\n  inboundDiscoveryCacheUnusedTimeout: 90s\n  livenessProbe:\n    initialDelaySeconds: 10\n    timeoutSeconds: 1\n  logFormat: plain\n  logHTTPHeaders: \"off\"\n  logLevel: warn,linkerd=debug,hickory=error,linkerd_proxy_http::client[{headers}]=on\n  metrics:\n    hostnameLabels: false\n  nativeSidecar: false\n  opaquePorts: 25,587,3306,4444,5432,6379,9300,11211\n  outbound:\n    server:\n      http2:\n        keepAliveInterval: 200s\n        keepAliveTimeout: 200s\n  outboundConnectTimeout: 1000ms\n  outboundDiscoveryCacheUnusedTimeout: 5s\n  outboundTransportMode: transport-header\n  ports:\n    admin: 4191\n    control: 4190\n    inbound: 4143\n    outbound: 4140\n  readinessProbe:\n    initialDelaySeconds: 2\n    timeoutSeconds: 1\n  requireIdentityOnInboundPorts: \"\"\n  resources:\n    cpu:\n      limit: \"\"\n      request: \"\"\n    ephemeral-storage:\n      limit: \"\"\n      request: \"\"\n    memory:\n      limit: \"\"\n      request: \"\"\n  runtime:\n    workers:\n      maximumCPURatio: null\n      minimum: 1\n  shutdownGracePeriod: \"\"\n  startupProbe:\n    failureThreshold: 120\n    initialDelaySeconds: 0\n    periodSeconds: 1\n  uid: 2102\n  waitBeforeExitSeconds: 0\nproxyInit:\n  closeWaitTimeoutSecs: 0\n  ignoreInboundPorts: 4567,4568\n  ignoreOutboundPorts: 4567,4568\n  image:\n    name: ghcr.io/buoyantio/proxy-init\n    pullPolicy: \"\"\n    version: enterprise-2.18.0\n  iptablesMode: legacy\n  kubeAPIServerPorts: 443,6443\n  logFormat: \"\"\n  logLevel: \"\"\n  privileged: false\n  runAsGroup: 65534\n  runAsRoot: false\n  runAsUser: 65534\n  skipSubnets: \"\"\n  xtMountPath:\n    mountPath: /run\n    name: linkerd-proxy-init-xtables-lock\nproxyInjector:\n  caBundle: \"\"\n  crtPEM: \"\"\n  externalSecret: false\n  injectCaFrom: \"\"\n  injectCaFromSecret: \"\"\n  livenessProbe:\n    timeoutSeconds: 1\n  namespaceSelector:\n    matchExpressions:\n    - key: config.linkerd.io/admission-webhooks\n      operator: NotIn\n      values:\n      - disabled\n    - key: kubernetes.io/metadata.name\n      operator: NotIn\n      values:\n      - kube-system\n      - cert-manager\n  objectSelector:\n    matchExpressions:\n    - key: linkerd.io/control-plane-component\n      operator: DoesNotExist\n    - key: linkerd.io/cni-resource\n      operator: DoesNotExist\n  podAnnotations: {}\n  readinessProbe:\n    timeoutSeconds: 1\n  timeoutSeconds: 10\nrevisionHistoryLimit: 10\nruntimeClassName: \"\"\nspValidator:\n  livenessProbe:\n    timeoutSeconds: 1\n  readinessProbe:\n    timeoutSeconds: 1\nwebhookFailurePolicy: Ignore\n"
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h1&gt;
  
  
  References
&lt;/h1&gt;

&lt;ul&gt;
&lt;li&gt;&lt;a href="https://kubernetes.io/docs/reference/access-authn-authz/admission-controllers/#mutatingadmissionwebhook" rel="noopener noreferrer"&gt;https://kubernetes.io/docs/reference/access-authn-authz/admission-controllers/#mutatingadmissionwebhook&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://github.com/linkerd/linkerd2/blob/main/pkg/inject/inject.go" rel="noopener noreferrer"&gt;https://github.com/linkerd/linkerd2/blob/main/pkg/inject/inject.go&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://github.com/linkerd/linkerd2/blob/main/controller/proxy-injector/webhook.go" rel="noopener noreferrer"&gt;https://github.com/linkerd/linkerd2/blob/main/controller/proxy-injector/webhook.go&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;

</description>
      <category>linkerd</category>
      <category>kubernetes</category>
      <category>microservices</category>
      <category>webhook</category>
    </item>
    <item>
      <title>From Trust Anchors to SPIFFE IDs: Understanding Linkerd’s Automated Identity Pipeline</title>
      <dc:creator>Ivan Porta</dc:creator>
      <pubDate>Thu, 19 Jun 2025 07:53:23 +0000</pubDate>
      <link>https://dev.to/gtrekter/from-trust-anchors-to-spiffe-ids-understanding-linkerds-automated-identity-pipeline-37k9</link>
      <guid>https://dev.to/gtrekter/from-trust-anchors-to-spiffe-ids-understanding-linkerds-automated-identity-pipeline-37k9</guid>
      <description>&lt;p&gt;Linkerd automatically enables mTLS for all TCP traffic between meshed pods. To do so, it relies on several certificates that must be in place for the control plane to function correctly. You can supply these certificates during installation or generate them with third-party tools such as cert-manager or trust-manager. The required certificates are the &lt;strong&gt;Root Trust Anchor&lt;/strong&gt; and an &lt;strong&gt;Identity Intermediate Issuer Certificate&lt;/strong&gt;, which work together to issue a unique Leaf Certificate for every meshed workload.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fbnuzs73ws0atnojnegp3.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fbnuzs73ws0atnojnegp3.png" alt="Image description" width="800" height="348"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  The Root Trust Anchor Certificate
&lt;/h2&gt;

&lt;p&gt;Linkerd’s Root Trust Anchor is a public CA certificate that serves as the ultimate trust point for all service-mesh certificates. It never issues workload certificates directly; instead, it signs intermediate CA certificates, which then issue the workload certificates. This separation lets each clusters (or multiple clusters) can run its own issuer while still validating against the same root anchor, maintaining mesh-wide trust without exposing the root key in day-to-day workflows.&lt;/p&gt;

&lt;p&gt;The Root Trust Anchor certificate (containing only the public key) is stored in the ConfigMap named linkerd-identity-trust-roots. Since this ConfigMap holds no private key material, it’s safe to store it in plain view and use it to bootstrap trust for all intermediates and end-entity certificates. A common practice for many enterprises is to leverage their own PKI to generate a new intermediate certificate that chains back to this root.&lt;/p&gt;

&lt;p&gt;When a new Linkerd proxy is injected into a workload pod, it receives its Root Trust Anchor certificate through an environment variable and a mounted volume.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;linkerd-proxy:
    Container ID:    containerd://f348b4bebec14d557c44951f309e07fac969de2ea93f20e9d1920b4a8e02180e
    Image:           cr.l5d.io/linkerd/proxy:edge-25.5.3
    ...
    Environment:
     ...
      LINKERD2_PROXY_IDENTITY_DIR:                               /var/run/linkerd/identity/end-entity
      LINKERD2_PROXY_IDENTITY_TRUST_ANCHORS:                     &amp;lt;set to the key 'ca-bundle.crt' of config map 'linkerd-identity-trust-roots'&amp;gt;  Optional: false
      LINKERD2_PROXY_IDENTITY_TOKEN_FILE:                        /var/run/secrets/tokens/linkerd-identity-token
      ...
    Mounts:
      /var/run/linkerd/identity/end-entity from linkerd-identity-end-entity (rw)
      /var/run/secrets/tokens from linkerd-identity-token (rw)
...
Volumes:
  trust-roots:
    Type:      ConfigMap (a volume populated by a ConfigMap)
    Name:      linkerd-identity-trust-roots
    Optional:  false
  linkerd-identity-token:
    Type:                    Projected (a volume that contains injected data from multiple sources)
    TokenExpirationSeconds:  86400
  linkerd-identity-end-entity:
    Type:        EmptyDir (a temporary directory that shares a pod's lifetime)
    Medium:      Memory
    SizeLimit:   &amp;lt;unset&amp;gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;At startup, the proxy loads the trust-anchor certificate specified by &lt;code&gt;LINKERD2_PROXY_IDENTITY_TRUST_ANCHORS&lt;/code&gt;, ensures the directory indicated by &lt;code&gt;LINKERD2_PROXY_IDENTITY_DIR&lt;/code&gt; exists, and generates an ECDSA P-256 key pair. The private key is then encoded in PKCS#8 PEM format and written to &lt;strong&gt;key.p8&lt;/strong&gt; file.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;func generateAndStoreKey(p string) (key *ecdsa.PrivateKey, err error) {
    key, err = tls.GenerateKey()
    if err != nil {
        return
    }
    pemb := tls.EncodePrivateKeyP8(key)
    err = os.WriteFile(p, pemb, 0600)
    return
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Next, it generates an X.509 CSR whose CN and DNS SAN are set to the proxy’s identity, saving it as &lt;strong&gt;csr.der&lt;/strong&gt;.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;func generateAndStoreCSR(p, id string, key *ecdsa.PrivateKey) ([]byte, error) {
    csr := x509.CertificateRequest{
        Subject:  pkix.Name{CommonName: id},
        DNSNames: []string{id},
    }
    csrb, err := x509.CreateCertificateRequest(rand.Reader, &amp;amp;csr, key)
    if err != nil {
        return nil, fmt.Errorf("failed to create CSR: %w", err)
    }
    if err := os.WriteFile(p, csrb, 0600); err != nil {
        return nil, fmt.Errorf("failed to write CSR: %w", err)
    }
    return csrb, nil
}
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Finally, it starts the Rust identity client, which reads the ServiceAccount JWT via &lt;code&gt;TokenSource::load()&lt;/code&gt;, loads the Root Trust Anchor certificate along with key.p8 and csr.der, and sends the raw CSR in a gRPC CertifyRequest.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;let req = tonic::Request::new(api::CertifyRequest {
  token: token.load()?,                   
  identity: name.to_string(),               
  certificate_signing_request: docs.csr_der.clone(),
});
let api::CertifyResponse { leaf_certificate, intermediate_certificates, valid_until } =
  IdentityClient::new(client).certify(req).await?.into_inner();
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Here, identity contains the SPIFFE ID (&lt;code&gt;spiffe://&amp;lt;cluster&amp;gt;/ns/&amp;lt;namespace&amp;gt;/sa/&amp;lt;serviceaccount&amp;gt;&lt;/code&gt;). The control plane uses this value to issue a certificate whose URI SAN matches that SPIFFE ID, ignoring any SANs present in the CSR itself.&lt;/p&gt;

&lt;h1&gt;
  
  
  The Identity Intermediate Issuer Certificate
&lt;/h1&gt;

&lt;p&gt;The intermediate issuer certificate is stored in the linkerd-identity-issuer secret within the linkerd namespace. When the Identity service receives a Certificate Signing Request, it first validates the related ServiceAccount token by submitting a &lt;code&gt;TokenReview&lt;/code&gt; to the Kubernetes API (&lt;code&gt;authentication.k8s.io/v1/tokenreviews&lt;/code&gt;). The request includes:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;the ServiceAccount token extracted from the CSR, and&lt;/li&gt;
&lt;li&gt;the identity.l5d.io audience (so that only tokens issued specifically for Linkerd are accepted).&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;If the validation fails or the token is not authentcated the validation fails immediately, otherwise, the API server will go ahead and verify the token’s signature, expiration, issuer, and intended audience.&lt;/p&gt;

&lt;p&gt;The Identity service parses the ServiceAccount reference (system:serviceaccount::), verifies that each segment is a valid DNS-1123 label, and constructs a SPIFFE URI in the configured trust domain. It then builds an x509.Certificate template that includes&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;the public key from the CSR,&lt;/li&gt;
&lt;li&gt;a SAN set to the SPIFFE URI, and&lt;/li&gt;
&lt;li&gt;a default 24-hour validity period.
The certificate is signed with x509.CreateCertificate(rand.Reader, &amp;amp;template, issuerCert, csr.PublicKey, issuerKey) and returned to the proxy. You can observe this workflow by increasing the Identity pod’s log level to debug.
&lt;/li&gt;
&lt;/ul&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;kubectl logs -n linkerd       linkerd-identity-56d78cdd86-8c64w 
Defaulted container "identity" out of: identity, linkerd-proxy, linkerd-init (init)
time="2025-05-21T12:11:32Z" level=info msg="running version enterprise-2.17.1"
time="2025-05-21T12:11:32Z" level=info msg="starting gRPC license client" component=license-client grpc-address="linkerd-enterprise:8082"
time="2025-05-21T12:11:32Z" level=info msg="starting admin server on :9990"
time="2025-05-21T12:11:32Z" level=info msg="Using k8s client with QPS=100.00 Burst=200"
time="2025-05-21T12:11:32Z" level=info msg="POST https://10.247.0.1:443/apis/authorization.k8s.io/v1/selfsubjectaccessreviews 201 Created in 1 milliseconds"
time="2025-05-21T12:11:32Z" level=debug msg="Loaded issuer cert: -----BEGIN CERTIFICATE-----\nMIIBsjCCAVigAwIBAgIQZelMfABi9RPUkaa1fEXfIjAKBggqhkjOPQQDAjAlMSMw\nIQYDVQQDExpyb290LmxpbmtlcmQuY2x1c3Rlci5sb2NhbDAeFw0yNTA1MjExMjEx\nMDJaFw0yNjA1MjExMjExMDJaMCkxJzAlBgNVBAMTHmlkZW50aXR5LmxpbmtlcmQu\nY2x1c3Rlci5sb2NhbDBZMBMGByqGSM49AgEGCCqGSM49AwEHA0IABO52MoQ7mva8\nYPg7abR7rqO3UhE0csDoPgFKoqM54JAfQY9/8rwgKWn3AUvH9NKNNy46Nq0MmPFd\nZgz/qSX3i0WjZjBkMA4GA1UdDwEB/wQEAwIBBjASBgNVHRMBAf8ECDAGAQH/AgEA\nMB0GA1UdDgQWBBTSq+l58FRN+T4ZSwqPyX9EFJmysTAfBgNVHSMEGDAWgBQpPJRY\nnNGBgGrC7LAnIDcwXkIHVjAKBggqhkjOPQQDAgNIADBFAiA7bw59dCwkhQ9CSyUN\nLR4/U7nt2mFV519zCtvD5cJmjgIhAKhPME9EJVtN28L6ZpaYSWbnSTyih1aL/b7m\neqW0acqg\n-----END CERTIFICATE-----\n"
time="2025-05-21T12:11:32Z" level=debug msg="Issuer has been updated"
time="2025-05-21T12:11:32Z" level=info msg="starting gRPC server on :8080"
time="2025-05-21T12:11:37Z" level=debug msg="Validating token for linkerd-identity.linkerd.serviceaccount.identity.linkerd.cluster.local"
time="2025-05-21T12:11:37Z" level=info msg="POST https://10.247.0.1:443/apis/authentication.k8s.io/v1/tokenreviews 201 Created in 2 milliseconds"
time="2025-05-21T12:11:37Z" level=info msg="issued certificate for linkerd-identity.linkerd.serviceaccount.identity.linkerd.cluster.local until 2025-05-22 12:11:57 +0000 UTC: a7048ff55002e726894ad92eccfd6738fcbc72b496d58ef3071a73c866c8e311"
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h1&gt;
  
  
  The Proxy Leaf Certificate
&lt;/h1&gt;

&lt;p&gt;After the proxy receives the certificate, it loads it into its in-memory store and immediately uses it for mTLS. It automatically renews the certificate when roughly 70 % of its TTL has elapsed, generating a new CSR to rotate the certificate.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;fn refresh_in(config: &amp;amp;Config, expiry: SystemTime) -&amp;gt; Duration {
    match expiry.duration_since(SystemTime::now()).ok().map(|d| d * 7 / 10) // 70% duration
    {
        None =&amp;gt; config.min_refresh,
        Some(lifetime) if lifetime &amp;lt; config.min_refresh =&amp;gt; config.min_refresh,
        Some(lifetime) if config.max_refresh &amp;lt; lifetime =&amp;gt; config.max_refresh,
        Some(lifetime) =&amp;gt; lifetime,
    }
}
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The overall flow is the following:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Flpe1fvgqlstmqc8crrth.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Flpe1fvgqlstmqc8crrth.png" alt="Image description" width="800" height="313"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h1&gt;
  
  
  References:
&lt;/h1&gt;

&lt;ul&gt;
&lt;li&gt;&lt;a href="https://linkerd.io/2-edge/tasks/generate-certificates/" rel="noopener noreferrer"&gt;https://linkerd.io/2-edge/tasks/generate-certificates/&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://kubernetes.io/docs/reference/kubernetes-api/authentication-resources/token-review-v1/" rel="noopener noreferrer"&gt;https://kubernetes.io/docs/reference/kubernetes-api/authentication-resources/token-review-v1/&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://github.com/linkerd/linkerd2-proxy/blob/main/linkerd/proxy/identity-client/src/certify.rs" rel="noopener noreferrer"&gt;https://github.com/linkerd/linkerd2-proxy/blob/main/linkerd/proxy/identity-client/src/certify.rs&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://github.com/linkerd/linkerd2-proxy/blob/main/linkerd/proxy/spire-client/src/lib.rs" rel="noopener noreferrer"&gt;https://github.com/linkerd/linkerd2-proxy/blob/main/linkerd/proxy/spire-client/src/lib.rs&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://github.com/linkerd/linkerd2-proxy/blob/main/linkerd/app/src/identity.rs" rel="noopener noreferrer"&gt;https://github.com/linkerd/linkerd2-proxy/blob/main/linkerd/app/src/identity.rs&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://github.com/linkerd/linkerd2/blob/main/controller/identity/validator.go" rel="noopener noreferrer"&gt;https://github.com/linkerd/linkerd2/blob/main/controller/identity/validator.go&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://github.com/linkerd/linkerd2/blob/main/proxy-identity/main.go" rel="noopener noreferrer"&gt;https://github.com/linkerd/linkerd2/blob/main/proxy-identity/main.go&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;

</description>
      <category>kubernetes</category>
      <category>linkerd</category>
      <category>security</category>
    </item>
    <item>
      <title>End-to-End Distributed Tracing: Integrating Linkerd with Splunk Observability Cloud</title>
      <dc:creator>Ivan Porta</dc:creator>
      <pubDate>Thu, 08 May 2025 03:43:57 +0000</pubDate>
      <link>https://dev.to/gtrekter/end-to-end-distributed-tracing-integrating-linkerd-with-splunk-observability-cloud-4hdo</link>
      <guid>https://dev.to/gtrekter/end-to-end-distributed-tracing-integrating-linkerd-with-splunk-observability-cloud-4hdo</guid>
      <description>&lt;p&gt;Observability is critical at multiple layers of an organization. For IT operations teams, whose primary focus is maintaining system uptime and reliability, it provides real-time visibility into system performance, sends alerts when anomalies occur, and offers critical data to quickly diagnose issues. Meanwhile, at the executive level, observability supports strategic decision-making by visualizing KPIs related to customer journeys via a user friendly dashboards. For example, a retail company might track the cost of abandoned shopping carts or the average customer spend. According to a 2024 Dynatrace survey, 77% of technology leaders report that more non-IT teams are now involved in decisions driven by observability insights.&lt;/p&gt;

&lt;p&gt;Data is at the core of observability and monitoring, typically originating from logs, metrics, and traces. As applications add new functionality and user numbers grow, the data volume can skyrocket, making analysis increasingly complex. The challenge becomes even more pronounced when organizations shift from monolithic architectures to microservices-based systems, where an application is composed of numerous loosely coupled services where a single user request may trigger a series of calls across multiple back-end services, each contributing to overall performance and reliability. In fact, 79% of technology leaders report that cloud-native technology stacks generate so much data that it exceeds human capacity to manage.&lt;/p&gt;

&lt;p&gt;Additionally, microservices often communicate via asynchronous or synchronous calls that happen concurrently and independently, complicating troubleshooting efforts. Without distributed tracing, pinpointing delays or failures in this parallel ecosystem would be a daunting task.&lt;/p&gt;

&lt;h1&gt;
  
  
  Distributed Tracing and Observability Tools
&lt;/h1&gt;

&lt;p&gt;To reduce complexity and avoid manually piecing together data from different monitoring systems, many organizations turn to integrated observability platforms. Vendors such as Dynatrace, Cisco, Datadog, and New Relic have developed or acquired solutions that bring logs, metrics, traces, and infrastructure data under one roof. On the open-source side, tools like Zipkin and Jaeger offer user-friendly interfaces for visualizing distributed traces.&lt;/p&gt;

&lt;h1&gt;
  
  
  How Distributed Tracing Works
&lt;/h1&gt;

&lt;p&gt;Distributed tracing typically starts by updating the application code so that each incoming request can be tracked as it travels through multiple services. Many implementations use the OpenTracing API, which supports popular languages like Go, Java, Python, JavaScript, Ruby, and PHP. This automatically creates “spans” whenever a request enters a service, eliminating the need for custom tracing logic. For example:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;{
  "traceid": "123abc4d5678ef91g234h",
  "spanID": "a12cde34f5gh67",
  "parentSpanID": "a1b2345678c91",
  "operationName": "/API",
  "serviceName": "API",
  "startTime": 1608239395286533,
  "duration": 1000000,
  "logs": [],
  "tags": [
    {
      "http.method": "GET",
      "http.path": "/api"
    }
  ]
}
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Spans are then linked through unique trace and span IDs to form a single trace for each request, revealing the end-to-end path across all involved services. If the services communicate via HTTP, the trace information can be passed through HTTP headers, using open-source standards such as B3. B3 uses headers like &lt;code&gt;X-B3-TraceId&lt;/code&gt;, &lt;code&gt;X-B3-SpanId&lt;/code&gt;, and &lt;code&gt;X-B3-ParentSpanId&lt;/code&gt; to carry identifiers from one service to the next.&lt;/p&gt;

&lt;p&gt;Spans are sent to the collector, which validates them, applies any necessary transformations, and stores them in back-end storage before rendering them in a UI.&lt;/p&gt;

&lt;h1&gt;
  
  
  Distributed Tracing and Linkerd
&lt;/h1&gt;

&lt;p&gt;Linkerd supports distributed tracing by emitting trace spans directly from its data-plane proxies, using either OpenCensus or OpenTelemetry. When the Linkerd proxy detects a tracing header in an incoming HTTP request, it automatically creates a span to capture metrics such as the time spent inside the proxy and other relevant metadata.&lt;/p&gt;

&lt;h1&gt;
  
  
  Integrating Linkerd with Splunk
&lt;/h1&gt;

&lt;p&gt;In this tutorial, you’ll learn how to integrate Linkerd with Splunk, one of the major observability platforms in the Gartner Magic Quadrant. We’ll start by creating a local Kubernetes environment, installing Linkerd, deploying a sample application, and then configuring everything to send traces to Splunk.&lt;br&gt;
We’ll use k3d to create a local Kubernetes cluster:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;k3d cluster create training \
  --agents 0 \
  --servers 1 \
  --image rancher/k3s:v1.30.8-k3s1 \
  --network playground \
  --port 8080:80@loadbalancer
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Next, install Linkerd using Helm and generate your own mTLS certificates with step:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;helm repo add linkerd-buoyant https://helm.buoyant.cloud
helm repo update

helm install linkerd-crds \
  --create-namespace \
  --namespace linkerd \
  linkerd-buoyant/linkerd-enterprise-crds

step certificate create root.linkerd.cluster.local ca.crt ca.key \
  --profile root-ca \
  --no-password \
  --insecure

step certificate create identity.linkerd.cluster.local issuer.crt issuer.key \
  --profile intermediate-ca \
  --not-after 8760h \
  --no-password \
  --insecure \
  --ca ca.crt \
  --ca-key ca.key

helm install linkerd-control-plane \
  --namespace linkerd \
  --set license=$BUOYANT_LICENSE \
  --set-file identityTrustAnchorsPEM=ca.crt \
  --set-file identity.issuer.tls.crtPEM=issuer.crt \
  --set-file identity.issuer.tls.keyPEM=issuer.key \
  linkerd-buoyant/linkerd-enterprise-control-plane
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;We’ll use the &lt;a href="https://github.com/BuoyantIO/emojivoto" rel="noopener noreferrer"&gt;Emojivoto&lt;/a&gt; demo application. The source code uses the &lt;code&gt;contrib.go.opencensus.io/exporter/ocagentlibrary&lt;/code&gt; to send OpenCensus traces over gRPC to an agent configured via the &lt;code&gt;OC_AGENT_HOST&lt;/code&gt; environment variable.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;kubectl apply -f https://run.linkerd.io/emojivoto.yml
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Finally, let’s enable automatic sidecar injection for the Emojivoto namespace and restart the deployments, so that the pods are going to be injected with the Linkerd proxy.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;kubectl annotate ns emojivoto linkerd.io/inject=enabled
kubectl rollout restart deploy -n emojivoto
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h1&gt;
  
  
  Linkerd Jaeger configuration
&lt;/h1&gt;

&lt;p&gt;For this demo, we’ll send proxy metrics to the collector over OpenTelemetry and the application traces over OpenCensus. This setup highlights Linkerd’s flexibility in handling different telemetry protocols.&lt;/p&gt;

&lt;p&gt;Linkerd Jaeger Extension has three different components:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Jaeger (optional):&lt;/strong&gt; UI for rendering traces collected by the linkerd-jaeger collector.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Injector:&lt;/strong&gt; Injects Linkerd proxies with the environment variables that define the protocol and endpoint for sending traces. For instance, &lt;code&gt;webhook.collectorSvcAddr&lt;/code&gt; sets &lt;code&gt;LINKERD2_PROXY_TRACE_COLLECTOR_SVC_NAME&lt;/code&gt; (the collector endpoint) and &lt;code&gt;webhook.collectorTraceProtocol&lt;/code&gt; specifies which tracing protocol (OpenCensus or OpenTelemetry) to use.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Collector (optional):&lt;/strong&gt; Receives traces from the proxies. Its configuration (in the collector-config ConfigMap) includes the list of exporters, endpoints, protocols, and other attributes/metadata.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;By default, Linkerd Jaeger configures the proxy to send metrics via OpenTelemetry on port &lt;code&gt;55678&lt;/code&gt; and enables only that exporter. Since we want both OpenTelemetry and OpenCensus, we need to customize these settings. Below is an example &lt;code&gt;value.yaml&lt;/code&gt; snippet showing how to enable multiple exporters:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;jaeger:
  enabled: false
collector:
  config:
    receivers:
      otlp:
        protocols:
          grpc: {}
          http: {}
      opencensus: {}
      zipkin: {}
      jaeger:
        protocols:
          grpc: {}
          thrift_http: {}
          thrift_compact: {}
          thrift_binary: {}
    processors:
      batch: {}
    extensions:
      health_check: {}
    exporters:
      jaeger:
        endpoint: collector.linkerd-jaeger.svc.cluster.local:14250
        tls:
          insecure: true
      otlp:
        endpoint: collector.linkerd-jaeger.svc.cluster.local:4317
        tls:
          insecure: true
    service:
      extensions: [health_check]
      pipelines:
        traces:
          receivers: [otlp, opencensus, zipkin, jaeger]
          processors: [batch]
          exporters: [otlp, jaeger]
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Here, we’ve disabled the Jaeger UI because we plan to visualize data in Splunk. Install and configure Linkerd Jaeger via Helm:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;helm repo add linkerd-edge https://helm.linkerd.io/edge
helm install linkerd-jaeger \
  --create-namespace \
  --namespace linkerd-jaeger \
  --file value.yaml \
  linkerd-edge/linkerd-jaeger 
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;To send application-level (not proxy) traces via OpenCensus, we will need to set the &lt;code&gt;OC_AGENT_HOST&lt;/code&gt; environment variable to the Jaeger collector endpoint:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;kubectl -n emojivoto set env --all deploy OC_AGENT_HOST=collector.linkerd-jaeger:55678
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;If we check the pods we will be able to confirm that the Linkerd injector has updated environment variables for tracing:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;kubectl describe pod -n emojivoto deploy/emoji
...
Containers:
  linkerd-proxy:
    Environment:
      LINKERD2_PROXY_TRACE_ATTRIBUTES_PATH:                      /var/run/linkerd/podinfo/labels
      LINKERD2_PROXY_TRACE_COLLECTOR_SVC_ADDR:                   collector.linkerd-jaeger:55678
      LINKERD2_PROXY_TRACE_PROTOCOL:                             opentelemetry
      LINKERD2_PROXY_TRACE_SERVICE_NAME:                         linkerd-proxy
      LINKERD2_PROXY_TRACE_COLLECTOR_SVC_NAME:                   collector.linkerd-jaeger.serviceaccount.identity.linkerd.cluster.local
      LINKERD2_PROXY_TRACE_EXTRA_ATTRIBUTES:                     k8s.pod.uid=$(_pod_uid)
                                                                 k8s.container.name=$(_pod_containerName)
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h1&gt;
  
  
  Distributed Tracing with Linkerd and Splunk Observability Cloud
&lt;/h1&gt;

&lt;p&gt;Before being able to send the traces to Splunk we will need to install the OpenTelemetry collector and Splunk agent. To do so:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Log in to &lt;a href="https://app.us1.signalfx.com/" rel="noopener noreferrer"&gt;Splunk Observability Cloud&lt;/a&gt;.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fvkqdkhw3thwugmjiszjv.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fvkqdkhw3thwugmjiszjv.png" alt="Image description" width="800" height="481"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Go to &lt;strong&gt;Data Management&lt;/strong&gt;, then select &lt;strong&gt;Add Integration&lt;/strong&gt;.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F8n1kzunqr6dj07nxmifp.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F8n1kzunqr6dj07nxmifp.png" alt="Image description" width="800" height="481"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Since our cluster is not running in any Cloud Service Provider, choose &lt;strong&gt;Deploy Splunk OpenTelemetry Collector for other environments&lt;/strong&gt;.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fdm8j2y3zkq9b6lsthgjx.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fdm8j2y3zkq9b6lsthgjx.png" alt="Image description" width="800" height="481"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Configure the collector for your setup, and click &lt;strong&gt;Next&lt;/strong&gt;.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fze8hi6qw0w30861aswpt.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fze8hi6qw0w30861aswpt.png" alt="Image description" width="800" height="481"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Splunk will generate the deployment commands, including an access token and other parameters.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fwtkzjvlsx3sg6ir7qsi1.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fwtkzjvlsx3sg6ir7qsi1.png" alt="Image description" width="800" height="481"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Execute the instructions listed in the platform. This will typically create a DaemonSet to run the Splunk OpenTelemetry Collector on each node:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;kubectl get ds 
NAMESPACE     NAME                          DESIRED   CURRENT   READY   UP-TO-DATE   AVAILABLE   NODE SELECTOR            AGE
...
default       splunk-otel-collector-agent   1         1         0       1            0           kubernetes.io/os=linux   47s
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The OpenTelemetry opeator and k8s Cluster Receiver deployments:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;kubectl get deploy -A
NAMESPACE        NAME                                         READY   UP-TO-DATE   AVAILABLE   AGE
...
default          splunk-otel-collector-k8s-cluster-receiver   1/1     1            1           74s
default          splunk-otel-collector-operator               0/1     1            0           74s
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;And a ConfigMap containing the Splunk OpenTelemetry Collector configuration.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;kubectl get cm -A
NAMESPACE         NAME                                                   DATA   AGE
default           kube-root-ca.crt                                       1      16m
default           splunk-otel-collector-otel-agent                       2      2m33s
default           splunk-otel-collector-otel-k8s-cluster-receiver        1      2m33s
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The agent will now send information about the cluster health directly to splunk.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fbtbix544sdqdop5vivob.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fbtbix544sdqdop5vivob.png" alt="Image description" width="800" height="481"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;By default, Linkerd Jaeger will export traces to its own Jaeger or OpenTelemetry endpoint. To forward these traces to Splunk, update the &lt;code&gt;exporters&lt;/code&gt; in your &lt;code&gt;value.yaml&lt;/code&gt; (or another Helm values file) to point to the Splunk OpenTelemetry Collector service:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;jaeger:
  enabled: false
collector:
  config:
    receivers:
      otlp:
        protocols:
          grpc: {}
          http: {}
      opencensus: {}
      zipkin: {}
      jaeger:
        protocols:
          grpc: {}
          thrift_http: {}
          thrift_compact: {}
          thrift_binary: {}
    processors:
      batch: {}
    extensions:
      health_check: {}
    exporters:
      jaeger:
        endpoint: splunk-otel-collector-agent.default.svc.cluster.local:14250
        tls:
          insecure: true
      otlp:
        endpoint: splunk-otel-collector-agent.default.svc.cluster.local:4317
        tls:
          insecure: true
    service:
      extensions: [health_check]
      pipelines:
        traces:
          receivers: [otlp, opencensus, zipkin, jaeger]
          processors: [batch]
          exporters: [otlp, jaeger]
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;After updating the Jaeger collector configuration, restart your pods so they pick up the new telemetry settings. Once everything is running, you should see distributed traces from both the Linkerd proxies and the Emojivoto application in Splunk Observability Cloud.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fvzx34njpe01l2uie0m9q.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fvzx34njpe01l2uie0m9q.png" alt="Image description" width="800" height="481"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fgchwc99vh8lof6rzxdy9.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fgchwc99vh8lof6rzxdy9.png" alt="Image description" width="800" height="481"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h1&gt;
  
  
  References:
&lt;/h1&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Linkerd Jaeger Helm Chart:&lt;/strong&gt; &lt;a href="https://artifacthub.io/packages/helm/linkerd2-edge/linkerd-jaeger" rel="noopener noreferrer"&gt;https://artifacthub.io/packages/helm/linkerd2-edge/linkerd-jaeger&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Linkerd Official Documentation:&lt;/strong&gt; &lt;a href="https://linkerd.io/2.17/tasks/distributed-tracing" rel="noopener noreferrer"&gt;https://linkerd.io/2.17/tasks/distributed-tracing&lt;/a&gt;
&lt;/li&gt;
&lt;/ul&gt;

</description>
      <category>kubernetes</category>
      <category>splunk</category>
      <category>linkerd</category>
      <category>tutorial</category>
    </item>
    <item>
      <title>Mesh Expansion with Linkerd, AKS, and Azure Virtual Machines</title>
      <dc:creator>Ivan Porta</dc:creator>
      <pubDate>Tue, 04 Mar 2025 08:16:54 +0000</pubDate>
      <link>https://dev.to/gtrekter/mesh-expansion-with-linkerd-aks-and-azure-virtual-machines-27b1</link>
      <guid>https://dev.to/gtrekter/mesh-expansion-with-linkerd-aks-and-azure-virtual-machines-27b1</guid>
      <description>&lt;p&gt;Kubernetes adoption continues to grow at an unprecedented pace, especially among larger organizations. According to a recent PortWorx survey of over 500 participants from companies with more than 500 employees, 58% plan to migrate at least some of their VM-managed applications to Kubernetes, while 85% plan to move the majority of their VM workloads to cloud-native platforms. This popularity of container orchestration is driven by scalability, flexibility, operational simplicity, and cost considerations , which make hybrid cloud environments particularly appealing.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fiaif33oy5y6p683p5rod.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fiaif33oy5y6p683p5rod.png" alt="Image description" width="800" height="296"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;At the same time, many enterprises still maintain a significant on-premises footprint, and recent uncertainties due to the Broadcom acquisition of VMware have accelerated the push to modernize traditional VM-based workloads. However, as organizations adopt microservices, they often still need to communicate with legacy services running on-premises. This is where Mesh Expansion comes into play. By extending a service mesh beyond the confines of Kubernetes clusters, Mesh Expansion allows modern microservices to seamlessly interact with traditional on-premises services. In this article, I will show you how to expand your mesh using Linkerd Enterprise, Azure Kubernetes Service (AKS), and a Virtual Machine running in Azure.&lt;/p&gt;

&lt;h1&gt;
  
  
  Setup the environment
&lt;/h1&gt;

&lt;p&gt;First, let’s deploy all the resources required for this demonstration. We’ll use Terraform to provision an Azure Resource Group, Virtual Networks (VNets), Subnets, a Kubernetes cluster (AKS), and a Linux Virtual Machine.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fw2akks4b145f6cdlbsz1.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fw2akks4b145f6cdlbsz1.png" alt="Image description" width="800" height="475"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;The following is the related Terraform configuration:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;terraform {
  required_version = "&amp;gt;= 0.13"

  required_providers {
    azurerm = {
      source  = "hashicorp/azurerm"
      version = "&amp;gt;= 3.0.0"
    }
  }
}

provider "azurerm" {
  features {}
  subscription_id = "c4de0e1c-1377-4248-9beb-e1f803c76248"
}

# -----------------------------------------------------------
# General
# -----------------------------------------------------------
resource "azurerm_resource_group" "resource_group" {
  name     = "rg-training-krc"
  location = "Korea Central"
}

# -----------------------------------------------------------
# Networking
# -----------------------------------------------------------
resource "azurerm_virtual_network" "virtual_network_kuberentes" {
  name                = "vnet-training-aks-krc"
  address_space       = ["10.224.0.0/16"]
  location            = azurerm_resource_group.resource_group.location
  resource_group_name = azurerm_resource_group.resource_group.name
}

resource "azurerm_virtual_network" "virtual_network_virtual_machine" {
  name                = "vnet-training-vm-krc"
  address_space       = ["10.1.0.0/28"]
  location            = azurerm_resource_group.resource_group.location
  resource_group_name = azurerm_resource_group.resource_group.name
}

resource "azurerm_subnet" "subnet_kuberentes" {
  name                 = "aks-subnet"
  address_prefixes     = ["10.224.1.0/24"]
  resource_group_name  = azurerm_resource_group.resource_group.name
  virtual_network_name = azurerm_virtual_network.virtual_network_kuberentes.name
}

resource "azurerm_subnet" "subnet_virtual_machine" {
  name                 = "vm-subnet"
  address_prefixes     = ["10.1.0.0/29"]
  resource_group_name  = azurerm_resource_group.resource_group.name
  virtual_network_name = azurerm_virtual_network.virtual_network_virtual_machine.name
}

resource "azurerm_virtual_network_peering" "virtual_network_peering_virtual_machine" {
  name                      = "VirtualMachineToAzureKubernetesService"
  resource_group_name       = azurerm_resource_group.resource_group.name
  virtual_network_name      = azurerm_virtual_network.virtual_network_virtual_machine.name
  remote_virtual_network_id = azurerm_virtual_network.virtual_network_kuberentes.id
}

resource "azurerm_virtual_network_peering" "virtual_network_peering_kuberentes" {
  name                      = "KubernetesToVirtualMachine"
  resource_group_name       = azurerm_resource_group.resource_group.name
  virtual_network_name      = azurerm_virtual_network.virtual_network_kuberentes.name
  remote_virtual_network_id = azurerm_virtual_network.virtual_network_virtual_machine.id
}

resource "azurerm_route_table" "route_table" {
  name                = "rt-training-krc"
  location            = azurerm_resource_group.resource_group.location
  resource_group_name = azurerm_resource_group.resource_group.name
}

resource "azurerm_subnet_route_table_association" "route_table_association_virtual_machine" {
  subnet_id      = azurerm_subnet.subnet_virtual_machine.id
  route_table_id = azurerm_route_table.route_table.id
}

resource "azurerm_subnet_route_table_association" "route_table_association_kubernetes" {
  subnet_id      = azurerm_subnet.subnet_kuberentes.id
  route_table_id = azurerm_route_table.route_table.id
}

# -----------------------------------------------------------
# Kubernetes
# -----------------------------------------------------------
resource "azurerm_kubernetes_cluster" "kubernetes_cluster" {
  name                = "aks-training-krc"
  location            = azurerm_resource_group.resource_group.location
  resource_group_name = azurerm_resource_group.resource_group.name
  dns_prefix          = "trainingaks"
  identity {
    type = "SystemAssigned"
  }
  default_node_pool {
    name                         = "default"
    node_count                   = 1
    vm_size                      = "Standard_D2_v2"
    vnet_subnet_id               = azurerm_subnet.subnet_kuberentes.id
  }
}

# -----------------------------------------------------------
# Virtual Machine
# -----------------------------------------------------------
resource "azurerm_network_interface" "network_interface" {
  name                = "nic-training-krc"
  location            = azurerm_resource_group.resource_group.location
  resource_group_name = azurerm_resource_group.resource_group.name
  ip_configuration {
    name                          = "internal"
    subnet_id                     = azurerm_subnet.subnet_virtual_machine.id
    private_ip_address_allocation = "Dynamic"
    public_ip_address_id          = azurerm_public_ip.public_ip.id
  }
}

resource "azurerm_public_ip" "public_ip" {
  name                = "pip-training-krc"
  resource_group_name = azurerm_resource_group.resource_group.name
  location            = azurerm_resource_group.resource_group.location
  allocation_method   = "Static"
}

resource "azurerm_linux_virtual_machine" "virtual_machine" {
  name                            = "vm-training-krc"
  resource_group_name             = azurerm_resource_group.resource_group.name
  location                        = azurerm_resource_group.resource_group.location
  size                            = "Standard_F2"
  admin_username                  = "adminuser"
  admin_password                  = "Password1234!" 
  disable_password_authentication = false
  network_interface_ids = [
    azurerm_network_interface.network_interface.id,
  ]

  os_disk {
    caching              = "ReadWrite"
    storage_account_type = "Standard_LRS"
  }

  source_image_reference {
    publisher = "Canonical"
    offer     = "0001-com-ubuntu-server-jammy"
    sku       = "22_04-lts"
    version   = "latest"
  }
}
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h1&gt;
  
  
  Networking configuration
&lt;/h1&gt;

&lt;h2&gt;
  
  
  Virtual Network Peering
&lt;/h2&gt;

&lt;p&gt;Because we have deployed the Virtual Machine in a different Virtual Network from the AKS nodes, we need Virtual Network Peering so they can communicate. Peering allows traffic to flow through the Microsoft private backbone, making the separate virtual networks appear as one. This enables our Virtual Machine to reach AKS nodes by their private IP addresses and vice versa — without routing over the public internet.&lt;br&gt;
The Terraform configuration above creates two VNet peering resources:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;VirtualMachineToAzureKubernetesService&lt;/strong&gt; (connects the VM’s VNet to the AKS VNet)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;KubernetesToVirtualMachine&lt;/strong&gt; (connects the AKS VNet back to the VM’s VNet)
Both are necessary because peering is bidirectional. We can test the connectivity by using a privileged debug container on the node to ping the Virtual Machine’s private IP:
&lt;/li&gt;
&lt;/ul&gt;
&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;$ kubectl get nodes -o wide
NAME                              STATUS   ROLES    AGE   VERSION   INTERNAL-IP   EXTERNAL-IP   OS-IMAGE             KERNEL-VERSION      CONTAINER-RUNTIME
aks-default-37266765-vmss000000   Ready    &amp;lt;none&amp;gt;   12m   v1.30.9   10.224.1.4    &amp;lt;none&amp;gt;        Ubuntu 22.04.5 LTS   5.15.0-1079-azure   containerd://1.7.25-1

$ kubectl debug node/aks-default-37266765-vmss000000 -it --image=mcr.microsoft.com/cbl-mariner/busybox:2.0
Creating debugging pod node-debugger-aks-default-37266765-vmss000000-x9zjr with container debugger on node aks-default-37266765-vmss000000.
If you don't see a command prompt, try pressing enter.
/ # ping 10.1.0.4
PING 10.1.0.4 (10.1.0.4): 56 data bytes
64 bytes from 10.1.0.4: seq=0 ttl=64 time=5.452 ms
64 bytes from 10.1.0.4: seq=1 ttl=64 time=2.412 ms
64 bytes from 10.1.0.4: seq=2 ttl=64 time=1.018 ms
64 bytes from 10.1.0.4: seq=3 ttl=64 time=0.879 ms
64 bytes from 10.1.0.4: seq=4 ttl=64 time=1.046 ms
64 bytes from 10.1.0.4: seq=5 ttl=64 time=1.007 ms
--- 10.1.0.4 ping statistics ---
6 packets transmitted, 6 packets received, 0% packet loss
round-trip min/avg/max = 0.879/1.969/5.452 ms
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;


&lt;p&gt;And SSH into the Virtual Machine using its public IP and pinging the AKS node’s private IP:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;$ ping  10.224.1.4
PING 10.224.1.4 (10.224.1.4) 56(84) bytes of data.
64 bytes from 10.224.1.4: icmp_seq=1 ttl=64 time=2.03 ms
64 bytes from 10.224.1.4: icmp_seq=2 ttl=64 time=1.30 ms
64 bytes from 10.224.1.4: icmp_seq=3 ttl=64 time=1.14 ms
^C
--- 10.224.1.4 ping statistics ---
3 packets transmitted, 3 received, 0% packet loss, time 2003ms
rtt min/avg/max/mdev = 1.142/1.488/2.029/0.387 ms
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h1&gt;
  
  
  Route Tables
&lt;/h1&gt;

&lt;p&gt;Even though our Virtual Machine and the Kubernetes nodes can now communicate via private IP addresses, the VM still cannot resolve Kubernetes Cluster IPs or Pod IPs. This is a critical requirement because, once we install the Linkerd proxy, the VM will need to communicate with Linkerd components running inside the Kubernetes cluster , such as &lt;code&gt;linkerd-destination&lt;/code&gt;, &lt;code&gt;linkerd-identity&lt;/code&gt;, and any target services. These services have internal IPs provided by Kubernetes and rely on &lt;strong&gt;CoreDNS&lt;/strong&gt; (running within the cluster) for name resolution.&lt;br&gt;
To route requests from the VM to Kubernetes services, we can add custom routes in Azure so that any traffic destined for the Kubernetes services or Pod ranges gets forwarded to the AKS node. That node will then use &lt;strong&gt;CoreDNS&lt;/strong&gt; to resolve the service and Pod IPs. In particular, you’ll need routes for the following CIDR:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fjmpd8wp88hwowbgpt4l5.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fjmpd8wp88hwowbgpt4l5.png" alt="Image description" width="800" height="477"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;The resulting routes will be the following:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Ficf2jts9orta74jo7h16.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Ficf2jts9orta74jo7h16.png" alt="Image description" width="800" height="401"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Once these rules are in place the VM will be able to send traffic to Kubernetes services by using their cluster-internal addresses. If we check the services running in the AKS cluster we will see the &lt;code&gt;kube-dns&lt;/code&gt; is at &lt;code&gt;10.0.0.10&lt;/code&gt;.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;$ kubectl get svc -A
NAMESPACE     NAME             TYPE        CLUSTER-IP    EXTERNAL-IP   PORT(S)         AGE
default       kubernetes       ClusterIP   10.0.0.1      &amp;lt;none&amp;gt;        443/TCP         48m
kube-system   kube-dns         ClusterIP   10.0.0.10     &amp;lt;none&amp;gt;        53/UDP,53/TCP   47m
kube-system   metrics-server   ClusterIP   10.0.144.73   &amp;lt;none&amp;gt;        443/TCP         47m
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;We can then test DNS resolution in the Virtual Machine by specifying &lt;code&gt;kube-dns&lt;/code&gt; as our DNS server:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;$ nslookup metrics-server.kube-system.svc.cluster.local 10.0.0.10
Server:  10.0.0.10
Address: 10.0.0.10#53

Name: metrics-server.kube-system.svc.cluster.local
Address: 10.0.144.73
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;To avoid having to specify the DNS server , we can to configure the VM’s &lt;code&gt;netplan&lt;/code&gt; file so that &lt;code&gt;10.0.0.10&lt;/code&gt; is recognized as a primary DNS server. On Ubuntu, netplan configuration files are typically located in &lt;code&gt;/etc/netplan/&lt;/code&gt;. The file named &lt;code&gt;50-cloud-init.yaml&lt;/code&gt; is a default or auto-generated configuration that describes how the Ubuntu system should bring up its network interfaces, like &lt;code&gt;eth0&lt;/code&gt;, and apply IP addresses, routing, and DNS settings.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;$ vim /etc/netplan/50-cloud-init.yaml
network:
  version: 2
  ethernets:
    eth0:
      match:
        macaddress: "00:22:48:f6:f2:98"
        driver: "hv_netvsc"
      dhcp4: true
      nameservers:
        addresses:
          - 10.0.0.10
        search:
          - cluster.local
          - svc.cluster.local
      dhcp4-overrides:
        route-metric: 100
      dhcp6: false
      set-name: "eth0"
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;After editing, apply the new configuration:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;$ sudo netplan apply
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Now, DNS resolution on the VM automatically uses &lt;code&gt;10.0.0.10&lt;/code&gt; for Kubernetes service lookups. You can verify this by running:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;$ nslookupp metrics-server.kube-system.svc.cluster.local 10.0.0.10
Server:  10.0.0.10
Address: 10.0.0.10#53

Name: metrics-server.kube-system.svc.cluster.local
Address: 10.0.144.73
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h1&gt;
  
  
  Installing Linkerd Enterprise
&lt;/h1&gt;

&lt;p&gt;Now that the networking configuration is completed, we can move forward and install Linkerd. In this demonstration, I will use Helm Charts. If you wanna know more about the different ways to install linkerd, you can read my previous article. &lt;a href="https://dev.to/gtrekter/how-to-install-linkerd-enterprise-via-cli-operator-and-helm-charts-2a8b"&gt;How to Install Linkerd Enterprise via CLI, Operator, and Helm Charts&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;First, we will need to install the Linkerd Custom Resource Defintions.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;$ helm upgrade --install linkerd-enterprise-crds linkerd-buoyant/linkerd-enterprise-crds \
  --namespace linkerd \
  --create-namespace \
  --set manageExternalWorkloads=true 
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Next, we will need to install the Linkerd control plane with the value &lt;code&gt;manageExternalWorkloads&lt;/code&gt; set to &lt;code&gt;true&lt;/code&gt;.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;helm upgrade --install linkerd-control-plane linkerd-buoyant/linkerd-enterprise-control-plane \
  --version 2.17.1 \
  --namespace linkerd \
  --create-namespace \
  --set-file identityTrustAnchorsPEM=./certificates/ca.crt \
  --set-file identity.issuer.tls.crtPEM=./certificates/issuer.crt \
  --set-file identity.issuer.tls.keyPEM=./certificates/issuer.key \
  --set linkerdVersion=enterprise-2.17.1 \
  --set manageExternalWorkloads=true \
  --set license=**** 
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The &lt;code&gt;manageExternalWorkloads&lt;/code&gt; set to &lt;code&gt;true&lt;/code&gt; will deploy a the &lt;code&gt;linkerd-autoregistration&lt;/code&gt; service and deployment.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;kubectl get svc -A
NAMESPACE     NAME                        TYPE        CLUSTER-IP     EXTERNAL-IP   PORT(S)         AGE
default       kubernetes                  ClusterIP   10.0.0.1       &amp;lt;none&amp;gt;        443/TCP         147m
kube-system   kube-dns                    ClusterIP   10.0.0.10      &amp;lt;none&amp;gt;        53/UDP,53/TCP   146m
kube-system   metrics-server              ClusterIP   10.0.144.73    &amp;lt;none&amp;gt;        443/TCP         146m
linkerd       linkerd-autoregistration    ClusterIP   10.0.228.3     &amp;lt;none&amp;gt;        8081/TCP        29s
linkerd       linkerd-dst                 ClusterIP   10.0.129.121   &amp;lt;none&amp;gt;        8086/TCP        4m13s
linkerd       linkerd-dst-headless        ClusterIP   None           &amp;lt;none&amp;gt;        8086/TCP        4m13s
linkerd       linkerd-enterprise          ClusterIP   10.0.56.38     &amp;lt;none&amp;gt;        8082/TCP        4m13s
linkerd       linkerd-identity            ClusterIP   10.0.201.170   &amp;lt;none&amp;gt;        8080/TCP        4m13s
linkerd       linkerd-identity-headless   ClusterIP   None           &amp;lt;none&amp;gt;        8080/TCP        4m13s
linkerd       linkerd-policy              ClusterIP   None           &amp;lt;none&amp;gt;        8090/TCP        4m13s
linkerd       linkerd-policy-validator    ClusterIP   10.0.180.41    &amp;lt;none&amp;gt;        443/TCP         4m13s
linkerd       linkerd-proxy-injector      ClusterIP   10.0.61.156    &amp;lt;none&amp;gt;        443/TCP         4m13s
linkerd       linkerd-sp-validator        ClusterIP   10.0.234.34    &amp;lt;none&amp;gt;        443/TCP         4m13s
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h1&gt;
  
  
  Workload Authentication and SPIRE
&lt;/h1&gt;

&lt;p&gt;When a Linkerd proxy starts inside a Kubernetes cluster, it generates a private key and submits a Certificate Signing Request (CSR). This CSR includes the service account token, which the Linkerd identity service uses — together with the Kubernetes API — to validate the proxy’s identity before issuing an x509 certificate. This certificate identifies the proxy in a DNS-like form and sets Subject Alternative Name (SAN) fields accordingly.&lt;br&gt;
Outside of Kubernetes, we don’t have service accounts or a default identity mechanism. That’s where SPIFFE and SPIRE come into play:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;SPIFFE defines a standard for identifying and securing workloads.&lt;/li&gt;
&lt;li&gt;SPIRE is a production-ready implementation of SPIFFE that many service mesh and infrastructure providers (including Linkerd) can leverage for secure identity management.&lt;/li&gt;
&lt;/ul&gt;
&lt;h2&gt;
  
  
  SPIRE Architecture
&lt;/h2&gt;

&lt;p&gt;In SPIRE, you run two main components:&lt;br&gt;
Server: Manages and issues identities based on registration entries. It uses these entries to assign the correct SPIFFE ID to each authenticated agent.&lt;br&gt;
Agent: Runs on the same node as the workload and exposes a gRPC API for workloads to request identities. The agent “attests” the workload by checking system or container-level attributes — such as Unix user ID, container image, or other selectors — to ensure the workload is truly what it claims to be. Once validated, the agent issues an x509 SVID (SPIFFE Verifiable Identity Document) containing a URI SAN in the form &lt;code&gt;spiffe://trust-domain-name/path&lt;/code&gt;.&lt;/p&gt;
&lt;h2&gt;
  
  
  Install SPIRE
&lt;/h2&gt;

&lt;p&gt;In production, it’s typical to run the SPIRE server on a dedicated node and have multiple SPIRE agents each running on separate nodes where workloads live. For simplicity, we’ll install both server and agent on a single virtual machine. First we will download and copy SPIRE binaries files to &lt;code&gt;/opt/spire/&lt;/code&gt;, which is a common directory for add-on software packages on Linux.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;wget https://github.com/spiffe/SPIRE/releases/download/v1.11.2/SPIRE-1.11.2-linux-amd64-musl.tar.gz
tar zvxf SPIRE-1.11.2-linux-amd64-musl.tar.gz
cp -r spire-1.11.2/. /opt/spire/
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Since we are going to integrate SPIRE with the Linkerd Certificate Authority (CA), we will need to upload your TrustAnchor certificate and key to the VM:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;$ scp ./certificates/ca.key adminuser@20.41.77.179:/home/adminuser/ca.key
adminuser@20.41.77.179's password: ********
ca.key              
                                                                                                   100%  227    24.9KB/s   00:00    
$ scp ./certificates/ca.crt adminuser@20.41.77.179:/home/adminuser/ca.crt
adminuser@20.41.77.179's password: ********
ca.crt      

$ mv /home/adminuser/ca.crt /opt/spire/certs/ca.crt
$ mv /home/adminuser/ca.key /opt/spire/certs/ca.key     
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Then we are going to create a simple SPIRE server configuration that will bind the server API to &lt;code&gt;127.0.0.1:8081&lt;/code&gt;, the it will use the &lt;code&gt;root.linkerd.cluster.local&lt;/code&gt; as the trust domain and load the previously uploaded CA and key from &lt;code&gt;/opt/spire/certs/ca.crt&lt;/code&gt; and &lt;code&gt;/opt/spire/certs/ca.key&lt;/code&gt;.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;cat &amp;gt;/opt/spire/server.cfg &amp;lt;&amp;lt;EOL
server {
    bind_address = "127.0.0.1"
    bind_port = "8081"
    trust_domain = "root.linkerd.cluster.local"
    data_dir = "/opt/spire/data/server"
    log_level = "DEBUG"
    ca_ttl = "168h"
    default_x509_svid_ttl = "48h"
}
plugins {
    DataStore "sql" {
        plugin_data {
            database_type = "sqlite3"
            connection_string = "/opt/spire/data/server/datastore.sqlite3"
        }
    }
    KeyManager "disk" {
        plugin_data {
            keys_path = "/opt/spire/data/server/keys.json"
        }
    }
    NodeAttestor "join_token" {
        plugin_data {}
    }
    UpstreamAuthority "disk" {
        plugin_data {
            cert_file_path = "/opt/spire/certs/ca.crt"
            key_file_path = "/opt/spire/certs/ca.key"
        }
    }
}
EOL
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Then, it’s time for the SPIRE agent. In this case, we will need to instruct it to communicate with the server at &lt;code&gt;127.0.0.1:8081&lt;/code&gt;, use the same trust domain, &lt;code&gt;root.linkerd.cluster.local&lt;/code&gt; and use the &lt;code&gt;unix&lt;/code&gt; plugin for workload attestation.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;cat &amp;gt;/opt/spire/agent.cfg &amp;lt;&amp;lt;EOL
agent {
    data_dir = "/opt/spire/data/agent"
    log_level = "DEBUG"
    trust_domain = "root.linkerd.cluster.local"
    server_address = "localhost"
    server_port = 8081
    insecure_bootstrap = true
}
plugins {
   KeyManager "disk" {
        plugin_data {
            directory = "/opt/spire/data/agent"
        }
    }
    NodeAttestor "join_token" {
        plugin_data {}
    }
    WorkloadAttestor "unix" {
        plugin_data {}
    }
}
EOL
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;We can now start the SPIRE server.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;$ ./opt/spire/bin/spire-server run -config ./opt/spire/server.cfg
...
INFO[0000] Using legacy downstream X509 CA TTL calculation by default; this default will change in a future release 
WARN[0000] default_x509_svid_ttl is too high for the configured ca_ttl value. SVIDs with shorter lifetimes may be issued. Please set default_x509_svid_ttl to 28h or less, or the ca_ttl to 288h or more, to guarantee the full default_x509_svid_ttl lifetime when CA rotations are scheduled. 
WARN[0000] Current umask 0022 is too permissive; setting umask 0027 
INFO[0000] Configured                                    admin_ids="[]" data_dir=/opt/spire/data/server launch_log_level=debug version=1.11.2
INFO[0000] Opening SQL database                          db_type=sqlite3 subsystem_name=sql
INFO[0000] Initializing new database                     subsystem_name=sql
INFO[0000] Connected to SQL database                     read_only=false subsystem_name=sql type=sqlite3 version=3.46.1
INFO[0000] Configured DataStore                          reconfigurable=false subsystem_name=catalog
INFO[0000] Configured plugin                             external=false plugin_name=disk plugin_type=KeyManager reconfigurable=false subsystem_name=catalog
INFO[0000] Plugin loaded                                 external=false plugin_name=disk plugin_type=KeyManager subsystem_name=catalog
INFO[0000] Configured plugin                             external=false plugin_name=join_token plugin_type=NodeAttestor reconfigurable=false subsystem_name=catalog
INFO[0000] Plugin loaded                                 external=false plugin_name=join_token plugin_type=NodeAttestor subsystem_name=catalog
INFO[0000] Configured plugin                             external=false plugin_name=disk plugin_type=UpstreamAuthority reconfigurable=false subsystem_name=catalog
INFO[0000] Plugin loaded                                 external=false plugin_name=disk plugin_type=UpstreamAuthority subsystem_name=catalog
DEBU[0000] Loading journal from datastore                subsystem_name=ca_manager
INFO[0000] There is not a CA journal record that matches any of the local X509 authority IDs  subsystem_name=ca_manager
INFO[0000] Journal loaded                                jwt_keys=0 subsystem_name=ca_manager x509_cas=0
DEBU[0000] Preparing X509 CA                             slot=A subsystem_name=ca_manager
DEBU[0000] There is no active X.509 authority yet. Can't save CA journal in the datastore  subsystem_name=ca_manager
INFO[0000] X509 CA prepared                              expiration="2025-03-10 10:57:17 +0000 UTC" issued_at="2025-03-03 10:57:17.573729853 +0000 UTC" local_authority_id=721ccaf61807f4d9d1fe258476359e740feeb15e self_signed=false slot=A subsystem_name=ca_manager upstream_authority_id=737a7f3dfd9afd9669a777208b012bab53bf1164
INFO[0000] X509 CA activated                             expiration="2025-03-10 10:57:17 +0000 UTC" issued_at="2025-03-03 10:57:17.573729853 +0000 UTC" local_authority_id=721ccaf61807f4d9d1fe258476359e740feeb15e slot=A subsystem_name=ca_manager upstream_authority_id=737a7f3dfd9afd9669a777208b012bab53bf1164
INFO[0000] Creating a new CA journal entry               subsystem_name=ca_manager
DEBU[0000] Successfully stored CA journal entry in datastore  ca_journal_id=1 local_authority_id=721ccaf61807f4d9d1fe258476359e740feeb15e subsystem_name=ca_manager
DEBU[0000] Successfully rotated X.509 CA                 subsystem_name=ca_manager trust_domain_id="spiffe://root.linkerd.cluster.local" ttl=604799.402084726
DEBU[0000] Preparing JWT key                             slot=A subsystem_name=ca_manager
WARN[0000] UpstreamAuthority plugin does not support JWT-SVIDs. Workloads managed by this server may have trouble communicating with workloads outside this cluster when using JWT-SVIDs.  plugin_name=disk subsystem_name=ca_manager
DEBU[0000] Successfully stored CA journal entry in datastore  ca_journal_id=1 local_authority_id=721ccaf61807f4d9d1fe258476359e740feeb15e subsystem_name=ca_manager
INFO[0000] JWT key prepared                              expiration="2025-03-10 10:57:17.597952274 +0000 UTC" issued_at="2025-03-03 10:57:17.597952274 +0000 UTC" local_authority_id=6Yb52ncPDI4FDZy2unga1133Vne6HS8d slot=A subsystem_name=ca_manager
INFO[0000] JWT key activated                             expiration="2025-03-10 10:57:17.597952274 +0000 UTC" issued_at="2025-03-03 10:57:17.597952274 +0000 UTC" local_authority_id=6Yb52ncPDI4FDZy2unga1133Vne6HS8d slot=A subsystem_name=ca_manager
DEBU[0000] Successfully stored CA journal entry in datastore  ca_journal_id=1 local_authority_id=721ccaf61807f4d9d1fe258476359e740feeb15e subsystem_name=ca_manager
DEBU[0000] Rotating server SVID                          subsystem_name=svid_rotator
DEBU[0000] Signed X509 SVID                              expiration="2025-03-05T10:57:17Z" spiffe_id="spiffe://root.linkerd.cluster.local/spire/server" subsystem_name=svid_rotator
INFO[0000] Building in-memory entry cache                subsystem_name=endpoints
INFO[0000] Completed building in-memory entry cache      subsystem_name=endpoints
INFO[0000] Logger service configured                     launch_log_level=debug
DEBU[0000] Initializing health checkers                  subsystem_name=health
DEBU[0000] Initializing API endpoints                    subsystem_name=endpoints
INFO[0000] Starting Server APIs                          address="127.0.0.1:8081" network=tcp subsystem_name=endpoints
INFO[0000] Starting Server APIs                          address=/tmp/spire-server/private/api.sock network=unix subsystem_name=endpoints
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Once it is up and running, we can generate a one-time join token that the agent will use to attest itself to the server.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;$ /opt/spire/bin/spire-server token generate -spiffeID spiffe://root.linkerd.cluster.local/agent -output json | jq -r '.value'
5f497c6c-4fa5-45bd-b1ce-9d7770a7761b
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Then, we can start the SPIRE agent.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;$ /opt/spire/bin/spire-agent run -config /opt/spire/agent.cfg -joinToken "5f497c6c-4fa5-45bd-b1ce-9d7770a7761b"
INFO[0000] Creating spire agent UDS directory            dir=/tmp/spire-agent/public
WARN[0000] Current umask 0022 is too permissive; setting umask 0027 
INFO[0000] Starting agent                                data_dir=/opt/spire/data/agent version=1.11.2
INFO[0000] Configured plugin                             external=false plugin_name=disk plugin_type=KeyManager reconfigurable=false subsystem_name=catalog
INFO[0000] Plugin loaded                                 external=false plugin_name=disk plugin_type=KeyManager subsystem_name=catalog
INFO[0000] Plugin loaded                                 external=false plugin_name=join_token plugin_type=NodeAttestor subsystem_name=catalog
INFO[0000] Configured plugin                             external=false plugin_name=unix plugin_type=WorkloadAttestor reconfigurable=false subsystem_name=catalog
INFO[0000] Plugin loaded                                 external=false plugin_name=unix plugin_type=WorkloadAttestor subsystem_name=catalog
INFO[0000] Bundle is not found                           subsystem_name=attestor
DEBU[0000] No pre-existing agent SVID found. Will perform node attestation  subsystem_name=attestor
INFO[0000] SVID is not found. Starting node attestation  subsystem_name=attestor
WARN[0000] Insecure bootstrap enabled; skipping server certificate verification  subsystem_name=attestor
INFO[0000] Node attestation was successful               reattestable=false spiffe_id="spiffe://root.linkerd.cluster.local/spire/agent/join_token/5f497c6c-4fa5-45bd-b1ce-9d7770a7761b" subsystem_name=attestor
DEBU[0000] Entry created                                 entry=6f0e6ebd-cb2f-48ac-9919-30d6e2820ca8 selectors_added=1 spiffe_id="spiffe://root.linkerd.cluster.local/agent" subsystem_name=cache_manager
DEBU[0000] Renewing stale entries                        cache_type=workload count=1 limit=500 subsystem_name=manager
INFO[0000] Creating X509-SVID                            entry_id=6f0e6ebd-cb2f-48ac-9919-30d6e2820ca8 spiffe_id="spiffe://root.linkerd.cluster.local/agent" subsystem_name=manager
DEBU[0000] SVID updated                                  entry=6f0e6ebd-cb2f-48ac-9919-30d6e2820ca8 spiffe_id="spiffe://root.linkerd.cluster.local/agent" subsystem_name=cache_manager
DEBU[0000] Bundle added                                  subsystem_name=svid_store_cache trust_domain_id=root.linkerd.cluster.local
DEBU[0000] Initializing health checkers                  subsystem_name=health
INFO[0000] Starting Workload and SDS APIs                address=/tmp/spire-agent/public/api.sock network=unix subsystem_name=endpoints
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Finally, create a registration entry for the process that will act as the Linkerd proxy outside of Kubernetes. The &lt;code&gt;-selector "unix:uid:998"&lt;/code&gt; means any process running under UID &lt;code&gt;998&lt;/code&gt; on this agent node will receive the SPIFFE ID specified:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;/opt/spire/bin/spire-server entry create \
    -spiffeID "spiffe://root.linkerd.cluster.local/proxy-harness" \
    -parentID "spiffe://root.linkerd.cluster.local/agent" \
    -selector "unix:uid:998"
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h1&gt;
  
  
  Install the Linkerd Proxy
&lt;/h1&gt;

&lt;p&gt;With the SPIRE agent running and issuing identities, we can now set up Linkerd’s proxy harness on the virtual machine. The harness is a small daemon that installs the Linkerd proxy, configures iptables for traffic redirection, and registers itself with the Linkerd control plane running in the Kubernetes cluster. First we will need to download it with the following command:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;wget https://github.com/BuoyantIO/linkerd-buoyant/releases/download/enterprise-2.17.1/linkerd-proxy-harness-enterprise-2.17.1-amd64.deb
apt-get -y install ./linkerd-proxy-harness-enterprise-2.17.1-amd64.deb
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h1&gt;
  
  
  Create a Workload Group in Kubernetes
&lt;/h1&gt;

&lt;p&gt;Then, in the Kubernetes cluster, we will need to deploy an &lt;code&gt;ExternalGroup&lt;/code&gt; resource. This tells the Linkerd control plane that an external workload (running outside of Kubernetes) is part of the service mesh under the namespace &lt;code&gt;training&lt;/code&gt;. The readiness probe ensures that Linkerd can verify when the proxy harness is up and healthy.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;$ kubectl apply -f - &amp;lt;&amp;lt;EOF
apiVersion: v1
kind: Namespace
metadata:
  name: training
---
apiVersion: workload.buoyant.io/v1alpha1
kind: ExternalGroup
metadata:
  name: training-vm
  namespace: training
spec:
  probes:
  - failureThreshold: 1
    httpGet:
      path: /ready
      port: 80
      scheme: HTTP
      host: 127.0.0.1
    initialDelaySeconds: 3
    periodSeconds: 10
    successThreshold: 1
    timeoutSeconds: 1
  template:
    metadata:
      labels:
        app: training-app
        location: vm
    ports:
    - port: 80
EOF
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Next, use &lt;code&gt;harnessctl&lt;/code&gt; (installed with the harness package) to point the harness at your Linkerd control plane:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;harnessctl set-config \
  --workload-group-name=training-vm \
  --workload-group-namespace=training \
  --control-plane-address=linkerd-autoregistration.linkerd.svc.cluster.local.:8081 \
  --control-plane-identity=linkerd-autoregistration.linkerd.serviceaccount.identity.linkerd.cluster.local
Config updated
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Finally, start the daemon:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;systemctl start linkerd-proxy-harness
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Running journalctl will output the harness logs where we can see it updating the iptables rules to ensure that all the traffic goes through the proxy.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;journalctl -u linkerd-proxy-harness -f
...
Mar 03 11:36:23 vm-training-krc systemd[1]: Starting Linkerd proxy harness...
Mar 03 11:36:23 vm-training-krc harness-init[2131]: time="2025-03-03T11:36:23Z" level=info msg="/usr/sbin/iptables-legacy -t nat -D PREROUTING -j PROXY_INIT_REDIRECT -m comment --comment proxy-init/install-proxy-init-prerouting"
Mar 03 11:36:23 vm-training-krc harness-init[2131]: time="2025-03-03T11:36:23Z" level=info msg="iptables v1.8.7 (legacy): Couldn't load target `PROXY_INIT_REDIRECT':No such file or directory\n\nTry `iptables -h' or 'iptables --help' for more information.\n"
Mar 03 11:36:23 vm-training-krc harness-init[2131]: time="2025-03-03T11:36:23Z" level=info msg="/usr/sbin/iptables-legacy -t nat -D OUTPUT -j PROXY_INIT_OUTPUT -m comment --comment proxy-init/install-proxy-init-output"
Mar 03 11:36:23 vm-training-krc harness-init[2131]: time="2025-03-03T11:36:23Z" level=info msg="iptables v1.8.7 (legacy): Couldn't load target `PROXY_INIT_OUTPUT':No such file or directory\n\nTry `iptables -h' or 'iptables --help' for more information.\n"
Mar 03 11:36:23 vm-training-krc harness-init[2131]: time="2025-03-03T11:36:23Z" level=info msg="/usr/sbin/iptables-legacy -t nat -F PROXY_INIT_OUTPUT"
Mar 03 11:36:23 vm-training-krc harness-init[2131]: time="2025-03-03T11:36:23Z" level=info msg="iptables: No chain/target/match by that name.\n"
Mar 03 11:36:23 vm-training-krc harness-init[2131]: time="2025-03-03T11:36:23Z" level=info msg="/usr/sbin/iptables-legacy -t nat -F PROXY_INIT_REDIRECT"
Mar 03 11:36:23 vm-training-krc harness-init[2131]: time="2025-03-03T11:36:23Z" level=info msg="iptables: No chain/target/match by that name.\n"
Mar 03 11:36:23 vm-training-krc harness-init[2131]: time="2025-03-03T11:36:23Z" level=info msg="/usr/sbin/iptables-legacy -t nat -X PROXY_INIT_OUTPUT"
Mar 03 11:36:23 vm-training-krc harness-init[2131]: time="2025-03-03T11:36:23Z" level=info msg="iptables: No chain/target/match by that name.\n"
Mar 03 11:36:23 vm-training-krc harness-init[2131]: time="2025-03-03T11:36:23Z" level=info msg="/usr/sbin/iptables-legacy -t nat -X PROXY_INIT_REDIRECT"
Mar 03 11:36:23 vm-training-krc harness-init[2131]: time="2025-03-03T11:36:23Z" level=info msg="iptables: No chain/target/match by that name.\n"
Mar 03 11:36:23 vm-training-krc harness-init[2131]: time="2025-03-03T11:36:23Z" level=info msg="/usr/sbin/iptables-legacy-save -t nat"
Mar 03 11:36:23 vm-training-krc harness-init[2131]: time="2025-03-03T11:36:23Z" level=info msg="# Generated by iptables-save v1.8.7 on Mon Mar  3 11:36:23 2025\n*nat\n:PREROUTING ACCEPT [0:0]\n:INPUT ACCEPT [0:0]\n:OUTPUT ACCEPT [0:0]\n:POSTROUTING ACCEPT [0:0]\nCOMMIT\n# Completed on Mon Mar  3 11:36:23 2025\n"
Mar 03 11:36:23 vm-training-krc harness-init[2131]: time="2025-03-03T11:36:23Z" level=info msg="/usr/sbin/iptables-legacy-save -t nat"
Mar 03 11:36:23 vm-training-krc harness-init[2131]: time="2025-03-03T11:36:23Z" level=info msg="# Generated by iptables-save v1.8.7 on Mon Mar  3 11:36:23 2025\n*nat\n:PREROUTING ACCEPT [0:0]\n:INPUT ACCEPT [0:0]\n:OUTPUT ACCEPT [0:0]\n:POSTROUTING ACCEPT [0:0]\nCOMMIT\n# Completed on Mon Mar  3 11:36:23 2025\n"
Mar 03 11:36:23 vm-training-krc harness-init[2131]: time="2025-03-03T11:36:23Z" level=info msg="/usr/sbin/iptables-legacy -t nat -N PROXY_INIT_REDIRECT"
Mar 03 11:36:23 vm-training-krc harness-init[2131]: time="2025-03-03T11:36:23Z" level=info msg="/usr/sbin/iptables-legacy -t nat -A PROXY_INIT_REDIRECT -p tcp --match multiport --dports 4567,4568 -j RETURN -m comment --comment proxy-init/ignore-port-4567,4568"
Mar 03 11:36:23 vm-training-krc harness-init[2131]: time="2025-03-03T11:36:23Z" level=info msg="/usr/sbin/iptables-legacy -t nat -A PROXY_INIT_REDIRECT -p tcp -j REDIRECT --to-port 4143 -m comment --comment proxy-init/redirect-all-incoming-to-proxy-port"
Mar 03 11:36:23 vm-training-krc harness-init[2131]: time="2025-03-03T11:36:23Z" level=info msg="/usr/sbin/iptables-legacy -t nat -A PREROUTING -j PROXY_INIT_REDIRECT -m comment --comment proxy-init/install-proxy-init-prerouting"
Mar 03 11:36:23 vm-training-krc harness-init[2131]: time="2025-03-03T11:36:23Z" level=info msg="/usr/sbin/iptables-legacy -t nat -N PROXY_INIT_OUTPUT"
Mar 03 11:36:23 vm-training-krc harness-init[2131]: time="2025-03-03T11:36:23Z" level=info msg="/usr/sbin/iptables-legacy -t nat -A PROXY_INIT_OUTPUT -m owner --uid-owner 998 -j RETURN -m comment --comment proxy-init/ignore-proxy-user-id"
Mar 03 11:36:23 vm-training-krc harness-init[2131]: time="2025-03-03T11:36:23Z" level=info msg="/usr/sbin/iptables-legacy -t nat -A PROXY_INIT_OUTPUT -o lo -j RETURN -m comment --comment proxy-init/ignore-loopback"
Mar 03 11:36:23 vm-training-krc harness-init[2131]: time="2025-03-03T11:36:23Z" level=info msg="/usr/sbin/iptables-legacy -t nat -A PROXY_INIT_OUTPUT -p tcp --match multiport --dports 4567,4568 -j RETURN -m comment --comment proxy-init/ignore-port-4567,4568"
Mar 03 11:36:23 vm-training-krc harness-init[2131]: time="2025-03-03T11:36:23Z" level=info msg="/usr/sbin/iptables-legacy -t nat -A PROXY_INIT_OUTPUT -p tcp -j REDIRECT --to-port 4140 -m comment --comment proxy-init/redirect-all-outgoing-to-proxy-port"
Mar 03 11:36:23 vm-training-krc harness-init[2131]: time="2025-03-03T11:36:23Z" level=info msg="/usr/sbin/iptables-legacy -t nat -A OUTPUT -j PROXY_INIT_OUTPUT -m comment --comment proxy-init/install-proxy-init-output"
Mar 03 11:36:23 vm-training-krc harness-init[2131]: time="2025-03-03T11:36:23Z" level=info msg="/usr/sbin/iptables-legacy-save -t nat"
Mar 03 11:36:23 vm-training-krc harness-init[2131]: time="2025-03-03T11:36:23Z" level=info msg="# Generated by iptables-save v1.8.7 on Mon Mar  3 11:36:23 2025\n*nat\n:PREROUTING ACCEPT [0:0]\n:INPUT ACCEPT [0:0]\n:OUTPUT ACCEPT [0:0]\n:POSTROUTING ACCEPT [0:0]\n:PROXY_INIT_OUTPUT - [0:0]\n:PROXY_INIT_REDIRECT - [0:0]\n-A PREROUTING -m comment --comment \"proxy-init/install-proxy-init-prerouting\" -j PROXY_INIT_REDIRECT\n-A OUTPUT -m comment --comment \"proxy-init/install-proxy-init-output\" -j PROXY_INIT_OUTPUT\n-A PROXY_INIT_OUTPUT -m owner --uid-owner 998 -m comment --comment \"proxy-init/ignore-proxy-user-id\" -j RETURN\n-A PROXY_INIT_OUTPUT -o lo -m comment --comment \"proxy-init/ignore-loopback\" -j RETURN\n-A PROXY_INIT_OUTPUT -p tcp -m multiport --dports 4567,4568 -m comment --comment \"proxy-init/ignore-port-4567,4568\" -j RETURN\n-A PROXY_INIT_OUTPUT -p tcp -m comment --comment \"proxy-init/redirect-all-outgoing-to-proxy-port\" -j REDIRECT --to-ports 4140\n-A PROXY_INIT_REDIRECT -p tcp -m multiport --dports 4567,4568 -m comment --comment \"proxy-init/ignore-port-4567,4568\" -j RETURN\n-A PROXY_INIT_REDIRECT -p tcp -m comment --comment \"proxy-init/redirect-all-incoming-to-proxy-port\" -j REDIRECT --to-ports 4143\nCOMMIT\n# Completed on Mon Mar  3 11:36:23 2025\n"
Mar 03 11:36:23 vm-training-krc systemd[1]: Started Linkerd proxy harness.
Mar 03 11:36:23 vm-training-krc sudo[2160]:     root : PWD=/ ; USER=proxyharness ; COMMAND=/bin/bash -c '\\/bin\\/bash -c \\/var\\/lib\\/linkerd\\/bin\\/harness'
Mar 03 11:36:23 vm-training-krc sudo[2160]: pam_unix(sudo:session): session opened for user proxyharness(uid=998) by (uid=0)
Mar 03 11:36:24 vm-training-krc start-harness.sh[2161]: 2025-03-03T11:36:24.259338Z  INFO harness: Harness admin interface on 127.0.0.1:4192
Mar 03 11:36:24 vm-training-krc start-harness.sh[2161]: 2025-03-03T11:36:24.477490Z  INFO harness: identity used for control: spiffe://root.linkerd.cluster.local/proxy-harness
Mar 03 11:36:24 vm-training-krc start-harness.sh[2161]: 2025-03-03T11:36:24.498363Z  INFO controller{addr=linkerd-autoregistration.linkerd.svc.cluster.local:8081}: linkerd_pool_p2c: Adding endpoint addr=10.0.228.3:8081
Mar 03 11:36:24 vm-training-krc start-harness.sh[2161]: 2025-03-03T11:36:24.552077Z  INFO report_health:controller{addr=linkerd-autoregistration.linkerd.svc.cluster.local:8081}: linkerd_pool_p2c: Adding endpoint addr=10.0.228.3:8081
Mar 03 11:36:24 vm-training-krc start-harness.sh[2164]: [     0.120238s]  INFO ThreadId(01) linkerd2_proxy: release 0.0.0-dev (48069376) by Buoyant, Inc. on 2025-02-04T06:51:38Z
Mar 03 11:36:25 vm-training-krc start-harness.sh[2164]: [     0.276965s]  INFO ThreadId(01) linkerd2_proxy::rt: Using single-threaded proxy runtime
Mar 03 11:36:25 vm-training-krc start-harness.sh[2164]: [     0.831749s]  INFO ThreadId(01) linkerd2_proxy: Admin interface on 0.0.0.0:4191
Mar 03 11:36:25 vm-training-krc start-harness.sh[2164]: [     0.831858s]  INFO ThreadId(01) linkerd2_proxy: Inbound interface on 0.0.0.0:4143
Mar 03 11:36:25 vm-training-krc start-harness.sh[2164]: [     0.831863s]  INFO ThreadId(01) linkerd2_proxy: Outbound interface on 127.0.0.1:4140
Mar 03 11:36:25 vm-training-krc start-harness.sh[2164]: [     0.831866s]  INFO ThreadId(01) linkerd2_proxy: Tap DISABLED
Mar 03 11:36:25 vm-training-krc start-harness.sh[2164]: [     0.831870s]  INFO ThreadId(01) linkerd2_proxy: SNI is training-vm-7ef4eba0.training.external.identity.linkerd.cluster.local
Mar 03 11:36:25 vm-training-krc start-harness.sh[2164]: [     0.831874s]  INFO ThreadId(01) linkerd2_proxy: Local identity is spiffe://root.linkerd.cluster.local/proxy-harness
Mar 03 11:36:25 vm-training-krc start-harness.sh[2164]: [     0.831878s]  INFO ThreadId(01) linkerd2_proxy: Destinations resolved via linkerd-dst-headless.linkerd.svc.cluster.local:8086 (linkerd-destination.linkerd.serviceaccount.identity.linkerd.cluster.local)
Mar 03 11:36:25 vm-training-krc start-harness.sh[2164]: [     0.868536s]  INFO ThreadId(01) policy:controller{addr=linkerd-policy.linkerd.svc.cluster.local:8090}: linkerd_pool_p2c: Adding endpoint addr=10.244.0.200:8090
Mar 03 11:36:25 vm-training-krc start-harness.sh[2164]: [     0.868739s]  INFO ThreadId(01) dst:controller{addr=linkerd-dst-headless.linkerd.svc.cluster.local:8086}: linkerd_pool_p2c: Adding endpoint addr=10.244.0.200:8086
Mar 03 11:36:25 vm-training-krc start-harness.sh[2164]: [     0.889893s]  INFO ThreadId(02) daemon:identity: linkerd_app: Certified identity id=spiffe://root.linkerd.cluster.local/proxy-harness
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Meanwhile, in the SPIRE agent logs, we’ll see entries confirming that the harness process (PID 3817 in the example) is attested under the &lt;code&gt;proxyharness&lt;/code&gt; user (UID 998). The SPIRE agent issues an x509 SVID for &lt;code&gt;spiffe://root.linkerd.cluster.local/proxy-harness&lt;/code&gt;, matching the registration entry we created earlier:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;...
INFO[1013] Creating X509-SVID                            entry_id=5c0955f0-335c-4b5b-a3b4-9c0eae649e39 spiffe_id="spiffe://root.linkerd.cluster.local/proxy-harness" subsystem_name=manager
DEBU[1013] SVID updated                                  entry=5c0955f0-335c-4b5b-a3b4-9c0eae649e39 spiffe_id="spiffe://root.linkerd.cluster.local/proxy-harness" subsystem_name=cache_manager
DEBU[1013] PID attested to have selectors                pid=3817 selectors="[type:\"unix\" value:\"uid:998\" type:\"unix\" value:\"user:proxyharness\" type:\"unix\" value:\"gid:999\" type:\"unix\" value:\"group:proxyharness\" type:\"unix\" value:\"supplementary_gid:999\" type:\"unix\" value:\"supplementary_group:proxyharness\"]" subsystem_name=workload_attestor
DEBU[1013] Fetched X.509 SVID                            count=1 method=FetchX509SVID pid=3817 registered=true service=WorkloadAPI spiffe_id="spiffe://root.linkerd.cluster.local/proxy-harness" subsystem_name=endpoints ttl=172799.338898565
DEBU[1013] PID attested to have selectors                pid=3817 selectors="[type:\"unix\" value:\"uid:998\" type:\"unix\" value:\"user:proxyharness\" type:\"unix\" value:\"gid:999\" type:\"unix\" value:\"group:proxyharness\" type:\"unix\" value:\"supplementary_gid:999\" type:\"unix\" value:\"supplementary_group:proxyharness\"]" subsystem_name=workload_attestor
DEBU[1013] Fetched X.509 SVID                            count=1 method=FetchX509SVID pid=3817 registered=true service=WorkloadAPI spiffe_id="spiffe://root.linkerd.cluster.local/proxy-harness" subsystem_name=endpoints ttl=172799.334645563
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h1&gt;
  
  
  Testing
&lt;/h1&gt;

&lt;p&gt;With all components running, we’re now ready to verify traffic flow between a workload running on our VM and services in the Kubernetes cluster. First, we need to install Docker on the VM:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;sudo apt-get update
sudo apt-get install ca-certificates curl
sudo install -m 0755 -d /etc/apt/keyrings
sudo curl -fsSL https://download.docker.com/linux/ubuntu/gpg -o /etc/apt/keyrings/docker.asc
sudo chmod a+r /etc/apt/keyrings/docker.asc
echo \
  "deb [arch=$(dpkg --print-architecture) signed-by=/etc/apt/keyrings/docker.asc] https://download.docker.com/linux/ubuntu \
  $(. /etc/os-release &amp;amp;&amp;amp; echo "${UBUNTU_CODENAME:-$VERSION_CODENAME}") stable" | \
  sudo tee /etc/apt/sources.list.d/docker.list &amp;gt; /dev/null
sudo apt-get update
sudo apt-get install docker-ce docker-ce-cli containerd.io docker-buildx-plugin docker-compose-plugin
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Then, start a simple HTTP echo service:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;docker run -p 80:80 hashicorp/http-echo:latest echo -text="Welcome from $(hostname)"
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;To confirms the workload is serving traffic internally on port 80 we can execute the following command:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;$ curl localhost:80
Welcome from vm-training-krc
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Next, we’ll define a service in the &lt;code&gt;training&lt;/code&gt; namespace pointing to the &lt;code&gt;ExternalGroup&lt;/code&gt;. We’ll also deploy a test pod that has the Linkerd sidecar injected.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;kubectl apply -f - &amp;lt;&amp;lt;EOF
apiVersion: v1
kind: Service
metadata:
  name: training-vm
  namespace: training
spec:
  type: ClusterIP
  selector:
    app: training-app
    location: vm
  ports:
  - port: 80
    protocol: TCP
    name: one
---
apiVersion: v1
kind: Service
metadata:
  name: test-server
spec:
  type: ClusterIP
  selector:
    app: test-server
  ports:
  - port: 80
    protocol: TCP
---
apiVersion: v1
kind: Pod
metadata:
  name: curl-test
  annotations:
    linkerd.io/inject: enabled
spec:
  containers:
  - name: curl
    image: curlimages/curl:latest
    command: ["sleep", "infinity"]
EOF
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Once the &lt;code&gt;curl-test&lt;/code&gt; pod is running, you can &lt;code&gt;exec&lt;/code&gt; into it and issue a request to the VM workload:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;$ kubectl exec curl-test -c curl -- curl http://training-vm.training.svc.cluster.local:80
Welcome from vm-training-krc
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The “Welcome from vm-training-krc” response confirms that the &lt;code&gt;ExternalGroup&lt;/code&gt; and Linkerd proxy harness are working correctly, allowing in-cluster traffic to reach the VM workload. Next, we’ll confirm traffic can flow from the VM to a service in Kubernetes. To do so, we will deploy a simple application in the cluster.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;kubectl apply -f - &amp;lt;&amp;lt;EOF
apiVersion: v1
kind: Namespace
metadata:
  name: simple-app
  annotations:
    linkerd.io/inject: enabled
---
apiVersion: v1
kind: Service
metadata:
  name: simple-app-v1
  namespace: simple-app
spec:
  selector:
    app: simple-app-v1
    version: v1
  ports:
    - port: 80
      targetPort: 5678
---
apiVersion: apps/v1
kind: Deployment
metadata:
  name: simple-app-v1
  namespace: simple-app
spec:
  replicas: 1
  selector:
    matchLabels:
      app: simple-app-v1
      version: v1
  template:
    metadata:
      labels:
        app: simple-app-v1
        version: v1
    spec:
      containers:
        - name: http-app
          image: hashicorp/http-echo:latest
          args:
            - "-text=Simple App v1"
          ports:
            - containerPort: 5678
EOF
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;On the VM, you can send a request to the &lt;code&gt;simple-app-v1.simple-app.svc.cluster.local&lt;/code&gt; service using either &lt;code&gt;curl&lt;/code&gt; or &lt;code&gt;wget&lt;/code&gt;.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;$ curl -v http://simple-app-v1.simple-app.svc.cluster.local:80
*   Trying 10.0.57.85:80...
* Connected to simple-app-v1.simple-app.svc.cluster.local (10.0.57.85) port 80 (#0)
&amp;gt; GET / HTTP/1.1
&amp;gt; Host: simple-app-v1.simple-app.svc.cluster.local
&amp;gt; User-Agent: curl/7.81.0
&amp;gt; Accept: */*
&amp;gt; 
* Mark bundle as not supporting multiuse
&amp;lt; HTTP/1.1 200 OK
&amp;lt; x-app-name: http-echo
&amp;lt; x-app-version: 1.0.0
&amp;lt; date: Mon, 03 Mar 2025 12:11:19 GMT
&amp;lt; content-length: 29
&amp;lt; content-type: text/plain; charset=utf-8
&amp;lt; 
Simple App v1
* Connection #0 to host simple-app-v1.simple-app.svc.cluster.local left intact
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The “Simple App v1” response confirms that the VM-based application can reach in-cluster services via Linkerd.&lt;/p&gt;

&lt;h1&gt;
  
  
  References:
&lt;/h1&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;PortWorx Survery:&lt;/strong&gt; &lt;a href="https://www.cncf.io/blog/2024/06/06/the-voice-of-kubernetes-experts-report-2024-the-data-trends-driving-the-future-of-the-enterprise/" rel="noopener noreferrer"&gt;https://www.cncf.io/blog/2024/06/06/the-voice-of-kubernetes-experts-report-2024-the-data-trends-driving-the-future-of-the-enterprise/&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Linkerd Open Source Documentation:&lt;/strong&gt; &lt;a href="https://linkerd.io/2-edge/tasks/adding-non-kubernetes-workloads/" rel="noopener noreferrer"&gt;https://linkerd.io/2-edge/tasks/adding-non-kubernetes-workloads/&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Linkerd Enterprise Harness Documentation:&lt;/strong&gt; &lt;a href="https://docs.buoyant.io/buoyant-enterprise-linkerd/latest/tasks/managing-external-workloads/" rel="noopener noreferrer"&gt;https://docs.buoyant.io/buoyant-enterprise-linkerd/latest/tasks/managing-external-workloads/&lt;/a&gt;
&lt;/li&gt;
&lt;/ul&gt;

</description>
      <category>kubernetes</category>
      <category>linkerd</category>
      <category>azure</category>
      <category>aks</category>
    </item>
    <item>
      <title>Mastering Service Mesh with Linkerd</title>
      <dc:creator>Ivan Porta</dc:creator>
      <pubDate>Thu, 14 Nov 2024 13:08:54 +0000</pubDate>
      <link>https://dev.to/gtrekter/mastering-service-mesh-with-linkerd-2hmn</link>
      <guid>https://dev.to/gtrekter/mastering-service-mesh-with-linkerd-2hmn</guid>
      <description>&lt;p&gt;As enterprises increasingly adopt microservices architecture to benefit from faster development, independent scalability, easier management, improved fault tolerance, better monitoring, and cost-effectiveness, the demand for robust infrastructure is growing. According to a 2023 Gartner report, 74% of respondents are using microservices, while 23% plan to adopt them.&lt;/p&gt;

&lt;p&gt;However, as these services scale, the architecture can become complex, involving multiple clusters across different locations (on-premise and cloud). Managing networking, monitoring, and security in such environments becomes challenging. This is where a service mesh comes into play. In this article, I will dive into the fundamentals of service mesh technology, focusing on one of the leading players, &lt;strong&gt;Buoyant’s Linkerd&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;If you’re more interested in the market trends, you’ll be pleased to know that the service mesh sector is experiencing rapid growth. This rise is fueled by the adoption of microservices, increasing cyberattacks, and supportive regulations like the EU’s “Path to the Digital Decade” initiative. Additionally, investments in AI are pushing countries like South Korea to leverage service mesh to transform data into intelligence. The service mesh market, valued at approximately USD 0.24 billion in 2022, is projected to grow to between USD 2.32 billion and USD 3.45 billion by 2030, with a compound annual growth rate (CAGR) of 39.7% from 2023 to 2030, according to reports from Next Move Strategy Consulting and Global Info Research.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F7vd8acl7kp73p9tj1job.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F7vd8acl7kp73p9tj1job.png" alt="Image description" width="720" height="261"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  What is Service Mesh?
&lt;/h2&gt;

&lt;p&gt;A service mesh is an additional infrastructure layer that centralizes the logic governing service-to-service communication, abstracting it away from individual services and managing it at the network layer. This enables developers to focus on building features without worrying about communication failures, retries, or routing, as the service mesh consistently handles these aspects across all services.&lt;/p&gt;

&lt;p&gt;Generally speaking, a service mesh provides the following benefits:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;    &lt;strong&gt;Automatic retries and circuit breaking:&lt;/strong&gt; It ensures reliable communication by handling failures and prevent cascading failures by applying circuit breaking patterns.&lt;/li&gt;
&lt;li&gt;    &lt;strong&gt;Automated service discovery:&lt;/strong&gt; It automatically detects services within the mesh without requiring manual configuration.&lt;/li&gt;
&lt;li&gt;    &lt;strong&gt;Improved security:&lt;/strong&gt; Encrypts communication between services by using mTLS.&lt;/li&gt;
&lt;li&gt;    &lt;strong&gt;Fine-grained traffic control:&lt;/strong&gt; It enables dynamic traffic routing, supporting deployment strategies like blue-green deployments, canary releases, and A/B testing without modifying the application’s cod.&lt;/li&gt;
&lt;li&gt;    &lt;strong&gt;Observability:&lt;/strong&gt; It enhances monitoring with additional metrics, such as request success rates, failures, and requests per second, collected directly from the proxy.&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  How does service mesh works?
&lt;/h2&gt;

&lt;p&gt;The architecture of service mesh has two components: a control planes and a data plane.&lt;/p&gt;

&lt;h3&gt;
  
  
  Data plane
&lt;/h3&gt;

&lt;p&gt;The data plane consists of &lt;strong&gt;proxies&lt;/strong&gt; (following the sidecar pattern) that are deployed alongside each application container instance within the same pod. These proxies intercept all inbound and outbound traffic to the service and, acting as intermediaries, implement features such as circuit breaking, request retries, load balancing, and enforcing mTLS (mutual TLS) for secure communication.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fdkue41q7i426vqj6qd0n.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fdkue41q7i426vqj6qd0n.png" alt="Image description" width="720" height="434"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h3&gt;
  
  
  Control Plane
&lt;/h3&gt;

&lt;p&gt;The control plane is responsible for managing and configuring the proxies in the data plane. While it doesn’t handle network packets directly, it orchestrates policies and configurations across the entire mesh. It does so by:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;    Maintaining a service registry and dynamically updating the list of services as they scale, join, or leave.&lt;/li&gt;
&lt;li&gt;    Defining and applying policies like traffic routing, security rules, and rate limiting.&lt;/li&gt;
&lt;li&gt;    Aggregating metrics, logs, and traces from the proxies for monitoring and observability.&lt;/li&gt;
&lt;li&gt;    Handling certificates and issuing cryptographic identities for mTLS to ensure secure communication between services.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The Linkerd control plane is composed of the following components:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;    &lt;strong&gt;Destination pod:&lt;/strong&gt; This pod takes care of providing the IP and port of the destination service to a proxy when it’s sending traffic to another service. It also informs the proxy of the TLS identity it should expect on the other end of the connection. Finally, it fetches both policy information — which includes the types of requests allowed by authorization and traffic policies — and service profile information, which defines how the service should behave in terms of retries, per-route metrics, and timeouts.&lt;/li&gt;
&lt;li&gt;    &lt;strong&gt;Identity pod:&lt;/strong&gt; It acts as a Certificate Authority by issuing signed certificates to proxies that send it a Certificate Signing Request (CSR) during their initialization.&lt;/li&gt;
&lt;li&gt;    &lt;strong&gt;Proxy injector pod:&lt;/strong&gt; This Kubernetes admission controller checks for the existence of the annotation &lt;code&gt;linkerd.io/inject: enabled&lt;/code&gt; when a pod is created. If present, it injects the &lt;code&gt;proxy-init&lt;/code&gt; and &lt;code&gt;linkerd-proxy&lt;/code&gt; containers into the pod.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F5i7x7emn9d1qnoxhknc7.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F5i7x7emn9d1qnoxhknc7.png" alt="Image description" width="720" height="453"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  A little bit of history of Linkerd
&lt;/h2&gt;

&lt;p&gt;The major players in the service mesh market include Red Hat with OpenShift Service Mesh, HashiCorp with Consul, F5 with NGINX Service Mesh, and Istio, originally developed by Google and later transitioned to the Cloud Native Computing Foundation (CNCF).&lt;/p&gt;

&lt;p&gt;We’re focusing on Buoyant’s product, Linkerd, due to its commitment to performance and &lt;strong&gt;minimal resource footprint&lt;/strong&gt;. This focus has made it one of the earliest and most performant service meshes available. Linkerd achieved CNCF Graduated status on July 28, 2021, underscoring its stability and widespread adoption by organizations such as Monzo, Geico, Wells Fargo, and Visa (data from HGData).&lt;/p&gt;

&lt;p&gt;Linkerd is an open-source service mesh that was initially released in 2016, originally built on Finagle — a scalable microservice library. Its first proxy, written in Scala, leveraged Java and Scala’s networking features. However, due to the JVM dependency and it’s complex surface ares, as well as some limitations caused by design choices of Scala, Netty, and Finagle, its adoption has some friction to production environment. As a result, in 2017, Buoyant developed a new lightweight proxy written in Rust. At the time, Rust was still an emerging language, but Buoyant chose it over Scala, Go, and C++ for several key reasons:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;    &lt;strong&gt;Predictable performance:&lt;/strong&gt; Go’s garbage collector can cause latency spikes during collection cycles, which ruled it out. Rust, on the other hand, offers predictable performance with no garbage collection overhead.&lt;/li&gt;
&lt;li&gt;   ** Security:** Many security vulnerabilities, such as buffer overflows (e.g., Heartbleed), stem from unsafe memory management in languages like C and C++. Rust handles memory safety at compile time, significantly reducing the risk of such vulnerabilities.&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  How does Linkerd do Load Balancing?
&lt;/h2&gt;

&lt;p&gt;When you create a Service in Kubernetes, an associated Endpoint is automatically created based on the selector defined in the service specification. This selector identifies which pods should be part of the service:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;$ kubectl get svc -n vastaya application-vastaya-svc -o yaml
apiVersion: v1
kind: Service
...
spec:
  ...
  selector:
    app.kubernetes.io/instance: application
    app.kubernetes.io/name: vastaya
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The corresponding Endpoint is populated with the IP addresses of the selected pods:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;$ kubectl get endpoints -n vastaya application-vastaya-svc -o yaml
apiVersion: v1
kind: Endpoints
...
subsets:
- addresses:
  - ip: 10.244.1.157
    nodeName: minikube
    targetRef:
      kind: Pod
      name: application-vastaya-dplmt-647b4dbdc-9bnwj
      namespace: vastaya
      uid: e3219642-4428-4bbc-89ec-a892ca571639

$ kubectl get pods -n vastaya -o yaml
apiVersion: v1
items:
- apiVersion: v1
  kind: Pod
  ...
  status:
    hostIP: 192.168.49.2
    hostIPs:
    - ip: 192.168.49.2
    phase: Running
    podIP: 10.244.1.157
    podIPs:
    - ip: 10.244.1.157
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;When a container is sending a request to a service, the proxy intercepts the request and looks up the target IP address in the Kubernetes API, and if the IP corresponds to a Kubernetes Service, it load balances using the EWMA (exponentially weighted moving average) algorithm to find and send requests to the fastest endpoints associated with that Service. Additionally, it applies any existing Service Policy (custom CRD) associated to the service that will enable traffic management capabilities like canary deployments, and more.&lt;/p&gt;

&lt;h2&gt;
  
  
  How Does the Linkerd Proxy Intercept Packets?
&lt;/h2&gt;

&lt;p&gt;When a packet arrives at the network interface, it passes through the iptables rules, which contain multiple chains. Once a rule is matched, an action is taken. Linkerd redirects the packets to the proxy using the &lt;code&gt;PREROUTING&lt;/code&gt; and &lt;code&gt;OUTPUT&lt;/code&gt; chains in the nat (Network Address Translation) table. These rules are updated by the &lt;code&gt;linkerd-init&lt;/code&gt; container, which, as an init container, runs before the other containers start. It modifies the pod’s iptables, creating new chains and updating existing ones. You can inspect these changes by checking the container’s logs.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Note:&lt;/strong&gt; The iptables are specific to the pod’s namespace and are separate from the node’s iptables.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;$ kubectl -n vastaya logs application-vastaya-dplmt-647b4dbdc-9bnwj linkerd-init
time="2024-09-18T23:45:26Z" level=info msg="/sbin/iptables-legacy-save -t nat"
time="2024-09-18T23:45:26Z" level=info msg="# Generated by iptables-save v1.8.10 on Wed Sep 18 23:45:26 2024\n*nat\n:PREROUTING ACCEPT [0:0]\n:INPUT ACCEPT [0:0]\n:OUTPUT ACCEPT [0:0]\n:POSTROUTING ACCEPT [0:0]\nCOMMIT\n# Completed on Wed Sep 18 23:45:26 2024\n"
time="2024-09-18T23:45:26Z" level=info msg="/sbin/iptables-legacy -t nat -N PROXY_INIT_REDIRECT"
time="2024-09-18T23:45:26Z" level=info msg="/sbin/iptables-legacy -t nat -A PROXY_INIT_REDIRECT -p tcp --match multiport --dports 4190,4191,4567,4568 -j RETURN -m comment --comment proxy-init/ignore-port-4190,4191,4567,4568"
time="2024-09-18T23:45:26Z" level=info msg="/sbin/iptables-legacy -t nat -A PROXY_INIT_REDIRECT -p tcp -j REDIRECT --to-port 4143 -m comment --comment proxy-init/redirect-all-incoming-to-proxy-port"
time="2024-09-18T23:45:26Z" level=info msg="/sbin/iptables-legacy -t nat -A PREROUTING -j PROXY_INIT_REDIRECT -m comment --comment proxy-init/install-proxy-init-prerouting"
time="2024-09-18T23:45:26Z" level=info msg="/sbin/iptables-legacy -t nat -N PROXY_INIT_OUTPUT"
time="2024-09-18T23:45:26Z" level=info msg="/sbin/iptables-legacy -t nat -A PROXY_INIT_OUTPUT -m owner --uid-owner 2102 -j RETURN -m comment --comment proxy-init/ignore-proxy-user-id"
time="2024-09-18T23:45:26Z" level=info msg="/sbin/iptables-legacy -t nat -A PROXY_INIT_OUTPUT -o lo -j RETURN -m comment --comment proxy-init/ignore-loopback"
time="2024-09-18T23:45:26Z" level=info msg="/sbin/iptables-legacy -t nat -A PROXY_INIT_OUTPUT -p tcp --match multiport --dports 4567,4568 -j RETURN -m comment --comment proxy-init/ignore-port-4567,4568"
time="2024-09-18T23:45:26Z" level=info msg="/sbin/iptables-legacy -t nat -A PROXY_INIT_OUTPUT -p tcp -j REDIRECT --to-port 4140 -m comment --comment proxy-init/redirect-all-outgoing-to-proxy-port"
time="2024-09-18T23:45:26Z" level=info msg="/sbin/iptables-legacy -t nat -A OUTPUT -j PROXY_INIT_OUTPUT -m comment --comment proxy-init/install-proxy-init-output"
time="2024-09-18T23:45:26Z" level=info msg="/sbin/iptables-legacy-save -t nat"
time="2024-09-18T23:45:26Z" level=info msg="# Generated by iptables-save v1.8.10 on Wed Sep 18 23:45:26 2024\n*nat\n:PREROUTING ACCEPT [0:0]\n:INPUT ACCEPT [0:0]\n:OUTPUT ACCEPT [0:0]\n:POSTROUTING ACCEPT [0:0]\n:PROXY_INIT_OUTPUT - [0:0]\n:PROXY_INIT_REDIRECT - [0:0]\n-A PREROUTING -m comment --comment \"proxy-init/install-proxy-init-prerouting\" -j PROXY_INIT_REDIRECT\n-A OUTPUT -m comment --comment \"proxy-init/install-proxy-init-output\" -j PROXY_INIT_OUTPUT\n-A PROXY_INIT_OUTPUT -m owner --uid-owner 2102 -m comment --comment \"proxy-init/ignore-proxy-user-id\" -j RETURN\n-A PROXY_INIT_OUTPUT -o lo -m comment --comment \"proxy-init/ignore-loopback\" -j RETURN\n-A PROXY_INIT_OUTPUT -p tcp -m multiport --dports 4567,4568 -m comment --comment \"proxy-init/ignore-port-4567,4568\" -j RETURN\n-A PROXY_INIT_OUTPUT -p tcp -m comment --comment \"proxy-init/redirect-all-outgoing-to-proxy-port\" -j REDIRECT --to-ports 4140\n-A PROXY_INIT_REDIRECT -p tcp -m multiport --dports 4190,4191,4567,4568 -m comment --comment \"proxy-init/ignore-port-4190,4191,4567,4568\" -j RETURN\n-A PROXY_INIT_REDIRECT -p tcp -m comment --comment \"proxy-init/redirect-all-incoming-to-proxy-port\" -j REDIRECT --to-ports 4143\nCOMMIT\n# Completed on Wed Sep 18 23:45:26 2024\n"
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Viewing these iptables rules directly can be tricky. You’ll need to run &lt;code&gt;iptables-legacy&lt;/code&gt; as &lt;code&gt;sudo&lt;/code&gt;, but networking restrictions may prevent you from doing this inside a container. Instead, you can access them from the node by using &lt;code&gt;nsenter&lt;/code&gt; to enter the pod’s network namespace. Here’s how you can do it:&lt;/p&gt;

&lt;p&gt;SSH into the node (in this case, using Minikube):&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;$ minikube ssh
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Find the container ID:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;$ docker ps | grep application-vastaya-dplmt-647b4dbdc-9bnwj
3f1f370f44e4   de30a4cc9fd9                "/docker-entrypoint.…"   39 minutes ago   Up 39 minutes             k8s_application-vastaya-cntr_application-vastaya-dplmt-647b4dbdc-9bnwj_vastaya_e3219642-4428-4bbc-89ec-a892ca571639_3
85b5f32df37e   4585864a6b91                "/usr/lib/linkerd/li…"   39 minutes ago   Up 39 minutes             k8s_linkerd-proxy_application-vastaya-dplmt-647b4dbdc-9bnwj_vastaya_e3219642-4428-4bbc-89ec-a892ca571639_3
e81bac0a9f9c   registry.k8s.io/pause:3.9   "/pause"                 39 minutes ago   Up 39 minutes             k8s_POD_application-vastaya-dplmt-647b4dbdc-9bnwj_vastaya_e3219642-4428-4bbc-89ec-a892ca571639_3
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Get the process ID of the container:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;$ docker@minikube:~$ docker inspect --format '{{.State.Pid}}' 3f1f370f44e4  
6859
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Use nsenter to access the pod’s network namespace:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;$ sudo nsenter -t 6859 -n
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Finally, you can view the iptables rules:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;$ root@minikube:/home/docker# iptables-legacy -t nat -L
Chain PREROUTING (policy ACCEPT)
target               prot opt source               destination         
PROXY_INIT_REDIRECT  all  --  anywhere             anywhere             /* proxy-init/install-proxy-init-prerouting */

Chain INPUT (policy ACCEPT)
target     prot opt source               destination         

Chain OUTPUT (policy ACCEPT)
target             prot opt source               destination         
PROXY_INIT_OUTPUT  all  --  anywhere             anywhere             /* proxy-init/install-proxy-init-output */

Chain POSTROUTING (policy ACCEPT)
target     prot opt source               destination         

Chain PROXY_INIT_OUTPUT (1 references)
target     prot opt source               destination         
RETURN     all  --  anywhere             anywhere             owner UID match 2102 /* proxy-init/ignore-proxy-user-id */
RETURN     all  --  anywhere             anywhere             /* proxy-init/ignore-loopback */
RETURN     tcp  --  anywhere             anywhere             multiport dports 4567,4568 /* proxy-init/ignore-port-4567,4568 */
REDIRECT   tcp  --  anywhere             anywhere             /* proxy-init/redirect-all-outgoing-to-proxy-port */ redir ports 4140

Chain PROXY_INIT_REDIRECT (1 references)
target     prot opt source               destination         
RETURN     tcp  --  anywhere             anywhere             multiport dports sieve,4191,4567,4568 /* proxy-init/ignore-port-4190,4191,4567,4568 */
REDIRECT   tcp  --  anywhere 
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;As you can see from the &lt;code&gt;PROXY_INIT_REDIRECT&lt;/code&gt; chain, it redirect all incoming traffic to the port 4143, which is the port where the Linkerd proxy container is running.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;$ kubectl get pods application-vastaya-dplmt-647b4dbdc-9bnwj -n vastaya -o yaml
apiVersion: v1
kind: Pod
metadata:
  ...
  name: application-vastaya-dplmt-647b4dbdc-9bnwj
spec:
  containers:
    image: cr.l5d.io/linkerd/proxy:edge-24.9.2
    name: linkerd-proxy
    ports:
    - containerPort: 4143
      name: linkerd-proxy
      protocol: TCP
    ...
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The proxy processes the traffic and forwards it to the OUTPUT chain while preserving the original destination IP and port using the &lt;code&gt;SO_ORIGINAL_DST&lt;/code&gt; socket option. Since the request will be owned by the proxy, the packet will be forwarded to the application.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fwb650hf3voru7asrrky5.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fwb650hf3voru7asrrky5.png" alt="Image description" width="720" height="461"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  Install Linkerd
&lt;/h2&gt;

&lt;p&gt;There are two primary ways to install Linkerd as a service mesh: via Helm charts or the Linkerd CLI.&lt;/p&gt;

&lt;h3&gt;
  
  
  Using CLI
&lt;/h3&gt;

&lt;p&gt;Installing Linkerd via the CLI is straightforward:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;$ curl --proto '=https' --tlsv1.2 -sSfL https://run.linkerd.io/install-edge | sh
$ export PATH=$HOME/.linkerd2/bin:$PATH
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Once the CLI is installed, you can verify that the environment is set up correctly for the installation:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;$ linkerd check --pre
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Next, similar to the Helm Chart installation, you’ll need to install the CRDs first, followed by the control plane:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;$ linkerd install --crds | kubectl apply -f -
$ linkerd install | kubectl apply -f -
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The CLI goes beyond just installation, offering fine-grained control over operations and useful extensions:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;$ linkerd
linkerd manages the Linkerd service mesh.Usage:
  linkerd [command]Available Commands:
  authz        List authorizations for a resource
  check        Check the Linkerd installation for potential problems
  completion   Output shell completion code for the specified shell (bash, zsh or fish)
  diagnostics  Commands used to diagnose Linkerd components
  help         Help about any command
  identity     Display the certificate(s) of one or more selected pod(s)
  inject       Add the Linkerd proxy to a Kubernetes config
  install      Output Kubernetes configs to install Linkerd
  install-cni  Output Kubernetes configs to install Linkerd CNI
  jaeger       jaeger manages the jaeger extension of Linkerd service mesh
  multicluster Manages the multicluster setup for Linkerd
  profile      Output service profile config for Kubernetes
  prune        Output extraneous Kubernetes resources in the linkerd control plane
  uninject     Remove the Linkerd proxy from a Kubernetes config
  uninstall    Output Kubernetes resources to uninstall Linkerd control plane
  upgrade      Output Kubernetes configs to upgrade an existing Linkerd control plane
  version      Print the client and server version information
  viz          viz manages the linkerd-viz extension of Linkerd service mesh
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Using Helm
&lt;/h3&gt;

&lt;p&gt;Differently than Linekrd CLI, aprerequisite for installing Linkerd via Helm is generating certificates for mutual TLS (mTLS). To handle this, we will use the step CLI for certificate management.&lt;/p&gt;

&lt;p&gt;To install the step CLI, run the following:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;$ wget &amp;lt;https://dl.smallstep.com/cli/docs-cli-install/latest/step-cli_amd64.deb&amp;gt;
$ sudo dpkg -i step-cli_amd64.deb
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Next, generate the required certificates:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;$ step certificate create root.linkerd.cluster.local ca.crt ca.key --profile root-ca --no-password --insecure
$ step certificate create identity.linkerd.cluster.local issuer.crt issuer.key --profile intermediate-ca --not-after 8760h --no-password --insecure --ca ca.crt --ca-key ca.key
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;After creating the certificates, you can proceed with installing Linkerd in two steps using Helm charts. In this demo, I’ll be installing it on a local Minikube instance. Since Minikube uses Docker as the container runtime, the proxy-init container must run with root privileges (--set runAsRoot=true).&lt;/p&gt;

&lt;p&gt;First, add the Linkerd Helm repository and update it:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;$ helm repo add linkerd-edge https://helm.linkerd.io/edge
$ helm repo update
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Then, install the CRDs and control plane:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;$ helm install linkerd-crds linkerd-edge/linkerd-crds -n linkerd --create-namespace
$ helm install linkerd-control-plane linkerd-edge/linkerd-control-plane -n linkerd --create-namespace --set-file identityTrustAnchorsPEM=certificates/ca.crt --set-file identity.issuer.tls.crtPEM=certificates/issuer.crt --set-file identity.issuer.tls.keyPEM=certificates/issuer.key --set runAsRoot=true
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  Mesh your services
&lt;/h2&gt;

&lt;p&gt;Once Linkerd’s CRDs and control plane are running in the cluster, you can start meshing your services. The proxy injector in Linkerd is implemented as a Kubernetes admission webhook, which automatically adds the proxy to new pods if the appropriate annotation is present in the namespace, deployment, or pod itself. Specifically, the annotation &lt;code&gt;linkerd.io/inject: enabled&lt;/code&gt; is used to trigger proxy injection.&lt;/p&gt;

&lt;p&gt;This injection adds two additional containers to each meshed pod:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;    &lt;strong&gt;linkerd-init&lt;/strong&gt;: Configures the iptables to forward all incoming and outgoing TCP traffic through the proxy.&lt;/li&gt;
&lt;li&gt;    &lt;strong&gt;linkerd-proxy&lt;/strong&gt;: The proxy itself, responsible for managing traffic, security, and observability.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Note:&lt;/strong&gt; If the annotation is added to an existing namespace, deployment, or pod, the pod must be restarted to apply the changes as Kubernetes only triggers the webhook when creating or updating resources.&lt;/p&gt;

&lt;p&gt;To manually enable injection for a deployment, use the following command:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;$ kubectl annotate deployment -n vastaya application-vastaya-dplmt linkerd.io/inject=enabled
$ kubectl rollout restart -n vastaya deployment/projects-vastaya-dplmt
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;You can verify the injection by describing the pod:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;
$ kubectl describe pod -n vastaya application-vastaya-dplmt-647b4dbdc-9bnwj 
Name:             application-vastaya-dplmt-647b4dbdc-9bnwj
Namespace:        vastaya
...
Annotations:      linkerd.io/created-by: linkerd/proxy-injector edge-24.9.2
                  linkerd.io/inject: enabled
                  linkerd.io/proxy-version: edge-24.9.2
                  linkerd.io/trust-root-sha256: f6f154536a867a210de469e735af865c87a3eb61c77442bd9988353b4b632663
                  viz.linkerd.io/tap-enabled: true
Status:           Running
IP:               10.244.1.94
IPs:
  IP:           10.244.1.94
Controlled By:  ReplicaSet/application-vastaya-dplmt-647b4dbdc
Init Containers:
  linkerd-init:
    Container ID:    docker://06d8fedeac3d5d84b76aa2c4bb790f05e747402795247fe0a6087a49abd52e7a
    Image:           cr.l5d.io/linkerd/proxy-init:v2.4.1
    Image ID:        docker-pullable://cr.l5d.io/linkerd/proxy-init@sha256:e4ef473f52c453ea7895e9258738909ded899d20a252744cc0b9459b36f987ca
    Port:            &amp;lt;none&amp;gt;
    Host Port:       &amp;lt;none&amp;gt;
    SeccompProfile:  RuntimeDefault
    Args:
      --ipv6=false
      --incoming-proxy-port
      4143
      --outgoing-proxy-port
      4140
      --proxy-uid
      2102
      --inbound-ports-to-ignore
      4190,4191,4567,4568
      --outbound-ports-to-ignore
      4567,4568
    State:          Terminated
      Reason:       Completed
      Exit Code:    0
      Started:      Wed, 18 Sep 2024 13:57:24 +0900
      Finished:     Wed, 18 Sep 2024 13:57:24 +0900
    Ready:          True
    Restart Count:  1
    Environment:    &amp;lt;none&amp;gt;
    Mounts:
      /run from linkerd-proxy-init-xtables-lock (rw)
      /var/run/secrets/kubernetes.io/serviceaccount from kube-api-access-kq29n (ro)
Containers:
  linkerd-proxy:
    Container ID:    docker://d91e9cdd9ef707538efc9467bc38ee35c95ed9d655d2d8f8a1b2a2834f910af4
    Image:           cr.l5d.io/linkerd/proxy:edge-24.9.2
    Image ID:        docker-pullable://cr.l5d.io/linkerd/proxy@sha256:43d1086980a64e14d1c3a732b0017efc8a9050bc05352e2dbefa9e954d6d607d
    Ports:           4143/TCP, 4191/TCP
    Host Ports:      0/TCP, 0/TCP
    SeccompProfile:  RuntimeDefault
    State:           Running
      Started:       Wed, 18 Sep 2024 13:57:25 +0900
    Last State:      Terminated
      Reason:        Completed
      Exit Code:     0
      Started:       Fri, 13 Sep 2024 20:02:28 +0900
      Finished:      Fri, 13 Sep 2024 22:32:19 +0900
    Ready:           True
    Restart Count:   1
    Liveness:        http-get http://:4191/live delay=10s timeout=1s period=10s #success=1 #failure=3
    Readiness:       http-get http://:4191/ready delay=2s timeout=1s period=10s #success=1 #failure=3
    Environment:
      _pod_name:                                                 application-vastaya-dplmt-647b4dbdc-9bnwj (v1:metadata.name)
      _pod_ns:                                                   vastaya (v1:metadata.namespace)
      _pod_nodeName:                                              (v1:spec.nodeName)
      LINKERD2_PROXY_SHUTDOWN_ENDPOINT_ENABLED:                  false
      LINKERD2_PROXY_LOG:                                        warn,linkerd=info,hickory=error,[{headers}]=off,[{request}]=off
      ...
  application-vastaya-cntr:
    Container ID:   docker://e078170a412c392412f8f4fe170cdfcd139f212d2bd31dd6afda5801874b9225
    Image:          application:latest
    Image ID:       docker://sha256:de30a4cc9fd90fb6e51d51881747fb9b8a088d374e897a379c3ef87c848ace11
    Port:           80/TCP
    ...
Volumes:
  linkerd-proxy-init-xtables-lock:
    Type:       EmptyDir (a temporary directory that shares a pod's lifetime)
    Medium:     
    SizeLimit:  &amp;lt;unset&amp;gt;
  linkerd-identity-end-entity:
    Type:       EmptyDir (a temporary directory that shares a pod's lifetime)
    Medium:     Memory
    SizeLimit:  &amp;lt;unset&amp;gt;
  linkerd-identity-token:
    Type:                    Projected (a volume that contains injected data from multiple sources)
    TokenExpirationSeconds:  86400
QoS Class:                   BestEffort
Node-Selectors:              &amp;lt;none&amp;gt;
Tolerations:                 node.kubernetes.io/not-ready:NoExecute op=Exists for 300s
                             node.kubernetes.io/unreachable:NoExecute op=Exists for 300s
Events:
  Type    Reason          Age   From     Message
  ----    ------          ----  ----     -------
  Normal  SandboxChanged  92s   kubelet  Pod sandbox changed, it will be killed and re-created.
  Normal  Pulled          92s   kubelet  Container image "cr.l5d.io/linkerd/proxy-init:v2.4.1" already present on machine
  Normal  Created         92s   kubelet  Created container linkerd-init
  Normal  Started         92s   kubelet  Started container linkerd-init
  Normal  Pulled          92s   kubelet  Container image "cr.l5d.io/linkerd/proxy:edge-24.9.2" already present on machine
  Normal  Created         91s   kubelet  Created container linkerd-proxy
  Normal  Started         91s   kubelet  Started container linkerd-proxy
  Normal  Pulled          44s   kubelet  Container image "application:latest" already present on machine
  Normal  Created         44s   kubelet  Created container application-vastaya-cntr
  Normal  Started         44s   kubelet  Started container application-vastaya-cntr
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Sometimes, you may want to mesh all pods in a namespace but exclude certain ones. You can achieve this by adding the annotation &lt;code&gt;linkerd.io/inject: disabled&lt;/code&gt; to the pods you want to exclude.&lt;/p&gt;

&lt;p&gt;Alternatively, you can use the Linkerd CLI to inject the proxy directly into existing YAML configurations. For example:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;kubectl get -n vastaya deploy -o yaml | linkerd inject - | kubectl apply -f
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  Metrics
&lt;/h2&gt;

&lt;p&gt;By default, Kubernetes collects metrics related to resource usage, such as memory and CPU, but it doesn’t gather information about the actual requests made between services. One of the key advantages of using Linkerd’s proxy is the additional metrics it collects and makes available. These include detailed data about the traffic flowing through the proxy, such as:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;  The number of requests the proxy has received.&lt;/li&gt;
&lt;li&gt;    Latency in milliseconds for each request.&lt;/li&gt;
&lt;li&gt;    Success and failure rates for service communication.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;These metrics provide deeper insights into the behavior of your services and can be crucial for monitoring, troubleshooting, and optimizing performance.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F0s0uwzzm1hxk29jrwi2c.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F0s0uwzzm1hxk29jrwi2c.png" alt="Image description" width="720" height="372"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  References
&lt;/h2&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;**Companies using Linkerd:** [https://discovery.hgdata.com/product/linkerd](https://discovery.hgdata.com/product/linkerd)
**Lesson learned and reason to Linkerd.x2:** [https://www.infoq.com/articles/linkerd-v2-production-adoption/](https://www.infoq.com/articles/linkerd-v2-production-adoption/)
**Reasons why Byoyant choose Rust:** [https://linkerd.io/2020/07/23/under-the-hood-of-linkerds-state-of-the-art-rust-proxy-linkerd2-proxy/](https://linkerd.io/2020/07/23/under-the-hood-of-linkerds-state-of-the-art-rust-proxy-linkerd2-proxy/)
**Business Insight Service Mesh report:** [https://www.businessresearchinsights.com/market-reports/service-mesh-market-100139](https://www.businessresearchinsights.com/market-reports/service-mesh-market-100139)
**Proxy Injection:** [https://linkerd.io/2.16/features/proxy-injection/](https://linkerd.io/2.16/features/proxy-injection/)
**GitHub demo source-code:** [https://github.com/GTRekter/Vastaya](https://github.com/GTRekter/Vastaya)
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;

</description>
      <category>kubernetes</category>
    </item>
    <item>
      <title>Zero Downtime Deployments in Kubernetes with Linkerd</title>
      <dc:creator>Ivan Porta</dc:creator>
      <pubDate>Thu, 14 Nov 2024 12:57:33 +0000</pubDate>
      <link>https://dev.to/gtrekter/zero-downtime-deployments-in-kubernetes-with-linkerd-3eo3</link>
      <guid>https://dev.to/gtrekter/zero-downtime-deployments-in-kubernetes-with-linkerd-3eo3</guid>
      <description>&lt;p&gt;Releasing applications into production always comes with a sense of nervousness, no matter how stable the application and automation have been in prior environments. The fear of causing unexpected disruptions for critical clients — and potentially driving them away — is a significant risk. As a result, many businesses still rely on manual, after-hours deployments, following long instruction pages that detail every manual step. While this may minimize the immediate business impact, it remains vulnerable to human error. Engineers can get distracted or miss a crucial step, and even a minor oversight can lead to significant issues — something automated procedures are designed to prevent.&lt;/p&gt;

&lt;p&gt;On the other hand, some companies fully embrace the “fail-fast” approach. For example, Netflix runs its “Simian Army” in production environments to ensure everything is functioning as expected, with their little monkeys trying to break things. However, reaching this level of confidence requires organizational maturity, and it takes time to get there.&lt;/p&gt;

&lt;p&gt;Modern production deployment strategies are evolving to address these challenges through automation, continuous delivery, and the use of advanced tools that implement deployment techniques such as blue-green deployments, canary releases, and progressive rollouts. These strategies not only reduce downtime but also ensure smoother, more reliable transitions for production workloads. Convincing management to adopt these approaches may take time, but a proof of concept (POC) and an adoption plan can help your organization achieve this while saving both engineers and management from sleepless, stressful nights.&lt;/p&gt;

&lt;p&gt;In this article, I will explain and demonstrate how to implement modern deployment strategies like canary deployment, A/B testing, and blue-green deployment in Kubernetes environments using Linkerd.&lt;/p&gt;

&lt;h2&gt;
  
  
  Traffic Management and Linkerd
&lt;/h2&gt;

&lt;p&gt;Kubernetes natively supports traffic management features like timeouts, retries, and mirroring through the Gateway API’s HTTPRoute resource. This resource defines rules and matching conditions to determine which backend services should handle incoming traffic. By using the weight field, you can specify the proportion of requests sent to a particular backend, facilitating traffic splitting across different versions or environments.&lt;/p&gt;

&lt;p&gt;Before version 2.14, Linkerd users had to rely on a custom resource definition (CRD) downstream of &lt;code&gt;httproutes.gateway.networking.k8s.io&lt;/code&gt;, specifically h&lt;code&gt;ttproutes.policy.linkerd.io&lt;/code&gt;, to instruct the Linkerd proxy on how to route requests. Starting with version 2.14, Linkerd extended its support to the native &lt;code&gt;httproutes.gateway.networking.k8s.io&lt;/code&gt;. This means that regardless of which resource you use, the Linkerd proxy will route traffic based on either the Gateway API's or Linkerd's policy HTTPRoute resource. This functionality also applies to gRPC requests.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Note:&lt;/strong&gt; By default, during installation, Linkerd attempts to install the Gateway API CRDs. However, if they are already present in the cluster, you can instruct Linkerd to skip this step by setting &lt;code&gt;enableHttpRoutes&lt;/code&gt; to &lt;code&gt;false&lt;/code&gt; in the Helm chart or CLI when installing Linkerd CRDs.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;$ kubectl get crds | grep gateway
grpcroutes.gateway.networking.k8s.io       2024-09-25T01:01:18Z
httproutes.gateway.networking.k8s.io       2024-09-25T01:01:18Z
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;In this demonstration, I’ll use NGINX as the Ingress controller. By default, the NGINX Ingress controller retrieves the Endpoint resources for services specified in the Ingress and forwards traffic directly to the IP addresses of the pods. However, this behavior doesn’t align with HTTPRoute policy, which applies to traffic routed through the service itself. To solve this issue, we need to configure the NGINX Ingress controller to forward traffic to the service rather than directly to the pod endpoints. This can be achieved by adding the annotation &lt;code&gt;nginx.ingress.kubernetes.io/service-upstream: "true"&lt;/code&gt; to the Ingress resource.&lt;/p&gt;

&lt;p&gt;Additionally, since is the Linkerd proxy that handles the redirection of the traffic to the backend service, we need to inject it into the Ingress controller pod.&lt;/p&gt;

&lt;p&gt;The overall traffic flow is the following:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;    The user sends a request to the application.&lt;/li&gt;
&lt;li&gt;    The inbound traffic is intercepted by the Linkerd proxy running in the Ingress controller pod and then forwarded to the NGINX Ingress controller for processing.&lt;/li&gt;
&lt;li&gt;    Due to the annotation &lt;code&gt;nginx.ingress.kubernetes.io/service-upstream: "true"&lt;/code&gt;, the Ingress controller forwards the traffic to the service defined in the upstream configuration located at &lt;code&gt;/etc/nginx/nginx.conf&lt;/code&gt;.&lt;/li&gt;
&lt;li&gt;    The outbound traffic is intercepted again by the Linkerd proxy, which evaluates the destination based on its in-memory state, which includes discovery results, requests, and connections retrieved from the Linkerd destination service. Unused cached entries are evicted after a certain timeout period.&lt;/li&gt;
&lt;li&gt;    Once the target is determined, the proxy queries the Linkerd policy service for applicable routing policies and applies them as necessary.&lt;/li&gt;
&lt;li&gt;    Finally, the Linkerd proxy forwards the request to the backend defined by the policy — in this case, the canary version of the service.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fe9og800muwwd6jkdnbuv.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fe9og800muwwd6jkdnbuv.png" alt="Image description" width="720" height="393"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Now that we have an idea of what’s happening behind the scenes, let’s dig into the different types of deployment available and what should be expected in the future.&lt;/p&gt;

&lt;h2&gt;
  
  
  Canary Deployment
&lt;/h2&gt;

&lt;p&gt;This deployment strategy involves deploying a new version of the service (referred to as the “canary”) alongside the current stable version running in production. A percentage of traffic is then redirected to the canary. By doing this, the development team can quickly test the service with production traffic and identify any issues with a minimal “blast radius” (the number of users affected by the change). During this triage phase, the team also collects key metrics from the service. Based on these results, they can decide to gradually increase traffic to the new version (e.g., 25%, 75%, 100%) or, if necessary, abort the release.&lt;/p&gt;

&lt;p&gt;Below is an example of an &lt;code&gt;HTTPRoute&lt;/code&gt; configuration using Kubernetes Gateway API to implement a canary deployment where the traffic targeting the service &lt;code&gt;projects-vastaya-svc&lt;/code&gt; is split between two services:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;    &lt;code&gt;projects-vastaya-svc&lt;/code&gt;: Receives 10% of the traffic.&lt;/li&gt;
&lt;li&gt;    &lt;code&gt;projects-canary-vastaya-svc&lt;/code&gt;: Receives 90% of the traffic.
&lt;/li&gt;
&lt;/ul&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;apiVersion: gateway.networking.k8s.io/v1
kind: HTTPRoute
metadata:
  name: project-vastaya-split
spec:
  parentRefs:
    - name: projects-vastaya-svc
      group: core
      kind: Service
      namespace: vastaya
      port: 80
  rules:
  - backendRefs:
    - name: projects-vastaya-svc
      port: 80
      weight: 10
    - name: projects-canary-vastaya-svc
      port: 80
      weight: 90
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;In the following image, you can see traffic being forwarded to both services by the Linkerd proxy. To visualize the inbound traffic to the services, I used the &lt;code&gt;viz&lt;/code&gt; extension in Linkerd and injected Linkerd into both the canary and stable deployments. This allowed me to observe the traffic distribution using the command:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;linkerd viz top deploy/projects-canary-vastaya-dplmt -n vastaya
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;**Note: **Injecting the Linkerd proxy into the destination pods is not required for traffic redirection, but I did it to collect detailed metrics on service performance.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fcsn38s6abkbwfh5ohcvl.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fcsn38s6abkbwfh5ohcvl.png" alt="Image description" width="720" height="385"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  Blue-green Deployment
&lt;/h2&gt;

&lt;p&gt;A Blue-Green deployment is similar to a canary deployment but takes a more drastic approach. Instead of gradually directing an incremental percentage of traffic to the new version, both the old (Blue) and new (Green) versions run in parallel. However, only one version is active and accessible to users at any given time.&lt;/p&gt;

&lt;p&gt;The key difference is that the new version (Green) remains inactive and hidden from users while you make any necessary adjustments to ensure it’s stable and reliable. Once you’re confident in the new version’s performance, you swap all traffic over to it in a single, coordinated switch. This approach minimizes downtime and allows for a quick rollback if issues are detected.&lt;/p&gt;

&lt;p&gt;In contrast to canary deployments — where users actively access both versions as traffic is incrementally shifted — the Blue-Green strategy keeps the new version isolated until it’s fully ready for production use.&lt;/p&gt;

&lt;p&gt;In our case, we’ll implement a Blue-Green deployment by changing the traffic weight from 0 to 1, directing all traffic to the new version. Here’s an example of the &lt;code&gt;HTTPRoute&lt;/code&gt; configuration:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;apiVersion: gateway.networking.k8s.io/v1beta1
kind: HTTPRoute
metadata:
  name: project-vastaya-split
spec:
  parentRefs:
    - name: projects-vastaya-svc
      group: core
      kind: Service
      namespace: vastaya
      port: 80
  rules:
  - backendRefs:
    - name: projects-vastaya-svc
      port: 80
      weight: 0
    - name: projects-canary-vastaya-svc
      port: 80
      weight: 1
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  A/B Testing
&lt;/h2&gt;

&lt;p&gt;A/B testing is a method of experimentation that involves running two versions of the same environment to collect metrics like conversion rates, performance, and user engagement. Similar to canary deployments, it allows you to compare different versions of a service, but with a focus on gathering specific data from targeted user groups.&lt;/p&gt;

&lt;p&gt;In A/B testing, the second version (the “B” version) targets one or more groups of users defined by predetermined criteria such as location, device type, user behavior, or other factors. This method is widely used in user experience (UX) design. For example, you might notice something appearing on your Netflix dashboard that your friend doesn’t see, or subtle changes in the application’s interface.&lt;/p&gt;

&lt;p&gt;In our case, we can achieve this by adding additional filters to our &lt;code&gt;HTTPRoute&lt;/code&gt;. In the following configuration we will:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;    Use the matches section to identify requests coming from users who have their locale set to Korean (&lt;code&gt;Accept-Language: ko.*&lt;/code&gt;) and are using Firefox as their web browser (&lt;code&gt;User-Agent: .*Firefox.*&lt;/code&gt;).&lt;/li&gt;
&lt;li&gt;    For these users, traffic is split evenly between the stable service (&lt;code&gt;projects-vastaya-svc&lt;/code&gt;) and the canary service (&lt;code&gt;projects-canary-vastaya-svc&lt;/code&gt;), each receiving 50% of the traffic.&lt;/li&gt;
&lt;li&gt;    For all other users, traffic is directed entirely to the stable service.
&lt;/li&gt;
&lt;/ul&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;apiVersion: gateway.networking.k8s.io/v1beta1
kind: HTTPRoute
metadata:
  name: project-vastaya-traffic-split
  namespace: vastaya
spec:
  parentRefs:
    - name: projects-vastaya-svc
      group: core
      kind: Service
      namespace: vastaya
      port: 80
  rules:
    - matches:
      - headers:
        - name: "User-Agent"
          type: RegularExpression
          value: ".*Firefox.*"
        - name: Accept-Language
          type: RegularExpression
          value: "en-US.*" 
      backendRefs:
        - name: projects-vastaya-svc
          port: 80
          weight: 10
        - name: projects-canary-vastaya-svc
          port: 80
          weight: 90
    - backendRefs:
        - name: projects-vastaya-svc
          port: 80
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;By implementing this configuration, you can conduct A/B testing by routing 50% of the targeted users to the canary version while the rest continue to use the stable version. This allows you to collect specific metrics and assess the performance of the new version among a defined user segment.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fhgvalnypvwgtpcwzgywz.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fhgvalnypvwgtpcwzgywz.png" alt="Image description" width="720" height="385"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Shadow Deployment (Mirrored Deployment)&lt;/p&gt;

&lt;p&gt;In shadow deployment, also known as mirrored deployment, a new version of a service runs in the background and receives a copy of real-world traffic. Users are not impacted because only the response from the main (stable) service is considered; responses from the new version are ignored. This method allows the development team to test the new service against production traffic to observe how it behaves under real-world conditions without affecting users.&lt;/p&gt;

&lt;p&gt;As of now, this feature is not fully supported by Linkerd, but the development team is actively working on it. You can track the progress through this GitHub issue: &lt;a href="https://github.com/linkerd/linkerd2/issues/11027" rel="noopener noreferrer"&gt;Linkerd Issue #11027&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;Once this feature becomes available, you’ll be able to apply the following configuration without setting up a gateway, and the Linkerd proxy will handle the rest. The traffic sent to the service &lt;code&gt;projects-vastaya-svc&lt;/code&gt; will be mirrored to &lt;code&gt;projects-canary-vastaya-svc&lt;/code&gt;, but only the response from &lt;code&gt;projects-vastaya-svc&lt;/code&gt; will be considered by the users.&lt;/p&gt;

&lt;p&gt;Here’s an example of the &lt;code&gt;HTTPRoute&lt;/code&gt; configuration:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;apiVersion: gateway.networking.k8s.io/v1beta1
kind: HTTPRoute
metadata:
  name: project-vastaya-traffic-split
  namespace: vastaya
spec:
  parentRefs:
    - name: projects-vastaya-svc
      group: core
      kind: Service
      namespace: vastaya
      port: 80
  rules:
  - backendRefs:
    - name: projects-vastaya-svc
      port: 80
      weight: 0
    filters:
    - type: RequestMirror
      requestMirror:
        backendRef:
          name: projects-canary-vastaya-svc
          port: 80
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  References
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;    &lt;strong&gt;Netflix and Canary Deployments:&lt;/strong&gt; &lt;a href="https://netflixtechblog.com/automated-canary-analysis-at-netflix-with-kayenta-3260bc7acc69" rel="noopener noreferrer"&gt;https://netflixtechblog.com/automated-canary-analysis-at-netflix-with-kayenta-3260bc7acc69&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;    &lt;strong&gt;Linkerd 2.14 Release Notes:&lt;/strong&gt; &lt;a href="https://github.com/linkerd/linkerd2/releases/tag/stable-2.14.0" rel="noopener noreferrer"&gt;https://github.com/linkerd/linkerd2/releases/tag/stable-2.14.0&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;    &lt;strong&gt;Ingress configuration with Linkerd:&lt;/strong&gt; &lt;a href="https://linkerd.io/2.16/tasks/using-ingress/#nginx-community-version" rel="noopener noreferrer"&gt;https://linkerd.io/2.16/tasks/using-ingress/#nginx-community-version&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;    &lt;strong&gt;Gateway API Traffic Splitting:&lt;/strong&gt; &lt;a href="https://gateway-api.sigs.k8s.io/guides/traffic-splitting/" rel="noopener noreferrer"&gt;https://gateway-api.sigs.k8s.io/guides/traffic-splitting/&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;    &lt;strong&gt;Netflix A/B Testing:&lt;/strong&gt; &lt;a href="https://netflixtechblog.com/its-all-a-bout-testing-the-netflix-experimentation-platform-4e1ca458c15" rel="noopener noreferrer"&gt;https://netflixtechblog.com/its-all-a-bout-testing-the-netflix-experimentation-platform-4e1ca458c15&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;    &lt;strong&gt;Proxy Discovery Cache:&lt;/strong&gt; &lt;a href="https://linkerd.io/2.16/tasks/configuring-proxy-discovery-cache/" rel="noopener noreferrer"&gt;https://linkerd.io/2.16/tasks/configuring-proxy-discovery-cache/&lt;/a&gt;
&lt;/li&gt;
&lt;/ul&gt;

</description>
      <category>kubernetes</category>
    </item>
    <item>
      <title>링커드 활용 쿠버네티스의 무중단 배포</title>
      <dc:creator>Ivan Porta</dc:creator>
      <pubDate>Thu, 14 Nov 2024 12:50:41 +0000</pubDate>
      <link>https://dev.to/gtrekter/ringkeodeu-hwalyong-kubeonetiseuyi-mujungdan-baepo-5n0</link>
      <guid>https://dev.to/gtrekter/ringkeodeu-hwalyong-kubeonetiseuyi-mujungdan-baepo-5n0</guid>
      <description>&lt;p&gt;애플리케이션 제작을 시작하면 이전 환경에서의 애플리케이션과 자동화의 안정성과 상관없이 초조하기 마련이다. 예상치 못하게 주요 고객을 혼란하게 하는 것– 주요 고객을 떠나가게 할 수 있다는 것 -은 매우 위험하다. 따라서 많은 기업들은 여전히 정규 시간외 수동 배포를 하고 있으며, 모든 수동 단계에 대한 자세한 설명서를 따른다. 이렇게 하면 직접적인 비즈니스 영향을 최소화할 수 있는 반면 인적 오류에 취약하다. 엔지니어는 산만해지거나 주요 단계를 놓칠 수 있고, 작은 실수가 중요한 문제-자동화가 방지하고자 하는 것-를 야기할 수 있다.&lt;/p&gt;

&lt;p&gt;반면에 페일페스트(fail-fast) 접근 방법을 적극적으로 수용하는 회사도 있다. 예를 들면, 넷플릭스는 제작 환경에서 시미언 아미(Simian Army)를 실행하여 모든 것이 예상대로 작동하도록 하고, 수작업자는 단계를 나누게 된다. 단, 이러한 신뢰 수준에 도달하려면 조직 성숙도와 시간이 필요하다.&lt;/p&gt;

&lt;p&gt;최신 제작 배포 전략은 자동화와 지속적 인도 이외에 블루그린 배포, 카나리 배포 그리고 점진적 롤아웃 등의 배포 기법을 도입한 최신 도구를 사용하여 이러한 문제를 다룬다. 이러한 전략은 중단 시간을 줄이고, 제작 부하를 한층 부드럽고 확실히 이동시켜 준다. 관리자가 이러한 접근 방법을 채택하도록 설득하려면 시간이 걸리지만 개념 증명(POC)과 채택 계획을 통하여 해당 조직이 이러한 목표를 달성하고, 엔지니어와 관리자를 안심시킬 수 있다.&lt;/p&gt;

&lt;p&gt;이 논문을 통하여 링커드를 활용하는 쿠버네티스 환경에서 카나리 출시, A/B 시험 그리고 블루그린 배포와 같은 최신 배포 전략을 실행하는 방법을 설명하고 논증한다.&lt;/p&gt;

&lt;p&gt;트래픽 관리와 링커드&lt;/p&gt;

&lt;p&gt;쿠버네티스는 기본적으로 게이트웨이 API의 HTTPRoute 자원을 통하여 타임아웃, 재시도 그리고 미러링과 같은 트래픽 관리 기능을 지원한다. 이 자원은 규칙과 매칭 조건을 정의하여 인커밍 트래픽을 처리할 백엔드 서비스를 결정한다. weight 필드를 사용하여 특정 백엔드로 송부된 요청의 일부를 명시하면 트래픽이 여러가지 버전이나 환경으로 나뉜다.&lt;/p&gt;

&lt;p&gt;버전 2.14 이전에서 링커드 사용자는 &lt;code&gt;httproutes.gateway.networking.k8s.io&lt;/code&gt; (구체적으로 말하면 &lt;code&gt;httproutes.policy.linkerd.io&lt;/code&gt;) 사용자 지정 리소스 정의 (CRD) 다운스트림을 사용하여 요청을 라우트하는 방법을 링커드 사용자에게 알려야 했다. 링커드는 버전 2.14부터 기본적인 &lt;code&gt;httproutes.gateway.networking.k8s.io&lt;/code&gt;로 지원을 확장하였다. 링커드 프록시는 사용 리소스와 상관없이 게이트웨이 API 또는 링커드 정책 HTTPRoute 리소스에 기반한 트래픽을 라우트하게 된다. 이러한 기능성은 gRPC 요청에도 적용된다.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;주:&lt;/strong&gt; 링커드는 설치 시 기본적으로 게이트웨이 API CRD 설치를 시도한다. 단, 해당 CRD가 클러스터에 이미 있으면 링커드 CRD 설치 시 헬름 차트나 CLI에서 &lt;code&gt;enableHttpRoutes&lt;/code&gt;를&lt;code&gt;false&lt;/code&gt;로 설정하여 이 단계를 생략하도록 링커드에 지시할 수 있다.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;$ kubectl get crds | grep gateway
grpcroutes.gateway.networking.k8s.io       2024-09-25T01:01:18Z
httproutes.gateway.networking.k8s.io       2024-09-25T01:01:18Z
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;이 논증에서는 NGINX를 인그레스 컨트롤러로 사용한다. NGINX 인그레스 컨트롤러는 기본적으로 인그레스에 명시된 서비스에 대한 종점 리소스를 검색하고, 트래픽을 pod의 IP 주소로 직접 보낸다. 단, 이 행위는 서비스 자체를 통하여 라우트된 트래픽에 적용되는 HTTPRoute 정책과 맞지 않는다. 이 문제를 해결하려면 NGINX 인그레스 콘트롤러를 설정하여 트래픽을 pod 종점으로 직접 송부하지 말고 서비스로 송부해야 한다. 인그레스 리소스에 &lt;code&gt;nginx.ingress.kubernetes.io/service-upstream: "true"&lt;/code&gt; 표기를 추가하면 된다.&lt;/p&gt;

&lt;p&gt;또한 링커드 프록시가 트래픽을 백엔드 서비스로 리디렉션하므로 그것을 인그레스 콘트롤러 pod로 주입해야 한다.&lt;/p&gt;

&lt;p&gt;전반적인 트래픽 흐름은 다음과 같다.&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;    사용자는 요청을 애플리케이션으로 송부한다.&lt;/li&gt;
&lt;li&gt;    인그레스 콘트롤러 pod에서 실행하는 링커드 프록시가 인바운드 트래픽을 가로채어 NGINX 인그레스 컨트롤러로 송부하여 처리한다.&lt;/li&gt;
&lt;li&gt;    인그레스 컨트롤러는 &lt;code&gt;nginx.ingress.kubernetes.io/service-upstream: "true"&lt;/code&gt; 표기로 인하여 트래픽을&lt;code&gt;/etc/nginx/nginx.conf&lt;/code&gt;에 위치한 업스트림 설정에 정의된 서비스로 송부한다.&lt;/li&gt;
&lt;li&gt;    발견 결과, 요청 그리고 링커드 목적지 서비스에서 검색된 연결 등의 인 메모리 상태에 기반한 목적지를 평가하는 링커드 프록시가 아웃바운드 트래픽을 가로챈다. 일정 타임아웃 기간 후에 캐쉬된 미사용 항목을 퇴거시킨다.&lt;/li&gt;
&lt;li&gt;    프록시는 목표가 결정되면 해당 라우팅 정책에 대한 링커드 정책 서비스를 조회하고 필요 시 해당 정책을 적용한다.&lt;/li&gt;
&lt;li&gt;    링커드 프록시는 최종적으로 요청을 정책이 정의한 백엔드로 송부한다(이 경우에는 서비스의 카나리 버전).&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Frfqpxkljr8215oshg9pf.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Frfqpxkljr8215oshg9pf.png" alt="Image description" width="720" height="397"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;막후에서 무슨 일이 일어나는지 알았으므로 여러가지 가용 배포 방식과 추후 예상 상황에 대하여 알아보도록 한다.&lt;/p&gt;

&lt;h2&gt;
  
  
  카나리 배포
&lt;/h2&gt;

&lt;p&gt;이 배포 전략은 제작 실행 중인 기존 안정적 버전과 함께 서비스(”카나리”라 함)의 신 버전을 배포하는 것을 포함한다. 트래픽의 일부를 카나리로 재전송한다. 개발팀은 이를 통하여 제작 트래픽으로 서비스를 신속하게 시험하고 최소한의 “폭발 반경”(변경으로 인하여 영향을 받는 사용자의 수)으로 문제를 확인할 수 있다. 해당 팀은 이러한 선별 단계에서 서비스로부터 주요 측정 기준도 수집한다. 이 결과에 따라 트래픽을 점진적으로 신 버전(예: 25%, 75%, 100%)으로 늘리거나 출시 중단을 결정할 수 있다.&lt;/p&gt;

&lt;p&gt;쿠버네티스 게이트웨이 API를 사용하여 &lt;code&gt;projects-vastaya-svc&lt;/code&gt;서비스를 대상으로 하는 트래픽이 두 서비스로 분리되는 카나리 배포를 실행하는&lt;code&gt;HTTPRoute&lt;/code&gt; 설정의 예는 다음과 같다.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;projects-vastaya-svc:&lt;/strong&gt; 트래픽의 10%를 수신한다.&lt;/li&gt;
&lt;li&gt;    &lt;strong&gt;projects-canary-vastaya-svc:&lt;/strong&gt; 트래픽의 90%를 수신한다.
&lt;/li&gt;
&lt;/ul&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;apiVersion: gateway.networking.k8s.io/v1
kind: HTTPRoute
metadata:
  name: project-vastaya-split
spec:
  parentRefs:
    - name: projects-vastaya-svc
      group: core
      kind: Service
      namespace: vastaya
      port: 80
  rules:
  - backendRefs:
    - name: projects-vastaya-svc
      port: 80
      weight: 10
    - name: projects-canary-vastaya-svc
      port: 80
      weight: 90
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;다음의 이미지에서 링커드 프록시가 트래픽을 두 서비스 모두에게 송부하는 것을 볼 수 있다. 서비스에 대한 인바운드 트래픽을 시각화하기 위하여 링커드의 viz 확장을 사용하고, 링커드를 카나리 배포와 안정적 배포 모두에게 주입하였다. 이를 통하여 해당 명령어를 사용하여 트래픽 분포를 관찰할 수 있었다.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;linkerd viz top deploy/projects-canary-vastaya-dplmt -n vastaya
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;주:&lt;/strong&gt; 트래픽 리디렉션을 하기 위하여 링커드 프록시를 목적지 pod에 주입할 필요가 없으나 서비스 성능에 대한 자세한 측정 기준을 수집하기 위하여 목적지 pod를 주입하였다.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F2l4pr6azb6pk568t9ox0.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F2l4pr6azb6pk568t9ox0.png" alt="Image description" width="700" height="374"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  블루그린 배포
&lt;/h2&gt;

&lt;p&gt;블루그린 접근 방법은 카나리 배포와 유사하지만 더 과감한 접근 방법을 취한다. 트래픽의 증분 일부를 점진적으로 송부하는 대신에 구(블루) 버전과 신(그린) 버전을 모두 병행하여 실행한다. 단, 어떤 시점이든지 한 버전만 활성화되고 사용자 접근을 허용한다.&lt;/p&gt;

&lt;p&gt;주요한 차이점은 필수 조정을 하는 동안 신 버전(그린)은 비활성화되고 사용자로부터 숨겨져서 안정성과 신뢰성을 확보한다는 것이다. 신 버전의 성능에 대한 확신이 있을 경우 단일 조정 전환으로 모든 트래픽을 교환한다. 이 접근 방법을 취하면 중단 시간을 최소화하고 문제 발견 시 신속한 롤백을 할 수 있다.&lt;/p&gt;

&lt;p&gt;블루그린 전략은 사용자가 트래픽이 점진적으로 바뀔 때 두 버전 모두를 활발하게 접근하는 카나리 배포와 대조적으로 신 버전이 제작용으로 완전히 준비될 때까지 신 버전을 사용하지 않는다.&lt;/p&gt;

&lt;p&gt;이 경우에는 트래픽 가중치를 0에서 1로 변경하여 블루그린 배포를 실행하고 모든 트래픽을 신 버전으로 송부하게 된다. &lt;code&gt;HTTPRoute&lt;/code&gt; 설정의 예는 다음과 같다.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;apiVersion: gateway.networking.k8s.io/v1beta1
kind: HTTPRoute
metadata:
  name: project-vastaya-split
spec:
  parentRefs:
    - name: projects-vastaya-svc
      group: core
      kind: Service
      namespace: vastaya
      port: 80
  rules:
  - backendRefs:
    - name: projects-vastaya-svc
      port: 80
      weight: 0
    - name: projects-canary-vastaya-svc
      port: 80
      weight: 1
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  A/B 시험
&lt;/h2&gt;

&lt;p&gt;A/B 시험은 동일한 환경의 두 버전을 실행하여 전환율, 성능 그리고 사용자 참여도와 같은 측정 기준을 수집하는 실험 방법이다. 카나리 배포와 유사하게 서비스의 여러가지 버전을 비교할 수 있지만 대상 사용자 그룹으로부터 특정 자료를 수집하는 데 초점을 둔다.&lt;/p&gt;

&lt;p&gt;A/B 시험에서 두 번째 버전(“B” 버전)은 위치, 장치 형태, 사용자 행동 또는 기타 요소와 같은 선결 기준이 정의한 한 개 이상의 사용자 그룹을 대상으로 한다. 이 방법은 사용자 경험(UX) 설계 시 널리 쓰인다. 예를 들면 친구에게는 보이지 않지만 넷플릭스에서는 보이는 것이나 애플리케이션 인터페이스의 미묘한 변화를 인지할 수 있을 것이다.&lt;/p&gt;

&lt;p&gt;이 경우에 추가 필터를 &lt;code&gt;HTTPRoute&lt;/code&gt; 에 추가하면 가능하다. 다음 설정에서 다음 사항을 실행하게 된다.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;    &lt;code&gt;matches&lt;/code&gt; 섹션을 사용하여 로켈을 한국어(&lt;code&gt;Accept-Language: ko.*&lt;/code&gt;) 로 설정하고 파이어폭스를 웹브라우저(&lt;code&gt;User-Agent: .*Firefox.*&lt;/code&gt;)로 사용하는 사용자의 요청을 확인한다.&lt;/li&gt;
&lt;li&gt;    트래픽은 이 사용자에 대해서 안정적 서비스(&lt;code&gt;projects-vastaya-svc&lt;/code&gt;)와 카나리 서비스(&lt;code&gt;projects-canary-vastaya-svc&lt;/code&gt;)간에 균등하게 나뉘며, 각각은 트래픽의 50% 를 각각 수신한다.&lt;/li&gt;
&lt;li&gt;    기타 사용자에 대해서는 트래픽을 모두 안정적 서비스로 송부한다.
&lt;/li&gt;
&lt;/ul&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;apiVersion: gateway.networking.k8s.io/v1beta1
kind: HTTPRoute
metadata:
  name: project-vastaya-traffic-split
  namespace: vastaya
spec:
  parentRefs:
    - name: projects-vastaya-svc
      group: core
      kind: Service
      namespace: vastaya
      port: 80
  rules:
    - matches:
      - headers:
        - name: "User-Agent"
          type: RegularExpression
          value: ".*Firefox.*"
        - name: Accept-Language
          type: RegularExpression
          value: "en-US.*" 
      backendRefs:
        - name: projects-vastaya-svc
          port: 80
          weight: 10
        - name: projects-canary-vastaya-svc
          port: 80
          weight: 90
    - backendRefs:
        - name: projects-vastaya-svc
          port: 80
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;이러한 설정의 실행을 통하여 대상 사용자의 50%를 카나리 버전으로 라우팅하고 나머지 사용자는 안정적 서비스를 계속 사용하게 하여 A/B 시험을 할 수 있다. 이를 통하여 특정 측정 기준을 수집하고, 지정 사용자 세그먼트에 대한 신 버전의 성능을 평가할 수 있다.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fvdlfrpdx64sr9kghg43v.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fvdlfrpdx64sr9kghg43v.png" alt="Image description" width="700" height="374"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  그림자 배포(미러드 배포)
&lt;/h2&gt;

&lt;p&gt;미러드 배포로도 알려진 그림자 배포에서는 서비스의 신 버전이 백그라운드에서 실행하고 현실 트래픽의 복사본을 수신한다. 주(안정적) 서비스의 반응만을 고려하므로 사용자는 영향을 받지 않으며, 신 버전의 반응은 무시된다. 이 방법을 통하여 개발팀은 신 서비스를 제작 트래픽에 대하여 시험하여 신 서비스가 사용자에게 영향을 미치지 않고 현실 조건 하에서 어떻게 움직이는지 관찰할 수 있다.&lt;/p&gt;

&lt;p&gt;링커드는 현재 이 기능을 완전하게 지원하지 않지만 개발팀은 해당 사항에 대하여 활발히 작업 중이다. GitHub 판을 통하여 상황을 파악할 수 있다(링커드 판 &lt;a href="https://github.com/linkerd/linkerd2/issues/11027" rel="noopener noreferrer"&gt;#11027&lt;/a&gt;).&lt;/p&gt;

&lt;p&gt;이 기능이 사용 가능 시 게이트웨이를 설정하지 않고 다음 설정을 적용할 수 있으며, 링커드 프록시가 뒤처리를 하게 된다. &lt;code&gt;projects-vastaya-svc&lt;/code&gt; 서비스로 송부된 트래픽을&lt;code&gt;projects-canary-vastaya-svc&lt;/code&gt; 로 미러링 하지만 사용자는&lt;code&gt;projects-vastaya-svc&lt;/code&gt;의 반응만을 고려하게 된다.&lt;/p&gt;

&lt;p&gt;&lt;code&gt;HTTPRoute&lt;/code&gt;설정의 예는 다음과 같다.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;apiVersion: gateway.networking.k8s.io/v1beta1
kind: HTTPRoute
metadata:
  name: project-vastaya-traffic-split
  namespace: vastaya
spec:
  parentRefs:
    - name: projects-vastaya-svc
      group: core
      kind: Service
      namespace: vastaya
      port: 80
  rules:
  - backendRefs:
    - name: projects-vastaya-svc
      port: 80
      weight: 0
    filters:
    - type: RequestMirror
      requestMirror:
        backendRef:
          name: projects-canary-vastaya-svc
          port: 80
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  참고 자료
&lt;/h2&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;**넷플릭스와 카나리 배포:** [https://netflixtechblog.com/automated-canary-analysis-at-netflix-with-kayenta-3260bc7acc69](https://netflixtechblog.com/automated-canary-analysis-at-netflix-with-kayenta-3260bc7acc69)
**Linkerd 2.14 릴리즈 노트:** [https://github.com/linkerd/linkerd2/releases/tag/stable-2.14.0](https://github.com/linkerd/linkerd2/releases/tag/stable-2.14.0)
**Linkerd를 활용한 인그레스 설정:** [https://linkerd.io/2.16/tasks/using-ingress/#nginx-community-version](https://linkerd.io/2.16/tasks/using-ingress/#nginx-community-version)
**Gateway API 트래픽 분할:** [https://gateway-api.sigs.k8s.io/guides/traffic-splitting/](https://gateway-api.sigs.k8s.io/guides/traffic-splitting/)
**넷플릭스 A/B 테스트:** [https://netflixtechblog.com/its-all-a-bout-testing-the-netflix-experimentation-platform-4e1ca458c15](https://netflixtechblog.com/its-all-a-bout-testing-the-netflix-experimentation-platform-4e1ca458c15)
**프록시 디스커버리 캐시:** [https://linkerd.io/2.16/tasks/configuring-proxy-discovery-cache/](https://linkerd.io/2.16/tasks/configuring-proxy-discovery-cache/)
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;

</description>
      <category>kubernetes</category>
    </item>
    <item>
      <title>Optimize Resilience and Reduce Cross-Zone Expenses Using HAZL</title>
      <dc:creator>Ivan Porta</dc:creator>
      <pubDate>Thu, 14 Nov 2024 12:43:34 +0000</pubDate>
      <link>https://dev.to/gtrekter/optimize-resilience-and-reduce-cross-zone-expenses-using-hazl-3f5m</link>
      <guid>https://dev.to/gtrekter/optimize-resilience-and-reduce-cross-zone-expenses-using-hazl-3f5m</guid>
      <description>&lt;p&gt;Unplanned downtime, whether it’s caused by hardware failures, glitches, or cyberattacks, is every organization’s worst nightmare, no matter its size and sectors. Its can not only cause a lost revenue, but also drops in stock value, hit to customer satisfaction, trust and damage to the company’s reputation. According to a Oxford Economics survey, the downtime costs for Global 2000 companies is estimated around $400B annually, which means $200M per company per year, with an average of of $9,000 or $540,000 per hour.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F0pqwr0eqlge03wzw72sn.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F0pqwr0eqlge03wzw72sn.png" alt="Image description" width="720" height="445"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Also, outages are more common that what you might think. In fact, according to a survey by Acronis in 2022 showed that 76% of companies experienced downtime. And let’s not forget Meta’s massive 2024 outage, which cost an estimated $100 million in lost revenue or the $34 million in missed sales for Amazon in 2021. This proves that, even with resiliency strategies in place, and thousands of engineers dedicated to avoiding downtime, it’s still a problem that need to be properly addressed.&lt;/p&gt;

&lt;p&gt;To provide resiliency to their customers, and help them reduce the risk of unplanned downtime, all major cloud providers, for Azure to Tencent Cloud, offer spread their databases, Kubernetes clusters, and other resources across different data centers (or zones) within a region. These zones have independent power, cooling, and networking. So if one zone goes down, the application can keep running on the other zones.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fjqdt3fnw0wtowr6x4ae4.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fjqdt3fnw0wtowr6x4ae4.png" alt="Image description" width="720" height="652"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Sounds perfect, right? Well, it’s close, but there’s a catch. Some of these Cloud Service Providers, like AWS, GCP (not Azure), have additional Data Transfer Costs for cross-Availability Zone communication.&lt;/p&gt;

&lt;h2&gt;
  
  
  What Are Data Transfer Costs for Cross-Availability Zone Communication?
&lt;/h2&gt;

&lt;p&gt;As the name suggests, these costs come from data moving between resources in different Availability Zones, and are usually calculated per gigabyte ($/GB). While it might not seem like much at first glance, let’s look at some example of real-life traffic.&lt;/p&gt;

&lt;p&gt;In October 2023, Coupang.com, one of the main e-commerce in South Korea, had around 127.6 million monthly visits. With the average page size at 4.97MB, and about 12 pages visited per session, the monthly traffic easily reach tens of petabytes of data. Even if only half of this traffic involves cross-zone communication, the cost for the transit of data between Availability Zone quickly reach hundreds of thousands of dollars in a single month.&lt;/p&gt;

&lt;h2&gt;
  
  
  Kubernetes’s native Topology Aware Routing (aka Topology Aware Hints)
&lt;/h2&gt;

&lt;p&gt;Starting in version 1.21, Kubernetes introduced Topology Aware Hints to minimize cross-zone traffic within clusters. This routing strategy is built on EndpointSlices, which were first introduced in version 1.17 to improve the scalability of the traditional Endpoints resource. When a new Service is created, Kubernetes automatically generates EndpointSlices, breaking down the network endpoints into manageable chunks. This reduces the overhead on kube-proxy and overcome the size limitations of objects stored in etcd (max 1.5MB).&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F7dv09p85fwbp5wzasu2w.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F7dv09p85fwbp5wzasu2w.png" alt="Image description" width="720" height="368"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;code&gt;EndpointSlices&lt;/code&gt; don’t just improve performance, they also carry metadata like the zone information. This metadata is critical for &lt;strong&gt;Topology-Aware Routing&lt;/strong&gt; because it allows Kubernetes to make decisions about routing traffic based on the topology of the cluster. This mecchanism is enabled by the annotation &lt;code&gt;service.kubernetes.io/topology-mode&lt;/code&gt; on a Service, and will instructs the &lt;code&gt;kube-proxy&lt;/code&gt; to filter the available endpoints according to topology hints provided by the EndpointSlice controller.&lt;/p&gt;

&lt;p&gt;In the following example, traffic can be routed based on the zone metadata (&lt;code&gt;koreacentral-1&lt;/code&gt;, &lt;code&gt;koreacentral-2&lt;/code&gt;, &lt;code&gt;koreacentral-3&lt;/code&gt;).&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F98pxvgk8ado4eyzkkqst.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F98pxvgk8ado4eyzkkqst.png" alt="Image description" width="720" height="443"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;The related EndpointSlice manifest will be the following:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;apiVersion: discovery.k8s.io/v1
kind: EndpointSlice
metadata:
  ...
  ownerReferences:
  - apiVersion: v1
    blockOwnerDeletion: true
    controller: true
    kind: Service
    name: tasks-vastaya-svc
addressType: IPv4
ports:
- name: http
  port: 80
  protocol: TCP
endpoints:
- addresses:
  - 10.244.3.74
  conditions:
    ready: true
    serving: true
    terminating: false
  nodeName: aks-hazlpoolha-33634351-vmss000000
  targetRef:
    kind: Pod
    name: tasks-vastaya-dplmt-68cd4dd76c-rblxz
    namespace: vastaya
    uid: 8fddbf95-dac8-420c-b0ab-d5076f9f27e9
  zone: koreacentral-1
- addresses:
  - 10.244.2.181
  conditions:
    ready: true
    serving: true
    terminating: false
  nodeName: aks-hazlpoolha-33634351-vmss000001
  targetRef:
    kind: Pod
    name: tasks-vastaya-dplmt-68cd4dd76c-cwshq
    namespace: vastaya
    uid: 8c82addd-1123-4810-ad21-0533e8cd15ee
  zone: koreacentral-2
- addresses:
  - 10.244.1.108
  - 10.244.1.110
  conditions:
    ready: true
    serving: true
    terminating: false
  nodeName: aks-hazlpoolha-33634351-vmss000002
  targetRef:
    kind: Pod
    name: tasks-vastaya-dplmt-68cd4dd76c-dwxg2
    namespace: vastaya
    uid: b5128ae8-6615-41e6-97ec-8db9b81b588e
  zone: koreacentral-3
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;However, while Topology-Aware Routing helps reduce inter-zone traffic, it has some inherent limitations. Endpoint allocation is relatively static, meaning it doesn’t adapt to real-time conditions like traffic load, network latency, or service health beyond basic readiness and liveness probes. This can lead to imbalanced resource utilization, especially in dynamic environments where local endpoints are overwhelmed while remote ones remain underutilized.&lt;/p&gt;

&lt;p&gt;This is where High Availability Zone-aware Load Balancing (HAZL) comes into play.&lt;/p&gt;

&lt;h2&gt;
  
  
  What is HAZL?
&lt;/h2&gt;

&lt;p&gt;High Availability Zone-aware Load Balancing (HAZL) is a load balancer that leverages Topology-aware Routing, as well as the HTTP and gRPC traffic intercepted by the sidecar proxy running in meshed pods to load balance each request independently, routing to the best available backend based on current conditions. It operates at the request-level, unlike traditional connection-level load balancing, where all requests in a connection are sent to the same backend.&lt;/p&gt;

&lt;p&gt;It also monitor the number of in-flight requests (requests waiting for resources or connections) referred as “load.” to the services, and handle the traffic between zones on a per-request basis. If load or latency spikes — signs that the system is under stress or unhealthy — HAZL adds additional endpoints from other zones. On the other hand, when the load decreases, HAZL remove those extra endpoints.&lt;/p&gt;

&lt;p&gt;This adaptive approach fill the gaps of Topology-aware Routing allowing to a more controlled and more dynamic management of the cross-zone traffic, providing a balance between reducing latency, ensuring service reliability, and optimizing resource utilization.&lt;/p&gt;

&lt;p&gt;HAZL is currently available only for Buoyant Enterprise for Linkerd and not for Linkerd Open Source.&lt;/p&gt;

&lt;h2&gt;
  
  
  What is Buoyant Enterprise for Linkerd?
&lt;/h2&gt;

&lt;p&gt;Linkerd began its journey in the open-source world in 2016 and has since improved immensely. However, as corporations like Microsoft, Adidas, and Geico started incorporating Linkerd into their architectures, it became necessary to provide enterprise-level services and support that go beyond what is possible with open-source alone. This includes everything from Tailored Proofs of Concept, Software Bills-of-Materials for all components, a dedicated support channels by private support ticketing allowing them to have a direct point of contact instead of relying on public forums, Service Level Agreements, and more.&lt;/p&gt;

&lt;p&gt;However, the Buoyant’s commitment to the open-source community is reflected in its pricing model. Anyone can try the enterprise features for non-production traffic, and companies with fewer than 50 employees can use Buoyant Enterprise for Linkerd in production for free, at any scale. Beyond that, there are different pricing tiers depending on the number of meshed pods and the specific features required.&lt;/p&gt;

&lt;p&gt;Enough with the theory — let’s get our hands dirty and see HAZL in action.&lt;/p&gt;

&lt;h2&gt;
  
  
  Demonstration
&lt;/h2&gt;

&lt;p&gt;In this demonstration, I will deploy the following infrastructure from scratch on an AKS cluster using Terraform. Then, I will install Prometheus, Linkerd Enterprise, and use Grafana to collect metrics of the traffic before and after enabling HAZL.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fv81984qqpgz7qofav963.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fv81984qqpgz7qofav963.png" alt="Image description" width="720" height="401"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h3&gt;
  
  
  Infrastructure
&lt;/h3&gt;

&lt;p&gt;Let’s start with the infrastructure. The following configuration will deploy:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;    &lt;strong&gt;Azure Kubernetes Cluster:&lt;/strong&gt; This resource will have a default node pool where we will run Grafana, Prometheus, and the job to simulate traffic. This pool won’t have any availability zones assigned, as we want to keep the requests coming from the same region.&lt;/li&gt;
&lt;li&gt;    &lt;strong&gt;Cluster Node Pool:&lt;/strong&gt; This pool will host the target services and pods and will have three availability zones, so Azure will automatically distribute the underlying VMs across these availability zones.&lt;/li&gt;
&lt;li&gt;   ** Container Registry:** This is the resource where we will push our container images and from where the cluster will pull them, thanks to a role assignment to the kubelet identity with the AcrPull role.
&lt;/li&gt;
&lt;/ul&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;provider "azurerm" {
  features {}
}

module "naming" {
  source  = "Azure/naming/azurerm"
  suffix = [ "training", "dev", "kr" ]
}

resource "azurerm_resource_group" "resource_group" {
  name     = "hazl-training-resources"
  location = "Korea Central"
}

resource "azurerm_kubernetes_cluster" "kubernetes_cluster" {
  name                = module.naming.kubernetes_cluster.name
  location            = azurerm_resource_group.resource_group.location
  resource_group_name = azurerm_resource_group.resource_group.name
  default_node_pool {
    name       = "default"
    node_count = 2
    vm_size    = "Standard_D2_v2"
    auto_scaling_enabled = true
  }
  identity {
    type = "SystemAssigned"
  }
}

resource "azurerm_kubernetes_cluster_node_pool" "kubernetes_cluster_node_pool" {
  name                  = "hazltrainingnodepool"
  kubernetes_cluster_id = azurerm_kubernetes_cluster.kubernetes_cluster.id
  vm_size               = "Standard_DS2_v2"
  node_count            = 3
  zones                 = ["1", "2", "3"]
}

resource "azurerm_container_registry" "container_registry" {
  name                = module.naming.container_registry.name
  resource_group_name = azurerm_resource_group.resource_group.name
  location            = azurerm_resource_group.resource_group.location
  sku                 = "Premium"
  admin_enabled       = true
}

resource "azurerm_role_assignment" "role_assignment_cluster_container_registry" {
  principal_id                     = azurerm_kubernetes_cluster.kubernetes_cluster.kubelet_identity[0].object_id
  role_definition_name             = "AcrPull"
  scope                            = azurerm_container_registry.container_registry.id
  skip_service_principal_aad_check = true
}
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Behind the scenes, Azure will set the &lt;code&gt;topology.kubernetes.io/zone&lt;/code&gt; label on each node with the related zone and availability zone.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;$ kubectl describe nodes | grep -e "Name:" -e "topology.kubernetes.io/zone"
Name: aks-agentpool-60539364-vmss000001
 topology.kubernetes.io/zone=0
Name: aks-hazlpoolha-33634351-vmss000000
 topology.kubernetes.io/zone=koreacentral-1
Name: aks-hazlpoolha-33634351-vmss000001
 topology.kubernetes.io/zone=koreacentral-2
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Now that the infrastructure is in place, it’s time to start deploying the applications that we will use.&lt;/p&gt;

&lt;h2&gt;
  
  
  Install Buoyant Enterprise for Linkerd (BEL)
&lt;/h2&gt;

&lt;p&gt;There are two ways to install BEL: via the operator or via CRDs and control plane Helm charts. The advantage of using the operator is that it will take care of pulling the configuration for both CRDs and the control plane for you, and manage the installation and upgrades automatically. Both methods require an active account on &lt;a href="https://enterprise.buoyant.io/" rel="noopener noreferrer"&gt;https://enterprise.buoyant.io/&lt;/a&gt; as it will provide a license to use during the installation of the control plane.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F3d8faslvw8leq11j2p2d.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F3d8faslvw8leq11j2p2d.png" alt="Image description" width="720" height="360"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;As there is a lot of documentation about the operator online, in this demo, I will use Helm.&lt;/p&gt;

&lt;p&gt;First, you will need to install the CRDs chart, which will install all the resource’s definitions necessary for Linkerd to work:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;helm upgrade --install linkerd-enterprise-crds linkerd-buoyant/linkerd-enterprise-crds \
  --namespace linkerd \
  --create-namespace
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Next, we will need to create a trust anchor certificate that will be used by the identity service to issue certificates and enable mTLS. In this case, I will use the step tool:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;step certificate create root.linkerd.cluster.local ./certificates/ca.crt ./certificates/ca.key --profile root-ca --no-password --insecure
step certificate create identity.linkerd.cluster.local ./certificates/issuer.crt ./certificates/issuer.key --profile intermediate-ca --not-after 8760h --no-password --insecure --ca ./certificates/ca.crt --ca-key ./certificates/ca.key
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Finally, we can install the control plane Helm chart, which will deploy all the roles, ConfigMaps, services, and components that make up Linkerd:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;helm upgrade --install linkerd-enterprise-control-plane linkerd-buoyant/linkerd-enterprise-control-plane \
  --set buoyantCloudEnabled=false \
  --set license=$BUOYANT_LICENSE \
  -f ./helm/linkerd-enterprise/values.yaml \
  --set-file linkerd-control-plane.identityTrustAnchorsPEM=./certificates/ca.crt \
  --set-file linkerd-control-plane.identity.issuer.tls.crtPEM=./certificates/issuer.crt \
  --set-file linkerd-control-plane.identity.issuer.tls.keyPEM=./certificates/issuer.key \
  --namespace linkerd \
  --create-namespace
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Install Linkerd-Viz
&lt;/h3&gt;

&lt;p&gt;Linkerd-viz is an open-source extension that installs and auto-configures a Prometheus instance to scrape metrics from Linkerd. Additionally, it provides a dashboard that users can utilize to gain insights about the meshed pods in the cluster. It has a dedicated Helm chart that you can easily install with the following command:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;helm upgrade --install linkerd-viz linkerd/linkerd-viz \
  --create-namespace \
  --namespace linkerd-viz
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;However, this extension only keeps metrics data for a brief window of time (6 hours) and does not persist data across restarts. Therefore, in this demo, we will install our own Prometheus instance and federate it with the Linkerd-viz Prometheus instance to persist the metrics.&lt;/p&gt;

&lt;h3&gt;
  
  
  Install Prometheus
&lt;/h3&gt;

&lt;p&gt;By default, Prometheus provides a lot of metrics about the cluster and its resources. However, if you want to scrape additional information about the nodes, such as labels, you can modify the Prometheus configuration or install additional packages. To obtain the labels of the nodes and group the metrics by zone , we will install the kube-state-metrics, which will expose this infomations to the prometheus queries.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;helm upgrade --install kube-state-metrics prometheus-community/kube-state-metrics \
  --set metricLabelsAllowlist.nodes=[*] \
  --create-namespace \
  --namespace monitoring
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Next, install Prometheus using Helm:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;helm upgrade --install prometheus prometheus-community/prometheus \
  --create-namespace \
  --namespace monitoring
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Finally, we can federate our Prometheus instance with the Linkerd-viz instance so that data are copied from one Prometheus to another. This allows us to access metrics collected at the transport level, such as:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;    &lt;strong&gt;tcp_write_bytes_total&lt;/strong&gt;: A counter of the total number of sent bytes. This is updated when the connection closes.&lt;/li&gt;
&lt;li&gt;    &lt;strong&gt;tcp_read_bytes_total&lt;/strong&gt;: A counter of the total number of received bytes. This is updated when the connection closes.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;To set up federation, add the following configuration to your Prometheus manifest file:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;- job_name: 'linkerd'
  kubernetes_sd_configs:
  - role: pod
    namespaces:
      names: ['{{.Namespace}}']

  relabel_configs:
  - source_labels:
    - __meta_kubernetes_pod_container_name
    action: keep
    regex: ^prometheus$

  honor_labels: true
  metrics_path: '/federate'

  params:
    'match[]':
      - '{job="linkerd-proxy"}'
      - '{job="linkerd-controller"}'
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Additionally, apply an &lt;code&gt;AuthorizationPolicy&lt;/code&gt; that will allow Prometheus to access the Linkerd-viz Prometheus metrics endpoint:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;apiVersion: policy.linkerd.io/v1alpha1
kind: AuthorizationPolicy
metadata:
  name: prometheus-admin-federate
  namespace: linkerd-viz
spec:
  targetRef:
    group: policy.linkerd.io
    kind: Server
    name: prometheus-admin
  requiredAuthenticationRefs:
    - group: policy.linkerd.io
      kind: NetworkAuthentication
      name: kubelet
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;If you have done everything correctly, you will be able to see the following target in Prometheus:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fb480jdfal0cedfzu322c.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fb480jdfal0cedfzu322c.png" alt="Image description" width="720" height="358"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Now, all the metrics scraped by the Linkerd-viz Prometheus instance are available in our Prometheus instance.&lt;/p&gt;

&lt;h3&gt;
  
  
  Install and configure Grafana
&lt;/h3&gt;

&lt;p&gt;Next, let’s install Grafana with the default configuration:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;helm upgrade --install grafana grafana/grafana \
  --create-namespace \
  --namespace monitoring
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;After logging in, we need to add a new data source pointing to the Prometheus server running in the cluster. To do this:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;    Expand the Connections option from the side pane and click Add new connection.&lt;/li&gt;
&lt;li&gt;    Click Prometheus and enter the internal DNS endpoint of the Kubernetes cluster (&lt;a href="http://prometheus-server.monitoring.svc.cluster.local" rel="noopener noreferrer"&gt;http://prometheus-server.monitoring.svc.cluster.local&lt;/a&gt;) in the connection input.&lt;/li&gt;
&lt;li&gt;    Click the Save &amp;amp; Test button at the bottom to complete the setup.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fgxnv81j2jfs8o3n6b1m7.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fgxnv81j2jfs8o3n6b1m7.png" alt="Image description" width="720" height="359"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Next, let’s create a new dashboard that will contain the visualizations we will use to monitor the traffic to and from our nodes using the previously created data source targeting the Prometheus server. The visualizations we need are the following:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;CPU Usage per Kubernetes Node:&lt;/strong&gt; This query will display the CPU usage percentage for each node. In this case, we expect that there will be a peak of CPU utilization in one of the nodes when we will trigger the jobs to simulate the traffic.
&lt;/li&gt;
&lt;/ul&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;100 - (avg(irate(node_cpu_seconds_total{mode="idle"}[5m])) by (instance) * 100)
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;TCP Read Bytes Total (Outbound):&lt;/strong&gt; This query shows the total number of bytes read over TCP connections for outbound traffic in the vastaya namespace, grouped by namespace, pod, instance, destination zone, and source zone. This metrics are collected by the Linkerd proxy.
&lt;/li&gt;
&lt;/ul&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;sum by (namespace, pod, instance, dst_zone, src_zone) (
  tcp_read_bytes_total{direction="outbound", namespace="vastaya"}
)
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Simulate the traffic with HAZL disabled
&lt;/h3&gt;

&lt;p&gt;With this setup in place, we can proceed to trigger a job that will create 5 replicas, each one increasing the number of requests to the service every 10 seconds. To ensure that the pods are provisioned in the node pool without the application, we will also set a node affinity.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;apiVersion: batch/v1
kind: Job
metadata:
  name: bot-get-project-report
  namespace: vastaya
spec:
  completions: 5          
  parallelism: 5          
  template:
    spec:
      affinity:
        nodeAffinity:
          requiredDuringSchedulingIgnoredDuringExecution:
            nodeSelectorTerms:
            - matchExpressions:
              - key: agentpool
                operator: In
                values:
                - default
      containers:
      - name: project-creator
        image: curlimages/curl:7.78.0 
        command: ["/bin/sh", "-c"] 
        args:
        - |
          API_URL="http://projects.vastaya.svc.cluster.local/1/report"
          get_report() {
            local num_requests=$1
            echo "Getting $num_requests tasks..."
            for i in $(seq 1 $num_requests); do
              (
                echo "Getting task $i..."
                GET_RESPONSE=$(curl -s -X GET "$API_URL")
                echo $GET_RESPONSE
              ) &amp;amp;
            done
            wait
          }
          wait_time=10
          for num_requests in 5000 10000 15000; do
            echo "Running with $num_requests requests..."
            get_report $num_requests
            echo "Waiting for $wait_time seconds before increasing requests..."
            sleep $wait_time
          done
      restartPolicy: Never
  backoffLimit: 1
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Since we haven’t enabled HAZL yet, Kubernetes will start directing the requests to pods running in different zones, resulting in an increase of cross-zone traffic.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F7sqlrdi9vemykmudmret.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F7sqlrdi9vemykmudmret.png" alt="Image description" width="720" height="358"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h3&gt;
  
  
  Enable HAZL
&lt;/h3&gt;

&lt;p&gt;Enabling HAZL with the operator is super easy. All we have to do is update the control plane values with the following command:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;helm upgrade --install linkerd-enterprise-control-plane linkerd-buoyant/linkerd-enterprise-control-plane \
  --set buoyantCloudEnabled=false \
  --set license=$BUOYANT_LICENSE \
  --set-file linkerd-control-plane.identityTrustAnchorsPEM=./certificates/ca.crt \
  --set-file linkerd-control-plane.identity.issuer.tls.crtPEM=./certificates/issuer.crt \
  --set-file linkerd-control-plane.identity.issuer.tls.keyPEM=./certificates/issuer.key \
  --set controlPlaneConfig.destinationController.additionalArgs="{ -ext-endpoint-zone-weights }" \
  --set controlPlaneConfig.proxy.additionalEnv[0].name=BUOYANT_BALANCER_LOAD_LOW \
  --set controlPlaneConfig.proxy.additionalEnv[0].value="0.8" \
  --set controlPlaneConfig.proxy.additionalEnv[1].name=BUOYANT_BALANCER_LOAD_HIGH \
  --set controlPlaneConfig.proxy.additionalEnv[1].value="2.0" \
  --namespace linkerd \
  --create-namespace \
  -f ./helm/linkerd-enterprise/values.yaml
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Simulate the traffic with HAZL enabled
&lt;/h3&gt;

&lt;p&gt;After enabling HAZL, we recreate the job to simulate the traffic and as you can see, the cross-zone communication has been completely eliminated.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fc6aw56u7nmmty48mnk10.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fc6aw56u7nmmty48mnk10.png" alt="Image description" width="720" height="358"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  References
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;    &lt;strong&gt;AWS Data Transfer Cost:&lt;/strong&gt; &lt;a href="https://aws.amazon.com/ec2/pricing/on-demand/#Data_Transfer_within_the_same_AWS_Region" rel="noopener noreferrer"&gt;https://aws.amazon.com/ec2/pricing/on-demand/#Data_Transfer_within_the_same_AWS_Region&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;    &lt;strong&gt;Linkerd and External Prometheus:&lt;/strong&gt; &lt;a href="https://linkerd.io/2-edge/tasks/external-prometheus/" rel="noopener noreferrer"&gt;https://linkerd.io/2-edge/tasks/external-prometheus/&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;    &lt;strong&gt;HAZL Official Documentation:&lt;/strong&gt; &lt;a href="https://docs.buoyant.io/buoyant-enterprise-linkerd/latest/features/hazl/" rel="noopener noreferrer"&gt;https://docs.buoyant.io/buoyant-enterprise-linkerd/latest/features/hazl/&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;    &lt;strong&gt;BEL OPfficial Documentation:&lt;/strong&gt; &lt;a href="https://docs.buoyant.io/buoyant-enterprise-linkerd/latest/installation/enterprise/" rel="noopener noreferrer"&gt;https://docs.buoyant.io/buoyant-enterprise-linkerd/latest/installation/enterprise/&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;    &lt;strong&gt;Demo Source Code:&lt;/strong&gt; &lt;a href="https://github.com/GTRekter/Vastaya" rel="noopener noreferrer"&gt;https://github.com/GTRekter/Vastaya&lt;/a&gt;
&lt;/li&gt;
&lt;/ul&gt;

</description>
      <category>kubernetes</category>
      <category>eks</category>
      <category>aks</category>
    </item>
    <item>
      <title>Naver Cloud: A look inside the South Korea’s Leading Cloud Platform</title>
      <dc:creator>Ivan Porta</dc:creator>
      <pubDate>Thu, 14 Nov 2024 12:31:40 +0000</pubDate>
      <link>https://dev.to/gtrekter/naver-cloud-a-look-inside-the-south-koreas-leading-cloud-platform-gl8</link>
      <guid>https://dev.to/gtrekter/naver-cloud-a-look-inside-the-south-koreas-leading-cloud-platform-gl8</guid>
      <description>&lt;p&gt;In 2019, global pop sensation BTS took the stage for their largest tour to date at Wembley Stadium in London. To promote its emerging venture, Naver broadcast the performance live to over 140,000 viewers worldwide via its Naver V Live platform, which relied on Naver Cloud infrastructure to reach audiences in the United States, Japan, Korea, and beyond. This successful event laid the foundation for what has become South Korea’s most widely used cloud service provider. In this article, we’ll explore what Naver Cloud offers and how it distinguishes itself from its Western counterparts in the competitive CSP landscape.&lt;/p&gt;

&lt;h2&gt;
  
  
  What’s Naver Cloud?
&lt;/h2&gt;

&lt;p&gt;Naver Cloud is the cloud service provider created by Naver, the South Korean conglomerate behind the country’s most popular search engine and an ecosystem of widely used apps like Naver Mail, Naver Pay, Naver Maps, Naver Workspace, and more.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fh2r7111smys7gopbf0xx.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fh2r7111smys7gopbf0xx.png" alt="Image description" width="720" height="534"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Officially founded in 2017, Naver Cloud initially offered 22 cloud products, expanding to over 200 products across 18 categories by 2022. Today, more than 60,000 South Korean companies, including 55% of the top 100 firms such as Samsung, SK Telecom, and PUBG Corp, rely on Naver Cloud for their services. Furthermore, government regulations make it the preferred choice for many government offices when it comes to cloud solutions. To put it in perspective, Naver can be seen as the “South Korean Google.”&lt;/p&gt;

&lt;h2&gt;
  
  
  Different Networking
&lt;/h2&gt;

&lt;p&gt;Naver Cloud’s networking options diverge slightly from those offered by Western cloud providers. Depending on the region, users can choose between two types of networking environments: Classic and VPC. Initially, Naver Cloud only provided the Classic environment, where all resources were deployed on a shared network. This setup allowed for private communication between servers created under multiple accounts, making it possible for users to interconnect resources across accounts.&lt;/p&gt;

&lt;p&gt;However, the Classic model comes with certain challenges. For instance, access control requires separate configurations through Access Control Groups or other methods to manage inter-tenant communication, which can become complex, especially when a single account supports multiple tenants. As deployments grow in complexity, maintaining access settings becomes increasingly challenging. Additionally, due to the shared network, each tenant’s network environment cannot be configured identically, and private IPs are assigned randomly, complicating the enforcement of strict access control policies by IP range.&lt;br&gt;
Different Networking&lt;/p&gt;

&lt;p&gt;Naver Cloud’s networking options diverge slightly from those offered by Western cloud providers. Depending on the region, users can choose between two types of networking environments: Classic and VPC. Initially, Naver Cloud only provided the Classic environment, where all resources were deployed on a shared network. This setup allowed for private communication between servers created under multiple accounts, making it possible for users to interconnect resources across accounts.&lt;/p&gt;

&lt;p&gt;However, the Classic model comes with certain challenges. For instance, access control requires separate configurations through Access Control Groups or other methods to manage inter-tenant communication, which can become complex, especially when a single account supports multiple tenants. As deployments grow in complexity, maintaining access settings becomes increasingly challenging. Additionally, due to the shared network, each tenant’s network environment cannot be configured identically, and private IPs are assigned randomly, complicating the enforcement of strict access control policies by IP range.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F1ym3f2n5o4eheodxfpvj.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F1ym3f2n5o4eheodxfpvj.png" alt="Image description" width="720" height="437"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;To address these limitations, Naver Cloud introduced the VPC environment on September 17, 2020. The VPC setup provides users with a fully isolated network, making network management more straightforward and adaptable to specific organizational needs, effectively resolving the issues associated with the Classic environment.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fvks0g0wdkyc3sadclne6.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fvks0g0wdkyc3sadclne6.png" alt="Image description" width="720" height="437"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Though Naver Cloud continues to support the Classic environment, there are now limitations on its functionality. For example, users can no longer create new Kuberneets clusters in the Classic setup, and certain newer features are exclusive to VPC. Naver Cloud has also introduced tools to facilitate the migration of workloads and servers from Classic to VPC, indicating the company’s intent to transition users towards the VPC model over time. While many organizations, especially large corporations, are often resistant to major changes, it’s likely that this shift will eventually become necessary, though Naver may need to provide additional support and resources to help clients through this transformation.&lt;/p&gt;

&lt;h2&gt;
  
  
  Create a new Naver Cloud Platform Account
&lt;/h2&gt;

&lt;p&gt;As of this writing, Naver Cloud supports the direct creation of new business accounts (associated with a company) for users residing primarily in Singapore, South Korea, Bangladesh, Cambodia, Canada, Germany, India, Indonesia, Japan, Malaysia, the Philippines, Taiwan, Thailand, the United States, and Vietnam. To complete the registration process, applicants must provide a certificate of company registration, a valid phone number from the country of residence (VoIP numbers are not accepted), and a valid credit card.&lt;/p&gt;

&lt;p&gt;Personal accounts, however, are available only in South Korea. If you attempt to create a personal account using a temporary South Korean phone number — such as those provided to tourists — it won’t work. This is because account registration requires identity verification through the phone number used to register the credit card. Additionally, without an Alien Registration Number, which tourists do not possess, you will be unable to complete the process.&lt;/p&gt;

&lt;p&gt;If you reside in a country not included in the supported list, don’t worry. You can create an account by directly contacting the Naver Support Center. In your support ticket, include your company’s certificate of registration, a 30-minute time window for verification, a photo of your credit card, and your phone number. The Naver Cloud team will then temporarily add your country to the list of available regions, allowing you to complete the account setup.&lt;/p&gt;

&lt;h2&gt;
  
  
  Training
&lt;/h2&gt;

&lt;p&gt;Upskilling engineers is a critical aspect that every cloud service provider must prioritize for success. To facilitate engineers’ use of their services, incentive companies to migrate to the platform, Naver Cloud provides a huge amount of educational content, along with regular online webinars and offline training sessions. The offline training sessions, available only in Korean, range from free full-day courses to multi-day, instructor-led programs designed to deepen users’ expertise and taken at their office in Seoul.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fkgmv40z0nit1zf41uryd.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fkgmv40z0nit1zf41uryd.png" alt="Image description" width="720" height="407"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  Certifications
&lt;/h2&gt;

&lt;p&gt;Certifications serve as a way to validate an engineer’s skills and attest to their knowledge. Many organizations, such as Azure, AWS, and CNCF offer certification programs that engineers can obtain by passing proctored exams. Naver Cloud follows this approach and currently offers the following certifications:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;    NAVER CLOUD PLATFORM Certified Associate: This certification verifies a foundational understanding of cloud concepts and the ability to configure basic resources like Compute, Storage, Database, Network, and Media services on Naver Cloud Platform.&lt;/li&gt;
&lt;li&gt;    NAVER CLOUD PLATFORM Certified Professional: This level takes the knowledge deeper, building on the previous resources and adding troubleshooting capabilities. It consists of three exams.&lt;/li&gt;
&lt;li&gt;    NAVER CLOUD PLATFORM Certified Expert: The highest certification level, which requires comprehensive knowledge of Naver Cloud Platform resources, as well as skills in troubleshooting and customization. It consists of four exams.&lt;/li&gt;
&lt;li&gt;    NAVER CLOUD PLATFORM Certified AI: This certification is designed for those working with AI, covering basic knowledge related to machine learning and a solid understanding of CLOVA Studio, based on the HyperCLOVA X engine.&lt;/li&gt;
&lt;li&gt;    NAVER CLOUD PLATFORM Certified Expert AI: The most advanced AI certification, requiring expertise in AI, a thorough understanding of large language models, and skills in RAG configuration and chatbot development using the HyperCLOVA X engine.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fvwucrjjyyb2sp0zr76hn.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fvwucrjjyyb2sp0zr76hn.png" alt="Image description" width="497" height="239"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  References:
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;    &lt;strong&gt;VPC Announcement:&lt;/strong&gt; &lt;a href="https://www.ncloud.com/intro/news/554" rel="noopener noreferrer"&gt;https://www.ncloud.com/intro/news/554&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;    &lt;strong&gt;Naver Cloud Support Center:&lt;/strong&gt; &lt;a href="https://www.ncloud.com/support/question/general" rel="noopener noreferrer"&gt;https://www.ncloud.com/support/question/general&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;    &lt;strong&gt;Naver Training Platform:&lt;/strong&gt; &lt;a href="https://g-bizschool.naver.com/" rel="noopener noreferrer"&gt;https://g-bizschool.naver.com/&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;    &lt;strong&gt;Naver Education Platform:&lt;/strong&gt; &lt;a href="https://edu.ncloud.com/online" rel="noopener noreferrer"&gt;https://edu.ncloud.com/online&lt;/a&gt;
&lt;/li&gt;
&lt;/ul&gt;

</description>
      <category>cloud</category>
      <category>naver</category>
    </item>
    <item>
      <title>CLI, Operator, Helm 차트를 이용한 Linkerd Enterprise 설치 방법</title>
      <dc:creator>Ivan Porta</dc:creator>
      <pubDate>Thu, 14 Nov 2024 12:27:52 +0000</pubDate>
      <link>https://dev.to/gtrekter/cli-operator-helm-cateureul-iyonghan-linkerd-enterprise-seolci-bangbeob-5g33</link>
      <guid>https://dev.to/gtrekter/cli-operator-helm-cateureul-iyonghan-linkerd-enterprise-seolci-bangbeob-5g33</guid>
      <description>&lt;p&gt;지난 몇 주 동안, Operator나 Helm 차트, 기타 주요 단계를 생략으로 인한 Linkerd 설치 관련 문의가 있었습니다. 이 글을 통해 Linkerd 서비스 메시 엔터프라이즈 버전 설치 방법을 차근차근 알려드리도록 하겠습니다. Kubernetes 클러스터에 Linkerd Enterprise를 설치하는 방법은 크게 세 가지가 있습니다:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;   Linkerd CLI&lt;/li&gt;
&lt;li&gt;   Helm 차트&lt;/li&gt;
&lt;li&gt;   Operator 사용&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;설치 방법과 관계없이 선행해야 할 단계는 Linkerd 엔터프라이즈 플랫폼에서 계정을 생성하는 것입니다. Linkerd 엔터프라이즈 설치 시, Buoyant Cloud SaaS 플랫폼을 활성화하지 않아도 되는 점 참고하시기 바랍니다.&lt;/p&gt;

&lt;h2&gt;
  
  
  Linkerd Enterprise 라이선스 키 접근 방법
&lt;/h2&gt;

&lt;p&gt;Linkerd 엔터프라이즈 설치 첫번째 단계를 라이선스 키를 받는 것입니다. 다음 단계를 따라 진행해주세요:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;a href="https://enterprise.buoyant.io/" rel="noopener noreferrer"&gt;https://enterprise.buoyant.io/&lt;/a&gt; 에 접속합니다.&lt;/li&gt;
&lt;li&gt;    계정을 생성합니다. 기존 계정이 있다면 로그인합니다.&lt;/li&gt;
&lt;li&gt;    설치 탭에서 &lt;code&gt;API_CLIENT_ID&lt;/code&gt;, &lt;code&gt;API_CLIENT_SECRET&lt;/code&gt;, &lt;code&gt;BUOYANT_LICENSE&lt;/code&gt; 정보가 있는 패널을 확인합니다.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;code&gt;API_CLIENT_ID&lt;/code&gt; 와 &lt;code&gt;API_CLIENT_SECRET&lt;/code&gt; 은 Buoyant 클라우드 연결에 사용되며, &lt;code&gt;BUOYANT_LICENSE&lt;/code&gt; 는 클러스터에Linkerd Enterprise를 설치하기 위해 필요한 키입니다.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F8ccswdg0dut81gz4m1io.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F8ccswdg0dut81gz4m1io.png" alt="Image description" width="720" height="356"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;참고: Linkerd Buoyant Enterprise는 비상업적 트래픽에 무료로 제공되며, 직원 수 50명 미만의 기업은 회사 규모와 상관없이 무료로 사용할 수 있습니다.&lt;/strong&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  (선택) Trust Anchor 및 ID 인증서 생성
&lt;/h2&gt;

&lt;p&gt;메시로 연결된 pod 간 통신 보안을 위해 Linkerd는 모든 TCP 통신에 상호 TLS (mTLS)를 적용합니다. 이를 위해 Linkerd는 Trust Anchor, ID 인증서, 프라이빗 키가 필요합니다. 해당 정보는 Kubernetes secrets 으로 저장, Linkerd 제어 평면은 각 Linkerd 프록시에 인증서 발급 시 사용됩니다.&lt;/p&gt;

&lt;p&gt;인증서가 기본으로 제공되지 않는 경우, Linkerd CLI는 1년간 유효한 Trust Anchor 와ID 인증서를 생성합니다. 하지만 Helm차트 또는 Operator를 이용해 설치를 진행하는 경우, 이 인증서를 사전에 생성하여 매개변수로 전달해야 합니다. Trust Anchor 와ID 인증서는 step tool 을 이용하여 아래와 같이 생성할 수 있습니다:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;step certificate create root.linkerd.cluster.local ca.crt ca.key \
  --profile root-ca \
  --no-password \
  --insecure

step certificate create identity.linkerd.cluster.local issuer.crt issuer.key \
  --profile intermediate-ca \
  --not-after 8760h \
  --no-password \
  --insecure \
  --ca ca.crt \
  --ca-key ca.key
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;인증서 유효기간은 필요에 의해 조정이 가능하지만, Trust Anchor 의 공통 이름은&lt;code&gt;root.linkerd.cluster.local&lt;/code&gt;, 중간 ID인증서의 공통 이름은 &lt;code&gt;identity.linkerd.cluster.local&lt;/code&gt; 이어야 합니다.&lt;/p&gt;

&lt;h2&gt;
  
  
  Linkerd Enterprise CLI 를 이용한 설치
&lt;/h2&gt;

&lt;p&gt;Linkerd 개발팀은 Kubernetes 클러스터에서 실행 중인 Linkerd 구성 요소와 상호작용, 설치, 프록시 주입, 진단, 메트릭 수집 등 다양한 작업을 수행할 수 있는 강력한 CLI를 제공하고 있습니다.&lt;/p&gt;

&lt;p&gt;먼저, Linkerd CLI를 다운로드하고, PATH 환경 변수를 업데이트하여 매번 .linkerd2 디렉토리를 탐색하지 않고도 Linkerd 명령 실행이 가능하도록 합니다.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;$ curl --proto '=https' --tlsv1.2 -sSfL https://enterprise.buoyant.io/install | sh
$ export PATH=$HOME/.linkerd2/bin:$PATH
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;check 명령어를 사용하여 Linkerd 설치에 방해가 될 수 있는 CRDs, roles, namespaces, 기타 구성요소와 충돌이 없는지 확인하세요.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;linkerd check --pre
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;다음으로 Linkerd 사용자 정의 리소스를 사용합니다. servers.policy.linkerd.io 및 httproutes.policy.linkerd.io 가 있습니다.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;참고: CLI는 Kubernetes 리소스를 직접 설치하지 않고, 대신 해당 매니페스트를 출력합니다. 이 출력을 kubectl apply로 연결하여 설치할 수 있습니다.&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;linkerd install --crds | kubectl apply -f -
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;CRD가 설정된 후에는 Linkerd 핵심인 제어 평면을 설치합니다. 제어 평면은 서비스 검색, 라우팅, mTLS 및 Linkerd의 다른 핵심 기능을 관리하는 여러 구성 요소를 배포합니다.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;linkerd install | kubectl apply -f -
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  Helm 차트를 이용한 설치
&lt;/h2&gt;

&lt;p&gt;일부 조직은 컴플라이언스 또는 업무진행방식 때문에 Helm 차트 사용을 선호합니다. 프로세스가 CLI 설치와 유사하지만, 리소스 적용 방식에 차이가 있습니다. CLI 설치와 마찬가지로, 먼저 CRD를 설치한 후 제어 평면을 설치해야 합니다.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;참고: 2.15버전부터 Linkerd Enterprise Helm 차트는 ArtifactHub에서 호스팅되는 전통적인 Helm 레지스트리에 저장되며, 컨테이너 이미지는 GitHub에 호스팅됩니다. 이전 버전과 달리, OCI 기반이 아니며, Azure Container 레지스트리에서 관리되지 않습니다.&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;먼저 Buoyant Helm 저장소를 로컬 Helm 구성에 추가합니다&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;helm repo add linkerd-buoyant https://helm.buoyant.cloud
helm repo update
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;다음으로 필수CRD가 포함된 Helm 차트를 설치합니다.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;helm upgrade --install linkerd-enterprise-crds \
  linkerd-buoyant/linkerd-enterprise-crds \
  --namespace linkerd \
  --create-namespace
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;마지막으로 제어 평면을 설치합니다. 해당 차트에서는 HAZL 기능 활성화나 proxyInit 설정 수정 등 대부분의 사용자 정의 구성을 적용할 수 있습니다. 예를 들어, 설치 중 다음과 같은 구성을 적용할 수 있습니다. [코드]&lt;/p&gt;

&lt;p&gt;기본 값으로 설치를 진행하하려면 다음 명령을 실행할 수 있습니다.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;  --set proxyInit.runAsRoot=true \
  --set destinationController.additionalArgs[0]=-ext-endpoint-zone-weights \
  --set proxy.additionalEnv[0].name=BUOYANT_BALANCER_LOAD_LOW \
  --set proxy.additionalEnv[0].value='0.1' \
  --set proxy.additionalEnv[1].name=BUOYANT_BALANCER_LOAD_HIGH \
  --set proxy.additionalEnv[1].value='3.0'
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;보시다시피, 여전히 루트 인증서, 발급자 인증서, 발급자 개인 키를 제공해야 합니다.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;helm upgrade --install linkerd-enterprise-control-plane \
  linkerd-buoyant/linkerd-enterprise-control-plane \
  --set-file linkerd-control-plane.identityTrustAnchorsPEM=./ca.crt \
  --set-file linkerd-control-plane.identity.issuer.tls.crtPEM=./issuer.crt \
  --set-file linkerd-control-plane.identity.issuer.tls.keyPEM=./issuer.key \
  --set buoyantCloudEnabled=false \
  --set license=$BUOYANT_LICENSE \
  --namespace linkerd \
  --create-namespace 
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;보시다시피, 우리는 여전히 루트 인증서, 발급자 인증서 및 발급자 개인 키를 제공해야 합니다.&lt;/p&gt;

&lt;h2&gt;
  
  
  Operator를 이용한 설치
&lt;/h2&gt;

&lt;p&gt;설치를 시작하기 전, Kubernetes Operator에 대해 간단히 설명하겠습니다.&lt;/p&gt;

&lt;h3&gt;
  
  
  Operator란?
&lt;/h3&gt;

&lt;p&gt;Kubernetes Operator는 사용자를 대신하여 애플리케이션 인스턴스를 컨트롤하기 위해 Kubernetes API를 확장하는 애플리케이션 전용 컨트롤러입니다. 클러스터의 최적 상태를 모니터링하고, 이를 실제 상태와 비교, 제어 루프를 사용해 차이를 조정하기 위해 필수작업을 수행합니다. 이를 통해 Kubernetes에서 복잡한 애플리케이션을 관리하는 작업이 간소화됩니다.&lt;/p&gt;

&lt;p&gt;먼저, Buoyant Helm 저장소를 로컬 Helm 구성에 추가합니다:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;helm repo add linkerd-buoyant https://helm.buoyant.cloud
helm repo update
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;이제 Linkerd Enterprise Operator설치가 가능합니다. CLI나 Helm 차트 기반 설치와 달리, 이 차트 하나만 설치하면 됩니다. Operator가 구성되면 ConfigMap, CRD 및 기타 구성 요소를 포함한 모든 리소스의 설치와 구성을 자동으로 처리합니다.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;helm install linkerd-buoyant \
  --create-namespace \
  --namespace linkerd-buoyant \
  --set buoyantCloudEnabled=false \
  --set license=$BUOYANT_LICENSE \
  linkerd-buoyant/linkerd-buoyant
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;다음으로 Trust Anchor, Identity Certificates 및 관련 개인 키를 저장할 전용 Secret을 생성해야 합니다.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;kubectl create secret generic linkerd-identity-issuer \
  --namespace=linkerd \
  --from-file=ca.crt=./ca.crt \
  --from-file=tls.crt=./issuer.crt \
  --from-file=tls.key=./issuer.key
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;이 때 운영자는 필요한 구성이 없어 아직 제어 평면 또는 CRD를 설치하지 않은 상태입니다.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;$ kubectl get controlplane.linkerd.buoyant.io -A
No resources found

$ helm list -A
NAME                  NAMESPACE       REVISION UPDATED                                  STATUS   CHART                                   APP VERSION      
linkerd-buoyant       linkerd-buoyant 1        2024-10-22 07:04:31.801677526 +0200 CEST deployed linkerd-buoyant-0.32.1                  0.32.1

설치를 계속 진행하려면 라이선스 키, Linkerd 버전, Trust Anchor 인증서와 함께 제어 평면 리소스를 배포해야 합니다.

cat &amp;lt;&amp;lt;EOF &amp;gt; linkerd-control-plane-config.yaml
apiVersion: linkerd.buoyant.io/v1alpha1
kind: ControlPlane
metadata:
  name: linkerd-control-plane
spec:
  components:
    linkerd:
      version: $LINKERD_VERSION
      license: $BUOYANT_LICENSE
      controlPlaneConfig:
        identityTrustAnchorsPEM: |
$(cat ca.crt | sed 's/^/          /')
        identity:
          issuer:
            scheme: kubernetes.io/tls
EOF
kubectl apply -f linkerd-control-plane-config.yaml
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;운영자가 작동 중이므로 몇 초 후에 Linkerd의 CRD와 제어 평면을 위한 Helm 차트를 설치하고 구성합니다.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;참고: ConfigMap과 Secret등의 리소스를 생성에 시간이 걸릴 수 있습니다. 해당 리소스를 생성하기 전 linkerd check를 실행하면 오류가 발생할 수 있습니다. 잠시 기다리시면 정상적으로 작동될 것입니다.&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;$ helm list -A
NAME                  NAMESPACE       REVISION UPDATED                                  STATUS   CHART                                   APP VERSION      
linkerd-buoyant       linkerd-buoyant 1        2024-10-22 07:04:31.801677526 +0200 CEST deployed linkerd-buoyant-0.32.1                  0.32.1           
linkerd-control-plane linkerd         1        2024-10-22 05:05:01.122822879 +0000 UTC  deployed linkerd-enterprise-control-plane-2.16.1 enterprise-2.16.1
linkerd-crds          linkerd         1        2024-10-22 05:04:59.388052991 +0000 UTC  deployed linkerd-enterprise-crds-2.16.1          enterprise-2.16.1
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  참고 자료
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Linkerd용 Buoyant Enterprise 가격:&lt;/strong&gt; &lt;a href="https://buoyant.io/pricing" rel="noopener noreferrer"&gt;https://buoyant.io/pricing&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Linkerd용 Buoyant Enterprise 공식 문서:&lt;/strong&gt; &lt;a href="https://docs.buoyant.io/buoyant-enterprise-linkerd/latest/installation/enterprise/" rel="noopener noreferrer"&gt;https://docs.buoyant.io/buoyant-enterprise-linkerd/latest/installation/enterprise/&lt;/a&gt;
&lt;/li&gt;
&lt;/ul&gt;

</description>
      <category>kubernetes</category>
      <category>linkerd</category>
    </item>
    <item>
      <title>How to Install Linkerd Enterprise via CLI, Operator, and Helm Charts</title>
      <dc:creator>Ivan Porta</dc:creator>
      <pubDate>Tue, 22 Oct 2024 09:24:15 +0000</pubDate>
      <link>https://dev.to/gtrekter/how-to-install-linkerd-enterprise-via-cli-operator-and-helm-charts-2a8b</link>
      <guid>https://dev.to/gtrekter/how-to-install-linkerd-enterprise-via-cli-operator-and-helm-charts-2a8b</guid>
      <description>&lt;p&gt;In the past weeks, I encountered several cases of confusion with the Linkerd installations, especially when people missed key components like operators, Helm charts, or crucial steps. In this article, I’ll walk you through how to install the enterprise version of the Linkerd service mesh.&lt;/p&gt;

&lt;p&gt;There are three main ways to install Linkerd Enterprise in your Kubernetes cluster:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Linkerd CLI&lt;/li&gt;
&lt;li&gt;Helm Charts&lt;/li&gt;
&lt;li&gt;Using an Operator&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Regardless of the method you choose, you must first create an account on the Linkerd Enterprise platform. However, it’s worth noting that installing Linkerd Enterprise does NOT require enabling the Buoyant Cloud SaaS platform.&lt;/p&gt;

&lt;h1&gt;
  
  
  Access Your Linkerd Enterprise License Key
&lt;/h1&gt;

&lt;p&gt;The first step in installing Linkerd Enterprise is obtaining your license key. To do so, follow these steps:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt; Browse to &lt;a href="https://enterprise.buoyant.io/" rel="noopener noreferrer"&gt;https://enterprise.buoyant.io/&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt; Create an account if you don’t already have one, or log in with your existing credentials.&lt;/li&gt;
&lt;li&gt; In the installation tab, you will see a panel with your &lt;code&gt;API_CLIENT_ID&lt;/code&gt;, &lt;code&gt;API_CLIENT_SECRET&lt;/code&gt;, and &lt;code&gt;BUOYANT_LICENSE&lt;/code&gt;
&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;While the &lt;code&gt;API_CLIENT_ID&lt;/code&gt; and &lt;code&gt;API_CLIENT_SECRET&lt;/code&gt; are used to connect with Buoyant Cloud, the &lt;code&gt;BUOYANT_LICENSE&lt;/code&gt; is the key you'll need to proceed with the installation of Linkerd Enterprise in your cluster.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fxdvpbu4n953il5l9tmfn.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fxdvpbu4n953il5l9tmfn.png" alt="Image description" width="720" height="357"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Note: Buoyant Enterprise for Linkerd is free for non-production traffic, and companies with fewer than 50 employees can use it for free, regardless of scale.&lt;/strong&gt;&lt;/p&gt;

&lt;h1&gt;
  
  
  (Optional) Generating Trust Anchor and Identity Certificate
&lt;/h1&gt;

&lt;p&gt;To secure communication between meshed pods, Linkerd applies mutual TLS (mTLS) to all TCP communications. For this to work, Linkerd requires a Trust Anchor, Identity Certificates, and the associated private keys. These certificates are stored as Kubernetes secrets and are used by the Linkerd control plane to issue certificates to each Linkerd proxy.&lt;/p&gt;

&lt;p&gt;By default, if no certificates are provided, the Linkerd CLI will generate a Trust Anchor and Identity certificate with a validity of one year. However, if you’re using Helm charts or an operator for installation, you must generate these certificates beforehand and pass them as parameters. You can generate the Trust Anchor and Identity certificates using the &lt;code&gt;step&lt;/code&gt; tool as follows:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;step certificate create root.linkerd.cluster.local ca.crt ca.key \
  --profile root-ca \
  --no-password \
  --insecure

step certificate create identity.linkerd.cluster.local issuer.crt issuer.key \
  --profile intermediate-ca \
  --not-after 8760h \
  --no-password \
  --insecure \
  --ca ca.crt \
  --ca-key ca.key
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;You can adjust the certificate duration as needed, but it’s critical that the Trust Anchor will have the Common Name &lt;code&gt;root.linkerd.cluster.local&lt;/code&gt; , and the identity Intermediate certificate has the Common Name &lt;code&gt;identity.linkerd.cluster.local&lt;/code&gt;.&lt;/p&gt;

&lt;h1&gt;
  
  
  Installing via Linkerd Enterprise CLI
&lt;/h1&gt;

&lt;p&gt;The Linkerd development team has built a powerful CLI that lets you interact with the Linkerd components running in your Kubernetes cluster and perform various operations, from installation, proxy injection, diagnostics, and metrics collection.&lt;/p&gt;

&lt;p&gt;First, download the Linkerd CLI and update your &lt;code&gt;PATH&lt;/code&gt; environment variable so you can run the Linkerd commands without navigating to the &lt;code&gt;.linkerd2&lt;/code&gt; directory every time.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;$ curl --proto '=https' --tlsv1.2 -sSfL https://enterprise.buoyant.io/install | sh
$ export PATH=$HOME/.linkerd2/bin:$PATH
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Use the &lt;code&gt;check&lt;/code&gt; command to ensure that there are no conflicts with CRDs, roles, namespaces, and other components that will prevent Linkerd from being installed.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;linkerd check --pre
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Next, deploy the Linkerd custom resource definitions. For example,&lt;code&gt;servers.policy.linkerd.io&lt;/code&gt;, &lt;code&gt;httproutes.policy.linkerd.io&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Note: The CLI won’t directly install the Kubernetes resources but will output their manifests. You can pipe this output to kubectl apply to install them.&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;linkerd install --crds | kubectl apply -f -
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Once the CRDs are in place, proceed with installing the heart of Linkerd: the control plane. The control plane will deploy several components that manage service discovery, routing, mTLS, and other core functions of Linkerd.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;linkerd install | kubectl apply -f -
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h1&gt;
  
  
  Installation via Helm Charts
&lt;/h1&gt;

&lt;p&gt;Some organizations might have compliance policies or workflows that steer them toward the usage of Helm charts. The process is similar to the CLI installation, with the main difference being how resources are applied. Just like the CLI installation, you will need to install the CRDs first, followed by the control plane.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Note: As of version 2.15, Linkerd Enterprise Helm charts are stored in traditional Helm registries hosted on ArtifactHub, with container images hosted in GitHub. This differs from previous releases, where Helm charts and container images were stored in OCI-based and Azure Container Registries.&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;First, add the Buoyant Helm repository to your local Helm configuration.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;helm repo add linkerd-buoyant https://helm.buoyant.cloud
helm repo update
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The next step is to install the Helm chart that contains the necessary CRDs.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;helm upgrade --install linkerd-enterprise-crds \
  linkerd-buoyant/linkerd-enterprise-crds \
  --namespace linkerd \
  --create-namespace
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Finally, we can move forward installing the control plane. This is the chart where you will apply most of your custom configurations, such as enabling features like HAZL or modifying proxyInit settings.&lt;/p&gt;

&lt;p&gt;For example, you can apply the following configurations during installation:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;  --set proxyInit.runAsRoot=true \
  --set destinationController.additionalArgs[0]=-ext-endpoint-zone-weights \
  --set proxy.additionalEnv[0].name=BUOYANT_BALANCER_LOAD_LOW \
  --set proxy.additionalEnv[0].value='0.1' \
  --set proxy.additionalEnv[1].name=BUOYANT_BALANCER_LOAD_HIGH \
  --set proxy.additionalEnv[1].value='3.0'
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;For a basic installation with default values, you can run the following command:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;helm upgrade --install linkerd-enterprise-control-plane \
  linkerd-buoyant/linkerd-enterprise-control-plane \
  --set-file linkerd-control-plane.identityTrustAnchorsPEM=./ca.crt \
  --set-file linkerd-control-plane.identity.issuer.tls.crtPEM=./issuer.crt \
  --set-file linkerd-control-plane.identity.issuer.tls.keyPEM=./issuer.key \
  --set buoyantCloudEnabled=false \
  --set license=$BUOYANT_LICENSE \
  --namespace linkerd \
  --create-namespace 
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;As you can see, we are still required to provide the Root Certificate, Issuer Certificate, and Issuer Private Key.&lt;/p&gt;

&lt;h1&gt;
  
  
  Installation via Operator
&lt;/h1&gt;

&lt;p&gt;Before moving into the installation process, let’s briefly explain what a Kubernetes operator is.&lt;/p&gt;

&lt;h2&gt;
  
  
  What’s an Operator?
&lt;/h2&gt;

&lt;p&gt;A Kubernetes operator is an application-specific controller that extends the Kubernetes API to manage instances of applications on behalf of the user. It monitors the desired state of the cluster and compares it to the actual state, taking action to reconcile any differences using control loops. This simplifies complex application management tasks in Kubernetes.&lt;/p&gt;

&lt;p&gt;First, add the Buoyant Helm repository to your local Helm configuration:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;helm repo add linkerd-buoyant https://helm.buoyant.cloud
helm repo update
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Next, we can install the Linkerd Enterprise operator. Unlike the CLI or Helm chart-based installation, this is the only chart you’ll need to install. Once the operator is configured, it will handle the installation and configuration of all necessary resources, including ConfigMaps, CRDs, and other components, automatically.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;helm install linkerd-buoyant \
  --create-namespace \
  --namespace linkerd-buoyant \
  --set buoyantCloudEnabled=false \
  --set license=$BUOYANT_LICENSE \
  linkerd-buoyant/linkerd-buoyant
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Then we will need to create a dedicated secret to store the Trust Anchor, Identity Certificates, and it’s related private key.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;kubectl create secret generic linkerd-identity-issuer \
  --namespace=linkerd \
  --from-file=ca.crt=./ca.crt \
  --from-file=tls.crt=./issuer.crt \
  --from-file=tls.key=./issuer.key
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;At this point, the operator has not yet installed the control plane or the CRDs because it lacks the necessary configuration.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;$ kubectl get controlplane.linkerd.buoyant.io -A
No resources found

$ helm list -A
NAME                  NAMESPACE       REVISION UPDATED                                  STATUS   CHART                                   APP VERSION      
linkerd-buoyant       linkerd-buoyant 1        2024-10-22 07:04:31.801677526 +0200 CEST deployed linkerd-buoyant-0.32.1                  0.32.1
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;To proceed, deploy the control plane resource with the License key, Linkerd Version and Trust Anchor certificate.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;cat &amp;lt;&amp;lt;EOF &amp;gt; linkerd-control-plane-config.yaml
apiVersion: linkerd.buoyant.io/v1alpha1
kind: ControlPlane
metadata:
  name: linkerd-control-plane
spec:
  components:
    linkerd:
      version: $LINKERD_VERSION
      license: $BUOYANT_LICENSE
      controlPlaneConfig:
        identityTrustAnchorsPEM: |
$(cat ca.crt | sed 's/^/          /')
        identity:
          issuer:
            scheme: kubernetes.io/tls
EOF
kubectl apply -f linkerd-control-plane-config.yaml
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The operator works in cycles, so after a few seconds, it will begin installing the necessary resources, including Helm charts for Linkerd’s CRDs and control plane.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Note: The operator works in cycles, so it might need a couple of seconds before it creates the resources needed&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;$ helm list -A
NAME                  NAMESPACE       REVISION UPDATED                                  STATUS   CHART                                   APP VERSION      
linkerd-buoyant       linkerd-buoyant 1        2024-10-22 07:04:31.801677526 +0200 CEST deployed linkerd-buoyant-0.32.1                  0.32.1           
linkerd-control-plane linkerd         1        2024-10-22 05:05:01.122822879 +0000 UTC  deployed linkerd-enterprise-control-plane-2.16.1 enterprise-2.16.1
linkerd-crds          linkerd         1        2024-10-22 05:04:59.388052991 +0000 UTC  deployed linkerd-enterprise-crds-2.16.1          enterprise-2.16.1
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h1&gt;
  
  
  Resources
&lt;/h1&gt;

&lt;ul&gt;
&lt;li&gt;Buoyant Enterprise for Linkerd Pricing: &lt;a href="https://buoyant.io/pricing" rel="noopener noreferrer"&gt;https://buoyant.io/pricing&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;Buoyant Enterprise for Linkerd Official Documentation: &lt;a href="https://docs.buoyant.io/buoyant-enterprise-linkerd/latest/installation/enterprise/" rel="noopener noreferrer"&gt;https://docs.buoyant.io/buoyant-enterprise-linkerd/latest/installation/enterprise/&lt;/a&gt;
&lt;/li&gt;
&lt;/ul&gt;

</description>
      <category>kubernetes</category>
      <category>linkerd</category>
      <category>microservices</category>
    </item>
    <item>
      <title>Leveraging AI for Kubernetes Troubleshooting via K8sGPT</title>
      <dc:creator>Ivan Porta</dc:creator>
      <pubDate>Tue, 02 Jul 2024 08:42:14 +0000</pubDate>
      <link>https://dev.to/gtrekter/leveraging-ai-for-kubernetes-troubleshooting-via-k8sgpt-1jgf</link>
      <guid>https://dev.to/gtrekter/leveraging-ai-for-kubernetes-troubleshooting-via-k8sgpt-1jgf</guid>
      <description>&lt;p&gt;Nowadays, there is a lot of excitement around AI and its new applications. For instance, in April/May 2024, there were at least four AI conventions in Seoul with thousands of attendees. So, what about Kubernetes? Can AI help us manage Kubernetes? The answer is yes. In this article, I will introduce K8sGPT.&lt;/p&gt;

&lt;h1&gt;
  
  
  What does GPT stand for?
&lt;/h1&gt;

&lt;p&gt;GPT stands for Generative Pre-trained Transformer. It’s a deep learning architecture that relies on a neural network pre-trained on a massive dataset of unlabeled text from various sources such as books, articles, websites, and other digital texts. This enables it to generate coherent and contextually relevant text. The first GPT was introduced in 2018 by OpenAI.&lt;/p&gt;

&lt;p&gt;GPT models are based on the transformer architecture, developed by Google, which uses a multi-head attention mechanism. Text is converted into numerical representations called tokens, often how the usage of these models is priced when provided as a service.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fexd2g2c3pnzfrwjlquyi.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fexd2g2c3pnzfrwjlquyi.png" alt="Image description" width="800" height="388"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Each token is transformed into a vector via a lookup from a word embedding table based on a pre-trained matrix where each row corresponds to a token and contains a vector representing the token in a high-dimensional space, preserving the semantic information about the token.&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Token ID&lt;/th&gt;
&lt;th&gt;Embedding Vector&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;11&lt;/td&gt;
&lt;td&gt;[0.12456, -0.00324, 0.45238,...]&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;19&lt;/td&gt;
&lt;td&gt;[-0.28345, 0.13245, 0.02938,...]&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;30&lt;/td&gt;
&lt;td&gt;[0.11234, -0.05678, 0.19834,...]&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;82&lt;/td&gt;
&lt;td&gt;[0.09876, 0.23456, -0.11234,...]&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;67474&lt;/td&gt;
&lt;td&gt;[0.56438, -0.23845, 0.04238,...]&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;At each layer, each token is then contextualized within the context window with other tokens through a parallel multi-head attention mechanism&lt;/p&gt;

&lt;h1&gt;
  
  
  What is K8sGPT?
&lt;/h1&gt;

&lt;p&gt;K8sGPT is an open-source project written in Go that uses different providers (called backends) to access various AI language models. It scans the Kubernetes cluster to discover issues and provides the results, causes, and solutions in simple sentences. The target audience for this tool is SRE Engineers, whose duty is to maintain and improve service stability.&lt;/p&gt;

&lt;h2&gt;
  
  
  Installation and Configuration
&lt;/h2&gt;

&lt;p&gt;Before performing any queries, it’s mandatory to install the tool in an environment with kubectl and set up the backend that will be used for our queries. In this example, I will install K8sGPT on Ubuntu x64:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;curl -LO https://github.com/k8sgpt-ai/k8sgpt/releases/download/v0.3.24/k8sgpt_amd64.deb
sudo dpkg -i k8sgpt_amd64.deb
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Once installed, we can configure it with the desired provider that will interact with the AI service’s APIs. In this example, I will use OpenAI.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Browse to the Open AI platform &lt;a href="https://platform.openai.com/api-keys" rel="noopener noreferrer"&gt;https://platform.openai.com/api-keys&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;Select the &lt;strong&gt;API Keys&lt;/strong&gt; option in the side menu, and click &lt;strong&gt;Create new Secret key&lt;/strong&gt;.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fbm4plwkyqjq8pw5fve7t.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fbm4plwkyqjq8pw5fve7t.png" alt="Image description" width="800" height="388"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Next, add the secret key to K8sGPT so that it can authenticate to the AI service:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;$ k8sgpt auth add
Warning: backend input is empty, will use the default value: openai
Warning: model input is empty, will use the default value: gpt-3.5-turbo
Enter openai Key: 
openai added to the AI backend provider list
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;By default, it will use OpenAI, but you can change it by executing the following command:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;$ k8sgpt auth list
Default:
&amp;gt; openai
Active:
&amp;gt; openai
Unused:
&amp;gt; localai
&amp;gt; azureopenai
&amp;gt; noopai
&amp;gt; cohere
&amp;gt; amazonbedrock
&amp;gt; amazonsagemaker

$ k8sgpt auth default --provider amazonsagemaker
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h1&gt;
  
  
  Analyze the cluster
&lt;/h1&gt;

&lt;p&gt;K8sGPT uses analyzers to triage and diagnose issues in the cluster. Each one of them will result in a series of requests (and subsequent usage of tokens) to the AI service’s APIs. To review which analyzers are enabled, execute the following:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;$ k8sgpt filter list
Active:
&amp;gt; Pod
&amp;gt; ValidatingWebhookConfiguration
&amp;gt; Deployment
&amp;gt; CronJob
&amp;gt; PersistentVolumeClaim
&amp;gt; ReplicaSet
&amp;gt; Ingress
&amp;gt; Node
&amp;gt; MutatingWebhookConfiguration
&amp;gt; Service
Unused:
&amp;gt; HTTPRoute
&amp;gt; StatefulSet
&amp;gt; Gateway
&amp;gt; HorizontalPodAutoScaler
&amp;gt; Log
&amp;gt; PodDisruptionBudget
&amp;gt; NetworkPolicy
&amp;gt; GatewayClass
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;By enabling and disabling these analyzers, you can limit the requests sent to the AI service APIs and focus on specific types of services. In this demo, we will analyze data coming from the logs and disable the Pods analyzer. To do so, I will execute the following:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;$ k8sgpt filter remove Pod
$ k8sgpt filter add Log
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Now that K8sGPT is configured, we can start analyzing the cluster. In this example, I will deploy two pods with incorrect configurations and proceed with cluster analysis using K8sGPT. The first will be a nginx image with a non-existent tag, and the second will be a mysql image without the mandatory parameters.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;$ kubectl run nginx --image=nginx:invalid_tag
$ kubectl run mysql --image=mysql:latest
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;If you check the pods running on the cluster, you will see that something went wrong:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;$ kubectl get pods
NAME    READY   STATUS         RESTARTS   AGE
mysql   0/1     Error          0          6s
nginx   0/1     ErrImagePull   0          17s
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Let’s move forward and analyze the cluster with K8sGPT by executing the following:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;$ k8sgpt analyze -e --no-cache --with-doc
 100% |█████████████████████████████████████████████████████████████████████████████████████████████████| (5/5, 34 it/min)
AI Provider: openai

Warnings :
- [HTTPRoute] failed to get API group resources: unable to retrieve the complete list of server APIs: gateway.networking.k8s.io/v1: the server could not find the requested resource

0 default/mysql(mysql)
- Error: 2024-07-01 07:28:34+00:00 [ERROR] [Entrypoint]: Database is uninitialized and password option is not specified
Error: Database is uninitialized and password option is not specified.
Solution:
1. Specify the password option for the database.
2. Initialize the database to resolve the uninitialized state.

1 default/nginx(nginx)
- Error: Error the server rejected our request for an unknown reason (get pods nginx) from Pod nginx
Error: The server rejected the request for an unknown reason when trying to get pods for the nginx Pod.
Solution:
1. Check the Kubernetes cluster logs for more details on the rejection.
2. Verify the permissions and access rights for the user making the request.
3. Ensure the Kubernetes API server is running and reachable.
4. Retry the request after resolving any issues.
2 kube-system/coredns-7db6d8ff4d-p8bxj(Deployment/coredns)

- Error: [INFO] plugin/kubernetes: pkg/mod/k8s.io/client-go@v0.27.4/tools/cache/reflector.go:231: failed to list *v1.Namespace: Get "https://10.96.0.1:443/api/v1/namespaces?limit=500&amp;amp;resourceVersion=0": dial tcp 10.96.0.1:443: connect: connection refused
Error: Unable to list namespaces in Kubernetes due to connection refusal.
Solution:
1. Check if the Kubernetes API server is running.
2. Verify the network connectivity between the client and API server.
3. Ensure the API server IP and port are correct.
4. Restart the API server if needed.
3 kube-system/kube-controller-manager-minikube(kube-controller-manager-minikube)

- Error: I0701 05:34:48.925627       1 actual_state_of_world.go:543] "Failed to update statusUpdateNeeded field in actual state of world" logger="persistentvolume-attach-detach-controller" err="Failed to set statusUpdateNeeded to needed true, because nodeName=\"minikube\" does not exist"
Error: Failed to update statusUpdateNeeded field in actual state of world because nodeName "minikube" does not exist.
Solution:
1. Check if the node "minikube" exists in the Kubernetes cluster.
2. If the node does not exist, create a new node with the name "minikube".
3. Update the statusUpdateNeeded field in the actual state of world.
4 kube-system/kube-scheduler-minikube(kube-scheduler-minikube)

- Error: W0701 05:34:34.522300       1 authentication.go:368] Error looking up in-cluster authentication configuration: configmaps "extension-apiserver-authentication" is forbidden: User "system:kube-scheduler" cannot get resource "configmaps" in API group "" in the namespace "kube-system"
Error: The user "system:kube-scheduler" is forbidden to access the configmaps resource in the kube-system namespace.
Solution:
1. Check the RBAC permissions for the user "system:kube-scheduler".
2. Grant the necessary permissions to access the configmaps resource.
3. Verify the changes by attempting to access the configmaps resource again.
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;As you can see, it provides a list of errors. While some of them are a consequence of the real error, the analyzers also provide correct explanations of the pods misconfiguration.&lt;/p&gt;

&lt;h1&gt;
  
  
  Conclusions
&lt;/h1&gt;

&lt;p&gt;This tool should not be considered the sole source of truth but rather as a good starting point for troubleshooting. It narrows the path to discovering the problem in the cluster. Organizations that don’t want to share their data with OpenAI can take advantage of the option to use local AI systems.&lt;/p&gt;

&lt;h1&gt;
  
  
  References
&lt;/h1&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;The Mathematics Underlying Transformers and ChatGPT:&lt;/strong&gt; &lt;a href="https://webpages.charlotte.edu/yonwang/papers/mathTransformer.pdf" rel="noopener noreferrer"&gt;https://webpages.charlotte.edu/yonwang/papers/mathTransformer.pdf&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;K8sGPT:&lt;/strong&gt; &lt;a href="https://k8sgpt.ai/" rel="noopener noreferrer"&gt;https://k8sgpt.ai/&lt;/a&gt;
&lt;/li&gt;
&lt;/ul&gt;

</description>
      <category>kubernetes</category>
      <category>ai</category>
      <category>k8s</category>
      <category>chatgpt</category>
    </item>
  </channel>
</rss>
