Mateen Anjum

Posted on Apr 18

Kubernetes v1.36 Drops April 22: What Platform Engineers Actually Need to Know

#kubernetes #devops #cloudcomputing #infrastructure

TL;DR: Kubernetes v1.36 releases April 22, 2026. The headline features are DRA GPU partitioning, workload-aware preemption for AI/ML jobs, and the permanent removal of the gitRepo volume plugin. Ingress-nginx is also officially retired. If you run AI inference workloads or care about cluster security, this release is not optional reading.

Why This Release Matters More Than Most

The CNCF's 2025 annual survey dropped a number that stopped a lot of people mid-scroll: 66% of organizations hosting generative AI models now use Kubernetes for some or all of their inference workloads. That's not a trend, that's a fait accompli. Kubernetes is the AI compute substrate whether you planned for it or not.

v1.36 is the release that leans into that reality. The bulk of the new work is in Dynamic Resource Allocation (DRA), gang scheduling, and topology-aware placement, all of which exist because running distributed AI/ML jobs on Kubernetes has historically been painful. This release makes it less painful.

But there are also breaking changes and security fixes that affect everyone, not just the ML crowd. Let me walk through what actually matters.

The Breaking Changes First

gitRepo Volume Plugin: Gone for Good

If you're still using gitRepo volumes, stop reading and go fix that right now. The plugin has been deprecated since v1.11 and is now permanently disabled in v1.36. No feature flag, no workaround.

The reason it's gone is serious: gitRepo allowed attackers to run code as root on the node. It was a known attack vector for years. The right replacement is an init container running git clone, or a git-sync sidecar. Both are well-documented and production-proven.

# Before (broken in v1.36)
volumes:
  - name: code
    gitRepo:
      repository: "https://github.com/example/repo"
      revision: "main"

# After: use an init container
initContainers:
  - name: git-sync
    image: registry.k8s.io/git-sync/git-sync:v4.2.1
    args:
      - --repo=https://github.com/example/repo
      - --branch=main
      - --root=/git
      - --one-time
    volumeMounts:
      - name: code
        mountPath: /git

Ingress-NGINX Is Retired

SIG Network and the Security Response Committee retired ingress-nginx on March 24, 2026. No more releases, no more security patches. Existing deployments keep running, but you're on your own for CVEs from here.

The community's recommended alternatives are Envoy Gateway (CNCF graduated), Cilium Gateway API, and Traefik. If you're on ingress-nginx in production, this is your migration window. Don't wait for the next CVE to force your hand.

service.spec.externalIPs Deprecated

The externalIPs field in Service specs is being deprecated (full removal planned for v1.43). It's been a known vector for man-in-the-middle attacks since CVE-2020-8554. You'll see deprecation warnings starting in v1.36. Migrate to LoadBalancer services, NodePort, or Gateway API.

The AI/ML Features That Actually Change How You Work

DRA: Partitionable Devices (Beta)

This is the one I'm most excited about. v1.36 promotes DRA support for partitionable devices to beta, meaning it's enabled by default. A single GPU can now be split into multiple logical units and allocated to different workloads.

Before this, if you had an H100 and a workload that only needed 20% of it, you either wasted 80% or ran a separate MIG configuration outside Kubernetes. Now the scheduler handles it natively.

apiVersion: resource.k8s.io/v1beta1
kind: ResourceClaim
metadata:
  name: partial-gpu
spec:
  devices:
    requests:
    - name: gpu-slice
      deviceClassName: nvidia.com/gpu
      count: 1
      # Request a partition, not the whole device
      selectors:
      - cel:
          expression: device.attributes["nvidia.com/gpu"].partitionable == true

For platform teams running shared GPU clusters, this is a significant cost lever. You can pack more inference workloads onto the same hardware without sacrificing isolation.

Workload-Aware Preemption (Alpha)

Standard Kubernetes preemption works pod-by-pod. For distributed AI/ML jobs, that's a disaster: preempt one pod from a training job and the whole job stalls, wasting all the resources it's still holding.

v1.36 introduces workload-aware preemption via PodGroups. The scheduler now treats a group of related pods as a single entity. When it needs to make room for a high-priority job, it preempts entire groups rather than individual pods.

apiVersion: scheduling.k8s.io/v1alpha1
kind: PodGroup
metadata:
  name: training-job-a
spec:
  minMember: 8
  priorityClassName: high-priority
  gangSchedulingPolicy:
    disruptionMode: PodGroup  # preempt the whole group, not individual pods

This is alpha, so it's off by default. But if you're running Kueue or JobSet for batch AI workloads, this is worth enabling in a test cluster now.

Pod-Level Resource Managers (Alpha)

For HPC and AI/ML workloads, NUMA alignment matters. Previously, the Topology Manager only worked at the container level. If you had a training container plus logging and monitoring sidecars in the same pod, you couldn't guarantee they all landed on the same NUMA node.

v1.36 adds pod-scope resource management: you can now set pod.spec.resources and have the Topology Manager treat the entire pod as a single scheduling unit. All containers get resources from the same NUMA node.

spec:
  resources:
    requests:
      cpu: "16"
      memory: "64Gi"
  topologySpreadConstraints:
    - maxSkew: 1
      topologyKey: topology.kubernetes.io/numa-node
      whenUnsatisfiable: DoNotSchedule

DRA Resource Availability Visibility (Alpha)

Finally, a native way to answer "how many GPUs are free in this cluster?" without writing custom tooling.

kubectl create -f - <<EOF
apiVersion: resource.k8s.io/v1alpha1
kind: ResourcePoolStatusRequest
metadata:
  name: check-gpus
spec:
  driver: nvidia.com/gpu
EOF

kubectl get rpsr/check-gpus -o yaml
# Returns: totalDevices, allocatedDevices, availableDevices per node

This is alpha, but it's the kind of operational visibility that platform teams have been hacking around for years.

The Stability Improvements

SELinux Volume Labeling: Now GA

Faster pod startup on SELinux-enforcing systems. This replaces recursive file relabeling with a single mount-time label, which can cut pod startup time significantly on large volumes. It's been in beta since v1.28 and is now stable and on by default.

If you're running RHEL or any SELinux-enforcing OS, you'll notice this immediately.

External ServiceAccount Token Signing: GA

The kube-apiserver can now delegate token signing to external KMS or HSM systems. For clusters with strict key management requirements (financial services, healthcare, government), this removes a significant compliance gap.

Graceful Leader Transition (Alpha)

Control plane components (kube-controller-manager, kube-scheduler) used to call os.Exit() when losing leader election, forcing a full restart. v1.36 introduces graceful transitions: the component moves to follower state and re-enters the election without restarting. Faster failover, less noise in your control plane logs.

Stale Controller Mitigation (Alpha)

Large clusters with high churn have always had a subtle bug: a controller creates a resource, its cache hasn't updated yet, and it tries to create the same resource again. v1.36 adds cache freshness tracking so controllers check whether their local state is current before reconciling. Fewer duplicate creates, fewer spurious errors in busy clusters.

HPA Scale-to-Zero (Alpha)

The Horizontal Pod Autoscaler can now scale deployments to zero replicas based on external metrics (queue depth, custom metrics). When the queue is empty, the deployment goes to zero. When work arrives, it scales back up. This is the missing piece for event-driven workloads that don't need to run 24/7.

What to Do Before April 22

Audit gitRepo volumes. Run kubectl get pods -A -o json | jq '.items[].spec.volumes[]? | select(.gitRepo != null)'. If you get output, you have work to do.
Plan your ingress-nginx migration. Check kubectl get ingressclass and kubectl get pods -A | grep ingress-nginx. If you're running it, pick a replacement and start testing.
Check for externalIPs usage. kubectl get svc -A -o json | jq '.items[] | select(.spec.externalIPs != null) | .metadata.name'
Enable DRA partitionable devices in staging. If you run GPU workloads, this is worth testing before it becomes the default everywhere.
Read the full changelog. The CHANGELOG-1.36.md is dense but worth scanning for anything specific to your stack.

The Bigger Picture

v1.36 isn't a flashy release. There's no single feature that rewrites how Kubernetes works. What it is, is a release that takes the AI/ML workload story seriously at the scheduler and resource allocation level, while cleaning up years of accumulated security debt.

The gitRepo removal and ingress-nginx retirement are overdue. The DRA work is genuinely new capability. And the gang scheduling improvements are the kind of thing that makes distributed training jobs actually reliable on Kubernetes instead of just theoretically possible.

If you're running AI inference at scale, v1.36 is the release you've been waiting for. If you're running anything else, it's a solid maintenance release with a few security items you can't ignore.

Resources:

DEV Community