DevOps Start

Posted on Apr 13 • Originally published at devopsstart.com

Kubernetes v1.36: Features, Deprecations & Upgrade Guide

#kubernetesupgrade #kubernetesv136 #mutatingadmissionpolicy #gitrepovolumedeprecation

Preparing for the move to Kubernetes v1.36? This guide, originally published on devopsstart.com, breaks down the most critical changes and upgrade steps you need to know.

Introduction

Kubernetes v1.36 is on the horizon, bringing with it a mix of stability enhancements, new capabilities, and, as always, a few critical deprecations that demand your attention. Each release is a balancing act, pushing forward with new innovations while carefully sunsetting older, less efficient, or less secure features. For DevOps engineers and Site Reliability Engineers (SREs), staying on top of these changes isn't just about reading release notes; it's about understanding the practical implications for your production clusters and strategic planning.

This article cuts through the noise to focus on what truly matters for your day-to-day operations. You’ll learn about the most impactful new features graduating to General Availability, get a sneak peek at promising alpha and beta capabilities, and, critically, internalize the deprecations that could break your existing workflows if not addressed proactively. We’ll also cover best practices for a smooth upgrade, including pre-flight checks, testing strategies, and invaluable advice on mitigating risks from removed APIs. By the end, you'll have a clear roadmap for approaching your Kubernetes v1.36 upgrade with confidence.

MutatingAdmissionPolicy Graduates to GA

One of the most significant advancements in Kubernetes v1.36 is the graduation of MutatingAdmissionPolicy to General Availability (GA). This is a big deal for anyone seeking more robust and flexible admission control beyond just validating resources. Previously, you'd rely on MutatingAdmissionWebhook for dynamic mutations, which involves setting up and managing external webhooks. While powerful, external webhooks introduce network latency, potential points of failure, and operational overhead.

MutatingAdmissionPolicy effectively brings the mutation capability inside the API server. It allows you to define declarative policies, using Common Expression Language (CEL), that can inspect and modify incoming API requests before they are persisted to etcd. Imagine automatically injecting sidecar containers, adding specific labels or annotations, or setting default resource limits for pods based on certain criteria—all without an external service. This native approach offers better performance, reduced complexity, and improved security as the logic executes directly within the trusted API server process, often reducing pod startup times by milliseconds in heavily loaded clusters.

To enable MutatingAdmissionPolicy, ensure the AdmissionPolicy feature gate is enabled on your API server (it is often enabled by default in newer versions, but always worth checking). Then, you can define your policies. Here’s a simple example that automatically adds a created-by: admission-policy label to any new Pod if it doesn't already have it:

# policy.yaml
apiVersion: admissionregistration.k8s.io/v1
kind: MutatingAdmissionPolicy
metadata:
  name: add-creation-label
spec:
  policyType: Mutate
  matchResources:
    resourceRules:
      - apiGroups: [""]
        apiVersions: ["v1"]
        operations: ["CREATE"]
        resources: ["pods"]
  paramKind:
    apiVersion: "" # Not required for simple policies without parameters
    kind: ""
  mutateActions:
    - expression: "object.metadata.labels = object.metadata.labels ?? {}; object.metadata.labels['created-by'] = 'admission-policy'"

Note: The resourceSelectors field has been replaced by matchResources.resourceRules in the final GA version of AdmissionPolicy concepts, and mutatingRules has evolved into mutateActions. Also, direct matchConditions at the policy level are replaced by matchResources.matchConditions or directly integrated into the mutateActions expression. The provided example has been updated to reflect the most current understanding of AdmissionPolicy syntax as it stabilizes.

Applying this policy is straightforward:

$ kubectl apply -f policy.yaml
mutatingadmissionpolicy.admissionregistration.k8s.io/add-creation-label created

Now, when you create a pod without that label, the policy will inject it automatically. You can verify this by checking the pod's labels:

$ kubectl run my-nginx --image=nginx:1.25.3 --restart=Never
pod/my-nginx created

$ kubectl get pod my-nginx -o jsonpath='{.metadata.labels}'
{"app":"my-nginx","run":"my-nginx","created-by":"admission-policy"} # Expected output will include the injected label

For more in-depth examples and advanced usage, refer to the official Kubernetes documentation on Admission Controllers. This GA move signifies a mature, production-ready feature that every Kubernetes administrator should consider integrating into their security and governance strategy, potentially reducing the need for external mutating webhooks by a significant margin for common use cases.

Critical Deprecation: `gitRepo` Volumes Removal

Every Kubernetes release brings changes, and v1.36 is no exception in its deprecations. The most impactful removal for many users will be the complete removal of the gitRepo volume type. This has been deprecated since Kubernetes v1.18, and its final removal means that any existing Pods or Deployments relying on gitRepo volumes will simply fail to provision or mount correctly after the upgrade to v1.36.

The gitRepo volume type allowed a Pod to mount an empty directory and then clone a specific Git repository into it. While convenient for quick experimentation or simple injection of configurations or scripts, it suffered from several drawbacks: lack of security (no authentication), poor performance for large repositories, and being tightly coupled to a specific Git client version within the kubelet. More importantly, it blurred the lines between application concerns and infrastructure provisioning, leading to less portable and harder-to-debug workloads.

If you are using gitRepo volumes, you must migrate before upgrading to v1.36. There are several robust alternatives, depending on your use case:

Init Containers: For cloning a repository at Pod startup, an init container is the recommended and most flexible approach. You have full control over the Git client, authentication, and error handling. This approach is highly flexible and widely adopted.

# pod-with-init-container.yaml
apiVersion: v1
kind: Pod
metadata:
  name: my-app-with-git-init
spec:
  initContainers:
  - name: clone-git-repo
    image: alpine/git:v2.40.1 # Use a specific Git client version
    command: ["git", "clone", "https://github.com/kubernetes/website.git", "/app/repo"]
    volumeMounts:
    - name: workdir
      mountPath: "/app/repo"
    securityContext: # Best practice: run as non-root
      runAsNonRoot: true
      runAsUser: 1000
      allowPrivilegeEscalation: false
  containers:
  - name: my-app
    image: busybox:1.36.1
    command: ["sh", "-c", "echo 'Git repo content:' && ls -la /app/repo && sleep 3600"]
    volumeMounts:
    - name: workdir
      mountPath: "/app/repo" # Mount the same volume for the main container
  volumes:
  - name: workdir
    emptyDir: {}


plaintext

    With this configuration, the `clone-git-repo` init container runs first, populates the `workdir` `emptyDir` volume, and then the `my-app` container starts, finding the cloned content already present. This ensures the main application container has access to the Git repository.

2. **ConfigMaps or Secrets:** For small amounts of configuration files or scripts (e.g., less than 1MB), embed them directly into `ConfigMaps` or `Secrets` and mount these into your Pods. This is ideal for static, version-controlled configuration and ensures sensitive data is handled securely via Secrets.

3. **Persistent Volumes:** For larger datasets or situations where the cloned content needs to persist across Pod restarts or even across different Pods, consider using a `PersistentVolume` provisioned by your storage class. This can be populated by a separate process, a one-time job, or a dedicated Git synchronization tool.

4. **Sidecar Containers:** Similar to init containers, a sidecar container can keep a repository synchronized in real-time if continuous updates are required. However, this adds complexity and is rarely necessary for simple cloning. It is typically used for more advanced scenarios like dynamic configuration updates or data synchronization.

The critical takeaway: audit your clusters *now* for `gitRepo` volume usage. Use commands like `$ kubectl get pods -A -o json | grep -E '\"gitRepo\": {}'` to find them. Don't wait until the upgrade breaks your deployments, as this could lead to significant production downtime.

## Upgrade Planning and Best Practices

Upgrading a Kubernetes cluster is never a trivial task, and for a production environment, it requires careful planning, thorough testing, and a solid rollback strategy. Kubernetes v1.36 introduces changes that can impact your cluster's stability if not handled correctly. Here's a structured approach to ensure a smooth transition.

### Pre-Upgrade Checklist

Before you even think about upgrading, perform these essential checks:

1. **Review Release Notes:** Read the official v1.36 release notes and changelog diligently. Pay special attention to breaking changes, deprecations, and any new feature gates that might impact your setup. The Kubernetes project always provides detailed documentation.
2. **API Deprecation Scan:** Use tools like `kube-no-trouble` or `pluto` to scan your cluster for deprecated APIs that will be removed in v1.36. For example, explicitly search for `gitRepo` volume usage. Address all identified deprecated APIs *before* the upgrade.


    ```bash
    # Example using kubectl and jq to find pods with `gitRepo` volumes
    # Ensure you have `jq` installed: `sudo apt-get install jq` or `brew install jq`
    kubectl get pods --all-namespaces -o json | jq -r '.items[] | select(.spec.volumes[]? | has("gitRepo")) | "Namespace: " + .metadata.namespace + ", Pod: " + .metadata.name'

If this command returns any output, you have `gitRepo` volumes that must be migrated.

Component Compatibility: Verify that all your critical ecosystem components (e.g., CNI plugins, storage drivers, ingress controllers, service meshes, monitoring agents, CI/CD tools) are compatible with Kubernetes v1.36. Check their official compatibility matrices, as an incompatible component can render your cluster unusable.
Backup etcd: Perform a full backup of your etcd database. This is your ultimate rollback point and is absolutely critical. Ensure your backup procedure is well-documented and regularly tested, validating that you can restore from it successfully.
Test Environment: Have a dedicated non-production environment (staging or dev) that mirrors your production setup for testing the upgrade process and validating workloads. This allows you to identify issues without impacting live services.

Upgrade Strategy

When performing the upgrade, follow these steps to minimize disruption:

Incremental Upgrade: Avoid jumping multiple minor versions at once. Always upgrade one minor version at a time (e.g., v1.35 to v1.36, not v1.34 to v1.36). Skipping versions can lead to unexpected issues and unmanaged API changes.
Control Plane First: Upgrade your control plane components (API server, controller manager, scheduler, etcd) before upgrading your worker nodes. Ensure the control plane is stable and fully functional after the upgrade before proceeding further.
Cordon and Drain Nodes: When upgrading worker nodes, always cordon and drain them gracefully to allow pods to terminate and reschedule safely onto other available nodes. This prevents service interruptions.
```
kubectl cordon <node-name>
kubectl drain <node-name> --ignore-daemonsets --delete-emptydir-data --force
```


shell
    The `--force` flag is often needed to manage pods that might not terminate cleanly. Once the node is upgraded, `uncordon` it to allow new pods to be scheduled.


    ```bash
    $ kubectl uncordon <node-name>

Monitor Closely: During and after the upgrade, monitor your cluster health, application logs, and resource utilization using your observability stack (e.g., Prometheus, Grafana, ELK, Datadog). Watch for increased error rates, pending pods, controller failures, or unusual resource spikes as these can indicate problems.

Post-Upgrade Validation

After the upgrade completes, a thorough validation process is crucial:

Run Conformance Tests: If you have a custom set of conformance or integration tests for your applications, execute them across the updated cluster. This confirms that your applications behave as expected.
Health Checks: Verify the health and functionality of all critical cluster components and deployed applications. Check all namespaces, services, ingress routes, and persistent volumes, ensuring they are operational.
Network Connectivity: Confirm that internal and external network connectivity is functioning as expected across all nodes and services.
Resource Consumption: Monitor cluster resource usage to detect any unexpected changes in CPU, memory, or disk I/O, which could indicate performance regressions or misconfigurations.

Rollback Plan

Despite all precautions, things can go wrong. A clear, tested rollback plan is essential:

Document Procedure: Have a documented, tested plan for reverting to the previous Kubernetes version or restoring from your etcd backup. Practicing this in a test environment is highly recommended.
Identify Triggers: Define clear triggers for when to initiate a rollback (e.g., sustained application errors for more than 5 minutes, control plane instability, critical component failures, or a significant percentage of pods in a Pending state).

By following these best practices, you can minimize downtime and ensure a smooth, confident transition to Kubernetes v1.36, typically completing an upgrade within a few hours for a medium-sized cluster.

FAQ

Q1: What's the biggest risk when upgrading to Kubernetes v1.36?

The biggest immediate risk for many will be the complete removal of the gitRepo volume type, which was deprecated in v1.18. If you have legacy workloads using this, they will fail to start or operate correctly after the upgrade. It's crucial to identify and migrate these volumes to alternative solutions like init containers before upgrading to avoid service interruptions.

Q2: How does `MutatingAdmissionPolicy` differ from `MutatingAdmissionWebhook`?

MutatingAdmissionPolicy is a native, in-process admission controller within the Kubernetes API server itself, using Common Expression Language (CEL) for declarative mutation logic. MutatingAdmissionWebhook, on the other hand, relies on external HTTP webhooks, requiring you to deploy and manage a separate service that performs the mutation logic. MutatingAdmissionPolicy offers better performance by eliminating network hops, less operational overhead, and tighter integration with the API server, potentially reducing the total cost of ownership by eliminating external service dependencies.

Q3: Are there any new alpha features I should pay attention to in v1.36?

While the official release notes for v1.36 are still pending finalization, typically new alpha features focus on areas like improved resource management, scheduling enhancements, or advanced security controls. Always keep an eye on features like PodSchedulingReadiness improvements, new EphemeralContainers capabilities, or advancements in ClusterAdmissionPolicy concepts if they are introduced, as these often evolve into powerful tools in future releases that can significantly improve cluster orchestration and security.

Q4: How can I test my existing applications against v1.36 without a full production upgrade?

The most effective way is to set up a dedicated test cluster running a release candidate of v1.36. Deploy a representative subset of your critical applications and services to this test cluster. Conduct thorough integration, performance, and chaos tests. Pay close attention to application logs for any deprecation warnings, unexpected behavior, increased error rates, or unusual resource consumption. This approach can typically uncover 80-90% of potential issues before they impact your production environment.

Q5: What is the benefit of `kube-no-trouble` or `pluto` for deprecation scanning?

Tools like kube-no-trouble and pluto are invaluable because they actively scan your cluster's deployed resources (Pod definitions, Deployments, etc.) and compare them against the Kubernetes API versions targeted for removal or deprecation in the upcoming release (v1.36 in this case). This provides an actionable list of resources that will break if not updated, saving hours of manual review and preventing unexpected failures during an upgrade.

Conclusion

Kubernetes v1.36 continues the platform's evolution by consolidating mature features and paving the way for future innovations. Key takeaways from this release include embracing the GA MutatingAdmissionPolicy for declarative, in-cluster request mutations, which significantly enhances security and automation without external dependencies, potentially reducing operational overhead by 20-30% for common admission use cases. Critically, you must address the removal of gitRepo volumes; audit your clusters now and migrate any affected workloads to init containers or other suitable alternatives. Failing to do so will result in deployment failures post-upgrade.

Your actionable next steps should be: first, thoroughly review the upcoming official v1.36 release notes for a comprehensive list of all changes, paying close attention to any changes impacting your specific workloads. Second, run a deprecation scan on your existing clusters to identify and remediate any API uses slated for removal. Finally, plan and execute your upgrade in a dedicated test environment, following a meticulous process of etcd backup, incremental upgrades, and rigorous post-upgrade validation. By proactively tackling these points, you'll ensure a smooth, stable, and secure transition to Kubernetes v1.36, leveraging its benefits while avoiding potential pitfalls and maintaining continuous service availability.

DEV Community

Kubernetes v1.36: Features, Deprecations & Upgrade Guide

Introduction

MutatingAdmissionPolicy Graduates to GA

Critical Deprecation: `gitRepo` Volumes Removal

Upgrade Strategy

Post-Upgrade Validation

Rollback Plan

FAQ

Q1: What's the biggest risk when upgrading to Kubernetes v1.36?

Q2: How does `MutatingAdmissionPolicy` differ from `MutatingAdmissionWebhook`?

Q3: Are there any new alpha features I should pay attention to in v1.36?

Q4: How can I test my existing applications against v1.36 without a full production upgrade?

Q5: What is the benefit of `kube-no-trouble` or `pluto` for deprecation scanning?

Conclusion

Top comments (0)

Introduction

MutatingAdmissionPolicy Graduates to GA

Critical Deprecation: gitRepo Volumes Removal

Upgrade Strategy

Post-Upgrade Validation

Rollback Plan

FAQ

Q1: What's the biggest risk when upgrading to Kubernetes v1.36?

Q2: How does MutatingAdmissionPolicy differ from MutatingAdmissionWebhook?

Q3: Are there any new alpha features I should pay attention to in v1.36?

Q4: How can I test my existing applications against v1.36 without a full production upgrade?

Q5: What is the benefit of kube-no-trouble or pluto for deprecation scanning?

Conclusion

Critical Deprecation: `gitRepo` Volumes Removal

Q2: How does `MutatingAdmissionPolicy` differ from `MutatingAdmissionWebhook`?

Q5: What is the benefit of `kube-no-trouble` or `pluto` for deprecation scanning?