This blog post covers the real issues I ran into while upgrading Karpenter from v0.25.0 → v1.5.0 in production, why they happened, and the exact fixes. If you're planning this upgrade, this guide will save you hours of debugging.
Background
Upgrading Karpenter from 0.25.0 to 1.5.0 is not a simple version bump. It requires migrating from v1alpha5 APIs to the new v1 APIs a breaking change.
Starting point:
- Karpenter v0.25.0
- EKS 1.31
- v1alpha5 CRDs (Provisioner, AWSNodeTemplate)
Target:
- Karpenter v1.5.0
- v1 CRDs (NodePool, EC2NodeClass)
- EKS 1.32 compatibility
Skipping the CRD migration step leads to controller crashes, stuck resources, and broken uninstalls all of which I learned the hard way.
Problems I Faced During Migration
Problem 1: Chart not found in Helm repository
Error:
helm upgrade karpenter karpenter/karpenter --version 1.5.0
# Error: chart "karpenter" matching 1.5.0 not found
Why: The old Helm repo only contains versions up to 0.16.3. Karpenter v1.x was moved to an OCI registry.
Fix:
helm upgrade karpenter oci://public.ecr.aws/karpenter/karpenter \
--version 1.5.0 --namespace karpenter --wait
Problem 2: OCI registry tag not found
Error:
helm upgrade karpenter oci://public.ecr.aws/karpenter/karpenter --version v1.5.0
# Error: ... v1.5.0: not found
Why: Starting in v0.35.0, OCI tags no longer use the v prefix.
Fix:
--version 1.5.0 # NOT v1.5.0
Problem 3: Controller crash on startup (missing CRDs)
Error:
# ERROR: no matches for kind "NodeClaim" in version "karpenter.sh/v1"
# panic: unable to retrieve the complete list of server APIs
Why: The v1.5.0 controller requires new v1 CRDs (NodePool, NodeClaim, EC2NodeClass). Your cluster still contains only v1alpha5 CRDs.
Fix: Install CRDs first
helm upgrade --install karpenter-crd \
oci://public.ecr.aws/karpenter/karpenter-crd \
--version 1.5.0 \
--namespace karpenter --create-namespace
Then upgrade the controller:
helm upgrade karpenter oci://public.ecr.aws/karpenter/karpenter \
--version 1.5.0 --namespace karpenter --wait
Most common migration failure CRDs must be upgraded first.
Problem 4: IAM permission denied
Error:
"error": "not authorized to perform: ec2:DescribeImages"
Why: Karpenter v1 introduces new instance profile and AMI discovery workflows.
Fix: Add this to the Karpenter controller IAM role:
{
"Effect": "Allow",
"Action": [
"ec2:DescribeImages",
"iam:GetInstanceProfile",
"iam:CreateInstanceProfile",
"iam:DeleteInstanceProfile",
"iam:AddRoleToInstanceProfile",
"iam:RemoveRoleFromInstanceProfile"
],
"Resource": "*"
}
Restart the deployment:
kubectl rollout restart deployment karpenter -n karpenter
Step-by-Step Migration
1. Backup existing resources
kubectl get provisioners -A -o yaml > provisioners-backup.yaml
kubectl get awsnodetemplates -A -o yaml > awsnodetemplates-backup.yaml
2. Install v1 CRDs
helm upgrade --install karpenter-crd \
oci://public.ecr.aws/karpenter/karpenter-crd \
--version 1.5.0 --namespace karpenter
3. Update IAM permissions
(Add the policy from Problem 4.)
4. Upgrade the controller
helm upgrade karpenter oci://public.ecr.aws/karpenter/karpenter \
--version 1.5.0 --namespace karpenter --wait
5. Convert v1alpha5 → v1 resources
Provisioner → NodePool
apiVersion: karpenter.sh/v1
kind: NodePool
metadata:
name: default
spec:
template:
spec:
requirements:
- key: karpenter.sh/capacity-type
operator: In
values: ["on-demand"]
nodeClassRef:
group: karpenter.k8s.aws
kind: EC2NodeClass
name: default
disruption:
consolidationPolicy: WhenEmptyOrUnderutilized
AWSNodeTemplate → EC2NodeClass
apiVersion: karpenter.k8s.aws/v1
kind: EC2NodeClass
metadata:
name: default
spec:
amiSelectorTerms:
- alias: al2@latest
subnetSelectorTerms:
- tags:
karpenter.sh/discovery: my-cluster
securityGroupSelectorTerms:
- tags:
karpenter.sh/discovery: my-cluster
role: "karpenter-node-role-name"
6. Apply and verify
kubectl apply -f ec2nodeclass.yaml
kubectl apply -f nodepool.yaml
kubectl get ec2nodeclass
kubectl get nodepools
7. Migrate nodes
# Test provisioning
kubectl run test --image=nginx --requests=cpu=1,memory=1Gi
# Drain old nodes
8. Clean up old resources
kubectl delete provisioner default
kubectl delete awsnodetemplate default
Results
- Karpenter v1.5.0 running smoothly
- All nodes migrated to NodePools
- Cluster ready for EKS 1.32
- Zero downtime
Useful Resources
Monitor logs during upgrade:
kubectl logs -n karpenter -l app.kubernetes.io/name=karpenter -f
Top comments (0)