DEV Community

Cover image for How to Fix Karpenter Migration Issues During Upgrade (v0.25.0 v1.5.0)
Adedamola Ajibola
Adedamola Ajibola

Posted on

How to Fix Karpenter Migration Issues During Upgrade (v0.25.0 v1.5.0)

This blog post covers the real issues I ran into while upgrading Karpenter from v0.25.0 → v1.5.0 in production, why they happened, and the exact fixes. If you're planning this upgrade, this guide will save you hours of debugging.

Background

Upgrading Karpenter from 0.25.0 to 1.5.0 is not a simple version bump. It requires migrating from v1alpha5 APIs to the new v1 APIs a breaking change.

Starting point:

  • Karpenter v0.25.0
  • EKS 1.31
  • v1alpha5 CRDs (Provisioner, AWSNodeTemplate)

Target:

  • Karpenter v1.5.0
  • v1 CRDs (NodePool, EC2NodeClass)
  • EKS 1.32 compatibility

Skipping the CRD migration step leads to controller crashes, stuck resources, and broken uninstalls all of which I learned the hard way.

Problems I Faced During Migration

Problem 1: Chart not found in Helm repository

Error:

helm upgrade karpenter karpenter/karpenter --version 1.5.0
# Error: chart "karpenter" matching 1.5.0 not found
Enter fullscreen mode Exit fullscreen mode

Why: The old Helm repo only contains versions up to 0.16.3. Karpenter v1.x was moved to an OCI registry.

Fix:

helm upgrade karpenter oci://public.ecr.aws/karpenter/karpenter \
  --version 1.5.0 --namespace karpenter --wait
Enter fullscreen mode Exit fullscreen mode

Problem 2: OCI registry tag not found

Error:

helm upgrade karpenter oci://public.ecr.aws/karpenter/karpenter --version v1.5.0
# Error: ... v1.5.0: not found
Enter fullscreen mode Exit fullscreen mode

Why: Starting in v0.35.0, OCI tags no longer use the v prefix.

Fix:

--version 1.5.0   # NOT v1.5.0
Enter fullscreen mode Exit fullscreen mode

Problem 3: Controller crash on startup (missing CRDs)

Error:

# ERROR: no matches for kind "NodeClaim" in version "karpenter.sh/v1"
# panic: unable to retrieve the complete list of server APIs
Enter fullscreen mode Exit fullscreen mode

Why: The v1.5.0 controller requires new v1 CRDs (NodePool, NodeClaim, EC2NodeClass). Your cluster still contains only v1alpha5 CRDs.

Fix: Install CRDs first

helm upgrade --install karpenter-crd \
  oci://public.ecr.aws/karpenter/karpenter-crd \
  --version 1.5.0 \
  --namespace karpenter --create-namespace
Enter fullscreen mode Exit fullscreen mode

Then upgrade the controller:

helm upgrade karpenter oci://public.ecr.aws/karpenter/karpenter \
  --version 1.5.0 --namespace karpenter --wait
Enter fullscreen mode Exit fullscreen mode

Most common migration failure CRDs must be upgraded first.

Problem 4: IAM permission denied

Error:

"error": "not authorized to perform: ec2:DescribeImages"
Enter fullscreen mode Exit fullscreen mode

Why: Karpenter v1 introduces new instance profile and AMI discovery workflows.

Fix: Add this to the Karpenter controller IAM role:

{
  "Effect": "Allow",
  "Action": [
    "ec2:DescribeImages",
    "iam:GetInstanceProfile",
    "iam:CreateInstanceProfile",
    "iam:DeleteInstanceProfile",
    "iam:AddRoleToInstanceProfile",
    "iam:RemoveRoleFromInstanceProfile"
  ],
  "Resource": "*"
}
Enter fullscreen mode Exit fullscreen mode

Restart the deployment:

kubectl rollout restart deployment karpenter -n karpenter
Enter fullscreen mode Exit fullscreen mode

Step-by-Step Migration

1. Backup existing resources

kubectl get provisioners -A -o yaml > provisioners-backup.yaml
kubectl get awsnodetemplates -A -o yaml > awsnodetemplates-backup.yaml
Enter fullscreen mode Exit fullscreen mode

2. Install v1 CRDs

helm upgrade --install karpenter-crd \
  oci://public.ecr.aws/karpenter/karpenter-crd \
  --version 1.5.0 --namespace karpenter
Enter fullscreen mode Exit fullscreen mode

3. Update IAM permissions

(Add the policy from Problem 4.)

4. Upgrade the controller

helm upgrade karpenter oci://public.ecr.aws/karpenter/karpenter \
  --version 1.5.0 --namespace karpenter --wait
Enter fullscreen mode Exit fullscreen mode

5. Convert v1alpha5 → v1 resources

Provisioner → NodePool

apiVersion: karpenter.sh/v1
kind: NodePool
metadata:
  name: default
spec:
  template:
    spec:
      requirements:
      - key: karpenter.sh/capacity-type
        operator: In
        values: ["on-demand"]
      nodeClassRef:
        group: karpenter.k8s.aws
        kind: EC2NodeClass
        name: default
  disruption:
    consolidationPolicy: WhenEmptyOrUnderutilized
Enter fullscreen mode Exit fullscreen mode

AWSNodeTemplate → EC2NodeClass

apiVersion: karpenter.k8s.aws/v1
kind: EC2NodeClass
metadata:
  name: default
spec:
  amiSelectorTerms:
    - alias: al2@latest
  subnetSelectorTerms:
    - tags:
        karpenter.sh/discovery: my-cluster
  securityGroupSelectorTerms:
    - tags:
        karpenter.sh/discovery: my-cluster
  role: "karpenter-node-role-name"
Enter fullscreen mode Exit fullscreen mode

6. Apply and verify

kubectl apply -f ec2nodeclass.yaml
kubectl apply -f nodepool.yaml

kubectl get ec2nodeclass
kubectl get nodepools
Enter fullscreen mode Exit fullscreen mode

7. Migrate nodes

# Test provisioning
kubectl run test --image=nginx --requests=cpu=1,memory=1Gi

# Drain old nodes
Enter fullscreen mode Exit fullscreen mode

8. Clean up old resources

kubectl delete provisioner default
kubectl delete awsnodetemplate default
Enter fullscreen mode Exit fullscreen mode

Results

  • Karpenter v1.5.0 running smoothly
  • All nodes migrated to NodePools
  • Cluster ready for EKS 1.32
  • Zero downtime

Useful Resources

Monitor logs during upgrade:

kubectl logs -n karpenter -l app.kubernetes.io/name=karpenter -f
Enter fullscreen mode Exit fullscreen mode

Top comments (0)