DEV Community

Cover image for Helm v4 Is Here — What Actually Breaks When You Upgrade
Valerii Vainkop
Valerii Vainkop

Posted on • Edited on

Helm v4 Is Here — What Actually Breaks When You Upgrade

Helm v4 Is Here — What Actually Breaks When You Upgrade

Helm v4.1.1 is the current stable release. If you're still on v3 (most teams are), the migration is more than a binary swap. Some things break silently. Some break loudly. A few require changes to how your whole chart release pipeline works.

This is what I've found going through it, with the things that actually matter for production platform teams.


The Short Version

Change Impact Action required?
OCI registry support is now default Medium Remove --experimental flags from scripts
helm install no longer waits by default High Add --wait where you relied on default behavior
Deprecated --generate-name flag removed Low Update scripts using it
Changed resource waiting behavior High Test rollout hooks and health checks
Go module path changed to helm.sh/helm/v4 Medium Update any Go code importing Helm libraries
Plugin protocol updated Medium Ensure plugins are updated before upgrading

What Actually Changed in v4

1. OCI is now the primary registry model

In Helm v3, OCI support was behind an HELM_EXPERIMENTAL_OCI=1 flag for years, then promoted to stable but still second-class. In v4, OCI is the default recommended distribution method. The legacy helm serve and HTTP-based chart repos work, but OCI is where development focus lives.

What breaks: Scripts or CI pipelines that set HELM_EXPERIMENTAL_OCI=1. The flag is gone — you'll get an error referencing an unknown environment variable in some contexts.

Fix: Remove the flag. OCI commands work without it in v4.

# v3 — needed this
export HELM_EXPERIMENTAL_OCI=1
helm push mychart.tgz oci://registry.example.com/charts

# v4 — just works
helm push mychart.tgz oci://registry.example.com/charts
Enter fullscreen mode Exit fullscreen mode

2. Resource waiting behavior changed

This one bites teams that don't read changelogs.

In v4.1, the kstatus-based resource waiting got two fixes that change behavior:

  • Resources that fail (not just timeout) now exit waiting immediately instead of waiting for the timeout to expire
  • Fine-grained context cancellation was added

What this means in practice: if you have health checks or rollout hooks that were accidentally succeeding via timeout (the resource failed but the wait timed out and the next step continued), those will now fail fast and visibly.

This is the correct behavior — but if you're discovering it in production for the first time, it's not a fun way to find out your health checks were wrong.

Fix: Before upgrading, audit your Helm hooks and --wait usage. Run a staging deploy and watch what actually happens when a pod fails to start.

3. helm install and helm upgrade flag changes

Several deprecated flags were removed in v4. The ones most likely to catch teams:

  • --use-deprecated-chart-hooks — gone
  • Some flag aliases that existed for backwards compatibility — gone
  • Behavior around --atomic and --wait combinations changed slightly

Fix: Run helm install --help and helm upgrade --help on your v4 binary and compare against your current scripts. Flag mismatches fail loudly at invocation, so at least they're easy to find.

4. Go module path changed

If you import Helm as a Go library (custom tooling, operators, controllers), the module path changed from helm.sh/helm/v3 to helm.sh/helm/v4.

Fix:

# Find all imports
grep -r "helm.sh/helm/v3" . --include="*.go"

# Update them
sed -i 's|helm.sh/helm/v3|helm.sh/helm/v4|g' $(find . -name "*.go")
go mod tidy
Enter fullscreen mode Exit fullscreen mode

5. Plugin compatibility

Helm plugins use a protocol to communicate with the CLI. v4 updated this protocol. Plugins built for v3 may not work correctly with v4.

Fix: Before upgrading, check every plugin you use (helm plugin list) and verify that the plugin maintainer has released a v4-compatible version. For critical plugins, test in staging first.


The Upgrade Path

My recommended sequence:

  1. Inventory your current state
   helm version
   helm plugin list
   helm list --all-namespaces
Enter fullscreen mode Exit fullscreen mode
  1. Download v4 binary side-by-side (don't replace v3 yet)
   curl -fsSL https://get.helm.sh/helm-v4.1.1-linux-amd64.tar.gz | tar xz
   mv linux-amd64/helm /usr/local/bin/helm4
   helm4 version
Enter fullscreen mode Exit fullscreen mode
  1. Test against your existing releases — the helm4 status and helm4 history commands should work against v3-deployed releases

  2. Run a staging helm4 upgrade for your most complex chart and watch for flag errors, hook behavior changes, and wait timeout differences

  3. Update plugins for any you depend on

  4. Update CI/CD pipelines — find all uses of helm in your pipeline definitions and update flags

  5. Swap the binary once staging is clean


Is the Upgrade Worth It?

For most teams: yes, eventually, but there's no urgency if v3 is stable for you. v3 will continue receiving patch releases through at least mid-2026.

If you're building new infrastructure or greenfielding a new cluster: start with v4. The OCI-first model is cleaner and where the ecosystem is heading.

If you have existing production releases: stage it, test the hook behavior change specifically, and upgrade during a maintenance window. It's not risky if you're methodical — but the wait behavior change has real potential to expose latent issues in your health checks.


tl;dr Checklist

  • [ ] Remove HELM_EXPERIMENTAL_OCI=1 from all scripts and CI
  • [ ] Audit --wait, --atomic, and hook behavior in staging
  • [ ] Check plugin compatibility before upgrading
  • [ ] Update Go module imports if you use Helm as a library
  • [ ] Run helm4 upgrade --dry-run against your critical releases before cutting over

The upgrade is manageable. Just don't skip staging.


Daily signals → t.me/stackpulse1


LinkedIn

Top comments (0)