DEV Community

Muskan
Muskan

Posted on

Terraform Lock-In Is Real: Here's How to Get Out

Terraform Lock-In Is Real: Here's How to Get Out

The License Change That Made Lock-In Visible

In August 2023, HashiCorp quietly changed everything. Terraform moved from the Mozilla Public License (MPL 2.0) to the Business Source License (BSL 1.1). The new license prohibits using Terraform to build a competing product. For most engineering teams, that restriction meant nothing in practice. But the change did something more important: it made visible how dependent the infrastructure world had become on a single vendor's tool.

Teams that had been writing HCL for three years suddenly had a reason to ask: what would it take to leave? The answer, for most of them, was uncomfortable.

diagram

Within a month of the BSL announcement, a community fork appeared under the Linux Foundation. OpenTofu reached general availability in January 2024. The speed of the fork was remarkable. The reason was simple: the infrastructure community had built too much on Terraform to let a license dispute make it a dead end, but they also weren't ready to hand full control to a single commercial entity.

The fork created an exit ramp. But whether you take it, or switch to something else entirely, starts with understanding what you're actually locked into.

What Terraform Lock-In Actually Looks Like

Lock-in is not one thing. It is four things stacked on top of each other.

diagram

HCL. HashiCorp Configuration Language is declarative, readable, and entirely specific to the HashiCorp toolchain. Skills transfer to Vault and Consul, but not to Pulumi, CDK, or Crossplane. Every engineer who knows Terraform knows HCL. That knowledge does not move.

Providers. There are over 3,000 providers in the Terraform registry, covering AWS, GCP, Azure, Datadog, PagerDuty, Cloudflare, and hundreds of niche services. These providers are written against the Terraform plugin SDK. OpenTofu maintains full compatibility here. Pulumi or CDK do not: they have their own provider layers, with varying coverage.

State. This is where migration actually gets painful. Terraform state files are JSON, but they are keyed against specific provider schema versions. Moving 500 resources from one tool to another means reconciling every resource's state representation, one at a time. There is no bulk converter that works reliably at scale.

Module registry coupling. Community modules on the Terraform registry assume HCL syntax, Terraform variable conventions, and tfstate output formats. If you switch tools, you lose access to this ecosystem unless the new tool explicitly provides compatibility layers.

Understanding which of these four you are most exposed to determines which exit path makes sense.

The Real Cost of Leaving

Teams underestimate migration cost because they measure the wrong thing. The question is not "how many lines of HCL do we have?" It is "how many distinct state files do we have, and how complex is each one?"

Team Size Module Count Estimated Migration Effort Realistic Timeline
5 engineers <20 modules 2-4 weeks 1 month
15 engineers 20-80 modules 2-4 months 3-6 months
30+ engineers 80-200 modules 6-12 months 6-18 months
Platform team 200+ modules 12-24 months 18+ months

These numbers assume a clean migration path, not a big-bang rewrite. Teams that attempt a big-bang switch consistently overshoot timelines by 2x. The ones that migrate incrementally, module by module, stay close to estimates.

Beyond engineering hours, there are three hidden costs most teams miss. CI/CD pipelines need rewiring: every plan/apply workflow in GitHub Actions, GitLab CI, or Jenkins has Terraform-specific steps. Secrets management integrations often use Terraform provider-level credential injection. And runbooks, documentation, and onboarding guides are written for Terraform. These take time to update and they are easy to forget until they break something in production.

The license concern alone is not worth a 12-month migration for most teams. What makes the migration worth it is if you also want better language support, stronger testing primitives, or a genuinely open governance model.

Your Three Exit Paths

Not all exits look the same. The right path depends on what you are trying to solve.

diagram

Criterion OpenTofu Pulumi CDK / Crossplane
Migration effort Lowest (binary swap) High (full rewrite) High (paradigm shift)
State compatibility Full (wire-compatible) None (manual import) None (manual import)
Language HCL (same as Terraform) TS, Python, Go, C# TS/Python (CDK); YAML (Crossplane)
Provider coverage All Terraform providers Pulumi registry + Terraform bridge AWS CDK: AWS only; Crossplane: varies
Governance Linux Foundation Commercial (Pulumi Corp) AWS (CDK); CNCF (Crossplane)
Maturity High (1.6+ GA) High (production-proven) Medium-High

OpenTofu is the right choice if your primary concern is the BSL license, or if you want to stay on HCL and avoid a rewrite. The migration is literally replacing the terraform binary with tofu. State files are compatible. Providers are compatible. CI/CD pipelines need one-line changes. Teams of 50+ with hundreds of modules should default here.

Pulumi is the right choice if HCL is itself the problem. Pulumi lets you write infrastructure in TypeScript, Python, Go, or C#. You get loops, conditionals, unit tests, and type checking without workarounds. The tradeoff is a full rewrite. There is no automated converter from HCL to Pulumi. The pulumi convert tool handles simple cases but fails on complex modules. Budget for a full migration effort.

CDK / Crossplane is the right choice if your team is already Kubernetes-native and you want infrastructure managed via operators and GitOps. Crossplane treats every cloud resource as a Kubernetes custom resource. AWS CDK generates CloudFormation. Both require rethinking how you model infrastructure, not just what language you use.

How to Migrate Without Breaking Production

The migration that does not break production is incremental. Never migrate a running system by switching the tool underneath it in one step.

diagram

Phase 1: Audit. Run terraform state list against every state file in your infrastructure. Count total managed resources. Map which state files are shared across teams. Identify state files that manage production database clusters versus those that manage S3 buckets for dev environments. Start with the latter.

Phase 2: Greenfield new. Stop writing new Terraform. All new infrastructure gets created in the target tool. This limits the blast radius of any migration mistakes and lets your team build muscle memory before touching production state.

Phase 3: Migrate state. For OpenTofu: run tofu init in the module directory. If it runs clean, you are done. For Pulumi: use pulumi import to pull existing resources into Pulumi state, then delete the corresponding Terraform config. Run both in parallel for one sprint to validate outputs match before cutting over.

Phase 4: Retire old. Delete Terraform configs for migrated modules. Remove Terraform-specific CI jobs. Update your runbooks. The migration is not complete until the old toolchain is gone, not just unused.

For teams using tfmigrate, the tool can batch state moves across module boundaries, which is useful when reorganizing how state is split during migration.

Preventing the Next Lock-In

Switching tools is expensive. The way to avoid doing it again is to stop letting tool APIs leak into every module.

diagram

Application teams should call internal modules with domain-meaningful inputs: database_size = "medium", environment = "staging". They should not call provider-specific resources directly. The internal module translates domain inputs to provider calls.

When the provider changes or the tool changes, you update the internal module. Application teams do not change. This pattern does not eliminate migration cost, but it concentrates it in one place instead of spreading it across every team's configuration.

Terraform made infrastructure programmable at scale. That contribution is real. But every tool that succeeds at scale creates its own gravity. Understanding that gravity, knowing what it costs to escape, and building enough abstraction to make the next exit cheaper: that is the work.

Top comments (0)