Ask any platform engineer who runs Terraform at scale and you will hear the same stories. The state file that got corrupted mid-apply. The terraform apply that took down more than it was supposed to. The drift that nobody noticed until an incident did. Terraform is excellent. But it was designed for a different job than what platform teams are doing today.
The job has changed. Platform teams in 2026 aren't just provisioning infrastructure for themselves. They're building infrastructure APIs for other teams across multiple clouds, with developers who want a VPC without caring what a subnet group is. Terraform's model wasn't built for this. Crossplane's was.
This post explains the shift, why teams are moving, what Crossplane actually is, and what you need to understand to evaluate it honestly.
The Real Problems With Terraform at Scale
Terraform gets unfairly blamed for problems that are actually symptoms of a deeper mismatch. Here's what that mismatch looks like in practice.
State is fragile and it grows: Your entire infrastructure truth lives in a .tfstate file. Back it up to S3, rely on its native locking it still breaks in ways that cost you hours of debugging. Two pipelines racing for a lock. A corrupted state from an interrupted apply. A stale local state that someone ran apply against without refreshing. And as your infrastructure grows, so does the state file hundreds of resources, thousands of attributes, all in one JSON blob. A terraform refresh against a large state can take minutes, and every plan has to load and parse the whole thing. The bigger your platform gets, the slower and more fragile the state becomes.
There's no reconciliation loop: Terraform does what you tell it. If you say create the bucket, it will create the bucket. But its job is completed there then it stops watching. If someone changes a security group in the console, Terraform has no idea. Drift accumulates silently between runs. You only find out during the next plan, or during an incident. For a platform managing hundreds of resources, this is a slow-burning reliability problem.
Multi-cloud and multi-region setups are painful: Terraform can target multiple clouds and regions but it doesn't make it easy. You end up juggling separate provider blocks, separate backend configs, separate workspaces or even separate repos for each region. A three-region, two-cloud setup means coordinating state files, provider versions, and pipeline runs across all of them. There's no unified view of your infrastructure.
It gets worse operationally. Even after setting up your main AWS Terraform installation with IAM roles, adding Azure or GCP means spinning up entirely separate Terraform instances separate backends, separate auth configurations, separate provider setups. Developers have to manually switch between these instances depending on what they're targeting. us-east-1? One workspace. ap-south-1? Switch context. Azure resources? Different instance entirely. It's a collection of separately managed silos that happen to be run by the same tool.
There's no native RBAC: Terraform has no concept of "this developer can only run plan, not apply" or "this team can only touch resources in their namespace." Access control lives entirely outside Terraform in your CI/CD system, in your cloud IAM policies, in whatever conventions your team agrees on and hopes people follow. A junior engineer experimenting and a senior engineer making a production change look identical to Terraform. You build guardrails yourself, or you don't have them at all.
What Crossplane Actually Is
Crossplane is a Kubernetes extension that turns your cluster into a universal control plane for cloud infrastructure. You define cloud resources S3 buckets, RDS instances, VPCs as Kubernetes custom resources. Provider controllers watch those resources and continuously reconcile actual cloud state against your declared desired state.
The key word is continuously.
There's no apply command. No pipeline gate between intent and reality. You declare what you want, apply it to the cluster, and the reconciler owns it from there. If a resource drifts someone deletes it manually, changes a setting in the console the controller corrects it automatically. The cluster is your state. Not a file. A living, watching control loop.
apiVersion: s3.aws.upbound.io/v1beta1
kind: Bucket
metadata:
name: crossplane-bucket-xxx1234
spec:
forProvider:
region: ap-south-1
providerConfigRef:
name: aws-provider
Apply this once. The bucket is created and stays in sync with this spec indefinitely. No drift. No manual reconciliation. No wondering if what's in your state file matches what's actually in AWS. Every 10 minutes, Crossplane provider controllers rerun the reconciliation loop so if someone deletes a resource manually or changes a setting in the console, it gets automatically corrected. No human intervention, no pipeline trigger, no alert you have to act on. The controller just fixes it.
And there is no state file to worry about. Ever. Want to know the current state of your infrastructure? Just run kubectl get. What you see is the live, reconciled truth. The cluster itself is the state. You're not querying a JSON file that might be stale or locked or corrupted. You're querying the same API server that's actively managing your resources. That's a fundamentally different and far more trustworthy relationship with your infrastructure's ground truth.
The Building Blocks Worth Understanding
Crossplane has a layered model that takes a bit of time to understand. These four concepts are the ones that matter.
1. Managed Resources
A Managed Resource is a one-to-one mapping to a single cloud resource an RDS instance, a GCS bucket, an Azure VNet. This is the lowest level. You can use Managed Resources directly, but most platform teams don't expose them to developers. They're the raw material, not the finished product.
2. CompositeResourceDefinitions (XRDs) — Your API
An XRD lets your platform team define a custom infrastructure API a new Kubernetes resource with a schema you design. You decide what fields are exposed, what's required, what has defaults. Everything else is an implementation detail that developers never see.
For example: a Database with just three fields engine, environment, and size. That's all a developer needs to ask for a database. Which VPC it lands in, which subnet, what encryption standard, what backup retention policy, whether it's Multi-AZ none of that is their problem. Your platform team owns those decisions, baked permanently into the Composition. The developer gets a database. The platform team maintains the standard.
apiVersion: apiextensions.crossplane.io/v2
kind: CompositeResourceDefinition
metadata:
name: databases.platform.example.com
spec:
scope: Namespaced
group: platform.example.com
names:
kind: Database
plural: databases
versions:
- name: v1alpha1
served: true
referenceable: true
schema:
openAPIV3Schema:
type: object
properties:
spec:
type: object
required:
- parameters
properties:
parameters:
type: object
required:
- engine
- size
- environment
properties:
engine:
type: string
enum: [postgres, mysql]
size:
type: string
enum: [small, medium, large]
environment:
type: string
enum: [staging, production]
With scope: Namespaced, the Database resource is available directly in each team's namespace. In Crossplane v2, Claims are gone developers apply the XR directly. A team in the payments namespace just does:
apiVersion: platform.example.com/v1alpha1
kind: Database
metadata:
name: payments-db
namespace: payments
spec:
parameters:
engine: postgres
size: medium
environment: production
That's it. kubectl apply and the platform takes it from there.
3. Compositions — The Implementation Behind the API
A Composition defines what actually gets created when someone applies a Database. It maps the high-level fields to the real Managed Resources underneath an RDS instance, a parameter group, a subnet group, a secret in Secrets Manager. One Composition can target AWS; another, for the same XRD, can target GCP Cloud SQL. Same developer-facing API, different cloud underneath.
apiVersion: apiextensions.crossplane.io/v1
kind: Composition
metadata:
name: database-aws-postgres
labels:
provider: aws
spec:
compositeTypeRef:
apiVersion: platform.example.com/v1alpha1
kind: Database
mode: Pipeline
pipeline:
- step: patch-and-transform
functionRef:
name: function-patch-and-transform
input:
apiVersion: pt.fn.crossplane.io/v1beta1
kind: Resources
resources:
- name: rds-instance
base:
apiVersion: rds.aws.upbound.io/v1beta1
kind: Instance
spec:
forProvider:
region: ap-south-1
dbSubnetGroupName: platform-subnet-group
vpcSecurityGroupIds:
- sg-0abc1234def56789a
storageEncrypted: true
backupRetentionPeriod: 7
providerConfigRef:
name: aws-prod
patches:
- type: FromCompositeFieldPath
fromFieldPath: spec.parameters.engine
toFieldPath: spec.forProvider.engine
- type: FromCompositeFieldPath
fromFieldPath: spec.parameters.size
toFieldPath: spec.forProvider.instanceClass
transforms:
- type: map
map:
small: db.t3.micro
medium: db.t3.medium
large: db.r6g.xlarge
Your platform team owns the XRD (the contract) and the Composition (the fulfillment). App teams own the XRs they apply in their namespaces. That separation is enforced by the Kubernetes API not by convention or documentation.
Compare this to Terraform modules. Modules are reusable but they leak details. The caller still needs to know the variables, wire up outputs, understand dependencies. There's no real API boundary, no RBAC, no clean separation between platform and application concerns. Crossplane gives you that boundary natively.
4. Functions — Logic Inside Compositions
Early Compositions were purely declarative, which meant no conditionals, no loops, no real logic. Functions fix that. A Function is a small container that runs in the Composition pipeline it receives the current desired state and returns a modified set of resources.
The built-in function-patch-and-transform handles most common cases: field mapping, value transformation, conditional patching. For heavier logic, you can write Functions in Go, Python, or KCL.
pipeline:
- step: size-to-instance-class
functionRef:
name: function-patch-and-transform
input:
apiVersion: pt.fn.crossplane.io/v1beta1
kind: Resources
resources:
- name: rds-instance
patches:
- type: FromCompositeFieldPath
fromFieldPath: spec.parameters.size
toFieldPath: spec.forProvider.instanceClass
transforms:
- type: map
map:
small: db.t3.micro
medium: db.t3.medium
large: db.r6g.xlarge
It keeps Compositions programmable without abandoning the declarative model that GitOps depends on.
So How Does It All Fit Together?
Think of it in two worlds, the developer's world and the platform's world with Crossplane sitting in the middle translating between them.
The developer lives entirely in the top layer. They apply a Database or a CICDPipeline or an ObjectStorage resource in their namespace. Three fields, maybe four. They don't see XRDs, Compositions, Managed Resources, or Functions.
The platform team lives in the bottom layer. They define the XRD the schema that describes what developers can ask for. They write the Composition the blueprint that maps every developer-facing field to the actual cloud resources needed to fulfill it. Once that's done, they step back.
Crossplane is the bridge. The moment a developer applies their Database resource, the controller picks it up, looks at the Composition, and starts mapping engine: postgres becomes spec.forProvider.engine: postgres on the RDS Managed Resource, size: large gets transformed to spec.forProvider.instanceClass: db.r6g.xlarge via a Function, encryption and subnet and backup retention get injected from the Composition's defaults. The developer just sees their Database move to Ready: True.
Nobody handed credentials around. Nobody ran a pipeline. Nobody switched workspace contexts. The developer asked for infrastructure in the language of their platform. The platform delivered it in the language of AWS.
GitOps Without the Plumbing
With Terraform, GitOps is something you construct. Pipelines, plan gates, apply stages, approval workflows, notifications all manually assembled, all maintained by your team. There's Atlantis, Spacelift, env0, Terraform Cloud each doing CI/CD differently, each with its own opinions, its own pricing, its own quirks. Every platform team solving the same problem in a slightly different way, none of it portable, all of it requiring maintenance.
With Crossplane, GitOps is the default. Every infrastructure resource is a Kubernetes manifest. Put them in Git, point Argo CD or Flux at the repo, and Crossplane handles continuous reconciliation from there. Your entire infrastructure history is diffable and revertable. Pull requests are your change management. Rollbacks are git revert.
The operational model you were building on top of Terraform is built into Crossplane.
The Stack That's Emerging
Teams making this shift are converging on a coherent platform stack: Crossplane for infrastructure provisioning and drift correction, Argo CD for GitOps delivery, Backstage for the developer-facing self-service portal, and Kubernetes as the control plane tying it all together.
It's not a perfect stack. But it's cohesive every tool has a clear role, and they compose naturally. That's something Terraform + custom pipelines + internal tooling + Slack approvals rarely manages to be.
The Honest Caveats
You need Kubernetes. Crossplane without K8s isn't a thing. If your team isn't already running Kubernetes as a foundation, the overhead of adopting it just for Crossplane is real. Terraform is the right call until you get there.
Provider coverage is still catching up. AWS and GCP providers from Upbound are production-ready. But Terraform's ecosystem has a decade of community investment. If you depend on niche services or SaaS providers, check coverage before committing.
The learning curve is real. XRDs, Compositions, and Functions are a new mental model. Your first few Compositions will feel verbose. It clicks eventually but it takes longer than picking up Terraform modules.
Your control plane cluster lives in a cloud too. Crossplane doesn't float in a neutral zone your cluster runs on EKS, GKE, AKS, or somewhere. AWS-native authentication via IRSA is frictionless on EKS. But authenticating to Azure or GCP from an EKS cluster still requires Workload Identity Federation or ESO-managed static credentials. Cross-cloud credential setup is not zero-effort.
Debugging is different. Terraform errors are immediate and Googleable. Crossplane issues surface as Kubernetes events and controller logs. The observability tooling is improving, but it's not as beginner-friendly yet.
The Bottom Line
Terraform is the right tool for a team managing its own infrastructure. It's battle-tested, broadly supported, and approachable for any engineer.
Crossplane is the right tool for a platform team building infrastructure APIs for other teams with self-healing resources, native GitOps, Kubernetes-native RBAC, and an abstraction model (XRDs + Compositions + Functions) that creates real API boundaries between platform and application teams.
The shift happening across the industry isn't about Terraform being bad. It's about the job changing. Platform teams aren't just provisioning infrastructure anymore they're building products. And Crossplane was designed for exactly that job.
Start with the Crossplane docs. Spend time on Compositions and Functions that's where the model really clicks.
And if you want a hands-on walkthrough of building a real IDP on top of Crossplane from XRDs to a working developer portal let me know in the comments. I'm planning that as the next post.
Top comments (0)