Stop the 'ticket-ops' madness! This guide, originally published on devopsstart.com, shows you how to combine Backstage and Crossplane to build a true self-service Internal Developer Platform.
Introduction
Stop forcing your developers to learn the intricacies of cloud provider consoles or struggle with 500-line Terraform modules just to get a database. The gap between raw infrastructure and developer productivity is where "ticket ops" thrives, slowing down deployment cycles and frustrating engineers. To solve this, you need an Internal Developer Platform (IDP) that abstracts infrastructure complexity into a self-service experience.
An IDP allows developers to provision resources via a simplified interface without needing to be cloud experts. In this guide, you will learn how to build a production-ready IDP by combining Backstage and Crossplane. Backstage acts as your front-end portal, providing a unified interface for service discovery and software templates. Crossplane serves as the back-end control plane, turning Kubernetes into a universal API for managing cloud resources.
By the end of this article, you will understand the architecture required to move from manual Infrastructure as Code (IaC) to a scalable Infrastructure as a Service (IaaS) model. You'll see exactly how to map a button click in a UI to a live AWS RDS instance via GitOps, reducing the cognitive load on your developers while maintaining strict governance for your platform team. For more on managing the underlying clusters, you can check out Kubernetes for Beginners: Deploy Your First Application.
The Architecture: Connecting Backstage to Crossplane
Building an IDP isn't about one tool; it's about the pipeline. The most common mistake is trying to connect Backstage directly to a cloud API. That is a security nightmare and lacks auditability. Instead, use a GitOps-driven control plane architecture. In this flow, Backstage doesn't "create" the infrastructure; it "requests" it by committing a manifest to Git.
The sequence works as follows: a developer selects a "Provision Postgres" template in the Backstage Scaffolder. Backstage then triggers a commit of a simple YAML file to a Git repository. An automated GitOps controller, such as ArgoCD, detects this change and syncs the manifest to a Kubernetes cluster. Inside that cluster, Crossplane v1.14.x sees the new Custom Resource (CR) and communicates with the cloud provider's API to provision the actual resource.
This ensures that your Git history is the single source of truth, which is critical for compliance and disaster recovery. To ensure these deployments are handled reliably, you should learn How to Set Up Argo CD GitOps for Kubernetes Automation.
The "connective tissue" here is the YAML schema. Backstage must output a manifest that exactly matches the CompositeResourceDefinition (XRD) you've defined in Crossplane. If the Scaffolder outputs db_size: small but Crossplane expects storageClass: small, the request will hang in a "Pending" state. You must treat your XRDs as the API contract between your platform team and your developers.
Abstracting Cloud Complexity with Crossplane Compositions
If you give developers raw Crossplane resources, you've just traded Terraform for Kubernetes YAML, which does not reduce cognitive load. The real power of Crossplane lies in Compositions. A Composition allows you to bundle multiple low-level resources (like a VPC, a Subnet, and an RDS instance) into a single, high-level "Composite Resource" (XR) that developers can actually understand.
For example, instead of requiring a developer to specify db.aws.upbound.io/v1beta1 with 20 mandatory fields, you create a CompositeDatabase definition. The developer only provides a name and a size. Your platform team defines the "blueprint" that maps size: small to a t3.micro instance with 20GB of encrypted GP3 storage.
Here is an example of a simplified CompositeResourceDefinition (XRD) that defines the API your developers will use:
apiVersion: apiextensions.crossplane.io/v1
kind: CompositeResourceDefinition
metadata:
name: xpostgresdatabases.platform.example.org
spec:
group: platform.example.org
names:
kind: XPostgresDatabase
plural: xpostgresdatabases
versions:
- name: v1alpha1
served: true
referenceable: true
schema:
openAPIV3Schema:
type: object
properties:
spec:
type: object
properties:
storageGb:
type: integer
region:
type: string
And here is how the developer's request (the "Claim") looks. This is the exact YAML that Backstage will generate:
apiVersion: platform.example.org/v1alpha1
kind: PostgresDatabase
metadata:
name: order-service-db
namespace: order-service-prod
spec:
storageGb: 20
region: us-east-1
By using this approach, you eliminate the need for developers to know AWS-specific jargon. You can change the underlying instance type or backup policy in the Composition without ever touching the developer's manifest.
Implementing the Backstage Scaffolder for Self-Service
The Backstage Scaffolder is the engine that turns a user's form input into a Git commit. To make this work with Crossplane, you create a template.yaml file. This template defines the UI form (the questions you ask the developer) and the "steps" required to process the answer.
In a production setup, your template should not just create a file; it should validate the input. For example, if a developer requests 10,000GB of storage, your template or a validating admission webhook in Kubernetes should catch it. The template uses "Nunjucks" templating to inject the form values into the Crossplane Claim YAML.
Below is a snippet of a Backstage software template designed to provision a Crossplane database:
apiVersion: backstage.io/template/scaffolder-entity/v1.0.0
metadata:
name: provision-rds-postgres
title: Provision RDS Postgres
description: Creates a production-ready Postgres DB via Crossplane
spec:
parameters:
- title: Database Details
properties:
dbName:
type: string
title: Database Name
storageGb:
type: integer
title: Storage Size (GB)
default: 20
environment:
type: string
title: Environment
enum: [dev, staging, prod]
steps:
- id: fetch-base
action: fetch:template
input:
templateRepo: templates/infrastructure/rds
values:
name: ${{ parameters.dbName }}
storage: ${{ parameters.storageGb }}
env: ${{ parameters.environment }}
- id: publish
action: publish:github
input:
allowedStatuses: [success]
repoUrl: github.com?owner=my-org&repo=${{ parameters.dbName }}-infra
When the developer clicks "Create," Backstage creates a new repository (or updates an existing one) with the resulting YAML. The critical part is the fetch:template step. It takes the generic claim.yaml from your template repository and fills it with the user's specific requirements. This removes the possibility of syntax errors in the YAML, as the developer never actually writes the code.
The GitOps Feedback Loop and Production Gotchas
A major pain point in IDPs is the "black hole" effect: a developer clicks a button in Backstage, the commit happens, and then nothing. They have no idea if the database is actually ready or if the Crossplane provider is stuck in a back-off loop. To solve this, you must implement a feedback loop.
One effective method is using the Backstage Kubernetes plugin combined with the Crossplane status fields. Crossplane updates the status section of the Claim resource once the cloud provider confirms the resource is Ready: True. You can configure Backstage to surface these Kubernetes resource statuses directly on the service's catalog page. If a resource is failing, the developer sees a "Warning" status in the portal, which links them to the logs.
In clusters with >100 nodes, you'll notice that Crossplane's reconciliation loop can put significant pressure on the Kubernetes API server. I've seen cases where too many frequent updates to the status of 500+ cloud resources caused API latency. To mitigate this, tune the pollInterval in your Crossplane providers. Don't check every 60 seconds if a database is ready; 5 or 10 minutes is usually sufficient for infrastructure that takes 15 minutes to provision.
Another production gotcha is "orphaned resources." If a developer deletes the manifest from Git, ArgoCD deletes the Claim from Kubernetes, and Crossplane deletes the RDS instance. This is great for dev environments but catastrophic for production. You must implement a "deletion policy" in your Compositions. Set deletionPolicy: Orphan for production workloads. This ensures that if the YAML is accidentally deleted, the actual cloud resource remains intact.
Best Practices for Platform Engineering
Implementing an IDP is more of an organizational challenge than a technical one. If you build a perfect platform that no one uses, you've failed. Follow these principles to ensure adoption:
- Start with the "Golden Path": Do not try to automate every possible cloud resource on day one. Identify the three most requested resources (for example, S3 buckets, Postgres DBs, and Redis caches) and build high-quality templates for those. This provides immediate value and builds trust.
- Enforce Governance via Compositions: Use Crossplane Compositions to bake in security. Ensure every S3 bucket is encrypted and every RDS instance is in a private subnet by default. The developer shouldn't even see the "Encryption" checkbox; it should be mandatory and invisible.
- Treat your IDP as a Product: Your developers are your customers. Conduct user interviews to find where the friction is. If they find the Backstage form too long, simplify it. If they need more visibility into costs, integrate a cost-tracking plugin.
-
Implement Strong RBAC: Use Kubernetes namespaces to isolate claims. Ensure that a developer in the
team-anamespace cannot modify aPostgresDatabaseclaim in theteam-bnamespace. Use a tool like Kyverno to enforce these boundaries. - Version your Compositions: When you update a Composition (for example, upgrading the RDS instance class), don't just push it to production. Version your XRDs and Compositions so you can migrate services gradually rather than forcing a global update.
FAQ
How does this approach differ from using Terraform with a CI/CD pipeline?
Traditional Terraform requires a "push" model where a pipeline runs terraform apply. This often leads to state locking issues and configuration drift. The Backstage + Crossplane approach uses a "pull" model (Control Plane). Crossplane constantly monitors the state of the cloud and automatically corrects drift without needing a manual pipeline trigger.
Does this mean I have to migrate all my existing Terraform code to Crossplane?
No. You can run them side-by-side. Use Crossplane for new, self-service workloads while keeping your core networking and foundation (VPCs, IAM roles) in Terraform. You can even use the Terraform provider for Crossplane to manage existing Terraform modules through the Kubernetes API.
What happens if the cloud provider API is down during provisioning?
Crossplane employs an exponential back-off strategy. If the AWS API returns a 500 error, Crossplane will keep retrying the request. The Kubernetes resource will stay in a Synced: False state. Because you have a GitOps audit trail, you can easily see which resources are stuck.
Is Backstage overkill for small teams?
If you have fewer than five developers, a simple README and a set of shared Terraform modules might suffice. However, once you hit a scale where the platform team becomes a bottleneck for "simple" requests, the investment in Backstage pays off by eliminating the ticket queue.
Conclusion
Combining Backstage and Crossplane allows you to move from a culture of "ticket-based infrastructure" to true self-service. By using Backstage as the user interface and Crossplane as the control plane, you create a system where developers can provision production-ready resources in minutes, not days. This doesn't just speed up delivery; it allows your platform engineers to stop performing repetitive manual tasks and start focusing on high-value architectural improvements.
To get started, your first actionable step is to install Crossplane v1.14.x on a development cluster and create your first CompositeResourceDefinition for a simple resource, like an S3 bucket. Once the API is working, set up a basic Backstage instance and create a software template that outputs the YAML required by that XRD. Start small, validate the "Golden Path" with one team, and then scale the platform to the rest of your organization.
Top comments (0)