varun varde

Posted on May 25

Building an Internal Developer Portal with Backstage A Production Deployment Guide

#devops #kubernetes #programming #python

Internal Developer Portals became inevitable the moment engineering organisations crossed a certain complexity threshold.

At 20 engineers, tribal knowledge still works.
At 80 engineers, documentation begins fracturing.
At 200 engineers, platform entropy becomes existential.

Teams stop knowing:

Which services exist
Who owns them
How deployments work
Where documentation lives
Which Kubernetes clusters matter
Which CI/CD templates are approved
Which APIs are deprecated
Which observability dashboards to trust

The result is operational drag masquerading as engineering complexity.

This is precisely why Backstage became the dominant Internal Developer Portal (IDP) platform. It unified service cataloguing, documentation, Golden Path workflows, Kubernetes visibility, and developer self-service into a single extensible platform.

But most Backstage tutorials stop at.

npx @backstage/create-app

Production deployments are where the real engineering begins.

This guide covers the practical architecture, operational tradeoffs, adoption strategies, and deployment patterns required to run Backstage successfully in medium-to-large engineering organisations.

Built from production implementations across organisations ranging from 100 to 800 engineers.

Why Backstage Won the IDP Category and What It Doesn't Do

Backstage succeeded because it solved the fragmentation problem.

Before Internal Developer Portals, engineering ecosystems looked like this

CI/CD → Jenkins
Docs → Confluence
Kubernetes → kubectl + dashboards
Ownership → spreadsheets
APIs → wiki pages
Templates → tribal knowledge
Monitoring → scattered Grafana links

Developers spent more time navigating tooling than shipping software.

Backstage unified discovery.

What Backstage Does Exceptionally Well

Backstage excels at:

Software cataloguing
Golden Path standardisation
Developer self-service
Documentation centralisation
Platform discoverability
Plugin extensibility

It becomes the operational interface layer for your platform.

What Backstage Does NOT Do

This distinction matters enormously.

Backstage is NOT:

A CI/CD engine
A Kubernetes platform
A monitoring system
A secrets manager
An infrastructure orchestrator

It orchestrates developer experience across those systems.

Think of it as the engineering control plane UI.

Architecture Decisions: Backstage Deployment Patterns for Production

Most failed Backstage deployments fail architecturally before adoption problems even begin.

Deployment Model 1 Single Container (Good for POCs)

Simple deployment

apiVersion: apps/v1
kind: Deployment
metadata:
  name: backstage
spec:
  replicas: 1
  template:
    spec:
      containers:
      - name: backstage
        image: backstage:latest

Suitable for:

Small engineering organisations
POCs
Internal experimentation

Not suitable for production scale.

Deployment Model 2 Split Frontend and Backend

Recommended production architecture:

Frontend (React UI)
↓
Backend API
↓
Plugins + Database + External Integrations

Benefits:

Independent scaling
Better caching
Reduced blast radius
Improved deployment flexibility

Recommended Kubernetes Architecture

apiVersion: apps/v1
kind: Deployment
metadata:
  name: backstage-backend
spec:
  replicas: 3
  template:
    spec:
      containers:
      - name: backend
        image: your-org/backstage-backend:v1.0.0
        env:
        - name: POSTGRES_HOST
          value: postgres.platform.svc.cluster.local
        - name: AUTH_GITHUB_CLIENT_ID
          valueFrom:
            secretKeyRef:
              name: backstage-secrets
              key: github-client-id

Database Choice: PostgreSQL Only

Avoid SQLite immediately.

Production Backstage requires:

Concurrent plugin access
Reliable catalog indexing
Transaction consistency
Search scalability

Recommended:

PostgreSQL

Ingress and Authentication

Recommended auth providers:

GitHub OAuth
Okta
Google Workspace
Azure AD

Avoid anonymous access.

Example ingress

apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
  name: backstage
spec:
  ingressClassName: nginx
  rules:
  - host: backstage.internal.company.com
    http:
      paths:
      - path: /
        pathType: Prefix
        backend:
          service:
            name: backstage
            port:
              number: 7007

The Plugin Selection Framework: Core vs Custom vs Community

Backstage plugin sprawl becomes dangerous quickly.

One client installed 47 plugins in six months.

Nobody maintained them.

Half broke after upgrades.

The Three Plugin Categories

1. Core Plugins

These are essential.

Recommended:

Catalog
TechDocs
Scaffolder
Kubernetes
Search

These create the foundation.

2. Community Plugins

Useful but operationally risky.

Examples:

Jira
ArgoCD
PagerDuty
SonarQube

Rule:

Only install plugins with active maintainers.

3. Custom Plugins

Necessary eventually.

Examples:

Internal deployment workflows
Compliance dashboards
Internal APIs
Platform-specific automation

Plugin Evaluation Checklist

Before installing any plugin

Question	Why It Matters
Is it actively maintained?	Prevent abandonment
Does it reduce cognitive load?	Avoid UI clutter
Does it duplicate existing workflows?	Prevent fragmentation
Is ownership assigned?	Avoid orphaned integrations

Software Catalogue: Getting 100% Entity Coverage Without Mandate

The catalog becomes useless if incomplete.

But forcing teams to manually register services never scales.

The Metadata Problem

Most teams will not voluntarily maintain

apiVersion: backstage.io/v1alpha1
kind: Component
metadata:
  name: payments-api

unless value is immediate.

The Successful Pattern

Auto-discovery first. Manual enrichment second.

GitHub Discovery Integration

Example

catalog:
  providers:
    github:
      yourOrg:
        organization: your-org
        catalogPath: /catalog-info.yaml

This enables repository scanning automatically.

Incentivise Coverage Through Utility

Engineers maintain metadata when it unlocks:

Deployment automation
Kubernetes visibility
Ownership clarity
Documentation indexing
Golden Path templates

Not because leadership mandates compliance.

TechDocs: Making Documentation a First-Class Engineering Practice

Documentation systems fail because writing docs feels disconnected from engineering workflows.

TechDocs fixes this by treating documentation like code.

Recommended TechDocs Architecture

Markdown in Git
↓
CI/CD build
↓
Static site generation
↓
Indexed inside Backstage

Example TechDocs Configuration

techdocs:
  builder: 'external'
  publisher:
    type: 'awsS3'
    awsS3:
      bucketName: backstage-techdocs

Why Docs-as-Code Works

Advantages:

PR reviews apply to documentation
Versioning becomes automatic
Ownership becomes explicit
Drift decreases dramatically

The Documentation Coverage Problem

Most organisations have

Critical systems
+
Zero operational documentation

Backstage exposes these gaps visibly.

Which is operationally valuable.

Scaffolder Templates: Building Your Golden Path Self-Service Workflows

This is where Backstage becomes transformational.

The Scaffolder creates operational consistency at scale.

Golden Path Philosophy

Developers should not repeatedly solve:

CI/CD setup
Observability wiring
Terraform structure
Security defaults
Kubernetes manifests

The platform should solve these once.

Example Production-Ready Template

apiVersion: scaffolder.backstage.io/v1beta3
kind: Template
metadata:
  name: golden-path-service
spec:
  owner: platform-team
  type: service

What the Best Templates Include

Every generated service should automatically include:

CI/CD pipeline
Terraform module
Kubernetes manifests
Observability integration
Security scanning
Logging standards
SLO defaults

The Real Goal

Reduce

Decision fatigue

Not flexibility.

Kubernetes Plugin: Real-Time Service Health in the Developer Portal

The Kubernetes plugin dramatically increases operational discoverability.

What Developers Actually Need

Not raw Kubernetes complexity.

They need:

Deployment status
Restart visibility
Pod health
Namespace ownership
Service mapping

Kubernetes Plugin Configuration

kubernetes:
  serviceLocatorMethod:
    type: 'multiTenant'

Recommended Features

Expose:

Pod health
Replica status
Rollout history
Resource consumption
Deployment age

Avoid exposing excessive cluster internals.

The Biggest UX Mistake

Turning Backstage into a thin wrapper around kubectl.

Developers want abstraction.

Not Kubernetes archaeology.

Search: Making Platform Knowledge Discoverable

Search quality determines portal usefulness more than most teams realise.

Poor search destroys trust quickly.

What Should Be Searchable

Search should index:

Services
Documentation
APIs
Runbooks
Ownership
Terraform modules
CI/CD templates

Elasticsearch Integration

Recommended at scale:

Elasticsearch

Example

search:
  engine:
    type: elasticsearch

Search Quality Rules

Good search requires:

Consistent metadata
Strong ownership tagging
Naming conventions
Documentation hygiene

Search quality reflects platform maturity.

Developer Adoption: The 90-Day Rollout Plan That Works

Most Backstage failures are adoption failures.

Not technical failures.

Phase 1 — Seed Critical Value (Days 1–30)

Launch with:

Service catalog
Ownership visibility
Kubernetes status
TechDocs

Avoid feature overload.

Phase 2 — Introduce Self-Service (Days 30–60)

Add:

Scaffolder templates
Deployment workflows
Golden Path automation

This creates habitual usage.

Phase 3 — Expand Platform Integrations (Days 60–90)

Integrate:

Incident systems
Monitoring
Cost visibility
Security tooling

Now Backstage becomes operationally indispensable.

The Biggest Adoption Mistake

Treating Backstage as

A documentation portal

instead of

A workflow accelerator

Measuring Backstage Success: The Metrics That Matter

Avoid vanity metrics like

Daily active users

Measure operational outcomes instead.

Key Backstage Metrics

Time to First Production Deployment

Target

< 1 day

Self-Service Rate

Measure

Infrastructure requests completed
without platform tickets

Target

> 80%

Golden Path Adoption

Target

> 90% of new services

Documentation Coverage

Measure

Catalog entities with TechDocs

Platform NPS

Critical indicator of developer trust.

Operating Backstage as a Product

This is the single most important principle.

Backstage is not an internal tool.

It is an internal product.

Product Thinking Changes Everything

Platform teams must manage:

Roadmaps
User feedback
Feature prioritisation
UX quality
Adoption metrics
Reliability

Exactly like customer-facing products.

Establish Platform Ownership

Recommended structure

Responsibility	Owner
Infrastructure	Platform engineering
Plugin lifecycle	Plugin owners
Documentation standards	Developer enablement
UX and adoption	Platform product owner

Create a Feedback Loop

Run:

Quarterly DX surveys
Office hours
Team interviews
Usage analytics reviews

Without feedback loops, Backstage decays rapidly.

Upgrade Strategy

Backstage evolves quickly.

Recommended:

Monthly dependency reviews
Quarterly platform upgrades
Dedicated staging environment
Plugin compatibility testing

Never allow upgrades to drift indefinitely.

Common Failure Modes

*Failure Mode 1 — Trying to Solve Everything
*
Start small.

Expand gradually.

Failure Mode 2 — Weak Ownership

No ownership guarantees entropy.

Failure Mode 3 — No Golden Path

A portal without workflows becomes passive documentation.

Failure Mode 4 — Ignoring Developer Experience

Engineers abandon tools that increase friction.

Immediately.

The most successful Backstage deployments do not succeed because of plugin count or UI polish.

They succeed because they reduce cognitive load.

They make:

Ownership obvious
Documentation discoverable
Infrastructure self-service
Operational workflows consistent

Most importantly, they create a unified developer experience layer across increasingly fragmented engineering ecosystems.

That is why Backstage became the Internal Developer Portal standard.

Not because it centralised tools.

Because it simplified engineering flow.