Sodiq Jimoh

Posted on Apr 6

Deploying Backstage on Kubernetes with the Helm Chart: The Infrastructure-First Guide

#backstage #kubernetes #devops #platformengineering

Who this is for: Engineers deploying Backstage on Kubernetes via the
official Helm chart who want a working portal, not just a running pod.
This guide starts where most tutorials end — after helm install succeeds
but before anything actually works.

A few weeks ago I published an article called
"Nine Ways Backstage Breaks Before Your Developer Portal Works".
A Backstage maintainer read it and gave me structured feedback. The core of
it was this: several of the failures I documented were caused by not
following the official getting-started documentation before using the Helm
chart, and by using the demo image as if it were a production-ready base.

They were right. This article is the follow-up they suggested — and the one
I should have written first.

It does not repeat the previous article. It starts earlier, goes deeper on
Helm-specific configuration, and correctly attributes failures to their
actual causes rather than blaming Backstage for things that are ArgoCD,
Traefik, or operator error.

Official resources you should read alongside this guide:

Project repo referenced throughout:
github.com/sodiq-code/neuroscale-platform

The one thing you must understand before installing the Helm chart

The Backstage Helm chart uses a demo image by default. The chart README
contains this explicit warning:

The Backstage chart is not an official Backstage project and is not
supported by the Backstage core team. The default image used in this chart
is for demo purposes only.

This single fact explains most of the configuration friction you will
encounter. The demo image does not behave like a real Backstage application
built with backstage new app. It has different startup characteristics,
different configuration defaults, and different failure modes.

What this means practically:

If you are building a real developer portal — not just running a demo — you
should follow the official getting started guide
to create your own Backstage application first, build a custom Docker image
from it, and then use the Helm chart to deploy that image. The chart's
image.repository and image.tag values are where you point to your
own image.

If you are experimenting, learning, or building an integration platform
where Backstage is one component (as in the NeuroScale project), the demo
image path is workable — but you need to understand its limitations and
configure it correctly.

This guide covers the Helm chart path specifically, with the official docs
as the reference point throughout.

The values hierarchy that breaks everything silently

This is the most important configuration concept in the entire Helm chart.
Get this wrong and every override you write will be silently ignored.

The Backstage Helm chart is a wrapper chart — Backstage itself is a
dependency inside it. The dependency is named backstage. This means
configuration for the Backstage application container must be nested under
backstage.backstage.*, not backstage.*.

Wrong — values are silently ignored:

# This looks correct but is placed at the wrong hierarchy level
backstage:
  appConfig:
    app:
      title: My Platform
  startupProbe:
    initialDelaySeconds: 120
  resources:
    requests:
      cpu: 100m

Correct — values reach the Backstage container:

backstage:
  backstage:           # <-- this second level is required
    appConfig:
      app:
        title: My Platform
    startupProbe:
      initialDelaySeconds: 120
    resources:
      requests:
        cpu: 100m

The Helm chart processes the outer backstage key as the dependency name.
Values placed directly under backstage.* are interpreted as chart-level
configuration, not as container configuration. Kubernetes then uses chart
defaults — including probe timings — rather than your overrides.

How to verify your values are actually applied:

Render the Helm chart before applying it and inspect the output Deployment
spec directly:

helm template neuroscale-backstage backstage/backstage \
  -f infrastructure/backstage/values.yaml \
  --namespace backstage \
  | grep -A 30 "startupProbe"

If you see initialDelaySeconds: 120 in the output, your probe override
reached the container. If you see initialDelaySeconds: 5 or a very small
number, your values are at the wrong nesting level.

This verification step should be part of your CI pipeline. In the NeuroScale
platform, scripts/ci/render_backstage.sh runs this check on every PR:

#!/bin/bash
# scripts/ci/render_backstage.sh
helm template neuroscale-backstage backstage/backstage \
  -f infrastructure/backstage/values.yaml \
  --namespace backstage \
  | grep "initialDelaySeconds" \
  | grep -q "120" || {
    echo "ERROR: startupProbe initialDelaySeconds not set correctly"
    exit 1
  }
echo "Helm values nesting verified"

Required configuration keys for the demo image

The demo image requires specific configuration keys to be present at
startup. Missing any of them causes the frontend to crash on load with a
JavaScript error that is only visible in browser developer tools — the page
itself shows a blank white screen with no visible error.

The minimum required appConfig block:

backstage:
  backstage:
    appConfig:
      app:
        title: Your Platform Name    # required — crash if absent
        baseUrl: http://localhost:7010
      backend:
        baseUrl: http://localhost:7010
        cors:
          origin: http://localhost:7010
        database:
          client: better-sqlite3
          connection: ':memory:'

Why baseUrl matters:

The app.baseUrl and backend.baseUrl values must match the URL you are
actually using to access Backstage. If you port-forward on port 7010 but
the config says port 7007, the frontend React app loads but all API calls
fail — the UI appears to work while the backend connection is broken.

Why better-sqlite3 for local deployments:

The demo image ships with SQLite support. For local Kubernetes deployments
where you want zero external dependencies, the in-memory SQLite connection
is sufficient. For production, replace this with a PostgreSQL connection
pointing at a managed database service. The chart includes optional
PostgreSQL deployment — see
the chart's database configuration docs.

Probe timings: the demo image starts slowly

Backstage is a Node.js application. The demo image takes approximately 60
to 90 seconds to complete startup on a typical Kubernetes node. Kubernetes
default probe timings assume a 2-second initial delay. The result is
predictable: the startup probe fires before the application is ready, the
pod fails the probe, Kubernetes kills it, and the pod enters
CrashLoopBackOff.

This is not a Backstage bug. It is a configuration requirement that the
Helm chart does not prominently surface. The correct probe settings:

backstage:
  backstage:
    startupProbe:
      httpGet:
        path: /healthcheck
        port: 7007
      initialDelaySeconds: 120    # give Node.js time to start
      periodSeconds: 10
      failureThreshold: 30        # 30 × 10s = 5 minutes maximum wait
    readinessProbe:
      httpGet:
        path: /healthcheck
        port: 7007
      initialDelaySeconds: 120
      periodSeconds: 10
      failureThreshold: 3
    livenessProbe:
      httpGet:
        path: /healthcheck
        port: 7007
      initialDelaySeconds: 300    # only check liveness after 5 minutes
      periodSeconds: 30
      failureThreshold: 3

How to diagnose probe failures:

# Watch pod status in real time
kubectl get pods -n backstage -w

# When you see CrashLoopBackOff, describe the pod
kubectl describe pod -n backstage <pod-name>

# Look for this in Events:
# Warning  Unhealthy  kubelet  Startup probe failed: connection refused

# Check logs from the previous container instance
kubectl logs -n backstage <pod-name> --previous --tail=100

If you see Startup probe failed: connection refused in events but the
previous container logs show normal Node.js startup messages, the
application is starting correctly — the probe is just firing too early.
Increase initialDelaySeconds.

A full incident postmortem for this specific failure, including the exact
Kubernetes events and the Helm values diff before and after the fix, is in
infrastructure/INCIDENT_BACKSTAGE_CRASHLOOP_RCA.md.

Authentication: local dev vs production

The Backstage new backend architecture (introduced in version 1.x) includes
an internal authentication policy that requires all service-to-service calls
to include a valid Backstage token. This affects how the scaffolder frontend
talks to the scaffolder backend — a call that was unauthenticated in older
versions.

For local development only, the quickest fix is to use the guest auth
provider:

backstage:
  backstage:
    appConfig:
      auth:
        providers:
          guest:
            dangerouslyAllowOutsideDevelopment: true

This keeps the auth subsystem active and provides a real
user:default/guest identity to all plugins — which is safer than
disabling auth entirely with dangerouslyDisableDefaultAuthPolicy: true.
Plugins that assume a user context will behave correctly.

For production, use the GitHub OAuth provider:

# infrastructure/backstage/values-prod.yaml
backstage:
  backstage:
    appConfig:
      auth:
        environment: production
        providers:
          github:
            production:
              clientId: ${GITHUB_CLIENT_ID}
              clientSecret: ${GITHUB_CLIENT_SECRET}

Store GITHUB_CLIENT_ID and GITHUB_CLIENT_SECRET as Kubernetes secrets,
not in values.yaml. The Helm chart's extraEnvVarsSecrets field handles
this:

backstage:
  backstage:
    extraEnvVarsSecrets:
      - backstage-secrets

Then create the secret:

kubectl create secret generic backstage-secrets \
  -n backstage \
  --from-literal=GITHUB_CLIENT_ID="your-client-id" \
  --from-literal=GITHUB_CLIENT_SECRET="your-client-secret"

How to verify auth is configured correctly:

# Check the scaffolder actions API directly
curl http://localhost:7010/api/scaffolder/v2/actions

# If you get 401: auth is not configured for your environment
# If you get 200 with a JSON list of actions: auth is working

If you get a 401 with {"error":{"name":"AuthenticationError","message":"Missing credentials"}},
the scaffolder form will load but render blank — the page returns HTTP 200
but has no data to display. This is only visible in browser developer tools.

Catalog configuration: registering templates

The Backstage catalog applies security rules to what entity kinds are
accepted from each registered location. The default allow list for
repository-based locations does not include Template.

This is documented in
the catalog rules documentation
and the
adding templates documentation.

The registration pattern that works:

backstage:
  backstage:
    appConfig:
      catalog:
        locations:
          - type: url
            target: https://github.com/your-org/your-repo/blob/main/backstage/templates/your-template/template.yaml
            rules:
              - allow: [Template]

Without the rules: - allow: [Template] block, the entity is silently
rejected at ingestion time. The only signal is a warning in the Backstage
server logs — nothing appears in the UI.

How to diagnose catalog ingestion failures:

kubectl logs -n backstage deploy/backstage --tail=100 \
  | grep -i "warn\|error\|forbidden\|NotAllowedError"

Look for NotAllowedError: Forbidden: entity of kind Template is not allowed from that location. If you see this, your rules block is missing
or at the wrong YAML nesting level.

After updating the config, restart Backstage to re-ingest:

kubectl rollout restart deploy/backstage -n backstage
kubectl rollout status deploy/backstage -n backstage --timeout=300s

The template should appear in /create within 60 seconds of the pod
becoming ready.

You can validate your app-config.yaml structure using the Backstage CLI:

npx @backstage/cli config:check --config app-config.yaml

GitHub integration: the token secret

The scaffolder requires a GitHub token to open pull requests. The token
must be present as an environment variable in the running Backstage pod.

backstage:
  backstage:
    appConfig:
      integrations:
        github:
          - host: github.com
            token: ${GITHUB_TOKEN}

Store the token as a Kubernetes secret:

# Create or update the secret
kubectl create secret generic backstage-github-token \
  -n backstage \
  --from-literal=GITHUB_TOKEN="ghp_your_token_here" \
  --dry-run=client -o yaml | kubectl apply -f -

# Restart to reload the environment variable
kubectl rollout restart deploy/backstage -n backstage

Critical: environment variables from Kubernetes secrets are injected at
pod start time. Updating the secret does not update the running pod. You
must restart the deployment after updating the secret for the new value
to take effect.

How to verify the token is present without exposing the value:

# Check character length — a valid GitHub token is 40+ characters
kubectl exec -n backstage deploy/backstage -- \
  sh -c 'echo "Token length: ${#GITHUB_TOKEN}"'

If this returns Token length: 0 or Token length: 16 (the length of a
placeholder like <YOUR_TOKEN_HERE>), the secret was not updated correctly
or the pod was not restarted after the update.

A working minimal values.yaml for local development

This is the minimum configuration that produces a functioning Backstage
portal on a local Kubernetes cluster with the demo image:

# infrastructure/backstage/values.yaml
backstage:
  backstage:
    image:
      registry: ghcr.io
      repository: backstage/backstage
      tag: latest           # pin to a specific version in production

    appConfig:
      app:
        title: Your Platform
        baseUrl: http://localhost:7010

      backend:
        baseUrl: http://localhost:7010
        cors:
          origin: http://localhost:7010
        database:
          client: better-sqlite3
          connection: ':memory:'

      auth:
        providers:
          guest:
            dangerouslyAllowOutsideDevelopment: true

      integrations:
        github:
          - host: github.com
            token: ${GITHUB_TOKEN}

      catalog:
        locations:
          - type: url
            target: https://github.com/your-org/your-repo/blob/main/backstage/templates/your-template/template.yaml
            rules:
              - allow: [Template]

    startupProbe:
      httpGet:
        path: /healthcheck
        port: 7007
      initialDelaySeconds: 120
      periodSeconds: 10
      failureThreshold: 30

    readinessProbe:
      httpGet:
        path: /healthcheck
        port: 7007
      initialDelaySeconds: 120
      periodSeconds: 10
      failureThreshold: 3

    livenessProbe:
      httpGet:
        path: /healthcheck
        port: 7007
      initialDelaySeconds: 300
      periodSeconds: 30
      failureThreshold: 3

    resources:
      requests:
        cpu: 100m
        memory: 512Mi

    extraEnvVarsSecrets:
      - backstage-github-token

  postgresql:
    enabled: false    # using in-memory SQLite for local dev

Deploying and verifying

Install:

helm repo add backstage https://backstage.github.io/charts
helm repo update

kubectl create namespace backstage

# Create the GitHub token secret first
kubectl create secret generic backstage-github-token \
  -n backstage \
  --from-literal=GITHUB_TOKEN="your-token"

# Install
helm install backstage backstage/backstage \
  -n backstage \
  -f infrastructure/backstage/values.yaml

Watch the startup:

kubectl get pods -n backstage -w

Expect the pod to stay in Running 0/1 for 60–120 seconds while Node.js
starts. Do not interpret this as a failure. The startup probe will not
pass until the application is ready.

Access the portal:

kubectl -n backstage port-forward svc/backstage 7010:7007
# Open: http://localhost:7010

Verify the backend is responding:

curl http://localhost:7010/healthcheck
# Expected: {"status":"ok"}

curl http://localhost:7010/api/scaffolder/v2/actions
# Expected: JSON list of available scaffolder actions

Verify catalog ingestion:

kubectl logs -n backstage deploy/backstage --tail=50 \
  | grep -i "processed\|warn\|error"

Look for Processed N entities with no NotAllowedError lines.

The production values profile

Separate your dev and prod configuration into two files. The difference is
significant enough that sharing a single file creates dangerous defaults
in production.

# infrastructure/backstage/values-prod.yaml
backstage:
  backstage:
    image:
      registry: ghcr.io
      repository: your-org/your-backstage-app   # your own image
      tag: "1.2.3"                               # pinned, never latest

    replicaCount: 2

    appConfig:
      app:
        baseUrl: https://backstage.your-domain.com
      backend:
        baseUrl: https://backstage.your-domain.com
        database:
          client: pg
          connection:
            host: ${POSTGRES_HOST}
            port: 5432
            user: ${POSTGRES_USER}
            password: ${POSTGRES_PASSWORD}
            database: backstage

      auth:
        environment: production
        providers:
          github:
            production:
              clientId: ${GITHUB_CLIENT_ID}
              clientSecret: ${GITHUB_CLIENT_SECRET}

    startupProbe:
      initialDelaySeconds: 60     # your own image starts faster
      failureThreshold: 18        # 3 minutes maximum

    resources:
      requests:
        cpu: 200m
        memory: 512Mi
      limits:
        cpu: "1"
        memory: 1Gi

Apply both files together:

helm upgrade backstage backstage/backstage \
  -n backstage \
  -f infrastructure/backstage/values.yaml \
  -f infrastructure/backstage/values-prod.yaml

The prod values file overrides only what it specifies. Everything else
comes from the base values.yaml.

Diagnostic command reference

# Pod status and events
kubectl get pods -n backstage
kubectl describe pod -n backstage <pod-name>

# Application logs
kubectl logs -n backstage deploy/backstage --tail=100
kubectl logs -n backstage deploy/backstage --previous --tail=100

# Catalog ingestion errors
kubectl logs -n backstage deploy/backstage --tail=200 \
  | grep -i "warn\|error\|forbidden"

# Verify rendered Helm values
helm template backstage backstage/backstage \
  -f infrastructure/backstage/values.yaml \
  --namespace backstage \
  | grep -A 5 "startupProbe"

# Verify token is loaded in the running container
kubectl exec -n backstage deploy/backstage -- \
  sh -c 'echo "GITHUB_TOKEN length: ${#GITHUB_TOKEN}"'

# Health check endpoints
curl http://localhost:7010/healthcheck
curl http://localhost:7010/api/catalog/entities?limit=1
curl http://localhost:7010/api/scaffolder/v2/actions

What this guide does not cover

This guide covers the Helm chart deployment path specifically. It does not
cover:

Building your own Backstage application — start with the official getting started guide and backstage new app for a real production portal
Writing custom plugins — see plugin development docs
TechDocs integration — covered separately in the TechDocs docs
Production ingress and TLS — specific to your cloud provider and ingress controller

DEV Community