DEV Community

Cover image for Per-PR Preview Environments with FluxCD
Artur Havrylov
Artur Havrylov

Posted on

Per-PR Preview Environments with FluxCD

You want a preview environment for every pull request - open a PR, get a working URL. On Kubernetes with Flux, there's no built-in way to get one: Kustomization and HelmRelease don't template themselves per PR.

The usual workarounds each have a catch:

  • a kubectl apply CI job - breaks GitOps, since the cluster drifts from git
  • Argo CD ApplicationSet - now you're running two GitOps controllers
  • a custom controller that watches your PRs - works, but it's yours to maintain

There's a lighter option: the Flux Operator's ResourceSet and ResourceSetInputProvider CRDs. Two YAML files per service, no custom code, no second control plane. This post is the full setup - copy-paste-able - using a backend service called api as the example.

What you'll build

  • A preview environment for every open PR, on its own URL (api-1234.preview.example.com).
  • Deploy gated on the image build - no ImagePullBackOff while CI is still running.
  • Automatic teardown when the PR closes - no cleanup job.
  • All in plain YAML, reconciled by Flux.

How it works

Three pieces make it work:

  • A watcher lists the PRs that are ready to deploy - it keys off a label your CI adds once the image is built.
  • A templater stamps out a full copy of your app for each PR: its own name, URL, and image tag.
  • Flux deploys those copies and keeps the cluster in sync as PRs open and close.
PR opened + labeled "build/image-ready"
        |
        v
Watcher  (ResourceSetInputProvider, polls every 5m)
  lists each open PR as { id, sha }
        |
        v
Templater  (ResourceSet)
  stamps out one environment per PR
        |
        v
Deployed preview env  (api-1234.preview.example.com)
  auto-deleted when the PR closes
Enter fullscreen mode Exit fullscreen mode

The rest of this post is just wiring those three pieces in YAML.

Prerequisites

  • A cluster already running Flux (flux-system GitRepository + controllers).
  • The Flux Operator installed on top - it's a separate project from Flux core:
  helm install flux-operator oci://ghcr.io/controlplaneio-fluxcd/charts/flux-operator \
    --namespace flux-system
Enter fullscreen mode Exit fullscreen mode
  • Provider auth: a PAT (code-read scope) via secretRef - seal it so it lives in git - or workload identity (serviceAccountName + cloud IAM) to avoid storing a token.

The two CRDs

The watcher and templater are two CRDs from the Flux Operator:

  • ResourceSetInputProvider - the watcher. Polls a source (Azure DevOps, GitHub, GitLab) on an interval and emits one input per open PR, each with fields like inputs.id (PR number) and inputs.sha (head commit).
  • ResourceSet - the templater. Takes those inputs and renders a set of Kubernetes resources once per input, via << inputs.id >> substitution.

Because the watcher and the template are separate, you can swap an Azure DevOps source for a GitHub one without touching what gets deployed.

Step 1: Watch PRs

The input provider for the api service - one repo, polled every five minutes:

apiVersion: fluxcd.controlplane.io/v1
kind: ResourceSetInputProvider
metadata:
  name: api-prs
  namespace: app-dev
  annotations:
    fluxcd.controlplane.io/reconcileEvery: "5m"
spec:
  type: AzureDevOpsPullRequest
  url: https://dev.azure.com/{org}/{project}/_git/api
  secretRef:
    name: azure-devops-auth
  skip:
    labels:
      - "!build/image-ready"
Enter fullscreen mode Exit fullscreen mode
  • type: AzureDevOpsPullRequest queries active PRs. GitHubPullRequest / GitLabMergeRequest work the same way, so nothing else in the setup changes.
  • skip.labels: ["!build/image-ready"] is the key line. The leading ! inverts the match: skip any PR that does NOT have build/image-ready. Net effect - only labeled PRs become inputs.
  • That label is the deploy-gates-on-build trick. Your CI sets build/image-ready only after it has built and pushed the image. Until then the provider doesn't see the PR, so the deploy can't race the build. No webhooks, no retry loops, no ImagePullBackOff.

Step 2: Template the Kustomization

The ResourceSet consumes those inputs and renders one Kustomization per PR:

apiVersion: fluxcd.controlplane.io/v1
kind: ResourceSet
metadata:
  name: api-pr-envs
  namespace: app-dev
spec:
  inputsFrom:
    - apiVersion: fluxcd.controlplane.io/v1
      kind: ResourceSetInputProvider
      name: api-prs
  resources:
    - apiVersion: kustomize.toolkit.fluxcd.io/v1
      kind: Kustomization
      metadata:
        name: api-pr-<< inputs.id >>
        namespace: app-dev
      spec:
        dependsOn:
          - name: app-common-dev      # shared namespace, ingress class, configmaps
            namespace: flux-system
        interval: 10m
        retryInterval: 1m
        prune: true                   # cascade-delete on teardown
        wait: false
        sourceRef:
          kind: GitRepository
          name: flux-system
          namespace: flux-system
        path: ./clusters/eks/apps/api/pr-template
        postBuild:
          substitute:
            PR_NUMBER: << inputs.id | quote >>
            COMMIT_SHA: "<< inputs.sha >>"
Enter fullscreen mode Exit fullscreen mode

If 7 PRs are open, this renders 7 Kustomization objects: api-pr-1234, api-pr-1289, and so on - each one you can inspect, retry, or delete on its own. postBuild.substitute passes PR_NUMBER and COMMIT_SHA down into the manifests at the templated path. dependsOn makes each per-PR env wait for shared infra so they don't race during cluster bootstrap.

Step 3: The per-PR overlay

This is the piece most write-ups skip. The path above points at a Kustomize overlay that turns one base manifest set into a PR-scoped copy:

# ./clusters/eks/apps/api/pr-template/kustomization.yaml
apiVersion: kustomize.config.k8s.io/v1beta1
kind: Kustomization
namespace: app-dev

resources:
  - ../../../../base/api          # the normal, un-PR'd manifests

nameSuffix: -${PR_NUMBER}         # api -> api-1234, on EVERY resource
commonLabels:
  pr-env: "pr-${PR_NUMBER}"       # one label to find/watch the whole env

images:
  - name: registry.example.com/api
    newTag: pr-${PR_NUMBER}       # pull the image CI built for this PR

replicas:
  - name: api-deployment
    count: 1                      # previews don't need HA

patches:
  - path: pr-ingress-patch.yaml
  - path: pr-env-vars-patch.yaml
  # cluster-scoped resources can't be duplicated per PR - strip them
  - target: { kind: ClusterRole }
    patch: |
      $patch: delete
      apiVersion: rbac.authorization.k8s.io/v1
      kind: ClusterRole
      metadata: { name: unused }
  - target: { kind: ClusterRoleBinding }
    patch: |
      $patch: delete
      apiVersion: rbac.authorization.k8s.io/v1
      kind: ClusterRoleBinding
      metadata: { name: unused }
Enter fullscreen mode Exit fullscreen mode

The mechanics that make this work:

  • nameSuffix: -${PR_NUMBER} is the isolation engine. Kustomize appends it to every resource name, so api becomes api-1234 across Deployment, Service, Ingress - no per-resource editing. Everything lives in one shared namespace; the suffix keeps PRs from colliding. (Simpler than namespace-per-PR, and you don't pay namespace setup cost on every env.)
  • images.newTag sets the image tag the Kustomize way instead of string-replacing inside the manifest - cleaner, and Kustomize validates it.
  • $patch: delete on cluster-scoped kinds is the gotcha nobody warns you about: ClusterRole/ClusterRoleBinding/ServiceAccount are cluster-scoped, so a name suffix would either collide or leak. Strip them from the overlay and provision them once in shared infra instead.

The two patches are small. Ingress gives the PR its own hostname:

# pr-ingress-patch.yaml
apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
  name: api-ingress
spec:
  rules:
    - host: api-${PR_NUMBER}.preview.example.com
      http:
        paths:
          - path: /
            pathType: Prefix
            backend:
              service:
                name: api
                port: { number: 80 }
  tls:
    - hosts: [api-${PR_NUMBER}.preview.example.com]
Enter fullscreen mode Exit fullscreen mode

And the env patch stamps the commit SHA as a pod annotation:

# pr-env-vars-patch.yaml
apiVersion: apps/v1
kind: Deployment
metadata:
  name: api-deployment
spec:
  template:
    metadata:
      annotations:
        app/commit-sha: "${COMMIT_SHA}"   # changes on each push -> forces rollout
    spec:
      containers:
        - name: api
          env:
            - name: PUBLIC_URL
              value: "https://api-${PR_NUMBER}.preview.example.com"
Enter fullscreen mode Exit fullscreen mode

Why bother? When you push a new commit to the same PR, the image tag stays the same (pr-1234), so the pod spec wouldn't change and Kubernetes wouldn't redeploy. Stamping the new COMMIT_SHA into an annotation forces a fresh rollout on every push.

Step 4: Cleanup is automatic

There's no teardown job. When PR #1234 closes (or loses its build/image-ready label):

  1. On the next 5-minute poll, the input provider stops emitting an input for #1234.
  2. The ResourceSet sees api-pr-1234 no longer maps to a live input and deletes that Kustomization.
  3. Because that Kustomization has prune: true, it cascade-deletes everything it owns - Deployment, Service, Ingress, ConfigMap - all gone.

The whole environment unwinds within one reconcile cycle. "What should exist" is derived from "what PRs are open," and Flux drives the cluster toward it.

Gotchas worth knowing

  • Gate the deploy on the build. The !build/image-ready skip rule is doing real work - without it, Flux tries to deploy before CI pushes the image. Set the label as the last CI step.
  • Strip cluster-scoped resources ($patch: delete). They can't be safely suffixed per PR; share them from common infra.
  • Force rollouts with a SHA annotation if your image tag is stable per PR.
  • Poll latency is your iteration loop. 5m is a reasonable default - push, wait for CI, get a deploy. Shorter means more API calls to your provider.
  • dependsOn shared infra, or every preview env races namespace/ingress setup on bootstrap.
  • If you use a PAT: code-read scope is enough; keep it sealed.

Wrap-up

The same shape - watch a dynamic source, render a templated resource per item, reconcile both directions - shows up in Argo ApplicationSet, Crossplane Composition, and Terraform for_each. What the Flux Operator does well is keep it to two CRDs and pure YAML. If you're already on Flux, the cost is one Helm install plus a provider, a ResourceSet, and a small overlay per service - cheap enough that per-PR previews stop being a "someday" item.

Top comments (0)