GitOps Patterns & Best Practices Guide
A comprehensive guide to implementing GitOps with ArgoCD, FluxCD, and Kustomize.
Table of Contents
- What Is GitOps?
- ArgoCD vs FluxCD
- Repository Strategies
- App-of-Apps Pattern
- Environment Promotion
- Secret Management
- Multi-Cluster GitOps
- Rollback Strategies
- Observability
- Common Pitfalls
What Is GitOps?
GitOps is an operational framework where the entire system state is described declaratively in Git. A GitOps operator (ArgoCD or FluxCD) continuously reconciles the cluster state with the desired state in the repository.
Core principles:
- Declarative -- The entire system is described declaratively (YAML, HCL, etc.)
- Versioned and immutable -- Git as the single source of truth
- Pulled automatically -- Agents pull desired state, not pushed by CI
- Continuously reconciled -- Drift is detected and corrected automatically
Developer --> Git Commit --> Git Repo <-- GitOps Agent --> Kubernetes
|
Reconcile Loop
(detect drift)
ArgoCD vs FluxCD
| Feature | ArgoCD | FluxCD |
|---|---|---|
| Architecture | Centralized server + UI | Distributed controllers |
| UI | Rich web UI built-in | No built-in UI (use Weave GitOps) |
| Multi-tenancy | AppProjects with RBAC | Namespace isolation |
| Helm support | Native + Kustomize | HelmRelease CRD |
| Diff preview | Built-in diff view | CLI only |
| Notifications | argocd-notifications | notification-controller |
| Image automation | argocd-image-updater | image-automation-controller |
| RBAC | Built-in with SSO | Kubernetes RBAC |
| Scale | Single cluster focus | Multi-cluster native |
| Learning curve | Lower (UI helps) | Higher (CLI-first) |
When to choose ArgoCD:
- You need a visual dashboard for developers
- Your team prefers a centralized management plane
- You want built-in RBAC with SSO integration
- You manage fewer than 10 clusters
When to choose FluxCD:
- You prefer a lightweight, controller-based approach
- You need native multi-cluster support
- You want tight integration with Helm and Kustomize
- You prioritize GitOps purity (no imperative UI actions)
Repository Strategies
Monorepo (Recommended for Small Teams)
gitops-repo/
├── apps/
│ ├── myapp/
│ │ ├── base/
│ │ └── overlays/
│ │ ├── dev/
│ │ ├── staging/
│ │ └── prod/
│ └── another-app/
├── infrastructure/
│ ├── cert-manager/
│ ├── ingress-nginx/
│ └── monitoring/
└── clusters/
├── dev-cluster/
├── staging-cluster/
└── prod-cluster/
Pros: Simple, all changes in one place, easy cross-cutting changes.
Cons: Blast radius, noisy commit history, harder access control.
Polyrepo (Recommended for Large Teams)
# Repo 1: app-gitops (per application team)
app-gitops/
├── base/
└── overlays/
├── dev/
├── staging/
└── prod/
# Repo 2: infra-gitops (platform team)
infra-gitops/
├── cert-manager/
├── ingress-nginx/
└── monitoring/
# Repo 3: cluster-gitops (platform team)
cluster-gitops/
├── dev-cluster/
├── staging-cluster/
└── prod-cluster/
Pros: Clear ownership, independent release cycles, fine-grained access.
Cons: More repos to manage, cross-repo coordination is harder.
App-of-Apps Pattern
The app-of-apps pattern uses a parent Application (or ApplicationSet) to manage child Applications. This is the recommended way to bootstrap a cluster.
ArgoCD App-of-Apps
# Root application that manages all other applications
apiVersion: argoproj.io/v1alpha1
kind: Application
metadata:
name: root-app
namespace: argocd
spec:
project: default
source:
repoURL: https://github.com/your-org/gitops-repo.git
path: apps # Directory containing Application manifests
targetRevision: main
destination:
server: https://kubernetes.default.svc
namespace: argocd
syncPolicy:
automated:
selfHeal: true
prune: true
FluxCD Equivalent
# Root Kustomization that manages all others
apiVersion: kustomize.toolkit.fluxcd.io/v1
kind: Kustomization
metadata:
name: apps
namespace: flux-system
spec:
interval: 5m
sourceRef:
kind: GitRepository
name: flux-system
path: ./apps
prune: true
dependsOn:
- name: infrastructure
Key benefit: Bootstrap the entire cluster with a single kubectl apply.
Environment Promotion
Strategy 1: Branch-per-Environment (Not Recommended)
main ──── dev deployment
staging ── staging deployment
prod ───── prod deployment
Problems: Merge conflicts, drift between branches, hard to compare.
Strategy 2: Directory-per-Environment (Recommended)
main branch:
kustomize/overlays/dev/ ← dev config
kustomize/overlays/staging/ ← staging config
kustomize/overlays/prod/ ← prod config
Promotion is a commit that updates the image tag in the target overlay:
# Automated by CI or the promote.sh script
./scripts/promote.sh --from=dev --to=staging --image=myapp:v1.2.3
Strategy 3: Git Tag Promotion
v1.2.3-dev ← deployed to dev
v1.2.3-staging ← deployed to staging
v1.2.3-prod ← deployed to prod
Each environment tracks a different tag pattern.
Recommended Workflow
1. Developer pushes code
2. CI builds image (myapp:v1.2.3)
3. CI updates dev overlay automatically
4. GitOps deploys to dev
5. After validation, promote.sh updates staging overlay (commit)
6. GitOps deploys to staging
7. After approval, promote.sh creates PR for prod
8. Reviewer approves PR → merge → GitOps deploys to prod
Secret Management
Never store plain-text secrets in Git. Use one of these approaches:
SOPS (Mozilla) + Age/PGP
Encrypts secret values in-place. The GitOps controller decrypts at apply time.
# Encrypt a secret
sops --encrypt --age age1... secrets.yaml > secrets.enc.yaml
# FluxCD decryption config
apiVersion: kustomize.toolkit.fluxcd.io/v1
kind: Kustomization
spec:
decryption:
provider: sops
secretRef:
name: sops-age-key
Pros: Secrets in Git (encrypted), diff-friendly, works with any tool.
Cons: Key management, everyone needs the public key.
Sealed Secrets (Bitnami)
Encrypts secrets client-side with a cluster-specific key. Only the controller can decrypt.
# Seal a secret
kubeseal --format=yaml < secret.yaml > sealed-secret.yaml
Pros: No external dependencies, cluster-scoped encryption.
Cons: Cluster-specific keys, can't decrypt locally.
External Secrets Operator
Syncs secrets from external vaults (AWS Secrets Manager, HashiCorp Vault, Azure Key Vault).
apiVersion: external-secrets.io/v1beta1
kind: ExternalSecret
metadata:
name: myapp-secrets
spec:
refreshInterval: 1h
secretStoreRef:
name: aws-secrets-manager
kind: ClusterSecretStore
target:
name: myapp-secrets
data:
- secretKey: db-password
remoteRef:
key: myapp/prod/db-password
Pros: Centralized secret management, rotation, audit trail.
Cons: External dependency, more moving parts.
Multi-Cluster GitOps
ArgoCD: Hub-and-Spoke
One ArgoCD instance manages multiple clusters:
# Register a remote cluster
argocd cluster add my-remote-cluster
# Application targeting remote cluster
spec:
destination:
server: https://remote-cluster-api:6443
namespace: myapp
FluxCD: Per-Cluster Bootstrap
Each cluster has its own Flux installation pointing to the same repo:
gitops-repo/
├── clusters/
│ ├── us-east-1/ ← Flux on cluster 1 watches this
│ │ ├── flux-system/
│ │ └── apps.yaml
│ ├── eu-west-1/ ← Flux on cluster 2 watches this
│ │ ├── flux-system/
│ │ └── apps.yaml
│ └── base/ ← Shared across clusters
└── apps/
└── myapp/
Rollback Strategies
Git Revert (Preferred)
# Revert the promotion commit
git revert HEAD
git push
# GitOps controller reconciles to previous state
ArgoCD Manual Rollback
# View deployment history
argocd app history myapp
# Rollback to specific revision
argocd app rollback myapp <revision-id>
FluxCD Suspend + Manual Fix
# Suspend auto-reconciliation
flux suspend ks myapp-prod
# Fix the issue in Git
git revert HEAD && git push
# Resume reconciliation
flux resume ks myapp-prod
Best practice: Always roll forward via Git. Manual rollbacks bypass the GitOps audit trail.
Observability
Metrics to Monitor
| Metric | Description |
|---|---|
argocd_app_sync_total |
Total sync operations |
argocd_app_health_status |
Application health state |
gotk_reconcile_duration_seconds |
Flux reconciliation time |
gotk_reconcile_condition |
Flux reconciliation status |
Recommended Dashboards
-
ArgoCD: Import Grafana dashboard ID
14584 -
FluxCD: Import Grafana dashboard ID
16714 - Custom: Track deployment frequency, lead time, MTTR, failure rate (DORA metrics)
Alert Rules
# Alert if app is out of sync for >10 minutes
- alert: ArgoAppOutOfSync
expr: argocd_app_info{sync_status="OutOfSync"} == 1
for: 10m
labels:
severity: warning
# Alert if Flux reconciliation fails
- alert: FluxReconciliationFailed
expr: gotk_reconcile_condition{type="Ready",status="False"} == 1
for: 5m
labels:
severity: critical
Common Pitfalls
1. Mixing Imperative and Declarative
Don't kubectl apply manually alongside GitOps. The controller will revert your changes on the next reconciliation cycle.
2. Not Using Sync Waves
Resources have dependencies. A Service needs its Deployment first. Use sync waves (argocd.argoproj.io/sync-wave) or FluxCD dependencies (dependsOn).
3. Ignoring Drift Detection
Enable self-heal (ArgoCD) or drift detection (FluxCD). Without it, manual changes persist until the next Git commit.
4. Storing Secrets in Plain Text
Use SOPS, Sealed Secrets, or External Secrets Operator. Never commit plain-text secrets.
5. Not Testing Kustomize Locally
Always validate before pushing:
# Build and verify Kustomize output
kustomize build kustomize/overlays/dev | kubectl apply --dry-run=client -f -
# Validate with kubeval/kubeconform
kustomize build kustomize/overlays/prod | kubeconform -strict
6. Infinite Reconciliation Loops
Some controllers (e.g., HPA, Istio) mutate resources. Use ignoreDifferences (ArgoCD) or exclude fields from drift detection (FluxCD) to prevent infinite sync loops.
Part of the DevOps Toolkit by Datanest Digital
This is 1 of 6 resources in the DevOps Toolkit Pro toolkit. Get the complete [Gitops Workflow Templates] with all files, templates, and documentation for $XX.
Or grab the entire DevOps Toolkit Pro bundle (6 products) for $178 — save 30%.
Top comments (0)