People talk about GitOps like it is the final form of delivery. In real life, it depends a lot on scale.
I have spent years helping teams go from one multi-tenant instance to hundreds of single-tenant instances. GitOps was useful early. However, for me at large scale, it became a constant fight.
One formula captures it well: P(failure) = 1 - p^n.
Where p is the chance each individual change works, and n is how many moving parts you have to coordinate. As n grows, failure risk climbs fast even if each single change is "pretty safe."
For example: you are deploying one release to 100 single-tenant customer environments, and each environment sync has a 99% success rate.
-
p = 0.99(one environment sync succeeds 99% of the time) -
n = 100(100 environment syncs in the rollout wave) 1 - 0.99^100 = 0.634
So that rollout has about a 63% chance that at least one customer environment fails to deploy cleanly on the first pass.
P(failure)
1.00 ┬ ●
│ ●
0.80 │ ●
│ ●
0.60 │ ●
│ ●
0.40 │ ●
│ ●
0.20 │ ●
│ ●
0.00 └───────────────────────────────────────────────────────────────
0 20 40 60 80 100 120 140 160 n
Formula-wise, you only have two levers:
- reduce
n(fewer independent steps per rollout) - increase
p(make each step more reliable)
GitOps alone does not raise p for you. To improve p, you need other tooling and controls like preflight checks, dependency validation, rollout orchestration, retries, and policy guardrails.
Where GitOps works
GitOps is great when:
- you have a small number of environments
- ownership is clear
- changes are low risk
- teams are disciplined with review and automation
In that setup, Git gives you clean history, solid audit trails, and predictable rollouts.
small scale
dev -> PR -> merge -> deploy -> done
(few moving parts, easy to reason about)
Where it starts to hurt
Once you have a big fleet, a few things happen fast.
Pull requests become your release system
Every deployment turns into repo choreography. More branches, more approvals, more waiting. You start optimizing for merge flow instead of delivery outcomes.
large scale
CI -> PR -> approval -> merge -> sync
\-> policy check -> rebase -> approval -> merge -> sync
\-> hotfix PR -> cherry-pick -> re-sync
Rollbacks are not simple anymore
Rolling back one service is easy. Rolling back a whole environment with dependencies is not. Git can show you what changed, but it cannot restore all runtime conditions.
Config sprawl gets expensive
At scale, you end up with endless overrides: customer-specific, region-specific, compliance-specific, and emergency patches. The issue is not YAML itself. The issue is how much state humans must keep in their heads.
Out-of-band changes become normal
This is the part people avoid saying out loud.
At scale, teams will make changes outside GitOps. During incidents, during customer escalations, during vendor outages. Not because they are careless, but because they are solving an immediate problem.
If your model assumes that never happens, it is too idealistic for enterprise operations.
The split you see in GitOps opinions
GitOps lovers and GitOps haters are usually dealing with different scales.
At small scale, GitOps feels clean.
At enterprise scale, repo-centric workflows become too low-level for the job.
That is the real mismatch.
What actually works better
Do not throw away GitOps. Just stop treating Git as the entire control plane.
Use Git for intent and auditability. Add platform-level orchestration for:
- preflight checks (before rollout starts, not after breakage)
- strong defaults (safe rollout strategy, retries, timeouts, guardrails)
- dependency validation (service A should not move before dependency B is healthy)
desired model
Git (intent) ---> Orchestrator ---> Fleet of environments
| | | | |
| +-> policy e1 e2 e3 ... eN
+-> rollout waves
+-> drift detection
+-> recovery paths
- rollout waves
- dependency ordering
- policy enforcement
- drift detection
- safe recovery after out-of-band changes
This is the key point: you need a tool that can actively orchestrate and enforce these runtime controls. GitOps alone cannot provide that. Git can store desired state. It does not run rollout logic, cross-environment safety checks, or live dependency coordination by itself.
That is the practical model: GitOps as an input, not the whole operating system.
Final take
GitOps is good. Pure GitOps at enterprise scale usually is not.
The bigger you get, the more you need orchestration that lives above pull requests.
Top comments (0)