Running NiFi 2 on Kubernetes without ZooKeeper simplifies your infrastructure — but it shifts the responsibility for cluster stability onto a probe configuration most teams get wrong.
Why ZooKeeper-less?
Apache NiFi 2 introduced KubernetesLeaseLeaderElectionProvider, which lets NiFi use Kubernetes-native leader election instead of relying on an external ZooKeeper ensemble. Fewer moving parts, less infrastructure to manage, no separate ZooKeeper StatefulSet to maintain.
The tradeoff: without an external coordinator, the Pods themselves are responsible for forming quorum. That makes your readinessProbe configuration far more critical than it would be in a traditional NiFi deployment.
Get it wrong and you face an uncomfortable choice: rolling updates that cause service outages, or full restarts that compromise data consistency.
The dilemma: lenient or strict?
When configuring the Kubernetes readiness probe for NiFi, you quickly run into a technical crossroads.
The risk of a lenient probe
The first thing most people try is checking that the Jetty server responds — a simple HTTP request to the NiFi API. It seems reasonable, but in a distributed environment it's dangerous.
During a rolling update, Kubernetes restarts pods sequentially. With a lenient probe, Kubernetes sees that the new pod (e.g. nifi-2) has a responsive Jetty server and immediately marks it as ready, then proceeds to terminate the next active pod (nifi-1).
The problem: nifi-2 may have started its Jetty server but hasn't joined the cluster or synchronized the flow yet. You've killed an active pod before the replacement is actually functional. In seconds, you lose quorum and the service goes down.
The problem with a strict probe
To avoid that, the logical fix is to be strict: "the pod is only Ready if it's CONNECTED to the cluster." This solves rolling updates but breaks full restarts.
When starting from scratch with podManagementPolicy: OrderedReady (the default), nifi-0 starts first. Alone, it can't connect to a cluster that doesn't exist yet.
Here's what happens: nifi-0 sits isolated, waiting for peers that won't arrive, until its election timeout expires (nifi.cluster.flow.election.max.wait.time, default 5 minutes). Only then does it declare itself sole leader.
This doesn't just delay startup unnecessarily — it breaks consensus. nifi-0 imposes its version of flow.json.gz without comparing it with anyone, creating a real risk of data inconsistency or loss of recent changes.
Using
podManagementPolicy: Paralleldoes allow all pods to start simultaneously on a fresh restart, but introduces its own dependencies and failure modes that deserve a separate discussion.
The solution: ordinal-aware probe logic
The answer isn't to pick one approach or the other — it's to apply each one based on the pod's role in the StatefulSet.
nifi-0 and nifi-1+ have fundamentally different responsibilities:
- nifi-0 must prioritize startup. We need Kubernetes to bring up its peers as quickly as possible so that leader election happens democratically.
- nifi-1+ must prioritize stability. They should not receive traffic or allow the rollout to proceed until they are fully integrated into the cluster.
Here's the hybrid probe that implements this:
readinessProbe:
exec:
command:
- /bin/bash
- -c
- |
FQDN=$(hostname -f)
API_URL="https://${FQDN}:8443/nifi-api/controller/cluster"
# Step 1: base check — is the NiFi API reachable? (applies to all pods)
RESPONSE=$(curl -s -m 5 $CERT_ARGS $API_URL)
if [ $? -ne 0 ]; then exit 1; fi
# Step 2: hybrid logic based on StatefulSet ordinal
if [[ "$(hostname)" == *"nifi-0"* ]]; then
# nifi-0: Ready as soon as the API responds.
# This immediately unblocks the startup of nifi-1 and nifi-2,
# allowing democratic leader election instead of a solo timeout.
exit 0
else
# nifi-1+: strict validation.
# Must be CONNECTED before receiving traffic or allowing rollout to proceed.
echo "$RESPONSE" | grep -q "\"status\":\"CONNECTED\""
if [ $? -eq 0 ]; then exit 0; else exit 1; fi
fi
initialDelaySeconds: 90
periodSeconds: 10
failureThreshold: 6
Note on
$CERT_ARGS: this variable should contain your TLS certificate arguments for curl (e.g.--cacert,--cert,--key). Define it in your pod environment or expand it inline based on your certificate setup.
Why this works better
During a rolling update
Kubernetes rolls pods from highest to lowest ordinal. nifi-2 is restarted first and must reach CONNECTED before Kubernetes touches nifi-1, and so on. The probe on nifi-1+ acts as a gate — the rollout cannot advance until the current pod is genuinely integrated.
nifi-2 restarted
|
↓
nifi-2 API up → probe passes base check
|
↓
nifi-2 joins cluster → status: CONNECTED → probe passes strict check → Ready
|
↓
Kubernetes proceeds to restart nifi-1
|
↓
nifi-1 joins cluster → status: CONNECTED → probe passes strict check → Ready
|
↓
Kubernetes proceeds to restart nifi-0
|
↓
nifi-0 API up → probe passes base check → Ready (lenient rule)
|
↓
Rolling update complete — zero downtime
The only side effect: you may not be able to make flow changes in the NiFi UI while a pod is joining the cluster — a minor and temporary constraint.
During a full restart
Without hybrid probe: With hybrid probe:
nifi-0 starts alone nifi-0 starts → API up → Ready immediately
| |
↓ ↓
waits 5 min timeout nifi-1 starts right after
| |
↓ ↓
imposes its flow.json.gz both compare flow.json.gz
(no consensus) → democratic leader election
nifi-0 tells Kubernetes a white lie — "I'm ready" — as soon as the API responds. This triggers the immediate startup of the rest of the cluster. With nifi-0 and nifi-1 coming up almost simultaneously, leader election and flow.json.gz comparison happen by real consensus rather than a unilateral decision made after a timeout.
Conclusion
Running NiFi on Kubernetes without ZooKeeper is fully viable and operationally simpler — but it requires your readinessProbe to be aware of the StatefulSet topology. Don't treat all your pods equally: give nifi-0 the freedom to start the party, and require the others to join it properly.
Kubernetes has no visibility into NiFi's internal cluster state. Your readiness probe does.
And that difference is what keeps the cluster stable.
Have questions or a different approach to this problem? Happy to discuss in the comments.
Top comments (0)