Discussion on: What the heck are Conditions in Kubernetes controllers?

View post

Hello @Mäel. First of all really nice article. I do not remember when it was the last time I read something with this kind of good research and concrete/trustable references.

This discussion about Conditions and Status and Events lead me to a cul-de-sac when it comes to an event-driven design since Watchers in Kubernetes are related to Events, but as Brian Grant says, they aren't intended to be used for event-driven automation.
Let's imagine that we have a controller and a CRD. That controller does the reconciliation but, in case of problems, there is another component in the cluster that finishes the lifecycle, for example rolling back changes in case of certain "back-off" threshold is reached. Let's call it "distributed controller".

In that scenario, what should the other component listen to? That is, theoretically speaking events won't provide the actual state, only a series of "logs" but to get conditions the other component should be polling the API regularly? - It does not scale.

So, in a nutshell, the questions is: what do you think is the best solution when having an event-driven approach in which a controller is a bunch of independent components reacting to state changes?
(don't get me wrong, I am referring to state using the idea of Conditions explained in the article, so in this context a state could be the latest condition)

Thanks! :)

Maël Valais • Feb 26 '23 • Edited

What should the other component listen to?

Each component (or "control loop") should reconcile everytime something "happened". The control loop doesn't know what happened, it has to figure it out by itself (control loops are "level triggered" as opposed to being "edge triggered").

Now, how should one control loop "communicate" with the other control loop?

First, it is OK to have two control loops reconciling the same object as long as they are doing it in an orthogonal way (the control loops are "decoupled"). The pod object is a good example of an object being reconciled by orthogonal controllers: kube-scheduler and kubelets both reconcile pods. Neither controller needs information from the other control loop.

The opposite ("coupled" control loops) also exist: cert-manager has a 4 different control loops reconciling the Certificate object. The coupling is due to a condition that gets added by a control loop (the "trigger controller") and removed by another (the "readiness controller"). In my opinion, this is a mistake we made in the cert-manager codebase. These controllers are not orthogonal, and understand one controller means that you also need to understand how the other controllers behave.

In your example, the second controller will be triggered everytime something happens, and it will need to figure out by itself that the object has entered the backoff threshold. I imagine that the first controller stores some form time to keep track of the backoff, something like this:

status:
  conditions:
    - type: BackoffThresholdReached
      status: "True"

The second controller would wait for this condition to perform operations. This bit of state can't be guessed by the second controller, so the first controller is definitely passing state to the second controller.

I think it is OK to do that as long as the second controller doesn't change that same condition.

Maël Valais • Nov 4 '20 • Edited

Hi! Thank you so much for your comment! Your question is very interesting, I need a little time to think in order to (try to) answer. 😅

krishnan-recrooz • Feb 5 '21

Wouldn't that be the use case for leveraging the "phase".