DEV Community

Sreekanth Kuruba
Sreekanth Kuruba

Posted on • Edited on

From Process Management to State Reconciliation

I used to restart servers at 2AM… Kubernetes made that job disappear

02:15 AM — Pager goes off
“nginx is down on web-01”

You wake up.
Grab your laptop.
SSH into the server.
Run a few commands. Restart the process.

02:22 AM — It’s back.

Try to sleep again.

This used to be normal.

Then Kubernetes changed the rules.


🧱 The old world: Process-driven operations

Before Kubernetes, everything revolved around processes.

A service was:

  • A Linux process
  • Running on a specific machine
  • Identified by a PID
  • Restarted manually (or via basic supervisors)

The assumptions were simple:

  • Machines are stable
  • Failures are rare
  • Humans fix problems

And when something broke…
👉 you fixed it

Availability depended on:

How fast someone could wake up and respond.


🐳 Containers helped… but didn’t solve the real problem

With tools like Docker, things improved:

  • Consistent environments
  • Faster deployments
  • Fewer “works on my machine” issues

But let’s be honest…

If a container crashed:

  • Maybe it restarted
  • Maybe it didn’t

If the node died?

  • You’re still in trouble

If dependencies failed?

  • Still your problem

👉 Containers improved portability
👉 They did NOT guarantee reliability


🔄 Kubernetes changed the question

Kubernetes doesn’t ask:

“Is this process running?”

It asks:

“Is the system in the state I declared?”

That’s a massive shift.

Instead of managing processes…
you define desired state.


⚙️ The magic: State reconciliation

You declare:

  • “I want 3 replicas”
  • “They should always be running”
  • “They should be healthy”

Kubernetes continuously checks:

  • Current state
  • Desired state

If something breaks…
👉 it fixes it automatically

Not later.
Not after a pager alert.
Continuously.


🔄 Traditional vs Kubernetes minds


🧠 Why Kubernetes doesn’t care about PIDs

In traditional systems:

  • PID = identity

In Kubernetes:

  • PID = irrelevant

Because a PID is:

  • Local to a machine
  • Temporary
  • Lost on restart

Kubernetes doesn’t track processes.

It tracks:

Desired outcomes

You don’t ask:

  • “What’s the PID?”

You ask:

  • “Do I have 3 healthy pods?”

👉 That’s the difference between instance thinking and system thinking


💥 The real shift: Replace, don’t repair

Old mindset:

  • Fix the broken process

New mindset:

  • Replace it
👉 Failure is handled through replacement, not repair.
Enter fullscreen mode Exit fullscreen mode

Kubernetes doesn’t try to “save” things.

It simply ensures:

The system matches your declared state


🧪 Jobs are different too

Before:

  • Run jobs manually
  • Monitor externally
  • Retry manually

Now:

  • Define a Job
  • Kubernetes ensures completion
  • Retries automatically
  • Tracks success/failure

👉 You define intent.
👉 System enforces outcome.


⚠️ Failure is not an exception anymore

At scale, failure is constant.

Systems like Google’s Borg (Kubernetes’ ancestor) proved this:

  • Machines fail
  • Networks break
  • Processes crash

Not if
But how often

Kubernetes is built for this reality.

It assumes:

  • Nodes will disappear
  • Pods will die
  • Networks will glitch

And it’s okay with that.


🔁 What actually changed?

Before Kubernetes:

  • You maintained systems
  • You fixed failures
  • You reacted

After Kubernetes:

  • You define intent
  • The system maintains itself
  • Recovery is automatic

👉 Your job shifts from:
operator → system designer


🏁 Final thought

Kubernetes doesn’t remove failure.

It removes panic.

The system doesn’t ask:

“Who will fix this?”

It asks:

“What should this look like?”

And then it makes it happen.


💬 Your turn

What’s the last thing you had to fix manually at 2AM?

And could Kubernetes have handled it for you?

Top comments (0)