Something I keep explaining in architecture reviews:

#devops #sre #kubernetes #terraform

LinkedIn Draft — Workflow (2026-04-05)

Secrets management: designing for rotation, not just storage

Most orgs solve 'where do we store secrets securely.' The teams that get paged at 2am are the ones who never solved 'how do we rotate them without downtime.'

Storage-only design:        Rotation-aware design:

Secret ──▶ Vault            Secret ──▶ Vault ──▶ Agent Injector
              │                                        │
         Pod (env var)                           Pod (file mount)
              │                                        │
         Restart to           Auto-reload ◀────── Lease renewer
         get new value        (zero downtime)

Where it breaks:
▸ Secrets as env vars require pod restarts on rotation — making rotation a deployment event with blast radius.
▸ Vault leases expiring in long-running jobs produce auth errors that look like app bugs, not infra failures.
▸ Secret sprawl across namespaces means rotation happens in 12 places — and one always gets missed.

The rule I keep coming back to:
→ Design rotation before you design storage. If you can't rotate a secret in under 10 minutes with no downtime, the design isn't production-ready.

How I sanity-check it:
▸ Vault Agent Injector or External Secrets Operator — decouple secret delivery from pod lifecycle.
▸ Monthly secret access log audit — stale consumers are how you discover forgotten service accounts before attackers do.

Reliability is a product feature. The engineers who treat it that way are the ones who get asked into the room.

Deep dive: https://neeraja-portfolio-v1.vercel.app/workflows/secrets-management-designing-for-rotation-not-just-storage

If this triggered a war story, I'd genuinely love to hear it.