Uber · Security · 18 May 2026
150,000 secrets. 25 separate vaults. Hundreds of teams managing their own credentials in their own ways, some in plain text in version control. At Uber's scale — 5,000 microservices, 5,000 databases, 500,000 analytical jobs per day — secrets sprawl is not a compliance problem. It is an incident waiting to happen. A team of ten engineers decided to fix it.
- 150,000 secrets managed
- 25 vaults → 6 managed vaults
- 5,000 microservices secured
- 20,000 automated rotations/month
- 90% fewer secrets in pipelines
- Team of 10 engineers
The Story
Secrets sprawl is the entropy of infrastructure security. Left to its own devices, every team builds its own vault, stores credentials however is convenient, and shares secrets in whatever way is fastest. At a startup with ten engineers, this is manageable. At Uber — 5,000 microservices, 5,000 databases, 400+ third-party integrations, 500,000 analytical jobs per day — it becomes a systemic security risk. By the time Uber's Secrets team began their consolidation project, the company had 150,000 secrets scattered across 25 separate vault systems , operated by different teams, with different security standards, different rotation practices, and inconsistent access controls. Some secrets were in plain text in codebases. Others lived in databases that had never been audited for credential exposure. Cyberattacks targeting exposed credentials were rising industry-wide. The question was not whether Uber should fix this — it was how.
🔐
Before the consolidation, Uber's 25 separate vault systems were operated by various teams across engineering. Some were standard HashiCorp Vault (an open-source secrets management tool that provides a secure, centralized store for tokens, passwords, certificates, and encryption keys) deployments. Others were custom databases. Others were cloud-specific secret managers for AWS, GCP, and Azure. None of them talked to each other. None of them had a unified view of what credentials existed where.
The Secrets team's strategy had two phases. Phase 1 was consolidation: take ownership of all vault infrastructure, standardize on a small number of canonical vault systems (one per cloud provider plus one on-premises HashiCorp Vault), and migrate all secrets from the 25 fragmented vaults into these six. This was the foundation work — unglamorous, involving hundreds of engineers across different teams, and requiring careful coordination to avoid breaking services that depended on existing vault paths. Phase 2 was the platform: building a Secret Management Platform on top of the consolidated vaults — a metadata model, lifecycle automation, unified API, and real-time scanning — that turned six vaults into a governed, auditable, self-service system.
THE FIVE PROBLEMS THEY HAD TO SOLVE
As the Secrets team consolidated vaults, five common problem patterns emerged that any future platform would need to address: (1) no unified metadata model — no way to know what a secret was for, who owned it, when it was last rotated; (2) no cross-vault CRUD — managing secrets across different vault types required different tools and APIs; (3) no developer self-service — engineers filed tickets to create or rotate secrets; (4) no inventory — no way to generate a complete list of secrets for security incident response; (5) no automated rotation — credential rotation required manual coordination, so it was delayed or skipped.
Problem
Secrets Sprawl: 150,000 Credentials, No Visibility
Uber's infrastructure had grown faster than its secrets governance. 25 vault systems operated by different teams meant no single team had visibility into the company's complete credential inventory. Shadow IT vaults with no central oversight created audit gaps. Secrets were shared insecurely, rotated rarely, and sometimes stored in version control. With cyberattacks targeting credential exposure rising industry-wide, the status quo was untenable.
Cause
Scale + Decentralization = Governance Collapse
At Uber's scale, decentralized secrets management doesn't produce diversity and resilience — it produces inconsistency and risk. Each of 25 vaults had its own standards, its own rotation schedule (usually none), its own access model. There was no way to answer basic security questions: who has access to which credentials? When were they last rotated? Are any credentials in source code? The scale that made the problem urgent also made it hard to fix without a dedicated team and platform.
Solution
Consolidation + Secret Management Platform
Phase 1 consolidated 25 vaults into 6 centrally managed vaults (one per cloud provider plus on-prem HashiCorp Vault). Phase 2 built the Secret Management Platform: a metadata model, a unified API abstracting across all vault types, a Cadence-orchestrated Secret Lifecycle Manager (Uber's automation system that handles the complete lifecycle of secrets — creation, rotation, distribution to workloads, and eventual decommissioning — using Uber's Cadence workflow engine), real-time scanning across git/Slack/CI pipelines, and self-service developer tooling.
Result
20,000 Automated Rotations Per Month, 90% Fewer Exposed Secrets
A team of 10 engineers now drives 20,000 automated monthly secret rotations — up from manual rotation that happened rarely. Secrets exposed in CI pipelines dropped by 90%. The platform generates a complete inventory of all 150,000 secrets on demand, enabling rapid response to security incidents. Uber is actively pursuing secretless authentication — replacing long-lived credentials with ephemeral, automatically-issued tokens wherever possible.
⚠️
The Migration Scale Problem
Migrating secrets from 25 vaults to 6 involved hundreds of engineers whose workloads depended on existing vault paths. A secret migration is not just a data copy — it is a coordination problem. Every service reading a secret from vault path A needs to be updated to read from vault path B. In a monolith, that's one codebase. Across Uber's 5,000 microservices, that's 5,000 potential update targets. The team built tooling to discover which services were reading from which vault paths, generated migration checklists automatically, and used feature flags to switch services over gradually with rollback capability.
The metadata model (a structured representation of a secret's properties — owner, purpose, rotation schedule, associated services, expiry date, security classification — that enables automated governance and incident response) was the architectural cornerstone of the Secret Management Platform. Before consolidation, a secret was just a key-value pair in a vault with no context. After the platform was built, every secret had a structured record: who owned it, which services used it, when it was last rotated, what its rotation policy was, and what its security classification was. This metadata made automated governance possible : the platform could identify secrets that hadn't been rotated in 90 days, generate compliance reports, and automatically alert owners of soon-to-expire credentials. It also made incident response practical: when a security team needed to identify all credentials that could have been exposed in a compromise, they could query the inventory rather than interviewing 250 engineering teams.
ℹ️
Real-Time Scanning: Catching Secrets Before They Ship
One of the most impactful platform features was real-time scanning across Uber's code repositories, CI pipelines, and internal Slack messages. The scanner looks for patterns matching API keys, database passwords, and private key formats. When detected, it automatically revokes the exposed credential and alerts the owning team. Before the platform, a credential committed to git might live there for months — or forever. Now, exposure is measured in seconds. The 90% reduction in secrets found in pipelines reflects this detection-and-revocation automation.
❌
Shadow IT Vaults: The Security Debt Multiplier
Perhaps the most dangerous aspect of Uber's pre-platform state was what the team called shadow IT vaults : secret storage systems created by individual teams outside the knowledge of the central Secrets team. These vaults had no security baseline review, no rotation policy, no access audit, and no inventory. When a team built a shadow vault, they optimized for their immediate convenience — and created a security liability that the company didn't know existed. You cannot rotate credentials you don't know about. You cannot audit access to vaults you don't know exist. Shadow IT vaults are the point where 'move fast' becomes 'incur unquantifiable risk.'
WHY 400 THIRD-PARTY INTEGRATIONS MATTER
Uber's 400+ third-party vendor integrations are a significant factor in the secrets management challenge. Each integration requires credentials — API keys, OAuth tokens, database passwords — that must be rotated when vendors change their systems or when Uber's access policy changes. Before the platform, vendor credential rotation required manual coordination: someone had to get the new credentials from the vendor, find which services used them, update each service's configuration, and verify nothing broke. At 400 integrations, this manual process consumed disproportionate engineering time and rotations were often delayed. The Secret Lifecycle Manager automated the rotation for most standard integrations.
🔄
Before the Secret Management Platform, secret rotation at Uber required a service owner to coordinate with the Secrets team, obtain a new credential from the upstream provider, update their service's configuration, and verify the rotation succeeded. At 150,000 secrets across 5,000 services, this process ran rarely — not because security was a low priority but because the operational overhead was prohibitive at scale. Most secrets were rotated only when forced by a security incident or vendor requirement. The platform inverts this: rotation is the default, manual coordination is the exception.
ℹ️
Kubernetes Native Injection
One of the most seamless developer experiences in the platform is Kubernetes-native secret injection. Rather than requiring services to call an API to retrieve their credentials at startup, the platform can inject secrets directly as environment variables or mounted files into Kubernetes pods at deploy time. This is transparent to application code — the service sees its credentials as normal environment variables, with no awareness of which vault they came from or how they were rotated. When a rotation occurs, the platform can trigger a pod restart with the new credentials injected automatically.
The Fix
The Secret Lifecycle Manager
The Secret Lifecycle Manager (SLM) is the operational core of Uber's Secret Management Platform. Built on Cadence (Uber's open-source distributed workflow engine, designed for long-running, fault-tolerant business processes — the same engine that powers Uber's ride dispatch and payment workflows), SLM orchestrates the complete lifecycle of every secret: initial creation, distribution to consuming services, periodic rotation, and eventual decommissioning. Using Cadence's durable workflow model means that secret rotation operations are fault-tolerant — if the rotation workflow fails midway through, it can resume from where it left off rather than leaving credentials in a half-rotated, potentially inconsistent state.
- 25→6 — Vault systems consolidated — from 25 team-operated vaults with inconsistent standards to 6 centrally managed vaults with uniform security baselines
- 20,000 — Automated secret rotations per month — up from rare manual rotation that required coordination between the Secrets team and service owners
- 90% — Reduction in secrets found exposed in CI/CD pipelines — achieved through real-time scanning with automatic revocation on detection
- 10 — Engineers on the Secrets team that built and now operates the entire platform — evidence that well-designed automation multiplies individual team capacity dramatically
# Simplified Secret Lifecycle Manager rotation workflow (conceptual)
# Real implementation uses Cadence's durable workflow primitives
from cadence.workflow import workflow_method
class SecretRotationWorkflow:
@workflow_method
async def rotate_secret(self, secret_id: str):
"""Cadence ensures this completes even if individual steps fail."""
# Step 1: Generate new credential from upstream provider
new_credential = await self.generate_new_credential(secret_id)
# Step 2: Write new credential to canonical vault
# (Durable: if this step completes, Cadence records it)
await self.write_to_vault(
secret_id=secret_id,
value=new_credential,
version='new' # old version still readable during transition
)
# Step 3: Signal consuming services to reload credential
# Each service has a registered reload handler
consuming_services = await self.get_consumers(secret_id)
for service in consuming_services:
await self.signal_reload(service, secret_id)
# Step 4: Verify all services are using new credential
# (Wait for health checks to confirm)
await self.verify_rotation_complete(secret_id, consuming_services)
# Step 5: Expire old credential in upstream provider
await self.revoke_old_credential(secret_id)
# Step 6: Update metadata: last_rotated, next_rotation_due
await self.update_metadata(secret_id, rotated_at=now())
# Cadence schedules next rotation based on policy
SECRETLESS AUTHENTICATION: THE NEXT FRONTIER
The logical endpoint of Uber's secrets management journey is secretless authentication — a model where services don't hold long-lived credentials at all. Instead, they use their identity (a Kubernetes service account, a Spiffe/SPIRE identity, a cloud provider IAM role) to dynamically request short-lived tokens at runtime. When a token expires in 1 hour, there is nothing to steal, nothing to rotate, nothing to audit. Uber is actively building toward this model as the long-term replacement for static credential management. The Secret Management Platform is both the current solution and the bridge to the secretless future.
✅
Self-Service Developer Tooling
Before the platform, creating a new secret required filing a ticket with the Secrets team. The turnaround could be days. After the platform, developers can create, update, and delete secrets through a self-service API, CLI, and web UI — all of which enforce the metadata requirements and policy compliance automatically. The Secrets team's workload shifted from manual secret operations (which scaled linearly with the number of services) to platform maintenance and governance (which scales much more slowly). A team of 10 can now serve 5,000 microservices because the services serve themselves.
ℹ️
The Multi-Vault Abstraction Layer
Uber's infrastructure spans AWS, GCP, Azure, and on-premises HashiCorp Vault. Each environment has its own native secret manager with a different API. The Secret Management Platform includes a unified abstraction layer that presents a single API for secret CRUD operations regardless of which underlying vault the secret lives in. Application code interacts with the platform API; the platform handles routing the operation to the correct vault (AWS Secrets Manager, GCP Secret Manager, Azure Key Vault, or HashiCorp Vault) and translating the response. This abstraction decouples application code from vault topology — when Uber migrates a secret from one vault to another, no application code changes.
ℹ️
The Unified API: One SDK, Four Vaults
Uber's unified abstraction layer exposes a single SDK that application developers use regardless of which underlying vault stores their secret. The SDK handles routing: an AWS-deployed service's database password might live in AWS Secrets Manager; an on-prem service's certificate might live in HashiCorp Vault. The developer writes
secrets.get('myservice/db_password')and receives the credential — the SDK consults the metadata catalog to find which vault holds that secret and retrieves it via the appropriate vault API. Application code is decoupled from vault topology , making future vault migrations transparent.✅
Compliance Reporting on Demand
Before the metadata model, answering a compliance auditor's question — 'show me all credentials with access to our payment processing systems' — would have required interviewing dozens of engineering teams over days. After the platform, the same question is answered by a metadata query: filter by associated_system='payment_processing', return all matching secrets with their rotation history, access policies, and owner contacts. Compliance reporting that took days now takes seconds. The metadata model was built for developer self-service but it turns out to be equally valuable for security operations and compliance.
Architecture
The Secret Management Platform sits as an orchestration layer above Uber's six canonical vault systems. Applications and services no longer talk directly to specific vaults — they interact with the platform's unified API or use Kubernetes-native injection (where secrets are automatically mounted into pods at deployment time). The platform maintains the metadata catalog, handles lifecycle automation via the Secret Lifecycle Manager, runs real-time scanning, and provides the developer self-service tools. The vault systems themselves are the authoritative stores; the platform is the governance and automation layer on top.
Before: 25 Fragmented Vaults, No Governance
View interactive diagram on TechLogStack →
Interactive diagram available on TechLogStack (link above).
After: Unified Secret Management Platform
View interactive diagram on TechLogStack →
Interactive diagram available on TechLogStack (link above).
CADENCE: WHY UBER CHOSE WORKFLOWS FOR ROTATION
Secret rotation is a multi-step process with real failure modes: the upstream provider might be unavailable, the vault write might fail, a downstream service might not acknowledge the new credential. A simple cron job or Lambda function that fails midway leaves the system in an unknown state — is the old credential still valid? Is the new one active? Cadence's durable workflow model provides exactly-once execution semantics : each step is recorded, and if the workflow fails partway through, it resumes from the last successful step. This makes secret rotation reliable enough to run 20,000 times per month without manual oversight.
⚠️
The Migration Coordination Challenge
Migrating a secret from its old vault path to the new centralized platform sounds like a database copy operation. In practice it's a distributed coordination problem across hundreds of teams. Every service reading the secret needs to be updated simultaneously (or with a dual-read transition period). Uber built tooling to discover all consumers of a vault path, generate migration checklists, and track completion status. Services that hadn't migrated within the target window were flagged for the owning team. The tooling made a migration problem tractable across an organization of thousands of engineers.
🛡️
SPIFFE/SPIRE: The Path to Secretless
Uber's secretless authentication initiative builds on the SPIFFE/SPIRE framework — an open standard for issuing cryptographic workload identities. Every service at Uber has a unique SPIFFE identity that is automatically issued, cryptographically verifiable, and short-lived. Services that can authenticate using their SPIFFE identity don't need to hold long-lived credentials at all — the identity proves who they are, and the system issues time-limited tokens dynamically. As more of Uber's infrastructure adopts SPIFFE-based authentication, the number of long-lived secrets that need to be managed by the platform shrinks toward zero.
Lessons
Uber's secrets management story is about the organizational and engineering cost of decentralization without governance — and the compounding returns of building the right platform once.
- 01. Consolidate ownership before building automation. Uber's two-phase approach — consolidate 25 vaults into 6, then build the platform — was the right sequence. Building a governance platform on top of 25 independent vaults would have required integrating 25 different systems. Building it on 6 centrally owned vaults meant one integration per vault type. Consolidation first is harder organizationally but dramatically simpler technically.
- 02. A metadata model (a structured record of each secret's properties — owner, purpose, associated services, rotation policy, security classification — that enables automated governance, inventory, and incident response) is the prerequisite for all other automation. Without metadata, you cannot automate rotation (you don't know the rotation policy), you cannot generate inventory (you don't know what secrets are for), and you cannot respond to incidents (you don't know which services are affected). Build the metadata model before building any automation on top of it.
- 03. Real-time scanning with automatic revocation changes the economics of credential exposure. When exposure is detected in seconds and the credential is automatically revoked, a developer accidentally committing a credential to git causes a 30-second incident rather than a multi-month exposure. The scanning + revocation loop is the highest-leverage security improvement for teams still relying on manual credential hygiene.
- 04. Use durable workflow systems (like Cadence or Temporal) for secret rotation, not scripts or cron jobs. Rotation is a multi-step process with real failure modes at each step. A workflow system that provides exactly-once execution and automatic resume on failure makes rotation reliable enough to run at scale without manual oversight. A cron job that fails halfway through a rotation leaves credentials in an unknown state.
- 05. Self-service developer tooling is what makes centralized governance scale. A centralized Secrets team without self-service tooling becomes a bottleneck — every credential operation requires a ticket. A centralized Secrets team with self-service tooling becomes a platform team — they build and maintain the guardrails, and developers operate within them autonomously. The goal is governance at scale, not control at the cost of velocity.
✅
10 Engineers, 5,000 Microservices
The most striking number in Uber's secrets story is the ratio: 10 engineers managing secrets governance for 5,000 microservices. This 500:1 leverage ratio is only possible because the platform does the work that used to require human coordination. Automated rotation, self-service tooling, policy enforcement in the platform layer — all of these shift work from the coordination model (each secret operation requires a human) to the automation model (each secret operation executes itself). Platform teams that want to scale should measure their leverage ratio and ask what automation would improve it.
SECRETS IN VERSION CONTROL ARE NOT A MISTAKE; THEY'RE AN ARCHITECTURE PROBLEM
Every engineering organization has discovered credentials accidentally committed to git. The standard response is to educate developers about the risk. Uber's analysis found that the root cause was not developer carelessness — it was the absence of a convenient alternative. When getting a credential into a service requires filing a ticket and waiting days, developers find a shortcut: put it in the config file. The secure path needs to be the easy path. The self-service API and Kubernetes injection that the Secret Management Platform provides made the secure approach easier than the shortcut.
Uber built a platform to manage 150,000 secrets at scale — and the most important feature turned out to be a metadata field that just says 'who owns this thing?'
TechLogStack — built at scale, broken in public, rebuilt by engineers
This case is a plain-English retelling of publicly available engineering material.
Read the full case on TechLogStack → (interactive diagrams, source links, and the full reader experience).
Top comments (0)