DEV Community: Stephane Nangue

good to know

Stephane Nangue — Tue, 17 Mar 2026 22:58:39 +0000

Stop Giving Secrets to Your Workloads: From Long-Lived Credentials to Identity-Aware Egress

Stephane Nangue ・ Mar 17

#cloud #identity #security #kubernetes

good to read

Stephane Nangue — Tue, 17 Mar 2026 22:57:36 +0000

Stop Giving Secrets to Your Workloads: From Long-Lived Credentials to Identity-Aware Egress

Stephane Nangue ・ Mar 17

#cloud #identity #security #kubernetes

Stop Giving Secrets to Your Workloads: From Long-Lived Credentials to Identity-Aware Egress

Stephane Nangue — Tue, 17 Mar 2026 22:54:12 +0000

How WIMSE rethinks credential exchange in multi-cloud environments — and how to implement it today

Every modern cloud application eventually faces the same uncomfortable truth: to call an external API, it needs a secret. That secret — an API key, an access token, a cloud credential — has to live somewhere. Today, the standard approach is to store that secret in a secret manager and have each workload fetch it at startup. From that point on, the secret lives in the application's memory for as long as the workload runs — ready to be used for every outbound API call. This model is simple to implement, but it carries a category of risk that quietly scales with your infrastructure.

The more services you run, the more copies of those credentials exist. The more cloud providers you integrate, the wider the attack surface grows. A single leaked key — in a log, in a crash dump, over an insecure channel — can give an attacker everything they need to move laterally across your systems.

The IETF is actively working on a standard called WIMSE (Workload Identity in a Multi-System Environment) that addresses this problem at its root: instead of distributing long-lived credentials to workloads, you issue short-lived, request-scoped access credentials derived from the workload's verified identity. This article explains what WIMSE proposes, where it still falls short, and how an identity-aware egress gateway can implement the same trust model today — without waiting for the standard to mature.

Understanding the Building Blocks

Before diving into the architecture, it helps to be precise about some terms that are often conflated.

A workload is an independently addressable and executable software entity — a microservice, a container, a virtual machine, a serverless function — that initiates and receives network communication. A workload instance is a single running instantiation of that workload at a given point in time.

An identity credential is a document asserting who an entity is, with a cryptographic binding that lets a relying party verify the claim without trusting the document alone. The issuer signs the document; the relying party verifies the signature against a known public key or JWKS endpoint. JWT/OIDC tokens, SPIFFE SVIDs, client certificates, and SAML assertions are all examples.

A workload identity credential combines both concepts: it is an identity credential whose subject is a workload, identified by a portable, structured URI called a WIMSE-ID or SPIFFE-ID that is globally meaningful across trust domains. It may be cryptographically bound to a key pair, requiring the presenter to prove possession of the private key. SPIFFE X.509-SVIDs and JWT-SVIDs are canonical examples. This is the foundational primitive that WIMSE builds on — and what distinguishes a proper workload identity system from the platform-specific attestation tokens that GitHub Actions, Kubernetes, or AWS issue today.

An access credential is a piece of data granting permissions to a specific resource or API. It carries entitlements that target services honor. Unlike identity credentials, access credentials are not cryptographically bound to the holder's entitlements, which means anyone holding one can exercise its permissions. API keys, AWS SigV4 keys, and OAuth access tokens are examples.

The distinction matters because most security incidents involving cloud credentials are not identity credential leaks — they are access credential leaks. Fixing the problem means changing how access credentials are issued, scoped, and consumed.

The Problem: How Credentials Are Managed Today

Consider a concrete example: an e-commerce application called Just-Buy-It running on Kubernetes. It has an ingress controller, a payments service, and a secret manager. The payments service needs to call Stripe API.

At step 0, at startup, the payments-service pod fetches a long-lived Stripe API key from the secret manager and stores it in memory.

A mobile application authenticates and obtains an access token to call the payments-service API.
The mobile application calls the payments-service through the ingress controller.
The ingress controller forwards the request to the payments-service over HTTPS (optionally with mTLS).
The payments-service authenticates the user and uses the Stripe API key it loaded at startup to call Stripe.

This flow has several compounding problems:

Distributed attack surface: One copy of the access credential is deployed to every instance of the payments-service. The more instances you run, the more places the credential can leak from.
Long-lived exposure: The credential persists for as long as the workload runs. If it leaks, an attacker can reuse it until it is manually rotated — a window that is often measured in months.
Coarse-grained authorization: The credential is scoped to a broad set of Stripe API operations, not to a specific request. Every call in the credential's lifetime uses the same permissions, dramatically increasing the blast radius of a compromise.
Context-unaware authorization: Stripe authorizes the call based on credential permissions alone — not on the caller's identity, the time of day, the originating IP, or any other contextual signal.
Credentials sprawl: Each new external service requires a new long-lived credential in memory. In multi-cloud environments, this multiplies rapidly.
Weak audit and attribution: Multiple instances share the same credential. There is no per-instance identity, so tracing a specific API call back to a specific workload instance is difficult or impossible.
Possession equals identity: Anyone holding the access credential can call Stripe. There is no proof of who — or what — is presenting it, making workload impersonation trivially easy.

These issues do not just add up linearly — they compound. In a multi-cloud, multi-system environment with dozens of services and hundreds of instances, they become an unmanageable liability.

The WIMSE Approach: Identity-First Credential Exchange

WIMSE (Workload Identity in a Multi-System Environment) is a specification being developed at the IETF to standardize how workloads obtain and use access credentials securely. Its core insight is straightforward: instead of distributing a long-lived access credential to a workload at startup, derive a short-lived, request-scoped access credential from the workload's verified identity at call time.

WIMSE introduces the concept of a trust domain — a logical grouping of systems sharing a common set of security controls and policies. In our Just-Buy-It example, there are two trust domains: just-do-it.prod.com and stripe.com. Within each trust domain, workload identity credentials are issued by a CA/Credential Service — either X.509 certificate-based or JWT-based. Three additional components orchestrate the credential exchange:

Context Service: Exchanges the mobile app's access token for a security context token containing verified information about the calling entity.
Token Service: Accepts a workload identity credential and a security context token, and issues a WIMSE token.
External Token Service: Operated by the external provider (Stripe in our example), it accepts a WIMSE token and returns a short-lived, request-scoped access credential.

Here is how the revised flow works:

The mobile application authenticates and obtains an access token.
The mobile application calls the payments-service through the Gateway Service, which now acts as an identity proxy.
The Gateway Service exchanges the mobile app's access token for a security context token from the Context Service.
The Gateway authenticates to the payments-service using its workload identity credential and forwards the request along with the security context token. Requests without a security context token are automatically denied.
The payments-service authenticates to the Token Service using its workload identity credential and passes the security context token. The Token Service issues a WIMSE token.
The payments-service presents the WIMSE token to Stripe's Token Service, which returns a short-lived stripe_api_key scoped to the current request.
The payments-service uses the short-lived, scoped stripe_api_key to call Stripe API.

The improvements are significant. Access credentials are short-lived and expire before an attacker can reuse them. They are issued per-request and scoped to that request, enabling fine-grained authorization. They are bound to the security context, so a leaked credential cannot be replayed without the matching context. The same workload identity credential can be used to request access credentials from multiple external systems, enabling a unified authentication posture across clouds. And because each workload instance receives its own unique credential, every API call becomes attributable.

Where WIMSE Still Falls Short

WIMSE is a meaningful step forward, but it is still a draft specification — and it will not reach standard status before at least 2028, with broad ecosystem adoption likely arriving around 2029–2030. More importantly, even once standardized, it leaves several gaps that enterprise customers will find difficult to accept:

Credentials still reach workloads: Even though they are short-lived, WIMSE still hands access credentials to workloads. Credential theft, replay attacks, and misuse before expiration remain possible.
Issuance-time control only: Once an access credential is issued, none of the WIMSE components are in the request path anymore. If a credential is misused during its validity window, there is no mechanism to intervene in real time.
Fragmented audit trails: WIMSE logs credential issuance events; workloads log API calls. Producing a complete, correlated audit trail requires joining logs from multiple systems — a significant operational burden.
Distributed security logic: Every workload must independently implement the full protocol — credential exchange, token refresh, caching, error handling, retry logic. In a system with hundreds of services, this produces a distributed, inconsistently maintained security implementation instead of a single enforcement layer.

Warden: The WIMSE Trust Model, Available Today

Warden is not a replacement for WIMSE. It implements the same trust model — workload identity credentials, per-request scoped access, and cross-domain authorization — but as a gateway rather than a distributed protocol. When WIMSE becomes a standard, Warden will be a natural implementation of it. In the meantime, it solves the problem today, with existing infrastructure, without waiting for ecosystem-wide adoption.
In our use case, Warden remplaces the Context Service and the Token Service.

Here is how the revised flow works:

The mobile application authenticates and obtains an access token.
The mobile application calls the payments-service through the Gateway Service.
The Gateway authenticates to the payments-service using its workload identity credential and forwards the request. The payments-service determines it needs to call Stripe API. The Stripe endpoint is configured on the payments-service to point to Warden.
The payments-service authenticates to Warden using its workload identity credential and makes the call exactly as it would to Stripe directly. Warden evaluates a policy based on the full request context — including a role passed by the payments-service — to decide whether to allow the request.
Based on the request context, Warden uses a privileged access credential stored in its vault — long-lived but regularly rotated automatically — to mint a short-lived stripe_api_key scoped to the current request.
Warden forwards the request to Stripe API using the short-lived, scoped stripe_api_key. The credential never leaves Warden.

Warden advantages Over the Naive WIMSE Implementation

No credential distribution: Warden calls cloud APIs on behalf of workloads. Credentials never reach workload processes — not even short-lived ones. The attack surface is reduced to a single, hardened component.
Continuous control: Because Warden sits directly in the request path, it can inspect every request, enforce policy, and block calls in real time — not just at credential issuance time.
Complete audit trails: Every outbound API call passes through Warden with a unique access credential bound to a specific workload identity. The audit trail is unified and attributable by construction.
Works with existing systems: WIMSE requires standardization, broad ecosystem adoption, and client-side support from every workload and external provider. Warden works immediately with existing infrastructure.
Reduced operational complexity: Workloads do not implement credential exchange. They simply send requests, and Warden handles everything. Policy enforcement is centralized; debugging is straightforward; upgrades are isolated to one component.
Better developer experience: Security happens transparently. Developers write normal HTTP calls; the gateway handles the rest.

Conclusion

The credential distribution problem is not new, but it is getting harder to ignore. As cloud architectures grow more distributed — more services, more providers, more workload instances — the gap between what today's secret-manager model can safely support and what modern infrastructure demands continues to widen.

WIMSE offers the right conceptual framework: derive short-lived, request-scoped access credentials from verifiable workload identities, and never hand long-lived secrets to workloads. The standard is still years away from broad adoption, but the architecture does not have to wait.

Warden implements this trust model today as an open-source, identity-aware egress gateway for cloud APIs. It currently proxies AWS, Azure, GCP, HashiCorp Vault, OpenBao, GitLab, GitHub, Mistral AI and OpenAI — with a roadmap targeting 100+ cloud providers, SaaS products and AI providers. The goal is to make the WIMSE model operationally available to any team, regardless of whether their cloud providers and service mesh have adopted the emerging standard.

If you are building infrastructure where workload identity and credential security matter, Warden is worth a look: https://github.com/stephnangue/warden

I Spent 4 Years Running HashiCorp Vault in Banks. Here's What It Can't Do.

Stephane Nangue — Wed, 25 Feb 2026 16:09:23 +0000

I've deployed and operated HashiCorp Vault in financial institutions across Europe for more than four years. Vault is an incredible piece of software. I've built my career around it. But after watching the same pattern play out at every organization, I realized Vault solves only half the problem — and the other half is about to get much worse.

Vault solves the secrets problem. It doesn't solve the access problem.

Vault is brilliant at what it does: store secrets, rotate credentials, issue short-lived tokens, manage encryption keys. It's the foundation of secrets management for good reason.

But here's what I kept seeing in every deployment, from mid-size fintechs to major European banks:

Teams would spend months setting up Vault. Secrets engines configured. Policies written. AppRole or Kubernetes auth wired up. Audit logs enabled. Everything by the book.

And then a service would authenticate to Vault, receive AWS credentials, and... that was it. The service walked away with real credentials and did whatever it wanted. No one knew which API calls it actually made. No one could distinguish between Service A and Service B if they shared an IAM role. And if something went wrong, CloudTrail would tell you the role name — not which of your twelve microservices actually called s3:DeleteObject at 3 AM.

Vault's job ends at the moment it hands over the credential. What happens after that is a blind spot.

The audit gap

Let me be specific about what's missing. Imagine you have five services that authenticate to Vault and receive AWS credentials via the AWS secrets engine. Vault's audit log tells you:

Service A requested credentials at 10:00
Service B requested credentials at 10:05
Service C requested credentials at 10:12

Great. Now one of those services makes an unusual API call — say, iam:CreateUser in a production account. CloudTrail shows the IAM role made the call. But which service? You can't tell. The credentials Vault issued are identical in terms of IAM role. You're left correlating timestamps and hoping for the best.

This isn't a Vault bug. It's a gap in the model. Vault manages credential lifecycle — issuance, rotation, revocation. It doesn't manage credential usage.

Short-lived credentials are not enough

The industry has converged on "short-lived credentials" as the answer to secrets sprawl. And it's a good answer — a credential that expires in 15 minutes is better than one that lives forever.

But a credential that lives for 15 minutes can still do unlimited damage in 15 minutes. Scope is defined at issuance, not at the moment of each request. Once the credential is in the service's hands, it can make any API call the IAM policy allows, as many times as it wants, until expiration.

In a financial institution, the question isn't just "did this service have valid credentials?" It's "exactly what API calls did this service make, when, and were they all expected?" Short-lived credentials don't answer that.

Enter AI agents — and the problem gets urgent

For the last four years, this audit gap was an annoyance. Something you worked around with convention, extra tooling, and a lot of manual correlation. Humans are predictable enough that you can mostly get away with it.

AI agents change everything.

We're now giving autonomous software access to production cloud accounts. An AI coding agent that needs to push to GitHub, read from S3, and deploy to Azure. A data analysis agent that queries DynamoDB. An infrastructure agent that provisions resources via Terraform.

These agents are non-deterministic. They make decisions at runtime. They might call an API you didn't expect, in an order you didn't predict. And they're multiplying — one team might run ten agents, each with cloud access.

The question isn't whether AI agents will have cloud credentials. They already do. The question is whether anyone knows what they're doing with them.

With Vault alone, the answer is no.

What's missing: request-level visibility

The gap is between credential issuance and credential usage. Vault handles the first part well. Nothing handles the second.

What you actually need is:

Per-request identity — know which specific workload or agent made each individual API call, not just which IAM role
Per-request audit — log every API call with the workload identity attached, not just credential issuance events
Per-request policy — enforce what API calls are allowed at the moment they happen, not just at credential issuance
Zero credential exposure — the workload shouldn't hold credentials at all, eliminating the risk of credential theft or misuse

This is why I built Warden

Warden is an open-source identity-aware egress gateway. It sits in the request path between your workloads and cloud services — not at credential issuance time, but at the moment of every API call.

Here's how it works:

Your workload makes a normal API call (say, to AWS S3)
Warden intercepts the outbound request
Warden verifies the workload's identity — via mTLS, JWT, Kubernetes service account, SPIFFE, or cloud machine identity
Warden mints short-lived credentials on the fly (via STS, Vault, or the provider IAM)
Warden injects the credentials into the request and forwards it to the cloud provider
Every request is logged with the specific workload identity that made it

The workload never sees, holds, or handles real cloud credentials. They exist only inside the proxy, for the duration of a single request.

Warden complements Vault — it doesn't replace it

This is important: Warden isn't a Vault replacement. Vault is still the best secrets manager out there. Warden can use Vault as a credential source — in fact, that's one of the recommended configurations.

The difference is where they sit in the stack:

Vault manages credential lifecycle: storage, issuance, rotation, revocation
Warden manages credential usage: per-request injection, per-request audit, per-request policy

Think of it this way: Vault is the bank that holds the money. Warden is the accountant that records every transaction and makes sure each one is authorized.

What it looks like in practice

Warden currently supports AWS, Azure, GCP, HashiCorp Vault, GitHub, and GitLab as providers. The workload doesn't need any code changes — if your tool supports a custom endpoint URL, it works with Warden.

A few things that become possible with Warden in the path:

"Which agent called s3:PutObject on the production bucket at 2:47 PM?" — you can answer this instantly, because every request is logged with the workload identity
"Block all destructive operations from AI agents" — per-request policy enforcement means you can allow reads but block deletes, regardless of what the IAM role permits
"This agent should only access GitHub repos in the src/ directory" — Warden can enforce path-level restrictions that GitHub's own token scoping can't express
"Prove to the auditor that no unauthorized API calls were made last quarter" — the audit log is complete, per-request, and tied to identities

The hard part: proxying cloud APIs

Building Warden has been an exercise in pain, mostly around AWS SigV4 signature verification from the proxy side. Signing a request is well-documented. Verifying one from a man-in-the-middle position — so you can strip the original auth, inject new credentials, and re-sign — is a minefield of undocumented SDK behaviors, URI encoding edge cases, and service-specific quirks.

S3 alone has path-style vs virtual-hosted buckets, unsigned payloads, chunked signing, and presigned URLs with auth in query parameters. Directory buckets and table buckets use a separate SigV4-S3Express auth flow that's next on the roadmap.

But the proxy architecture is what makes everything else possible. If you're not in the request path, you can't do per-request anything.

What's next

Warden supports six providers today. The roadmap is 100+ — every major cloud, SaaS, and AI service. The goal is to become the standard identity-aware gateway for how workloads and AI agents access external services.

The project is open-source, written in Go, and available on GitHub: github.com/stephnangue/warden

If you've been running Vault and felt the same gap I described — or if you're deploying AI agents and wondering how to govern their cloud access — I'd love to hear from you. Open an issue, start a discussion, or reach out directly.

Stephane has been deploying and operating HashiCorp Vault in financial institutions across Europe for over four years. He is the creator of Warden, an open-source identity-aware egress gateway for cloud and SaaS services.