Ramasankar Molleti

Posted on May 6

Anthropic Just Killed the API Key: A Deep Dive into Workload Identity Federation for Claude

#ai #kubernetes #devops #security

TL;DR — Anthropic shipped Workload Identity Federation (WIF) for the Claude API. Your workloads now exchange a short-lived OIDC JWT from your IdP (EKS IRSA, GKE, AKS, GitHub Actions, Kubernetes, SPIFFE/SPIRE, Okta, Entra ID) for a short-lived sk-ant-oat01-... token via RFC 7523 jwt-bearer grant. Zero static secrets. But it's workload identity, not user delegation — and that distinction is where confused deputy bugs are about to start showing up.

Why this matters (and why I'm writing a sequel)

A few weeks back I wrote about draft-klrc-aiagent-auth — the IETF blueprint for agentic identity from engineers at AWS, Zscaler, Ping Identity, and Defakto Security. The thesis was straightforward: most teams securing AI agents with API keys are one breach away from disaster, and the fix is an 8-layer Agent Identity Management System (AIMS) built on SPIFFE for workload identity, WIMSE for proof tokens across proxies, OAuth Token Exchange for delegation, and Transaction Tokens for operation-scoped authorization.

That post was about the standard. This post is about the first major LLM provider to ship a production implementation of the bottom half of that stack.

If you're running Claude in a regulated environment — financial services, healthcare, gov — and you've been waiting for the day you can stop baking sk-ant-... keys into Kubernetes secrets, that day is here. But there's a subtle architectural trap, and it's easy to miss.

Let's walk through what shipped, the exchange flow, the SPIFFE integration, and the confused-deputy footgun.

What Anthropic actually built

The mental model is clean:

A service account has credentials minted for it on demand, instead of being a credential.

That one sentence captures the whole shift. An API key is a credential — possessing it is sufficient. A service account is a principal that gets credentials minted on demand from an attested workload identity. Possession of the principal isn't a thing. You have to be the workload.

Three resources in the Claude Console express the trust relationship:

1. Service Account (`svac_...`)

A non-human principal in your Anthropic org. No email, no password, no Console login. It's the identity a federated token acts as. Joins workspaces like a human member. Minted tokens inherit that workspace's rate limits and usage attribution.

2. Federation Issuer (`fdis_...`)

Registers an OIDC provider with two key fields:

Issuer URL — must match the iss claim in your IdP's JWTs exactly.
JWKS source — discovery (default, hits /.well-known/openid-configuration), explicit_url, or inline for air-gapped clusters.

One issuer per environment. Your prod EKS, your staging EKS, and GitHub Actions are three separate issuers.

3. Federation Rule (`fdrl_...`)

The bridge between issuer and service account: "when a JWT from issuer X has claims matching Y, mint a token for service account Z."

Match conditions:

subject_prefix — exact or trailing-* match
exact audience
exact claim values (key/value map)
a CEL condition expression for complex logic

All matchers must pass. There is no implicit rule search — the client specifies the rule ID in the exchange request, and Anthropic verifies the JWT satisfies that rule. This is a deliberate design choice that prevents "rule confusion" attacks where a token accidentally matches a more permissive rule.

The exchange flow

┌──────────────┐  1. Get JWT     ┌───────────┐
│   Workload   │ ──────────────▶ │  Your IdP │
│  (in pod)    │ ◀────────────── │  (SPIRE,  │
└──────┬───────┘   JWT-SVID      │   EKS,    │
       │                          │   GHA…)   │
       │ 2. POST /v1/oauth/token  └───────────┘
       │    (jwt-bearer grant)
       ▼
┌──────────────────────────────────────┐
│  Anthropic token endpoint            │
│  - Verify signature against JWKS     │
│  - Check exp/nbf/iat                 │
│  - Match against federation rule     │
│  - Mint sk-ant-oat01-... (≤ rule TTL)│
└──────────────────────────────────────┘
       │
       │ 3. Bearer token on every API call
       ▼
   api.anthropic.com/v1/messages

Concretely, the SDK construction looks like this:

from anthropic import Anthropic, WorkloadIdentityCredentials, IdentityTokenFile

client = Anthropic(
    credentials=WorkloadIdentityCredentials(
        identity_token_provider=IdentityTokenFile(
            "/var/run/secrets/anthropic.com/token"
        ),
        federation_rule_id="fdrl_...",
        organization_id="00000000-0000-0000-0000-000000000000",
        service_account_id="svac_...",
        workspace_id="wrkspc_...",
    ),
)

message = client.messages.create(
    model="claude-sonnet-4-6",
    max_tokens=1024,
    messages=[{"role": "user", "content": "Hello, Claude"}],
)

In production you'd skip the explicit constructor entirely and let the SDK resolve from environment variables — that's the recommended pattern. Ship the same container image everywhere, inject the env per environment:

ANTHROPIC_FEDERATION_RULE_ID=fdrl_...
ANTHROPIC_ORGANIZATION_ID=00000000-...
ANTHROPIC_SERVICE_ACCOUNT_ID=svac_...
ANTHROPIC_WORKSPACE_ID=wrkspc_...
ANTHROPIC_IDENTITY_TOKEN_FILE=/var/run/secrets/anthropic.com/token

Then in code: client = Anthropic(). Done. The SDK reads the file, exchanges for a token, refreshes before expiry, retries on rotation.

Token lifetime — the smart part

Most "short-lived token" systems get this wrong. Anthropic got it right.

The minted token's lifetime is the lesser of:

The rule's token_lifetime_seconds (60s to 24h, default 1h)

Twice the remaining IdP JWT lifetime, with a 60-second floor

That second bound is what matters. It prevents an Anthropic token from significantly outliving the upstream identity it was derived from. If your SPIRE JWT-SVID has a 5-minute TTL (SPIRE's default), the Anthropic token can live at most 10 minutes regardless of what the rule says.

Upstream attestation is the binding constraint — exactly the property you want.

The SDK runs a two-tier refresh modeled on botocore:

Tier	Trigger	Behavior
Advisory	`expiry - 120s`	Best-effort exchange. Falls back to cached token on failure.
Mandatory	`expiry - 30s`	Failed exchange raises an error. Cached token too close to expiry.

And it re-reads ANTHROPIC_IDENTITY_TOKEN_FILE on every exchange, so rotated projected tokens (Kubernetes service-account tokens, SPIFFE JWT-SVIDs from spiffe-helper) get picked up transparently. No app restart. No human in the loop.

SPIFFE on Anthropic — the cleanest path

If you're running SPIRE, Anthropic has a first-class SPIFFE provider and the integration is genuinely well-designed. Here's the full setup.

SPIRE side

server.conf:

server {
    trust_domain         = "prod.example.com"
    jwt_issuer           = "https://oidc-discovery.prod.example.com"
    default_jwt_svid_ttl = "5m"
}

Two non-obvious things here:

jwt_issuer MUST equal the OIDC Discovery Provider's public URL — that exact string is what you register with Anthropic. Mismatch = 400 invalid_grant. This is the #1 cause of failed setups.
default_jwt_svid_ttl ≤ 1 hour. Anthropic's token-exchange endpoint rejects identity tokens with longer lifetimes. SPIRE's default is fine.

OIDC Discovery Provider config:

domains = ["oidc-discovery.prod.example.com"]

server_api {
    address = "unix:///run/spire/sockets/private/api.sock"
}

acme {
    email        = "..."
    tos_accepted = true
}

Workload registration entry:

spire-server entry create \
    -spiffeID spiffe://prod.example.com/ns/inference/sa/worker \
    -parentID spiffe://prod.example.com/spire/agent/k8s_psat/prod-cluster/NODE_UID \
    -selector k8s:ns:inference \
    -selector k8s:sa:worker

For cluster-wide registration, parent to a node alias instead of a single agent ID — otherwise you're pinned to one node.

spiffe-helper sidecar config:

agent_address = "/run/spire/sockets/agent.sock"
cert_dir      = "/var/run/secrets/anthropic.com"
daemon_mode   = true

jwt_svids = [{
    jwt_audience       = "https://api.anthropic.com"
    jwt_svid_file_name = "token"
}]

Anthropic side

Federation issuer:

{
  "name": "spire-prod",
  "issuer_url": "https://oidc-discovery.prod.example.com",
  "jwks_source": "discovery"
}

Federation rule:

{
  "name": "spire-inference-worker",
  "issuer_id": "fdis_...",
  "match": {
    "subject_prefix": "spiffe://prod.example.com/ns/inference/sa/worker",
    "audience": "https://api.anthropic.com"
  },
  "target": {
    "type": "service_account",
    "service_account_id": "svac_..."
  },
  "workspace_id": "wrkspc_...",
  "oauth_scope": "workspace:developer",
  "token_lifetime_seconds": 600
}

Kubernetes deployment — the volume detail nobody mentions

This is the one operational detail people miss:

volumes:
  - name: anthropic-token
    emptyDir:
      medium: Memory   # ← THIS LINE

Use a memory-backed emptyDir shared between spiffe-helper and the application container. The bearer JWT-SVID never touches the node's disk. Same pattern as Vault Agent token sinks. Same reason: bearer tokens on persistent storage are a postmortem waiting to happen.

Validation before wiring up the SDK

Always validate the JWT-SVID claims before you trust your federation rule:

spire-agent api fetch jwt \
    -audience https://api.anthropic.com \
    -socketPath /run/spire/sockets/agent.sock \
  | awk '/^[[:space:]]*eyJ/{print $1; exit}' \
  | jq -rR 'split(".")[1] | gsub("-";"+") | gsub("_";"/") | @base64d | fromjson'

Check:

iss matches the OIDC Discovery Provider URL you registered
sub is the workload's SPIFFE ID
aud contains https://api.anthropic.com

If any of those don't match what your federation rule expects, the exchange returns 400 invalid_grant with no useful diagnostic on the client side. Validate the claims first.

Three SPIFFE gotchas

1. Always set the audience matcher on the rule. Without it, the rule accepts JWT-SVIDs minted for any relying party. If the same workload also calls some other SaaS via SPIFFE, a token meant for that SaaS could exchange for an Anthropic token. Always pin audience.

2. Inline JWKS = you own rotation. SPIRE rotates signing keys frequently. If you registered the issuer with inline JWKS (air-gapped clusters), you must add new keys before workloads present them, and remove superseded keys after tokens signed with them expire. Stale keys in inline JWKS remain trusted indefinitely.

3. One issuer per trust domain. Each SPIRE trust domain has its own signing keys and OIDC Discovery Provider. Register each as a separate Anthropic federation issuer.

Mapping onto the IETF AIMS stack

This is where it gets interesting for anyone tracking the agentic identity standards work.

In draft-klrc-aiagent-auth, AIMS is 8 layers:

Identifiers → Credentials → Attestation → Provisioning → Authentication → Authorization → Observability → Policy

Anthropic WIF doesn't implement all eight. But it implements the bottom five correctly, and that's exactly the foundation the upper layers need.

AIMS Layer	What it requires	Anthropic WIF
Identifiers	Cryptographic, runtime-issued	SPIFFE ID via `sub` claim, or IdP-native subject
Credentials	Short-lived, attested, no secrets at rest	OIDC JWT exchanged for OAuth access token. Zero static secrets.
Attestation	Identity bound to what the workload is	Inherited from upstream IdP (SPIRE selectors, IRSA pod identity, GHA repo+workflow claims)
Provisioning	Federated trust, declarative	Console-configured issuer + rule. CEL for complex policy.
Authentication	Standards-based, verifiable	RFC 7523 JWT-bearer. JWKS validated. `iss`/`aud`/`exp`/`nbf`/`iat` enforced.
Authorization	Scoped, least-privilege	`workspace:developer` scope, workspace-bound, rate-limited
Observability	Audit chain	Service account attribution per request
Policy	Centralized enforcement	CEL match expressions, per-rule scoping

What's notably not here: the upper-layer agentic authorization primitives — OAuth Token Exchange with act claims, Transaction Tokens, Rich Authorization Requests (RAR), CAEP for real-time revocation.

That's not a criticism. Those belong at your gateway, not at the LLM provider's auth endpoint.

Which brings me to the most important point in this whole post.

The trap: this is workload auth, NOT user delegation

Here's the single most consequential thing to understand about Anthropic WIF, and it's hiding in plain sight: the caller is treated as a workload. There is no user delegation semantics here.

This isn't OAuth's authorization code flow. There's no user identity riding through the exchange. The federated token represents the workload that called Anthropic — not the user that asked the agent to do something. And in any real agentic deployment, a user is almost always at the top of the call chain.

That gap is where confused deputy bugs live.

Let me make this concrete.

Anthropic WIF answers: "Is this workload allowed to call Claude on behalf of this Anthropic service account, with these rate limits, in this workspace?"

Anthropic WIF does NOT answer: "Is Alice allowed to ask Claude to summarize Bob's salary data?"

There is no act claim. No user identity propagated to Anthropic. From Anthropic's perspective, every request from your gateway looks like the same service account. The user is invisible to them — by design, because that's how a workload-to-API trust boundary should work.

This is the classic confused deputy setup:

Your AgentGateway holds workload credentials (now: WIF tokens) that grant access to Claude.
Users delegate tasks to agents. Agents call through the gateway.
If the gateway doesn't enforce user authorization at its own boundary, an authenticated agent acting for a low-privilege user can ask the LLM to operate on data that user shouldn't see.
The LLM has no way to know.

The upstream WIF token only proves "the gateway said this is a legit workload call." It says nothing about which user triggered the call, what they're allowed to do, or whether the prompt content respects their authorization scope.

The layered model that actually works

┌───────────────────────────────────────────────────────────┐
│  USER                                                     │
│  ↓ (OIDC auth code flow, MFA, IdP session)                │
│  AGENT-FACING APP                                         │
│  ↓ (OAuth Token Exchange — adds `act` claim)              │
│  AGENT  ←── Transaction Token (RAR-scoped: "summarize     │
│  ↓                              doc X, max 1 LLM call")   │
│  AGENTGATEWAY  ←── enforces user policy + scope intersect │
│  ↓ (Anthropic WIF: SPIFFE JWT-SVID → sk-ant-oat01-...)    │
│  CLAUDE API                                               │
└───────────────────────────────────────────────────────────┘

Read top-to-bottom: user identity rides through OAuth Token Exchange with act claims, Transaction Tokens scope the specific operation, the gateway enforces user-level authorization, and Anthropic WIF handles the workload-to-LLM hop. Each layer answers a different question. None are interchangeable.

If you skip the user layer, WIF is still a massive upgrade over API keys — you've eliminated stored secrets, gained short-lived tokens, gained per-workload attribution. But you have not solved agentic identity. You've solved infrastructure identity.

The migration trap that will bite you

Buried in the docs:

ANTHROPIC_API_KEY sits above the federation tiers, so a leftover key in the environment silently shadows federation.

Translation: you can configure WIF perfectly, deploy, smoke-test, ship — and still be using the old API key. Because credential precedence puts ANTHROPIC_API_KEY above the federation env vars, the federation code path simply never runs.

The migration sequence that actually works:

1. Stand up federation in parallel. Leave ANTHROPIC_API_KEY in place.
2. Run `ant auth status` from inside the workload.
   At this stage: the API key wins. That's expected.
3. Unset ANTHROPIC_API_KEY EVERYWHERE:
     - CI secrets
     - Container env (Deployment manifests, Helm values)
     - Shell profiles
     - Any sidecar that injects it
4. Re-run `ant auth status`. Confirm the federation source is selected.
5. ONLY NOW: revoke the API key in the Console.

Step 3 is the high-risk step where audit chains catch leftover injections. I'd add: instrument your gateway logs to alert on requests carrying sk-ant-api03-... prefixes after cutover. If that prefix shows up after step 5, you have a stowaway. Could be a CronJob, a CI workflow, a debug pod, a contractor's laptop.

What this means for platform architecture

If you're running Claude in production today, three things change:

1. The threat model shifts from "key custody" to "issuer trust"

You're no longer worried about a static key leaking from a Vault transit engine, a CI log, a Slack message, or a developer laptop. You're worried about whether your IdP is correctly attesting workload identity.

The blast radius of compromise goes from "anyone with the key can be us" to "the attacker needs to compromise our IdP and satisfy the federation rule's match conditions during a short token window".

2. Audit and attribution become per-workload by default

Service account IDs flow into Anthropic's usage and rate limit attribution. Combined with your gateway logs, you can trace a single Claude inference back to: which workspace, which service account, which workload (via the federation rule + JWT sub), which user request (via correlation ID).

That's the audit chain regulators will eventually require for AI inference in regulated industries.

3. The gateway's job gets more important, not less

WIF closes the workload-to-LLM hop. The user-to-agent and agent-to-tool hops are still yours to enforce.

AgentGateway, with SPIFFE for workload mechanism and OAuth Token Exchange + RAR for user delegation, is where confused-deputy attacks get prevented. WIF is necessary but not sufficient.

Closing thought

Anthropic shipped Workload Identity Federation. RFC 7523 JWT-bearer grant. First-class support for AWS, GCP, Azure, GitHub Actions, Kubernetes, SPIFFE, Okta. Service account model. Short-lived tokens. Two-tier SDK refresh. Per-workload attribution.

For platform teams running Claude in regulated environments — especially on Kubernetes with SPIRE — this is the API key killer we've been waiting for. It maps cleanly onto the bottom five layers of the IETF AIMS stack.

But it is workload identity, not user delegation. The agent calling through your gateway still needs OAuth Token Exchange with act claims for user context, Transaction Tokens for operation scoping, and gateway-level policy enforcement to prevent confused deputy.

Static credentials die at the gateway. Dynamic attested tokens live at every hop.

The shift continues:

Stop asking "does it have the right key?"
Start asking "what IS this entity, do we trust it, what do we expect from it?"

References

If you found this useful, follow along — the next post in this series digs into AgentGateway implementation patterns: SPIFFE attestation, WIMSE proof tokens across proxies, and OAuth Token Exchange with act claim chaining for user delegation. The piece WIF doesn't solve.

DEV Community

Anthropic Just Killed the API Key: A Deep Dive into Workload Identity Federation for Claude

Why this matters (and why I'm writing a sequel)

What Anthropic actually built

1. Service Account (`svac_...`)

2. Federation Issuer (`fdis_...`)

3. Federation Rule (`fdrl_...`)

The exchange flow

Token lifetime — the smart part

SPIFFE on Anthropic — the cleanest path

SPIRE side

Anthropic side

Kubernetes deployment — the volume detail nobody mentions

Validation before wiring up the SDK

Three SPIFFE gotchas

Mapping onto the IETF AIMS stack

The trap: this is workload auth, NOT user delegation

The layered model that actually works

The migration trap that will bite you

What this means for platform architecture

1. The threat model shifts from "key custody" to "issuer trust"

2. Audit and attribution become per-workload by default

3. The gateway's job gets more important, not less

Closing thought

References

Top comments (0)

Why this matters (and why I'm writing a sequel)

What Anthropic actually built

1. Service Account (svac_...)

2. Federation Issuer (fdis_...)

3. Federation Rule (fdrl_...)

The exchange flow

Token lifetime — the smart part

SPIFFE on Anthropic — the cleanest path

SPIRE side

Anthropic side

Kubernetes deployment — the volume detail nobody mentions

Validation before wiring up the SDK

Three SPIFFE gotchas

Mapping onto the IETF AIMS stack

The trap: this is workload auth, NOT user delegation

The layered model that actually works

The migration trap that will bite you

What this means for platform architecture

1. The threat model shifts from "key custody" to "issuer trust"

2. Audit and attribution become per-workload by default

3. The gateway's job gets more important, not less

Closing thought

References

1. Service Account (`svac_...`)

2. Federation Issuer (`fdis_...`)

3. Federation Rule (`fdrl_...`)