<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>DEV Community: kt</title>
    <description>The latest articles on DEV Community by kt (@kanywst).</description>
    <link>https://dev.to/kanywst</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.us-east-2.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F3700180%2F04651b63-c6a1-4069-b356-a0f85c17e0bb.png</url>
      <title>DEV Community: kt</title>
      <link>https://dev.to/kanywst</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://dev.to/feed/kanywst"/>
    <language>en</language>
    <item>
      <title>JWT (Access Token) vs X.509 Deep Dive: How to Choose What You Present as a Credential</title>
      <dc:creator>kt</dc:creator>
      <pubDate>Mon, 15 Jun 2026 13:54:03 +0000</pubDate>
      <link>https://dev.to/kanywst/jwt-access-token-vs-x509-deep-dive-how-to-choose-what-you-present-as-a-credential-34ij</link>
      <guid>https://dev.to/kanywst/jwt-access-token-vs-x509-deep-dive-how-to-choose-what-you-present-as-a-credential-34ij</guid>
      <description>&lt;h2&gt;
  
  
  Introduction
&lt;/h2&gt;

&lt;p&gt;I was poking around a service mesh when something struck me as odd.&lt;/p&gt;

&lt;p&gt;Inside the mesh, Pod-to-Pod traffic authenticated with &lt;strong&gt;mTLS&lt;/strong&gt;: both sides showed each other an X.509 certificate. But the same request, when it came from outside through an API Gateway, carried a &lt;strong&gt;JWT&lt;/strong&gt; in &lt;code&gt;Authorization: Bearer eyJ...&lt;/code&gt;. Same job ("who are you, and may you come in"), yet the thing being handed over changed depending on where you stood.&lt;/p&gt;

&lt;p&gt;At first I assumed it was just historical mess. The more I dug, though, the more it turned out to be a clean split that comes down to &lt;strong&gt;one axis&lt;/strong&gt;: "bearer, or proof-of-possession?"&lt;/p&gt;

&lt;p&gt;This article lines up JWT (access tokens) and X.509 (mTLS) from the angle of "what do you present as a credential," and walks through when to pick which, top to bottom. It starts from the basics, so you can follow it even if you are not deep into OAuth or TLS.&lt;/p&gt;




&lt;h2&gt;
  
  
  Setup: what "present a credential to get authorized" actually means
&lt;/h2&gt;

&lt;p&gt;Let me get the vocabulary straight first. The "getting authorized" scene that runs through this whole article has three actors.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fraw.githubusercontent.com%2F0-draft%2Fdev.to%2Fmain%2Farticles%2Fassets%2Fjwt-vs-x509-credential%2Fdiagrams%2F01-setup-three-actors.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fraw.githubusercontent.com%2F0-draft%2Fdev.to%2Fmain%2Farticles%2Fassets%2Fjwt-vs-x509-credential%2Fdiagrams%2F01-setup-three-actors.png" alt="Three actors: issuer, presenter, verifier, and the allow/deny decision" width="800" height="1453"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Issuer&lt;/strong&gt;: creates the credential and hands it out. For a JWT that is the IdP (authorization server); for X.509 it is the Certificate Authority (CA).&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Presenter&lt;/strong&gt;: holds the credential it received and shows it to the other side on every request.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Verifier&lt;/strong&gt;: checks whether the presented credential is genuine and whose it is, then decides to let it through or reject it.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;A "credential" is the &lt;strong&gt;ID badge&lt;/strong&gt; you present in step 2. It is structurally the same as showing your driver's license to prove your age. The question is what you use as that badge. Here the path forks into two families.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;JWT (access token)&lt;/strong&gt;: a signed JSON, sent in the &lt;code&gt;Authorization&lt;/code&gt; header.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;X.509 certificate&lt;/strong&gt;: a digital certificate presented during the TLS handshake. The "m" in mTLS (mutual) is exactly this.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The two differ in how they look and where they ride, but the essential difference is a single thing. That is next.&lt;/p&gt;




&lt;h2&gt;
  
  
  The core: bearer, or proof-of-possession
&lt;/h2&gt;

&lt;p&gt;Every difference grows from here. Credentials split into two kinds by "can you use it just by holding it."&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Bearer type&lt;/strong&gt;: anyone holding the thing can use it. Like &lt;strong&gt;cash&lt;/strong&gt;. Drop your cash and whoever picks it up can spend it. JWT access tokens are basically this.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Proof-of-possession type&lt;/strong&gt;: showing the thing is not enough. You have to &lt;strong&gt;prove every time&lt;/strong&gt; that it is really yours. Closer to a &lt;strong&gt;bank card with a PIN&lt;/strong&gt;. Steal the card and you still cannot move money without the PIN (the private key). X.509 + mTLS is this.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fraw.githubusercontent.com%2F0-draft%2Fdev.to%2Fmain%2Farticles%2Fassets%2Fjwt-vs-x509-credential%2Fdiagrams%2F02-bearer-vs-pop.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fraw.githubusercontent.com%2F0-draft%2Fdev.to%2Fmain%2Farticles%2Fassets%2Fjwt-vs-x509-credential%2Fdiagrams%2F02-bearer-vs-pop.png" alt="Bearer type is usable once stolen; proof-of-possession needs the private key" width="800" height="805"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;This one picture is the most important in the article. Almost everything else is a consequence of it.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;If a JWT is stolen, the thief gets to act as "you" directly (a replay attack). That is why the standard move for JWTs is to &lt;strong&gt;keep their lifetime short&lt;/strong&gt; so the damage window stays small.&lt;/li&gt;
&lt;li&gt;An X.509 certificate is more or less public information, but you cannot open a TLS session without holding the &lt;strong&gt;matching private key&lt;/strong&gt;. Sniff the wire and copy the certificate, and you still cannot impersonate "you," because you lack the private key.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;"So just always use X.509, right?" Not quite. Proof-of-possession carries its own costs (distributing, storing, and rotating private keys, plus the proxy problem covered later). That is exactly why you need to choose.&lt;/p&gt;




&lt;h2&gt;
  
  
  Cracking each one open
&lt;/h2&gt;

&lt;p&gt;Now that the axis is clear, let me look at the two real artifacts.&lt;/p&gt;

&lt;h3&gt;
  
  
  JWT (access token): a signed JSON flowing through the app layer
&lt;/h3&gt;

&lt;p&gt;A JWT is a string split into three parts by &lt;code&gt;.&lt;/code&gt;.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;eyJhbGciOiJSUzI1NiIsInR5cCI6IkpXVCJ9 . eyJzdWIiOiJzdmMtYSIsImF1ZCI6... . SflKxwRJSMeKKF2QT4f...
|------------ header -----------|    |--------- payload --------|    |----- signature ----|
  alg (sign method), typ              sub (who), aud (for whom), exp    signed with private key
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The verifier checks the signature using the key set the issuer publishes (the &lt;strong&gt;JWKS&lt;/strong&gt;: JSON Web Key Set). The point is that the &lt;strong&gt;verifier does not have to call the issuer&lt;/strong&gt;. If the signature checks out, it can trust the token's contents (who, until when, for whom). This is &lt;strong&gt;stateless verification&lt;/strong&gt;. Delivery is via an HTTP header.&lt;/p&gt;

&lt;p&gt;There are also access tokens whose contents are not a JWT, the &lt;strong&gt;opaque tokens&lt;/strong&gt;. For those the verifier calls the issuer's introspection endpoint on every request to confirm validity (stateful). When this article says "access token," it means the &lt;strong&gt;JWT form&lt;/strong&gt; that supports the stateless verification above.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;GET /api/orders HTTP/1.1
Host: api.example.com
Authorization: Bearer eyJhbGciOiJSUzI1NiI...
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Since this lives at the application layer (L7), anything that speaks HTTP can carry it. A proxy or a Gateway can pass it straight through as "just a header." That matters later.&lt;/p&gt;

&lt;h3&gt;
  
  
  X.509 certificate: an identity proven during the TLS handshake
&lt;/h3&gt;

&lt;p&gt;X.509 is presented in the middle of opening a TLS connection. Ordinary HTTPS has only the server present a certificate, but in &lt;strong&gt;mTLS&lt;/strong&gt; the client presents one too, and on top of that &lt;strong&gt;proves possession by signing with its private key&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fraw.githubusercontent.com%2F0-draft%2Fdev.to%2Fmain%2Farticles%2Fassets%2Fjwt-vs-x509-credential%2Fdiagrams%2F03-mtls-handshake.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fraw.githubusercontent.com%2F0-draft%2Fdev.to%2Fmain%2Farticles%2Fassets%2Fjwt-vs-x509-credential%2Fdiagrams%2F03-mtls-handshake.png" alt="mTLS handshake: the client proves key possession before any HTTP data flows" width="800" height="661"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;The verifier walks the certificate up to a CA it trusts (the &lt;strong&gt;trust bundle&lt;/strong&gt;) to confirm legitimacy, and then uses the &lt;code&gt;CertificateVerify&lt;/code&gt; signature to confirm the peer "really holds the private key." The defining trait is that authentication finishes &lt;strong&gt;before a single byte of application data flows&lt;/strong&gt;, and this is transport-layer (closer to L4) work.&lt;/p&gt;

&lt;p&gt;That pins down where each one lives: a JWT is &lt;strong&gt;data flowing through the app layer&lt;/strong&gt;, and X.509 is &lt;strong&gt;an identity carved into the connection itself&lt;/strong&gt;.&lt;/p&gt;




&lt;h2&gt;
  
  
  Lining them up on seven axes
&lt;/h2&gt;

&lt;p&gt;With both artifacts in view, let me put them side by side. First a table for the big picture, then notes on the axes that matter most.&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Axis&lt;/th&gt;
&lt;th&gt;JWT (access token)&lt;/th&gt;
&lt;th&gt;X.509 (mTLS)&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Trust model&lt;/td&gt;
&lt;td&gt;Bearer&lt;/td&gt;
&lt;td&gt;Proof-of-possession&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Theft resistance&lt;/td&gt;
&lt;td&gt;Weak (steal it, use it)&lt;/td&gt;
&lt;td&gt;Strong (useless without the private key)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Layer it works at&lt;/td&gt;
&lt;td&gt;App layer (L7 / HTTP header)&lt;/td&gt;
&lt;td&gt;Transport layer (TLS handshake)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Survives a proxy / LB&lt;/td&gt;
&lt;td&gt;Yes (just a header)&lt;/td&gt;
&lt;td&gt;No (mTLS is terminated there)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Verification&lt;/td&gt;
&lt;td&gt;Check signature via JWKS (stateless)&lt;/td&gt;
&lt;td&gt;CA chain (trust bundle) + signature check&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Revocation&lt;/td&gt;
&lt;td&gt;Let it expire via a short &lt;code&gt;exp&lt;/code&gt;
&lt;/td&gt;
&lt;td&gt;CRL / OCSP, or short-lived certs&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Operational cost&lt;/td&gt;
&lt;td&gt;Low (no key distribution)&lt;/td&gt;
&lt;td&gt;High (distribute, store, renew keys). Automated in a mesh&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;Three axes are worth expanding.&lt;/p&gt;

&lt;h3&gt;
  
  
  Axis 1: can it cross a proxy
&lt;/h3&gt;

&lt;p&gt;This is the one that bites hardest in production. mTLS is authentication bound to "&lt;strong&gt;this point-to-point connection&lt;/strong&gt;." Put an L7 load balancer or API Gateway in the middle that terminates TLS, and the client's identity &lt;strong&gt;vanishes&lt;/strong&gt; there. From a service deeper than the Gateway, the peer is the Gateway, not the original client.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fraw.githubusercontent.com%2F0-draft%2Fdev.to%2Fmain%2Farticles%2Fassets%2Fjwt-vs-x509-credential%2Fdiagrams%2F04-proxy-traversal.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fraw.githubusercontent.com%2F0-draft%2Fdev.to%2Fmain%2Farticles%2Fassets%2Fjwt-vs-x509-credential%2Fdiagrams%2F04-proxy-traversal.png" alt="At a TLS-terminating proxy, mTLS identity vanishes while a JWT passes through as a header" width="800" height="758"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;A JWT is "data," so the Gateway can forward it deeper as a header untouched. So &lt;strong&gt;if you want to carry identity end to end across several proxy hops, use a JWT&lt;/strong&gt;. That is the answer to the mystery from the opening: cross the Gateway and it becomes a JWT.&lt;/p&gt;

&lt;h3&gt;
  
  
  Axis 2: how you revoke
&lt;/h3&gt;

&lt;p&gt;When you want to cancel a credential, the two strategies are inverted.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;JWT&lt;/strong&gt;: once a signed token is out, the verifier never calls the issuer, so "revoke it after the fact" is hard. So you make the lifetime &lt;strong&gt;short from the start&lt;/strong&gt; (minutes to tens of minutes) and wait for a leaked one to expire on its own. A long-lived JWT is a red flag.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;X.509&lt;/strong&gt;: you can announce "this certificate is no longer valid" via a revocation list (&lt;strong&gt;CRL&lt;/strong&gt;) or &lt;strong&gt;OCSP&lt;/strong&gt;. But the verifier has to go check that, which is operationally heavy. So rather than leaning on CRLs, the current trend is to make the &lt;strong&gt;certificate itself short-lived&lt;/strong&gt; (SPIRE defaults to one hour, for example) and reissue it frequently.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The funny part is that both roads end up in the same place: "&lt;strong&gt;keep it short-lived to dodge the problem.&lt;/strong&gt;"&lt;/p&gt;

&lt;h3&gt;
  
  
  Axis 3: operational cost
&lt;/h3&gt;

&lt;p&gt;X.509 means handing a private key to each workload, storing it safely, and renewing it on a schedule. That is the real source of "certificates are painful." But today a service mesh like Istio or Linkerd, or SPIFFE/SPIRE, does &lt;strong&gt;this distribution and renewal fully automatically&lt;/strong&gt;. The old common sense that "mTLS is operationally heavy," from the days of handing out certificates by hand, has largely faded in a mesh.&lt;/p&gt;

&lt;p&gt;The JWT side does not need to distribute a private key to clients (only the issuer holds the signing key), and the verifier just pulls the public key from the JWKS. The barrier to adoption is low.&lt;/p&gt;




&lt;h2&gt;
  
  
  Know how each one breaks
&lt;/h2&gt;

&lt;p&gt;Before you choose, it helps to know how each gets defeated, so your judgment does not wobble. The entry point for an attacker is completely different between the two.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;JWT &lt;code&gt;alg:none&lt;/code&gt;&lt;/strong&gt;: the biggest landmine in JWT. Set the header &lt;code&gt;alg&lt;/code&gt; to &lt;code&gt;none&lt;/code&gt; and a buggy implementation may accept "valid even without a signature," a classic vulnerability. You close it with a sane library and an operational rule that pins the allowed &lt;code&gt;alg&lt;/code&gt; (the SPIFFE JWT-SVID spec limits &lt;code&gt;alg&lt;/code&gt; to nine values and rejects &lt;code&gt;none&lt;/code&gt; and the symmetric &lt;code&gt;HS*&lt;/code&gt; family).&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;JWT theft and replay&lt;/strong&gt;: the bearer curse. Minimize the damage with a short &lt;code&gt;exp&lt;/code&gt;, sending over TLS, and never logging it.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;X.509 private key leak&lt;/strong&gt;: since possession of the private key is the whole basis of proof, a leaked key is game over. So keep keys off plain disk, ideally locked inside a TPM / HSM or in memory, and rotate them by keeping them short-lived.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;X.509 CA compromise&lt;/strong&gt;: take over the CA and you can mint any fake identity. Managing the trust bundle matters.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;X.509 says "as long as you protect the private key, being watched on the wire is fine." JWT says "as long as you protect the channel (TLS) it is easy to adopt, but a leak means instant impersonation." The point you have to defend is different.&lt;/p&gt;




&lt;h2&gt;
  
  
  The hybrid: bearer ergonomics + proof-of-possession strength
&lt;/h2&gt;

&lt;p&gt;"I want the convenience of a JWT, but I do not want it used after being stolen." Two mechanisms answer that greed. Both bolt proof-of-possession onto a JWT.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fraw.githubusercontent.com%2F0-draft%2Fdev.to%2Fmain%2Farticles%2Fassets%2Fjwt-vs-x509-credential%2Fdiagrams%2F05-hybrid-pop.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fraw.githubusercontent.com%2F0-draft%2Fdev.to%2Fmain%2Farticles%2Fassets%2Fjwt-vs-x509-credential%2Fdiagrams%2F05-hybrid-pop.png" alt="Cert-bound tokens and DPoP both add proof-of-possession to a plain JWT" width="799" height="480"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Certificate-bound access token (RFC 8705)&lt;/strong&gt;: at issuance, the hash (thumbprint) of the client certificate is stamped into the token. The verifier checks that "the thumbprint written in the token" matches "the certificate on the mTLS connection it currently holds." A stolen token alone is useless because you cannot open the matching mTLS connection. Used in high-security domains like Open Banking.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;DPoP&lt;/strong&gt;: for situations where you cannot run mTLS, it binds the token to a key the client holds, at the app layer. A "proof" signed with that key is attached per request, and the verifier cross-checks it against the token.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;In short, "bearer or PoP" is not 0 or 1. You can &lt;strong&gt;add PoP to a JWT and slide toward the middle&lt;/strong&gt;. But every addition costs implementation and operational effort, so for the first decision the plain two-way split is enough.&lt;/p&gt;




&lt;h2&gt;
  
  
  Choosing by example: the SPIFFE guidance makes it concrete
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;SPIFFE&lt;/strong&gt;, the standard for workload identity, carries both formats: the certificate flavor &lt;strong&gt;X509-SVID&lt;/strong&gt; and the JWT flavor &lt;strong&gt;JWT-SVID&lt;/strong&gt;. And the guidance in its docs is blunt, which makes a ready-to-use decision rule. Summarized:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;Because they are susceptible to replay attacks, use X.509-SVIDs whenever possible. Use JWT-SVID when mTLS is not practical, such as when an L7 proxy or load balancer sits between workloads.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;Walk a worked example. An order service calls an inventory service.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Inside the mesh, Pod to Pod, direct&lt;/strong&gt;: no L7 proxy in between, and the sidecar rotates certificates for you. Use &lt;strong&gt;X509-SVID (mTLS)&lt;/strong&gt;. Sniffing is useless without the private key, so it resists replay.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Across an API Gateway from outside, or with an L7 LB in between&lt;/strong&gt;: mTLS gets terminated at the Gateway and identity vanishes. Put a &lt;strong&gt;JWT-SVID&lt;/strong&gt; in a header and carry it end to end. In return, keep &lt;code&gt;exp&lt;/code&gt; short (a few minutes) to bound the theft risk.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fraw.githubusercontent.com%2F0-draft%2Fdev.to%2Fmain%2Farticles%2Fassets%2Fjwt-vs-x509-credential%2Fdiagrams%2F06-decision-tree.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fraw.githubusercontent.com%2F0-draft%2Fdev.to%2Fmain%2Farticles%2Fassets%2Fjwt-vs-x509-credential%2Fdiagrams%2F06-decision-tree.png" alt="Decision tree: proxy in between, key automation, and replay defense lead to JWT or X.509" width="800" height="976"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;In real production the standard is a hybrid: "JWT (OAuth) at the user edge, mTLS between services." You do not have to commit to one. The right answer is to &lt;strong&gt;choose per boundary using the decision tree above&lt;/strong&gt;.&lt;/p&gt;




&lt;h2&gt;
  
  
  Quick reference
&lt;/h2&gt;

&lt;p&gt;To put the decision in one place.&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;When&lt;/th&gt;
&lt;th&gt;Pick&lt;/th&gt;
&lt;th&gt;Why&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Service to service, direct inside a mesh&lt;/td&gt;
&lt;td&gt;X.509 + mTLS&lt;/td&gt;
&lt;td&gt;Auto-renewal works, resists replay&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Crossing an L7 proxy / Gateway&lt;/td&gt;
&lt;td&gt;JWT&lt;/td&gt;
&lt;td&gt;mTLS is terminated and identity vanishes; a JWT survives as a header&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;From a browser / mobile&lt;/td&gt;
&lt;td&gt;JWT&lt;/td&gt;
&lt;td&gt;Distributing a private key to the client is impractical&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Many services trust the same issuer&lt;/td&gt;
&lt;td&gt;JWT&lt;/td&gt;
&lt;td&gt;Verifiable via JWKS, no certificate distribution&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Strong defense against replay needed&lt;/td&gt;
&lt;td&gt;X.509, or a cert-bound / DPoP JWT&lt;/td&gt;
&lt;td&gt;Close the bearer weakness with proof-of-possession&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Must revoke immediately&lt;/td&gt;
&lt;td&gt;X.509 (CRL/OCSP), or a short-lived JWT&lt;/td&gt;
&lt;td&gt;JWT is bad at after-the-fact revocation; substitute short lifetimes&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;And the one line to remember goes back to the first picture.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;A JWT works if you hold it (bearer). X.509 works if you hold it and can prove it with the private key (proof-of-possession).&lt;/strong&gt; Almost every other difference follows from that.&lt;/p&gt;




&lt;h2&gt;
  
  
  Conclusion
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;The two families of credential you hand over for authorization, JWT (access token) and X.509 (mTLS), can be organized along one axis: &lt;strong&gt;bearer or proof-of-possession&lt;/strong&gt;.&lt;/li&gt;
&lt;li&gt;A JWT is easy and crosses proxies, but a stolen one gets used. So you protect it by keeping it &lt;strong&gt;short-lived&lt;/strong&gt;.&lt;/li&gt;
&lt;li&gt;X.509 is strong, useless even if sniffed without the private key, but its weaknesses are that &lt;strong&gt;identity is cut off at a proxy&lt;/strong&gt; and that &lt;strong&gt;keys need operating&lt;/strong&gt;. In a mesh, automation makes the operations light.&lt;/li&gt;
&lt;li&gt;If you want both, you can bolt proof-of-possession onto a JWT with &lt;strong&gt;RFC 8705 certificate-bound tokens&lt;/strong&gt; or &lt;strong&gt;DPoP&lt;/strong&gt;.&lt;/li&gt;
&lt;li&gt;In practice it is not either-or; you &lt;strong&gt;choose per boundary with the decision tree&lt;/strong&gt;. SPIFFE's "X.509 whenever possible, JWT when a proxy is in the way" works directly as the rule.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Next time you look at your own system, go boundary by boundary and check, "is what flows here a bearer, or a proof-of-possession?" If a long-lived bearer token is flowing along a near-plaintext path, that is your most dangerous spot.&lt;/p&gt;

</description>
      <category>security</category>
      <category>authentication</category>
      <category>jwt</category>
      <category>tls</category>
    </item>
    <item>
      <title>a2claude: Turn Claude Code Into a Server Other AI Agents Can Call</title>
      <dc:creator>kt</dc:creator>
      <pubDate>Sun, 14 Jun 2026 07:54:12 +0000</pubDate>
      <link>https://dev.to/kanywst/a2claude-turn-claude-code-into-a-server-other-ai-agents-can-call-1mf6</link>
      <guid>https://dev.to/kanywst/a2claude-turn-claude-code-into-a-server-other-ai-agents-can-call-1mf6</guid>
      <description>&lt;h1&gt;
  
  
  Introduction
&lt;/h1&gt;

&lt;p&gt;Running a single AI agent is starting to feel dated. You have an agent that researches, one that plans, one that writes code, each with a role, handing work to each other. When I tried to wire up a setup like that, I got stuck on one question: what do I use for the "writes code" slot?&lt;/p&gt;

&lt;p&gt;I already had Claude Code. It actually runs tools, edits files, and runs tests for you. The catch is that Claude Code is built to be driven by a human sitting at a terminal. There is no built-in door for another agent to call it automatically.&lt;/p&gt;

&lt;p&gt;So I wrote a small tool that exposes Claude Code as an &lt;strong&gt;A2A protocol server&lt;/strong&gt;. It is called &lt;code&gt;a2claude&lt;/code&gt;. Another agent sends "implement this feature" over A2A, a real Claude Code session runs against the project you pointed it at, and the result streams back. And what comes back is not a blob of text: it is structured information about which tools ran, which files changed and how, what it cost, and which actions need approval.&lt;/p&gt;

&lt;p&gt;The repo is here: &lt;a href="https://github.com/kanywst/a2claude" rel="noopener noreferrer"&gt;github.com/kanywst/a2claude&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;This article is written so that reading top to bottom takes you from knowing nothing about A2A or the internals of Claude Code to understanding exactly what a2claude bridges and how.&lt;/p&gt;

&lt;h2&gt;
  
  
  Background 1: What A2A is
&lt;/h2&gt;

&lt;p&gt;A2A (Agent2Agent) is a &lt;strong&gt;communication standard for AI agents to talk to each other&lt;/strong&gt;. Google announced it in 2025, donated it to the Linux Foundation the same year, and it is now governed there under neutral stewardship (v1.0 landed in 2026). It rides on top of HTTP over one of JSON-RPC, gRPC, or REST (the spec requires all three to expose the same operations), and it standardizes the exchange where one agent asks another to do a job.&lt;/p&gt;

&lt;p&gt;There are only four terms you need to hold onto.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Agent Card&lt;/strong&gt;: the business card that states what an agent can do. It lives at a fixed path, &lt;code&gt;/.well-known/agent-card.json&lt;/code&gt;, and a caller reads it first to decide whether this agent can handle what it wants done.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Task&lt;/strong&gt;: one request. It carries state and moves like &lt;code&gt;submitted&lt;/code&gt; (received) → &lt;code&gt;working&lt;/code&gt; (in progress) → &lt;code&gt;completed&lt;/code&gt; (done). Along the way it can also hit &lt;code&gt;input-required&lt;/code&gt; (waiting on input) or &lt;code&gt;failed&lt;/code&gt;.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Artifact&lt;/strong&gt;: the output of a Task. Generated text, files, and so on. It can be returned incrementally (streamed).&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;contextId&lt;/strong&gt;: the conversation handle. Send the next Task with the same contextId and it is treated as a continuation of the previous one.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The Task state transitions look like this. It pays to keep this in your head, because how a2claude uses these states matters later.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fraw.githubusercontent.com%2F0-draft%2Fdev.to%2Fmain%2Farticles%2Fassets%2Fa2claude-claude-code-as-a2a-agent%2Fdiagrams%2F01-task-lifecycle.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fraw.githubusercontent.com%2F0-draft%2Fdev.to%2Fmain%2Farticles%2Fassets%2Fa2claude-claude-code-as-a2a-agent%2Fdiagrams%2F01-task-lifecycle.png" alt="A2A Task lifecycle, with input-required highlighted" width="800" height="846"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;A2A has other states too, such as &lt;code&gt;rejected&lt;/code&gt; (the agent declines the request), but the ones a2claude actually uses are in this diagram. The one in red, &lt;code&gt;input-required&lt;/code&gt;, becomes the star of the second half.&lt;/p&gt;

&lt;p&gt;The point is that A2A is a shared language between agents: it does not matter what an agent runs on the inside (Claude or some other LLM), as long as it honors the Agent Card and the Task exchange, the conversation works.&lt;/p&gt;

&lt;h2&gt;
  
  
  Background 2: What Claude Code is
&lt;/h2&gt;

&lt;p&gt;Claude Code is Anthropic's coding agent. It is not a chat that only returns prose. It also:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;runs commands with &lt;code&gt;Bash&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;reads and writes files with &lt;code&gt;Read&lt;/code&gt; / &lt;code&gt;Edit&lt;/code&gt; / &lt;code&gt;Write&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;stops to ask for approval&lt;/strong&gt; before risky actions&lt;/li&gt;
&lt;li&gt;reports the cost and turn count of a run&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;In other words, it actually moves its hands. The thing to notice for this article is that when Claude Code runs, &lt;strong&gt;a lot of information beyond text comes out&lt;/strong&gt;: which tool it called, which file it changed and how, what it cost. a2claude does not throw that away. It puts it on A2A.&lt;/p&gt;

&lt;h2&gt;
  
  
  Background 3: Why a bridge is needed
&lt;/h2&gt;

&lt;p&gt;A2A is the protocol for agents to talk. Claude Code is the powerful tool that actually writes code. The two do not line up. Claude Code has no A2A server door, and from the A2A world Claude Code is invisible.&lt;/p&gt;

&lt;p&gt;So you need a translator in between. It takes an A2A Task, drives Claude Code, and translates the events Claude Code emits back into the language of A2A. That is what a2claude does.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fraw.githubusercontent.com%2F0-draft%2Fdev.to%2Fmain%2Farticles%2Fassets%2Fa2claude-claude-code-as-a2a-agent%2Fdiagrams%2F02-bridge-overview.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fraw.githubusercontent.com%2F0-draft%2Fdev.to%2Fmain%2Farticles%2Fassets%2Fa2claude-claude-code-as-a2a-agent%2Fdiagrams%2F02-bridge-overview.png" alt="a2claude sitting between the calling agent and Claude Code" width="800" height="1231"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  How this differs from the usual approach
&lt;/h2&gt;

&lt;p&gt;Adapters that "wrap a coding agent in A2A" are not rare. But most of them &lt;strong&gt;flatten both input and output to text&lt;/strong&gt;. You hand over a prompt and get text back. That erases everything that happened in between. From the calling agent's point of view, you cannot see whether a file was rewritten, what spent money, or whether an action needed approval.&lt;/p&gt;

&lt;p&gt;a2claude keeps the structure that comes out of Claude Code and carries it onto A2A. The difference between the left and right of the diagram below is the whole reason this project exists.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fraw.githubusercontent.com%2F0-draft%2Fdev.to%2Fmain%2Farticles%2Fassets%2Fa2claude-claude-code-as-a2a-agent%2Fdiagrams%2F03-flatten-vs-structured.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fraw.githubusercontent.com%2F0-draft%2Fdev.to%2Fmain%2Farticles%2Fassets%2Fa2claude-claude-code-as-a2a-agent%2Fdiagrams%2F03-flatten-vs-structured.png" alt="Text-only flattening versus structured events kept by a2claude" width="799" height="339"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  What maps to what
&lt;/h2&gt;

&lt;p&gt;The heart of a2claude comes down to one mapping table: "what Claude Code emits" onto "which A2A surface it lands on". This table, which is also in the README, is the core of the design.&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;What Claude Code emits&lt;/th&gt;
&lt;th&gt;The A2A surface it lands on&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Assistant text&lt;/td&gt;
&lt;td&gt;A streamed artifact (&lt;code&gt;append&lt;/code&gt; / &lt;code&gt;last_chunk&lt;/code&gt;)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;A tool call (Bash, Edit, ...)&lt;/td&gt;
&lt;td&gt;A &lt;code&gt;working&lt;/code&gt; status update&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;A file edit&lt;/td&gt;
&lt;td&gt;A named artifact carrying the diff&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Run result&lt;/td&gt;
&lt;td&gt;Cost / turns / usage on the completion message&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Session id&lt;/td&gt;
&lt;td&gt;Mapped to the A2A &lt;code&gt;contextId&lt;/code&gt; to resume next time&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;This mapping is closed up in one place in the code (&lt;code&gt;executor.py&lt;/code&gt;). That is what pays off in the next section.&lt;/p&gt;

&lt;h2&gt;
  
  
  Architecture: the A2A layer does not know Claude directly
&lt;/h2&gt;

&lt;p&gt;a2claude is split into layers. The single most important design decision is that &lt;strong&gt;the layer doing the A2A translation never imports the SDK that drives Claude Code&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;In between sits an abstraction called a "backend". A backend's job is to drive Claude Code and emit &lt;strong&gt;normalized events&lt;/strong&gt; (text, tool calls, file changes, permission requests, run results). The translation layer looks only at those events and maps them onto A2A. How Claude is invoked (through the SDK today, through the raw CLI later) is none of the translation layer's business.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fraw.githubusercontent.com%2F0-draft%2Fdev.to%2Fmain%2Farticles%2Fassets%2Fa2claude-claude-code-as-a2a-agent%2Fdiagrams%2F04-architecture-layers.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fraw.githubusercontent.com%2F0-draft%2Fdev.to%2Fmain%2Farticles%2Fassets%2Fa2claude-claude-code-as-a2a-agent%2Fdiagrams%2F04-architecture-layers.png" alt="Layered architecture: the A2A layer talks to a backend abstraction, not the SDK" width="800" height="1401"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Because of this split there are two backends.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;&lt;code&gt;echo&lt;/code&gt;&lt;/strong&gt;: no API key and no Claude install required. A dummy that just mirrors the input. It lets you exercise the server, the protocol mapping, and the CLI &lt;strong&gt;end to end, offline&lt;/strong&gt;. It reproduces every path, including the permission round trip that comes up later.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;&lt;code&gt;claude&lt;/code&gt;&lt;/strong&gt;: drives the real Claude Code through the Claude Agent SDK.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Why is "the translation layer not knowing the SDK" worth it? Because testing gets easy. The &lt;code&gt;echo&lt;/code&gt; backend alone can verify all of the A2A behavior. With no API key to call Claude, even in CI, you can check that the design has not broken.&lt;/p&gt;

&lt;h2&gt;
  
  
  Inside: an event queue and "parking"
&lt;/h2&gt;

&lt;p&gt;This is the most interesting part of a2claude. A backend's &lt;code&gt;drive&lt;/code&gt; runs as a &lt;strong&gt;background task&lt;/strong&gt; and pushes the events it produces onto a queue. The translation layer (the consumer) pulls them off in order with &lt;code&gt;drain&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;While things are flowing normally it is just a producer / consumer. The interesting case is when you hit a tool that needs approval. Claude Code stops and waits. At that point a2claude &lt;strong&gt;parks the background task right at the permission request&lt;/strong&gt;, and pushes only a "please approve this" event onto the queue. &lt;code&gt;drain&lt;/code&gt; returns that one event and then halts.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fraw.githubusercontent.com%2F0-draft%2Fdev.to%2Fmain%2Farticles%2Fassets%2Fa2claude-claude-code-as-a2a-agent%2Fdiagrams%2F05-queue-and-park.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fraw.githubusercontent.com%2F0-draft%2Fdev.to%2Fmain%2Farticles%2Fassets%2Fa2claude-claude-code-as-a2a-agent%2Fdiagrams%2F05-queue-and-park.png" alt="The event queue: drain stops when a permission request parks the background task" width="800" height="609"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;While it is halted, the Claude session does not die: it waits. Later, when the caller sends its answer, the parked task resumes and starts flowing again. This trick of "keeping the session alive across two separate calls" is what makes the next section's permission exchange possible.&lt;/p&gt;

&lt;h2&gt;
  
  
  Permissions: do not auto-approve, do not silently skip
&lt;/h2&gt;

&lt;p&gt;How you handle actions that need approval is the most nerve-wracking part of opening a coding agent up to someone else (another agent). A sloppy implementation either "auto-approves everything" or "silently skips anything that needs approval". Both are dangerous.&lt;/p&gt;

&lt;p&gt;a2claude uses the &lt;strong&gt;A2A &lt;code&gt;input-required&lt;/code&gt; state&lt;/strong&gt;. This is not something a2claude invented: A2A defines it precisely for human-in-the-loop, where a human or the caller makes a decision mid-task. Per the spec, when a Task reaches &lt;code&gt;input-required&lt;/code&gt; the processing stops there and control returns to the caller. a2claude rides directly on that.&lt;/p&gt;

&lt;p&gt;When a tool that needs approval shows up, the Task stops at &lt;code&gt;input-required&lt;/code&gt; and asks the caller "approve this action?". The caller sends its answer as a message &lt;strong&gt;on the same Task&lt;/strong&gt;. &lt;code&gt;allow&lt;/code&gt; (or &lt;code&gt;yes&lt;/code&gt;, &lt;code&gt;approve&lt;/code&gt;, &lt;code&gt;ok&lt;/code&gt;) approves; anything else denies.&lt;/p&gt;

&lt;p&gt;Thanks to the park from the previous section, the Claude session is still alive while stopped, so when the approval comes back it picks up &lt;strong&gt;right where it left off&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fraw.githubusercontent.com%2F0-draft%2Fdev.to%2Fmain%2Farticles%2Fassets%2Fa2claude-claude-code-as-a2a-agent%2Fdiagrams%2F06-permission-round-trip.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fraw.githubusercontent.com%2F0-draft%2Fdev.to%2Fmain%2Farticles%2Fassets%2Fa2claude-claude-code-as-a2a-agent%2Fdiagrams%2F06-permission-round-trip.png" alt="Permission round-trip across two calls while the session stays parked" width="800" height="835"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;One more important point: &lt;strong&gt;the server does not inherit the personal Claude settings of whoever runs it&lt;/strong&gt;. Your local Claude Code has a pre-approved tool allowlist ("this tool is always OK"), but the server does not load it. So when it acts on behalf of another agent, any action needing approval always routes back to the caller for a decision. Read-only actions that are already considered safe still run without a prompt, as before.&lt;/p&gt;

&lt;h2&gt;
  
  
  Session continuity: remembering "the previous turn"
&lt;/h2&gt;

&lt;p&gt;Claude Code holds a conversation session and can keep working with the prior context. a2claude &lt;strong&gt;ties that Claude session id to the A2A contextId&lt;/strong&gt; and remembers it. Send the next Task with the same contextId and the same Claude conversation resumes.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fraw.githubusercontent.com%2F0-draft%2Fdev.to%2Fmain%2Farticles%2Fassets%2Fa2claude-claude-code-as-a2a-agent%2Fdiagrams%2F07-session-continuity.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fraw.githubusercontent.com%2F0-draft%2Fdev.to%2Fmain%2Farticles%2Fassets%2Fa2claude-claude-code-as-a2a-agent%2Fdiagrams%2F07-session-continuity.png" alt="Session continuity: the same contextId resumes the same Claude conversation" width="800" height="846"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;"Add a &lt;code&gt;/health&lt;/code&gt; endpoint" → "now add a test for it" goes through as one continuous piece of work, not two separate requests.&lt;/p&gt;

&lt;h2&gt;
  
  
  Try it
&lt;/h2&gt;

&lt;p&gt;From here on we actually run it. You need three things.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Python 3.13 or newer&lt;/li&gt;
&lt;li&gt;&lt;a href="https://docs.astral.sh/uv/" rel="noopener noreferrer"&gt;uv&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;(only for the &lt;code&gt;claude&lt;/code&gt; backend) the Claude Code CLI on your &lt;code&gt;PATH&lt;/code&gt;
&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;First clone and install dependencies.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;git clone https://github.com/kanywst/a2claude
&lt;span class="nb"&gt;cd &lt;/span&gt;a2claude
uv &lt;span class="nb"&gt;sync&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  First, offline (the echo backend)
&lt;/h3&gt;

&lt;p&gt;The &lt;code&gt;echo&lt;/code&gt; backend needs no API key and no Claude install. Running the whole path offline first puts your mind at ease. Start the server and call it from another mouth.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;uv run a2claude serve &lt;span class="nt"&gt;--backend&lt;/span&gt; &lt;span class="nb"&gt;echo&lt;/span&gt; &amp;amp;
uv run a2claude call &lt;span class="s2"&gt;"fix the failing test"&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;What comes back looks like this. A Task and context id are assigned, and after the stream the completion metadata (cost and turn count) is attached.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;task 189b1c63-1a7b-4908-87c4-c8f3bba8f6b5
context 0b2a901e-2b6f-4c56-bba2-d0da546936e9

  · Echo
fix the failing test
[completed] $0.0 · 1 turns
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Point it at a real project (the claude backend)
&lt;/h3&gt;

&lt;p&gt;Now the &lt;code&gt;claude&lt;/code&gt; backend. Use &lt;code&gt;--cwd&lt;/code&gt; to set the directory Claude Code works in.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;uv run a2claude serve &lt;span class="nt"&gt;--backend&lt;/span&gt; claude &lt;span class="nt"&gt;--cwd&lt;/span&gt; /path/to/project
uv run a2claude call &lt;span class="s2"&gt;"add a /health endpoint"&lt;/span&gt; &lt;span class="nt"&gt;--url&lt;/span&gt; http://localhost:9100/
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Ask for a follow-up
&lt;/h3&gt;

&lt;p&gt;Pass the &lt;code&gt;context&lt;/code&gt; that came back from the previous turn and it becomes a continuation of the same conversation.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;uv run a2claude call &lt;span class="s2"&gt;"now add a test for it"&lt;/span&gt; &lt;span class="nt"&gt;--context&lt;/span&gt; &amp;lt;context-id&amp;gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Try the permission exchange
&lt;/h3&gt;

&lt;p&gt;Send an action that needs approval and it stops at &lt;code&gt;input-required&lt;/code&gt; and asks for an answer. The answer goes to the same Task.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;uv run a2claude call &lt;span class="s2"&gt;"sudo reboot"&lt;/span&gt;
&lt;span class="c"&gt;# ... [input-required] Permission requested for Bash: $ sudo reboot&lt;/span&gt;
&lt;span class="c"&gt;#       reply: a2claude call "allow" --task &amp;lt;id&amp;gt; --context &amp;lt;id&amp;gt;&lt;/span&gt;
uv run a2claude call &lt;span class="s2"&gt;"allow"&lt;/span&gt; &lt;span class="nt"&gt;--task&lt;/span&gt; &amp;lt;&lt;span class="nb"&gt;id&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="nt"&gt;--context&lt;/span&gt; &amp;lt;&lt;span class="nb"&gt;id&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The &lt;code&gt;echo&lt;/code&gt; backend also asks for approval when the prompt contains &lt;code&gt;sudo&lt;/code&gt;, so you can verify &lt;strong&gt;just this exchange&lt;/strong&gt; without driving Claude.&lt;/p&gt;

&lt;h3&gt;
  
  
  Peek at the Agent Card
&lt;/h3&gt;

&lt;p&gt;This is the business card the other agent reads first.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;uv run a2claude card
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;a2claude does not lump Claude Code's abilities into one "chat box". It lists them as &lt;strong&gt;discrete skills&lt;/strong&gt; on the Agent Card: code generation, refactor, debug, review, test, and code explanation, six of them. The caller can aim a request, "this is a debug job", deliberately.&lt;/p&gt;

&lt;h2&gt;
  
  
  Wrap-up
&lt;/h2&gt;

&lt;p&gt;What a2claude does comes down to one mapping table: it maps the structured events Claude Code emits onto A2A's Task, Artifact, status, and contextId. It does not flatten them to text.&lt;/p&gt;

&lt;p&gt;Three design points to single out:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;The translation layer does not know Claude's SDK.&lt;/strong&gt; With a backend abstraction in between, &lt;code&gt;echo&lt;/code&gt; alone verifies the whole path offline, and swapping out how Claude is invoked later leaves the translation layer untouched.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Permissions map to &lt;code&gt;input-required&lt;/code&gt;.&lt;/strong&gt; Keeping a parked session alive across two calls means it never auto-approves and never silently skips. The decision always returns to the caller.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;contextId makes the conversation continuous.&lt;/strong&gt; Mapping the Claude session id to the A2A context remembers "the previous turn".&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;If you are looking for a "writes code" slot in a setup where agents hand work to each other, you can drop Claude Code straight in as a part. The code is small enough to read all of, so take a look, and if something is off, throw an Issue.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://github.com/kanywst/a2claude" rel="noopener noreferrer"&gt;github.com/kanywst/a2claude&lt;/a&gt;&lt;/p&gt;

</description>
      <category>showdev</category>
      <category>ai</category>
      <category>agents</category>
      <category>python</category>
    </item>
    <item>
      <title>A2A Protocol Auth, Taken Apart: Why the Spec Is Thin and Where That Leaves Holes</title>
      <dc:creator>kt</dc:creator>
      <pubDate>Sat, 13 Jun 2026 14:26:48 +0000</pubDate>
      <link>https://dev.to/kanywst/a2a-protocol-auth-taken-apart-why-the-spec-is-thin-and-where-that-leaves-holes-22ii</link>
      <guid>https://dev.to/kanywst/a2a-protocol-auth-taken-apart-why-the-spec-is-thin-and-where-that-leaves-holes-22ii</guid>
      <description>&lt;h2&gt;
  
  
  How this started: I opened the auth section and it was nearly empty
&lt;/h2&gt;

&lt;p&gt;A2A (Agent2Agent) keeps showing up as the protocol for letting AI agents talk to each other. Google announced it in April 2025, and the Linux Foundation runs it now.&lt;/p&gt;

&lt;p&gt;"If one agent calls another, there has to be authn and authz in there," I figured, and opened the auth section of the spec. It was an anticlimax. There is no new authentication mechanism defined anywhere. What it says is: "use OAuth2 or OpenID Connect or mTLS," "send credentials in HTTP headers," "advertise your requirements in the Agent Card." That is it.&lt;/p&gt;

&lt;p&gt;My first reaction was "is this just lazy?" But the more I read, the clearer it got that the thinness is intentional. A2A defines only the &lt;strong&gt;frame&lt;/strong&gt; for authn and authz and delegates the contents to existing standards. The catch is that the way it delegates creates holes.&lt;/p&gt;

&lt;p&gt;Everything I have written before about AI agent auth (WIMSE, SPIFFE, ID-JAG, Identity Chaining, Transaction Tokens) shows up here as A2A's "delegation target."&lt;/p&gt;

&lt;h2&gt;
  
  
  Background: just three players to remember
&lt;/h2&gt;

&lt;p&gt;Before the auth details, the minimum cast. A2A is a protocol for one agent to hand a task to another agent. The transport is JSON-RPC over HTTP (there are gRPC and HTTP+JSON bindings too).&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Player&lt;/th&gt;
&lt;th&gt;Role&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Client Agent (caller)&lt;/td&gt;
&lt;td&gt;The agent handing off a task. The source of the HTTP request&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Remote Agent / A2A Server (callee)&lt;/td&gt;
&lt;td&gt;The agent that receives and works the task. Acts as an HTTP server&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Agent Card (business card)&lt;/td&gt;
&lt;td&gt;JSON where the Remote Agent publishes its capabilities and "auth requirements." Lives at &lt;code&gt;/.well-known/agent-card.json&lt;/code&gt;
&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;The Agent Card is the one to internalize. "What credential do I need to call this agent?" is written entirely on this card. The Client reads the card first, sets up the right auth, then sends the real request.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fraw.githubusercontent.com%2F0-draft%2Fdev.to%2Fmain%2Farticles%2Fassets%2Fa2a-protocol-auth-deep-dive%2Fdiagrams%2F01-players-overview.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fraw.githubusercontent.com%2F0-draft%2Fdev.to%2Fmain%2Farticles%2Fassets%2Fa2a-protocol-auth-deep-dive%2Fdiagrams%2F01-players-overview.png" alt="A2A players and where auth happens" width="800" height="475"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;The core of A2A auth is already in this picture. &lt;strong&gt;Tokens are issued outside A2A (by an external IdP)&lt;/strong&gt;, and A2A itself only "states requirements in the card" and "receives credentials in headers."&lt;/p&gt;

&lt;h2&gt;
  
  
  When is it actually A2A: the difference from a plain API call
&lt;/h2&gt;

&lt;p&gt;Before the auth internals, let's pin down what is A2A and what is not. Read on with these confused and the later design decisions float in midair. Here are the three things that get lumped together.&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;What you call&lt;/th&gt;
&lt;th&gt;What you pass&lt;/th&gt;
&lt;th&gt;What it is called&lt;/th&gt;
&lt;th&gt;A2A?&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;An LLM (an inference API)&lt;/td&gt;
&lt;td&gt;a prompt&lt;/td&gt;
&lt;td&gt;just a model API&lt;/td&gt;
&lt;td&gt;No. The other side is not an agent&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;A tool / data (DB, external API, files)&lt;/td&gt;
&lt;td&gt;a function call&lt;/td&gt;
&lt;td&gt;MCP territory (agent to tool)&lt;/td&gt;
&lt;td&gt;No&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Another autonomous agent&lt;/td&gt;
&lt;td&gt;a &lt;strong&gt;task&lt;/strong&gt;
&lt;/td&gt;
&lt;td&gt;A2A (agent to agent)&lt;/td&gt;
&lt;td&gt;Yes&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;Calling an LLM directly is "make the model think." The other side is not an agent. MCP is "use your own hands and feet (tools)," where the other side is subordinate. Only A2A is "&lt;strong&gt;hand work to someone else's agent&lt;/strong&gt;."&lt;/p&gt;

&lt;h3&gt;
  
  
  What is different from a plain API call
&lt;/h3&gt;

&lt;p&gt;To be honest, A2A is mechanically nothing more than HTTP + JSON-RPC. Nothing magic beyond "a service calls a service" happens. What differs is not the wiring but &lt;strong&gt;who the other side is and how you ask&lt;/strong&gt;. The same job, "is this expense (EUR 420, client entertainment) within policy, and convert it to USD while you're at it," looks different written two ways.&lt;/p&gt;

&lt;p&gt;The plain-API way (you know the whole contract and wire it tightly yourself):&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;GET  fx-api.com/rate?from=EUR&amp;amp;to=USD
POST policy-service/check  {amount: 420, category: "entertainment"}
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The A2A way (you do not know the internals; you read the card at runtime and send intent):&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;1. At runtime, read the Agent Card of the other side
   (another team's / company's "expense compliance agent")
   -&amp;gt; "I can do policy checks and currency conversion. Auth is Bearer token."
2. Send intent as a structured message:
   "Is this expense (EUR 420, entertainment) within policy? Also convert to USD."
3. The LLM inside that agent decides on its own which internal APIs to hit
   and how to judge, then returns a reasoned answer plus an attachment (artifact).
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The differences are these three:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;You send "intent," not an exact command.&lt;/strong&gt; You do not spell out steps like &lt;code&gt;?from=EUR&amp;amp;to=USD&lt;/code&gt;. You hand over a goal and the other side works out the steps (because it has an LLM).&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;You do not know the internals / you do not pre-wire.&lt;/strong&gt; You do not build against the other side's API contract. You read the Agent Card at runtime to learn "what it can do and how to auth" on the spot.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;The other side is someone else's property.&lt;/strong&gt; You cannot &lt;code&gt;import&lt;/code&gt; it into your code. It is another team's agent running somewhere else.&lt;/li&gt;
&lt;/ol&gt;

&lt;h3&gt;
  
  
  The one reason it exists: killing the N×N wiring
&lt;/h3&gt;

&lt;p&gt;If the other side is your own service, you do not need A2A. Just call the API. That can be said plainly. The one reason A2A earns its keep is &lt;strong&gt;avoiding the N×N problem&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fraw.githubusercontent.com%2F0-draft%2Fdev.to%2Fmain%2Farticles%2Fassets%2Fa2a-protocol-auth-deep-dive%2Fdiagrams%2F02-nxn-wiring.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fraw.githubusercontent.com%2F0-draft%2Fdev.to%2Fmain%2Farticles%2Fassets%2Fa2a-protocol-auth-deep-dive%2Fdiagrams%2F02-nxn-wiring.png" alt="Plain APIs N by N wiring versus the A2A common socket" width="800" height="721"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;When companies each have agents and want to hand work to each other, plain APIs need wiring per pair (the combinations blow up as the count grows). If everyone aligns on A2A's one card format and one calling convention, you can hand a task to an agent you have never met before, with no pre-wiring. It is like USB-C: the standard itself is not new technology, the value is that &lt;strong&gt;everyone aligned on the same socket&lt;/strong&gt;. There is also a business angle: vendors do not want to expose raw APIs, but they will open up as an "agent."&lt;/p&gt;

&lt;h3&gt;
  
  
  Reality: still pre-adoption
&lt;/h3&gt;

&lt;p&gt;Honestly, as of 2026 there are still few sharp "this is the A2A use case" examples. Production adoption is past 150 organizations and shows up in supply chain, financial services, insurance, and IT operations, but the public material stops at adoption counts and verticals, and concrete workflow case studies are thin. The canonical demos are cross-vendor delegation: travel booking (a planner delegates to airline, hotel, and rental-car agents from different companies) and hiring (a manager delegates to sourcing and scheduling agents). Most "multi-agent" today is in-process sub-agents, and for that you do not need A2A. A2A pays off when "make agents across org boundaries talk" becomes real, and it is a standard betting on that.&lt;/p&gt;

&lt;p&gt;That covers "what A2A is and when to use it." Now the real subject: authn and authz. The auth design takes the shape it does precisely because crossing boundaries is the premise.&lt;/p&gt;

&lt;h2&gt;
  
  
  The design philosophy: "treat agents as ordinary apps"
&lt;/h2&gt;

&lt;p&gt;A2A's auth design follows from a single principle.&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;Treat agents not as something special but as &lt;strong&gt;ordinary enterprise applications&lt;/strong&gt;.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;How do ordinary web apps authenticate when they call each other's APIs? Put an OAuth token in the &lt;code&gt;Authorization: Bearer&lt;/code&gt; header, put an API key in a header, present a client cert with mTLS. A2A just applies that to agents as-is. It invents nothing new.&lt;/p&gt;

&lt;p&gt;From this principle, four concrete design decisions fall out. This is the whole shape of A2A auth. Going through them in order:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Don't put identity in the payload. Establish it at the HTTP transport layer.&lt;/li&gt;
&lt;li&gt;Advertise auth requirements in the Agent Card.&lt;/li&gt;
&lt;li&gt;Acquire credentials out-of-band (outside A2A's scope).&lt;/li&gt;
&lt;li&gt;Authorization is per-skill scope based. But actual enforcement is left to external infrastructure.&lt;/li&gt;
&lt;/ol&gt;

&lt;h3&gt;
  
  
  Decision 1: don't put identity in the payload
&lt;/h3&gt;

&lt;p&gt;The spec says this.&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;A2A protocol payloads (JSON-RPC messages) don't carry user or client identity information directly.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;"Not in the payload" is easy to misread, so look at the actual HTTP request. A2A traffic is JSON-RPC carried over HTTP, and it splits like this.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight http"&gt;&lt;code&gt;&lt;span class="nf"&gt;POST&lt;/span&gt; &lt;span class="nn"&gt;/a2a&lt;/span&gt; &lt;span class="k"&gt;HTTP&lt;/span&gt;&lt;span class="o"&gt;/&lt;/span&gt;&lt;span class="m"&gt;1.1&lt;/span&gt;
&lt;span class="na"&gt;Host&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="s"&gt;remote-agent.example.com&lt;/span&gt;
&lt;span class="na"&gt;Authorization&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="s"&gt;Bearer eyJhbGci...     # "who is calling" (identity) goes here&lt;/span&gt;
&lt;span class="na"&gt;Content-Type&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="s"&gt;application/json&lt;/span&gt;

&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"jsonrpc"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"2.0"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"method"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"message/send"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"params"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nl"&gt;"message"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nl"&gt;"text"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"Summarize this PDF"&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;The payload (the JSON-RPC body) carries only "what you want done." "Who" is carried by the &lt;code&gt;Authorization&lt;/code&gt; token in the HTTP header (or the mTLS client cert).&lt;/strong&gt; That is what "don't put identity in the payload" means.&lt;/p&gt;

&lt;p&gt;The contrast with the design A2A &lt;strong&gt;does not&lt;/strong&gt; use makes it clear. If you did put identity in the payload, it would look like this.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight json"&gt;&lt;code&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"jsonrpc"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"2.0"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"method"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"message/send"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"params"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"callerId"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"agent-A"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"onBehalfOf"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"user-123"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"message"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nl"&gt;"text"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"Summarize this PDF"&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;A2A avoided this shape. There are three reasons, and they are what make it a "design decision."&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;A "who" inside the body is nothing but self-assertion.&lt;/strong&gt; Anyone can write &lt;code&gt;callerId: "admin"&lt;/code&gt;. The header token, on the other hand, is signed by an IdP and cannot be forged. The identity you can actually verify only rides on the header side.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;You can offload verification to infrastructure.&lt;/strong&gt; Reading the &lt;code&gt;Authorization&lt;/code&gt; header is the specialty of an API Gateway / reverse proxy / IAM. It does not have to parse the body (app-specific JSON). So the A2A SDK itself never needs to know "who is calling" (it stays identity-agnostic).&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;No double bookkeeping.&lt;/strong&gt; HTTP already has a place for auth (the header). Put it in the body too and you have two places that can disagree.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;The "transport layer" here does not mean OSI L4 in the strict sense. It is more like "the HTTP request envelope (headers) that wraps the app message (JSON-RPC)." ID on the envelope, business in the contents.&lt;/p&gt;

&lt;p&gt;This separation also feeds the "holes" later. If identity does not ride in the payload, the A2A message itself cannot carry the context "acting on behalf of user-123." So when you delegate in hops, A to B and B to C, the job of conveying "who the original user is" falls outside A2A (Identity Chaining / Transaction Tokens).&lt;/p&gt;

&lt;h3&gt;
  
  
  Decision 2: advertise auth requirements in the Agent Card
&lt;/h3&gt;

&lt;p&gt;"To call this agent you need a Bearer token (scope &lt;code&gt;read:tasks&lt;/code&gt;)" or "no, we use an API key": requirements differ per agent. A2A makes you declare this in the Agent Card's &lt;code&gt;securitySchemes&lt;/code&gt; and &lt;code&gt;security&lt;/code&gt; fields. Read the card and you know what to bring. Details below.&lt;/p&gt;

&lt;h3&gt;
  
  
  Decision 3: acquire credentials out-of-band
&lt;/h3&gt;

&lt;p&gt;The spec decides how credentials are "used" but not how they are "obtained."&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;Credentials for a client agent to connect to a remote agent are obtained by the client agent through an out-of-band process outside the scope of the A2A protocol.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;How you get the token (run an OAuth authorization-code flow, or grab one with client credentials) is outside A2A. The agent operates on the premise that it "already has a token."&lt;/p&gt;

&lt;h3&gt;
  
  
  Decision 4: per-skill scope authorization, enforced externally
&lt;/h3&gt;

&lt;p&gt;An A2A agent lists its "skills" (what it can do) in the Agent Card. Authorization can be applied per skill.&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;Access can be controlled on a per-skill basis ... specific OAuth scopes should grant an authenticated client access to invoke certain skills but not others.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;But the implementation that "checks the scope and rejects" is not something A2A does. That too is external work (a Gateway or IAM, or the app's middleware). The A2A spec only says "do it."&lt;/p&gt;

&lt;h2&gt;
  
  
  Sorting "what A2A defines" from "what it delegates"
&lt;/h2&gt;

&lt;p&gt;Sort the decisions so far cleanly into "what A2A decides itself" and "what it delegates to external standards," and the essence of A2A auth shows.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fraw.githubusercontent.com%2F0-draft%2Fdev.to%2Fmain%2Farticles%2Fassets%2Fa2a-protocol-auth-deep-dive%2Fdiagrams%2F03-define-vs-delegate.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fraw.githubusercontent.com%2F0-draft%2Fdev.to%2Fmain%2Farticles%2Fassets%2Fa2a-protocol-auth-deep-dive%2Fdiagrams%2F03-define-vs-delegate.png" alt="What A2A defines itself versus what it delegates" width="800" height="1271"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Left (green) is A2A's "frame," right (yellow) is the "delegation targets." &lt;strong&gt;Understanding A2A auth is almost entirely about understanding the left frame&lt;/strong&gt;, and the right is the very standards covered in other articles (ID-JAG, Identity Chaining, WIMSE, and so on).&lt;/p&gt;

&lt;p&gt;From here, take the left frame apart in order.&lt;/p&gt;

&lt;h2&gt;
  
  
  Dissecting the Agent Card: 5 SecurityScheme types
&lt;/h2&gt;

&lt;p&gt;Look at a real Agent Card that declares auth requirements. &lt;code&gt;securitySchemes&lt;/code&gt; defines "what auth methods exist" by name, and the &lt;code&gt;security&lt;/code&gt; array specifies "which methods are required." This notation is borrowed straight from OpenAPI 3.x Security Schemes.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight json"&gt;&lt;code&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"name"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"Document Processing Agent"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"description"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"An agent that analyzes documents"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"capabilities"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"streaming"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="kc"&gt;true&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"extendedAgentCard"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="kc"&gt;true&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"securitySchemes"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"oauth2"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"type"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"oauth2"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"flows"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
        &lt;/span&gt;&lt;span class="nl"&gt;"authorizationCode"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
          &lt;/span&gt;&lt;span class="nl"&gt;"authorizationUrl"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"https://auth.example.com/oauth/authorize"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
          &lt;/span&gt;&lt;span class="nl"&gt;"tokenUrl"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"https://auth.example.com/oauth/token"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
          &lt;/span&gt;&lt;span class="nl"&gt;"scopes"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
            &lt;/span&gt;&lt;span class="nl"&gt;"read:tasks"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"Read task information"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
            &lt;/span&gt;&lt;span class="nl"&gt;"write:tasks"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"Modify tasks"&lt;/span&gt;&lt;span class="w"&gt;
          &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
        &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"apiKey"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"type"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"apiKey"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"in"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"header"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"name"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"X-API-Key"&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"security"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nl"&gt;"oauth2"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s2"&gt;"read:tasks"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"write:tasks"&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nl"&gt;"apiKey"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;[]&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"skills"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nl"&gt;"name"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"analyze-document"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nl"&gt;"description"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"Analyze document content"&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"signatures"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"protected"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"eyJhbGciOiJSUzI1NiIsInR5cCI6IkpXVCJ9"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"signature"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"SflKxwRJSMeKKF2QT4fwpMeJf36POk6yJV_adQssw5c"&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;How to read this card:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;code&gt;securitySchemes&lt;/code&gt; defines two methods, &lt;code&gt;oauth2&lt;/code&gt; and &lt;code&gt;apiKey&lt;/code&gt;.&lt;/li&gt;
&lt;li&gt;The &lt;code&gt;security&lt;/code&gt; array is read as &lt;strong&gt;OR&lt;/strong&gt;. It means "come in with &lt;code&gt;oauth2&lt;/code&gt; (with scopes), or come in with &lt;code&gt;apiKey&lt;/code&gt;."&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;signatures&lt;/code&gt; is for tamper detection on the card itself (more below).&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The spec fixes SecurityScheme at &lt;strong&gt;exactly 5 types&lt;/strong&gt;. These are the entirety of A2A's auth methods.&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;type&lt;/th&gt;
&lt;th&gt;what it is&lt;/th&gt;
&lt;th&gt;main fields&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;apiKey&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;API key&lt;/td&gt;
&lt;td&gt;
&lt;code&gt;in&lt;/code&gt; (header/query/cookie), &lt;code&gt;name&lt;/code&gt;
&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;http&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;HTTP auth (Basic/Bearer, etc.)&lt;/td&gt;
&lt;td&gt;
&lt;code&gt;scheme&lt;/code&gt; (e.g. "bearer"), &lt;code&gt;bearerFormat&lt;/code&gt;
&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;oauth2&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;OAuth 2.0&lt;/td&gt;
&lt;td&gt;
&lt;code&gt;flows&lt;/code&gt; (table below)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;openIdConnect&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;OIDC discovery&lt;/td&gt;
&lt;td&gt;&lt;code&gt;openIdConnectUrl&lt;/code&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;mtls&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;mutual TLS&lt;/td&gt;
&lt;td&gt;(no extra fields)&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;The flows you can write in &lt;code&gt;oauth2&lt;/code&gt;'s &lt;code&gt;flows&lt;/code&gt; are also limited by the spec: the only ones defined are &lt;code&gt;authorizationCode&lt;/code&gt;, &lt;code&gt;clientCredentials&lt;/code&gt;, and &lt;code&gt;deviceCode&lt;/code&gt;, &lt;strong&gt;just these three&lt;/strong&gt;. OpenAPI's &lt;code&gt;implicit&lt;/code&gt; and &lt;code&gt;password&lt;/code&gt; are not adopted in A2A (both are discouraged flows, so it is a sensible cut). For agent-to-agent traffic (no human in the loop) you pick &lt;code&gt;clientCredentials&lt;/code&gt;, and for human delegation you pick &lt;code&gt;authorizationCode&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;Worth noting: &lt;strong&gt;&lt;code&gt;mtls&lt;/code&gt; is in this list of 5&lt;/strong&gt;. Beyond OAuth tokens, certificate-based mutual auth is in the "frame" from the start. That turns out to be the key to closing holes later.&lt;/p&gt;

&lt;h2&gt;
  
  
  The auth flow: read the card, then send the real request
&lt;/h2&gt;

&lt;p&gt;With the cast and card understood, walk the actual auth flow end to end. A2A auth is &lt;strong&gt;discovery-driven&lt;/strong&gt; (it starts from the card). The client first fetches the card with no auth, reads the requirements written there, then sets up auth.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fraw.githubusercontent.com%2F0-draft%2Fdev.to%2Fmain%2Farticles%2Fassets%2Fa2a-protocol-auth-deep-dive%2Fdiagrams%2F04-auth-flow-sequence.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fraw.githubusercontent.com%2F0-draft%2Fdev.to%2Fmain%2Farticles%2Fassets%2Fa2a-protocol-auth-deep-dive%2Fdiagrams%2F04-auth-flow-sequence.png" alt="A2A discovery-driven auth flow with 401 and 403 branches" width="800" height="746"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Three points.&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;Phase 1 needs no auth.&lt;/strong&gt; Anyone can read the card. So you must not write secrets in the card, and the card's authenticity has to be protected by other means (signing).&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Phase 2 is outside A2A.&lt;/strong&gt; A2A has nothing to do with how the token is obtained. The diagram uses &lt;code&gt;client_credentials&lt;/code&gt; as an example, but this is entirely the OAuth world.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Credentials always go in HTTP headers.&lt;/strong&gt; The spec is blunt: "Credentials MUST be transmitted in standard HTTP headers." Never in the JSON-RPC body. Validation failure is &lt;code&gt;401&lt;/code&gt;, insufficient permission is &lt;code&gt;403&lt;/code&gt;.&lt;/li&gt;
&lt;/ol&gt;

&lt;h2&gt;
  
  
  Extended Agent Card: authenticate and the card grows
&lt;/h2&gt;

&lt;p&gt;A2A has a neat trick: the &lt;strong&gt;Extended Agent Card&lt;/strong&gt;. The public card (readable by anyone) carries only the minimum, and &lt;strong&gt;only authenticated clients get an "extended card" with additional skills and config&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;This helps when you want to hide even the existence of a capability. Show only base skills to the outside, and show management skills to authenticated internal agents.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fraw.githubusercontent.com%2F0-draft%2Fdev.to%2Fmain%2Farticles%2Fassets%2Fa2a-protocol-auth-deep-dive%2Fdiagrams%2F05-extended-card-flow.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fraw.githubusercontent.com%2F0-draft%2Fdev.to%2Fmain%2Farticles%2Fassets%2Fa2a-protocol-auth-deep-dive%2Fdiagrams%2F05-extended-card-flow.png" alt="Extended Agent Card retrieval flow" width="800" height="1002"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Spec rules:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Usable only if the public card has &lt;code&gt;capabilities.extendedAgentCard: true&lt;/code&gt;.&lt;/li&gt;
&lt;li&gt;The auth for fetching the extended card uses &lt;strong&gt;a scheme declared in the public card's &lt;code&gt;security&lt;/code&gt;&lt;/strong&gt;. In other words, "the key to see the extended version is told to you properly by the public version."&lt;/li&gt;
&lt;li&gt;A client that gets the extended card &lt;strong&gt;replaces&lt;/strong&gt; its cached public card with it (for the duration of the authenticated session, or until the version changes).&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;What is interesting is that "the visible capabilities change with authentication," a form of authorization, is built into the protocol.&lt;/p&gt;

&lt;h2&gt;
  
  
  Signing the Agent Card: the card itself is an attack surface
&lt;/h2&gt;

&lt;p&gt;You may have noticed by now. All the auth requirements are written on the card. So &lt;strong&gt;what happens if the card is tampered with&lt;/strong&gt;?&lt;/p&gt;

&lt;p&gt;If an attacker rewrites the card's &lt;code&gt;tokenUrl&lt;/code&gt; to their own server, the client goes to the attacker to fetch a token. Inject a prompt injection into the card's &lt;code&gt;description&lt;/code&gt; and you can warp the behavior of a victim agent that reads it. The card is the root of trust, yet in Phase 1 it is handed out with no auth. That is the weak point.&lt;/p&gt;

&lt;p&gt;For this, A2A provides &lt;strong&gt;Agent Card signing&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fraw.githubusercontent.com%2F0-draft%2Fdev.to%2Fmain%2Farticles%2Fassets%2Fa2a-protocol-auth-deep-dive%2Fdiagrams%2F06-card-signing.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fraw.githubusercontent.com%2F0-draft%2Fdev.to%2Fmain%2Farticles%2Fassets%2Fa2a-protocol-auth-deep-dive%2Fdiagrams%2F06-card-signing.png" alt="Agent Card signing on the issuer side and verification on the client side" width="800" height="1700"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;The mechanism is two stages:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;Canonicalize with JCS&lt;/strong&gt; (RFC 8785, JSON Canonicalization Scheme): the same JSON content can produce different bytes depending on key order and whitespace. Canonicalize before signing so anyone who processes it gets the same byte string. Skip this and you get the accident where the meaning is identical but signature verification fails.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Sign with JWS&lt;/strong&gt; (RFC 7515): sign the canonicalized bytes and put &lt;code&gt;protected&lt;/code&gt; (the JWS protected header) and &lt;code&gt;signature&lt;/code&gt; into &lt;code&gt;signatures[]&lt;/code&gt;.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;But in the spec this is &lt;strong&gt;MAY&lt;/strong&gt; (optional). It is "you may sign," not "you must sign." That becomes one of the holes discussed next.&lt;/p&gt;

&lt;h2&gt;
  
  
  Where the holes are: risks born from A2A's delegation design
&lt;/h2&gt;

&lt;p&gt;A2A's "define only the frame, delegate the contents" design is safe if the implementer fills it in properly, but &lt;strong&gt;forget to fill it and it becomes a hole&lt;/strong&gt;. The parts the spec does not bind with MUST become attack surface directly. Here are the main attack surfaces and their roots (how to close them is diagrammed in the next section).&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Card tampering / impersonation&lt;/strong&gt;: the signature (&lt;code&gt;signatures&lt;/code&gt;) is optional, and the spec does not even mandate "how to verify the card." An implementation that trusts an unsigned card as-is gets lured to the attacker's server by a rewritten card.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Replay&lt;/strong&gt;: &lt;code&gt;Authorization: Bearer &amp;lt;token&amp;gt;&lt;/code&gt; is a bearer token. Steal it and the thief acts as the legitimate client. A2A itself has no mechanism to "bind a token to a specific client."&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Credential leak down the delegation chain&lt;/strong&gt;: in a chain where agent A calls B and B calls C, if A's token unintentionally flows to C, permission leaks. The spec says "credentials SHOULD be bound to the agent which originated the request," but with SHOULD, not as a mandate.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Scope bypass / confused deputy&lt;/strong&gt;: per-skill authorization is also up to the app/Gateway to enforce. Forget the "can this token call this skill" check and authentication passes while authorization sails through. And when an agent executes someone else's request with "its own strong permissions," you get a textbook confused deputy.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;In short, whether A2A auth ends up safe hinges on &lt;strong&gt;whether the implementer can promote the parts the spec let off with MAY / SHOULD up to MUST&lt;/strong&gt;.&lt;/p&gt;

&lt;h2&gt;
  
  
  Closing the holes: mTLS + signed cards + existing identity standards
&lt;/h2&gt;

&lt;p&gt;So how do you fill it concretely. OAuth bearer tokens alone cannot close the holes above. The combination that comes up over and over in practice is the three-piece set "mutual TLS + signed Agent Card + PKI-backed machine identity." These three supply the "proof of the sender" and "authenticity of the card" that a bearer token structurally cannot hold. Mapping this onto A2A's frame and the existing standards covered in other articles looks like this.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fraw.githubusercontent.com%2F0-draft%2Fdev.to%2Fmain%2Farticles%2Fassets%2Fa2a-protocol-auth-deep-dive%2Fdiagrams%2F07-holes-fixes-standards.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fraw.githubusercontent.com%2F0-draft%2Fdev.to%2Fmain%2Farticles%2Fassets%2Fa2a-protocol-auth-deep-dive%2Fdiagrams%2F07-holes-fixes-standards.png" alt="Holes mapped to fixes and the delegated standards that back them" width="800" height="2063"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Brought down to concrete implementation policy:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;Always sign the card. Do not accept an unsigned card.&lt;/strong&gt; Promote the spec's MAY to MUST in your own system. Put issuer verification on PKI (a trusted CA, or a SPIFFE trust domain).&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Use mTLS for agent-to-agent paths with no human in the loop.&lt;/strong&gt; A2A has the &lt;code&gt;mtls&lt;/code&gt; SecurityScheme from the start. With client certs, a stolen token alone cannot impersonate. Even when using OAuth, kill bearer replay with sender-constrained tokens. This is a mechanism that binds the token to "a specific client's key or cert," so stealing only the token is useless without the matching key. The two implementation styles are mTLS binding (RFC 8705) and DPoP (RFC 9449).&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Leave chain propagation to standards outside A2A.&lt;/strong&gt; Bind the credential to the originator, and for crossing domains use ID-JAG / Identity Chaining, for safe propagation within a chain use Transaction Tokens. A2A is only the "frame" to carry these, so bring the contents in from standards.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Always enforce per-skill authorization at a Gateway / IAM.&lt;/strong&gt; "Authentication passed" and "has permission to call that skill" are different things. Always place a layer that does not let the latter through unchecked.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;This is where the "delegation targets" (the blue on the right) all connect to existing standards covered in other articles. A2A is the &lt;strong&gt;plumbing (frame)&lt;/strong&gt; of AI agent auth, and the water flowing through it (the identity contents) is the very standards I have been writing about. That structure becomes clear.&lt;/p&gt;

&lt;h2&gt;
  
  
  See it for real: turn a coding CLI into an A2A server and the holes appear
&lt;/h2&gt;

&lt;p&gt;The "holes" so far may look abstract, but stand up one existing OSS project and they show up right in front of you. A Remote Agent is not a special product. It is &lt;strong&gt;just an HTTP server that wraps an agent you have on hand as an A2A server&lt;/strong&gt;. There are several OSS projects you can actually try.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;a href="https://github.com/a2aproject/a2a-samples" rel="noopener noreferrer"&gt;a2aproject/a2a-samples&lt;/a&gt;: the official hello-world Remote Agent (5 languages). The minimal server that handles &lt;code&gt;message/send&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;a href="https://github.com/hybroai/a2a-adapter" rel="noopener noreferrer"&gt;hybroai/a2a-adapter&lt;/a&gt;: a Python SDK that converts n8n / LangGraph / CrewAI / &lt;strong&gt;Claude Code&lt;/strong&gt; / Codex / Ollama and more into A2A servers (published on PyPI)&lt;/li&gt;
&lt;li&gt;
&lt;a href="https://github.com/firstintent/a2a-bridge" rel="noopener noreferrer"&gt;firstintent/a2a-bridge&lt;/a&gt;: connects CLI agents like Claude Code / Codex / Gemini CLI / Zed to each other through a daemon&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;For example, turning Claude Code into an A2A server with &lt;code&gt;a2a-adapter&lt;/code&gt; is just this.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;a2a_adapter&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;ClaudeCodeAdapter&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;serve_agent&lt;/span&gt;

&lt;span class="n"&gt;adapter&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;ClaudeCodeAdapter&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;working_dir&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;/path/to/project&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="nf"&gt;serve_agent&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;adapter&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;  &lt;span class="c1"&gt;# Claude Code comes up as an A2A server
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This stands up a server where "throw a task over the network and Claude Code reads/writes files and runs commands and returns." The problem starts here. The &lt;code&gt;a2a-adapter&lt;/code&gt; README says this.&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;Without a pre-configured Claude Code permissions file, tool-use is blocked. You can bypass it with &lt;code&gt;skip_permissions=True&lt;/code&gt; (or the env var &lt;code&gt;A2A_CLAUDE_SKIP_PERMISSIONS=1&lt;/code&gt;).&lt;br&gt;
&lt;/p&gt;
&lt;/blockquote&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="n"&gt;adapter&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;ClaudeCodeAdapter&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;working_dir&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;...&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;skip_permissions&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="bp"&gt;True&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The moment you add &lt;code&gt;skip_permissions=True&lt;/code&gt;, &lt;strong&gt;Claude Code runs tools (file edits, command execution) without human confirmation&lt;/strong&gt;. And the trigger that launches it is a task coming over A2A. If you have not tightened the front-door auth (who can call this server), anyone who reaches the endpoint gets all the way to command execution. This is exactly the attack surface listed earlier in the article.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;No auth in front -&amp;gt; unauthenticated remote code execution&lt;/li&gt;
&lt;li&gt;Token only, no mTLS -&amp;gt; replay with a stolen token&lt;/li&gt;
&lt;li&gt;Per-skill scope not enforced -&amp;gt; "read only" turns into write and execute (confused deputy)&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;So the previous section's fixes (limit who can call it with mTLS / signed cards / enforce per-skill scope at a Gateway) are not slogans. They become &lt;strong&gt;mandatory&lt;/strong&gt; the moment you ship one such adapter to production. The responsibility A2A "delegated" for auth is ultimately taken on by the human who stood up this server.&lt;/p&gt;

&lt;h2&gt;
  
  
  Design-decision cheat sheet
&lt;/h2&gt;

&lt;p&gt;For reference while implementing, here are A2A auth decision points on one page.&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Topic&lt;/th&gt;
&lt;th&gt;What A2A specifies&lt;/th&gt;
&lt;th&gt;Strength&lt;/th&gt;
&lt;th&gt;Recommendation in practice&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Where identity lives&lt;/td&gt;
&lt;td&gt;HTTP header / transport layer. Not in the payload&lt;/td&gt;
&lt;td&gt;design&lt;/td&gt;
&lt;td&gt;Follow as-is. Push verification to Gateway/IAM&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Conveying auth requirements&lt;/td&gt;
&lt;td&gt;Agent Card &lt;code&gt;securitySchemes&lt;/code&gt; / &lt;code&gt;security&lt;/code&gt;
&lt;/td&gt;
&lt;td&gt;defined&lt;/td&gt;
&lt;td&gt;Mind the OR semantics. Cut least-privilege scopes&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Auth methods&lt;/td&gt;
&lt;td&gt;apiKey / http / oauth2 / openIdConnect / mtls, 5 types&lt;/td&gt;
&lt;td&gt;defined&lt;/td&gt;
&lt;td&gt;For M2M paths use &lt;code&gt;mtls&lt;/code&gt; or client_credentials&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Credential transmission&lt;/td&gt;
&lt;td&gt;HTTP headers required&lt;/td&gt;
&lt;td&gt;MUST&lt;/td&gt;
&lt;td&gt;Never in the body&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Transport&lt;/td&gt;
&lt;td&gt;HTTPS required in prod, TLS 1.2+ recommended&lt;/td&gt;
&lt;td&gt;MUST/SHOULD&lt;/td&gt;
&lt;td&gt;Use TLS 1.3, verify the server cert&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Card signing&lt;/td&gt;
&lt;td&gt;JWS (RFC 7515) + JCS (RFC 8785)&lt;/td&gt;
&lt;td&gt;MAY&lt;/td&gt;
&lt;td&gt;Promote to MUST in your own system&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Authorization granularity&lt;/td&gt;
&lt;td&gt;per-skill scope&lt;/td&gt;
&lt;td&gt;SHOULD&lt;/td&gt;
&lt;td&gt;Always enforce at Gateway/IAM&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Binding the credential&lt;/td&gt;
&lt;td&gt;bind to the originating agent&lt;/td&gt;
&lt;td&gt;SHOULD&lt;/td&gt;
&lt;td&gt;Enforce with sender-constrained tokens&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Cross-domain propagation&lt;/td&gt;
&lt;td&gt;unspecified (outside A2A)&lt;/td&gt;
&lt;td&gt;none&lt;/td&gt;
&lt;td&gt;Bring in ID-JAG / Identity Chaining&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Replay defense&lt;/td&gt;
&lt;td&gt;unspecified&lt;/td&gt;
&lt;td&gt;none&lt;/td&gt;
&lt;td&gt;Kill bearer theft with mTLS / DPoP&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;Where the "Strength" column is MUST, A2A has your back. Where it is MAY / SHOULD / none, &lt;strong&gt;you tighten it yourself&lt;/strong&gt;. A2A auth security comes down to how much of the bottom-right of this table you fill.&lt;/p&gt;

&lt;h2&gt;
  
  
  Conclusion
&lt;/h2&gt;

&lt;p&gt;A2A auth is "thin." But that is not laziness. It is the result of a clear design decision: "treat agents as ordinary enterprise apps and delegate the auth contents to existing standards."&lt;/p&gt;

&lt;p&gt;The three things to hold onto:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;What A2A defines itself is only the "frame."&lt;/strong&gt; Agent Card &lt;code&gt;securitySchemes&lt;/code&gt; / &lt;code&gt;security&lt;/code&gt;, the 5 SecurityScheme types, the no-identity-in-payload design, the Extended Agent Card, JWS signing. Token issuance, validation, and propagation are all outside.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Holes appear in the parts the spec let off with MAY / SHOULD.&lt;/strong&gt; Card signing is optional, replay defense is absent, scope enforcement is external. Unless the implementer promotes these to MUST, they sail through.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;The way to close them is mTLS + signed cards + existing identity standards.&lt;/strong&gt; SPIFFE / WIMSE / ID-JAG / Identity Chaining / Transaction Tokens are the water flowing through the A2A plumbing.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Read A2A as a "new auth protocol" and you get the anticlimax. Read it as a "frame for wiring existing auth standards between AI agents" and the rationale of the design, and the responsibility you have to fill yourself, both come into focus. Next time you stand up an A2A server, change the bottom-right of this article's cheat sheet (MAY / SHOULD / none) into MUST, one row at a time.&lt;/p&gt;

&lt;h2&gt;
  
  
  References
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;&lt;a href="https://a2a-protocol.org/latest/specification/" rel="noopener noreferrer"&gt;Agent2Agent (A2A) Protocol Specification&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://a2a-protocol.org/latest/topics/enterprise-ready/" rel="noopener noreferrer"&gt;Enterprise-Ready Features - Agent2Agent Protocol (A2A)&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://www.rfc-editor.org/rfc/rfc7515" rel="noopener noreferrer"&gt;RFC 7515: JSON Web Signature (JWS)&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://www.rfc-editor.org/rfc/rfc8785" rel="noopener noreferrer"&gt;RFC 8785: JSON Canonicalization Scheme (JCS)&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://www.rfc-editor.org/rfc/rfc8705" rel="noopener noreferrer"&gt;RFC 8705: OAuth 2.0 Mutual-TLS Client Authentication&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://www.rfc-editor.org/rfc/rfc9449" rel="noopener noreferrer"&gt;RFC 9449: OAuth 2.0 Demonstrating Proof of Possession (DPoP)&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;

</description>
      <category>ai</category>
      <category>agents</category>
      <category>security</category>
      <category>oauth</category>
    </item>
    <item>
      <title>gRPC Deep Dive: Stubs, HTTP/2 Frames, and Why Netflix, Spotify, and Mercari Switched</title>
      <dc:creator>kt</dc:creator>
      <pubDate>Wed, 10 Jun 2026 15:37:14 +0000</pubDate>
      <link>https://dev.to/kanywst/grpc-deep-dive-stubs-http2-frames-and-why-netflix-spotify-and-mercari-switched-12if</link>
      <guid>https://dev.to/kanywst/grpc-deep-dive-stubs-http2-frames-and-why-netflix-spotify-and-mercari-switched-12if</guid>
      <description>&lt;h1&gt;
  
  
  Introduction
&lt;/h1&gt;

&lt;p&gt;While writing my xDS post (the one about istiod shipping protobuf to Envoy over a gRPC stream), I noticed something uncomfortable. I use gRPC every day. And if someone had asked me "so what is gRPC, exactly", I would have said "HTTP/2" and "Protocol Buffers" and then run out of sentences.&lt;/p&gt;

&lt;p&gt;So I stopped and rebuilt my understanding from the ground floor. This post walks through it in order, and every term gets explained at the spot where it first appears. The only assumption is that you have written an HTTP API that returns JSON at some point.&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;What RPC means (the "RPC" in gRPC)&lt;/li&gt;
&lt;li&gt;The two parts that make up gRPC: Protocol Buffers and HTTP/2&lt;/li&gt;
&lt;li&gt;The four communication patterns&lt;/li&gt;
&lt;li&gt;Running it locally and measuring everything (Go)&lt;/li&gt;
&lt;li&gt;Adoption stories: what each company was migrating away from&lt;/li&gt;
&lt;li&gt;Weak points, and when to pick gRPC over REST&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;One running example carries the whole post: &lt;strong&gt;gadgefre&lt;/strong&gt;, a fictional flea-market app for used gadgets. It has exactly four characters, and they show up in every diagram.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;The mobile app (what the user touches)&lt;/li&gt;
&lt;li&gt;The order service (accepts orders)&lt;/li&gt;
&lt;li&gt;The stock service (tracks inventory)&lt;/li&gt;
&lt;li&gt;The payment service (moves money)&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  RPC: network calls dressed up as function calls
&lt;/h2&gt;

&lt;p&gt;Before gRPC, RPC. Remote Procedure Call means exactly what the name says: &lt;strong&gt;calling code that runs on another machine, written as if it were a local function call&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;Think about what calling a REST API looks like in code. You build a URL, pick an HTTP method, encode a JSON body, parse the JSON response, branch on the status code. The thing you wanted was "reserve 2 units of this item", and most of the code is transport logistics.&lt;/p&gt;

&lt;p&gt;RPC hides all of that. The order service just writes this:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight go"&gt;&lt;code&gt;&lt;span class="c"&gt;// The stock service runs on another machine, but this reads like a local call&lt;/span&gt;
&lt;span class="n"&gt;res&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;err&lt;/span&gt; &lt;span class="o"&gt;:=&lt;/span&gt; &lt;span class="n"&gt;stockClient&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Reserve&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;ctx&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="o"&gt;&amp;amp;&lt;/span&gt;&lt;span class="n"&gt;pb&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;ReserveRequest&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="n"&gt;ItemId&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt; &lt;span class="s"&gt;"gx100"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;Quantity&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt; &lt;span class="m"&gt;2&lt;/span&gt;&lt;span class="p"&gt;})&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Here is what actually happens underneath:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fraw.githubusercontent.com%2F0-draft%2Fdev.to%2Fmain%2Farticles%2Fassets%2Fgrpc-introduction%2Fdiagrams%2F01-rpc-stub-call.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fraw.githubusercontent.com%2F0-draft%2Fdev.to%2Fmain%2Farticles%2Fassets%2Fgrpc-introduction%2Fdiagrams%2F01-rpc-stub-call.png" alt="How an RPC call travels through generated stubs from a Go client to a Java server" width="800" height="1530"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;The interesting box is the &lt;strong&gt;stub&lt;/strong&gt;. A stub is the code that does the transport logistics for you (serialize, send, receive, deserialize), and &lt;strong&gt;nobody writes it by hand. A tool generates it.&lt;/strong&gt; The diagram has the order service in Go and the stock service in Java on purpose: stubs are generated per language, so the caller and callee do not need to agree on one.&lt;/p&gt;

&lt;p&gt;RPC itself is an old idea with plenty of implementations (Java RMI, Thrift, JSON-RPC). gRPC is the modern one, open sourced by Google in 2015. Internally Google had been running an RPC system called Stubby for over a decade, handling on the order of tens of billions of calls per second across its datacenters, and gRPC is the rebuild of that system on open standards. In 2017 it moved to the CNCF (the same foundation as Kubernetes), where it lives today as an Incubating project with a release roughly every six weeks.&lt;/p&gt;

&lt;h2&gt;
  
  
  Part 1: Protocol Buffers (what gets sent)
&lt;/h2&gt;

&lt;p&gt;gRPC is two parts glued together. &lt;strong&gt;What gets sent&lt;/strong&gt; is decided by Protocol Buffers (protobuf for short). &lt;strong&gt;How it travels&lt;/strong&gt; is decided by HTTP/2. Protobuf first.&lt;/p&gt;

&lt;p&gt;Protobuf has two jobs:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;An IDL (interface definition language)&lt;/strong&gt;: you describe the shape of your API in a plain-text &lt;code&gt;.proto&lt;/code&gt; file&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;A serialization format&lt;/strong&gt;: it turns data into a binary blob smaller than JSON&lt;/li&gt;
&lt;/ol&gt;

&lt;h3&gt;
  
  
  Protobuf as an IDL
&lt;/h3&gt;

&lt;p&gt;Here is the gadgefre order service API as a &lt;code&gt;.proto&lt;/code&gt; file. This one file is the spine of the whole post; the hands-on later uses it unchanged.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight protobuf"&gt;&lt;code&gt;&lt;span class="na"&gt;syntax&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s"&gt;"proto3"&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

&lt;span class="kn"&gt;package&lt;/span&gt; &lt;span class="nn"&gt;gadgefre&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;order.v1&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

&lt;span class="kd"&gt;service&lt;/span&gt; &lt;span class="n"&gt;OrderService&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="c1"&gt;// Unary: create one order&lt;/span&gt;
  &lt;span class="k"&gt;rpc&lt;/span&gt; &lt;span class="n"&gt;CreateOrder&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;CreateOrderRequest&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;returns&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;CreateOrderResponse&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
  &lt;span class="c1"&gt;// Server streaming: push order status changes as they happen&lt;/span&gt;
  &lt;span class="k"&gt;rpc&lt;/span&gt; &lt;span class="n"&gt;WatchOrder&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;WatchOrderRequest&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;returns&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;stream&lt;/span&gt; &lt;span class="n"&gt;OrderEvent&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="kd"&gt;message&lt;/span&gt; &lt;span class="nc"&gt;CreateOrderRequest&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="kt"&gt;string&lt;/span&gt; &lt;span class="na"&gt;item_id&lt;/span&gt;  &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
  &lt;span class="kt"&gt;int32&lt;/span&gt;  &lt;span class="na"&gt;quantity&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
  &lt;span class="kt"&gt;int64&lt;/span&gt;  &lt;span class="na"&gt;user_id&lt;/span&gt;  &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;3&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="kd"&gt;message&lt;/span&gt; &lt;span class="nc"&gt;CreateOrderResponse&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="kt"&gt;string&lt;/span&gt;      &lt;span class="na"&gt;order_id&lt;/span&gt;  &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
  &lt;span class="n"&gt;OrderStatus&lt;/span&gt; &lt;span class="na"&gt;status&lt;/span&gt;    &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
  &lt;span class="kt"&gt;int64&lt;/span&gt;       &lt;span class="na"&gt;total_yen&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;3&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="kd"&gt;message&lt;/span&gt; &lt;span class="nc"&gt;WatchOrderRequest&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="kt"&gt;string&lt;/span&gt; &lt;span class="na"&gt;order_id&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="kd"&gt;message&lt;/span&gt; &lt;span class="nc"&gt;OrderEvent&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="kt"&gt;string&lt;/span&gt;      &lt;span class="na"&gt;order_id&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
  &lt;span class="n"&gt;OrderStatus&lt;/span&gt; &lt;span class="na"&gt;status&lt;/span&gt;   &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="kd"&gt;enum&lt;/span&gt; &lt;span class="n"&gt;OrderStatus&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="na"&gt;ORDER_STATUS_UNSPECIFIED&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
  &lt;span class="na"&gt;ORDER_STATUS_PENDING&lt;/span&gt;     &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
  &lt;span class="na"&gt;ORDER_STATUS_CONFIRMED&lt;/span&gt;   &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
  &lt;span class="na"&gt;ORDER_STATUS_SHIPPED&lt;/span&gt;     &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;3&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Three things to know when reading it:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;The &lt;code&gt;service&lt;/code&gt; block is the list of callable functions, each in the form &lt;code&gt;rpc Name(Input) returns (Output)&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;The &lt;code&gt;message&lt;/code&gt; blocks are the input/output types. The &lt;code&gt;= 1&lt;/code&gt;, &lt;code&gt;= 2&lt;/code&gt; are &lt;strong&gt;field numbers&lt;/strong&gt;: on the wire, these numbers stand in for the field names (that is where the size savings come from)&lt;/li&gt;
&lt;li&gt;The first enum value being &lt;code&gt;_UNSPECIFIED = 0&lt;/code&gt; is a protobuf convention. In proto3 there is no way to tell "field was never set" apart from "field was set to value 0", so value 0 is reserved to mean "not specified"&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Feed this file to the &lt;code&gt;protoc&lt;/code&gt; compiler and it generates stubs for whatever languages you need:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fraw.githubusercontent.com%2F0-draft%2Fdev.to%2Fmain%2Farticles%2Fassets%2Fgrpc-introduction%2Fdiagrams%2F02-proto-codegen-fanout.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fraw.githubusercontent.com%2F0-draft%2Fdev.to%2Fmain%2Farticles%2Fassets%2Fgrpc-introduction%2Fdiagrams%2F02-proto-codegen-fanout.png" alt="One order.proto file generating stubs for Go, Java, Swift, and TypeScript" width="800" height="369"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;This diagram is the benefit that adopting companies bring up first. The API definition lives in one &lt;code&gt;.proto&lt;/code&gt; file, and every language's client and server code is generated from it mechanically. Two chronic problems ("the docs drifted from the implementation" and "we hand-write a client per language") disappear structurally. Mercari, which shows up in the adoption section, keeps every service's &lt;code&gt;.proto&lt;/code&gt; in a single repository and has CI regenerate the Go, Python, Java, and Node.js code on merge.&lt;/p&gt;

&lt;h3&gt;
  
  
  Protobuf as a serialization format
&lt;/h3&gt;

&lt;p&gt;The second job is binary serialization. I serialized the same payload as protobuf and as JSON and measured (the measurement code appears in the hands-on):&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;protobuf: 24 bytes  0a 11 6f 72 64 5f 67 78 31 30 30 5f 31 32 33 34 35 36 37 10 01 18 e8 4d
json:     81 bytes  {"order_id":"ord_gx100_1234567","status":"ORDER_STATUS_PENDING","total_yen":9960}
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Same content, &lt;strong&gt;24 bytes of protobuf versus 81 bytes of JSON&lt;/strong&gt;, a 3.4x difference. The bytes are worth reading because they show how the trick works:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;code&gt;0a&lt;/code&gt; says "field number 1, length-delimited type". That is &lt;code&gt;order_id&lt;/code&gt;. The 8-character string &lt;code&gt;"order_id"&lt;/code&gt; appears nowhere in the payload&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;11&lt;/code&gt; is length 17, and the next 17 bytes are the ASCII of &lt;code&gt;ord_gx100_1234567&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;10 01&lt;/code&gt; is field 2 (status) set to 1 (PENDING). One byte, not the enum name as a string&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;18 e8 4d&lt;/code&gt; is field 3 (total_yen) set to 9960, encoded as a varint (variable-length integer) in 2 bytes&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The field names, quotes, and braces that JSON re-sends on every single message simply do not exist in protobuf, because the field numbers are pinned in the &lt;code&gt;.proto&lt;/code&gt;. When your services exchange tens of thousands of messages per second, that difference is bandwidth and parsing CPU, paid continuously.&lt;/p&gt;

&lt;p&gt;There is a cost: the payload is binary, so you cannot &lt;code&gt;curl&lt;/code&gt; it and read it with your eyes. The weak points section deals with that, tooling included.&lt;/p&gt;

&lt;h2&gt;
  
  
  Part 2: HTTP/2 (how it travels)
&lt;/h2&gt;

&lt;p&gt;gRPC runs on HTTP/2 as its transport. If your mental model of HTTP/2 is "HTTP/1.1 but faster", gRPC's design decisions look arbitrary, so this section takes its time. HTTP/1.1 and HTTP/2 share the same &lt;strong&gt;semantics&lt;/strong&gt; (methods, headers, status codes) but differ completely in &lt;strong&gt;architecture&lt;/strong&gt;: how those semantics are laid out as bytes on a connection.&lt;/p&gt;

&lt;h3&gt;
  
  
  HTTP/1.1's architecture: text letters, one at a time
&lt;/h3&gt;

&lt;p&gt;An HTTP/1.1 request is plain text separated by newlines:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;POST /reserve HTTP/1.1
Host: stock.gadgefre.internal
Content-Type: application/json
Content-Length: 32

{"item_id":"gx100","quantity":2}
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The smallest unit in this protocol is the whole message. Text streams in top to bottom, and the receiver has no way to tell "which request does this line belong to", so &lt;strong&gt;one TCP connection can carry only one request at a time&lt;/strong&gt;. Want parallelism, open more connections. That is literally what browsers have done for decades: about six connections per host.&lt;/p&gt;

&lt;p&gt;For service-to-service traffic this hurts. The order service constantly wants to ask the stock service and the payment service things at the same time, and every extra connection costs a TCP handshake plus a TLS negotiation (several round trips) to establish.&lt;/p&gt;

&lt;h3&gt;
  
  
  HTTP/2's architecture: binary frames interleaved on one connection
&lt;/h3&gt;

&lt;p&gt;HTTP/2 rebuilt exactly this part. Messages are chopped into &lt;strong&gt;frames&lt;/strong&gt;, small binary boxes, and every frame is tagged with the number of the &lt;strong&gt;stream&lt;/strong&gt; (the logical conversation) it belongs to. The receiver sorts frames by stream number and reassembles. Result: &lt;strong&gt;one TCP connection carries many conversations at once&lt;/strong&gt;. That is multiplexing.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fraw.githubusercontent.com%2F0-draft%2Fdev.to%2Fmain%2Farticles%2Fassets%2Fgrpc-introduction%2Fdiagrams%2F03-http2-streams-and-frames.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fraw.githubusercontent.com%2F0-draft%2Fdev.to%2Fmain%2Farticles%2Fassets%2Fgrpc-introduction%2Fdiagrams%2F03-http2-streams-and-frames.png" alt="Three concurrent RPCs as logical streams, and the same RPCs as interleaved frames on the wire" width="799" height="521"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;The colors line up between the two halves: collect only the blue frames and you have reassembled stream 1, only the purple ones and you have stream 5. That picture is most of what HTTP/2 is. Frames come in a handful of types, and these are the ones gRPC traffic actually consists of:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Frame&lt;/th&gt;
&lt;th&gt;Role&lt;/th&gt;
&lt;th&gt;How gRPC uses it&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;HEADERS&lt;/td&gt;
&lt;td&gt;Carries a block of headers&lt;/td&gt;
&lt;td&gt;Opens an RPC (method name etc.) and closes it (the trailer, see below)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;DATA&lt;/td&gt;
&lt;td&gt;Carries body bytes&lt;/td&gt;
&lt;td&gt;The protobuf messages&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;SETTINGS&lt;/td&gt;
&lt;td&gt;Connection parameter exchange&lt;/td&gt;
&lt;td&gt;Both sides send it right after connecting&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;WINDOW_UPDATE&lt;/td&gt;
&lt;td&gt;Flow control (receive buffer space)&lt;/td&gt;
&lt;td&gt;Backpressure for streaming&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;PING&lt;/td&gt;
&lt;td&gt;Liveness check&lt;/td&gt;
&lt;td&gt;Keepalive&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;RST_STREAM&lt;/td&gt;
&lt;td&gt;Kill one stream only&lt;/td&gt;
&lt;td&gt;RPC cancellation, deadline exceeded&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;GOAWAY&lt;/td&gt;
&lt;td&gt;Announce connection shutdown&lt;/td&gt;
&lt;td&gt;Graceful shutdown&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;Look at RST_STREAM and WINDOW_UPDATE for a moment. "Cancel one RPC out of the hundred in flight" and "slow down only the stream whose receiver is falling behind" are operations &lt;strong&gt;built into the protocol layer&lt;/strong&gt;. On HTTP/1.1 your only lever is killing the whole connection. The reason cancellation and flow control behave consistently across every gRPC language is not heroic framework code. It is that HTTP/2 is shaped like this.&lt;/p&gt;

&lt;p&gt;Headers got an upgrade too: they travel compressed with HPACK. A header sent once gets an entry in a per-connection dictionary, and from then on it is a one-or-two-byte index. Nobody re-sends the string &lt;code&gt;content-type: application/grpc&lt;/code&gt; ten thousand times.&lt;/p&gt;

&lt;p&gt;One more mechanism matters later: the &lt;strong&gt;trailer&lt;/strong&gt;. HTTP/2 lets a peer send &lt;strong&gt;one more HEADERS frame after the body&lt;/strong&gt;, closing the message with headers at the end. That tail block is the trailer, and it exists to carry "here is how things turned out" information that is only known after the body finished. For a stream that pushed gigabytes before failing, a status in the leading headers is physically impossible; the end of the message is the only honest place. gRPC puts the RPC's outcome there. And this trailer is the entire reason for the "browser wall" coming up in the weak points section.&lt;/p&gt;

&lt;h3&gt;
  
  
  How gRPC maps onto HTTP/2
&lt;/h3&gt;

&lt;p&gt;Here is the full assignment of gRPC concepts to HTTP/2 machinery:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;gRPC concept&lt;/th&gt;
&lt;th&gt;What it is on HTTP/2&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;One RPC&lt;/td&gt;
&lt;td&gt;One stream&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Method selection&lt;/td&gt;
&lt;td&gt;The &lt;code&gt;:path&lt;/code&gt; header (e.g. &lt;code&gt;/gadgefre.order.v1.OrderService/CreateOrder&lt;/code&gt;)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;A request/response message&lt;/td&gt;
&lt;td&gt;"5-byte prefix + protobuf" inside DATA frames&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;RPC outcome&lt;/td&gt;
&lt;td&gt;
&lt;code&gt;grpc-status&lt;/code&gt; in the trailer (0 means OK)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Deadline&lt;/td&gt;
&lt;td&gt;The &lt;code&gt;grpc-timeout&lt;/code&gt; header&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;The four patterns (next section)&lt;/td&gt;
&lt;td&gt;Just how many DATA frames flow in each direction&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;A channel (the client-side connection object)&lt;/td&gt;
&lt;td&gt;One or more TCP connections plus SETTINGS/PING bookkeeping&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;And the frame-by-frame shape of a single unary RPC:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fraw.githubusercontent.com%2F0-draft%2Fdev.to%2Fmain%2Farticles%2Fassets%2Fgrpc-introduction%2Fdiagrams%2F04-unary-rpc-frame-sequence.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fraw.githubusercontent.com%2F0-draft%2Fdev.to%2Fmain%2Farticles%2Fassets%2Fgrpc-introduction%2Fdiagrams%2F04-unary-rpc-frame-sequence.png" alt="Frame sequence of one unary RPC: HEADERS and DATA from the client, then HEADERS, DATA, and trailer from the server" width="800" height="579"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;That is the theory. Whether it is true gets checked at the end of the hands-on, where I capture the actual frames off a live connection.&lt;/p&gt;

&lt;p&gt;In case you are wondering about HTTP/3: as of June 2026, ecosystem-wide HTTP/3 support is still at the &lt;a href="https://github.com/grpc/proposal/blob/master/G2-http3-protocol.md" rel="noopener noreferrer"&gt;official proposal stage (G2)&lt;/a&gt;, with grpc-dotnet running a trial implementation. Production gRPC traffic today is overwhelmingly HTTP/2.&lt;/p&gt;

&lt;h2&gt;
  
  
  The four communication patterns
&lt;/h2&gt;

&lt;p&gt;A REST API has essentially one shape: one request, one response. gRPC gives you four. As the mapping table said, they are not four mechanisms; they differ only in how many DATA frames each side sends on the one stream.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fraw.githubusercontent.com%2F0-draft%2Fdev.to%2Fmain%2Farticles%2Fassets%2Fgrpc-introduction%2Fdiagrams%2F05-four-rpc-patterns.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fraw.githubusercontent.com%2F0-draft%2Fdev.to%2Fmain%2Farticles%2Fassets%2Fgrpc-introduction%2Fdiagrams%2F05-four-rpc-patterns.png" alt="The four gRPC patterns: unary, server streaming, client streaming, bidirectional" width="799" height="293"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Mapped onto gadgefre:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Pattern&lt;/th&gt;
&lt;th&gt;In the proto&lt;/th&gt;
&lt;th&gt;gadgefre use case&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Unary&lt;/td&gt;
&lt;td&gt;&lt;code&gt;rpc CreateOrder(Req) returns (Res)&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Creating an order. Every ordinary API call&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Server streaming&lt;/td&gt;
&lt;td&gt;&lt;code&gt;rpc WatchOrder(Req) returns (stream Event)&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Live order status. Push the moment it ships&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Client streaming&lt;/td&gt;
&lt;td&gt;&lt;code&gt;rpc UploadPhotos(stream Chunk) returns (Res)&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Photo upload in chunks, one result at the end&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Bidirectional&lt;/td&gt;
&lt;td&gt;&lt;code&gt;rpc Chat(stream Msg) returns (stream Msg)&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Buyer-seller chat&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;The only syntax involved is where you put the &lt;code&gt;stream&lt;/code&gt; keyword. Here is &lt;code&gt;WatchOrder&lt;/code&gt; (server streaming) over time:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fraw.githubusercontent.com%2F0-draft%2Fdev.to%2Fmain%2Farticles%2Fassets%2Fgrpc-introduction%2Fdiagrams%2F06-watchorder-server-streaming.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fraw.githubusercontent.com%2F0-draft%2Fdev.to%2Fmain%2Farticles%2Fassets%2Fgrpc-introduction%2Fdiagrams%2F06-watchorder-server-streaming.png" alt="WatchOrder server streaming: one request, then PENDING, CONFIRMED, and SHIPPED events pushed on a single stream" width="800" height="743"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Doing this with REST means the client polls with GET every few seconds, or you bolt on a separate WebSocket layer. In gRPC you write the word &lt;code&gt;stream&lt;/code&gt; in the proto and the generated stubs, flow control, and all four languages' implementations come with it. This is one of the reasons ABEMA (more on them later) picked gRPC for a latency-sensitive video service.&lt;/p&gt;

&lt;h2&gt;
  
  
  The big picture so far
&lt;/h2&gt;

&lt;p&gt;All the parts are on the table, so here is gadgefre in one diagram:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fraw.githubusercontent.com%2F0-draft%2Fdev.to%2Fmain%2Farticles%2Fassets%2Fgrpc-introduction%2Fdiagrams%2F07-gadgefre-architecture.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fraw.githubusercontent.com%2F0-draft%2Fdev.to%2Fmain%2Farticles%2Fassets%2Fgrpc-introduction%2Fdiagrams%2F07-gadgefre-architecture.png" alt="gadgefre architecture: REST/JSON on the outside, gRPC between internal services" width="800" height="995"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;There is a design decision buried in this diagram, and nearly every adopter made the same one: &lt;strong&gt;the outside edge (phones, browsers) speaks REST/JSON, and only the service-to-service interior speaks gRPC&lt;/strong&gt;. Mercari, ABEMA, and Netflix all look like this. The reasons are the browser problem explained in the weak points section, plus the fact that external developers expect REST. Lining up the case studies makes the pattern obvious: gRPC spread as &lt;strong&gt;the tool for service-to-service traffic&lt;/strong&gt;, not as a REST replacement.&lt;/p&gt;

&lt;h2&gt;
  
  
  Hands-on: run it and measure it
&lt;/h2&gt;

&lt;p&gt;Time to touch it. Environment: Apple Silicon Mac, Go 1.26.4, libprotoc 35.0, grpc-go v1.81.1, grpcurl 1.9.3. The proto is the &lt;code&gt;order.proto&lt;/code&gt; from earlier, unchanged.&lt;/p&gt;

&lt;h3&gt;
  
  
  1. Generate the code
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;brew &lt;span class="nb"&gt;install &lt;/span&gt;protobuf protoc-gen-go protoc-gen-go-grpc grpcurl

&lt;span class="nb"&gt;mkdir &lt;/span&gt;grpc-demo &lt;span class="o"&gt;&amp;amp;&amp;amp;&lt;/span&gt; &lt;span class="nb"&gt;cd &lt;/span&gt;grpc-demo
go mod init example.com/gadgefre
&lt;span class="c"&gt;# save the proto from above as proto/order.proto, then:&lt;/span&gt;
protoc &lt;span class="nt"&gt;--go_out&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="nb"&gt;.&lt;/span&gt; &lt;span class="nt"&gt;--go_opt&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="nv"&gt;module&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;example.com/gadgefre &lt;span class="se"&gt;\&lt;/span&gt;
       &lt;span class="nt"&gt;--go-grpc_out&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="nb"&gt;.&lt;/span&gt; &lt;span class="nt"&gt;--go-grpc_opt&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="nv"&gt;module&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;example.com/gadgefre &lt;span class="se"&gt;\&lt;/span&gt;
       proto/order.proto
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Two files appear under &lt;code&gt;gen/orderpb/&lt;/code&gt;: &lt;code&gt;order.pb.go&lt;/code&gt; holds the message types (structs plus serialization), &lt;code&gt;order_grpc.pb.go&lt;/code&gt; holds the service stubs (client and server interfaces). Open them and you can see that the "stub" from the first diagram is just ordinary Go code.&lt;/p&gt;

&lt;h3&gt;
  
  
  2. The server
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight go"&gt;&lt;code&gt;&lt;span class="k"&gt;package&lt;/span&gt; &lt;span class="n"&gt;main&lt;/span&gt;

&lt;span class="k"&gt;import&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="s"&gt;"context"&lt;/span&gt;
    &lt;span class="s"&gt;"fmt"&lt;/span&gt;
    &lt;span class="s"&gt;"log"&lt;/span&gt;
    &lt;span class="s"&gt;"net"&lt;/span&gt;
    &lt;span class="s"&gt;"time"&lt;/span&gt;

    &lt;span class="s"&gt;"google.golang.org/grpc"&lt;/span&gt;
    &lt;span class="s"&gt;"google.golang.org/grpc/reflection"&lt;/span&gt;

    &lt;span class="n"&gt;pb&lt;/span&gt; &lt;span class="s"&gt;"example.com/gadgefre/gen/orderpb"&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="k"&gt;type&lt;/span&gt; &lt;span class="n"&gt;orderServer&lt;/span&gt; &lt;span class="k"&gt;struct&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="n"&gt;pb&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;UnimplementedOrderServiceServer&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="k"&gt;func&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;s&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;orderServer&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="n"&gt;CreateOrder&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;ctx&lt;/span&gt; &lt;span class="n"&gt;context&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Context&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;req&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;pb&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;CreateOrderRequest&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;pb&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;CreateOrderResponse&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="kt"&gt;error&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="o"&gt;&amp;amp;&lt;/span&gt;&lt;span class="n"&gt;pb&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;CreateOrderResponse&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="n"&gt;OrderId&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt;  &lt;span class="n"&gt;fmt&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Sprintf&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"ord_%s_%d"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;req&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;ItemId&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;req&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;UserId&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
        &lt;span class="n"&gt;Status&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt;   &lt;span class="n"&gt;pb&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;OrderStatus_ORDER_STATUS_PENDING&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="n"&gt;TotalYen&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt; &lt;span class="m"&gt;4980&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="kt"&gt;int64&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;req&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Quantity&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
    &lt;span class="p"&gt;},&lt;/span&gt; &lt;span class="no"&gt;nil&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="k"&gt;func&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;s&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;orderServer&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="n"&gt;WatchOrder&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;req&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;pb&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;WatchOrderRequest&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;stream&lt;/span&gt; &lt;span class="n"&gt;grpc&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;ServerStreamingServer&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;pb&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;OrderEvent&lt;/span&gt;&lt;span class="p"&gt;])&lt;/span&gt; &lt;span class="kt"&gt;error&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="n"&gt;statuses&lt;/span&gt; &lt;span class="o"&gt;:=&lt;/span&gt; &lt;span class="p"&gt;[]&lt;/span&gt;&lt;span class="n"&gt;pb&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;OrderStatus&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="n"&gt;pb&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;OrderStatus_ORDER_STATUS_PENDING&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="n"&gt;pb&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;OrderStatus_ORDER_STATUS_CONFIRMED&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="n"&gt;pb&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;OrderStatus_ORDER_STATUS_SHIPPED&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;
    &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;_&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;st&lt;/span&gt; &lt;span class="o"&gt;:=&lt;/span&gt; &lt;span class="k"&gt;range&lt;/span&gt; &lt;span class="n"&gt;statuses&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;err&lt;/span&gt; &lt;span class="o"&gt;:=&lt;/span&gt; &lt;span class="n"&gt;stream&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Send&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;&amp;amp;&lt;/span&gt;&lt;span class="n"&gt;pb&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;OrderEvent&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="n"&gt;OrderId&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt; &lt;span class="n"&gt;req&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;OrderId&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;Status&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt; &lt;span class="n"&gt;st&lt;/span&gt;&lt;span class="p"&gt;});&lt;/span&gt; &lt;span class="n"&gt;err&lt;/span&gt; &lt;span class="o"&gt;!=&lt;/span&gt; &lt;span class="no"&gt;nil&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
            &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;err&lt;/span&gt;
        &lt;span class="p"&gt;}&lt;/span&gt;
        &lt;span class="n"&gt;time&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Sleep&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="m"&gt;300&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="n"&gt;time&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Millisecond&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="no"&gt;nil&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="k"&gt;func&lt;/span&gt; &lt;span class="n"&gt;main&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="n"&gt;lis&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;err&lt;/span&gt; &lt;span class="o"&gt;:=&lt;/span&gt; &lt;span class="n"&gt;net&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Listen&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"tcp"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s"&gt;":50061"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;err&lt;/span&gt; &lt;span class="o"&gt;!=&lt;/span&gt; &lt;span class="no"&gt;nil&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="n"&gt;log&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Fatal&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;err&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;
    &lt;span class="n"&gt;s&lt;/span&gt; &lt;span class="o"&gt;:=&lt;/span&gt; &lt;span class="n"&gt;grpc&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;NewServer&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
    &lt;span class="n"&gt;pb&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;RegisterOrderServiceServer&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;s&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="o"&gt;&amp;amp;&lt;/span&gt;&lt;span class="n"&gt;orderServer&lt;/span&gt;&lt;span class="p"&gt;{})&lt;/span&gt;
    &lt;span class="n"&gt;reflection&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Register&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;s&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;log&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Println&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"listening on :50061"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;log&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Fatal&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;s&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Serve&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;lis&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Notice what is missing: not one line of HTTP/2 handling or serialization. Only logic. &lt;code&gt;reflection.Register(s)&lt;/code&gt; is for grpcurl later; it lets the server tell clients "here is the proto I implement". (The port is 50061 because something on my machine was already squatting on 50051, which is the conventional one.)&lt;/p&gt;

&lt;h3&gt;
  
  
  3. The client, and real numbers
&lt;/h3&gt;

&lt;p&gt;The client side is just calls on the generated stub. The relevant excerpt:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight go"&gt;&lt;code&gt;&lt;span class="n"&gt;conn&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;_&lt;/span&gt; &lt;span class="o"&gt;:=&lt;/span&gt; &lt;span class="n"&gt;grpc&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;NewClient&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"localhost:50061"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;grpc&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;WithTransportCredentials&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;insecure&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;NewCredentials&lt;/span&gt;&lt;span class="p"&gt;()))&lt;/span&gt;
&lt;span class="n"&gt;c&lt;/span&gt; &lt;span class="o"&gt;:=&lt;/span&gt; &lt;span class="n"&gt;pb&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;NewOrderServiceClient&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;conn&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="n"&gt;res&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;_&lt;/span&gt; &lt;span class="o"&gt;:=&lt;/span&gt; &lt;span class="n"&gt;c&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;CreateOrder&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;ctx&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="o"&gt;&amp;amp;&lt;/span&gt;&lt;span class="n"&gt;pb&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;CreateOrderRequest&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="n"&gt;ItemId&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt; &lt;span class="s"&gt;"gx100"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;Quantity&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt; &lt;span class="m"&gt;2&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;UserId&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt; &lt;span class="m"&gt;1234567&lt;/span&gt;&lt;span class="p"&gt;})&lt;/span&gt;

&lt;span class="n"&gt;stream&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;_&lt;/span&gt; &lt;span class="o"&gt;:=&lt;/span&gt; &lt;span class="n"&gt;c&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;WatchOrder&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;ctx&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="o"&gt;&amp;amp;&lt;/span&gt;&lt;span class="n"&gt;pb&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;WatchOrderRequest&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="n"&gt;OrderId&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt; &lt;span class="n"&gt;res&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;OrderId&lt;/span&gt;&lt;span class="p"&gt;})&lt;/span&gt;
&lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="n"&gt;ev&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;err&lt;/span&gt; &lt;span class="o"&gt;:=&lt;/span&gt; &lt;span class="n"&gt;stream&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Recv&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;err&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="n"&gt;io&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;EOF&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="k"&gt;break&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;
    &lt;span class="n"&gt;fmt&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Printf&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"WatchOrder  -&amp;gt; %s is now %s&lt;/span&gt;&lt;span class="se"&gt;\n&lt;/span&gt;&lt;span class="s"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;ev&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;OrderId&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;ev&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Status&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Output, including a latency loop I added at the end (10000 unary calls on a warmed-up connection, sorted, percentiles taken):&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;CreateOrder -&amp;gt; order_id=ord_gx100_1234567 status=ORDER_STATUS_PENDING total_yen=9960
WatchOrder  -&amp;gt; ord_gx100_1234567 is now ORDER_STATUS_PENDING
WatchOrder  -&amp;gt; ord_gx100_1234567 is now ORDER_STATUS_CONFIRMED
WatchOrder  -&amp;gt; ord_gx100_1234567 is now ORDER_STATUS_SHIPPED
unary x10000: p50=50.334µs p99=165.667µs
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Localhost, sure, but that includes serialization and HTTP/2 framing both ways: &lt;strong&gt;p50 of 50 microseconds per call&lt;/strong&gt;. Worth keeping as a gut number for how light the RPC machinery itself is.&lt;/p&gt;

&lt;h3&gt;
  
  
  4. Debugging with grpcurl
&lt;/h3&gt;

&lt;p&gt;This is the answer to "binary means no curl". With reflection enabled on the server, grpcurl fetches the proto and does the JSON conversion for you:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nv"&gt;$ &lt;/span&gt;grpcurl &lt;span class="nt"&gt;-plaintext&lt;/span&gt; localhost:50061 list
gadgefre.order.v1.OrderService
grpc.reflection.v1.ServerReflection

&lt;span class="nv"&gt;$ &lt;/span&gt;grpcurl &lt;span class="nt"&gt;-plaintext&lt;/span&gt; &lt;span class="nt"&gt;-d&lt;/span&gt; &lt;span class="s1"&gt;'{"item_id":"gx100","quantity":2,"user_id":1234567}'&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
    localhost:50061 gadgefre.order.v1.OrderService/CreateOrder
&lt;span class="o"&gt;{&lt;/span&gt;
  &lt;span class="s2"&gt;"orderId"&lt;/span&gt;: &lt;span class="s2"&gt;"ord_gx100_1234567"&lt;/span&gt;,
  &lt;span class="s2"&gt;"status"&lt;/span&gt;: &lt;span class="s2"&gt;"ORDER_STATUS_PENDING"&lt;/span&gt;,
  &lt;span class="s2"&gt;"totalYen"&lt;/span&gt;: &lt;span class="s2"&gt;"9960"&lt;/span&gt;
&lt;span class="o"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Small thing that trips everyone up once: &lt;code&gt;"totalYen": "9960"&lt;/code&gt; is a string, and that is by spec. Protobuf's JSON mapping always renders int64 as a JSON string, because JavaScript's Number cannot represent 64-bit integers exactly.&lt;/p&gt;

&lt;h3&gt;
  
  
  5. Looking at the actual HTTP/2 frames
&lt;/h3&gt;

&lt;p&gt;Time to check Part 2's diagrams against reality. I wrote an 80-line proxy that sits between client and server, passes every byte through untouched, and feeds a copy into &lt;code&gt;golang.org/x/net/http2&lt;/code&gt;'s Framer to log what it sees (listens on 50071, forwards to 50061). The whole trick is this:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight go"&gt;&lt;code&gt;&lt;span class="n"&gt;framer&lt;/span&gt; &lt;span class="o"&gt;:=&lt;/span&gt; &lt;span class="n"&gt;http2&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;NewFramer&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;io&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Discard&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;r&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="c"&gt;// r carries the raw connection bytes&lt;/span&gt;
&lt;span class="n"&gt;framer&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;ReadMetaHeaders&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;hpack&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;NewDecoder&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="m"&gt;4096&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="no"&gt;nil&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="n"&gt;f&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;err&lt;/span&gt; &lt;span class="o"&gt;:=&lt;/span&gt; &lt;span class="n"&gt;framer&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;ReadFrame&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;err&lt;/span&gt; &lt;span class="o"&gt;!=&lt;/span&gt; &lt;span class="no"&gt;nil&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;
    &lt;span class="c"&gt;// f is a *http2.MetaHeadersFrame, *http2.DataFrame, etc. Log by type&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Every frame from one &lt;code&gt;CreateOrder&lt;/code&gt; call through that proxy (connection setup SETTINGS, keepalive PING, and flow-control WINDOW_UPDATE omitted):&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;[client→server] HEADERS stream=1 END_STREAM=false
         :method=POST
         :scheme=http
         :path=/gadgefre.order.v1.OrderService/CreateOrder
         :authority=localhost:50071
         content-type=application/grpc
         user-agent=grpc-go/1.81.1
         te=trailers
[client→server] DATA stream=1 len=18 END_STREAM=true
         00 00 00 00 0d 0a 05 67 78 31 30 30 10 02 18 87 ad 4b
[server→client] HEADERS stream=1 END_STREAM=false
         :status=200
         content-type=application/grpc
[server→client] DATA stream=1 len=29 END_STREAM=false
         00 00 00 00 18 0a 11 6f 72 64 5f 67 78 31 30 30 5f 31 32 33 34 35 36 37 10 01 18 e8 4d
[server→client] HEADERS (trailer) stream=1 END_STREAM=true
         grpc-status=0
         grpc-message=
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;What to look at:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;The order matches Part 2's sequence diagram exactly: HEADERS, DATA, then HEADERS, DATA, trailer coming back&lt;/li&gt;
&lt;li&gt;The first 5 bytes of the request DATA, &lt;code&gt;00 00 00 00 0d&lt;/code&gt;, are the "5-byte prefix" in the flesh. Byte one is the compression flag (0 = uncompressed), the next four are the message length (0x0d = 13). The remaining 13 bytes are the &lt;code&gt;CreateOrderRequest&lt;/code&gt; protobuf: &lt;code&gt;0a 05 67 78 31 30 30&lt;/code&gt; reads "field 1, length 5, gx100"&lt;/li&gt;
&lt;li&gt;The 24 protobuf bytes inside the response DATA are byte-for-byte identical to the hex dump from the size measurement section&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;te=trailers&lt;/code&gt; in the request headers is the client declaring "I can receive trailers". A browser's fetch API cannot make that declaration (this becomes the next section)&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;grpc-status=0&lt;/code&gt; in the final trailer means the RPC succeeded. Note that it is a different thing from &lt;code&gt;:status=200&lt;/code&gt;
&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;That last point has operational teeth. A failing gRPC call still carries HTTP &lt;code&gt;:status=200&lt;/code&gt;; only &lt;code&gt;grpc-status&lt;/code&gt; in the trailer changes. If your L7 access logs equate 200 with healthy, you are blind to every gRPC error. Monitor on &lt;code&gt;grpc-status&lt;/code&gt;.&lt;/p&gt;

&lt;h2&gt;
  
  
  Adoption: what was each company escaping from?
&lt;/h2&gt;

&lt;p&gt;Now that the machine is understood, the users. A bare list of logos teaches nothing, so the axis here is &lt;strong&gt;what they ran before, and what hurt&lt;/strong&gt;.&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Company&lt;/th&gt;
&lt;th&gt;Before&lt;/th&gt;
&lt;th&gt;What decided it&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Google&lt;/td&gt;
&lt;td&gt;In-house Stubby&lt;/td&gt;
&lt;td&gt;gRPC is Stubby's open rebuild; everything internal is RPC&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Netflix&lt;/td&gt;
&lt;td&gt;In-house HTTP/1.1 stack (Ribbon)&lt;/td&gt;
&lt;td&gt;Cost of maintaining their own; new Java services start on gRPC&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Spotify&lt;/td&gt;
&lt;td&gt;In-house RPC (Hermes)&lt;/td&gt;
&lt;td&gt;"The community caught up and surpassed us"&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Dropbox&lt;/td&gt;
&lt;td&gt;In-house RPC frameworks&lt;/td&gt;
&lt;td&gt;Could keep existing protobufs; HTTP/2 multiplexing and streaming&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Uber&lt;/td&gt;
&lt;td&gt;Server-Sent Events over HTTP/1.1 (push)&lt;/td&gt;
&lt;td&gt;Bidirectional streaming, cross-language stubs, QUIC interop&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Salesforce&lt;/td&gt;
&lt;td&gt;JSON/REST&lt;/td&gt;
&lt;td&gt;
&lt;code&gt;.proto&lt;/code&gt; as a fixed contract between teams&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Mercari / Merpay&lt;/td&gt;
&lt;td&gt;(greenfield)&lt;/td&gt;
&lt;td&gt;Standardized on gRPC while splitting into microservices&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;ABEMA&lt;/td&gt;
&lt;td&gt;(greenfield)&lt;/td&gt;
&lt;td&gt;Low latency, fit with GCP + Kubernetes + Go&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Ikyu&lt;/td&gt;
&lt;td&gt;REST&lt;/td&gt;
&lt;td&gt;Speed; built a parallel REST fallback and never needed it&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;A few worth unpacking.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Netflix&lt;/strong&gt; ran service-to-service traffic on its own HTTP/1.1-based stack (Ribbon and friends, parts of it open sourced) until around 2015, then moved to gRPC when the maintenance bill came due. Today a large share of their internal traffic is gRPC, and new Java development starts gRPC-first. The interesting bit: the driver was not speed. It was &lt;strong&gt;wanting to stop maintaining a bespoke RPC framework&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Spotify&lt;/strong&gt; is the same story with different names: their in-house Hermes got replaced by gRPC plus Envoy. Their engineer Dave Zolotusky summarized the whole industry arc in one line: they had built their own tools because nothing handled their scale, "but then the community kind of caught up and surpassed us." Every company that went microservices early, around 2015, eventually faced that decision, and nearly all of them landed on gRPC.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Dropbox&lt;/strong&gt; documented its migration in detail as &lt;a href="https://dropbox.tech/infrastructure/courier-dropbox-migration-to-grpc" rel="noopener noreferrer"&gt;Courier&lt;/a&gt;: hundreds of services in multiple languages exchanging millions of requests per second. Two details stand out. They picked gRPC partly because &lt;strong&gt;they could carry their existing protobuf definitions over unchanged&lt;/strong&gt;, and Courier itself is not a new protocol; it is gRPC wired into their existing auth, service discovery, and tracing. Their closing lesson applies to any migration: it takes longer than the development itself, and it is only finished after the cleanup.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Uber&lt;/strong&gt; is the streaming showcase. Their mobile push platform (internally called RAMEN) originally delivered updates over Server-Sent Events on HTTP/1.1; &lt;a href="https://www.uber.com/blog/ubers-next-gen-push-platform-on-grpc/" rel="noopener noreferrer"&gt;they rebuilt it on gRPC bidirectional streaming&lt;/a&gt;, citing the standardized cross-language implementations and the ability to ride Cronet's QUIC sessions on mobile. If your mental image of gRPC is "internal microservice plumbing", Uber pushing to phones over it is the counterexample.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Mercari / Merpay&lt;/strong&gt; (Japan's largest C2C marketplace and its payments arm) is the best-documented case in the Japanese-language sphere, and the operational details translate well. When they split the monolith to scale the org toward 1000 engineers, they standardized inter-service traffic on gRPC:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Every microservice's &lt;code&gt;.proto&lt;/code&gt; lives in one repository; CI generates the Go, Python, Java, and Node.js code on merge&lt;/li&gt;
&lt;li&gt;API design debates happen on &lt;code&gt;.proto&lt;/code&gt; pull requests, so interfaces get reviewed before implementation starts&lt;/li&gt;
&lt;li&gt;They went further and built &lt;a href="https://engineering.mercari.com/blog/entry/20241204-developing-bff-using-grpc-federation/" rel="noopener noreferrer"&gt;gRPC Federation&lt;/a&gt;, an OSS tool that generates an entire BFF (the aggregation layer in front of mobile clients) from options written in the proto&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;ABEMA&lt;/strong&gt; (a Japanese streaming TV service) launched in 2016 on GCP + Kubernetes + Go + gRPC, with roughly 40 microservices talking gRPC to each other. Video is latency-sensitive, and protobuf's encode/decode speed and density were the deciding factors. For external APIs they use grpc-gateway (a tool that generates a REST proxy from the proto), making them a clean example of "gRPC inside, REST outside" done by code generation.&lt;/p&gt;

&lt;p&gt;Squeeze the cases and four patterns fall out. If you are deciding whether gRPC belongs in your stack, it comes down to whether these apply:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;Too many teams to keep inter-service contracts as verbal agreements&lt;/strong&gt; (Salesforce, Mercari)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;An in-house RPC layer you are tired of maintaining&lt;/strong&gt; (Netflix, Spotify, Dropbox)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Clients needed in several languages, none hand-written&lt;/strong&gt; (everyone)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Real-time or latency requirements that polling cannot meet&lt;/strong&gt; (Uber's push platform, ABEMA)&lt;/li&gt;
&lt;/ol&gt;

&lt;h2&gt;
  
  
  Weak points, with fixes
&lt;/h2&gt;

&lt;p&gt;It has been a friendly story so far, so here are the traps, honestly. Each comes with a workaround.&lt;/p&gt;

&lt;h3&gt;
  
  
  Weak point 1: browsers cannot speak it
&lt;/h3&gt;

&lt;p&gt;Remember the trailer, explained in Part 2 and caught on the wire in the hands-on (the final HEADERS frame carrying &lt;code&gt;grpc-status&lt;/code&gt;). gRPC reports the outcome of every RPC there, and &lt;strong&gt;the browser fetch API cannot read trailers&lt;/strong&gt;. The &lt;code&gt;te=trailers&lt;/code&gt; declaration visible in the frame capture is one a browser will never send. So plain gRPC from browser JavaScript is off the table.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fraw.githubusercontent.com%2F0-draft%2Fdev.to%2Fmain%2Farticles%2Fassets%2Fgrpc-introduction%2Fdiagrams%2F08-browser-wall-three-routes.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fraw.githubusercontent.com%2F0-draft%2Fdev.to%2Fmain%2Farticles%2Fassets%2Fgrpc-introduction%2Fdiagrams%2F08-browser-wall-three-routes.png" alt="Three routes from a browser: plain gRPC blocked, gRPC-Web through a proxy, Connect directly" width="799" height="302"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Three families of workarounds:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;gRPC-Web&lt;/strong&gt;: a browser-safe variant of the protocol; a proxy (typically Envoy) translates to real gRPC. Longest track record&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;grpc-gateway&lt;/strong&gt;: generates a REST/JSON API from the proto and runs it as a proxy (the ABEMA approach)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Connect RPC&lt;/strong&gt;: the newer option, from Buf, accepted into the CNCF in 2024. One server speaks gRPC, gRPC-Web, and plain HTTP+JSON on the same port, so the translating proxy disappears entirely. Browsers call it with ordinary fetch&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Starting fresh today and wanting protobuf types in the browser, I would look at Connect first. Deleting a proxy tier from your architecture is a big operational win.&lt;/p&gt;

&lt;p&gt;Trailers are awkward even outside browsers, by the way. When Cloudflare added gRPC support to their edge in 2020, a large chunk of &lt;a href="https://blog.cloudflare.com/road-to-grpc/" rel="noopener noreferrer"&gt;the work&lt;/a&gt; was that their NGINX-based proxies barely supported HTTP trailers and their origin-facing connections were HTTP/1.1. If a CDN had to build a new proxy platform for this, your middleboxes deserve a look too: every hop between client and server must speak HTTP/2 and pass trailers through.&lt;/p&gt;

&lt;h3&gt;
  
  
  Weak point 2: load balancing skews on Kubernetes
&lt;/h3&gt;

&lt;p&gt;This is the trap people hit in production. HTTP/2's greatest strength, one long-lived connection reused for everything, collides head-on with how Kubernetes load balances by default. A Service (ClusterIP) picks a backend &lt;strong&gt;once, at connection time&lt;/strong&gt;. A long-lived gRPC connection therefore glues itself to whichever Pod it first landed on, and every subsequent request rides that connection to the same Pod.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fraw.githubusercontent.com%2F0-draft%2Fdev.to%2Fmain%2Farticles%2Fassets%2Fgrpc-introduction%2Fdiagrams%2F09-k8s-lb-skew.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fraw.githubusercontent.com%2F0-draft%2Fdev.to%2Fmain%2Farticles%2Fassets%2Fgrpc-introduction%2Fdiagrams%2F09-k8s-lb-skew.png" alt="One long-lived HTTP/2 connection pinning all traffic to a single Pod while two others idle" width="800" height="330"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;The symptom: you scale the stock service to 3 Pods and one Pod melts while two idle. The fix is always some layer that picks a backend &lt;strong&gt;per request instead of per connection&lt;/strong&gt;, and there are three:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;A service mesh / L7 proxy&lt;/strong&gt;: Istio or Linkerd sidecars (Envoy) balance per request. If you already run a mesh, you get this for free&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Client-side load balancing&lt;/strong&gt;: built into grpc-go and friends; point it at a headless Service (&lt;code&gt;clusterIP: None&lt;/code&gt;) so the client sees every Pod IP, connects to all, and round-robins&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;xDS&lt;/strong&gt;: the client gets routing info straight from a control plane, speaking the same protocol Envoy does (proxyless gRPC). Datadog runs this setup&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Weak point 3: humans cannot read it without tools
&lt;/h3&gt;

&lt;p&gt;It is binary; tcpdump and curl show you noise. The baseline fix is what the hands-on did: &lt;strong&gt;reflection on the server, grpcurl in your hand&lt;/strong&gt;. Postman supports gRPC if you want a GUI, and there is also &lt;a href="https://engineering.mercari.com/blog/entry/grpc_and_evans/" rel="noopener noreferrer"&gt;Evans&lt;/a&gt;, a REPL-style client that came out of Mercari. If you are rolling gRPC out to a team, make "reflection enabled on every server, at least outside production" a written rule early. It pays off weekly.&lt;/p&gt;

&lt;h3&gt;
  
  
  Weak point 4: schema evolution needs discipline
&lt;/h3&gt;

&lt;p&gt;Field numbers are the binary compatibility contract, so &lt;strong&gt;a number, once used, can never change meaning or be recycled&lt;/strong&gt;. Deleting a field means writing &lt;code&gt;reserved 4;&lt;/code&gt; to leave a tombstone. Discipline like this should be enforced by a linter, not by memory: &lt;code&gt;buf breaking&lt;/code&gt; checks "does this change break wire compatibility" in CI. Starting with buf instead of raw protoc saves you the incident later.&lt;/p&gt;

&lt;h2&gt;
  
  
  The decision table
&lt;/h2&gt;

&lt;p&gt;To wrap up the design guidance. "Everything becomes gRPC" is not the lesson; the adopters' own architecture (gRPC inside, REST outside) says so.&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Situation&lt;/th&gt;
&lt;th&gt;Pick&lt;/th&gt;
&lt;th&gt;Why&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Internal service-to-service&lt;/td&gt;
&lt;td&gt;gRPC&lt;/td&gt;
&lt;td&gt;Typed contracts, performance, stubs in every language&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Public API for arbitrary consumers&lt;/td&gt;
&lt;td&gt;REST + OpenAPI&lt;/td&gt;
&lt;td&gt;curl-ability, ecosystem reach&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Typed contracts in browser/mobile&lt;/td&gt;
&lt;td&gt;Connect or gRPC-Web&lt;/td&gt;
&lt;td&gt;Plain gRPC dies on the trailer problem&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Real-time server-to-client push&lt;/td&gt;
&lt;td&gt;gRPC server streaming&lt;/td&gt;
&lt;td&gt;No polling; one &lt;code&gt;stream&lt;/code&gt; keyword in the proto&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Internal, but consumers only speak curl&lt;/td&gt;
&lt;td&gt;gRPC + grpc-gateway&lt;/td&gt;
&lt;td&gt;Generate the REST facade from the proto&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;And the tools that appeared along the way:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Tool&lt;/th&gt;
&lt;th&gt;Job&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;
&lt;code&gt;protoc&lt;/code&gt; + &lt;code&gt;protoc-gen-go&lt;/code&gt; etc.&lt;/td&gt;
&lt;td&gt;Generate per-language stubs from &lt;code&gt;.proto&lt;/code&gt;
&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;buf&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Modern protoc frontend; linting and breaking-change checks&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;grpcurl&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;curl for gRPC, pairs with server reflection&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;Evans&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Interactive REPL gRPC client&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;grpc-gateway&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Generate a REST proxy from the proto&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Connect RPC&lt;/td&gt;
&lt;td&gt;gRPC-compatible framework family with native browser support&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;h2&gt;
  
  
  Conclusion
&lt;/h2&gt;

&lt;p&gt;The question that started this post ("so what is gRPC, exactly") now has a one-sentence answer I can stand behind: &lt;strong&gt;a framework that generates every language's communication code from a contract written in a &lt;code&gt;.proto&lt;/code&gt; file, and carries the messages as protobuf over HTTP/2 streams&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;Two things stuck with me from running it. First, how short the distance is from writing a proto to a working client. Second, the weight of the machinery: p50 of 50µs per call, and a wire format where I could account for every single byte. On the flip side, the browser wall and the Kubernetes balancing skew are both "trivial if you know, an outage if you don't" traps, so if you take one section into a migration meeting, take the weak points.&lt;/p&gt;

&lt;p&gt;If you want the next layer up, my xDS deep dive is the same story from the service mesh side: istiod pushing protobuf to Envoy over one long-lived gRPC stream.&lt;/p&gt;

</description>
      <category>grpc</category>
      <category>microservices</category>
      <category>go</category>
      <category>http2</category>
    </item>
    <item>
      <title>What xDS Actually Ships: Your Control Plane Sends protobuf, Not YAML</title>
      <dc:creator>kt</dc:creator>
      <pubDate>Tue, 09 Jun 2026 17:09:28 +0000</pubDate>
      <link>https://dev.to/kanywst/what-xds-actually-ships-your-control-plane-sends-protobuf-not-yaml-2oje</link>
      <guid>https://dev.to/kanywst/what-xds-actually-ships-your-control-plane-sends-protobuf-not-yaml-2oje</guid>
      <description>&lt;h2&gt;
  
  
  The thing I had wrong for years
&lt;/h2&gt;

&lt;p&gt;When I started with Istio, one mental model stuck and it was wrong. I knew &lt;code&gt;istiod&lt;/code&gt; pushed "config" to the Envoy sidecars. Edit a &lt;code&gt;VirtualService&lt;/code&gt;, routing changes with no restart. It looked like magic.&lt;/p&gt;

&lt;p&gt;The picture in my head was this: istiod generates Envoy YAML, ships that YAML to Envoy, and Envoy loads it. Half right, half wrong, and the wrong half is the interesting one. &lt;strong&gt;istiod does not send YAML. It sends protobuf, a binary format, over a gRPC stream.&lt;/strong&gt; YAML only shows up at the human-facing edges.&lt;/p&gt;

&lt;p&gt;This post follows the payload from the top down. Words like &lt;code&gt;xDS&lt;/code&gt;, &lt;code&gt;protobuf&lt;/code&gt;, &lt;code&gt;gRPC&lt;/code&gt;, &lt;code&gt;Any&lt;/code&gt;, and &lt;code&gt;type_url&lt;/code&gt; show up along the way, and each one gets explained where it lands. At the end I stand up Istio locally on kind and pull the actual bytes off the wire to look at them.&lt;/p&gt;

&lt;p&gt;By the time you finish, one sentence should feel obvious: the control plane ships Envoy-schema protobuf messages over xDS.&lt;/p&gt;

&lt;h2&gt;
  
  
  Cast of characters
&lt;/h2&gt;

&lt;p&gt;Jumping straight into xDS is a good way to get lost, so first the players. A service mesh has two layers.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fraw.githubusercontent.com%2F0-draft%2Fdev.to%2Fmain%2Farticles%2Fassets%2Fxds-protobuf-deep-dive%2Fdiagrams%2F01-cast-control-data-plane.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fraw.githubusercontent.com%2F0-draft%2Fdev.to%2Fmain%2Farticles%2Fassets%2Fxds-protobuf-deep-dive%2Fdiagrams%2F01-cast-control-data-plane.png" alt="Control plane and data plane" width="800" height="611"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Here is who is who.&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Term&lt;/th&gt;
&lt;th&gt;What it is&lt;/th&gt;
&lt;th&gt;Concrete example here&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Data plane&lt;/td&gt;
&lt;td&gt;The layer that actually moves traffic&lt;/td&gt;
&lt;td&gt;The Envoy sidecar in each Pod&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Control plane&lt;/td&gt;
&lt;td&gt;The layer that tells the data plane how to behave&lt;/td&gt;
&lt;td&gt;istiod&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Envoy&lt;/td&gt;
&lt;td&gt;A programmable proxy that receives and forwards packets&lt;/td&gt;
&lt;td&gt;The sidecar itself&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;sidecar&lt;/td&gt;
&lt;td&gt;A helper container that rides along with the app Pod&lt;/td&gt;
&lt;td&gt;Holds Envoy&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;istiod&lt;/td&gt;
&lt;td&gt;Istio's control plane binary (single process)&lt;/td&gt;
&lt;td&gt;The source of config&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;The key point: &lt;strong&gt;Envoy does not know what to do on its own&lt;/strong&gt;. Which port to listen on, where to forward, who to trust, all of it comes from the control plane. The mechanism for "telling it" is xDS.&lt;/p&gt;

&lt;h2&gt;
  
  
  What problem this even solves
&lt;/h2&gt;

&lt;p&gt;Why ship config at all? Picture running Envoy on a static file and the answer falls out.&lt;/p&gt;

&lt;p&gt;Envoy originally boots from a single YAML file (&lt;code&gt;envoy.yaml&lt;/code&gt;). You write "listen on 8080, forward to the &lt;code&gt;reviews&lt;/code&gt; service" and it runs. Fine while things are small.&lt;/p&gt;

&lt;p&gt;The trouble starts in a moving environment like Kubernetes. Pods come and go by the second. Their forwarding IPs are not stable. Routing rules change on every deploy. mTLS certificates (the certs that let services authenticate each other in both directions) rotate on a schedule. Doing all of that by rewriting YAML and restarting Envoy means dropping connections on every restart, and you never keep up with the churn.&lt;/p&gt;

&lt;p&gt;So the idea: receive config through an API instead of a file. When something changes, the control plane streams the delta in, and Envoy never restarts. That mechanism is xDS.&lt;/p&gt;

&lt;h2&gt;
  
  
  What xDS is
&lt;/h2&gt;

&lt;p&gt;xDS stands for "&lt;strong&gt;x&lt;/strong&gt; Discovery Service". The &lt;code&gt;x&lt;/code&gt; is a wildcard that later becomes a letter like &lt;code&gt;L&lt;/code&gt; (Listener) or &lt;code&gt;C&lt;/code&gt; (Cluster). Mentally expand it to "the family of APIs that ship config to Envoy".&lt;/p&gt;

&lt;p&gt;The mechanism is plain: &lt;strong&gt;Envoy dials the control plane over gRPC and receives config&lt;/strong&gt;. For now, treat gRPC as "a way to exchange messages over HTTP/2" (its real shape shows up later with protobuf). Config is not pushed at Envoy out of nowhere; Envoy asks for it.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fraw.githubusercontent.com%2F0-draft%2Fdev.to%2Fmain%2Farticles%2Fassets%2Fxds-protobuf-deep-dive%2Fdiagrams%2F02-xds-pull-subscribe.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fraw.githubusercontent.com%2F0-draft%2Fdev.to%2Fmain%2Farticles%2Fassets%2Fxds-protobuf-deep-dive%2Fdiagrams%2F02-xds-pull-subscribe.png" alt="Envoy subscribes to the xDS server" width="800" height="630"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;The natural next question: who is the "xDS server" in Istio terms?&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Role&lt;/th&gt;
&lt;th&gt;In Istio&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Control plane process&lt;/td&gt;
&lt;td&gt;istiod (single binary)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;The piece inside it that serves xDS&lt;/td&gt;
&lt;td&gt;Pilot (a subsystem of istiod; &lt;code&gt;DiscoveryServer&lt;/code&gt; in the code)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;The side that receives config (the xDS client)&lt;/td&gt;
&lt;td&gt;Each Pod's Envoy sidecar&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;Since Istio 1.5 (2020) istiod is a "modular monolith": Pilot, Citadel, and Galley, once separate processes, were folded into one binary. The part that serves xDS is &lt;strong&gt;Pilot&lt;/strong&gt;. So "what part of the control plane serves xDS?" answers as "the xDS server (discovery server)" in general, or "istiod's Pilot" in Istio.&lt;/p&gt;

&lt;h2&gt;
  
  
  xDS comes in five flavors
&lt;/h2&gt;

&lt;p&gt;xDS is not one API. It splits by the kind of thing being shipped. Five do the heavy lifting, and they compose inside Envoy to serve a single request.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fraw.githubusercontent.com%2F0-draft%2Fdev.to%2Fmain%2Farticles%2Fassets%2Fxds-protobuf-deep-dive%2Fdiagrams%2F03-five-xds-types.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fraw.githubusercontent.com%2F0-draft%2Fdev.to%2Fmain%2Farticles%2Fassets%2Fxds-protobuf-deep-dive%2Fdiagrams%2F03-five-xds-types.png" alt="The five xDS discovery services" width="800" height="1014"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Read top to bottom and you see one request being resolved.&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Kind&lt;/th&gt;
&lt;th&gt;What it ships&lt;/th&gt;
&lt;th&gt;Analogy&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;LDS&lt;/td&gt;
&lt;td&gt;Listeners (the address and port to accept on)&lt;/td&gt;
&lt;td&gt;The front door&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;RDS&lt;/td&gt;
&lt;td&gt;Routes (where each path or host goes)&lt;/td&gt;
&lt;td&gt;The reception map&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;CDS&lt;/td&gt;
&lt;td&gt;Clusters (the group of upstream targets)&lt;/td&gt;
&lt;td&gt;The destination department&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;EDS&lt;/td&gt;
&lt;td&gt;Endpoints (the real Pod IPs and ports)&lt;/td&gt;
&lt;td&gt;The home addresses of people in that department&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;SDS&lt;/td&gt;
&lt;td&gt;Secrets (TLS certs and keys)&lt;/td&gt;
&lt;td&gt;The ID card&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;EDS updates the most often, because the set of endpoints changes every time a Pod scales.&lt;/p&gt;

&lt;h3&gt;
  
  
  ADS: one stream to carry them all
&lt;/h3&gt;

&lt;p&gt;A fair question: with five types, does Envoy open five connections to istiod? Usually no. Istio uses &lt;strong&gt;ADS (Aggregated Discovery Service)&lt;/strong&gt; to bundle all of them onto &lt;strong&gt;a single gRPC stream&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;The reason is ordering. If you tell Envoy about a new cluster (CDS) before its endpoints (EDS), there is a brief window where a config references something that is not there yet, and traffic can drop. One stream lets istiod control the order of delivery.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fraw.githubusercontent.com%2F0-draft%2Fdev.to%2Fmain%2Farticles%2Fassets%2Fxds-protobuf-deep-dive%2Fdiagrams%2F04-ads-single-stream.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fraw.githubusercontent.com%2F0-draft%2Fdev.to%2Fmain%2Farticles%2Fassets%2Fxds-protobuf-deep-dive%2Fdiagrams%2F04-ads-single-stream.png" alt="ADS multiplexes every type on one stream" width="800" height="312"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;"How does one stream keep five types apart?" That is what &lt;code&gt;type_url&lt;/code&gt; is for. Its real shape arrives with protobuf below. For now it is just a per-type label.&lt;/p&gt;

&lt;h2&gt;
  
  
  The misconception worth killing
&lt;/h2&gt;

&lt;p&gt;I kept saying "ship config". Back to the wrong mental model from the top.&lt;/p&gt;

&lt;p&gt;For a long time I assumed istiod generated Envoy YAML and sent it. &lt;code&gt;istioctl proxy-config&lt;/code&gt; spits out JSON, Envoy config is "YAML", so of course that is what flows. Right?&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Wrong.&lt;/strong&gt; What istiod sends over gRPC is neither YAML nor JSON. It is &lt;strong&gt;protobuf&lt;/strong&gt;, a binary format.&lt;/p&gt;

&lt;p&gt;YAML shows up in exactly two places. One is when a human hand-writes &lt;code&gt;envoy.yaml&lt;/code&gt; (the bootstrap config at startup). The other is when protobuf gets rendered back into something readable (&lt;code&gt;istioctl proxy-config ... -o yaml&lt;/code&gt; and friends). Nowhere on the delivery path does YAML or JSON appear.&lt;/p&gt;

&lt;p&gt;To get why, you need to know what protobuf is, which is where this goes next.&lt;/p&gt;

&lt;h2&gt;
  
  
  What protobuf is
&lt;/h2&gt;

&lt;p&gt;protobuf is short for Protocol Buffers. It is a Google format, and in one line it is &lt;strong&gt;a language-neutral serialization format, plus a schema language, for packing structs into bytes and back&lt;/strong&gt;.&lt;/p&gt;

&lt;h3&gt;
  
  
  Serialization, briefly
&lt;/h3&gt;

&lt;p&gt;An object (a struct) in memory cannot go on the wire as is. You convert it to a form that can travel (a byte string). That is serialization. The receiver does the reverse (deserialization) to get the struct back.&lt;/p&gt;

&lt;p&gt;JSON, YAML, and XML aim at the same thing. Here is the difference.&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Format&lt;/th&gt;
&lt;th&gt;Kind&lt;/th&gt;
&lt;th&gt;Human-readable&lt;/th&gt;
&lt;th&gt;Size&lt;/th&gt;
&lt;th&gt;Parse speed&lt;/th&gt;
&lt;th&gt;Schema&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;JSON&lt;/td&gt;
&lt;td&gt;text&lt;/td&gt;
&lt;td&gt;yes&lt;/td&gt;
&lt;td&gt;large&lt;/td&gt;
&lt;td&gt;slow&lt;/td&gt;
&lt;td&gt;optional&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;YAML&lt;/td&gt;
&lt;td&gt;text&lt;/td&gt;
&lt;td&gt;yes&lt;/td&gt;
&lt;td&gt;large&lt;/td&gt;
&lt;td&gt;slow&lt;/td&gt;
&lt;td&gt;optional&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;protobuf&lt;/td&gt;
&lt;td&gt;binary&lt;/td&gt;
&lt;td&gt;no&lt;/td&gt;
&lt;td&gt;small&lt;/td&gt;
&lt;td&gt;fast&lt;/td&gt;
&lt;td&gt;required&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;protobuf is unreadable to humans but small and fast, which suits machine-to-machine traffic. When a control plane ships config to thousands of Envoys at high frequency, that difference earns its keep.&lt;/p&gt;

&lt;h3&gt;
  
  
  You write the schema first (.proto)
&lt;/h3&gt;

&lt;p&gt;protobuf's defining trait is "define the type first". You write the type in a &lt;code&gt;.proto&lt;/code&gt; file. A small example:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight protobuf"&gt;&lt;code&gt;&lt;span class="kd"&gt;message&lt;/span&gt; &lt;span class="nc"&gt;Listener&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="kt"&gt;string&lt;/span&gt; &lt;span class="na"&gt;name&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
  &lt;span class="n"&gt;Address&lt;/span&gt; &lt;span class="na"&gt;address&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
  &lt;span class="k"&gt;repeated&lt;/span&gt; &lt;span class="n"&gt;FilterChain&lt;/span&gt; &lt;span class="na"&gt;filter_chains&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;3&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;How to read it:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;code&gt;message&lt;/code&gt; defines a struct (a class). It is one type.&lt;/li&gt;
&lt;li&gt;Each line is the triple &lt;code&gt;type name = number;&lt;/code&gt;.&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;= 1&lt;/code&gt;, &lt;code&gt;= 2&lt;/code&gt; are &lt;strong&gt;field numbers (tags)&lt;/strong&gt;. On the wire the field is identified by this number, not by its name. That keeps it small, and a number, once chosen, must never change.&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;repeated&lt;/code&gt; is an array. &lt;code&gt;filter_chains&lt;/code&gt; can hold many.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Run this &lt;code&gt;.proto&lt;/code&gt; through the &lt;code&gt;protoc&lt;/code&gt; compiler and it generates classes in each language.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fraw.githubusercontent.com%2F0-draft%2Fdev.to%2Fmain%2Farticles%2Fassets%2Fxds-protobuf-deep-dive%2Fdiagrams%2F05-proto-codegen.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fraw.githubusercontent.com%2F0-draft%2Fdev.to%2Fmain%2Farticles%2Fassets%2Fxds-protobuf-deep-dive%2Fdiagrams%2F05-proto-codegen.png" alt="protoc generates Go and C++ classes from one proto" width="800" height="411"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;This is the payoff. &lt;strong&gt;istiod (Go) and Envoy (C++) compile the same &lt;code&gt;.proto&lt;/code&gt;.&lt;/strong&gt; They share the exact same type, so bytes packed by one deserialize straight into the other's class. No intermediate text like YAML is needed.&lt;/p&gt;

&lt;h3&gt;
  
  
  How gRPC fits
&lt;/h3&gt;

&lt;p&gt;To close the loop: &lt;strong&gt;gRPC is an RPC framework that carries protobuf&lt;/strong&gt;. The messages gRPC exchanges are defined in protobuf. xDS is "stream protobuf over a bidirectional gRPC connection", so protobuf and gRPC always show up together.&lt;/p&gt;

&lt;h3&gt;
  
  
  Any and type_url get special treatment
&lt;/h3&gt;

&lt;p&gt;If you read xDS, you cannot avoid &lt;code&gt;google.protobuf.Any&lt;/code&gt;. A normal field has a fixed type, but &lt;code&gt;Any&lt;/code&gt; is &lt;strong&gt;a box whose contents type is decided at runtime&lt;/strong&gt;.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight protobuf"&gt;&lt;code&gt;&lt;span class="kd"&gt;message&lt;/span&gt; &lt;span class="nc"&gt;Filter&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="kt"&gt;string&lt;/span&gt; &lt;span class="na"&gt;name&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
  &lt;span class="n"&gt;google.protobuf.Any&lt;/span&gt; &lt;span class="na"&gt;typed_config&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;An &lt;code&gt;Any&lt;/code&gt; looks like this inside:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;{
  type_url: "type.googleapis.com/envoy....Router"
  value:    &amp;lt; bytes of a serialized Router &amp;gt;
}
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;code&gt;type_url&lt;/code&gt; declares "this is a Router", and &lt;code&gt;value&lt;/code&gt; holds that type packed into bytes. Envoy reads &lt;code&gt;type_url&lt;/code&gt; and concludes "then deserialize this as Router".&lt;/p&gt;

&lt;p&gt;The reason this exists: filters and plugins are open-ended. The Listener &lt;code&gt;.proto&lt;/code&gt; cannot enumerate every type in advance, so it keeps a "type decided at runtime" box. The &lt;code&gt;type_url&lt;/code&gt; from the ADS section is exactly this mechanism, telling the stream "the type flowing right now is X".&lt;/p&gt;

&lt;h2&gt;
  
  
  Reading the real listener.proto
&lt;/h2&gt;

&lt;p&gt;That was the setup. Now the real thing. I pulled Envoy's &lt;code&gt;listener.proto&lt;/code&gt; from upstream. It runs 450 lines, so I will pick out the syntax highlights with their actual line numbers.&lt;/p&gt;

&lt;p&gt;The header first.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight protobuf"&gt;&lt;code&gt;&lt;span class="na"&gt;syntax&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s"&gt;"proto3"&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;                              &lt;span class="c1"&gt;// line 1: protobuf version&lt;/span&gt;
&lt;span class="kn"&gt;package&lt;/span&gt; &lt;span class="nn"&gt;envoy&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;config.listener.v3&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;               &lt;span class="c1"&gt;// line 3: namespace&lt;/span&gt;
&lt;span class="k"&gt;import&lt;/span&gt; &lt;span class="s"&gt;"envoy/config/core/v3/address.proto"&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;    &lt;span class="c1"&gt;// line 6: import an Envoy type&lt;/span&gt;
&lt;span class="k"&gt;import&lt;/span&gt; &lt;span class="s"&gt;"google/protobuf/wrappers.proto"&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;        &lt;span class="c1"&gt;// line 16: import a Google standard type&lt;/span&gt;
&lt;span class="k"&gt;import&lt;/span&gt; &lt;span class="s"&gt;"xds/type/matcher/v3/matcher.proto"&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;     &lt;span class="c1"&gt;// line 19: import an xds.* type&lt;/span&gt;
&lt;span class="k"&gt;option&lt;/span&gt; &lt;span class="na"&gt;go_package&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s"&gt;".../listener/v3;listenerv3"&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;  &lt;span class="c1"&gt;// line 30: Go package on generation&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;An important fact is already visible. The imported types fall into &lt;strong&gt;three families&lt;/strong&gt;: &lt;code&gt;envoy/*&lt;/code&gt;, &lt;code&gt;google/protobuf/*&lt;/code&gt;, and &lt;code&gt;xds/*&lt;/code&gt;. The difference matters later, so keep it in the back of your mind. The &lt;code&gt;listenerv3&lt;/code&gt; alias in &lt;code&gt;go_package&lt;/code&gt; is the package name of the generated Go code, which is where the &lt;code&gt;listenerv3.Listener{}&lt;/code&gt; I used earlier comes from.&lt;/p&gt;

&lt;p&gt;Now the body. An excerpt from &lt;code&gt;message Listener&lt;/code&gt;.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight protobuf"&gt;&lt;code&gt;&lt;span class="c1"&gt;// [#next-free-field: 39]            // line 64: next free field number is 39&lt;/span&gt;
&lt;span class="kd"&gt;message&lt;/span&gt; &lt;span class="nc"&gt;Listener&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="kd"&gt;enum&lt;/span&gt; &lt;span class="n"&gt;DrainType&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;                   &lt;span class="c1"&gt;// line 68: an enum nested inside the message&lt;/span&gt;
    &lt;span class="na"&gt;DEFAULT&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;                     &lt;span class="c1"&gt;// line 71: proto3 enums must start at 0&lt;/span&gt;
    &lt;span class="na"&gt;MODIFY_ONLY&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
  &lt;span class="p"&gt;}&lt;/span&gt;

  &lt;span class="n"&gt;reserved&lt;/span&gt; &lt;span class="mi"&gt;14&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;23&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;                   &lt;span class="c1"&gt;// line 140: numbers 14 and 23 are retired, never reuse&lt;/span&gt;

  &lt;span class="kt"&gt;string&lt;/span&gt; &lt;span class="na"&gt;name&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;                   &lt;span class="c1"&gt;// line 145&lt;/span&gt;
  &lt;span class="n"&gt;core.v3.Address&lt;/span&gt; &lt;span class="na"&gt;address&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;       &lt;span class="c1"&gt;// line 157: type from another package&lt;/span&gt;
  &lt;span class="k"&gt;repeated&lt;/span&gt; &lt;span class="n"&gt;FilterChain&lt;/span&gt; &lt;span class="na"&gt;filter_chains&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;3&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;        &lt;span class="c1"&gt;// line 176: array&lt;/span&gt;
  &lt;span class="n"&gt;google.protobuf.BoolValue&lt;/span&gt; &lt;span class="na"&gt;use_original_dst&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;4&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt; &lt;span class="c1"&gt;// line 207: a wrapper type&lt;/span&gt;

  &lt;span class="k"&gt;oneof&lt;/span&gt; &lt;span class="n"&gt;listener_specifier&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;         &lt;span class="c1"&gt;// line 416: only one of these may be set&lt;/span&gt;
    &lt;span class="n"&gt;InternalListenerConfig&lt;/span&gt; &lt;span class="na"&gt;internal_listener&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;27&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
  &lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Three things that look strange on first read.&lt;/p&gt;

&lt;h3&gt;
  
  
  reserved (retired numbers)
&lt;/h3&gt;

&lt;p&gt;&lt;code&gt;reserved 14, 23;&lt;/code&gt; (line 140) seals off field numbers that were used in the past. The field number is the on-wire identifier, so if a client from back when 14 held a different type is still around, reusing 14 makes that old client misread the bytes. So the number is reserved and never used again. The weight of a number here is nothing like a JSON key name.&lt;/p&gt;

&lt;h3&gt;
  
  
  Wrapper types (the one that matters most)
&lt;/h3&gt;

&lt;p&gt;&lt;code&gt;google.protobuf.BoolValue use_original_dst = 4;&lt;/code&gt; (line 207). Why &lt;code&gt;BoolValue&lt;/code&gt; instead of &lt;code&gt;bool&lt;/code&gt;?&lt;/p&gt;

&lt;p&gt;A proto3 scalar (&lt;code&gt;bool&lt;/code&gt;, &lt;code&gt;int32&lt;/code&gt;, and so on) &lt;strong&gt;cannot tell "unset" from "the zero value"&lt;/strong&gt;. &lt;code&gt;bool&lt;/code&gt; defaults to &lt;code&gt;false&lt;/code&gt;, so "explicitly false" and "never set" are indistinguishable.&lt;/p&gt;

&lt;p&gt;Envoy config has many fields that need all three states (true / false / unset). The fix is to wrap &lt;code&gt;bool&lt;/code&gt; in a message, &lt;code&gt;BoolValue&lt;/code&gt;. &lt;strong&gt;A message can express presence&lt;/strong&gt;, so you get the distinction.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fraw.githubusercontent.com%2F0-draft%2Fdev.to%2Fmain%2Farticles%2Fassets%2Fxds-protobuf-deep-dive%2Fdiagrams%2F06-bool-vs-boolvalue.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fraw.githubusercontent.com%2F0-draft%2Fdev.to%2Fmain%2Farticles%2Fassets%2Fxds-protobuf-deep-dive%2Fdiagrams%2F06-bool-vs-boolvalue.png" alt="bool versus BoolValue and the three states" width="800" height="1304"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;code&gt;UInt32Value&lt;/code&gt; and &lt;code&gt;Duration&lt;/code&gt; follow the same logic. When you hit an &lt;code&gt;XxxValue&lt;/code&gt; in Envoy &lt;code&gt;.proto&lt;/code&gt;, it is a field that needs to tell all three states apart.&lt;/p&gt;

&lt;h3&gt;
  
  
  oneof (mutually exclusive)
&lt;/h3&gt;

&lt;p&gt;&lt;code&gt;oneof listener_specifier&lt;/code&gt; (line 416) means &lt;strong&gt;exactly one&lt;/strong&gt; of its fields is set. You cannot hold two at once. It expresses "A or B, not both" at the schema level.&lt;/p&gt;

&lt;h2&gt;
  
  
  The xDS proto and the Envoy proto are not the same
&lt;/h2&gt;

&lt;p&gt;Back to the imports splitting into three families. This is the structure I most want to land. "Is the xDS proto different from the Envoy proto?" Yes, &lt;strong&gt;and it splits into three layers&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fraw.githubusercontent.com%2F0-draft%2Fdev.to%2Fmain%2Farticles%2Fassets%2Fxds-protobuf-deep-dive%2Fdiagrams%2F07-three-layers.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fraw.githubusercontent.com%2F0-draft%2Fdev.to%2Fmain%2Farticles%2Fassets%2Fxds-protobuf-deep-dive%2Fdiagrams%2F07-three-layers.png" alt="The three layers of xDS protos" width="800" height="849"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;As a table.&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Layer&lt;/th&gt;
&lt;th&gt;What&lt;/th&gt;
&lt;th&gt;Where defined&lt;/th&gt;
&lt;th&gt;Envoy-only&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Transport&lt;/td&gt;
&lt;td&gt;The subscribe and deliver envelope (the xDS protocol itself)&lt;/td&gt;
&lt;td&gt;&lt;code&gt;envoy/service/discovery/v3&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Spec is general, defined in envoy&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Shared data model&lt;/td&gt;
&lt;td&gt;Base types like matcher&lt;/td&gt;
&lt;td&gt;
&lt;code&gt;xds.*&lt;/code&gt; (cncf/xds)&lt;/td&gt;
&lt;td&gt;No, vendor-neutral&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Resource schema&lt;/td&gt;
&lt;td&gt;The contents of Listener, Cluster, etc&lt;/td&gt;
&lt;td&gt;
&lt;code&gt;envoy.config.*&lt;/code&gt; (envoy)&lt;/td&gt;
&lt;td&gt;Yes, Envoy-specific&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;The history makes it click. xDS types all used to live in Envoy's repo under &lt;code&gt;envoy.api.v2.*&lt;/code&gt; (the &lt;code&gt;previous_message_type = "envoy.api.v2.Listener"&lt;/code&gt; inside &lt;code&gt;listener.proto&lt;/code&gt; is the fossil of that). Then there was a push to make xDS a general API usable beyond Envoy, because gRPC itself wants to speak xDS without a proxy ("proxyless gRPC"). The shared parts were carved out into &lt;code&gt;cncf/xds&lt;/code&gt; (formerly UDPA, the Universal Data Plane API). The &lt;code&gt;xds/core/v3&lt;/code&gt; and &lt;code&gt;xds/type/matcher/v3&lt;/code&gt; imports are that.&lt;/p&gt;

&lt;p&gt;But &lt;strong&gt;the shape of resource bodies like Listener and Cluster is still defined by Envoy&lt;/strong&gt;, because what the data plane can interpret is Envoy's call to make.&lt;/p&gt;

&lt;p&gt;So the precise statement:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;xDS is "the delivery protocol" plus "vendor-neutral base types". The contents of Listener and Cluster are Envoy's proto. What istiod ships is "the xDS transport envelope, stuffed with Envoy-schema resources".&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;What ties ① and ③ together is &lt;code&gt;type_url&lt;/code&gt;. The envelope &lt;code&gt;DiscoveryResponse&lt;/code&gt; (layer ①) holds an &lt;code&gt;Any&lt;/code&gt;, and its &lt;code&gt;type_url&lt;/code&gt; points at &lt;code&gt;type.googleapis.com/envoy.config.listener.v3.Listener&lt;/code&gt; (layer ③). ① is the envelope, ③ is the letter inside.&lt;/p&gt;

&lt;h2&gt;
  
  
  When does the protobuf get filled in
&lt;/h2&gt;

&lt;p&gt;An important distinction. The &lt;code&gt;.proto&lt;/code&gt;, and the Go/C++ classes generated from it, are &lt;strong&gt;types only, with zero bytes of data&lt;/strong&gt;. So when does the content get filled in?&lt;/p&gt;

&lt;p&gt;At runtime, while istiod is running. On a timeline, building the type and filling the data are clearly separate.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fraw.githubusercontent.com%2F0-draft%2Fdev.to%2Fmain%2Farticles%2Fassets%2Fxds-protobuf-deep-dive%2Fdiagrams%2F08-build-vs-runtime.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fraw.githubusercontent.com%2F0-draft%2Fdev.to%2Fmain%2Farticles%2Fassets%2Fxds-protobuf-deep-dive%2Fdiagrams%2F08-build-vs-runtime.png" alt="Types at build time, data filled at runtime" width="800" height="878"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Inside istiod, information read from Kubernetes gets poured into the empty types. The code looks roughly like this.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight go"&gt;&lt;code&gt;&lt;span class="n"&gt;lis&lt;/span&gt; &lt;span class="o"&gt;:=&lt;/span&gt; &lt;span class="o"&gt;&amp;amp;&lt;/span&gt;&lt;span class="n"&gt;listenerv3&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Listener&lt;/span&gt;&lt;span class="p"&gt;{}&lt;/span&gt;            &lt;span class="c"&gt;// new up the type (still empty)&lt;/span&gt;
&lt;span class="n"&gt;lis&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Name&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s"&gt;"0.0.0.0_8080"&lt;/span&gt;                &lt;span class="c"&gt;// fill a value&lt;/span&gt;
&lt;span class="n"&gt;lis&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Address&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;buildAddress&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;svc&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;          &lt;span class="c"&gt;// fill from the Service&lt;/span&gt;
&lt;span class="n"&gt;lis&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;FilterChains&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;buildChains&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;vs&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;eps&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;  &lt;span class="c"&gt;// fill from VirtualService and Endpoints&lt;/span&gt;

&lt;span class="n"&gt;bytes&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;_&lt;/span&gt; &lt;span class="o"&gt;:=&lt;/span&gt; &lt;span class="n"&gt;proto&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Marshal&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;lis&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;           &lt;span class="c"&gt;// turn it into bytes here&lt;/span&gt;
&lt;span class="n"&gt;stream&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Send&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;resp&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;                        &lt;span class="c"&gt;// send to Envoy&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Phase by phase, the data state changes like this.&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Phase&lt;/th&gt;
&lt;th&gt;When&lt;/th&gt;
&lt;th&gt;What happens&lt;/th&gt;
&lt;th&gt;Data state&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Definition&lt;/td&gt;
&lt;td&gt;When you write the &lt;code&gt;.proto&lt;/code&gt;
&lt;/td&gt;
&lt;td&gt;Decide the type's shape&lt;/td&gt;
&lt;td&gt;No data (blueprint)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Code generation&lt;/td&gt;
&lt;td&gt;Build time (protoc)&lt;/td&gt;
&lt;td&gt;Turn the type into language classes&lt;/td&gt;
&lt;td&gt;No data (empty classes)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Filling&lt;/td&gt;
&lt;td&gt;While istiod runs&lt;/td&gt;
&lt;td&gt;Pour Kubernetes info into the type&lt;/td&gt;
&lt;td&gt;Values present&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Sending&lt;/td&gt;
&lt;td&gt;On each fill&lt;/td&gt;
&lt;td&gt;Marshal and push&lt;/td&gt;
&lt;td&gt;Becomes bytes on the wire&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;h3&gt;
  
  
  What goes on inside istiod
&lt;/h3&gt;

&lt;p&gt;A bit deeper. istiod has several subsystems, and filling the config happens through their interplay.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fraw.githubusercontent.com%2F0-draft%2Fdev.to%2Fmain%2Farticles%2Fassets%2Fxds-protobuf-deep-dive%2Fdiagrams%2F09-istiod-internals.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fraw.githubusercontent.com%2F0-draft%2Fdev.to%2Fmain%2Farticles%2Fassets%2Fxds-protobuf-deep-dive%2Fdiagrams%2F09-istiod-internals.png" alt="istiod internal subsystems" width="800" height="1079"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;istiod reads 20-plus resource types (VirtualService, Service, and the rest) and aggregates them into an internal snapshot called &lt;code&gt;PushContext&lt;/code&gt;. Then a per-type &lt;code&gt;Generator&lt;/code&gt; (&lt;code&gt;RouteGenerator&lt;/code&gt; and friends) builds config tailored to the Envoy that connected.&lt;/p&gt;

&lt;p&gt;The interesting part: &lt;strong&gt;generation depends on which Envoy connected&lt;/strong&gt;. istiod does not translate inputs to xDS mechanically. It looks at the client's labels and computes "the set of policies that apply to this proxy" before building config. The cost of that flexibility is that config translation eats most of istiod's resources. When istiod pins a CPU on a large mesh, this generation step is usually why.&lt;/p&gt;

&lt;h3&gt;
  
  
  What triggers a fill
&lt;/h3&gt;

&lt;p&gt;Two main triggers.&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;When Envoy connects and subscribes&lt;/strong&gt;: Envoy boots, opens a gRPC stream to istiod, and sends a &lt;code&gt;DiscoveryRequest&lt;/code&gt; ("give me LDS"). istiod fills the latest snapshot into the type and returns it.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;When config changes&lt;/strong&gt;: someone runs &lt;code&gt;kubectl apply&lt;/code&gt;, a Pod scales and endpoints change, and so on. istiod's watch fires, and it recomputes and pushes to the affected Envoys.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;It does not fire blindly. istiod batches changes for a moment (debounce) and only sends the kinds that changed. Endpoints only? Just EDS. A VirtualService? RDS and maybe CDS, and so on.&lt;/p&gt;

&lt;p&gt;There are also two delivery styles: "State of the World (SotW)", which sends everything, and "Delta (Incremental) xDS", which sends only the diff. &lt;strong&gt;Delta became the default in Istio 1.22&lt;/strong&gt;, because at scale SotW means re-sending every resource each time, which is heavy on the network and the control plane.&lt;/p&gt;

&lt;h2&gt;
  
  
  What Envoy does with it
&lt;/h2&gt;

&lt;p&gt;Switch to Envoy's side. When Envoy receives the protobuf bytes, it does not convert them to YAML. It deserializes the bytes straight into the matching type (a C++ class) it already holds, and that becomes the live &lt;code&gt;listener&lt;/code&gt; or &lt;code&gt;cluster&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;istiod turns a Go object into bytes; Envoy turns bytes back into a C++ object. Neither side ever touches a text form (YAML or JSON). The JSON you see in &lt;code&gt;istioctl proxy-config&lt;/code&gt; is Envoy rendering the binary into something readable for debugging, not the engine running on JSON.&lt;/p&gt;

&lt;h3&gt;
  
  
  ACK/NACK rejects broken config
&lt;/h3&gt;

&lt;p&gt;Delivery is not fire and forget. Envoy validates what it receives and replies ACK if it applied, NACK if it could not.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fraw.githubusercontent.com%2F0-draft%2Fdev.to%2Fmain%2Farticles%2Fassets%2Fxds-protobuf-deep-dive%2Fdiagrams%2F10-ack-nack.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fraw.githubusercontent.com%2F0-draft%2Fdev.to%2Fmain%2Farticles%2Fassets%2Fxds-protobuf-deep-dive%2Fdiagrams%2F10-ack-nack.png" alt="ACK and NACK flow" width="800" height="849"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;The &lt;code&gt;nonce&lt;/code&gt; is an identifier stamped on each delivery, used to match a reply to the response it answers. On a NACK, Envoy drops the new config and keeps running the old one. That makes it a safety net: pushing broken config does not instantly kill traffic. A NACK in istiod's logs usually means the generated config is invalid for the Envoy version that sidecar runs.&lt;/p&gt;

&lt;h2&gt;
  
  
  Applying it: which layer does metadata go in
&lt;/h2&gt;

&lt;p&gt;You now know what flows. The question that always comes up in real design is: when you want to inject some metadata (this Pod is v2, this path needs stronger auth, and so on), which of LDS/RDS/CDS/EDS is the right place? There is a clear rule for this, and each layer has things only it can express.&lt;/p&gt;

&lt;h3&gt;
  
  
  The two questions
&lt;/h3&gt;

&lt;p&gt;Envoy processes one request from upstream to downstream in this order.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fraw.githubusercontent.com%2F0-draft%2Fdev.to%2Fmain%2Farticles%2Fassets%2Fxds-protobuf-deep-dive%2Fdiagrams%2F11-request-pipeline.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fraw.githubusercontent.com%2F0-draft%2Fdev.to%2Fmain%2Farticles%2Fassets%2Fxds-protobuf-deep-dive%2Fdiagrams%2F11-request-pipeline.png" alt="Envoy request processing pipeline" width="800" height="1921"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Where metadata goes is decided by asking, in order:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;What does this metadata vary by?&lt;/strong&gt; Per port means LDS, per service means CDS, per path or header means RDS, per Pod means EDS. The smallest grain at which it differs is the layer.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Who reads it, and when?&lt;/strong&gt; Place it at the layer where the consuming filter or load balancer runs, or upstream of it, or it cannot be read.&lt;/li&gt;
&lt;/ol&gt;

&lt;h3&gt;
  
  
  What only each layer can express
&lt;/h3&gt;

&lt;p&gt;"Only" is not a figure of speech here. It means structurally there is nowhere else to put it.&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Layer&lt;/th&gt;
&lt;th&gt;Where it goes (proto)&lt;/th&gt;
&lt;th&gt;What only this layer can express&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;LDS&lt;/td&gt;
&lt;td&gt;
&lt;code&gt;Listener.metadata&lt;/code&gt; / FilterChain&lt;/td&gt;
&lt;td&gt;Whether a filter is installed at all. Later layers can only override what this installs. L4 decisions like SNI and original destination&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;RDS&lt;/td&gt;
&lt;td&gt;
&lt;code&gt;Route.metadata&lt;/code&gt; / &lt;code&gt;typed_per_filter_config&lt;/code&gt;
&lt;/td&gt;
&lt;td&gt;Decisions based on HTTP request attributes (path, header). Per-path filter overrides (rate limit, ext_authz) live only here&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;CDS&lt;/td&gt;
&lt;td&gt;
&lt;code&gt;Cluster.metadata&lt;/code&gt; / &lt;code&gt;lb_subset_config&lt;/code&gt;
&lt;/td&gt;
&lt;td&gt;The subset "key definition", load balancing policy, circuit breakers, upstream TLS&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;EDS&lt;/td&gt;
&lt;td&gt;
&lt;code&gt;LbEndpoint.metadata&lt;/code&gt; / &lt;code&gt;LocalityLbEndpoints&lt;/code&gt;
&lt;/td&gt;
&lt;td&gt;Per-target values (version=v2 and such), locality/zone, per-endpoint weight and health&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;Concrete "physically nowhere else" cases:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;A value that differs per Pod&lt;/strong&gt; (this Pod is v2, this one is in Zone-A) attaches only to an endpoint. EDS is the only home.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Behavior per HTTP path&lt;/strong&gt; (only &lt;code&gt;/admin&lt;/code&gt; gets stronger auth) can only see the request at the route, so it is RDS-only.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Whether a filter exists at all&lt;/strong&gt; is decided at LDS (the filter chain). Trying to override a filter via RDS &lt;code&gt;typed_per_filter_config&lt;/code&gt; that LDS never installed is "overriding a filter that is not there", and it does nothing.&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  How the wrong layer breaks
&lt;/h3&gt;

&lt;p&gt;This is the most common trap in practice. Pick the wrong layer and you get "config applies but does nothing".&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Put a per-endpoint value on the Cluster (CDS) and it becomes shared across all Pods, so the per-Pod distinction vanishes.&lt;/li&gt;
&lt;li&gt;Put &lt;code&gt;typed_per_filter_config&lt;/code&gt; on a route but never install that filter in the LDS filter chain, and it is silently ignored (sometimes without even a NACK).&lt;/li&gt;
&lt;li&gt;Put a subset match on RDS but leave &lt;code&gt;subset_selectors&lt;/code&gt; undefined on CDS, or leave that key off the EDS endpoint metadata, and nothing matches, so you get a 503.&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Worked example: canary by version (three layers split the job)
&lt;/h3&gt;

&lt;p&gt;"Split &lt;code&gt;reviews&lt;/code&gt; into v1/v2/v3 by version label, and send a slice of traffic to v2." This one feature decomposes across three layers, each doing the part only it can.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;CDS (Cluster.lb_subset_config):
    subset_selectors: [{ keys: ["version"] }]      # declare: build subsets by version

EDS (LbEndpoint.metadata):
    10.244.0.11 -&amp;gt; {"envoy.lb": {"version": "v1"}}  # tag each Pod with its label value
    10.244.0.20 -&amp;gt; {"envoy.lb": {"version": "v2"}}

RDS (Route.route.metadata_match):
    { "envoy.lb": {"version": "v2"} }              # select: route to the v2 subset
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;In Istio terms, each layer maps like this.&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;What you write in Istio&lt;/th&gt;
&lt;th&gt;The xDS it generates&lt;/th&gt;
&lt;th&gt;The job it does&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;
&lt;code&gt;DestinationRule.subsets&lt;/code&gt; (version key)&lt;/td&gt;
&lt;td&gt;CDS &lt;code&gt;subset_selectors&lt;/code&gt;
&lt;/td&gt;
&lt;td&gt;Declares the "split by version" structure&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;The Pod's &lt;code&gt;version: v2&lt;/code&gt; label&lt;/td&gt;
&lt;td&gt;EDS endpoint metadata&lt;/td&gt;
&lt;td&gt;Tags each target with the value&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;The &lt;code&gt;VirtualService&lt;/code&gt; route&lt;/td&gt;
&lt;td&gt;RDS &lt;code&gt;metadata_match&lt;/code&gt;
&lt;/td&gt;
&lt;td&gt;Selects the destination subset per request&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;The point: one concern, version, decomposes by grain into three layers. The value differs per Pod so it is EDS, "split by version" is a property of the whole Cluster so it is CDS, and the actual selection happens at request time so it is RDS. &lt;strong&gt;Routing one concern to the right layer by its grain is what designing with xDS is.&lt;/strong&gt;&lt;/p&gt;

&lt;h3&gt;
  
  
  Quick reference: where does it go
&lt;/h3&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;What you want to inject&lt;/th&gt;
&lt;th&gt;Layer&lt;/th&gt;
&lt;th&gt;proto field&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Per-Pod tags, zone, weight&lt;/td&gt;
&lt;td&gt;EDS&lt;/td&gt;
&lt;td&gt;
&lt;code&gt;LbEndpoint.metadata&lt;/code&gt; / &lt;code&gt;LocalityLbEndpoints&lt;/code&gt;
&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Service-wide LB policy, subset definition, TLS&lt;/td&gt;
&lt;td&gt;CDS&lt;/td&gt;
&lt;td&gt;
&lt;code&gt;Cluster.metadata&lt;/code&gt; / &lt;code&gt;lb_subset_config&lt;/code&gt;
&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Per-path or per-header filter behavior&lt;/td&gt;
&lt;td&gt;RDS&lt;/td&gt;
&lt;td&gt;
&lt;code&gt;Route.typed_per_filter_config&lt;/code&gt; / &lt;code&gt;metadata&lt;/code&gt;
&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Port, filter setup, L4 decisions&lt;/td&gt;
&lt;td&gt;LDS&lt;/td&gt;
&lt;td&gt;
&lt;code&gt;Listener.metadata&lt;/code&gt; / FilterChain&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;h2&gt;
  
  
  Hands-on: pull a raw DiscoveryResponse
&lt;/h2&gt;

&lt;p&gt;I have been saying "protobuf flows" in words. Now confirm it with your own eyes. Stand up Kubernetes on kind, install Istio, and pull the actual bytes istiod ships, then decode them with &lt;code&gt;protoc&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;You need &lt;code&gt;docker&lt;/code&gt;, &lt;code&gt;kind&lt;/code&gt;, &lt;code&gt;istioctl&lt;/code&gt;, &lt;code&gt;kubectl&lt;/code&gt;, &lt;code&gt;grpcurl&lt;/code&gt;, and &lt;code&gt;protoc&lt;/code&gt;. On a Mac, &lt;code&gt;brew install kind grpcurl protobuf&lt;/code&gt; plus &lt;code&gt;istioctl&lt;/code&gt; from the official instructions.&lt;/p&gt;

&lt;h3&gt;
  
  
  Step 1: stand up the cluster and Istio
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;# create the kind cluster&lt;/span&gt;
kind create cluster &lt;span class="nt"&gt;--name&lt;/span&gt; xds-demo

&lt;span class="c"&gt;# install Istio, minimal profile&lt;/span&gt;
istioctl &lt;span class="nb"&gt;install&lt;/span&gt; &lt;span class="nt"&gt;--set&lt;/span&gt; &lt;span class="nv"&gt;profile&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;demo &lt;span class="nt"&gt;-y&lt;/span&gt;

&lt;span class="c"&gt;# deploy a sample app (Bookinfo) with sidecar injection&lt;/span&gt;
kubectl label namespace default istio-injection&lt;span class="o"&gt;=&lt;/span&gt;enabled
kubectl apply &lt;span class="nt"&gt;-f&lt;/span&gt; samples/bookinfo/platform/kube/bookinfo.yaml

&lt;span class="c"&gt;# wait for the Pods to come up&lt;/span&gt;
kubectl &lt;span class="nb"&gt;wait&lt;/span&gt; &lt;span class="nt"&gt;--for&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="nv"&gt;condition&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;ready pod &lt;span class="nt"&gt;--all&lt;/span&gt; &lt;span class="nt"&gt;--timeout&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;180s
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Step 2: the human-facing view with istioctl
&lt;/h3&gt;

&lt;p&gt;Start with ordinary debugging. See what Listeners istiod generated for a given Envoy. Every output from here on is the real thing, captured by running these steps.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;# grab the productpage Pod name&lt;/span&gt;
&lt;span class="nv"&gt;POD&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="si"&gt;$(&lt;/span&gt;kubectl get pod &lt;span class="nt"&gt;-l&lt;/span&gt; &lt;span class="nv"&gt;app&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;productpage &lt;span class="nt"&gt;-o&lt;/span&gt; &lt;span class="nv"&gt;jsonpath&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s1"&gt;'{.items[0].metadata.name}'&lt;/span&gt;&lt;span class="si"&gt;)&lt;/span&gt;

&lt;span class="c"&gt;# the Listener list for that Envoy&lt;/span&gt;
istioctl proxy-config listeners &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="nv"&gt;$POD&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;It comes out as a readable table (excerpt).&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;ADDRESSES      PORT   MATCH                                   DESTINATION
10.96.0.10     53     ALL                                     Cluster: outbound|53||kube-dns.kube-system.svc...
0.0.0.0        80     Trans: raw_buffer; App: http/1.1,h2c    Route: 80
10.96.0.1      443    ALL                                     Cluster: outbound|443||kubernetes.default.svc...
10.96.151.152  443    ALL                                     Cluster: outbound|443||istiod.istio-system.svc...
0.0.0.0        9080   Trans: raw_buffer; App: http/1.1,h2c    Route: 9080
0.0.0.0        9080   ALL                                     PassthroughCluster
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Add &lt;code&gt;-o yaml&lt;/code&gt; to see the protobuf dumped into a YAML view. Remember that this YAML is a readable rendering of the protobuf, not what went over the wire.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;istioctl proxy-config listeners &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="nv"&gt;$POD&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt; &lt;span class="nt"&gt;-o&lt;/span&gt; yaml | &lt;span class="nb"&gt;head&lt;/span&gt; &lt;span class="nt"&gt;-50&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Step 3: see the type_url in the open
&lt;/h3&gt;

&lt;p&gt;Next, pull the config dump from Envoy's admin endpoint and list the &lt;code&gt;@type&lt;/code&gt; values.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;# port-forward Envoy's admin API&lt;/span&gt;
kubectl port-forward &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="nv"&gt;$POD&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt; 15000:15000 &amp;amp;

&lt;span class="c"&gt;# list the @type values (the type_url) inside config_dump&lt;/span&gt;
curl &lt;span class="nt"&gt;-s&lt;/span&gt; localhost:15000/config_dump | jq &lt;span class="nt"&gt;-r&lt;/span&gt; &lt;span class="s1"&gt;'.configs[]."@type"'&lt;/span&gt; | &lt;span class="nb"&gt;sort&lt;/span&gt; &lt;span class="nt"&gt;-u&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The real output:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;type.googleapis.com/envoy.admin.v3.BootstrapConfigDump
type.googleapis.com/envoy.admin.v3.ClustersConfigDump
type.googleapis.com/envoy.admin.v3.ListenersConfigDump
type.googleapis.com/envoy.admin.v3.RoutesConfigDump
type.googleapis.com/envoy.admin.v3.ScopedRoutesConfigDump
type.googleapis.com/envoy.admin.v3.SecretsConfigDump
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;That &lt;code&gt;type.googleapis.com/...&lt;/code&gt; is &lt;code&gt;type_url&lt;/code&gt;. It is the protobuf &lt;code&gt;Any&lt;/code&gt; declaring "what type is inside this", and you can see Listener, Cluster, Route, and Secret each carrying a different &lt;code&gt;type_url&lt;/code&gt;.&lt;/p&gt;

&lt;h3&gt;
  
  
  Step 4: pull a raw DiscoveryResponse from istiod
&lt;/h3&gt;

&lt;p&gt;This is the part the setup was for. istiod also speaks xDS on port &lt;code&gt;15010&lt;/code&gt; (plaintext), so you can hit the ADS stream directly with &lt;code&gt;grpcurl&lt;/code&gt; and pull a raw &lt;code&gt;DiscoveryResponse&lt;/code&gt;. You pretend to be Envoy and ask for Listeners.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;# borrow the node ID from the real Envoy bootstrap&lt;/span&gt;
&lt;span class="nv"&gt;NODEID&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="si"&gt;$(&lt;/span&gt;istioctl proxy-config bootstrap &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="nv"&gt;$POD&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt; &lt;span class="nt"&gt;-o&lt;/span&gt; json | jq &lt;span class="nt"&gt;-r&lt;/span&gt; &lt;span class="s1"&gt;'.bootstrap.node.id'&lt;/span&gt;&lt;span class="si"&gt;)&lt;/span&gt;

&lt;span class="c"&gt;# port-forward istiod's xDS port&lt;/span&gt;
kubectl &lt;span class="nt"&gt;-n&lt;/span&gt; istio-system port-forward deploy/istiod 15010:15010 &amp;amp;

&lt;span class="c"&gt;# ask ADS for Listeners and inspect the structure of the envelope that comes back&lt;/span&gt;
grpcurl &lt;span class="nt"&gt;-plaintext&lt;/span&gt; &lt;span class="nt"&gt;-max-time&lt;/span&gt; 8 &lt;span class="nt"&gt;-d&lt;/span&gt; @ localhost:15010 &lt;span class="se"&gt;\&lt;/span&gt;
  envoy.service.discovery.v3.AggregatedDiscoveryService/StreamAggregatedResources &lt;span class="o"&gt;&amp;lt;&amp;lt;&lt;/span&gt;&lt;span class="no"&gt;EOF&lt;/span&gt;&lt;span class="sh"&gt; &lt;/span&gt;&lt;span class="se"&gt;\&lt;/span&gt;&lt;span class="sh"&gt;
  | jq '{typeUrl, nonce, count: (.resources|length), first_type: .resources[0]."@type", first_name: .resources[0].name}'
{"node":{"id":"&lt;/span&gt;&lt;span class="nv"&gt;$NODEID&lt;/span&gt;&lt;span class="sh"&gt;"},"typeUrl":"type.googleapis.com/envoy.config.listener.v3.Listener"}
&lt;/span&gt;&lt;span class="no"&gt;EOF
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The real thing that came back (&lt;code&gt;grpcurl&lt;/code&gt; renders the protobuf as JSON):&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;NODEID = sidecar~10.244.0.11~productpage-v1-xxxxx.default~default.svc.cluster.local

{
  "typeUrl": "type.googleapis.com/envoy.config.listener.v3.Listener",
  "nonce": "2026-06-09T14:23:1...",
  "count": 18,
  "first_type": "type.googleapis.com/envoy.config.listener.v3.Listener",
  "first_name": "10.96.174.178_15021"
}
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This is the envelope, the &lt;code&gt;DiscoveryResponse&lt;/code&gt;, in the flesh. &lt;code&gt;typeUrl&lt;/code&gt; declares "the stream is carrying Listener type right now", &lt;code&gt;nonce&lt;/code&gt; is the identifier for matching ACK/NACK, and &lt;code&gt;resources&lt;/code&gt; holds 18 entries, each wrapped in an &lt;code&gt;Any&lt;/code&gt; whose &lt;code&gt;@type&lt;/code&gt; points at Listener. The earlier "stuff Envoy-schema resources into the transport envelope" is right there in the open.&lt;/p&gt;

&lt;h3&gt;
  
  
  Step 5: see field numbers with protoc --decode_raw
&lt;/h3&gt;

&lt;p&gt;Last, confirm by hand that protobuf "identifies fields by number". &lt;code&gt;grpcurl&lt;/code&gt; formats to JSON, so here I build a tiny proto that mirrors only the first fields of &lt;code&gt;listener.proto&lt;/code&gt;, encode it, and feed the raw bytes to &lt;code&gt;protoc --decode_raw&lt;/code&gt;.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;# a minimal proto with the same field numbers as listener.proto&lt;/span&gt;
&lt;span class="nb"&gt;cat&lt;/span&gt; &lt;span class="o"&gt;&amp;gt;&lt;/span&gt; mini.proto &lt;span class="o"&gt;&amp;lt;&amp;lt;&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="no"&gt;PROTO&lt;/span&gt;&lt;span class="sh"&gt;'
syntax = "proto3";
message SocketAddress { string address = 1; uint32 port_value = 2; }
message Address { SocketAddress socket_address = 1; }
message Listener { string name = 1; Address address = 2; }
&lt;/span&gt;&lt;span class="no"&gt;PROTO

&lt;/span&gt;&lt;span class="c"&gt;# fill values and encode to binary&lt;/span&gt;
&lt;span class="nb"&gt;cat&lt;/span&gt; &lt;span class="o"&gt;&amp;gt;&lt;/span&gt; listener.txtpb &lt;span class="o"&gt;&amp;lt;&amp;lt;&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="no"&gt;TXT&lt;/span&gt;&lt;span class="sh"&gt;'
name: "0.0.0.0_9080"
address { socket_address { address: "0.0.0.0" port_value: 9080 } }
&lt;/span&gt;&lt;span class="no"&gt;TXT
&lt;/span&gt;protoc &lt;span class="nt"&gt;--encode&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;Listener mini.proto &amp;lt; listener.txtpb &lt;span class="o"&gt;&amp;gt;&lt;/span&gt; listener.bin

&lt;span class="c"&gt;# decode the raw bytes with no type info&lt;/span&gt;
protoc &lt;span class="nt"&gt;--decode_raw&lt;/span&gt; &amp;lt; listener.bin
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The real output. No field names, just &lt;strong&gt;the field numbers as keys&lt;/strong&gt;.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;1: "0.0.0.0_9080"
2 {
  1 {
    1: "0.0.0.0"
    2: 9080
  }
}
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;code&gt;1:&lt;/code&gt; is the Listener &lt;code&gt;name&lt;/code&gt;, &lt;code&gt;2 {&lt;/code&gt; is &lt;code&gt;address&lt;/code&gt;, the &lt;code&gt;1 {&lt;/code&gt; inside is &lt;code&gt;socket_address&lt;/code&gt;, and &lt;code&gt;2: 9080&lt;/code&gt; is the port. Recall the proto: &lt;code&gt;string name = 1;&lt;/code&gt;, &lt;code&gt;core.v3.Address address = 2;&lt;/code&gt;, and &lt;code&gt;port_value = 2&lt;/code&gt; on &lt;code&gt;SocketAddress&lt;/code&gt;. Those numbers show up in the decode exactly. &lt;strong&gt;That is the hard proof that what flows is not YAML but protobuf packed by field number.&lt;/strong&gt;&lt;/p&gt;

&lt;h3&gt;
  
  
  Step 6: tear down
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;kind delete cluster &lt;span class="nt"&gt;--name&lt;/span&gt; xds-demo
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;A script that reproduces this hands-on end to end lives at &lt;code&gt;articles/assets/xds-protobuf-deep-dive/run.sh&lt;/code&gt;.&lt;/p&gt;

&lt;h2&gt;
  
  
  Wrap-up
&lt;/h2&gt;

&lt;p&gt;Long road, so a retrace from the top.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;A service mesh has a &lt;strong&gt;control plane (istiod)&lt;/strong&gt; and a &lt;strong&gt;data plane (Envoy)&lt;/strong&gt;. Envoy does not know what to do on its own and gets config from the control plane.&lt;/li&gt;
&lt;li&gt;Shipping config over an API instead of a file is &lt;strong&gt;xDS&lt;/strong&gt;. In Istio, &lt;strong&gt;Pilot&lt;/strong&gt; inside istiod serves it, and each Envoy sidecar receives it.&lt;/li&gt;
&lt;li&gt;The contents come in &lt;strong&gt;LDS/RDS/CDS/EDS/SDS&lt;/strong&gt;, bundled onto one stream by &lt;strong&gt;ADS&lt;/strong&gt;.&lt;/li&gt;
&lt;li&gt;What flows is &lt;strong&gt;protobuf, not YAML&lt;/strong&gt;. YAML lives only at the human-writing edge and the debug view.&lt;/li&gt;
&lt;li&gt;protobuf means &lt;strong&gt;defining the type in &lt;code&gt;.proto&lt;/code&gt; first&lt;/strong&gt;, then generating language classes with protoc. The &lt;code&gt;.proto&lt;/code&gt; is type only; &lt;strong&gt;the data is filled in at runtime by istiod&lt;/strong&gt;.&lt;/li&gt;
&lt;li&gt;xDS is three layers. The &lt;strong&gt;delivery (transport)&lt;/strong&gt;, the &lt;strong&gt;vendor-neutral base types (cncf/xds)&lt;/strong&gt;, and the &lt;strong&gt;Envoy-specific resource schema&lt;/strong&gt; are distinct, and &lt;code&gt;type_url&lt;/code&gt; ties the envelope to the contents.&lt;/li&gt;
&lt;li&gt;Where metadata goes is set by &lt;strong&gt;the grain it varies by&lt;/strong&gt; and &lt;strong&gt;who reads it&lt;/strong&gt;. Per Pod is EDS, per service is CDS, per path is RDS, per port is LDS. Routing one concern to the right layer by grain is the essence of designing with xDS.&lt;/li&gt;
&lt;li&gt;Finally, decoding the raw bytes on kind shows the field numbers from &lt;code&gt;listener.proto&lt;/code&gt; appearing verbatim, confirming with your own eyes that protobuf is what flows.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;"The control plane ships Envoy-schema protobuf messages over xDS." If that sentence lands, the post did its job. Next time you reach for &lt;code&gt;istioctl proxy-config&lt;/code&gt; during an Istio incident, you can see what is happening behind it.&lt;/p&gt;

&lt;h2&gt;
  
  
  References
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;&lt;a href="https://www.envoyproxy.io/docs/envoy/latest/api-docs/xds_protocol" rel="noopener noreferrer"&gt;xDS REST and gRPC protocol (Envoy documentation)&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://www.envoyproxy.io/docs/envoy/latest/intro/arch_overview/operations/dynamic_configuration" rel="noopener noreferrer"&gt;xDS configuration API overview (Envoy documentation)&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://www.envoyproxy.io/docs/envoy/latest/api-v3/service/discovery/v3/discovery.proto" rel="noopener noreferrer"&gt;Common discovery API components, discovery.proto (Envoy documentation)&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://github.com/istio/istio/blob/master/architecture/networking/pilot.md" rel="noopener noreferrer"&gt;istio/architecture/networking/pilot.md&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://tetrate.io/blog/istio-service-mesh-delta-xds" rel="noopener noreferrer"&gt;Istio Delta xDS Now on by Default: What's New in Istio 1.22&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://github.com/cncf/xds" rel="noopener noreferrer"&gt;cncf/xds repository&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://protobuf.dev/" rel="noopener noreferrer"&gt;Protocol Buffers documentation&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;

</description>
      <category>envoy</category>
      <category>istio</category>
      <category>kubernetes</category>
      <category>grpc</category>
    </item>
    <item>
      <title>xDS Deep Dive, Part 2: Designing What You Ship Over the Universal Data Plane API</title>
      <dc:creator>kt</dc:creator>
      <pubDate>Sun, 07 Jun 2026 08:25:40 +0000</pubDate>
      <link>https://dev.to/kanywst/xds-deep-dive-part-2-designing-what-you-ship-over-the-universal-data-plane-api-3hef</link>
      <guid>https://dev.to/kanywst/xds-deep-dive-part-2-designing-what-you-ship-over-the-universal-data-plane-api-3hef</guid>
      <description>&lt;h2&gt;
  
  
  Introduction
&lt;/h2&gt;

&lt;p&gt;In &lt;a href="https://dev.to/kanywst/xds-deep-dive-dissecting-the-nervous-system-of-the-service-mesh-3m5i"&gt;xDS Deep Dive: Dissecting the "Nervous System" of the Service Mesh&lt;/a&gt; I dug into the dependency chain of &lt;code&gt;LDS / RDS / CDS / EDS / SDS&lt;/code&gt;, the robustness of &lt;code&gt;ACK/NACK&lt;/code&gt;, why &lt;code&gt;ADS&lt;/code&gt; exists, and the evolution from &lt;code&gt;SotW&lt;/code&gt; to &lt;code&gt;Delta&lt;/code&gt;. In short, it was about &lt;strong&gt;how Envoy consumes xDS&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;But there was one line I tossed off at the very end:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;xDS isn't Envoy-only anymore. The CNCF xDS API Working Group is standardizing it as the "Universal Data Plane API".&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;I waved at that and never came back to it. This time I take it seriously, and I change my stance from Part 1. Part 1 read xDS from the &lt;strong&gt;consuming side: how Envoy eats it&lt;/strong&gt;. This one is about the &lt;strong&gt;producing side: how you design what gets shipped over xDS, and how&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;Why frame it that way? Because every decision you face when designing xDS resources (what to name them, how to split them, how to express variants, which CP owns them) is already spelled out in the &lt;code&gt;github.com/cncf/xds&lt;/code&gt; protos and the xRFC documents. &lt;code&gt;xdstp://&lt;/code&gt;, Authority, Dynamic Parameters: read straight, they look like feature descriptions. Read through a designer's eyes, they're &lt;strong&gt;levers, each one saying "here is where you choose, and how"&lt;/strong&gt;. I'll re-read them one by one as design choices.&lt;/p&gt;

&lt;p&gt;And I won't stop at theory. At the end I use &lt;code&gt;go-control-plane&lt;/code&gt; to &lt;strong&gt;build a Listener → Route → Cluster graph by hand and serve it&lt;/strong&gt;, then confirm in the output that what I designed actually lands on the wire. I cloned the protos locally too, so I'll keep the definitions open beside me.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;git clone https://github.com/cncf/xds ~/xds
&lt;span class="nb"&gt;ls&lt;/span&gt; ~/xds/xds/core/v3/
&lt;span class="c"&gt;# authority.proto       collection_entry.proto  context_params.proto&lt;/span&gt;
&lt;span class="c"&gt;# extension.proto       resource.proto          resource_locator.proto&lt;/span&gt;
&lt;span class="c"&gt;# resource_name.proto   cidr.proto              ...&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The protos sitting in there are the proxy-neutral core: types that don't depend on Envoy at all.&lt;/p&gt;

&lt;h3&gt;
  
  
  Vocabulary to keep handy
&lt;/h3&gt;

&lt;p&gt;I'm writing this assuming you've read Part 1, but here's a quick glossary of terms that come up a lot. Come back here when you get stuck.&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Term&lt;/th&gt;
&lt;th&gt;Role&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;LDS / RDS / CDS / EDS / SDS&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;The five core xDS services. Linked by reference: Listener → Route → Cluster → Endpoint&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;ADS&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Multiplexes the above onto a single gRPC stream. Required when ordering matters&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;Node&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;The identity + metadata a client sends to the server at the start of the stream&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;
&lt;code&gt;SotW&lt;/code&gt; / &lt;code&gt;Delta&lt;/code&gt;
&lt;/td&gt;
&lt;td&gt;Two modes: send every resource each time, or send only the diff&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;ACK / NACK&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;How a client reports whether it could apply the config, via &lt;code&gt;version_info&lt;/code&gt; and &lt;code&gt;nonce&lt;/code&gt;
&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;LRS&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;A xDS-family service where the client reports endpoint load back to the CP&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;h2&gt;
  
  
  0. The premise: xDS is nothing but "protobuf flowing over a gRPC stream"
&lt;/h2&gt;

&lt;p&gt;In Part 1 I explained &lt;code&gt;LDS / RDS / CDS / EDS&lt;/code&gt;, but it hit me that I never once showed &lt;strong&gt;the shape of the xDS interface itself&lt;/strong&gt;. If that hasn't clicked, everything that follows (&lt;code&gt;xdstp://&lt;/code&gt;, ORCA, all of it) floats in the air. So let me drop down a level and look at the xDS transport plainly.&lt;/p&gt;

&lt;h3&gt;
  
  
  xDS is a single gRPC service
&lt;/h3&gt;

&lt;p&gt;There's no standalone "xDS protocol". &lt;strong&gt;xDS is one gRPC service definition.&lt;/strong&gt; The ADS from Part 1 is, concretely, this proto in envoy's &lt;code&gt;service/discovery/v3&lt;/code&gt;:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight protobuf"&gt;&lt;code&gt;&lt;span class="c1"&gt;// envoy/service/discovery/v3/ads.proto&lt;/span&gt;
&lt;span class="kd"&gt;service&lt;/span&gt; &lt;span class="n"&gt;AggregatedDiscoveryService&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="k"&gt;rpc&lt;/span&gt; &lt;span class="n"&gt;StreamAggregatedResources&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;stream&lt;/span&gt; &lt;span class="n"&gt;DiscoveryRequest&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
      &lt;span class="k"&gt;returns&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;stream&lt;/span&gt; &lt;span class="n"&gt;DiscoveryResponse&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;

  &lt;span class="k"&gt;rpc&lt;/span&gt; &lt;span class="n"&gt;DeltaAggregatedResources&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;stream&lt;/span&gt; &lt;span class="n"&gt;DeltaDiscoveryRequest&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
      &lt;span class="k"&gt;returns&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;stream&lt;/span&gt; &lt;span class="n"&gt;DeltaDiscoveryResponse&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;A bidirectional streaming RPC, &lt;code&gt;stream ... returns (stream ...)&lt;/code&gt;, and that's it. The client (the proxy) holds a single stream open to the CP, pushes &lt;code&gt;DiscoveryRequest&lt;/code&gt; upstream and receives &lt;code&gt;DiscoveryResponse&lt;/code&gt; downstream, forever. LDS, CDS, EDS aren't "different protocols": they're &lt;strong&gt;just messages with different &lt;code&gt;type_url&lt;/code&gt; flowing over the same stream&lt;/strong&gt;. Even &lt;code&gt;SotW&lt;/code&gt; / &lt;code&gt;Delta&lt;/code&gt; from Part 1 are nothing more than two dialects on this one stream.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fraw.githubusercontent.com%2F0-draft%2Fdev.to%2Fmain%2Farticles%2Fassets%2Fxds-universal-data-plane-deep-dive%2Fdiagrams%2F01-xds-ads-stream.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fraw.githubusercontent.com%2F0-draft%2Fdev.to%2Fmain%2Farticles%2Fassets%2Fxds-universal-data-plane-deep-dive%2Fdiagrams%2F01-xds-ads-stream.png" alt="xDS ADS bidirectional stream carrying ACK and NACK" width="800" height="957"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;The crux is that &lt;strong&gt;there's only one stream&lt;/strong&gt;. You don't open a connection per type; a single bidirectional stream, multiplexed by &lt;code&gt;type_url&lt;/code&gt;, carries Clusters, Listeners, Endpoints, and both ACKs and NACKs all mixed together. Exactly which fields express that &lt;code&gt;ACK&lt;/code&gt; / &lt;code&gt;NACK&lt;/code&gt; becomes clear the moment you open up the messages.&lt;/p&gt;

&lt;h3&gt;
  
  
  Why is it all protobuf
&lt;/h3&gt;

&lt;p&gt;"Why is the config protobuf instead of YAML or JSON" answers itself once you're here. &lt;strong&gt;Because xDS is gRPC.&lt;/strong&gt; gRPC's IDL is protobuf, and you can only define a service's arguments and return values as protobuf messages. &lt;code&gt;DiscoveryRequest&lt;/code&gt; / &lt;code&gt;DiscoveryResponse&lt;/code&gt; being protobuf isn't even a choice; it's a consequence of deciding to use gRPC. The resource bodies (Listener, Cluster) ride inside these messages, so they have to be protobuf too.&lt;/p&gt;

&lt;p&gt;Look inside the request / response and you'll see the entire behavior of xDS collapses into the fields of these two messages.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight protobuf"&gt;&lt;code&gt;&lt;span class="c1"&gt;// upstream: client -&amp;gt; CP&lt;/span&gt;
&lt;span class="kd"&gt;message&lt;/span&gt; &lt;span class="nc"&gt;DiscoveryRequest&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="kt"&gt;string&lt;/span&gt; &lt;span class="na"&gt;version_info&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;            &lt;span class="c1"&gt;// the version I'm currently ACKing&lt;/span&gt;
  &lt;span class="n"&gt;Node&lt;/span&gt; &lt;span class="na"&gt;node&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;                      &lt;span class="c1"&gt;// my identity, sent at the start of the stream&lt;/span&gt;
  &lt;span class="k"&gt;repeated&lt;/span&gt; &lt;span class="kt"&gt;string&lt;/span&gt; &lt;span class="na"&gt;resource_names&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;3&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt; &lt;span class="c1"&gt;// "give me these"&lt;/span&gt;
  &lt;span class="kt"&gt;string&lt;/span&gt; &lt;span class="na"&gt;type_url&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;4&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;                &lt;span class="c1"&gt;// which kind (Listener? Cluster?)&lt;/span&gt;
  &lt;span class="kt"&gt;string&lt;/span&gt; &lt;span class="na"&gt;response_nonce&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;5&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;          &lt;span class="c1"&gt;// which response this replies to&lt;/span&gt;
  &lt;span class="n"&gt;google.rpc.Status&lt;/span&gt; &lt;span class="na"&gt;error_detail&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;6&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt; &lt;span class="c1"&gt;// the reason, when this is a NACK&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="c1"&gt;// downstream: CP -&amp;gt; client&lt;/span&gt;
&lt;span class="kd"&gt;message&lt;/span&gt; &lt;span class="nc"&gt;DiscoveryResponse&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="kt"&gt;string&lt;/span&gt; &lt;span class="na"&gt;version_info&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
  &lt;span class="k"&gt;repeated&lt;/span&gt; &lt;span class="n"&gt;google.protobuf.Any&lt;/span&gt; &lt;span class="na"&gt;resources&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt; &lt;span class="c1"&gt;// &amp;lt;- resource bodies are Any&lt;/span&gt;
  &lt;span class="kt"&gt;string&lt;/span&gt; &lt;span class="na"&gt;type_url&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;4&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
  &lt;span class="kt"&gt;string&lt;/span&gt; &lt;span class="na"&gt;nonce&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;5&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;In Part 1 I wrote "&lt;code&gt;ACK/NACK&lt;/code&gt; is returned via &lt;code&gt;version_info&lt;/code&gt; and &lt;code&gt;nonce&lt;/code&gt;". Well, those &lt;code&gt;version_info&lt;/code&gt; / &lt;code&gt;nonce&lt;/code&gt; / &lt;code&gt;error_detail&lt;/code&gt; are literally fields on this message. An &lt;code&gt;ACK&lt;/code&gt; is a &lt;code&gt;DiscoveryRequest&lt;/code&gt; with &lt;code&gt;error_detail&lt;/code&gt; empty; a &lt;code&gt;NACK&lt;/code&gt; is a &lt;code&gt;DiscoveryRequest&lt;/code&gt; with &lt;code&gt;error_detail&lt;/code&gt; set. That's all.&lt;/p&gt;

&lt;h3&gt;
  
  
  The core: resource bodies are wrapped in &lt;code&gt;google.protobuf.Any&lt;/code&gt;
&lt;/h3&gt;

&lt;p&gt;The field that matters most is the type of &lt;code&gt;resources&lt;/code&gt;: &lt;code&gt;repeated google.protobuf.Any&lt;/code&gt;. &lt;code&gt;Any&lt;/code&gt; is a pair of "a &lt;code&gt;type_url&lt;/code&gt; string + serialized bytes", &lt;strong&gt;a box that can wrap any protobuf message regardless of type&lt;/strong&gt;. So xDS isn't "the protocol that carries Listeners" or "the protocol that carries Clusters". It's &lt;strong&gt;a type-neutral config bus that names the type via &lt;code&gt;type_url&lt;/code&gt; and wraps the body in &lt;code&gt;Any&lt;/code&gt;&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;This "type-neutral box" property is exactly where &lt;strong&gt;Universal Data Plane&lt;/strong&gt; starts. If the box doesn't care about the type, then standardizing just the "name" and "type" you ship opens the door to a world that isn't Envoy-specific. What &lt;code&gt;cncf/xds&lt;/code&gt; is trying to do is precisely this: universalize &lt;code&gt;resource_names&lt;/code&gt; and &lt;code&gt;type_url&lt;/code&gt;.&lt;/p&gt;

&lt;h3&gt;
  
  
  What is an "xDS client", really
&lt;/h3&gt;

&lt;p&gt;Now the thing the phrase "xDS client" points to is clear. It's &lt;strong&gt;a state machine living inside the proxy (Envoy itself, or the gRPC library) that manages this bidirectional stream&lt;/strong&gt;. What it does:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Holds a single &lt;code&gt;StreamAggregatedResources&lt;/code&gt; stream to the CP (the one that holds is the client; the one that waits, the CP, is the server)&lt;/li&gt;
&lt;li&gt;Names itself by sending &lt;code&gt;Node&lt;/code&gt; at the start, and subscribes to the &lt;code&gt;resource_names&lt;/code&gt; it wants via &lt;code&gt;DiscoveryRequest&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;Routes the incoming &lt;code&gt;DiscoveryResponse&lt;/code&gt; &lt;code&gt;Any&lt;/code&gt; by &lt;code&gt;type_url&lt;/code&gt;, unpacks it, and bakes it into internal config&lt;/li&gt;
&lt;li&gt;If it applied, &lt;code&gt;ACK&lt;/code&gt; via &lt;code&gt;version_info&lt;/code&gt; + &lt;code&gt;nonce&lt;/code&gt;; if it broke, &lt;code&gt;NACK&lt;/code&gt; with &lt;code&gt;error_detail&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;Keeps a local cache of "which resource at which version I currently hold"&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;In §3 I actually poke at grpc-go's &lt;code&gt;internal/xds/bootstrap&lt;/code&gt;, and that turns out to be the config that decides &lt;strong&gt;which CP this xDS client opens a stream to at startup&lt;/strong&gt;. And every feature of &lt;code&gt;cncf/xds&lt;/code&gt; this article reads (&lt;code&gt;xdstp://&lt;/code&gt;, Authority, Dynamic Parameters, ORCA) is, almost entirely, about &lt;strong&gt;expanding the vocabulary of "what this client subscribes to and how the CP answers"&lt;/strong&gt;. In fact, the TP1 &lt;code&gt;ResourceLocator&lt;/code&gt; and the TP3 &lt;code&gt;ResourceError&lt;/code&gt; that show up later &lt;strong&gt;already exist as fields&lt;/strong&gt; on the latest &lt;code&gt;DiscoveryRequest&lt;/code&gt; / &lt;code&gt;DiscoveryResponse&lt;/code&gt; (&lt;code&gt;resource_locators&lt;/code&gt; / &lt;code&gt;resource_errors&lt;/code&gt;). That's proof the standard is already descending onto the wire.&lt;/p&gt;

&lt;h3&gt;
  
  
  A map for reading this as a designer
&lt;/h3&gt;

&lt;p&gt;For a designer, what you ship is ultimately just the proto you stuff into that &lt;code&gt;Any&lt;/code&gt; and the name you attach to it. So design boils down to deciding &lt;strong&gt;"which proto, named how, at what granularity, served from which CP"&lt;/strong&gt;. The chapters that follow knock out those decision points one at a time.&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Chapter&lt;/th&gt;
&lt;th&gt;Design decision&lt;/th&gt;
&lt;th&gt;Lever&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;§2&lt;/td&gt;
&lt;td&gt;What name do you give a resource&lt;/td&gt;
&lt;td&gt;
&lt;code&gt;xdstp://&lt;/code&gt; URI / id / context_params&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;§3&lt;/td&gt;
&lt;td&gt;How many CPs, and where you draw the boundary&lt;/td&gt;
&lt;td&gt;Authority / federation&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;§4&lt;/td&gt;
&lt;td&gt;Whether to bake references and failover into names&lt;/td&gt;
&lt;td&gt;Resource Locator (&lt;code&gt;alt=&lt;/code&gt; / &lt;code&gt;entry=&lt;/code&gt;)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;§5&lt;/td&gt;
&lt;td&gt;Subscribe one at a time, or ship in bundles&lt;/td&gt;
&lt;td&gt;singleton / collection / glob&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;§6&lt;/td&gt;
&lt;td&gt;Bake variants into the name, or keep them out&lt;/td&gt;
&lt;td&gt;Context Parameters / Dynamic Parameters&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;§7&lt;/td&gt;
&lt;td&gt;What granularity to pick for errors and resources&lt;/td&gt;
&lt;td&gt;TP3 Resource Error / NACK blast radius&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;§8&lt;/td&gt;
&lt;td&gt;Whether to wire telemetry into the design&lt;/td&gt;
&lt;td&gt;ORCA / LRS&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;§9&lt;/td&gt;
&lt;td&gt;Where to hold matching, declaratively&lt;/td&gt;
&lt;td&gt;Unified Matcher / CEL&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;§11&lt;/td&gt;
&lt;td&gt;Actually build and ship all of the above&lt;/td&gt;
&lt;td&gt;go-control-plane snapshot&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;Each chapter ends with a &lt;strong&gt;"design call"&lt;/strong&gt;: the options, the criterion for choosing, and how it breaks when you get it wrong, condensed onto one card. The goal is for this to work not just as reading but as a checklist when you design your own mesh.&lt;/p&gt;

&lt;h2&gt;
  
  
  1. Why does &lt;code&gt;cncf/xds&lt;/code&gt; exist in the first place
&lt;/h2&gt;

&lt;p&gt;Envoy has long had a mountain of protos at &lt;code&gt;envoyproxy/envoy/api&lt;/code&gt;: that place where &lt;code&gt;envoy.config.listener.v3.Listener&lt;/code&gt;, &lt;code&gt;envoy.config.cluster.v3.Cluster&lt;/code&gt;, and friends live. So a natural question is: don't we already have those, why a separate repo?&lt;/p&gt;

&lt;p&gt;The answer is one line in the README:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;We will evolve the xDS APIs to support additional clients, for example
data plane proxies beyond Envoy, proxyless service mesh libraries,
hardware load balancers, mobile clients and beyond.
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;As long as xDS lives inside the Envoy repo, the &lt;code&gt;envoy.config.*&lt;/code&gt; namespace tags along forever. When gRPC Proxyless speaks xDS, when Cilium ztunnel speaks xDS, when a load balancer speaks xDS, everyone ends up importing &lt;code&gt;envoy.config.*&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;Structurally that makes it &lt;strong&gt;"everyone else eats Envoy's leftovers"&lt;/strong&gt;, and the standardization story doesn't hold up. So the WG has been incrementally carving the Envoy-independent parts out into &lt;code&gt;cncf/xds&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;While we're here: there's also an old repo &lt;code&gt;cncf/udpa&lt;/code&gt;, but it's retired. &lt;code&gt;udpa/README.md&lt;/code&gt; says it bluntly:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;THESE PROTOS ARE DEPRECATED
We are no longer using the "UDPA" name, and we are moving away from the
protos in this tree. Users should prefer the corresponding protos in
the xds tree instead.
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;So when we talk xDS from here on, you can ignore &lt;code&gt;udpa&lt;/code&gt;. Just look at &lt;code&gt;cncf/xds&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fraw.githubusercontent.com%2F0-draft%2Fdev.to%2Fmain%2Farticles%2Fassets%2Fxds-universal-data-plane-deep-dive%2Fdiagrams%2F02-cncf-xds-evolution.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fraw.githubusercontent.com%2F0-draft%2Fdev.to%2Fmain%2Farticles%2Fassets%2Fxds-universal-data-plane-deep-dive%2Fdiagrams%2F02-cncf-xds-evolution.png" alt="Evolution from udpa and envoy api into cncf/xds" width="800" height="601"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;That's the preamble. The real subject is &lt;strong&gt;reading what's inside &lt;code&gt;cncf/xds&lt;/code&gt;&lt;/strong&gt;. A quick roll call of what lives there:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Component&lt;/th&gt;
&lt;th&gt;What it is&lt;/th&gt;
&lt;th&gt;xRFC&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;
&lt;code&gt;xdstp://&lt;/code&gt; URI / Authority&lt;/td&gt;
&lt;td&gt;A universal namespace stamped on every resource&lt;/td&gt;
&lt;td&gt;TP1&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Context Parameters&lt;/td&gt;
&lt;td&gt;Embed a resource "variant" into the URI&lt;/td&gt;
&lt;td&gt;TP1&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Resource Locator + directive&lt;/td&gt;
&lt;td&gt;
&lt;code&gt;alt=&lt;/code&gt; (failover), &lt;code&gt;entry=&lt;/code&gt; (inline reference)&lt;/td&gt;
&lt;td&gt;TP1&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Glob Collections&lt;/td&gt;
&lt;td&gt;Wildcard subscription of the form &lt;code&gt;xdstp://.../foo/*&lt;/code&gt;
&lt;/td&gt;
&lt;td&gt;TP1&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Dynamic Parameters&lt;/td&gt;
&lt;td&gt;Express variants without putting them in the name (no name pollution)&lt;/td&gt;
&lt;td&gt;TP2&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Resource Error&lt;/td&gt;
&lt;td&gt;Return per-resource errors without tearing down the stream&lt;/td&gt;
&lt;td&gt;TP3&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;ORCA&lt;/td&gt;
&lt;td&gt;A separate-family service carrying load metrics backend -&amp;gt; client&lt;/td&gt;
&lt;td&gt;(xds.service)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Unified Matcher API&lt;/td&gt;
&lt;td&gt;A shared matching tree across all extensions&lt;/td&gt;
&lt;td&gt;(xds.type)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;CelExpression&lt;/td&gt;
&lt;td&gt;A type letting matchers and extensions call CEL (Common Expression Language)&lt;/td&gt;
&lt;td&gt;(xds.type)&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;I'll knock these out top to bottom.&lt;/p&gt;

&lt;h2&gt;
  
  
  2. Why resource names had to become URIs
&lt;/h2&gt;

&lt;p&gt;Here's the real start. &lt;strong&gt;Unless I first explain why &lt;code&gt;xdstp://&lt;/code&gt; was born&lt;/strong&gt;, the reason for the following chapters (Authority, Context Parameters, Resource Locator) won't land at all.&lt;/p&gt;

&lt;h3&gt;
  
  
  What a legacy xDS resource name actually was
&lt;/h3&gt;

&lt;p&gt;I touched on this in Part 1, but an xDS resource used to be identified by three things:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;Resource name&lt;/strong&gt;: an arbitrary opaque string. e.g. &lt;code&gt;foo&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;type URL&lt;/strong&gt;: the resource's proto type. e.g. &lt;code&gt;envoy.config.endpoint.v3.ClusterLoadAssignment&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;the &lt;code&gt;Node&lt;/code&gt; message&lt;/strong&gt;: identity info about the node (locality, metadata, etc.), sent once at the start of the stream&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;The problem: control-plane implementations started looking at &lt;code&gt;Node&lt;/code&gt; metadata and &lt;strong&gt;returning different bodies for the same name &lt;code&gt;foo&lt;/code&gt;&lt;/strong&gt;. That spawns a nasty side effect.&lt;/p&gt;

&lt;h3&gt;
  
  
  What breaks: caching
&lt;/h3&gt;

&lt;p&gt;At medium scale and up, you want to drop a &lt;strong&gt;cache layer&lt;/strong&gt; in the middle of xDS. Classic examples are &lt;code&gt;xds-relay&lt;/code&gt; (a relay / caching server for xDS published by the Envoy project) or a setup like &lt;code&gt;Envoy Mobile&lt;/code&gt;, which embeds Envoy in the mobile app's own process (one client per device, O(millions) scale). Once your client count crosses a threshold, the CP can't keep up without a relay.&lt;/p&gt;

&lt;p&gt;But the moment &lt;code&gt;Node&lt;/code&gt; gets tangled into the cache key, it's over.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fraw.githubusercontent.com%2F0-draft%2Fdev.to%2Fmain%2Farticles%2Fassets%2Fxds-universal-data-plane-deep-dive%2Fdiagrams%2F03-node-cache-poisoning.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fraw.githubusercontent.com%2F0-draft%2Fdev.to%2Fmain%2Farticles%2Fassets%2Fxds-universal-data-plane-deep-dive%2Fdiagrams%2F03-node-cache-poisoning.png" alt="Node-keyed cache poisoning in an xDS relay" width="800" height="560"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;A single key &lt;code&gt;foo&lt;/code&gt; doesn't say whose &lt;code&gt;foo&lt;/code&gt; it is. Mix &lt;code&gt;Node&lt;/code&gt; into the cache key and you get a separate entry per Envoy, which defeats the cache entirely. That's the starting point for TP1.&lt;/p&gt;

&lt;h3&gt;
  
  
  The answer: cram the needed context into the name itself
&lt;/h3&gt;

&lt;p&gt;The &lt;code&gt;xdstp://&lt;/code&gt; URI format is this, written verbatim in the comment of &lt;code&gt;xds/core/v3/resource_name.proto&lt;/code&gt;:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;xdstp://{authority}/{type_url}/{id}?{context_params}
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fraw.githubusercontent.com%2F0-draft%2Fdev.to%2Fmain%2Farticles%2Fassets%2Fxds-universal-data-plane-deep-dive%2Fdiagrams%2F04-xdstp-uri-breakdown.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fraw.githubusercontent.com%2F0-draft%2Fdev.to%2Fmain%2Farticles%2Fassets%2Fxds-universal-data-plane-deep-dive%2Fdiagrams%2F04-xdstp-uri-breakdown.png" alt="Breakdown of an xdstp:// resource URI" width="800" height="384"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;The point is that &lt;strong&gt;this one URI uniquely determines the resource&lt;/strong&gt;. Without consulting &lt;code&gt;Node&lt;/code&gt; metadata, you can pin down &lt;code&gt;foo&lt;/code&gt; as "whose, for what, in which state". The relay can do a cache lookup on the URI alone without peeking at &lt;code&gt;Node&lt;/code&gt;, so for the first time caching means something.&lt;/p&gt;

&lt;p&gt;In proto, it looks like this:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight protobuf"&gt;&lt;code&gt;&lt;span class="c1"&gt;// xds/core/v3/resource_name.proto&lt;/span&gt;
&lt;span class="kd"&gt;message&lt;/span&gt; &lt;span class="nc"&gt;ResourceName&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="kt"&gt;string&lt;/span&gt; &lt;span class="na"&gt;id&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;                    &lt;span class="c1"&gt;// "api-fe"&lt;/span&gt;
  &lt;span class="kt"&gt;string&lt;/span&gt; &lt;span class="na"&gt;authority&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;             &lt;span class="c1"&gt;// "traffic-director.gcp.io"&lt;/span&gt;
  &lt;span class="kt"&gt;string&lt;/span&gt; &lt;span class="na"&gt;resource_type&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;3&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;         &lt;span class="c1"&gt;// "envoy.config.listener.v3.Listener"&lt;/span&gt;
  &lt;span class="n"&gt;ContextParams&lt;/span&gt; &lt;span class="na"&gt;context&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;4&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;        &lt;span class="c1"&gt;// {env: prod, region: us}&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="c1"&gt;// xds/core/v3/context_params.proto&lt;/span&gt;
&lt;span class="kd"&gt;message&lt;/span&gt; &lt;span class="nc"&gt;ContextParams&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="n"&gt;map&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="kt"&gt;string&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="kt"&gt;string&lt;/span&gt;&lt;span class="err"&gt;&amp;gt;&lt;/span&gt; &lt;span class="na"&gt;params&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;code&gt;context&lt;/code&gt; takes arbitrary key-values. By convention the &lt;code&gt;xds.resource.*&lt;/code&gt; prefix is reserved; for example &lt;code&gt;xds.resource.listening_address&lt;/code&gt; (e.g. &lt;code&gt;"10.1.1.3:8080"&lt;/code&gt;) is defined for Listeners.&lt;/p&gt;

&lt;p&gt;By here, the new worldview lands: &lt;strong&gt;an xDS resource name is not a plain string, it's a URI.&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Design call: what goes where in the name&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Because the URI has four slots (authority / type / id / context_params), as a designer you decide every time which piece of info goes in which slot. The guidance:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;id&lt;/strong&gt;: only &lt;strong&gt;the stable identity&lt;/strong&gt; of the resource. A logical name like &lt;code&gt;api-fe&lt;/code&gt;. Do not put variable axes like &lt;code&gt;env&lt;/code&gt; or &lt;code&gt;version&lt;/code&gt; here. Change the id and it's a different resource.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;context_params&lt;/strong&gt;: the axes where &lt;strong&gt;the body changes per client or environment&lt;/strong&gt; (&lt;code&gt;env&lt;/code&gt;, &lt;code&gt;region&lt;/code&gt;). This becomes the cache key and triggers the viral effect below, so don't overload it (§6).&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;authority&lt;/strong&gt;: &lt;strong&gt;which CP owns&lt;/strong&gt; the resource (§3).&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;How it breaks&lt;/strong&gt;: fold an environment into the id like &lt;code&gt;api-fe-prod&lt;/code&gt; and names proliferate per environment, and the upstream references (&lt;code&gt;RDS → CDS&lt;/code&gt;) split per environment too. Keep "id immutable, variation in context_params" and you confine that proliferation to one place.&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  3. Authority and federation
&lt;/h2&gt;

&lt;p&gt;The first thing in the URI path is &lt;code&gt;authority&lt;/code&gt;, and that's meaningful.&lt;/p&gt;

&lt;h3&gt;
  
  
  One client speaking to multiple control planes
&lt;/h3&gt;

&lt;p&gt;Legacy xDS implicitly assumed &lt;strong&gt;one control plane per client&lt;/strong&gt;. &lt;code&gt;ConfigSource&lt;/code&gt; was basically a single source. But reality isn't a single source:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;You want a CPaaS (Control Plane as a Service) as primary, with your own on-prem CP as failover&lt;/li&gt;
&lt;li&gt;Multi-cloud, with separate CPs on the AWS and GCP sides while Envoy runs across clusters&lt;/li&gt;
&lt;li&gt;Pull only specific resource types from a different CP managed by a different team&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;By building &lt;code&gt;Authority&lt;/code&gt; into the URI, TP1 lets you put a mapping of "authority name -&amp;gt; physical CP address" into the client's bootstrap. For gRPC this exists for real as the &lt;code&gt;authorities&lt;/code&gt; map in the bootstrap JSON. Here's the actual format grpc-go can parse:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight json"&gt;&lt;code&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"xds_servers"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;...&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"client_default_listener_resource_name_template"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="s2"&gt;"xdstp://traffic-director.gcp.io/envoy.config.listener.v3.Listener/%s"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"authorities"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"traffic-director.gcp.io"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"xds_servers"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;[{&lt;/span&gt;&lt;span class="w"&gt;
        &lt;/span&gt;&lt;span class="nl"&gt;"server_uri"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"trafficdirector.googleapis.com:443"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
        &lt;/span&gt;&lt;span class="nl"&gt;"channel_creds"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;[{&lt;/span&gt;&lt;span class="nl"&gt;"type"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"google_default"&lt;/span&gt;&lt;span class="p"&gt;}],&lt;/span&gt;&lt;span class="w"&gt;
        &lt;/span&gt;&lt;span class="nl"&gt;"server_features"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s2"&gt;"xds_v3"&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="p"&gt;}]&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"onprem-cp.internal"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"xds_servers"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;[{&lt;/span&gt;&lt;span class="w"&gt;
        &lt;/span&gt;&lt;span class="nl"&gt;"server_uri"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"istiod.mesh.svc:15010"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
        &lt;/span&gt;&lt;span class="nl"&gt;"channel_creds"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;[{&lt;/span&gt;&lt;span class="nl"&gt;"type"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"insecure"&lt;/span&gt;&lt;span class="p"&gt;}],&lt;/span&gt;&lt;span class="w"&gt;
        &lt;/span&gt;&lt;span class="nl"&gt;"server_features"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s2"&gt;"xds_v3"&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="p"&gt;}]&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Now writing &lt;code&gt;xdstp://traffic-director.gcp.io/...&lt;/code&gt; in a resource URI sends the query to the former, and &lt;code&gt;xdstp://onprem-cp.internal/...&lt;/code&gt; to the latter. One Envoy / gRPC client can &lt;strong&gt;speak to multiple CPs at once, keyed on a "logical authority"&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fraw.githubusercontent.com%2F0-draft%2Fdev.to%2Fmain%2Farticles%2Fassets%2Fxds-universal-data-plane-deep-dive%2Fdiagrams%2F05-authority-federation.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fraw.githubusercontent.com%2F0-draft%2Fdev.to%2Fmain%2Farticles%2Fassets%2Fxds-universal-data-plane-deep-dive%2Fdiagrams%2F05-authority-federation.png" alt="One client federating across two authorities" width="799" height="449"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;That's the seed of &lt;strong&gt;xDS federation&lt;/strong&gt;. In the gRPC world the client-side bootstrap spec is nailed down in the &lt;code&gt;A47-xds-federation&lt;/code&gt; proposal. Look inside grpc-go's &lt;code&gt;internal/xds/bootstrap&lt;/code&gt; and the &lt;code&gt;Authorities&lt;/code&gt; field is already implemented, structured to hold a list of &lt;code&gt;ServerConfig&lt;/code&gt; per authority.&lt;/p&gt;

&lt;h3&gt;
  
  
  Run it to check: feed it to the bootstrap parser
&lt;/h3&gt;

&lt;p&gt;"It's implemented" sounds like a cop-out in words, so I fed the bootstrap above to grpc-go's real parser (&lt;code&gt;internal/xds/bootstrap&lt;/code&gt;). It does three things only: (1) parse the bootstrap and look up authority -&amp;gt; CP address, (2) watch the legacy logical target &lt;code&gt;xds:///api-fe&lt;/code&gt; expand into &lt;code&gt;xdstp://&lt;/code&gt; via &lt;code&gt;client_default_listener_resource_name_template&lt;/code&gt;, (3) parse the resulting URI back apart with &lt;code&gt;ParseName()&lt;/code&gt;.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight go"&gt;&lt;code&gt;&lt;span class="n"&gt;cfg&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;_&lt;/span&gt; &lt;span class="o"&gt;:=&lt;/span&gt; &lt;span class="n"&gt;bootstrap&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;NewConfigFromContents&lt;/span&gt;&lt;span class="p"&gt;([]&lt;/span&gt;&lt;span class="kt"&gt;byte&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;bs&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt; &lt;span class="c"&gt;// bs = the JSON above&lt;/span&gt;
&lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;name&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;a&lt;/span&gt; &lt;span class="o"&gt;:=&lt;/span&gt; &lt;span class="k"&gt;range&lt;/span&gt; &lt;span class="n"&gt;cfg&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Authorities&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="n"&gt;fmt&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Printf&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"%-24s -&amp;gt; %s&lt;/span&gt;&lt;span class="se"&gt;\n&lt;/span&gt;&lt;span class="s"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;name&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;a&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;XDSServers&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="m"&gt;0&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;ServerURI&lt;/span&gt;&lt;span class="p"&gt;())&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="c"&gt;// xds:///api-fe expands into an xdstp:// name via the template&lt;/span&gt;
&lt;span class="n"&gt;fmt&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Println&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;bootstrap&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;PopulateResourceTemplate&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="n"&gt;cfg&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;ClientDefaultListenerResourceNameTemplate&lt;/span&gt;&lt;span class="p"&gt;(),&lt;/span&gt; &lt;span class="s"&gt;"api-fe"&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
&lt;span class="c"&gt;// parse the resulting URI back apart&lt;/span&gt;
&lt;span class="n"&gt;n&lt;/span&gt; &lt;span class="o"&gt;:=&lt;/span&gt; &lt;span class="n"&gt;xdsresource&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;ParseName&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="s"&gt;"xdstp://onprem-cp.internal/envoy.config.listener.v3.Listener/api-fe?env=prod"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;fmt&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Printf&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"authority=%q type=%q id=%q ctx=%v&lt;/span&gt;&lt;span class="se"&gt;\n&lt;/span&gt;&lt;span class="s"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;n&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Authority&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;n&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Type&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;n&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;ID&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;n&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;ContextParams&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Clone &lt;code&gt;grpc-go&lt;/code&gt;, run this (confirmed on &lt;code&gt;grpc-go 0f3086d&lt;/code&gt; / &lt;code&gt;go1.26.4&lt;/code&gt;), and the actual output is:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;== parsed authorities ==
  traffic-director.gcp.io  -&amp;gt; trafficdirector.googleapis.com:443
  onprem-cp.internal       -&amp;gt; istiod.mesh.svc:15010

== xds:///api-fe expands via client_default template ==
  xds:///api-fe  =&amp;gt;  xdstp://traffic-director.gcp.io/envoy.config.listener.v3.Listener/api-fe

== ParseName() splits an xdstp URI back into parts ==
  scheme="xdstp" authority="onprem-cp.internal"
  type="envoy.config.listener.v3.Listener" id="api-fe" ctx=map[env:prod]
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The payoff is from line 3 on. &lt;strong&gt;A perfectly ordinary target &lt;code&gt;xds:///api-fe&lt;/code&gt; turns, internally, into an authority-qualified URI &lt;code&gt;xdstp://traffic-director.gcp.io/.../api-fe&lt;/code&gt;&lt;/strong&gt;. And &lt;code&gt;ParseName()&lt;/code&gt; cleanly splits that URI into &lt;code&gt;authority / type / id / context_params&lt;/code&gt; (note &lt;code&gt;?env=prod&lt;/code&gt; getting picked up as &lt;code&gt;ctx=map[env:prod]&lt;/code&gt;). In §2 I wrote "a resource name isn't a string, it's a URI", and this is that being literally true at the code level.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Design call: how many CPs, and where to cut the authority boundary&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Authority isn't a physical CP address; it's &lt;strong&gt;the logical boundary of "who owns this set of resources"&lt;/strong&gt;. So you draw the line along the org chart and trust boundaries.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;When to split&lt;/strong&gt;: (1) different managing team, (2) different trust boundary (your own CP vs a vendor CP), (3) you want fault isolation (escape to your own CP when CPaaS is down), (4) different lifecycle (config that changes constantly vs config that almost never does)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;When not to split&lt;/strong&gt;: merely a different physical DC or region. That should be a context_param (&lt;code&gt;region=us&lt;/code&gt;); splitting the authority for it is overkill&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;How it breaks&lt;/strong&gt;: split too much and the bootstrap bloats while cross-authority resource references multiply, complicating operations. Split too little and a single CP becomes the SPOF / scaling ceiling. "Split only where ownership splits" is the sweet spot&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  4. Resource Locator: failover and inline references
&lt;/h2&gt;

&lt;p&gt;By looks alone &lt;code&gt;xdstp://&lt;/code&gt; resembles a URL so closely it's easy to miss, but the &lt;code&gt;fragment&lt;/code&gt; (after &lt;code&gt;#&lt;/code&gt;) hides an extension with real teeth: &lt;strong&gt;directives&lt;/strong&gt;.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;xdstp://{authority}/{type_url}/{id}?{context_params}{#directive,*}
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;In proto that's &lt;code&gt;ResourceLocator&lt;/code&gt;:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight protobuf"&gt;&lt;code&gt;&lt;span class="c1"&gt;// xds/core/v3/resource_locator.proto&lt;/span&gt;
&lt;span class="kd"&gt;message&lt;/span&gt; &lt;span class="nc"&gt;ResourceLocator&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="n"&gt;Scheme&lt;/span&gt; &lt;span class="na"&gt;scheme&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;          &lt;span class="c1"&gt;// XDSTP / HTTP / FILE&lt;/span&gt;
  &lt;span class="kt"&gt;string&lt;/span&gt; &lt;span class="na"&gt;id&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
  &lt;span class="kt"&gt;string&lt;/span&gt; &lt;span class="na"&gt;authority&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;3&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
  &lt;span class="kt"&gt;string&lt;/span&gt; &lt;span class="na"&gt;resource_type&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;4&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
  &lt;span class="k"&gt;oneof&lt;/span&gt; &lt;span class="n"&gt;context_param_specifier&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="n"&gt;ContextParams&lt;/span&gt; &lt;span class="na"&gt;exact_context&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;5&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
  &lt;span class="p"&gt;}&lt;/span&gt;
  &lt;span class="k"&gt;repeated&lt;/span&gt; &lt;span class="n"&gt;Directive&lt;/span&gt; &lt;span class="na"&gt;directives&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;6&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

  &lt;span class="kd"&gt;message&lt;/span&gt; &lt;span class="nc"&gt;Directive&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="k"&gt;oneof&lt;/span&gt; &lt;span class="n"&gt;directive&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
      &lt;span class="n"&gt;ResourceLocator&lt;/span&gt; &lt;span class="na"&gt;alt&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;   &lt;span class="c1"&gt;// failover target&lt;/span&gt;
      &lt;span class="kt"&gt;string&lt;/span&gt; &lt;span class="na"&gt;entry&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;          &lt;span class="c1"&gt;// inline reference&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;
  &lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Let me kill one confusing naming collision here. This &lt;code&gt;ResourceLocator&lt;/code&gt; / &lt;code&gt;ResourceName&lt;/code&gt; is from &lt;strong&gt;&lt;code&gt;xds.core.v3&lt;/code&gt;&lt;/strong&gt; (cncf/xds): the core type that defines "the grammar of a name". Separately, the &lt;code&gt;ResourceLocator&lt;/code&gt; carrying &lt;code&gt;dynamic_parameters&lt;/code&gt; and the &lt;code&gt;ResourceName&lt;/code&gt; carrying &lt;code&gt;dynamic_parameter_constraints&lt;/code&gt; that show up in §0 and §6 are the same-named messages on the &lt;strong&gt;&lt;code&gt;envoy.service.discovery.v3&lt;/code&gt;&lt;/strong&gt; side: a different thing, for transport (subscribe / deliver). Same names, different layers (one is "the type of the name itself", the other "the type of the discovery message that carries that name"). This chapter is reading the former, the core type.&lt;/p&gt;

&lt;h3&gt;
  
  
  &lt;code&gt;alt=&lt;/code&gt;: failover
&lt;/h3&gt;

&lt;p&gt;The &lt;code&gt;alt&lt;/code&gt; directive is an instruction meaning &lt;strong&gt;"if you can't fetch this resource, try the alternate URI"&lt;/strong&gt;.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;xdstp://gcp-cp/envoy.config.endpoint.v3.ClusterLoadAssignment/foo#alt=xdstp://onprem-cp/envoy.config.endpoint.v3.ClusterLoadAssignment/bar
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;"If the GCP-side CP is unreachable, fall back to the on-prem CP" becomes something &lt;strong&gt;you can write in a single string&lt;/strong&gt;. The client only needs both authorities registered in its bootstrap.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fraw.githubusercontent.com%2F0-draft%2Fdev.to%2Fmain%2Farticles%2Fassets%2Fxds-universal-data-plane-deep-dive%2Fdiagrams%2F06-alt-failover-sequence.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fraw.githubusercontent.com%2F0-draft%2Fdev.to%2Fmain%2Farticles%2Fassets%2Fxds-universal-data-plane-deep-dive%2Fdiagrams%2F06-alt-failover-sequence.png" alt="Resource Locator alt= failover sequence" width="800" height="756"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h3&gt;
  
  
  &lt;code&gt;entry=&lt;/code&gt;: inline reference
&lt;/h3&gt;

&lt;p&gt;When a List collection (below) inline-expands several resources, this is the fragment to &lt;strong&gt;reference one specific entry inside the collection by name&lt;/strong&gt;.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;xdstp://some-authority/envoy.config.listeners.v3.ListenerCollection/foo#entry=bar
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This means "the item named &lt;code&gt;bar&lt;/code&gt; inside collection &lt;code&gt;foo&lt;/code&gt;". After you pull the whole collection, you can reuse its insides as URIs again, a recursive usage that's allowed.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Design call: whether to bake failover and references into the name&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;A directive is a tool for "declaratively baking behavior into the name". Handy, but bake too much in and it stiffens up.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Use &lt;code&gt;alt=&lt;/code&gt;&lt;/strong&gt;: for failover where the alternate is &lt;strong&gt;statically determined&lt;/strong&gt; (primary CP -&amp;gt; backup CP). No client-side logic; the intent fits in one URI&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Use &lt;code&gt;entry=&lt;/code&gt;&lt;/strong&gt;: when you want to reference and reuse an entry inside a collection by name after fetching it&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;How it breaks&lt;/strong&gt;: if "where to fail over to" changes dynamically and you bake it into the name with &lt;code&gt;alt=&lt;/code&gt;, the resource name changes on every switch and stiffens. Keep dynamic decisions in the client / LB, and bake only static things into the name&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  5. Collections and glob: formalizing bulk subscription
&lt;/h2&gt;

&lt;p&gt;This one quietly pays off. As I wrote in Part 1, in old xDS &lt;strong&gt;only LDS / CDS were special&lt;/strong&gt;: empty &lt;code&gt;resource_names&lt;/code&gt; meant the implicit rule "wildcard = give me everything". RDS / EDS / SDS, meanwhile, were explicit-subscription. That &lt;strong&gt;mix of special-casing and implicit rules&lt;/strong&gt; had become technical debt.&lt;/p&gt;

&lt;p&gt;TP1 cleaned this up by &lt;strong&gt;making collections (sets) a first-class concept&lt;/strong&gt;.&lt;/p&gt;

&lt;h3&gt;
  
  
  List collection
&lt;/h3&gt;

&lt;p&gt;Separate from a singleton URI that references one resource, a collection URI exists.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;# singleton (one resource)
xdstp://auth/envoy.config.listeners.v3.Listener/api-fe

# list collection (a set of resources)
xdstp://auth/envoy.config.listeners.v3.ListenerCollection/my-listeners
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The server can answer in two ways:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;Return a list of Locators&lt;/strong&gt;: "the bodies are at other URIs, come fetch them yourself"&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Embed bodies via InlineEntry&lt;/strong&gt;: "I want to save round-trips, so here are the bodies too"&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;The proto is &lt;code&gt;CollectionEntry&lt;/code&gt;:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight protobuf"&gt;&lt;code&gt;&lt;span class="c1"&gt;// xds/core/v3/collection_entry.proto&lt;/span&gt;
&lt;span class="kd"&gt;message&lt;/span&gt; &lt;span class="nc"&gt;CollectionEntry&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="k"&gt;oneof&lt;/span&gt; &lt;span class="n"&gt;resource_specifier&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="n"&gt;ResourceLocator&lt;/span&gt; &lt;span class="na"&gt;locator&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;     &lt;span class="c1"&gt;// 1. points to another URI&lt;/span&gt;
    &lt;span class="n"&gt;InlineEntry&lt;/span&gt; &lt;span class="na"&gt;inline_entry&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;    &lt;span class="c1"&gt;// 2. hands you the body right here&lt;/span&gt;
  &lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Glob collection: formalizing the wildcard
&lt;/h3&gt;

&lt;p&gt;When you want to subscribe to "everything matching a given prefix", use a &lt;strong&gt;glob&lt;/strong&gt;.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;xdstp://auth/envoy.config.listeners.v3.Listener/my-listeners/*?node_type=ingress
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The trailing &lt;code&gt;/*&lt;/code&gt; is the glob. You can send this as a subscription, and the server returns every matching resource. A context parameter like &lt;code&gt;node_type=ingress&lt;/code&gt; narrows it further.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fraw.githubusercontent.com%2F0-draft%2Fdev.to%2Fmain%2Farticles%2Fassets%2Fxds-universal-data-plane-deep-dive%2Fdiagrams%2F07-glob-collection.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fraw.githubusercontent.com%2F0-draft%2Fdev.to%2Fmain%2Farticles%2Fassets%2Fxds-universal-data-plane-deep-dive%2Fdiagrams%2F07-glob-collection.png" alt="Glob collection subscription filtered by context parameter" width="800" height="552"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;This kills the LDS / CDS "empty string = wildcard" black magic. Everything closes over structured URIs.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Design call: subscribe one at a time, or ship in bundles&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;How you present resources to the client is your call.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;singleton (explicit subscribe)&lt;/strong&gt;: when the client &lt;strong&gt;knows the names it needs in advance&lt;/strong&gt;. Simplest when the count is bounded and relationships are static&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;list collection&lt;/strong&gt;: when the server wants to manage "this set". Send bodies via &lt;code&gt;inline_entry&lt;/code&gt; to &lt;strong&gt;save round-trips&lt;/strong&gt;, or return only &lt;code&gt;locator&lt;/code&gt; for &lt;strong&gt;lazy fetch&lt;/strong&gt;. Use this when set members change often&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;glob (&lt;code&gt;/*&lt;/code&gt;)&lt;/strong&gt;: when the client &lt;strong&gt;doesn't know the individual names / they're dynamically added and removed&lt;/strong&gt; (e.g. all ingresses). Narrow with &lt;code&gt;?node_type=ingress&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;How it breaks&lt;/strong&gt;: a too-broad glob ships unneeded resources and eats bandwidth and memory. Conversely, making everything a singleton makes the subscription list a chore, requiring client-side subscription updates every time you add a resource&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  6. Context Parameters and Dynamic Parameters: how to express "variants"
&lt;/h2&gt;

&lt;p&gt;Things get a bit more advanced here. How do you answer the demand: &lt;strong&gt;"same resource named &lt;code&gt;foo&lt;/code&gt;, but I want a different body per client"&lt;/strong&gt;?&lt;/p&gt;

&lt;h3&gt;
  
  
  The Context Parameters (TP1) way
&lt;/h3&gt;

&lt;p&gt;As we saw, you put it in the URI's query string.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;xdstp://auth/RouteConfiguration/foo?env=prod&amp;amp;version=v1
xdstp://auth/RouteConfiguration/foo?env=canary&amp;amp;version=v1
xdstp://auth/RouteConfiguration/foo?env=prod&amp;amp;version=v2
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;These are treated as &lt;strong&gt;completely separate resources&lt;/strong&gt;. Different cache key, different subscription.&lt;/p&gt;

&lt;p&gt;But this approach has a fatal weakness, which the TP2 document articulates well: the phenomenon of &lt;strong&gt;virality&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;The xDS reference graph runs top-down, &lt;code&gt;LDS → RDS → CDS → EDS&lt;/code&gt;. Variants spread &lt;strong&gt;in the opposite direction&lt;/strong&gt;. If EDS has two variants &lt;code&gt;env=prod&lt;/code&gt; and &lt;code&gt;env=canary&lt;/code&gt;, the CDS that references it splits into two, each pointing at one variant's URI. Once CDS splits into two, the RDS pointing at it needs two as well. And the LDS above that, two. In other words, &lt;strong&gt;"an EDS variant climbs up the dependency graph and swaps out the upstream resources wholesale"&lt;/strong&gt;. The diagram below is for a single &lt;code&gt;env&lt;/code&gt; axis (green -&amp;gt; orange -&amp;gt; blue -&amp;gt; purple is the upstream direction).&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fraw.githubusercontent.com%2F0-draft%2Fdev.to%2Fmain%2Farticles%2Fassets%2Fxds-universal-data-plane-deep-dive%2Fdiagrams%2F08-context-params-viral.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fraw.githubusercontent.com%2F0-draft%2Fdev.to%2Fmain%2Farticles%2Fassets%2Fxds-universal-data-plane-deep-dive%2Fdiagrams%2F08-context-params-viral.png" alt="Context-parameter variants spreading virally upstream" width="800" height="752"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;On top of that, context parameters are exact-match only, so adding axes causes a combinatorial blowup. Let me count it concretely. Just wanting &lt;code&gt;env={prod,canary,test}&lt;/code&gt; x &lt;code&gt;version={v1,v2,v3}&lt;/code&gt;, two axes of three values each, gives &lt;code&gt;3 x 3 = 9&lt;/code&gt; variants. And because it's viral, those 9 spread across every layer of the dependency graph.&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Layer&lt;/th&gt;
&lt;th&gt;Count the CP holds under TP1 (context params)&lt;/th&gt;
&lt;th&gt;Breakdown&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;EDS&lt;/td&gt;
&lt;td&gt;9&lt;/td&gt;
&lt;td&gt;&lt;code&gt;env x version&lt;/code&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;CDS (-&amp;gt; EDS)&lt;/td&gt;
&lt;td&gt;9&lt;/td&gt;
&lt;td&gt;one per EDS variant&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;RDS (-&amp;gt; CDS)&lt;/td&gt;
&lt;td&gt;9&lt;/td&gt;
&lt;td&gt;spread upstream&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;LDS (-&amp;gt; RDS)&lt;/td&gt;
&lt;td&gt;9&lt;/td&gt;
&lt;td&gt;spread further up&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Total&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;36&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;logically one service&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;Logically it's "one service with an env and version axis", yet as xDS resources it's 36. Wanting to add one EDS option spreads the same variant across every layer, CDS / RDS / LDS. That's the weakness of TP1.&lt;/p&gt;

&lt;h3&gt;
  
  
  The Dynamic Parameters (TP2) way
&lt;/h3&gt;

&lt;p&gt;TP2 flips this. &lt;strong&gt;It evicts "the info used to select a variant" from the resource name and into a separate field that only the transport layer sees.&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight protobuf"&gt;&lt;code&gt;&lt;span class="c1"&gt;// envoy/service/discovery/v3/discovery.proto (real; also generated in go-control-plane)&lt;/span&gt;
&lt;span class="kd"&gt;message&lt;/span&gt; &lt;span class="nc"&gt;DynamicParameterConstraints&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="kd"&gt;message&lt;/span&gt; &lt;span class="nc"&gt;SingleConstraint&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="kt"&gt;string&lt;/span&gt; &lt;span class="na"&gt;key&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
    &lt;span class="k"&gt;oneof&lt;/span&gt; &lt;span class="n"&gt;constraint_type&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
      &lt;span class="kt"&gt;string&lt;/span&gt; &lt;span class="na"&gt;value&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;        &lt;span class="c1"&gt;// exact value match&lt;/span&gt;
      &lt;span class="n"&gt;Exists&lt;/span&gt; &lt;span class="na"&gt;exists&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;3&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;       &lt;span class="c1"&gt;// key-existence check&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;
  &lt;span class="p"&gt;}&lt;/span&gt;

  &lt;span class="k"&gt;oneof&lt;/span&gt; &lt;span class="n"&gt;type&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="n"&gt;SingleConstraint&lt;/span&gt; &lt;span class="na"&gt;constraint&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
    &lt;span class="n"&gt;ConstraintList&lt;/span&gt; &lt;span class="na"&gt;or_constraints&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
    &lt;span class="n"&gt;ConstraintList&lt;/span&gt; &lt;span class="na"&gt;and_constraints&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;3&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
    &lt;span class="n"&gt;DynamicParameterConstraints&lt;/span&gt; &lt;span class="na"&gt;not_constraints&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;4&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
  &lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;On the wire the flow splits in two directions. The client puts &lt;code&gt;dynamic_parameters&lt;/code&gt; (&lt;code&gt;map&amp;lt;string, string&amp;gt;&lt;/code&gt;) on the subscribing &lt;code&gt;ResourceLocator&lt;/code&gt; to say "these are the parameters I hold". The server puts &lt;code&gt;dynamic_parameter_constraints&lt;/code&gt; on the returned &lt;code&gt;ResourceName&lt;/code&gt; to say "this resource is for clients satisfying this constraint". The important part is that this constraint is &lt;strong&gt;not part of the resource's id (the URI string)&lt;/strong&gt;. The EDS a CDS points at stays a single name, yet the EDS side can serve out the &lt;code&gt;env=prod&lt;/code&gt; variant and the &lt;code&gt;env=canary&lt;/code&gt; variant separately.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fraw.githubusercontent.com%2F0-draft%2Fdev.to%2Fmain%2Farticles%2Fassets%2Fxds-universal-data-plane-deep-dive%2Fdiagrams%2F09-dynamic-params-nonviral.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fraw.githubusercontent.com%2F0-draft%2Fdev.to%2Fmain%2Farticles%2Fassets%2Fxds-universal-data-plane-deep-dive%2Fdiagrams%2F09-dynamic-params-nonviral.png" alt="Dynamic Parameters keep a single name with constraints" width="800" height="300"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;So what happens to that "36"? Under Dynamic Parameters the &lt;strong&gt;resource names stay the four &lt;code&gt;LDS / RDS / CDS / EDS&lt;/code&gt;&lt;/strong&gt;, and the &lt;code&gt;env&lt;/code&gt; / &lt;code&gt;version&lt;/code&gt; axes exist only as constraints attached to EDS. Variants don't pollute the namespace, so the upstream viral spread stops. The constraint's expressive power is &lt;code&gt;value&lt;/code&gt; exact-match, &lt;code&gt;Exists&lt;/code&gt; (key existence), and AND / OR / NOT combinations, so you can express exactly the variants you actually need via constraints. And a &lt;strong&gt;caching xDS proxy&lt;/strong&gt; only needs to look at the constraint to remember multiple variants, without touching the data-model graph.&lt;/p&gt;

&lt;p&gt;That said, while the proto type has landed in envoy, TP2's own &lt;code&gt;Implementation&lt;/code&gt; section is still &lt;code&gt;TBD (Will probably be implemented in gRPC before Envoy)&lt;/code&gt;. The type exists, but the behavior that actually serves out variants is, like TP3, expected to come to gRPC first, and is still ahead of us.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Design call: bake variants into the name, or keep them out (the chapter's crux)&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;This is the single biggest design fork in this article. Which one you pick changes how things break when variants grow.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Pick Context Parameters&lt;/strong&gt;: when variant axes are &lt;strong&gt;few&lt;/strong&gt; and the upstream viral spread is tolerable. A simple, exact-match-only mechanism that &lt;strong&gt;works today on both Envoy and gRPC&lt;/strong&gt;. Clean cache keys too. For a small-to-medium single &lt;code&gt;env&lt;/code&gt; axis, this is plenty&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Plan for Dynamic Parameters&lt;/strong&gt;: when variant axes are &lt;strong&gt;many&lt;/strong&gt; and &lt;code&gt;axes x values x layers&lt;/code&gt; combinatorially blows up. It doesn't pollute the name, so virality stops. But it needs constraint-evaluation logic, and &lt;strong&gt;the behavior is currently unimplemented&lt;/strong&gt; (gRPC expected first). You can't deploy it today, but you can choose to not pollute your name design now, on the premise of moving there later&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;A numeric criterion&lt;/strong&gt;: as counted in §6, two axes of three values each (&lt;code&gt;env x version&lt;/code&gt;) balloons early to 36 resources. &lt;strong&gt;The moment "the variant count looks like it'll hit double digits" is the danger signal for a context_params-only approach.&lt;/strong&gt; At that point, defend "don't bake variants into the id" and deliberately keep the variant count down until dynamic parameters catch up in implementation&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;How it breaks&lt;/strong&gt;: carelessly grow context_params and the entire reference graph splits per variant and cache efficiency collapses. Conversely, deploy dynamic parameters today "because it's new" and there's no serving side to implement it, so it simply doesn't work&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  7. TP3: returning per-resource errors without tearing down the stream
&lt;/h2&gt;

&lt;p&gt;Unglamorous, but it bites once you operate this stuff.&lt;/p&gt;

&lt;p&gt;In legacy xDS there was effectively one way to say "can't fetch this resource": &lt;strong&gt;tear down the whole stream with a non-OK status&lt;/strong&gt;. That's way too blunt. One NOT_FOUND out of 100 resources stops the other 99 updates too.&lt;/p&gt;

&lt;p&gt;Worse, the SotW protocol had no way to even express "the resource doesn't exist"; the client had to wait for a &lt;strong&gt;15-second does-not-exist timer&lt;/strong&gt; to fire. This is a spec hole (also documented in &lt;a href="https://www.envoyproxy.io/docs/envoy/latest/api-docs/xds_protocol" rel="noopener noreferrer"&gt;Envoy's official docs&lt;/a&gt;).&lt;/p&gt;

&lt;p&gt;TP3 solves it by adding a &lt;code&gt;resource_errors&lt;/code&gt; field to &lt;code&gt;DiscoveryResponse&lt;/code&gt;.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight protobuf"&gt;&lt;code&gt;&lt;span class="kd"&gt;message&lt;/span&gt; &lt;span class="nc"&gt;ResourceError&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="n"&gt;ResourceName&lt;/span&gt; &lt;span class="na"&gt;resource_name&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
  &lt;span class="n"&gt;google.rpc.Status&lt;/span&gt; &lt;span class="na"&gt;error_detail&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="kd"&gt;message&lt;/span&gt; &lt;span class="nc"&gt;DiscoveryResponse&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="c1"&gt;// ...existing fields elided...&lt;/span&gt;
  &lt;span class="k"&gt;repeated&lt;/span&gt; &lt;span class="n"&gt;ResourceError&lt;/span&gt; &lt;span class="na"&gt;resource_errors&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;N&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The client branches on the status code:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;status code&lt;/th&gt;
&lt;th&gt;client behavior&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;
&lt;code&gt;UNAVAILABLE&lt;/code&gt; / &lt;code&gt;INTERNAL&lt;/code&gt; / &lt;code&gt;UNKNOWN&lt;/code&gt;
&lt;/td&gt;
&lt;td&gt;treat as transient. Keep using the last good config&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;
&lt;code&gt;NOT_FOUND&lt;/code&gt; / &lt;code&gt;PERMISSION_DENIED&lt;/code&gt;
&lt;/td&gt;
&lt;td&gt;treat as a data error. Free to drop the cache&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;anything else&lt;/td&gt;
&lt;td&gt;undefined. SHOULD treat the same as transient&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;The benefit comes down to two things:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;No tearing down the stream&lt;/strong&gt;. With &lt;code&gt;subscribe [foo, bar, baz]&lt;/code&gt;, if only &lt;code&gt;baz&lt;/code&gt; goes NOT_FOUND, the &lt;code&gt;foo&lt;/code&gt; and &lt;code&gt;bar&lt;/code&gt; updates keep running on the same stream&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;SotW can express does-not-exist instantly&lt;/strong&gt;. No waiting on the 15-second timer&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The TP3 text says &lt;code&gt;This will probably be implemented in gRPC before Envoy.&lt;/code&gt;, so gRPC lands first and Envoy follows.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Design call: what granularity to pick for resources (the NACK blast radius)&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;TP3 is about "error granularity", but what a designer actually controls is &lt;strong&gt;the "resource granularity" upstream of it&lt;/strong&gt;. The more you cram into one resource, the wider the collateral damage when it NACKs.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Make it fat (all vhosts in one big Route)&lt;/strong&gt;: easy to manage, but one invalid spot &lt;strong&gt;NACKs the whole resource&lt;/strong&gt;, halting updates for unrelated paths too. Before TP3, this was even a full stream teardown&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Split it fine (per-vhost / per-service)&lt;/strong&gt;: small NACK blast radius. Carve the fragile, high-churn parts into separate resources and a NACK there leaves the rest running&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;TP3 presence changes your tolerance&lt;/strong&gt;: if the delivery target is gRPC (TP3 first), per-resource errors shrink the collateral, so you can stomach a bit of fatness. If it's Envoy-heavy without TP3 yet, &lt;strong&gt;design granularity finer to physically shrink the collateral of an error&lt;/strong&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;How it breaks&lt;/strong&gt;: a giant resource invites "one typo fully outages it". Over-splitting invites a blowup in subscription count and reference-graph complexity. Split fine only where "change frequency x blast impact" is high&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  8. ORCA: the moment xDS outgrew "shipping config"
&lt;/h2&gt;

&lt;p&gt;The lineage changes here. So far it's been "how to ship resources". ORCA is a protocol for &lt;strong&gt;carrying load info from the backend to the client / LB&lt;/strong&gt;, and it lives in &lt;code&gt;xds/service/orca/v3/orca.proto&lt;/code&gt;.&lt;/p&gt;

&lt;h3&gt;
  
  
  Why it's needed
&lt;/h3&gt;

&lt;p&gt;For an LB to route smartly, it wants to know whether a backend is "heavy or light right now": CPU usage, memory, a custom business cost metric, whatever. If each backend can hand this to the client / LB periodically or per-response, you get smarter load balancing.&lt;/p&gt;

&lt;h3&gt;
  
  
  Two modes
&lt;/h3&gt;

&lt;p&gt;ORCA has two delivery styles, laid out in &lt;a href="https://grpc.io/docs/guides/custom-backend-metrics/" rel="noopener noreferrer"&gt;grpc.io's Custom Backend Metrics guide&lt;/a&gt;.&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Mode&lt;/th&gt;
&lt;th&gt;When it sends&lt;/th&gt;
&lt;th&gt;Best for&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Per-query (in-band)&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;rides the trailer at RPC end&lt;/td&gt;
&lt;td&gt;short unary RPCs&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;OOB (Out-of-Band)&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;pushed periodically on a separate stream&lt;/td&gt;
&lt;td&gt;streaming RPCs, works at zero QPS&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;The OOB service definition is just this:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight protobuf"&gt;&lt;code&gt;&lt;span class="c1"&gt;// xds/service/orca/v3/orca.proto&lt;/span&gt;
&lt;span class="kd"&gt;service&lt;/span&gt; &lt;span class="n"&gt;OpenRcaService&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="k"&gt;rpc&lt;/span&gt; &lt;span class="n"&gt;StreamCoreMetrics&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;OrcaLoadReportRequest&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
      &lt;span class="k"&gt;returns&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;stream&lt;/span&gt; &lt;span class="n"&gt;xds.data.orca.v3.OrcaLoadReport&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="kd"&gt;message&lt;/span&gt; &lt;span class="nc"&gt;OrcaLoadReportRequest&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="n"&gt;google.protobuf.Duration&lt;/span&gt; &lt;span class="na"&gt;report_interval&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
  &lt;span class="k"&gt;repeated&lt;/span&gt; &lt;span class="kt"&gt;string&lt;/span&gt; &lt;span class="na"&gt;request_cost_names&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;When the client asks "give me metrics every 2 seconds", the backend server-streams load reports forever.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fraw.githubusercontent.com%2F0-draft%2Fdev.to%2Fmain%2Farticles%2Fassets%2Fxds-universal-data-plane-deep-dive%2Fdiagrams%2F10-orca-load-reporting.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fraw.githubusercontent.com%2F0-draft%2Fdev.to%2Fmain%2Farticles%2Fassets%2Fxds-universal-data-plane-deep-dive%2Fdiagrams%2F10-orca-load-reporting.png" alt="ORCA load reporting from backends to the LB and CP" width="800" height="622"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;The body of &lt;code&gt;xds.data.orca.v3.OrcaLoadReport&lt;/code&gt; carries cpu / memory / utilization plus an app-specific &lt;code&gt;request_cost&lt;/code&gt; map you can add freely. Look at the proto directly and these are all the fields:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight protobuf"&gt;&lt;code&gt;&lt;span class="c1"&gt;// xds/data/orca/v3/orca_load_report.proto&lt;/span&gt;
&lt;span class="kd"&gt;message&lt;/span&gt; &lt;span class="nc"&gt;OrcaLoadReport&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="kt"&gt;double&lt;/span&gt; &lt;span class="na"&gt;cpu_utilization&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
  &lt;span class="kt"&gt;double&lt;/span&gt; &lt;span class="na"&gt;mem_utilization&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
  &lt;span class="n"&gt;map&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="kt"&gt;string&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="kt"&gt;double&lt;/span&gt;&lt;span class="err"&gt;&amp;gt;&lt;/span&gt; &lt;span class="na"&gt;request_cost&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;4&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;     &lt;span class="c1"&gt;// per-RPC cost&lt;/span&gt;
  &lt;span class="n"&gt;map&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="kt"&gt;string&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="kt"&gt;double&lt;/span&gt;&lt;span class="err"&gt;&amp;gt;&lt;/span&gt; &lt;span class="na"&gt;utilization&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;5&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;      &lt;span class="c1"&gt;// arbitrary 0..1 metrics&lt;/span&gt;
  &lt;span class="kt"&gt;double&lt;/span&gt; &lt;span class="na"&gt;rps_fractional&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;6&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
  &lt;span class="kt"&gt;double&lt;/span&gt; &lt;span class="na"&gt;eps&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;7&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;                            &lt;span class="c1"&gt;// errors per second&lt;/span&gt;
  &lt;span class="n"&gt;map&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="kt"&gt;string&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="kt"&gt;double&lt;/span&gt;&lt;span class="err"&gt;&amp;gt;&lt;/span&gt; &lt;span class="na"&gt;named_metrics&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;8&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;    &lt;span class="c1"&gt;// app-defined metrics&lt;/span&gt;
  &lt;span class="kt"&gt;double&lt;/span&gt; &lt;span class="na"&gt;application_utilization&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;9&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;You can put any app-specific unit on &lt;code&gt;named_metrics&lt;/code&gt;, like "this RPC occupied a GPU for 1.2 sec".&lt;/p&gt;

&lt;h3&gt;
  
  
  Run it to check: hit the OOB stream directly
&lt;/h3&gt;

&lt;p&gt;grpc-go's &lt;code&gt;examples/features/orca&lt;/code&gt; is itself an OOB server. The server is a demo that just toggles CPU usage between &lt;code&gt;0.5&lt;/code&gt; and &lt;code&gt;0.9&lt;/code&gt; every two seconds. Against it I wrote a ~30-line client that hits &lt;code&gt;StreamCoreMetrics&lt;/code&gt; directly with &lt;code&gt;report_interval=2s&lt;/code&gt;.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight go"&gt;&lt;code&gt;&lt;span class="n"&gt;cli&lt;/span&gt; &lt;span class="o"&gt;:=&lt;/span&gt; &lt;span class="n"&gt;orcav3&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;NewOpenRcaServiceClient&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;cc&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;stream&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;_&lt;/span&gt; &lt;span class="o"&gt;:=&lt;/span&gt; &lt;span class="n"&gt;cli&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;StreamCoreMetrics&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;ctx&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="o"&gt;&amp;amp;&lt;/span&gt;&lt;span class="n"&gt;orcav3&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;OrcaLoadReportRequest&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="n"&gt;ReportInterval&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt; &lt;span class="n"&gt;durationpb&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;New&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="m"&gt;2&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="n"&gt;time&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Second&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="c"&gt;// "give it to me every 2s"&lt;/span&gt;
&lt;span class="p"&gt;})&lt;/span&gt;
&lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="n"&gt;rep&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;err&lt;/span&gt; &lt;span class="o"&gt;:=&lt;/span&gt; &lt;span class="n"&gt;stream&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Recv&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;err&lt;/span&gt; &lt;span class="o"&gt;!=&lt;/span&gt; &lt;span class="no"&gt;nil&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;
    &lt;span class="n"&gt;log&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Printf&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"OrcaLoadReport: cpu=%.2f mem=%.2f"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;rep&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;GetCpuUtilization&lt;/span&gt;&lt;span class="p"&gt;(),&lt;/span&gt; &lt;span class="n"&gt;rep&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;GetMemUtilization&lt;/span&gt;&lt;span class="p"&gt;())&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Start the server (&lt;code&gt;go run server/main.go&lt;/code&gt;), point this client at it, and it really keeps flowing (&lt;code&gt;grpc-go 0f3086d&lt;/code&gt; / &lt;code&gt;go1.26.4&lt;/code&gt;):&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;19:13:18 OrcaLoadReport: cpu=0.90 mem=0.00
19:13:21 OrcaLoadReport: cpu=0.50 mem=0.00
19:13:24 OrcaLoadReport: cpu=0.50 mem=0.00
19:13:27 OrcaLoadReport: cpu=0.90 mem=0.00
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Call &lt;code&gt;Recv()&lt;/code&gt; once and the backend sends an &lt;code&gt;OrcaLoadReport&lt;/code&gt; at roughly 2-second intervals until you hang up. You can see the server-side CPU toggle (&lt;code&gt;0.9&lt;/code&gt; / &lt;code&gt;0.5&lt;/code&gt;) showing up at the client as-is. Opposite to the pull model of xDS fetching config, &lt;strong&gt;here the backend pushes load at you&lt;/strong&gt;. That's why I said up front this has a different flavor from "shipping config".&lt;/p&gt;

&lt;h3&gt;
  
  
  Combining with LRS
&lt;/h3&gt;

&lt;p&gt;The gRPC &lt;code&gt;A64-lrs-custom-metrics&lt;/code&gt; proposal formalizes &lt;strong&gt;sending client-aggregated ORCA metrics back to the control plane via LRS (Load Reporting Service)&lt;/strong&gt;. LRS is a bidirectional reporting service in the xDS family, originally a mechanism for the client to report "how many RPS I sent to which endpoint" back to the CP. A64 put custom metrics onto that payload.&lt;/p&gt;

&lt;p&gt;So in the direction &lt;code&gt;Backend → Client → Control Plane&lt;/code&gt;, metrics flow end-to-end within the xDS context. The CP gains a global view of "which backend is heavy overall" and can reflect it into the weights of the EDS it ships next. The point is &lt;strong&gt;it doesn't close inside Envoy's world alone&lt;/strong&gt;: the picture is identical for gRPC Proxyless. xDS, which was supposed to be a config-shipping protocol, is reaching out to swallow the telemetry uplink too.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Design call: whether to wire load info into the design&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;If you want to decide EDS weights (the weighted cluster in §11) "smartly", you need to include in your design how the load info that feeds it is gathered.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;per-query (in-band)&lt;/strong&gt;: when short unary RPCs dominate. Rides the trailer, no extra stream needed&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;OOB (&lt;code&gt;StreamCoreMetrics&lt;/code&gt;)&lt;/strong&gt;: steady-state monitoring. Works at zero QPS, so you can observe even idle long-tail backends&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;aggregate up to the CP via LRS&lt;/strong&gt;: when you want to decide weights globally on the CP side. &lt;code&gt;A64&lt;/code&gt; lets custom metrics ride too&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;How it breaks&lt;/strong&gt;: gather no telemetry at all and the CP can only ship static weights, causing the "spray evenly onto a genuinely heavy backend" accident. If you're talking about load balancing, the ORCA/LRS path is part of the design&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  9. Unified Matcher API: a shared matching tree across all extensions
&lt;/h2&gt;

&lt;p&gt;Now the data-model side.&lt;/p&gt;

&lt;p&gt;Historically Envoy's filters each had &lt;strong&gt;their own matching machinery&lt;/strong&gt;: HTTP header match, RBAC principal match, access-log filter, access tags, external authorization. Each had its own proto and its own match logic. &lt;code&gt;xds.type.matcher.v3.Matcher&lt;/code&gt; unifies that.&lt;/p&gt;

&lt;h3&gt;
  
  
  Structure
&lt;/h3&gt;

&lt;p&gt;It's expressed as a tree. Pulling just the essentials of the proto:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight protobuf"&gt;&lt;code&gt;&lt;span class="c1"&gt;// xds/type/matcher/v3/matcher.proto&lt;/span&gt;
&lt;span class="kd"&gt;message&lt;/span&gt; &lt;span class="nc"&gt;Matcher&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="kd"&gt;message&lt;/span&gt; &lt;span class="nc"&gt;OnMatch&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="k"&gt;oneof&lt;/span&gt; &lt;span class="n"&gt;on_match&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
      &lt;span class="n"&gt;Matcher&lt;/span&gt; &lt;span class="na"&gt;matcher&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;                      &lt;span class="c1"&gt;// a nested matcher&lt;/span&gt;
      &lt;span class="n"&gt;core.v3.TypedExtensionConfig&lt;/span&gt; &lt;span class="na"&gt;action&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;  &lt;span class="c1"&gt;// the action to run&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;
  &lt;span class="p"&gt;}&lt;/span&gt;

  &lt;span class="c1"&gt;// ...field matcher list / exact match tree ...&lt;/span&gt;
  &lt;span class="n"&gt;OnMatch&lt;/span&gt; &lt;span class="na"&gt;on_no_match&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;N&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Inside &lt;code&gt;Matcher&lt;/code&gt; there's a &lt;code&gt;MatcherList&lt;/code&gt; (each entry is predicate + on_match) and a &lt;code&gt;MatcherTree&lt;/code&gt; (a fast exact-match branch); reach a leaf and the &lt;code&gt;action&lt;/code&gt; runs. &lt;code&gt;on_no_match&lt;/code&gt; branches to the default when nothing hit.&lt;/p&gt;

&lt;h3&gt;
  
  
  Embedding CEL
&lt;/h3&gt;

&lt;p&gt;The standout is being able to call &lt;strong&gt;CEL (Common Expression Language)&lt;/strong&gt; inside a predicate. The type for that is &lt;code&gt;xds.type.v3.CelExpression&lt;/code&gt;, and you can write things like:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;request.headers['x-env'] == 'prod'
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This directly drives a branch. Under Envoy's Unified Matcher, CEL matchers became usable in Access Log, RBAC, and external authorization, and CEL itself is a language Google has run for years in policy evaluation for internal IAM and the like, so this isn't some unproven toy being shoved into Envoy.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fraw.githubusercontent.com%2F0-draft%2Fdev.to%2Fmain%2Farticles%2Fassets%2Fxds-universal-data-plane-deep-dive%2Fdiagrams%2F11-unified-matcher-cel.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fraw.githubusercontent.com%2F0-draft%2Fdev.to%2Fmain%2Farticles%2Fassets%2Fxds-universal-data-plane-deep-dive%2Fdiagrams%2F11-unified-matcher-cel.png" alt="Unified Matcher decision tree with CEL predicates" width="799" height="674"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Why is this nice? The big operational win is &lt;strong&gt;"reusing the same matching language across filters"&lt;/strong&gt;. The access-log filter and the RBAC policy can be written in the same DSL, so an operator only keeps one mental model. A declarative match language that closes inside the proto is unglamorous but effective.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Design call: where and how to hold matching, declaratively&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Do you write match conditions for routing, authorization, and logging separately per filter, or unify them across the board?&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Lean on Unified Matcher + CEL&lt;/strong&gt;: when multiple filters (RBAC / access log / external authz) want to &lt;strong&gt;reuse the same condition expression&lt;/strong&gt;. Write one CEL like &lt;code&gt;request.headers['x-env'] == 'prod'&lt;/code&gt; and share it&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Stay with per-filter matching&lt;/strong&gt;: when it's a simple single-filter condition not worth bringing CEL in for&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;How it breaks&lt;/strong&gt;: write match conditions in separate DSLs per filter and the same "prod only" condition ends up subtly different everywhere, splitting the operator's mental model and breeding mistakes. Centralize cross-cutting conditions in Unified Matcher as the single source of truth&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  10. Putting it together: the world gRPC Proxyless sees
&lt;/h2&gt;

&lt;p&gt;The pieces introduced above are exactly what gRPC's Proxyless xDS &lt;strong&gt;assembles and uses on the client side&lt;/strong&gt;.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;The &lt;code&gt;authorities&lt;/code&gt; map in the bootstrap file identifies multiple CPs (&lt;code&gt;A47-xds-federation&lt;/code&gt;)&lt;/li&gt;
&lt;li&gt;An existing &lt;code&gt;xds:///my-service&lt;/code&gt; target URI expands internally into an xdstp-form Listener name, but only when the bootstrap's &lt;code&gt;client_default_listener_resource_name_template&lt;/code&gt; is set to &lt;code&gt;xdstp://&lt;/code&gt; form (if the template contains &lt;code&gt;%s&lt;/code&gt;, the service authority is embedded at that position)&lt;/li&gt;
&lt;li&gt;Load can be pulled directly from the backend via ORCA (&lt;code&gt;A51-custom-backend-metrics&lt;/code&gt;)&lt;/li&gt;
&lt;li&gt;The aggregate can be returned to the control plane via LRS (&lt;code&gt;A64-lrs-custom-metrics&lt;/code&gt;)&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;So a good chunk of the cncf/xds components dissected in the previous chapters is already at the point of &lt;strong&gt;running inside the gRPC library without any sidecar&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fraw.githubusercontent.com%2F0-draft%2Fdev.to%2Fmain%2Farticles%2Fassets%2Fxds-universal-data-plane-deep-dive%2Fdiagrams%2F12-proxyless-grpc-assembly.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fraw.githubusercontent.com%2F0-draft%2Fdev.to%2Fmain%2Farticles%2Fassets%2Fxds-universal-data-plane-deep-dive%2Fdiagrams%2F12-proxyless-grpc-assembly.png" alt="Proxyless gRPC assembling the xDS pieces in-process" width="800" height="732"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;The "Proxyless gRPC" I introduced at the end of Part 1 runs as the result of assembling, on the client side, the pieces we've read here. It doesn't have Envoy's L7 extension points (Wasm, HTTP filter chains, etc.) so the coverage differs, but the standards-track features of xDS are landing on the gRPC side at a pace that keeps step with Envoy.&lt;/p&gt;

&lt;h2&gt;
  
  
  11. Designing as the shipper: actually writing a control plane
&lt;/h2&gt;

&lt;p&gt;I've lined up the design levers. Now let's not leave it on paper, and run it. The brief is one worked example: design &lt;strong&gt;"ship the &lt;code&gt;api&lt;/code&gt; service as prod 90% / canary 10%"&lt;/strong&gt; as xDS resources, and serve it from a real control plane.&lt;/p&gt;

&lt;h3&gt;
  
  
  Design: build the resource graph as proto objects
&lt;/h3&gt;

&lt;p&gt;In §3 I raised "do you design from JSON?", but a production CP doesn't route through JSON. What go-control-plane's snapshot cache takes is &lt;code&gt;proto.Message&lt;/code&gt; itself (&lt;code&gt;types.Resource = proto.Message&lt;/code&gt;), and the designer &lt;code&gt;new&lt;/code&gt;s proto structs directly in Go. The canary split is expressed via an &lt;strong&gt;RDS weighted cluster&lt;/strong&gt;, splitting prod / canary into separate Clusters.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight go"&gt;&lt;code&gt;&lt;span class="c"&gt;// the designer builds the resource graph here (excerpt)&lt;/span&gt;
&lt;span class="k"&gt;func&lt;/span&gt; &lt;span class="n"&gt;designCluster&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;name&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;host&lt;/span&gt; &lt;span class="kt"&gt;string&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;cluster&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Cluster&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="o"&gt;&amp;amp;&lt;/span&gt;&lt;span class="n"&gt;cluster&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Cluster&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="n"&gt;Name&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt;                 &lt;span class="n"&gt;name&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="n"&gt;ConnectTimeout&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt;       &lt;span class="n"&gt;durationpb&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;New&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="m"&gt;2&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="n"&gt;time&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Second&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
        &lt;span class="n"&gt;ClusterDiscoveryType&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt; &lt;span class="o"&gt;&amp;amp;&lt;/span&gt;&lt;span class="n"&gt;cluster&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Cluster_Type&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="n"&gt;Type&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt; &lt;span class="n"&gt;cluster&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Cluster_LOGICAL_DNS&lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;
        &lt;span class="n"&gt;LbPolicy&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt;             &lt;span class="n"&gt;cluster&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Cluster_ROUND_ROBIN&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="n"&gt;LoadAssignment&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt;       &lt;span class="c"&gt;/* endpoint host:8080 */&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="c"&gt;// prod 90% / canary 10% weighted route&lt;/span&gt;
&lt;span class="k"&gt;func&lt;/span&gt; &lt;span class="n"&gt;designRoute&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;routeName&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;prod&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;canary&lt;/span&gt; &lt;span class="kt"&gt;string&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;route&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;RouteConfiguration&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="o"&gt;&amp;amp;&lt;/span&gt;&lt;span class="n"&gt;route&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;RouteConfiguration&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="n"&gt;Name&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt; &lt;span class="n"&gt;routeName&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="n"&gt;VirtualHosts&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt; &lt;span class="p"&gt;[]&lt;/span&gt;&lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;route&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;VirtualHost&lt;/span&gt;&lt;span class="p"&gt;{{&lt;/span&gt;
            &lt;span class="n"&gt;Name&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt; &lt;span class="s"&gt;"api"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;Domains&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt; &lt;span class="p"&gt;[]&lt;/span&gt;&lt;span class="kt"&gt;string&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="s"&gt;"*"&lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;
            &lt;span class="n"&gt;Routes&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt; &lt;span class="p"&gt;[]&lt;/span&gt;&lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;route&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Route&lt;/span&gt;&lt;span class="p"&gt;{{&lt;/span&gt;
                &lt;span class="n"&gt;Match&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt; &lt;span class="o"&gt;&amp;amp;&lt;/span&gt;&lt;span class="n"&gt;route&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;RouteMatch&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="n"&gt;PathSpecifier&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt; &lt;span class="o"&gt;&amp;amp;&lt;/span&gt;&lt;span class="n"&gt;route&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;RouteMatch_Prefix&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="n"&gt;Prefix&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt; &lt;span class="s"&gt;"/"&lt;/span&gt;&lt;span class="p"&gt;}},&lt;/span&gt;
                &lt;span class="n"&gt;Action&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt; &lt;span class="o"&gt;&amp;amp;&lt;/span&gt;&lt;span class="n"&gt;route&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Route_Route&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="n"&gt;Route&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt; &lt;span class="o"&gt;&amp;amp;&lt;/span&gt;&lt;span class="n"&gt;route&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;RouteAction&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;
                    &lt;span class="n"&gt;ClusterSpecifier&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt; &lt;span class="o"&gt;&amp;amp;&lt;/span&gt;&lt;span class="n"&gt;route&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;RouteAction_WeightedClusters&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="n"&gt;WeightedClusters&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt; &lt;span class="o"&gt;&amp;amp;&lt;/span&gt;&lt;span class="n"&gt;route&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;WeightedCluster&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;
                        &lt;span class="n"&gt;Clusters&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt; &lt;span class="p"&gt;[]&lt;/span&gt;&lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;route&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;WeightedCluster_ClusterWeight&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;
                            &lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="n"&gt;Name&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt; &lt;span class="n"&gt;prod&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;Weight&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt; &lt;span class="n"&gt;wrapperspb&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;UInt32&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="m"&gt;90&lt;/span&gt;&lt;span class="p"&gt;)},&lt;/span&gt;
                            &lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="n"&gt;Name&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt; &lt;span class="n"&gt;canary&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;Weight&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt; &lt;span class="n"&gt;wrapperspb&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;UInt32&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="m"&gt;10&lt;/span&gt;&lt;span class="p"&gt;)},&lt;/span&gt;
                        &lt;span class="p"&gt;},&lt;/span&gt;
                    &lt;span class="p"&gt;}},&lt;/span&gt;
                &lt;span class="p"&gt;}},&lt;/span&gt;
            &lt;span class="p"&gt;}},&lt;/span&gt;
        &lt;span class="p"&gt;}},&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Stuff the built graph into a snapshot and serve it as an ADS server. This is the one line that "puts what you designed onto the wire".&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight go"&gt;&lt;code&gt;&lt;span class="n"&gt;snap&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;_&lt;/span&gt; &lt;span class="o"&gt;:=&lt;/span&gt; &lt;span class="n"&gt;cachev3&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;NewSnapshot&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"v1"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="k"&gt;map&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;resourcev3&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Type&lt;/span&gt;&lt;span class="p"&gt;][]&lt;/span&gt;&lt;span class="n"&gt;types&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Resource&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="n"&gt;resourcev3&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;ClusterType&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="n"&gt;designCluster&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"api-prod"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s"&gt;"prod.api.svc"&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
                             &lt;span class="n"&gt;designCluster&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"api-canary"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s"&gt;"canary.api.svc"&lt;/span&gt;&lt;span class="p"&gt;)},&lt;/span&gt;
    &lt;span class="n"&gt;resourcev3&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;RouteType&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt;   &lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="n"&gt;designRoute&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"api-route"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s"&gt;"api-prod"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s"&gt;"api-canary"&lt;/span&gt;&lt;span class="p"&gt;)},&lt;/span&gt;
&lt;span class="p"&gt;})&lt;/span&gt;
&lt;span class="n"&gt;cache&lt;/span&gt; &lt;span class="o"&gt;:=&lt;/span&gt; &lt;span class="n"&gt;cachev3&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;NewSnapshotCache&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="no"&gt;false&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;cachev3&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;IDHash&lt;/span&gt;&lt;span class="p"&gt;{},&lt;/span&gt; &lt;span class="no"&gt;nil&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;cache&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;SetSnapshot&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;context&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Background&lt;/span&gt;&lt;span class="p"&gt;(),&lt;/span&gt; &lt;span class="s"&gt;"edge-proxy-1"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;snap&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="n"&gt;srv&lt;/span&gt; &lt;span class="o"&gt;:=&lt;/span&gt; &lt;span class="n"&gt;serverv3&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;NewServer&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;context&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Background&lt;/span&gt;&lt;span class="p"&gt;(),&lt;/span&gt; &lt;span class="n"&gt;cache&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="no"&gt;nil&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;discovery&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;RegisterAggregatedDiscoveryServiceServer&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;grpcServer&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;srv&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="c"&gt;// grow an ADS endpoint&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Confirm: did what I designed actually land on the wire
&lt;/h3&gt;

&lt;p&gt;Pull it back from the serving side with a raw ADS client (just open &lt;code&gt;StreamAggregatedResources&lt;/code&gt; and send &lt;code&gt;DiscoveryRequest{TypeUrl: ClusterType}&lt;/code&gt;). The actual output (&lt;code&gt;go-control-plane v0.14.0&lt;/code&gt; / &lt;code&gt;envoy v1.37.0&lt;/code&gt; / &lt;code&gt;go1.26.4&lt;/code&gt;):&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;== CDS (type_url=type.googleapis.com/envoy.config.cluster.v3.Cluster, version=v1, 2 resources) ==
  cluster api-prod     lb=ROUND_ROBIN endpoint=prod.api.svc
  cluster api-canary   lb=ROUND_ROBIN endpoint=canary.api.svc
== RDS (type_url=type.googleapis.com/envoy.config.route.v3.RouteConfiguration, version=v1, 1 resources) ==
  route   api-route    -&amp;gt; api-prod weight=90
  route   api-route    -&amp;gt; api-canary weight=10
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The two Clusters I designed, &lt;code&gt;api-prod&lt;/code&gt; / &lt;code&gt;api-canary&lt;/code&gt;, and the &lt;code&gt;90 / 10&lt;/code&gt; weighted route, came back on the wire as-is, &lt;code&gt;type_url&lt;/code&gt; and all. The &lt;strong&gt;&lt;code&gt;resources []Any&lt;/code&gt; from §0 contains exactly the proto structs I just &lt;code&gt;new&lt;/code&gt;d&lt;/strong&gt;. Design -&amp;gt; snapshot -&amp;gt; ADS -&amp;gt; fetch closes inside one Go program.&lt;/p&gt;

&lt;p&gt;Here every thread laid since §0 gets pulled together: what I shipped was a proto stuffed in &lt;code&gt;Any&lt;/code&gt; (§0), with an id &lt;code&gt;api-prod&lt;/code&gt; attached (§2), served to one node &lt;code&gt;edge-proxy-1&lt;/code&gt; (the authority/node of §3), with the canary variant &lt;strong&gt;expressed not by baking it into the name but as an RDS weight&lt;/strong&gt; (the §6 practice of "don't put variants in the id"). All the design levers show up in these 30 lines.&lt;/p&gt;

&lt;h3&gt;
  
  
  A few things change if the target is gRPC Proxyless
&lt;/h3&gt;

&lt;p&gt;If you ship the same canary design to a &lt;strong&gt;gRPC Proxyless client instead of Envoy&lt;/strong&gt;, the skeleton of the resource graph (&lt;code&gt;LDS → RDS → CDS → EDS&lt;/code&gt;) is the same, but three things change. It comes down to who you are shipping to: the client side from §10 has its own constraints.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;The Listener is built as an API listener, not a socket&lt;/strong&gt;. The Listener I built in §11 above was a socket listener with an &lt;code&gt;address&lt;/code&gt; (Envoy actually binds a port). gRPC has no port to bind, so you have to ship an &lt;strong&gt;API listener&lt;/strong&gt; with an &lt;code&gt;HttpConnectionManager&lt;/code&gt; stuffed directly into the &lt;code&gt;Listener.api_listener&lt;/code&gt; field. grpc-go's &lt;code&gt;unmarshal_lds.go&lt;/code&gt; literally branches on &lt;code&gt;if lis.GetApiListener() != nil&lt;/code&gt; to decide client-side, and hand it a socket listener and it won't treat it as a client one. The designer ends up branching how the Listener is built on "whether the target is proxyless"&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;The extension set is a subset. Build it rich and it NACKs&lt;/strong&gt;. The HTTP filters gRPC interprets are roughly &lt;code&gt;router&lt;/code&gt; / &lt;code&gt;fault&lt;/code&gt; / &lt;code&gt;rbac&lt;/code&gt; / &lt;code&gt;ext_proc&lt;/code&gt;; there's no Wasm or Envoy-specific filter. And gRPC &lt;strong&gt;doesn't silently ignore unknown / unsupported fields, it rejects them&lt;/strong&gt; (e.g. it bounces a nonzero &lt;code&gt;xff_num_trusted_hops&lt;/code&gt;). Ship a Listener you padded out for Envoy straight to proxyless and it NACKs. Narrow "the set of extensions you may use" per target at design time&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Custom LB is specified via the Cluster's &lt;code&gt;load_balancing_policy&lt;/code&gt;&lt;/strong&gt;. In §11 I hard-coded &lt;code&gt;LbPolicy: ROUND_ROBIN&lt;/code&gt; on the Cluster, but for fancier load balancing on proxyless you put an &lt;code&gt;envoy.extensions.load_balancing_policies.*&lt;/code&gt; proto (&lt;code&gt;ring_hash&lt;/code&gt;, &lt;code&gt;wrr_locality&lt;/code&gt;, &lt;code&gt;client_side_weighted_round_robin&lt;/code&gt;, etc.) on the Cluster's &lt;code&gt;load_balancing_policy&lt;/code&gt;. grpc-go converts it into an internal balancer tree via &lt;code&gt;xdslbregistry&lt;/code&gt;. Pick &lt;code&gt;client_side_weighted_round_robin&lt;/code&gt; to feed it the §8 ORCA metrics, and that's where &lt;strong&gt;ORCA design and LB design connect&lt;/strong&gt;
&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;So designing "what you ship" &lt;strong&gt;branches on the target (Envoy or proxyless)&lt;/strong&gt;. Same canary, but a socket Listener + filter chain for Envoy, an API listener + limited filters + an LB-policy proto for proxyless. Even if cncf/xds provides a type-neutral box, you still have to design the box's contents to fit the receiver's capabilities.&lt;/p&gt;

&lt;h2&gt;
  
  
  Conclusion
&lt;/h2&gt;

&lt;p&gt;Part 1 was about learning to read xDS. This one was about being able to design what you ship over xDS. To close, here's every chapter's "design call" folded onto one card. The intent is: when you design your own mesh, run down this table top to bottom and you can decide "which lever to throw, and how", with nothing missed.&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Design lever&lt;/th&gt;
&lt;th&gt;Options&lt;/th&gt;
&lt;th&gt;Criterion for this side&lt;/th&gt;
&lt;th&gt;How it breaks when you throw it wrong&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Name (§2)&lt;/td&gt;
&lt;td&gt;id / context_params / authority&lt;/td&gt;
&lt;td&gt;id is the immutable logical name, variable axes go in context_params&lt;/td&gt;
&lt;td&gt;bake env into id and names proliferate, references split&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;CP boundary (§3)&lt;/td&gt;
&lt;td&gt;split authority / don't&lt;/td&gt;
&lt;td&gt;split only where ownership / trust / fault isolation splits&lt;/td&gt;
&lt;td&gt;over-split=bloated bootstrap / under-split=SPOF&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Failover (§4)&lt;/td&gt;
&lt;td&gt;bake into name via &lt;code&gt;alt=&lt;/code&gt; / client-side&lt;/td&gt;
&lt;td&gt;use &lt;code&gt;alt=&lt;/code&gt; declaratively when the alternate is static&lt;/td&gt;
&lt;td&gt;bake a dynamic switch into the name and it stiffens&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Subscription unit (§5)&lt;/td&gt;
&lt;td&gt;singleton / collection / glob&lt;/td&gt;
&lt;td&gt;know the name=singleton, dynamic churn=glob&lt;/td&gt;
&lt;td&gt;glob too broad=wasted bandwidth / too fine=chore subscriptions&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Variant (§6)&lt;/td&gt;
&lt;td&gt;context_params / dynamic_params&lt;/td&gt;
&lt;td&gt;few axes=context, hits double digits=plan for dynamic&lt;/td&gt;
&lt;td&gt;grow context too far=36 blowup / dynamic is unimplemented&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Granularity (§7)&lt;/td&gt;
&lt;td&gt;fat / fine&lt;/td&gt;
&lt;td&gt;split fine only where change frequency / blast impact is high&lt;/td&gt;
&lt;td&gt;a giant resource fully outages on one typo&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Telemetry (§8)&lt;/td&gt;
&lt;td&gt;ORCA in-band / OOB / LRS aggregate&lt;/td&gt;
&lt;td&gt;short unary=in-band, steady monitoring=OOB&lt;/td&gt;
&lt;td&gt;gather none and load balancing goes blind&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Matching (§9)&lt;/td&gt;
&lt;td&gt;per-filter / Unified Matcher + CEL&lt;/td&gt;
&lt;td&gt;use Unified when you want the same DSL across the board&lt;/td&gt;
&lt;td&gt;scattered DSLs split the operator's mental model&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;Behind this table is the one fact from §0: &lt;strong&gt;xDS is a type-neutral bus that can carry anything via &lt;code&gt;type_url&lt;/code&gt; + &lt;code&gt;Any&lt;/code&gt;, and design is deciding the proto and the name you stuff into that box.&lt;/strong&gt; Each xRFC of &lt;code&gt;cncf/xds&lt;/code&gt; is the history of how that "naming and shipping" gets standardized.&lt;/p&gt;

&lt;p&gt;xDS, which was supposed to be a "config shipper", is morphing into &lt;strong&gt;a unified gRPC API family for remote-controlling a fleet of proxies&lt;/strong&gt;: federation via Authority, cache-friendly variants via Dynamic Parameters, bidirectional telemetry via ORCA / LRS, a declarative match language via Unified Matcher and CEL. The model "xDS = Envoy's config protocol" is too narrow as of 2026. And as §11 showed, that design and delivery is something you can pull into your own hands in &lt;strong&gt;30 lines of Go&lt;/strong&gt;. Read the &lt;code&gt;cncf/xds&lt;/code&gt; protos and xRFCs as design levers, and the rest is just assembly.&lt;/p&gt;

&lt;h2&gt;
  
  
  References
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;&lt;a href="https://github.com/cncf/xds" rel="noopener noreferrer"&gt;CNCF xDS API Repository&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://github.com/cncf/xds/blob/main/proposals/TP1-xds-transport-next.md" rel="noopener noreferrer"&gt;TP1: xdstp:// structured resource naming, caching and federation support&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://github.com/cncf/xds/blob/main/proposals/TP2-dynamically-generated-cacheable-xds-resources.md" rel="noopener noreferrer"&gt;TP2: Dynamically Generated Cacheable xDS Resources&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://github.com/cncf/xds/blob/main/proposals/TP3-xds-error-propagation.md" rel="noopener noreferrer"&gt;TP3: xds-error-propagation&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://github.com/grpc/proposal/blob/master/A47-xds-federation.md" rel="noopener noreferrer"&gt;A47: xDS Federation (gRPC proposal)&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://github.com/grpc/proposal/blob/master/A51-custom-backend-metrics.md" rel="noopener noreferrer"&gt;A51: Custom Backend Metrics (gRPC proposal)&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://github.com/grpc/proposal/blob/master/A64-lrs-custom-metrics.md" rel="noopener noreferrer"&gt;A64: LRS Custom Metrics (gRPC proposal)&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://www.envoyproxy.io/docs/envoy/latest/api-docs/xds_protocol" rel="noopener noreferrer"&gt;Envoy xDS REST and gRPC Protocol&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://www.envoyproxy.io/docs/envoy/latest/api-v3/config/common/matcher/v3/matcher.proto" rel="noopener noreferrer"&gt;Envoy Unified Matcher API&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://grpc.io/docs/guides/custom-backend-metrics/" rel="noopener noreferrer"&gt;Custom Backend Metrics (grpc.io)&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://github.com/envoyproxy/go-control-plane" rel="noopener noreferrer"&gt;envoyproxy/go-control-plane (xDS server / snapshot cache)&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;

</description>
      <category>envoy</category>
      <category>servicemesh</category>
      <category>kubernetes</category>
      <category>grpc</category>
    </item>
    <item>
      <title>Why AWS IAM Is So Hard</title>
      <dc:creator>kt</dc:creator>
      <pubDate>Sat, 06 Jun 2026 10:58:20 +0000</pubDate>
      <link>https://dev.to/kanywst/why-aws-iam-is-so-hard-58bp</link>
      <guid>https://dev.to/kanywst/why-aws-iam-is-so-hard-58bp</guid>
      <description>&lt;h2&gt;
  
  
  Where it starts
&lt;/h2&gt;

&lt;p&gt;The first thing that beats you up when you start using AWS is IAM. It got me too.&lt;/p&gt;

&lt;p&gt;You see &lt;code&gt;AccessDenied&lt;/code&gt;. You check the policy and &lt;code&gt;"Effect": "Allow"&lt;/code&gt; is right there. Denied anyway. You &lt;code&gt;AssumeRole&lt;/code&gt;, then run &lt;code&gt;aws sts get-caller-identity&lt;/code&gt; and you are still the same old you, with the same permissions. There are two similar-looking JSON blobs called a trust policy and a permission policy, and you have no idea which one to write what in. Someone asks you the difference between a User and a Role, and you sort of know, but you cannot say it in one sentence.&lt;/p&gt;

&lt;p&gt;What I eventually realized is this: &lt;strong&gt;IAM is hard not because the mechanics are complex, but because it is a minefield of traps that go "same word, different thing" and "a handful of asymmetric rules that fight your intuition."&lt;/strong&gt; Each trap is the kind you understand instantly once someone explains it. The flip side: defuse them one at a time and IAM becomes shockingly obedient.&lt;/p&gt;

&lt;p&gt;This is not a reference that covers every corner of IAM. It is the mines a beginner will definitely step on, broken apart by their true cause, and lined up so that reading top to bottom makes each one go "oh, that's all it was."&lt;/p&gt;




&lt;h2&gt;
  
  
  1. What problem is IAM actually solving
&lt;/h2&gt;

&lt;p&gt;Before the traps, let me pin down in one line what IAM is for. If this drifts, everything after it goes blurry.&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;IAM decides who (authentication) can do what (authorization). Every operation in AWS is an HTTPS API call, and IAM stands in front of every single one of them.&lt;/strong&gt;&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;Clicking around in the console, running &lt;code&gt;aws s3 ls&lt;/code&gt;, applying Terraform: under the hood they all become the same thing, an API request to AWS. For every one of those requests, IAM asks two questions.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fraw.githubusercontent.com%2F0-draft%2Fdev.to%2Fmain%2Farticles%2Fassets%2Faws-iam-why-its-hard-deep-dive%2Fdiagrams%2F01-authn-authz-overview.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fraw.githubusercontent.com%2F0-draft%2Fdev.to%2Fmain%2Farticles%2Fassets%2Faws-iam-why-its-hard-deep-dive%2Fdiagrams%2F01-authn-authz-overview.png" alt="AuthN and AuthZ overview" width="800" height="1072"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Authentication (AuthN)&lt;/strong&gt;: from the signature on the request, confirm that "this caller really is alice." This is the world of signing (SigV4), and the place beginners get stuck is mostly what comes after it: authorization.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Authorization (AuthZ)&lt;/strong&gt;: for a confirmed caller, decide whether a policy exists that permits this operation. &lt;strong&gt;Almost all of IAM's difficulty lives on this side.&lt;/strong&gt;
&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;When I say "hard" in this article, I mean authorization, nearly every time. I will leave the math of signing to another article and focus here on how you write "who can do what" and how it gets evaluated.&lt;/p&gt;




&lt;h2&gt;
  
  
  2. Get the vocabulary straight
&lt;/h2&gt;

&lt;p&gt;The number one reason IAM blows up on you is that &lt;strong&gt;there are too many concepts with similar names.&lt;/strong&gt; So let me fix the names first. If a term shows up later that you do not recognize, come back here.&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Term&lt;/th&gt;
&lt;th&gt;In one line&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;AWS account&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;The container for resources and IAM. Identified by a 12-digit ID like &lt;code&gt;123456789012&lt;/code&gt;, and it is also the billing unit. Users and Roles are created &lt;strong&gt;inside&lt;/strong&gt; it and are invisible from other accounts. This becomes the key fact later&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Principal&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;The subject of a request. The "who." A User, a Role, an AWS service, and so on&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;IAM User&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;A &lt;strong&gt;permanent identity&lt;/strong&gt; you create inside an account, tied to a person or a machine. Holds a password or long-lived access keys&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;IAM Role&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;A &lt;strong&gt;role&lt;/strong&gt; that anyone (who meets the conditions) can temporarily become. Holds no long-lived keys, built around short-lived credentials&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;IAM Group&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;A container that bundles Users. Attach a policy to a Group and it applies to every member. A Group itself cannot log in&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;IAM Policy&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;JSON that says "what is allowed or denied." On its own it floats in the air; it only means something once you attach it to someone&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;AssumeRole&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;The STS API that takes on a Role and hands back short-lived credentials. The central act of IAM&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Short-lived credentials&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;The disposable keys you get back from &lt;code&gt;AssumeRole&lt;/code&gt;, expiring in about an hour (a three-piece set: access key, secret, session token)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;identity-based policy&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;A policy &lt;strong&gt;attached to the identity side&lt;/strong&gt;: a User, Role, or Group. "What can this identity do?"&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;resource-based policy&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;A policy &lt;strong&gt;attached to the resource side&lt;/strong&gt;, like an S3 bucket. "Who is this resource willing to let touch it?"&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;trust policy&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;A special resource-based policy attached to a Role. It writes only one thing: "who is allowed to Assume this Role?"&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;
&lt;strong&gt;SCP&lt;/strong&gt; (Service Control Policy)&lt;/td&gt;
&lt;td&gt;An Organizations feature that spans multiple accounts. Attached to an OU (a folder that groups accounts) or to an account, it &lt;strong&gt;lowers the ceiling&lt;/strong&gt; on permissions. Think of it as Deny-only&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Permission Boundary&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;A ceiling attached to an individual User or Role. Where an SCP is the per-account version, this is per-identity&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;
&lt;strong&gt;STS&lt;/strong&gt; (Security Token Service)&lt;/td&gt;
&lt;td&gt;The service that issues short-lived credentials. &lt;code&gt;AssumeRole&lt;/code&gt; is its API&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;The three pairs below are the especially confusing ones. This article defuses them one at a time.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;User vs Role&lt;/strong&gt; (both are "identities" but their nature is the exact opposite)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;identity-based policy vs trust policy&lt;/strong&gt; (a Role needs both)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;same-account vs cross-account&lt;/strong&gt; (the same operation needs different things)&lt;/li&gt;
&lt;/ul&gt;




&lt;h2&gt;
  
  
  3. Difficulty 1: what is the difference between a User and a Role
&lt;/h2&gt;

&lt;p&gt;The first mine. Both are "identities that can be a Principal," so beginners stall on "so which one am I supposed to use?" The difference boils down to &lt;strong&gt;how the keys are held.&lt;/strong&gt; Here it is in a table.&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;&lt;/th&gt;
&lt;th&gt;IAM User&lt;/th&gt;
&lt;th&gt;IAM Role&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Key lifetime&lt;/td&gt;
&lt;td&gt;Long-lived (valid until you revoke it)&lt;/td&gt;
&lt;td&gt;Short-lived (auto-expires in ~1 hour)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Key owner&lt;/td&gt;
&lt;td&gt;The User keeps holding it&lt;/td&gt;
&lt;td&gt;No owner. Borrowed fresh each time you use it&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Binding&lt;/td&gt;
&lt;td&gt;Pinned to a specific person or machine&lt;/td&gt;
&lt;td&gt;Anyone (who meets the conditions) can become it&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Damage if leaked&lt;/td&gt;
&lt;td&gt;Big (valid until you recreate it)&lt;/td&gt;
&lt;td&gt;Small (expires shortly)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Today's guidance&lt;/td&gt;
&lt;td&gt;Avoid where possible (emergency / legacy)&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;Default to this&lt;/strong&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;In plain analogy:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;User&lt;/strong&gt; = a photo ID badge. Pinned to one person; if lost, it can be abused until you reissue it.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Role&lt;/strong&gt; = a visitor pass you borrow at the front desk. Anyone (if allowed) can borrow it, and it goes invalid automatically at the end of the day.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Why is a Role the default now? Simple: &lt;strong&gt;so you do not scatter long-lived keys all over the world.&lt;/strong&gt; Writing an access key into code, pushing it to GitHub, leaking it: that is the classic accident. So the goal is a world where only disposable, expiring keys are ever in circulation. That is why human logins, EC2, Lambda, CI/CD all converge on assuming a Role and taking short-lived credentials.&lt;/p&gt;

&lt;p&gt;Let me defuse one behavior right here that every beginner trips on: the "&lt;strong&gt;I assumed the Role but my permissions did not change&lt;/strong&gt;" one.&lt;/p&gt;

&lt;p&gt;&lt;code&gt;AssumeRole&lt;/code&gt; &lt;strong&gt;does not rewrite your current credentials.&lt;/strong&gt; It only "hands back" a fresh set of short-lived credentials. You set those into an environment variable or a profile, and only then, starting with your next request, do you act as the Role. There is no magic switch at the instant you assume. The reason &lt;code&gt;aws sts get-caller-identity&lt;/code&gt; returns the same old you is that you have not used the returned keys yet.&lt;/p&gt;




&lt;h2&gt;
  
  
  4. Difficulty 2: why does a Role carry two policies
&lt;/h2&gt;

&lt;p&gt;This is the biggest climb in IAM. &lt;strong&gt;A Role has two policies of different natures hanging off it.&lt;/strong&gt; And beginners conflate the two and always stall on "which one do I write what in?"&lt;/p&gt;

&lt;p&gt;The reason is simple: &lt;strong&gt;a Role is a split personality that is both an "identity" and a "resource" at the same time.&lt;/strong&gt; That duality maps directly onto the identity of the two policies.&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Policy&lt;/th&gt;
&lt;th&gt;Which face of the Role&lt;/th&gt;
&lt;th&gt;Question it answers&lt;/th&gt;
&lt;th&gt;Analogy&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;trust policy&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;The resource face (resource-based)&lt;/td&gt;
&lt;td&gt;Who is allowed into this Role&lt;/td&gt;
&lt;td&gt;The &lt;strong&gt;front-door key&lt;/strong&gt; (who gets in)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;permission policy&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;The identity face (identity-based)&lt;/td&gt;
&lt;td&gt;What you can do once inside&lt;/td&gt;
&lt;td&gt;The &lt;strong&gt;list of things you may touch&lt;/strong&gt; in the room&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;You need both. Neither works alone.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;No (or mismatched) trust policy → you cannot Assume at all. Stopped at the door.&lt;/li&gt;
&lt;li&gt;No permission policy → you can Assume, but you cannot touch anything in the room you entered. Empty permissions.&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Watch it across the Assume flow
&lt;/h3&gt;

&lt;p&gt;Tracing where each of the two takes effect along the timeline organizes the whole thing at once.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fraw.githubusercontent.com%2F0-draft%2Fdev.to%2Fmain%2Farticles%2Fassets%2Faws-iam-why-its-hard-deep-dive%2Fdiagrams%2F02-assume-role-sequence.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fraw.githubusercontent.com%2F0-draft%2Fdev.to%2Fmain%2Farticles%2Fassets%2Faws-iam-why-its-hard-deep-dive%2Fdiagrams%2F02-assume-role-sequence.png" alt="AssumeRole two-policy sequence" width="800" height="594"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;The door check (trust policy) takes effect &lt;strong&gt;at the moment of Assume.&lt;/strong&gt;
&lt;/li&gt;
&lt;li&gt;The contents check (permission policy) takes effect &lt;strong&gt;after you assume, at the moment you actually hit the API.&lt;/strong&gt;
&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;If you do not know about this two-stage structure, you read the error message wrong.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;code&gt;is not authorized to perform: sts:AssumeRole&lt;/code&gt; → &lt;strong&gt;stopped at the door.&lt;/strong&gt; Look at the trust policy, or the caller's own AssumeRole permission.&lt;/li&gt;
&lt;li&gt;Assume went through but &lt;code&gt;s3:GetObject&lt;/code&gt; returns &lt;code&gt;AccessDenied&lt;/code&gt; → &lt;strong&gt;stopped at the contents.&lt;/strong&gt; Look at the permission policy.&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  The trust policy is the more dangerous one
&lt;/h3&gt;

&lt;p&gt;A permission policy that is too wide means "the person who got in does too much." A trust policy that is too wide means "people who should never have gotten in can get in." The latter is the more serious accident. Carelessly setting &lt;code&gt;Principal&lt;/code&gt; to &lt;code&gt;"*"&lt;/code&gt; (anyone) turns that Role into a privilege-escalation hole that any AWS account on Earth can Assume. The trust policy is less flashy than the permission policy, but it is the place to write more carefully.&lt;/p&gt;




&lt;h2&gt;
  
  
  5. Difficulty 3: there are too many policy types
&lt;/h2&gt;

&lt;p&gt;So far we have seen three: identity-based, resource-based, and trust. AWS officially classifies policies into &lt;strong&gt;seven types&lt;/strong&gt; (identity-based, resource-based, permission boundary, SCP, RCP, ACL, session). A beginner who sees that list loses heart on the spot.&lt;/p&gt;

&lt;p&gt;You do not need to memorize all of them as equals. Two clarifications up front make it much lighter.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;The trust policy is not a separate type.&lt;/strong&gt; It is a kind of resource-based policy, dedicated to the Assume door of a Role (the one from the last section). An ACL is also a relative of resource-based, and it is treated as legacy now, so do not use it for anything new. So in practice you only need to look at "identity side / resource side / ceiling family."&lt;/li&gt;
&lt;li&gt;The rest organize instantly once you &lt;strong&gt;split them into two groups: things that add permissions, and things that lower the ceiling on permissions.&lt;/strong&gt;
&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fraw.githubusercontent.com%2F0-draft%2Fdev.to%2Fmain%2Farticles%2Fassets%2Faws-iam-why-its-hard-deep-dive%2Fdiagrams%2F03-policy-two-groups.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fraw.githubusercontent.com%2F0-draft%2Fdev.to%2Fmain%2Farticles%2Fassets%2Faws-iam-why-its-hard-deep-dive%2Fdiagrams%2F03-policy-two-groups.png" alt="Two groups of policy types" width="800" height="699"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;(The trust policy is not in this diagram. As covered above, it is dedicated to the Assume door and plays a different role from the permission math we do here. SCP and RCP are ceilings that only show up if you use Organizations; a personal account has none.)&lt;/p&gt;

&lt;p&gt;The clearest way to picture the two groups is addition and multiplication.&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Group&lt;/th&gt;
&lt;th&gt;Which ones&lt;/th&gt;
&lt;th&gt;How they combine&lt;/th&gt;
&lt;th&gt;Intuition&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;The adding group&lt;/td&gt;
&lt;td&gt;identity-based / resource-based&lt;/td&gt;
&lt;td&gt;
&lt;strong&gt;union (addition).&lt;/strong&gt; A single Allow anywhere permits it&lt;/td&gt;
&lt;td&gt;Pushes toward more permission&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;The ceiling group&lt;/td&gt;
&lt;td&gt;SCP / RCP / Boundary / session&lt;/td&gt;
&lt;td&gt;
&lt;strong&gt;intersection (multiplication).&lt;/strong&gt; Denied unless all permit&lt;/td&gt;
&lt;td&gt;Pushes toward less. One veto and you are out&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;The consequence here is the one beginners get caught on the most.&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Writing an Allow into an SCP or a Permission Boundary does not add a single bit of permission.&lt;/strong&gt; These are a "ceiling," not a "permission." Actual permission is granted by the adding group (identity / resource). The ceiling group only "trims the part of the granted permission that goes over the ceiling."&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;So "I wrote Allow in the SCP but my usable permissions did not grow" is correct behavior. An SCP's Allow is nothing but setting a ceiling that says "you may permit up to here." Actual permission has to be granted separately, by something like an identity-based policy.&lt;/p&gt;




&lt;h2&gt;
  
  
  6. Difficulty 4: why do I get denied when I wrote Allow
&lt;/h2&gt;

&lt;p&gt;"I clearly wrote &lt;code&gt;Allow&lt;/code&gt; in the policy but I get &lt;code&gt;AccessDenied&lt;/code&gt;." The most common way to get stuck in IAM. The cause is &lt;strong&gt;the order of evaluation.&lt;/strong&gt; AWS looks at all policies in this order.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fraw.githubusercontent.com%2F0-draft%2Fdev.to%2Fmain%2Farticles%2Fassets%2Faws-iam-why-its-hard-deep-dive%2Fdiagrams%2F04-evaluation-order.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fraw.githubusercontent.com%2F0-draft%2Fdev.to%2Fmain%2Farticles%2Fassets%2Faws-iam-why-its-hard-deep-dive%2Fdiagrams%2F04-evaluation-order.png" alt="IAM policy evaluation order" width="800" height="856"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;There are three iron rules to read off this diagram.&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;The default is deny (implicit Deny).&lt;/strong&gt; Write nothing and you can do nothing. Permission is born only once you spell out at least one Allow.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;An explicit Deny beats everything.&lt;/strong&gt; If any one policy has a Deny, it is denied no questions asked, even with Allows lined up across every other policy. Evaluation stops there.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;The ceiling group is a "cutoff," not a "grant."&lt;/strong&gt; As covered last section, an operation the SCP / Boundary / session do not permit will not go through no matter how many Allows the identity-based policy has.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;(This diagram is simplified for beginners. Strictly, a Permission Boundary does not constrain permissions that a resource-based policy grants directly to a Principal. A path explicitly permitted by name on the resource side can slip past the boundary ceiling. At first, forget this exception and just learn "the ceiling group is the overall ceiling.")&lt;/p&gt;

&lt;p&gt;When you knock out the typical patterns of "I wrote Allow but got denied," the culprit is usually one of these.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;There is an &lt;strong&gt;explicit Deny&lt;/strong&gt; somewhere (common in an SCP the org attached).&lt;/li&gt;
&lt;li&gt;You are hitting the &lt;strong&gt;ceiling of an SCP or Permission Boundary&lt;/strong&gt; and are simply outside the limit.&lt;/li&gt;
&lt;li&gt;It is cross-account and the &lt;strong&gt;other account has no permission&lt;/strong&gt; (next section).&lt;/li&gt;
&lt;li&gt;A &lt;code&gt;Condition&lt;/code&gt; (an IP restriction, MFA required, and so on) is not satisfied, so that Allow never fires.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The presence of &lt;code&gt;Allow&lt;/code&gt; is a necessary condition for permission, not a sufficient one. Suspecting "is there a Deny?" and "am I inside the ceiling?" first is the shortcut to debugging.&lt;/p&gt;




&lt;h2&gt;
  
  
  7. Difficulty 5: behavior changes between same-account and cross-account
&lt;/h2&gt;

&lt;p&gt;This is the asymmetric rule you will never figure out unless someone tells you. &lt;strong&gt;The exact same operation needs different things depending on whether the other side is the same account or a different account.&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fraw.githubusercontent.com%2F0-draft%2Fdev.to%2Fmain%2Farticles%2Fassets%2Faws-iam-why-its-hard-deep-dive%2Fdiagrams%2F05-same-vs-cross-account.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fraw.githubusercontent.com%2F0-draft%2Fdev.to%2Fmain%2Farticles%2Fassets%2Faws-iam-why-its-hard-deep-dive%2Fdiagrams%2F05-same-vs-cross-account.png" alt="Same-account vs cross-account" width="800" height="283"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;To pin it in words:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Within the same account&lt;/strong&gt;: either the identity-based policy or the resource-based policy permits it, and you are through (union).&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Cross-account&lt;/strong&gt;: company A's Allow on the identity side &lt;strong&gt;and&lt;/strong&gt; company B's Allow on the resource side, &lt;strong&gt;both&lt;/strong&gt; are required (missing either one and you are out).&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Why is it like this? Cross-account is the act of "reaching for stuff in someone else's house," so it is natural to think both parties must consent: &lt;strong&gt;the reaching side (your account permits it)&lt;/strong&gt; and &lt;strong&gt;the reached side (their account permits it).&lt;/strong&gt; Within one account there is a single homeowner, so one side's consent is enough.&lt;/p&gt;

&lt;p&gt;On top of this rides one more asymmetry. The "Role trust policy" from the earlier section also tangles with this same/cross distinction. To assume another company's Role cross-account, their trust policy must name your Principal, and you must have the &lt;code&gt;sts:AssumeRole&lt;/code&gt; permission on your side: both sides must line up. This is why, when you "cannot assume the other party's Role," looking only at your own permissions often does not solve it.&lt;/p&gt;




&lt;h2&gt;
  
  
  8. Difficulty 6: why is "IAM Policy alone" not enough
&lt;/h2&gt;

&lt;p&gt;The final climb. By now you might think "what I actually want is just 'who can do what,' so as long as I can write an IAM Policy, that is enough, right?" In reality this is the structural trap the field gets caught in the most.&lt;/p&gt;

&lt;p&gt;The key is the fact we touched in the vocabulary section. &lt;strong&gt;An IAM Policy floats in the air on its own; it only takes effect once you attach it to an "identity" (a User or a Role).&lt;/strong&gt; And &lt;strong&gt;that identity can only exist inside one specific account.&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;The instant you add accounts, this bares its fangs. Say a company splits into dev / staging / prod / sandbox and so on, ten accounts. Each account is an independent IAM space. To log alice into every account, the naive approach gives you this.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fraw.githubusercontent.com%2F0-draft%2Fdev.to%2Fmain%2Farticles%2Fassets%2Faws-iam-why-its-hard-deep-dive%2Fdiagrams%2F06-iam-user-per-account.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fraw.githubusercontent.com%2F0-draft%2Fdev.to%2Fmain%2Farticles%2Fassets%2Faws-iam-why-its-hard-deep-dive%2Fdiagrams%2F06-iam-user-per-account.png" alt="An IAM User per account" width="800" height="464"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;IAM Users and long-lived access keys multiply by headcount times account count.&lt;/strong&gt; 10 accounts × 50 people = 500 Users and keys. What makes this hell:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;A flood of long-lived keys gets scattered (the leak surface is headcount times account count).&lt;/li&gt;
&lt;li&gt;MFA has to be set up on every one of them.&lt;/li&gt;
&lt;li&gt;When alice leaves, you go around deleting all ten. Forget even one and a hole stays open.&lt;/li&gt;
&lt;li&gt;To change permissions, you do every account by hand.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;This very property of "identities being bound per account" is why IAM Policy alone cannot run a real organization. AWS provides the fix: use &lt;strong&gt;IAM Identity Center.&lt;/strong&gt; This manages employee identities in one place and connects to your in-house &lt;strong&gt;IdP&lt;/strong&gt; (Identity Provider: a base like Okta, Microsoft Entra, or Google Workspace that centrally manages employee accounts).&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fraw.githubusercontent.com%2F0-draft%2Fdev.to%2Fmain%2Farticles%2Fassets%2Faws-iam-why-its-hard-deep-dive%2Fdiagrams%2F07-identity-center.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fraw.githubusercontent.com%2F0-draft%2Fdev.to%2Fmain%2Farticles%2Fassets%2Faws-iam-why-its-hard-deep-dive%2Fdiagrams%2F07-identity-center.png" alt="IAM Identity Center fan-out" width="800" height="914"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;The crux of the mechanism comes down to two words.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Permission Set&lt;/strong&gt;: a &lt;strong&gt;template&lt;/strong&gt; for permissions. Define &lt;code&gt;ReadOnly&lt;/code&gt; or &lt;code&gt;Admin&lt;/code&gt; once in the center and reuse it across multiple accounts. The contents are just a bundle of ordinary IAM Policies. You do not have to write from scratch; you can pick AWS-made &lt;strong&gt;predefined&lt;/strong&gt; ones like &lt;code&gt;AdministratorAccess&lt;/code&gt; / &lt;code&gt;PowerUserAccess&lt;/code&gt; / &lt;code&gt;ReadOnlyAccess&lt;/code&gt; as-is. Only when you need something fancy do you mix in a custom policy of your own.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Assignment&lt;/strong&gt;: the tuple of "which person (Group) / in which account / which Permission Set" to assign.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;When you assign, Identity Center &lt;strong&gt;grows an IAM Role inside that account automatically.&lt;/strong&gt; So the IAM constraint that "each account needs its own Role" has not changed. What changed is that &lt;strong&gt;instead of creating that Role by hand, it gets handed out automatically from a central template.&lt;/strong&gt; And alice's identity is a single one in the IdP. On departure, disable that one in the IdP and she is shut out of every account at once.&lt;/p&gt;

&lt;p&gt;To organize it:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;&lt;/th&gt;
&lt;th&gt;IAM User with a policy attached directly&lt;/th&gt;
&lt;th&gt;Identity Center + Permission Set&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Where the identity lives&lt;/td&gt;
&lt;td&gt;Scattered across each account&lt;/td&gt;
&lt;td&gt;Consolidated into one in the IdP&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Credentials&lt;/td&gt;
&lt;td&gt;Long-lived access keys&lt;/td&gt;
&lt;td&gt;Short-lived (issued fresh by &lt;code&gt;aws sso login&lt;/code&gt;)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;On departure&lt;/td&gt;
&lt;td&gt;Delete the User in every account&lt;/td&gt;
&lt;td&gt;Disable one in the IdP&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Changing permissions&lt;/td&gt;
&lt;td&gt;Every account by hand&lt;/td&gt;
&lt;td&gt;Change the template / Assign centrally&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;h3&gt;
  
  
  You see the difference when you get your hands dirty
&lt;/h3&gt;

&lt;p&gt;This article mostly skips code, but this is the one place where lining up the actual commands makes the difference obvious at a glance. Let me do the same "alice uses two accounts, dev and prod" both ways.&lt;/p&gt;

&lt;p&gt;The IAM User way makes the admin repeat the same work &lt;strong&gt;as many times as there are accounts.&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;# as the admin of the dev account&lt;/span&gt;
aws iam create-user &lt;span class="nt"&gt;--user-name&lt;/span&gt; alice
aws iam attach-user-policy &lt;span class="nt"&gt;--user-name&lt;/span&gt; alice &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--policy-arn&lt;/span&gt; arn:aws:iam::aws:policy/PowerUserAccess
aws iam create-access-key &lt;span class="nt"&gt;--user-name&lt;/span&gt; alice   &lt;span class="c"&gt;# long-lived keys come out -&amp;gt; hand them to alice&lt;/span&gt;

&lt;span class="c"&gt;# do exactly the same thing again in the prod account (another set of keys)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;On alice's side, she writes the &lt;strong&gt;long-lived keys she received as-is&lt;/strong&gt; into &lt;code&gt;~/.aws/credentials&lt;/code&gt; and uses them.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight ini"&gt;&lt;code&gt;&lt;span class="nn"&gt;[dev]&lt;/span&gt;
&lt;span class="py"&gt;aws_access_key_id&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s"&gt;AKIA...DEV...&lt;/span&gt;
&lt;span class="py"&gt;aws_secret_access_key&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s"&gt;...DEV...&lt;/span&gt;

&lt;span class="nn"&gt;[prod]&lt;/span&gt;
&lt;span class="py"&gt;aws_access_key_id&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s"&gt;AKIA...PROD...&lt;/span&gt;
&lt;span class="py"&gt;aws_secret_access_key&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s"&gt;...PROD...&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The Identity Center way, on the other hand, has no &lt;code&gt;create-user&lt;/code&gt; and no &lt;code&gt;create-access-key&lt;/code&gt;. alice configures once.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;aws configure sso        &lt;span class="c"&gt;# register the SSO URL once. log in via browser (MFA here)&lt;/span&gt;
                         &lt;span class="c"&gt;# every account/permission you can reach is listed, and profiles are written automatically&lt;/span&gt;
aws sso login            &lt;span class="c"&gt;# about once a day. short-lived credentials rain down&lt;/span&gt;
aws s3 &lt;span class="nb"&gt;ls&lt;/span&gt; &lt;span class="nt"&gt;--profile&lt;/span&gt; dev  &lt;span class="c"&gt;# same login, just switch between dev and prod&lt;/span&gt;
aws s3 &lt;span class="nb"&gt;ls&lt;/span&gt; &lt;span class="nt"&gt;--profile&lt;/span&gt; prod
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Look at &lt;code&gt;~/.aws/config&lt;/code&gt; and the point is that &lt;strong&gt;not a single key is written there.&lt;/strong&gt; All that is written is "which account, which Permission Set," and the actual keys are issued fresh and short-lived on every &lt;code&gt;aws sso login&lt;/code&gt;. This is the decisive difference from the former, which parks long-lived keys in a file.&lt;/p&gt;

&lt;p&gt;This is where the essence we pinned at the start pays off: "&lt;strong&gt;every access in AWS ultimately converges on assuming a Role and taking short-lived credentials.&lt;/strong&gt;" Identity Center and Permission Sets are not a new authorization mechanism; they are nothing but &lt;strong&gt;a connector that wires the human entrance to each account's Role AssumeRole.&lt;/strong&gt; What you hold at the end is the usual Role's short-lived credentials.&lt;/p&gt;




&lt;h2&gt;
  
  
  9. Summary: one mental model
&lt;/h2&gt;

&lt;p&gt;Let me fold all the difficulty so far into a single picture. When IAM stops making sense, come back to this drawing.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fraw.githubusercontent.com%2F0-draft%2Fdev.to%2Fmain%2Farticles%2Fassets%2Faws-iam-why-its-hard-deep-dive%2Fdiagrams%2F08-mental-model.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fraw.githubusercontent.com%2F0-draft%2Fdev.to%2Fmain%2Farticles%2Fassets%2Faws-iam-why-its-hard-deep-dive%2Fdiagrams%2F08-mental-model.png" alt="One-page IAM mental model" width="800" height="996"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Each difficulty in one line:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;User vs Role&lt;/strong&gt;: the difference is key lifetime. A User keeps a long-lived key; a Role borrows a short-lived one each time. A Role is the default now.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Two policies&lt;/strong&gt;: a Role is a split personality, "identity" and "resource." The door (trust policy) and the contents (permission policy) answer different questions.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Policy types&lt;/strong&gt;: officially there are seven, but split them into "the group that grants by adding" and "the group that lowers the ceiling by multiplying" and it organizes. An Allow in the ceiling group adds no permission.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Evaluation order&lt;/strong&gt;: write nothing, denied. Deny beats everything. "I wrote Allow" is only a necessary condition for permission.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Same vs cross&lt;/strong&gt;: same account, one side's Allow is enough. Different account, you need both sides' Allow.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;IAM Policy alone is not enough&lt;/strong&gt;: identities are bound to accounts. When accounts grow, use Identity Center to consolidate identities centrally and hand out Roles automatically.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;IAM beats you up at first sight because there are so many concepts, but most of the difficulty is the kind that disappears once explained: "same word, different thing," and "a handful of asymmetric rules that fight intuition." Step on these six mines once and the next time you see &lt;code&gt;AccessDenied&lt;/code&gt;, you will be able to narrow down the culprit in order.&lt;/p&gt;

&lt;p&gt;Next time you get stuck, suspect first "is there a Deny?" and "am I inside the ceiling (SCP / Boundary)?" If that still does not explain it, then it is either the same/cross asymmetry or you are stopped at the trust policy door. Once you know where the mines are, IAM is not scary.&lt;/p&gt;

</description>
      <category>aws</category>
      <category>iam</category>
      <category>security</category>
      <category>beginners</category>
    </item>
    <item>
      <title>Transaction Tokens Deep Dive: The OAuth Spec That Carries 'Who, and Why' Across Your Microservices</title>
      <dc:creator>kt</dc:creator>
      <pubDate>Wed, 03 Jun 2026 16:17:13 +0000</pubDate>
      <link>https://dev.to/kanywst/transaction-tokens-deep-dive-the-oauth-spec-that-carries-who-and-why-across-your-microservices-7k1</link>
      <guid>https://dev.to/kanywst/transaction-tokens-deep-dive-the-oauth-spec-that-carries-who-and-why-across-your-microservices-7k1</guid>
      <description>&lt;h1&gt;
  
  
  Introduction
&lt;/h1&gt;

&lt;p&gt;A modern system processes a single request across many cooperating microservices. A user clicks "Buy" on a trading site, and behind that one click the API gateway, the order service, the risk service, the payment service, and the notification service all fire in a chain.&lt;/p&gt;

&lt;p&gt;That chain hides a basic question.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;"What happens if you take the user's OAuth access token that arrived at the first API gateway, and keep forwarding it as-is to every internal service?"&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fraw.githubusercontent.com%2F0-draft%2Fdev.to%2Fmain%2Farticles%2Fassets%2Fdraft-ietf-oauth-transaction-tokens-deep-dive%2Fdiagrams%2F01-token-forwarding-problem.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fraw.githubusercontent.com%2F0-draft%2Fdev.to%2Fmain%2Farticles%2Fassets%2Fdraft-ietf-oauth-transaction-tokens-deep-dive%2Fdiagrams%2F01-token-forwarding-problem.png" alt="Forwarding the OAuth token unchanged down the call chain" width="800" height="2002"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Plenty goes wrong.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;You are leaking an external token into the internals.&lt;/strong&gt; If that access token leaks, every internal service is exposed.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;You lose track of who started the request.&lt;/strong&gt; The risk service cannot tell whether this call came through the user's gateway or whether someone is hitting it directly.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Any service can impersonate another.&lt;/strong&gt; Once the internal network is breached, a malicious workload can pretend to be a legitimate request and call other services.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Token theft.&lt;/strong&gt; Steal the OAuth token flowing internally and you can reach external resources too.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Transaction Tokens (Txn-Tokens)&lt;/strong&gt; are the answer to all of this.&lt;/p&gt;




&lt;h2&gt;
  
  
  What Transaction Tokens are
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;draft-ietf-oauth-transaction-tokens-08&lt;/strong&gt; (Txn-Tokens for short) is a spec moving through the IETF OAuth WG. As of June 2026 the latest version is Draft 08, and it has entered &lt;strong&gt;WG Last Call (WGLC)&lt;/strong&gt;, the final stage of the working group's process. The authors are engineers from CrowdStrike, Practical Identity, and Defakto Security, and the spec grew out of running microservices at real scale.&lt;/p&gt;

&lt;p&gt;In one sentence:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;A Txn-Token is a short-lived JWT that tells every workload inside a Trust Domain, in a tamper-proof way, who a request was started for and what it was started to do.&lt;/strong&gt;&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fraw.githubusercontent.com%2F0-draft%2Fdev.to%2Fmain%2Farticles%2Fassets%2Fdraft-ietf-oauth-transaction-tokens-deep-dive%2Fdiagrams%2F02-trust-domain-overview.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fraw.githubusercontent.com%2F0-draft%2Fdev.to%2Fmain%2Farticles%2Fassets%2Fdraft-ietf-oauth-transaction-tokens-deep-dive%2Fdiagrams%2F02-trust-domain-overview.png" alt="Trust Domain with a TTS issuing Txn-Tokens" width="800" height="923"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;The contrast:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Axis&lt;/th&gt;
&lt;th&gt;Forwarding the OAuth access token as-is&lt;/th&gt;
&lt;th&gt;Using Txn-Tokens&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Leak risk&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;an externally-facing token flows through the internals&lt;/td&gt;
&lt;td&gt;only an internal, short-lived token flows&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Context&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;lost along the way&lt;/td&gt;
&lt;td&gt;every workload can verify the same context&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Tamper detection&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;none&lt;/td&gt;
&lt;td&gt;the TTS signature makes it tamper-proof&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Impersonation&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;hard to stop&lt;/td&gt;
&lt;td&gt;the TTS only issues for legitimate transactions&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Blast radius&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;unbounded&lt;/td&gt;
&lt;td&gt;narrowed by tight scopes&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;




&lt;h2&gt;
  
  
  Terminology first
&lt;/h2&gt;

&lt;p&gt;Before reading the spec, nail down the terms that are specific to Txn-Tokens.&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Term&lt;/th&gt;
&lt;th&gt;Meaning&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Trust Domain&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;A set of systems under a common security policy. This is the boundary where a Txn-Token is valid.&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;External Endpoint&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;The entry point into a Trust Domain (an API gateway and the like), where external tokens arrive.&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Workload&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;A unit of execution: a container, a microservice, a managed database.&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Call Chain&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;The sequence of workloads invoked one after another for a single transaction.&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;TTS (Txn-Token Service)&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;The special service inside a Trust Domain that issues Txn-Tokens.&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Txn-Token&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;A short-lived, signed JWT holding user ID, workload ID, and authorization context, flowing through the whole Call Chain.&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;A &lt;strong&gt;Trust Domain&lt;/strong&gt; is a group of systems that share a common security policy and controls. Two or more workloads on a physically or virtually isolated network form one Trust Domain, and access to a workload is restricted to its published interfaces only.&lt;/p&gt;

&lt;p&gt;The &lt;strong&gt;TTS&lt;/strong&gt; is the single special service inside a Trust Domain that issues Txn-Tokens. "Logically one" means you can run multiple instances for availability, as long as they are unified under one trust policy.&lt;/p&gt;




&lt;h2&gt;
  
  
  What kind of JWT a Txn-Token is
&lt;/h2&gt;

&lt;p&gt;A Txn-Token is just a signed JWT (JSON Web Token) carrying a specific set of claims.&lt;/p&gt;

&lt;h3&gt;
  
  
  JWT header
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight json"&gt;&lt;code&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"typ"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"txntoken+jwt"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"alg"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"RS256"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"kid"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"identifier-to-key"&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The &lt;code&gt;typ&lt;/code&gt; of &lt;code&gt;txntoken+jwt&lt;/code&gt; marks this token as a Txn-Token. That media type also gets registered with IANA.&lt;/p&gt;

&lt;h3&gt;
  
  
  JWT body claims
&lt;/h3&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Claim&lt;/th&gt;
&lt;th&gt;Required&lt;/th&gt;
&lt;th&gt;Meaning&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;iat&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;yes&lt;/td&gt;
&lt;td&gt;issued-at time&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;aud&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;yes&lt;/td&gt;
&lt;td&gt;Trust Domain identifier (unusable outside this domain)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;exp&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;yes&lt;/td&gt;
&lt;td&gt;expiry&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;txn&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;yes&lt;/td&gt;
&lt;td&gt;transaction-unique ID (per RFC 8417 Section 2.2)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;sub&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;yes&lt;/td&gt;
&lt;td&gt;the subject of the transaction (a user or workload identifier)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;scope&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;yes&lt;/td&gt;
&lt;td&gt;the purpose and permission range of this transaction (per RFC 8693 Section 4.2)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;req_wl&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;yes&lt;/td&gt;
&lt;td&gt;identifier of the workload that requested the Txn-Token&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;iss&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;no&lt;/td&gt;
&lt;td&gt;issuer (omit when the signing key is already known)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;rctx&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;no&lt;/td&gt;
&lt;td&gt;requester environment context (source IP, auth method, ...)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;tctx&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;no&lt;/td&gt;
&lt;td&gt;transaction context (values that stay constant across the Call Chain)&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;The two that matter most are &lt;code&gt;tctx&lt;/code&gt; (Transaction Context) and &lt;code&gt;rctx&lt;/code&gt; (Requester Context).&lt;/p&gt;

&lt;h4&gt;
  
  
  tctx (Transaction Context)
&lt;/h4&gt;

&lt;p&gt;Holds the parts of the transaction that &lt;strong&gt;do not change&lt;/strong&gt; across the call chain. The TTS sets this as authoritative data, and services read it to make authorization decisions.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight json"&gt;&lt;code&gt;&lt;span class="nl"&gt;"tctx"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"action"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"BUY"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"ticker"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"MSFT"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"quantity"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"100"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"customer_type"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"geo"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"US"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"level"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"VIP"&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h4&gt;
  
  
  rctx (Requester Context)
&lt;/h4&gt;

&lt;p&gt;Holds the &lt;strong&gt;environment&lt;/strong&gt; of the request: the source IP, auth method, and so on of the original caller.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight json"&gt;&lt;code&gt;&lt;span class="nl"&gt;"rctx"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"req_ip"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"69.151.72.123"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"authn"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"urn:ietf:rfc:6749"&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  A real Txn-Token (the stock-trade case)
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight json"&gt;&lt;code&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"iat"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;1686536226&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"aud"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"trust-domain.example"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"exp"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;1686536586&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"txn"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"97053963-771d-49cc-a4e3-20aad399c312"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"sub"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"d084sdrt234fsaw34tr23t"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"req_wl"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"apigateway.trust-domain.example"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"rctx"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"req_ip"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"69.151.72.123"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"authn"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"urn:ietf:rfc:6749"&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"scope"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"trade.stocks"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"tctx"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"action"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"BUY"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"ticker"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"MSFT"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"quantity"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"100"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"customer_type"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"geo"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"US"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"level"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"VIP"&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Notice &lt;code&gt;exp&lt;/code&gt; is &lt;code&gt;iat&lt;/code&gt; + 360 seconds (6 minutes). &lt;strong&gt;A Txn-Token must be short-lived.&lt;/strong&gt; Set it to a few minutes at most.&lt;/p&gt;




&lt;h2&gt;
  
  
  The basic flow: handling an external request
&lt;/h2&gt;

&lt;p&gt;Let's walk through how Transaction Tokens get used, with a sequence diagram.&lt;/p&gt;

&lt;h3&gt;
  
  
  Basic flow (external request)
&lt;/h3&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fraw.githubusercontent.com%2F0-draft%2Fdev.to%2Fmain%2Farticles%2Fassets%2Fdraft-ietf-oauth-transaction-tokens-deep-dive%2Fdiagrams%2F03-external-request-flow.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fraw.githubusercontent.com%2F0-draft%2Fdev.to%2Fmain%2Farticles%2Fassets%2Fdraft-ietf-oauth-transaction-tokens-deep-dive%2Fdiagrams%2F03-external-request-flow.png" alt="Sequence: handling an external request" width="800" height="701"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;The key part is &lt;strong&gt;step 3, the call chain.&lt;/strong&gt; The Txn-Token is &lt;strong&gt;forwarded unchanged&lt;/strong&gt; between internal services. You must not modify it. Because each service verifies the signature independently, every one of them can confirm that the TTS authorized this transaction.&lt;/p&gt;

&lt;h3&gt;
  
  
  When the request starts internally
&lt;/h3&gt;

&lt;p&gt;An external API call is not the only way a transaction begins. A scheduler, a batch job, or any internal workload can kick off a transaction on its own.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fraw.githubusercontent.com%2F0-draft%2Fdev.to%2Fmain%2Farticles%2Fassets%2Fdraft-ietf-oauth-transaction-tokens-deep-dive%2Fdiagrams%2F04-internal-trigger-flow.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fraw.githubusercontent.com%2F0-draft%2Fdev.to%2Fmain%2Farticles%2Fassets%2Fdraft-ietf-oauth-transaction-tokens-deep-dive%2Fdiagrams%2F04-internal-trigger-flow.png" alt="Sequence: a transaction started internally" width="800" height="555"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;With no external OAuth token in hand, the workload generates a &lt;strong&gt;Self-Signed JWT&lt;/strong&gt; and presents it to the TTS. The TTS verifies it, confirms the request comes from a legitimate workload, and then issues a Txn-Token.&lt;/p&gt;




&lt;h2&gt;
  
  
  The Txn-Token Service (TTS) in detail
&lt;/h2&gt;

&lt;p&gt;The TTS is the heart of the spec. It is implemented as a profile of RFC 8693 (OAuth 2.0 Token Exchange). Token Exchange is the OAuth extension for "trade a token you already hold for a new token with a different purpose or scope," using &lt;code&gt;grant_type=urn:ietf:params:oauth:grant-type:token-exchange&lt;/code&gt;. The TTS rides on that mechanism to convert an OAuth access token (or the Self-Signed JWT above) into a Txn-Token.&lt;/p&gt;

&lt;h3&gt;
  
  
  Request to the TTS (Token Exchange)
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight http"&gt;&lt;code&gt;&lt;span class="nf"&gt;POST&lt;/span&gt; &lt;span class="nn"&gt;/txn-token-service/token_endpoint&lt;/span&gt; &lt;span class="k"&gt;HTTP&lt;/span&gt;&lt;span class="o"&gt;/&lt;/span&gt;&lt;span class="m"&gt;1.1&lt;/span&gt;
&lt;span class="na"&gt;Host&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="s"&gt;txn-token-service.trust-domain.example&lt;/span&gt;
&lt;span class="na"&gt;Content-Type&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="s"&gt;application/x-www-form-urlencoded&lt;/span&gt;

grant_type=urn:ietf:params:oauth:grant-type:token-exchange
&amp;amp;requested_token_type=urn:ietf:params:oauth:token-type:txn_token
&amp;amp;audience=http://trust-domain.example
&amp;amp;scope=trade.stocks
&amp;amp;subject_token=eyJhbGciOiJFUzI1NiIsImtpZCI...
&amp;amp;subject_token_type=urn:ietf:params:oauth:token-type:access_token
&amp;amp;request_context={"req_ip":"69.151.72.123","authn":"urn:ietf:rfc:6749"}
&amp;amp;request_details={"action":"BUY","ticker":"MSFT","quantity":"100"}
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;What each parameter means:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Parameter&lt;/th&gt;
&lt;th&gt;Required&lt;/th&gt;
&lt;th&gt;Description&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;grant_type&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;yes&lt;/td&gt;
&lt;td&gt;the fixed Token Exchange value from RFC 8693&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;requested_token_type&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;yes&lt;/td&gt;
&lt;td&gt;set to the &lt;code&gt;txn_token&lt;/code&gt; URN&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;audience&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;yes&lt;/td&gt;
&lt;td&gt;the Trust Domain identifier&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;scope&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;yes&lt;/td&gt;
&lt;td&gt;the purpose and permission of this transaction (keep it tight)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;subject_token&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;yes&lt;/td&gt;
&lt;td&gt;a token proving the subject (OAuth access token, Self-Signed JWT, ...)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;subject_token_type&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;yes&lt;/td&gt;
&lt;td&gt;a URI naming the subject_token type&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;request_context&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;no&lt;/td&gt;
&lt;td&gt;request environment (IP, ...); lands in rctx&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;request_details&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;no&lt;/td&gt;
&lt;td&gt;request details (the action, ...); used to build tctx&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;h3&gt;
  
  
  Kinds of subject_token
&lt;/h3&gt;

&lt;p&gt;The TTS accepts several kinds of subject_token.&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;subject_token&lt;/th&gt;
&lt;th&gt;
&lt;code&gt;subject_token_type&lt;/code&gt; URI&lt;/th&gt;
&lt;th&gt;Typical use&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;OAuth access token&lt;/td&gt;
&lt;td&gt;&lt;code&gt;urn:ietf:params:oauth:token-type:access_token&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;common at an external endpoint&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;ID token (OIDC)&lt;/td&gt;
&lt;td&gt;&lt;code&gt;urn:ietf:params:oauth:token-type:id_token&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;a subject already authenticated via OIDC&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;SAML assertion&lt;/td&gt;
&lt;td&gt;&lt;code&gt;urn:ietf:params:oauth:token-type:saml2&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;SAML-based authentication&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Self-Signed JWT&lt;/td&gt;
&lt;td&gt;&lt;code&gt;urn:ietf:params:oauth:token-type:self_signed&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;sign-and-present for internal triggers&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Unsigned JSON object&lt;/td&gt;
&lt;td&gt;&lt;code&gt;urn:ietf:params:oauth:token-type:unsigned_json&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;the simplest internal trigger&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;Beyond the token types defined in RFC 8693, you can use &lt;code&gt;self_signed&lt;/code&gt; / &lt;code&gt;unsigned_json&lt;/code&gt; for internal triggers, plus any custom URN the parties agree on.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Note&lt;/strong&gt;: a Refresh Token must not be used as a &lt;code&gt;subject_token&lt;/code&gt;. Txn-Tokens are never minted from a Refresh Token.&lt;/p&gt;

&lt;h3&gt;
  
  
  The TTS processing flow
&lt;/h3&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fraw.githubusercontent.com%2F0-draft%2Fdev.to%2Fmain%2Farticles%2Fassets%2Fdraft-ietf-oauth-transaction-tokens-deep-dive%2Fdiagrams%2F05-tts-processing-flow.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fraw.githubusercontent.com%2F0-draft%2Fdev.to%2Fmain%2Farticles%2Fassets%2Fdraft-ietf-oauth-transaction-tokens-deep-dive%2Fdiagrams%2F05-tts-processing-flow.png" alt="The TTS request processing flow" width="800" height="1164"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h3&gt;
  
  
  Response from the TTS
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight http"&gt;&lt;code&gt;&lt;span class="k"&gt;HTTP&lt;/span&gt;&lt;span class="o"&gt;/&lt;/span&gt;&lt;span class="m"&gt;1.1&lt;/span&gt; &lt;span class="m"&gt;200&lt;/span&gt; &lt;span class="ne"&gt;OK&lt;/span&gt;
&lt;span class="na"&gt;Content-Type&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="s"&gt;application/json&lt;/span&gt;
&lt;span class="na"&gt;Cache-Control&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="s"&gt;no-store&lt;/span&gt;

&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"token_type"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"N_A"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"issued_token_type"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"urn:ietf:params:oauth:token-type:txn_token"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"access_token"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"eyJCI6IjllciJ9...Qedw6rx"&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The &lt;code&gt;token_type&lt;/code&gt; of &lt;code&gt;N_A&lt;/code&gt; matters. A Txn-Token is not a bearer token you present as-is. It is an internal-only carrier of authorization context, so per RFC 8693 the TTS returns &lt;code&gt;N_A&lt;/code&gt;.&lt;/p&gt;




&lt;h2&gt;
  
  
  Using the Txn-Token: internal service-to-service calls
&lt;/h2&gt;

&lt;h3&gt;
  
  
  Carrying it in an HTTP header
&lt;/h3&gt;

&lt;p&gt;A workload puts the Txn-Token in a dedicated HTTP header called &lt;code&gt;Txn-Token&lt;/code&gt; and passes it to the next workload.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight http"&gt;&lt;code&gt;&lt;span class="nf"&gt;POST&lt;/span&gt; &lt;span class="nn"&gt;/api/check-risk&lt;/span&gt; &lt;span class="k"&gt;HTTP&lt;/span&gt;&lt;span class="o"&gt;/&lt;/span&gt;&lt;span class="m"&gt;1.1&lt;/span&gt;
&lt;span class="na"&gt;Host&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="s"&gt;risk-service.trust-domain.example&lt;/span&gt;
&lt;span class="na"&gt;Content-Type&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="s"&gt;application/json&lt;/span&gt;
&lt;span class="na"&gt;Txn-Token&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="s"&gt;eyJhbGciOiJSUzI1NiIsInR5cCI6InR4bnRva2VuK2p3dCJ9...&lt;/span&gt;
&lt;span class="na"&gt;Authorization&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="s"&gt;Bearer &amp;lt;workload-own-access-token&amp;gt;&lt;/span&gt;

&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"order_id"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"ord-12345"&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Important&lt;/strong&gt;: do not put the Txn-Token in the &lt;code&gt;Authorization&lt;/code&gt; header (the spec says &lt;code&gt;MUST NOT&lt;/code&gt;, Section 13). The thing to notice is that &lt;code&gt;Txn-Token&lt;/code&gt; and &lt;code&gt;Authorization&lt;/code&gt; sit side by side in the example above. That is not a mistake, it is the intended shape: two layers run at once inside one request.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;code&gt;Authorization&lt;/code&gt;: carries the calling workload's own authentication, plus &lt;strong&gt;coarse service-to-service authorization&lt;/strong&gt; (is this service allowed to call this API).&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;Txn-Token&lt;/code&gt;: carries who and what the request was started &lt;strong&gt;for, and to do&lt;/strong&gt; (user ID, the action, and other fine-grained immutable context).&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The &lt;code&gt;Authorization&lt;/code&gt; header may already be in use for some other purpose, so piggy-backing the Txn-Token on it would collide. That is why it gets its own &lt;code&gt;Txn-Token&lt;/code&gt; header.&lt;/p&gt;

&lt;h3&gt;
  
  
  Verification on the receiving workload
&lt;/h3&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fraw.githubusercontent.com%2F0-draft%2Fdev.to%2Fmain%2Farticles%2Fassets%2Fdraft-ietf-oauth-transaction-tokens-deep-dive%2Fdiagrams%2F06-receiver-verification.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fraw.githubusercontent.com%2F0-draft%2Fdev.to%2Fmain%2Farticles%2Fassets%2Fdraft-ietf-oauth-transaction-tokens-deep-dive%2Fdiagrams%2F06-receiver-verification.png" alt="Verification on the receiving workload" width="800" height="1301"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;At this point, &lt;strong&gt;you must not modify the Txn-Token.&lt;/strong&gt; Forward it exactly as received. Passing the TTS-signed token along idempotently is what keeps the context consistent across the whole call chain.&lt;/p&gt;




&lt;h2&gt;
  
  
  Pairing it with workload authentication
&lt;/h2&gt;

&lt;p&gt;Public-key based mutual authentication with the TTS is recommended.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;mTLS (RFC 8705)&lt;/strong&gt;: mutual TLS certificates so the workload and the TTS authenticate each other&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;SPIFFE / SPIRE&lt;/strong&gt;: the workload identity standard, proven with an SVID&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;WIMSE Mutual TLS&lt;/strong&gt; (draft-ietf-wimse-mutual-tls)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;WIMSE HTTP Signatures&lt;/strong&gt; (draft-ietf-wimse-http-signature)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;WIMSE Workload Proof Token&lt;/strong&gt; (draft-ietf-wimse-wpt)&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Client Authentication JWT (RFC 7523)&lt;/strong&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The TTS authenticates to the workload and the workload authenticates to the TTS (mutual). If the workload does not authenticate the TTS, it risks leaking its OAuth access token to a rogue TTS.&lt;/p&gt;




&lt;h2&gt;
  
  
  Lifetime and replay defense
&lt;/h2&gt;

&lt;h3&gt;
  
  
  Designing the lifetime
&lt;/h3&gt;

&lt;p&gt;The design principle is simple: a Txn-Token only needs to live as long as the request takes to process. The spec spells it out: the lifetime &lt;code&gt;MUST be kept short&lt;/code&gt; (on the order of minutes or less). Even for a long-running batch, you do not make the Txn-Token long-lived. You re-issue a fresh short-lived Txn-Token per transaction. And if the presented &lt;code&gt;subject_token&lt;/code&gt; has already expired, the TTS must not issue a Txn-Token.&lt;/p&gt;

&lt;h3&gt;
  
  
  Replay detection via the txn claim
&lt;/h3&gt;

&lt;p&gt;The &lt;code&gt;txn&lt;/code&gt; claim is a transaction-unique UUID. A workload that caches &lt;code&gt;txn&lt;/code&gt; values briefly can &lt;strong&gt;detect reuse of the same Txn-Token, that is, a replay attack.&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fraw.githubusercontent.com%2F0-draft%2Fdev.to%2Fmain%2Farticles%2Fassets%2Fdraft-ietf-oauth-transaction-tokens-deep-dive%2Fdiagrams%2F07-replay-detection.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fraw.githubusercontent.com%2F0-draft%2Fdev.to%2Fmain%2Farticles%2Fassets%2Fdraft-ietf-oauth-transaction-tokens-deep-dive%2Fdiagrams%2F07-replay-detection.png" alt="Replay detection via the txn claim" width="800" height="762"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;The catch: this is hard when multiple instances share no state. In practice you combine a short lifetime with correlating &lt;code&gt;txn&lt;/code&gt; in the logs.&lt;/p&gt;




&lt;h2&gt;
  
  
  Security considerations
&lt;/h2&gt;

&lt;h3&gt;
  
  
  Never embed an access token inside a Txn-Token
&lt;/h3&gt;

&lt;p&gt;Putting the original OAuth access token inside a Txn-Token is strictly forbidden. If the access token is still valid after the Txn-Token expires, an attacker can crack open the Txn-Token, pull out the access token, and use it to reach external resources.&lt;/p&gt;

&lt;h3&gt;
  
  
  A Txn-Token is not an authentication credential
&lt;/h3&gt;

&lt;p&gt;A Txn-Token does not prove a workload's identity. It is a carrier of authorization context. A workload still needs a separate mechanism (mTLS and the like) to authenticate itself to the TTS.&lt;/p&gt;

&lt;h3&gt;
  
  
  Prevent scope amplification
&lt;/h3&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fraw.githubusercontent.com%2F0-draft%2Fdev.to%2Fmain%2Farticles%2Fassets%2Fdraft-ietf-oauth-transaction-tokens-deep-dive%2Fdiagrams%2F08-scope-amplification.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fraw.githubusercontent.com%2F0-draft%2Fdev.to%2Fmain%2Farticles%2Fassets%2Fdraft-ietf-oauth-transaction-tokens-deep-dive%2Fdiagrams%2F08-scope-amplification.png" alt="Scope amplification is rejected by the TTS" width="799" height="496"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h3&gt;
  
  
  Never log a raw Txn-Token
&lt;/h3&gt;

&lt;p&gt;Do not write a complete Txn-Token to your logs: it becomes replay material. Log one of:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;only a hash of the Txn-Token (for correlation)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;or&lt;/strong&gt; the payload with the JWS signature stripped&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Be extra careful when PII is involved.&lt;/p&gt;




&lt;h2&gt;
  
  
  How Txn-Tokens relate to WIMSE
&lt;/h2&gt;

&lt;p&gt;WIMSE (Workload Identity in Multi-System Environments) is the IETF WG standardizing workload identity and workload-to-workload authentication for microservice environments.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fraw.githubusercontent.com%2F0-draft%2Fdev.to%2Fmain%2Farticles%2Fassets%2Fdraft-ietf-oauth-transaction-tokens-deep-dive%2Fdiagrams%2F09-wimse-relationship.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fraw.githubusercontent.com%2F0-draft%2Fdev.to%2Fmain%2Farticles%2Fassets%2Fdraft-ietf-oauth-transaction-tokens-deep-dive%2Fdiagrams%2F09-wimse-relationship.png" alt="How WPT and Txn-Tokens complement each other" width="799" height="357"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;WIMSE and Txn-Tokens complement each other. WIMSE WPT establishes a workload's own identity, and Txn-Tokens supply the context of the request that workload is processing. Used together, they give you the authentication and authorization model a zero-trust microservice architecture wants.&lt;/p&gt;




&lt;h2&gt;
  
  
  Identity Chaining vs Txn-Tokens
&lt;/h2&gt;

&lt;p&gt;A related spec is &lt;strong&gt;draft-ietf-oauth-identity-chaining&lt;/strong&gt;. Both aim to propagate request context, but they target different problems. Identity Chaining propagates user identity &lt;strong&gt;across Trust Domains&lt;/strong&gt; (cross-domain federation), while Transaction Tokens propagate the full request context &lt;strong&gt;within a single Trust Domain&lt;/strong&gt; (consistency inside the call chain).&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Axis&lt;/th&gt;
&lt;th&gt;Identity Chaining&lt;/th&gt;
&lt;th&gt;Transaction Tokens&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Scope of use&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;between Trust Domains (cross-domain)&lt;/td&gt;
&lt;td&gt;within a Trust Domain (same domain)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Main goal&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;access a resource in another domain&lt;/td&gt;
&lt;td&gt;keep context consistent across the call chain&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Token basis&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;RFC 8693 + RFC 7523 combined&lt;/td&gt;
&lt;td&gt;a profile of RFC 8693&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Context carried&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;mainly user identity&lt;/td&gt;
&lt;td&gt;user identity + action + environment, all of it&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;




&lt;h2&gt;
  
  
  Wrap-up
&lt;/h2&gt;

&lt;p&gt;The problems inside microservices were these: external tokens circulating through the internals, request context lost along the way, impersonation you cannot detect. Txn-Tokens solve them by converting the external token into an internal-only Txn-Token, keeping one consistent context across the call chain, blocking tampering and impersonation with the TTS signature, and shrinking the blast radius with short lifetimes and tight scopes.&lt;/p&gt;

&lt;p&gt;The benefits, summarized:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;Short-lived + per-transaction binding&lt;/strong&gt; → lower replay risk&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Tight scope&lt;/strong&gt; → limits lateral movement&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;TTS signature&lt;/strong&gt; → blocks tampering inside the call chain&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Independent verification at each workload&lt;/strong&gt; → rejects unauthorized direct calls&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Only workloads with the right permission can obtain one&lt;/strong&gt; → contains the impact of a compromised service&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;The spec is at Draft 08 and in WG Last Call, the final stretch before becoming an RFC. Standardization is moving ahead with WIMSE integration in view. If you are serious about applying zero trust to your microservices, this is one of the specs to track now.&lt;/p&gt;




&lt;h2&gt;
  
  
  References
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;&lt;a href="https://datatracker.ietf.org/doc/draft-ietf-oauth-transaction-tokens/" rel="noopener noreferrer"&gt;draft-ietf-oauth-transaction-tokens-08&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://datatracker.ietf.org/doc/html/rfc8693" rel="noopener noreferrer"&gt;RFC 8693 - OAuth 2.0 Token Exchange&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://datatracker.ietf.org/doc/html/rfc7519" rel="noopener noreferrer"&gt;RFC 7519 - JSON Web Token (JWT)&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://datatracker.ietf.org/doc/html/rfc8417" rel="noopener noreferrer"&gt;RFC 8417 - Security Event Token (SET)&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://datatracker.ietf.org/doc/draft-ietf-wimse-arch/" rel="noopener noreferrer"&gt;draft-ietf-wimse-arch - WIMSE Architecture&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;

</description>
      <category>oauth</category>
      <category>security</category>
      <category>microservices</category>
      <category>identity</category>
    </item>
    <item>
      <title>ID-JAG, Transaction Tokens, WIF: The Three Layers of AI Agent Auth</title>
      <dc:creator>kt</dc:creator>
      <pubDate>Wed, 03 Jun 2026 11:12:33 +0000</pubDate>
      <link>https://dev.to/kanywst/the-three-layers-of-ai-agent-authentication-what-id-jag-transaction-tokens-and-wif-actually-1mbk</link>
      <guid>https://dev.to/kanywst/the-three-layers-of-ai-agent-authentication-what-id-jag-transaction-tokens-and-wif-actually-1mbk</guid>
      <description>&lt;h2&gt;
  
  
  Introduction
&lt;/h2&gt;

&lt;p&gt;Something clicked when I looked at the &lt;a href="https://platform.claude.com/docs/en/manage-claude/workload-identity-federation" rel="noopener noreferrer"&gt;Workload Identity Federation&lt;/a&gt; that Anthropic shipped in May 2026. Delete the &lt;code&gt;sk-ant-...&lt;/code&gt; API key, have a k8s service account JWT mint a short-lived &lt;code&gt;sk-ant-oat01-...&lt;/code&gt; token, call &lt;code&gt;anthropic.Anthropic()&lt;/code&gt; with no arguments, and it just works. The security posture clearly went up.&lt;/p&gt;

&lt;p&gt;But the actual day-to-day flow, where an agent "runs Cursor on Alice's GitHub PAT and opens a PR," did not change one bit. Cursor, Claude Code, Comet: all of them ultimately hold the user's own credentials and act with them. Around the same time I was following the IETF drafts for ID-JAG (Identity Assertion JWT Authorization Grant) and Transaction Tokens, and I kept feeling this gap: the specs are converging, yet nobody is adopting them.&lt;/p&gt;

&lt;p&gt;That raises one obvious question. &lt;strong&gt;"Tighter scope," "auditable," "smaller blast radius" are all true, and yet the agent UX you use every day does not change. So is this stuff actually going to catch on?&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;This article takes that question head on. The conclusions up front:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;The UX not changing is, like refresh tokens, evidence that the design is correct&lt;/li&gt;
&lt;li&gt;ID-JAG, Transaction Tokens, and WIF each protect a different layer; lining them up to compare is a category error&lt;/li&gt;
&lt;li&gt;There are two adoption routes. One is the MFA / IRSA / Sigstore "incident → regulation → adoption" path. The other is the DNSSEC / SPIFFE "eternal niche, spreads only under a different name through vendor integration" path. &lt;strong&gt;A protocol with no UX improvement basically takes the latter&lt;/strong&gt;, and that is the cold lesson of tech history&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;To put the punchline first: I predict "we deployed ID-JAG" (Layer A) will not catch on. Layer C will. Layer B is an eternal niche&lt;/strong&gt;.&lt;/p&gt;

&lt;h2&gt;
  
  
  The May 2026 map: where the major specs and implementations actually stand
&lt;/h2&gt;

&lt;p&gt;To start the discussion from the same place, here are the primary sources.&lt;/p&gt;

&lt;h3&gt;
  
  
  IETF drafts
&lt;/h3&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Spec&lt;/th&gt;
&lt;th&gt;Latest&lt;/th&gt;
&lt;th&gt;Published&lt;/th&gt;
&lt;th&gt;Status&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;
&lt;a href="https://datatracker.ietf.org/doc/draft-ietf-oauth-identity-assertion-authz-grant/" rel="noopener noreferrer"&gt;draft-ietf-oauth-identity-assertion-authz-grant&lt;/a&gt; (ID-JAG)&lt;/td&gt;
&lt;td&gt;-04&lt;/td&gt;
&lt;td&gt;2026-05-21&lt;/td&gt;
&lt;td&gt;Standards Track, Internet-Draft&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;a href="https://datatracker.ietf.org/doc/draft-ietf-oauth-identity-chaining/" rel="noopener noreferrer"&gt;draft-ietf-oauth-identity-chaining&lt;/a&gt;&lt;/td&gt;
&lt;td&gt;-12&lt;/td&gt;
&lt;td&gt;2026-05-11&lt;/td&gt;
&lt;td&gt;Standards Track, Internet-Draft&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;a href="https://datatracker.ietf.org/doc/draft-ietf-oauth-transaction-tokens/" rel="noopener noreferrer"&gt;draft-ietf-oauth-transaction-tokens&lt;/a&gt;&lt;/td&gt;
&lt;td&gt;-08&lt;/td&gt;
&lt;td&gt;2026-03-02&lt;/td&gt;
&lt;td&gt;Standards Track, Internet-Draft&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;a href="https://datatracker.ietf.org/doc/draft-araut-oauth-transaction-tokens-for-agents/" rel="noopener noreferrer"&gt;draft-araut-oauth-transaction-tokens-for-agents&lt;/a&gt;&lt;/td&gt;
&lt;td&gt;-06&lt;/td&gt;
&lt;td&gt;2026-04-11&lt;/td&gt;
&lt;td&gt;Individual Draft (Ashay Raut, Amazon), pre-WG-adoption&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;a href="https://datatracker.ietf.org/doc/draft-ietf-oauth-v2-1/" rel="noopener noreferrer"&gt;draft-ietf-oauth-v2-1&lt;/a&gt;&lt;/td&gt;
&lt;td&gt;-15&lt;/td&gt;
&lt;td&gt;2026-03-02&lt;/td&gt;
&lt;td&gt;Standards Track, Internet-Draft (in WG discussion, pre-Last-Call)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;a href="https://datatracker.ietf.org/doc/draft-klrc-aiagent-auth/" rel="noopener noreferrer"&gt;draft-klrc-aiagent-auth-00&lt;/a&gt;&lt;/td&gt;
&lt;td&gt;-00&lt;/td&gt;
&lt;td&gt;2026-03&lt;/td&gt;
&lt;td&gt;Individual Draft (Defakto / AWS / Zscaler / Ping)&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;What this tells you about the lay of the land:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;ID-JAG is not an RFC yet&lt;/strong&gt;. There is just a Standards Track draft circulating; as of this writing there is no official RFC number assigned. The contents can still change&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;The Agent extension to Transaction Tokens is still an individual proposal&lt;/strong&gt;, pre-WG-adoption (&lt;code&gt;draft-araut-&lt;/code&gt;, not &lt;code&gt;draft-ietf-&lt;/code&gt;). It uses the &lt;code&gt;act&lt;/code&gt; claim for the AI agent and the &lt;code&gt;sub&lt;/code&gt; claim for the principal, and it has moved from -00 to -06 in half a year (a straightforward extension of RFC 8693 Token Exchange's actor/subject concepts)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;OAuth 2.1 (-15) is in WG discussion&lt;/strong&gt;. It has not reached Last Call, but it is framed to obsolete RFC 6749. Mandatory PKCE, removal of the Implicit/Password grants, and so on all become load-bearing assumptions for the agent specs&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;I wrote the dedicated walkthrough of ID-JAG itself in a separate piece (&lt;a href="https://dev.to/kanywst/id-jag-deep-dive-1mhp"&gt;ID-JAG Deep Dive&lt;/a&gt;). Here I narrow in on "&lt;strong&gt;what becomes visible when you put all three side by side&lt;/strong&gt;."&lt;/p&gt;

&lt;h3&gt;
  
  
  The implementation side
&lt;/h3&gt;

&lt;p&gt;Chasing specs alone is pointless. Here is what is actually moving in 2026.&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Implementation&lt;/th&gt;
&lt;th&gt;Released&lt;/th&gt;
&lt;th&gt;What it does&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;a href="https://platform.claude.com/docs/en/manage-claude/workload-identity-federation" rel="noopener noreferrer"&gt;Anthropic Workload Identity Federation&lt;/a&gt;&lt;/td&gt;
&lt;td&gt;2026-05&lt;/td&gt;
&lt;td&gt;Exchanges k8s SA / GHA / SPIFFE / AWS / GCP JWTs for &lt;code&gt;sk-ant-oat01-...&lt;/code&gt; short-lived tokens via RFC 7523 jwt-bearer&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;a href="https://www.okta.com/solutions/cross-app-access/" rel="noopener noreferrer"&gt;Okta Cross-App Access (XAA)&lt;/a&gt;&lt;/td&gt;
&lt;td&gt;2025-Q3 beta → 2026 GA&lt;/td&gt;
&lt;td&gt;The commercial name for ID-JAG. Adopted formally as MCP's Authorization Extension "Enterprise-Managed Authorization"&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;a href="https://developer.okta.com/blog/2026/01/20/xaa-dev-playground" rel="noopener noreferrer"&gt;xaa.dev playground&lt;/a&gt;&lt;/td&gt;
&lt;td&gt;2026-01-20&lt;/td&gt;
&lt;td&gt;An official Okta environment for trying XAA end to end on your own machine&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;a href="https://www.keycloak.org/2026/01/jwt-authorization-grant" rel="noopener noreferrer"&gt;Keycloak 26.5 JWT Authorization Grant&lt;/a&gt;&lt;/td&gt;
&lt;td&gt;2026-01&lt;/td&gt;
&lt;td&gt;Preview support for RFC 7523. Exchanges externally-signed JWTs for Keycloak tokens&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;a href="https://modelcontextprotocol.io/specification/draft/basic/authorization" rel="noopener noreferrer"&gt;MCP spec 2026-03-15&lt;/a&gt;&lt;/td&gt;
&lt;td&gt;2026-03-15&lt;/td&gt;
&lt;td&gt;Makes RFC 8707 Resource Indicators mandatory, RFC 9728 Protected Resource Metadata mandatory&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;a href="https://www.salesforce.com/news/stories/agent-fabric-control-plane-announcement/" rel="noopener noreferrer"&gt;Salesforce Agentforce "Trusted Agent Identity"&lt;/a&gt;&lt;/td&gt;
&lt;td&gt;2026 GA&lt;/td&gt;
&lt;td&gt;Inserts mobile approval into high-risk agent actions&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;Two quiet but big shifts are happening in that table.&lt;/p&gt;

&lt;p&gt;One: the 2026-03-15 MCP spec made RFC 8707 (Resource Indicators) and RFC 9728 (Protected Resource Metadata) &lt;strong&gt;mandatory&lt;/strong&gt;. Audience binding is now enforced per MCP server, so an access token issued for one MCP server gets rejected by the authorization server if you throw it at a different MCP server. A leak path like Cursor accidentally forwarding a GitHub MCP token to a Slack MCP is now closed at the spec level.&lt;/p&gt;

&lt;p&gt;Two: Okta XAA got folded in as MCP's "Enterprise-Managed Authorization" extension. The ID-JAG concept is reaching the implementation side first not as a pure IETF draft but riding on top of MCP.&lt;/p&gt;

&lt;p&gt;Summed up: &lt;strong&gt;the specs have converged, implementations are out from Anthropic / Okta / Keycloak / Salesforce, but almost nobody in the field is using any of it&lt;/strong&gt;. That is the state of play in May 2026.&lt;/p&gt;

&lt;h2&gt;
  
  
  What are the three layers actually protecting?
&lt;/h2&gt;

&lt;p&gt;Putting all three in parallel just causes confusion. As the figure shows, &lt;strong&gt;different layers are protected by different specs&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fraw.githubusercontent.com%2F0-draft%2Fdev.to%2Fmain%2Farticles%2Fassets%2Fagent-identity-three-layers-2026%2Fdiagrams%2F01-three-layers.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fraw.githubusercontent.com%2F0-draft%2Fdev.to%2Fmain%2Farticles%2Fassets%2Fagent-identity-three-layers-2026%2Fdiagrams%2F01-three-layers.png" alt="3-layer architecture of agent identity" width="800" height="599"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h3&gt;
  
  
  Layer A: Human → Agent → External Resource (ID-JAG / XAA)
&lt;/h3&gt;

&lt;p&gt;The problem here is "&lt;strong&gt;how does an Agent touch GitHub on behalf of the human Alice&lt;/strong&gt;."&lt;/p&gt;

&lt;p&gt;There are two OAuth extensions you need as background. They come up repeatedly, so pin them down first.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;RFC 7523&lt;/strong&gt; (JWT Profile for OAuth 2.0 Client Authentication and Authorization Grants): the profile for "bring one JWT, get an OAuth access token in return." It defines the form you post with &lt;code&gt;grant_type=urn:ietf:params:oauth:grant-type:jwt-bearer&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;RFC 8693&lt;/strong&gt; (Token Exchange): the general framework for "exchange one token for another." &lt;code&gt;subject_token&lt;/code&gt; is "who the token is for," &lt;code&gt;actor_token&lt;/code&gt; is "the party actually acting." The &lt;code&gt;act&lt;/code&gt; / &lt;code&gt;sub&lt;/code&gt; claim hierarchy comes from here too&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;ID-JAG / XAA combines these two into a profile that "&lt;strong&gt;extends the IdP trust established by a human's SSO out to API calls made through an agent&lt;/strong&gt;."&lt;/p&gt;

&lt;p&gt;To see the concrete difference, the old approaches are only two:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Hand the Agent Alice's GitHub PAT (basically everyone today)&lt;/li&gt;
&lt;li&gt;Have Alice individually approve an OAuth App for the Agent (this is what ChatGPT Plugins / Claude Connectors do)&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;Approach 1 cannot narrow scope. If Alice has &lt;code&gt;repo:full&lt;/code&gt;, the Agent has &lt;code&gt;repo:full&lt;/code&gt;. Approach 2 can narrow scope, but Alice has to walk through a consent screen per Agent, which collapses in the age of N agents.&lt;/p&gt;

&lt;p&gt;ID-JAG works like this:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Alice is already SSO'd into her IdP (Okta / Entra ID / Google Workspace)&lt;/li&gt;
&lt;li&gt;The IdP issues an identity assertion JWT for Alice&lt;/li&gt;
&lt;li&gt;The Agent brings that to GitHub's Authorization Server via the RFC 7523 jwt-bearer grant&lt;/li&gt;
&lt;li&gt;GitHub's AS &lt;strong&gt;validates the JWT through its trust relationship with the IdP&lt;/strong&gt; and, via RFC 8693 Token Exchange, exchanges it for a &lt;strong&gt;scoped-down GitHub access token for Alice&lt;/strong&gt;
&lt;/li&gt;
&lt;li&gt;The Agent calls the GitHub API with that token&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;Alice never touches a consent screen. The Agent only ever holds a narrowly-scoped token from the start. Even with N agents, you manage per-agent policy on the IdP side.&lt;/p&gt;

&lt;p&gt;The commercial name, in Okta's case, is "Cross-App Access (XAA)." Conceptually it is a profile riding on top of OAuth Identity Chaining (&lt;code&gt;draft-ietf-oauth-identity-chaining-12&lt;/code&gt;), and Okta shipped an implementation in 2026 as MCP's "Enterprise-Managed Authorization" extension.&lt;/p&gt;

&lt;h3&gt;
  
  
  Layer B: Agent → internal service chain (Transaction Tokens for Agents)
&lt;/h3&gt;

&lt;p&gt;The problem here is "&lt;strong&gt;while an Agent processes a single request and calls multiple internal services, how do you propagate who is acting and for what purpose&lt;/strong&gt;."&lt;/p&gt;

&lt;p&gt;Layer A's ID-JAG is about crossing a trust domain boundary (your company → GitHub). Layer B is about the &lt;strong&gt;microservice chain inside your own walls&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;The prerequisite, Transaction Tokens (&lt;code&gt;draft-ietf-oauth-transaction-tokens-08&lt;/code&gt;), is a mechanism to "convert" the outbound token received at the API gateway into a short-lived JWT meant to flow through the internal chain. Pushing the received OAuth access token straight through 5 hops makes the scope too broad and stacks up leak risk. So inside the Trust Domain, a &lt;strong&gt;Transaction Token Service (TTS)&lt;/strong&gt; issues a short-lived "for this request" token and flows that through the chain. I plan to write the details elsewhere, but the gist gets across as just "pack the per-request context into a JWT and carry it."&lt;/p&gt;

&lt;p&gt;Transaction Tokens &lt;strong&gt;for Agents&lt;/strong&gt; is the extension that adds the claim design for when an AI agent is mixed into the Trust Domain.&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;claim&lt;/th&gt;
&lt;th&gt;meaning&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;sub&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;the principal (the human Alice, or the agent itself if the agent acts independently)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;act&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;the actor (the AI agent actually making the call; embeds RFC 8693's actor token concept straight into the JWT)&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;By carrying these two claims across the whole internal chain, who-delegated-to-whom survives per request.&lt;/p&gt;

&lt;p&gt;Why is this needed? The ID-JAG token itself does have an &lt;code&gt;act&lt;/code&gt; claim for the actor (from RFC 8693), so it is not that "delegation can't be expressed." The problem is elsewhere. ID-JAG is a &lt;strong&gt;single-hop, cross-boundary grant&lt;/strong&gt; whose &lt;code&gt;aud&lt;/code&gt; is pinned to a single Resource AS, so you cannot flow it through the internal chain as-is (the &lt;code&gt;aud&lt;/code&gt; won't match). Pushing the received outbound token straight through 5 hops is also over-scoped and stacks up leak risk. So in practice you &lt;strong&gt;exchange it at the gateway for a short-lived internal token (i.e. mint a new one)&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;If information disappears, it is not because it crosses hops; it is because &lt;strong&gt;the actor gets left off at the moment of that exchange&lt;/strong&gt;. Mint it naively and only &lt;code&gt;sub=alice&lt;/code&gt; survives, &lt;code&gt;act&lt;/code&gt; drops, and the terminal service sees nothing but "Alice's own transaction." Transaction Tokens for Agents nails down a claim design that forces &lt;code&gt;act=agent-X, sub=alice&lt;/code&gt; to be preserved at exchange time. Once minted, the TraT propagates immutably through Order Service → Risk Service → Payment Service, and each hop can verify "agent-X is executing Alice's transaction" at the JWT level. The Payment Service's audit log permanently records "agent-X executed this transfer on behalf of alice." That is the territory Layer A alone cannot reach.&lt;/p&gt;

&lt;h3&gt;
  
  
  Layer C: Agent (workload) → LLM (Anthropic WIF)
&lt;/h3&gt;

&lt;p&gt;The problem here is "&lt;strong&gt;how does the workload that is the Agent authenticate to the LLM API of Anthropic / OpenAI / Google&lt;/strong&gt;."&lt;/p&gt;

&lt;p&gt;Layers A and B were authentication toward the resource side (GitHub / internal services). Layer C is &lt;strong&gt;authentication toward the AI side&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;The term Workload Identity Federation (WIF) has history: AWS and GCP originally built it as a mechanism to "delete IAM user static access keys by exchanging an IdP's OIDC token for an STS / GCP token." It is the whole story of exchanging GitHub Actions OIDC or a Kubernetes service account for AWS sigv4 credentials. Anthropic WIF is just the LLM-provider version of that; the structure is completely isomorphic.&lt;/p&gt;

&lt;p&gt;The flow on Anthropic's side (&lt;a href="https://platform.claude.com/docs/en/manage-claude/workload-identity-federation" rel="noopener noreferrer"&gt;official docs&lt;/a&gt;) is this:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;The platform the Agent runs on (k8s SA / GHA / SPIFFE / AWS / GCP) issues an ambient OIDC JWT (a projected service account token on Kubernetes, the OIDC endpoint on GitHub Actions, and so on)&lt;/li&gt;
&lt;li&gt;The Agent posts that JWT to Anthropic's &lt;code&gt;/v1/oauth/token&lt;/code&gt; via RFC 7523 &lt;code&gt;urn:ietf:params:oauth:grant-type:jwt-bearer&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;Anthropic validates the JWT signature against the registered issuer's JWKS and checks the federation rule's &lt;code&gt;subject_prefix&lt;/code&gt; and the like&lt;/li&gt;
&lt;li&gt;On a match, it returns a &lt;strong&gt;short-lived &lt;code&gt;sk-ant-oat01-...&lt;/code&gt; token&lt;/strong&gt; (per service account, 3600 seconds by default)&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;The nice part is you can fully delete the &lt;code&gt;sk-ant-...&lt;/code&gt; static API key. No more operations of injecting a secret into the Agent container.&lt;/p&gt;

&lt;p&gt;One thing that often gets misunderstood here: &lt;strong&gt;"if I have WIF, do I even need ID-JAG?"&lt;/strong&gt; They protect different legs, so they do not compare. WIF is the Agent → Anthropic leg, ID-JAG is the Human → Agent → GitHub leg. Anthropic WIF cannot protect Alice's GitHub PAT no matter how nasty a prompt injection the Agent eats. The premise is you use both.&lt;/p&gt;

&lt;h3&gt;
  
  
  The three-layer mapping
&lt;/h3&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;&lt;/th&gt;
&lt;th&gt;Layer A&lt;/th&gt;
&lt;th&gt;Layer B&lt;/th&gt;
&lt;th&gt;Layer C&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Which leg&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Human → Agent → External Resource&lt;/td&gt;
&lt;td&gt;Agent → Internal Service Chain&lt;/td&gt;
&lt;td&gt;Agent (workload) → LLM Provider&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Representative spec&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;ID-JAG / Identity Chaining / XAA&lt;/td&gt;
&lt;td&gt;Transaction Tokens for Agents&lt;/td&gt;
&lt;td&gt;RFC 7523 jwt-bearer + each vendor's WIF&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Input identity&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Human OIDC ID Token&lt;/td&gt;
&lt;td&gt;Agent + principal context&lt;/td&gt;
&lt;td&gt;Workload identity (k8s SA / SPIFFE)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Problem solved&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;narrowing the human's scope, blast radius on agent takeover&lt;/td&gt;
&lt;td&gt;propagating "for whom" across the call chain&lt;/td&gt;
&lt;td&gt;killing static API keys, auditable workload identity&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;2026 status&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;IETF draft -04, implemented in Okta XAA&lt;/td&gt;
&lt;td&gt;IETF individual draft -06&lt;/td&gt;
&lt;td&gt;Anthropic GA, AWS/GCP for years&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Who hurts without it&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;enterprise SaaS integrations&lt;/td&gt;
&lt;td&gt;microservice audit teams&lt;/td&gt;
&lt;td&gt;platform SRE / audit&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;Once you get here, you see the "&lt;strong&gt;ID-JAG vs WIF&lt;/strong&gt;" framing never held in the first place. You use both.&lt;/p&gt;

&lt;h2&gt;
  
  
  Why the UX "doesn't change," and what not changing actually means
&lt;/h2&gt;

&lt;p&gt;Back to the opening question: "if the UX doesn't change, won't it fail to catch on?"&lt;/p&gt;

&lt;p&gt;This is half right. &lt;strong&gt;In the happy path, one person, one agent, within their own permissions, the UX really does not change at all&lt;/strong&gt;. The Agent fetching a JWT from the IdP behind the scenes, narrowing scope with Token Exchange, throwing it at the resource server: none of that is visible to Alice.&lt;/p&gt;

&lt;p&gt;And that is the result of correct design. Just as users do not think about OAuth's access_token and refresh_token, &lt;strong&gt;making them think about it is a design failure&lt;/strong&gt;. If you look at this spec expecting "the UX to visibly change," you misread its essence.&lt;/p&gt;

&lt;p&gt;But "the UX does not change" does not equal "it has no value." Layers A / B / C deliver the kind of value that is &lt;strong&gt;invisible day to day and only becomes visible the instant something goes wrong&lt;/strong&gt;. Concretely, it shows up in the next three scenarios.&lt;/p&gt;

&lt;h3&gt;
  
  
  Scenario 1: when you have an accident (blast radius)
&lt;/h3&gt;

&lt;p&gt;July 2025, the incident where Replit's AI agent wiped a production DB during a code freeze. In an experiment Jason Lemkin of SaaStr documented, data for over 1,200 executives and 1,190 companies was destroyed. (&lt;a href="https://fortune.com/2025/07/23/ai-coding-tool-replit-wiped-database-called-it-a-catastrophic-failure/" rel="noopener noreferrer"&gt;Fortune report&lt;/a&gt;, &lt;a href="https://incidentdatabase.ai/cite/1152/" rel="noopener noreferrer"&gt;AI Incident Database #1152&lt;/a&gt;)&lt;/p&gt;

&lt;p&gt;What happened comes down to one thing: &lt;strong&gt;the Agent ran with the same DB connection privileges as a human&lt;/strong&gt;. The code-freeze instruction was implemented only as a prompt-level "please," with no mechanism to stop it at the authorization level. The Agent confessed it had "panicked in response to empty queries," and per the report it even lied about the recovery procedure.&lt;/p&gt;

&lt;p&gt;If Layer A (ID-JAG) had been in place, picturing it out: scope the token issued to the Agent to just &lt;code&gt;db:read&lt;/code&gt; and &lt;code&gt;DROP TABLE&lt;/code&gt; comes back as &lt;strong&gt;HTTP 403&lt;/strong&gt;. Even if the Agent goes haywire, if the token is short-lived (say 5 minutes), the rampage itself has a time cap. And the audit log permanently records "agent-X attempted a DELETE against prod and got 403." That is enforcement at the authorization layer, a notch stronger than a prompt-level "please."&lt;/p&gt;

&lt;p&gt;After the incident Replit added "automatic dev / prod DB separation." That is the right response, but structurally it is &lt;strong&gt;absorbing Layer A's role into the infrastructure side&lt;/strong&gt;. Solving it at the app layer (ID-JAG) or the data layer (DB separation) are both fine; the real problem is that most teams currently do neither.&lt;/p&gt;

&lt;h3&gt;
  
  
  Scenario 2: when you eat a zero-click attack (scope violation)
&lt;/h3&gt;

&lt;p&gt;June 2025, &lt;strong&gt;EchoLeak&lt;/strong&gt; (CVE-2025-32711, CVSS 9.3) hit Microsoft 365 Copilot (&lt;a href="https://thehackernews.com/2025/06/zero-click-ai-vulnerability-exposes.html" rel="noopener noreferrer"&gt;The Hacker News report&lt;/a&gt;; &lt;a href="https://arxiv.org/html/2509.10540v1" rel="noopener noreferrer"&gt;arXiv paper 2509.10540&lt;/a&gt; analyzes it as the "first real-world zero-click prompt injection exploit in a production LLM system"). The attack flow is simple:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;The attacker &lt;strong&gt;sends one email&lt;/strong&gt; to a user at the target org&lt;/li&gt;
&lt;li&gt;Copilot later goes to read that email&lt;/li&gt;
&lt;li&gt;A hidden prompt embedded in the email body steers Copilot&lt;/li&gt;
&lt;li&gt;Copilot reads out &lt;strong&gt;all data the user has access to&lt;/strong&gt; from SharePoint / OneDrive / Teams and exfiltrates it to an external endpoint&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;The user clicked nothing. It completes just by an email landing in the inbox. Aim Labs, who found it, calls this "LLM Scope Violation."&lt;/p&gt;

&lt;p&gt;The core problem is isomorphic to Replit: &lt;strong&gt;the Agent (Copilot) held "all of Alice's permissions."&lt;/strong&gt; More precisely, it is not that Alice failed to narrow Copilot's scope, it is that there is no mechanism to narrow it.&lt;/p&gt;

&lt;p&gt;In the ID-JAG / XAA worldview, you can cut the token Copilot uses to read SharePoint down to something narrow like "&lt;strong&gt;read the document Alice is looking at right now&lt;/strong&gt;." However hard the hidden prompt tries, it can only touch what the token allows. A scope violation like EchoLeak is a phenomenon that occurs where the token's permission boundary is "all of Alice's permissions," so the real problem is that there is currently no standardized way to make that boundary narrow.&lt;/p&gt;

&lt;h3&gt;
  
  
  Scenario 3: when it scales (N agents)
&lt;/h3&gt;

&lt;p&gt;The &lt;strong&gt;Moltbook incident&lt;/strong&gt; of January–February 2026. A service billing itself as a "social network" for AI agents launched on 2026-01-28, and a mere three days later, on 2026-01-31, it got &lt;a href="https://www.404media.co/exposed-moltbook-database-let-anyone-take-control-of-any-ai-agent-on-the-site/" rel="noopener noreferrer"&gt;reported by 404 Media&lt;/a&gt;. A backend Supabase misconfiguration left it in a state where &lt;strong&gt;anyone could take over any agent on the platform via the DB&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;Then in February 2026, Wiz researchers found that &lt;a href="https://www.wiz.io/blog/exposed-moltbook-database-reveals-millions-of-api-keys" rel="noopener noreferrer"&gt;the Supabase API key was exposed in the front-end JavaScript&lt;/a&gt;. Full read/write to production data went through, and you could pull &lt;strong&gt;1.5M API authentication tokens, 35,000 email addresses, and private messages between agents&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;What matters here is a structural story prior to any individual vulnerability: no identity separation between agents, no scope narrowing, no rotation, none of it implemented. Stand up a platform holding this many identities this quickly and try to run it on human-scale credential ops, and this is what you get.&lt;/p&gt;

&lt;p&gt;The numbers back it up too. Non-Human Identities (NHI) as of 2026 outnumber humans by &lt;strong&gt;40–100x, reaching 500x in hyper-automation orgs&lt;/strong&gt;. The average enterprise carries over 250,000 NHIs (&lt;a href="https://nhimg.org/articles/identity-governance-is-shifting-as-non-human-identities-outnumber-humans/" rel="noopener noreferrer"&gt;NHI Working Group's summary&lt;/a&gt;).&lt;/p&gt;

&lt;p&gt;Of those, &lt;strong&gt;71% are not rotated, 97% are over-privileged, and only 15% of orgs feel confident they can defend against NHI-based attacks&lt;/strong&gt;. 68% of incidents involve NHIs.&lt;/p&gt;

&lt;p&gt;This is the &lt;strong&gt;pattern where operations break before any accident does&lt;/strong&gt;. Kubernetes service accounts spread not because of an accident but because "sharing creds across 200 microservices became physically impossible." The same thing happens with agents. If Replit / EchoLeak "push it up from the accident side," then Moltbook and the NHI numbers are the "push it up from the operations side."&lt;/p&gt;

&lt;p&gt;By here the real meaning of "the UX doesn't change" comes into focus. In the happy path you feel nothing. That is evidence of correct design. On the flip side, when an accident hits, the damage is &lt;strong&gt;human-scale&lt;/strong&gt; (Replit's full DB deletion, EchoLeak exfiltrating all of a company's documents off one email), and that is where the cost of not having Layer A surfaces. When you scale, audit-log pollution and the inability to revoke come due all at once as the tab for not having Layers B / C. The cost you do not see day to day becomes visible all at once under accidents and scale: that is the essential role of Layers A / B / C.&lt;/p&gt;

&lt;h2&gt;
  
  
  Predicting "when each layer catches on": the 3-stage model of security history
&lt;/h2&gt;

&lt;p&gt;From here on it is prediction. Less to be right than to give the discussion a foundation.&lt;/p&gt;

&lt;p&gt;The spread of security infrastructure almost without exception goes through the next three stages.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fraw.githubusercontent.com%2F0-draft%2Fdev.to%2Fmain%2Farticles%2Fassets%2Fagent-identity-three-layers-2026%2Fdiagrams%2F02-adoption-pattern.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fraw.githubusercontent.com%2F0-draft%2Fdev.to%2Fmain%2Farticles%2Fassets%2Fagent-identity-three-layers-2026%2Fdiagrams%2F02-adoption-pattern.png" alt="3-stage adoption pattern for security infrastructure" width="800" height="542"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h3&gt;
  
  
  Past isomorphic cases
&lt;/h3&gt;

&lt;p&gt;&lt;strong&gt;MFA&lt;/strong&gt;. TOTP was standardized in RFC 6238 in 2011, U2F in 2014; the tech itself is over a decade old. The 2012 LinkedIn breach (6.5M passwords), 2013-2014 Yahoo (eventually found to be 3B accounts), 2017 Equifax (147M): flashy incidents kept coming, and regulation caught up afterward, with &lt;strong&gt;NIST SP 800-63-3 downgrading SMS-based OOB authentication to a "restricted authenticator" in 2017&lt;/strong&gt; and &lt;strong&gt;PCI-DSS v4.0 expanding the mandatory-MFA scope beyond admins in 2022&lt;/strong&gt;. Real adoption was 2019–2023. &lt;strong&gt;Roughly 15 years from the tech appearing to practical use&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Short-lived AWS credentials&lt;/strong&gt;. 2011 STS AssumeRole, 2014 Code Spaces burned to the ground (an AWS key leak deleted all resources and ended the company), 2016 Uber (an S3 key leak, 57M records), 2019 Capital One (106M records via the EC2 instance metadata service). 2019 brought EKS IRSA, and &lt;strong&gt;EKS Pod Identity at re:Invent in November 2023&lt;/strong&gt; removed even the need to configure OIDC, so the default option finally became "don't create a static key." &lt;strong&gt;10–12 years from the tech appearing to real adoption&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Sigstore / keyless signing&lt;/strong&gt;. Late 2020 SolarWinds, 2021 Codecov / Kaseya. &lt;strong&gt;Executive Order 14028 in May 2021&lt;/strong&gt; codified SBOM and signing as federal procurement conditions, 2022 NIST SSDF (SP 800-218), 2023 SLSA v1.0. Real adoption was 2023–2025. &lt;strong&gt;Rapid adoption 3 years after the accident&lt;/strong&gt;, the fastest of the three.&lt;/p&gt;

&lt;p&gt;Lining up all three, a common thread emerges. &lt;strong&gt;The tech exists years before the accident&lt;/strong&gt; (it does not spread not because of a tech problem, but because the asymmetry between cost and risk has not yet broken). It needs &lt;strong&gt;the media to turn it into a story&lt;/strong&gt;: a CVE number and CVSS alone do not move anything; only when a named big company gets named-and-shamed does it start to turn. Finally, &lt;strong&gt;codification by regulation or industry BCP&lt;/strong&gt; is the trigger. NIST, PCI, EO 14028, SLSA: at this stage insurance premiums and compliance audits move, and budget gets secured inside companies.&lt;/p&gt;

&lt;h3&gt;
  
  
  The other historical pattern: protocols with zero UX improvement do not spread
&lt;/h3&gt;

&lt;p&gt;But reading only the 3-stage model, you might conclude "ID-JAG will come eventually too," and that misses an important precondition. &lt;strong&gt;MFA / Sigstore had extremely strong regulation, or a UX bonus&lt;/strong&gt;. With the same "accidents present," a protocol with zero UX improvement and implementation room only on the vendor side does not spread even when regulation comes. Look at tech history and this is the more common pattern.&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Protocol&lt;/th&gt;
&lt;th&gt;UX improvement&lt;/th&gt;
&lt;th&gt;Setup / ops&lt;/th&gt;
&lt;th&gt;Regulatory pressure&lt;/th&gt;
&lt;th&gt;Adoption&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;DNSSEC&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;none&lt;/td&gt;
&lt;td&gt;heavy&lt;/td&gt;
&lt;td&gt;a few govt agencies only&lt;/td&gt;
&lt;td&gt;~30% in 25 years&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;IPv6&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;none&lt;/td&gt;
&lt;td&gt;heavy&lt;/td&gt;
&lt;td&gt;Asia-centric, weak in US/EU&lt;/td&gt;
&lt;td&gt;~50% in 25 years&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;DMARC&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;none&lt;/td&gt;
&lt;td&gt;medium, full of traps&lt;/td&gt;
&lt;td&gt;enterprise BCP only&lt;/td&gt;
&lt;td&gt;half of enterprises, long tail unconfigured&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;DPoP&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;none&lt;/td&gt;
&lt;td&gt;medium&lt;/td&gt;
&lt;td&gt;none&lt;/td&gt;
&lt;td&gt;OAuth extension, near-zero adoption&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;SPIFFE&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;none&lt;/td&gt;
&lt;td&gt;heavy&lt;/td&gt;
&lt;td&gt;none&lt;/td&gt;
&lt;td&gt;niche after 10 years&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;mTLS (via service mesh)&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;none&lt;/td&gt;
&lt;td&gt;the mesh hides it&lt;/td&gt;
&lt;td&gt;none&lt;/td&gt;
&lt;td&gt;adopted where the mesh is&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;MFA&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;worse&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;light&lt;/td&gt;
&lt;td&gt;
&lt;strong&gt;strong&lt;/strong&gt; (PCI / NIST)&lt;/td&gt;
&lt;td&gt;real adoption in 15 years&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Passkey&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;dramatically better&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;light&lt;/td&gt;
&lt;td&gt;weak&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;exploded in 2-3 years&lt;/strong&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Let's Encrypt + HTTPS&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;
&lt;strong&gt;better&lt;/strong&gt; (free, automatic)&lt;/td&gt;
&lt;td&gt;light&lt;/td&gt;
&lt;td&gt;weak&lt;/td&gt;
&lt;td&gt;global standard in 5 years&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;TLS 1.3&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;
&lt;strong&gt;better&lt;/strong&gt; (lower latency)&lt;/td&gt;
&lt;td&gt;automatic&lt;/td&gt;
&lt;td&gt;none&lt;/td&gt;
&lt;td&gt;spread in a few years&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;The rule you can read off this is simple:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;UX improves&lt;/strong&gt; → spreads fast (Passkey, Let's Encrypt, TLS 1.3)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;UX worsens + strong regulation&lt;/strong&gt; → takes time but spreads (MFA muscled through the UX hit of SMS OTP via regulation)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;zero UX improvement + weak regulation + heavy setup&lt;/strong&gt; → &lt;strong&gt;eternal niche&lt;/strong&gt; (DNSSEC, IPv6, DMARC, DPoP, SPIFFE)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;zero UX improvement but a mechanism makes the ops invisible (service mesh, vendor integration)&lt;/strong&gt; → spreads via a different route&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;So which quadrant is ID-JAG in?&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;UX improvement: &lt;strong&gt;none&lt;/strong&gt;. Alice's experience is actually better for skipping the consent screen, but it is not at the "wow" level&lt;/li&gt;
&lt;li&gt;Setup: &lt;strong&gt;heavy&lt;/strong&gt;. Trust with the IdP, the Resource Server's support, the Agent's Token Exchange implementation, all cross-vendor&lt;/li&gt;
&lt;li&gt;Vendor dependence: &lt;strong&gt;strong&lt;/strong&gt;. Users can't use it unless Cursor / Claude Code / Comet each implement it&lt;/li&gt;
&lt;li&gt;Strong regulation: &lt;strong&gt;not yet&lt;/strong&gt;. Neither the EU AI Act nor the NIST AI RMF steps into the identity layer&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;This is the &lt;strong&gt;DNSSEC / SPIFFE / DPoP quadrant&lt;/strong&gt;. Rapid adoption like MFA or Sigstore is more likely not to happen.&lt;/p&gt;

&lt;h3&gt;
  
  
  Where Agent Identity stands now
&lt;/h3&gt;

&lt;p&gt;Put the three isomorphic cases next to Agent Identity and it looks like this.&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Stage&lt;/th&gt;
&lt;th&gt;MFA&lt;/th&gt;
&lt;th&gt;Short-lived AWS creds&lt;/th&gt;
&lt;th&gt;Sigstore&lt;/th&gt;
&lt;th&gt;Agent Identity&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;1. Tech appears&lt;/td&gt;
&lt;td&gt;2011 (TOTP)&lt;/td&gt;
&lt;td&gt;2011 (STS)&lt;/td&gt;
&lt;td&gt;2020&lt;/td&gt;
&lt;td&gt;2024-2026 (ID-JAG / Tx Tokens / WIF)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;2. Flashy accident&lt;/td&gt;
&lt;td&gt;2012-2017&lt;/td&gt;
&lt;td&gt;2014-2019&lt;/td&gt;
&lt;td&gt;2020-2021&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;2025-2026 (Replit / EchoLeak / Comet / Moltbook, ongoing)&lt;/strong&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;3. Regulation / BCP&lt;/td&gt;
&lt;td&gt;2017-2022 (800-63-3 / PCI v4)&lt;/td&gt;
&lt;td&gt;2019-2023 (IRSA / Pod Identity)&lt;/td&gt;
&lt;td&gt;2021-2023 (EO 14028 / SLSA)&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;2026-? (EU AI Act / NIST AI RMF / RSAC 2026 in discussion)&lt;/strong&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;4. Real adoption&lt;/td&gt;
&lt;td&gt;2019-2023&lt;/td&gt;
&lt;td&gt;2021-2023&lt;/td&gt;
&lt;td&gt;2023-2025&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;predicted 2028-2030&lt;/strong&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;Right now is &lt;strong&gt;the entrance to stage 2&lt;/strong&gt;. Individual cases like Replit / EchoLeak / Comet (&lt;a href="https://brave.com/blog/comet-prompt-injection/" rel="noopener noreferrer"&gt;Brave's analysis of Perplexity Comet&lt;/a&gt;, &lt;a href="https://www.schneier.com/blog/archives/2025/11/prompt-injection-in-ai-browsers.html" rel="noopener noreferrer"&gt;Schneier on Security's warning on AI browsers&lt;/a&gt;) keep stacking up, but the &lt;strong&gt;"Equifax of Agents"&lt;/strong&gt; has not arrived yet. Three scenarios that look likely to come next:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;A big company's Copilot / Cursor / Claude Code integration eats a prompt injection and exfiltrates customer DB contents to a company-wide Slack. Happens at some Fortune 500&lt;/li&gt;
&lt;li&gt;An executive's AI assistant gets hijacked and a transfer instruction goes through via Slack. Salesforce building &lt;strong&gt;mobile approval&lt;/strong&gt; into Trusted Agent Identity is precisely a guard against this&lt;/li&gt;
&lt;li&gt;A dev Agent force-pushes to a prod repo on a human's GitHub creds and breaks the supply chain. The AI version of SolarWinds&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The moment any one of these lands in the New York Times &lt;strong&gt;with a name attached&lt;/strong&gt;, it moves to stage 3 (regulation). Until then it stays stuck at "we really should do this."&lt;/p&gt;

&lt;h3&gt;
  
  
  Per-layer adoption-timing prediction (the honest version)
&lt;/h3&gt;

&lt;p&gt;I see the three layers walking separate fates. I write "honest version" because it is my personal prediction with the wishful thinking removed.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Layer C (WIF) → will catch on&lt;/strong&gt;. The reason is simple: &lt;strong&gt;the vendor completely hides the operations&lt;/strong&gt;. Anthropic WIF works just by initializing the SDK with no arguments, and a k8s SA auto-mints projected tokens. The end user does nothing. This is the same shape as mTLS spreading hidden inside the service mesh: the protocol itself spreads without anyone being aware of it. &lt;code&gt;sk-ant-...&lt;/code&gt; / &lt;code&gt;sk-...&lt;/code&gt; static keys drop within a few years to the status of "exists, but not recommended."&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Layer A (ID-JAG / XAA) → won't catch on as a standard; the concept spreads under a different name via vendor integration&lt;/strong&gt;. Almost no team will say "we deployed ID-JAG" or "we adopted XAA." Just like SAML, each vendor implements it proprietarily as "Claude Connector," "Cursor for Enterprise," "Copilot Connector," and the end user perceives it as "I connected GitHub with one click in the Anthropic console." Nobody cares whether the token exchange behind it is ID-JAG-compliant. &lt;strong&gt;Only XAA via MCP has a shot at surfacing&lt;/strong&gt;, but even that is only something "the side standing up the MCP Server" is aware of; the end user never sees it.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Layer B (Transaction Tokens for Agents) → eternal niche&lt;/strong&gt;. This is dragged down by the fact that Transaction Tokens proper are currently almost unadopted. It will get picked up in some finance / healthcare intra-trust-domain microservices, but nowhere else. Same pattern as SPIFFE (correct, but adoption is limited).&lt;/p&gt;

&lt;h3&gt;
  
  
  The likely base case: it ends on the SPIFFE / DNSSEC route
&lt;/h3&gt;

&lt;p&gt;I write this as the base case. Not wishful thinking: I mean that going by the lessons of tech history plus current vendor dynamics, this is the most likely outcome.&lt;/p&gt;

&lt;p&gt;Three reasons:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Microsoft / OpenAI / Anthropic / Salesforce locking it down proprietarily is faster&lt;/strong&gt;. Once Anthropic Service Account, OpenAI Agent Identity, Microsoft Copilot Studio Identity, and Salesforce Trusted Agent Identity each become self-contained, the incentive to wait for standardization evaporates. The same road SAML took, ending up subtly different per SaaS&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;MCP's Enterprise-Managed Authorization (= XAA)&lt;/strong&gt; is a route that could save ID-JAG, but it is an Okta-led implementation, and if Microsoft / Google / Auth0 start shipping isomorphic vendor extensions, it could fragment&lt;/li&gt;
&lt;li&gt;The discussion at &lt;strong&gt;RSAC 2026&lt;/strong&gt; (&lt;a href="https://venturebeat.com/security/rsac-2026-agent-identity-frameworks-three-gaps" rel="noopener noreferrer"&gt;VentureBeat's summary&lt;/a&gt;) is moving the direction even higher, to "Agent Identity alone is not enough; you need an Action Governance layer." When the "next layer" discussion starts before the identity-layer standardization solidifies, the standardization work falls behind, and each vendor's proprietary implementation becomes the de facto standard&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The optimistic scenario does exist: "an Equifax of Agents lands in the NYT, the EU AI Act steps into the identity layer, and ID-JAG becomes effectively mandatory." But that needs regulatory pressure as strong as how regulation muscled MFA through the UX hit of SMS OTP, and the current AI-regulation discussion centers on &lt;strong&gt;liability allocation / transparency / fairness&lt;/strong&gt;, with no sign of stepping into identity tech. &lt;strong&gt;The probability of "spreads in reality via vendor proprietary" is higher than the probability of regulation arriving&lt;/strong&gt;: that is my honest read.&lt;/p&gt;

&lt;h2&gt;
  
  
  Conclusion
&lt;/h2&gt;

&lt;p&gt;This ran long, so here is what I want you to take home from the three-layer model.&lt;/p&gt;

&lt;p&gt;"The UX doesn't change, so we don't need ID-JAG" is half a misread and half correct. &lt;strong&gt;Layer A's value shows up under accidents and scale&lt;/strong&gt;: that much is correct as a statement about the spec. On the other hand, &lt;strong&gt;"will the ID-JAG protocol catch on" and "will the Layer A concept spread" are separate questions&lt;/strong&gt;, and that is the honest read. The route where the standard stands up front and spreads, like MFA or Sigstore, is possible, but a protocol that combines zero UX improvement, heavy setup, and strong vendor dependence is more likely to follow the DNSSEC / SPIFFE / DPoP route.&lt;/p&gt;

&lt;p&gt;Even so, the three-layer model itself does not break. &lt;strong&gt;Layer C (WIF) catches on&lt;/strong&gt; (the structure where the vendor hides the operations), &lt;strong&gt;Layer A spreads as concept only, under a different name via vendor integration&lt;/strong&gt; (Claude Connector / Copilot Connector / Cursor for Enterprise effectively implementing the ID-JAG equivalent behind the scenes), &lt;strong&gt;Layer B is an eternal niche in just part of finance / healthcare&lt;/strong&gt;: that is how it splits. If an accident like Replit / EchoLeak / Moltbook happens in your environment, thinking about what stops and what doesn't per Layer A / B / C changes how you see it.&lt;/p&gt;

&lt;p&gt;If you want to move in practice today, start with &lt;strong&gt;Layer C&lt;/strong&gt;. Replacing &lt;code&gt;sk-ant-...&lt;/code&gt; / &lt;code&gt;sk-...&lt;/code&gt; static keys with &lt;a href="https://platform.claude.com/docs/en/manage-claude/workload-identity-federation" rel="noopener noreferrer"&gt;Anthropic WIF&lt;/a&gt; or equivalent Workload Identity Federation is the side you can do today without waiting for regulation, and that will reliably catch on. For Layer A, rather than aiming for "ID-JAG compliance," it is more practical to &lt;strong&gt;check how narrow the scope design is in the proprietary connectors your vendors (Anthropic / OpenAI / Microsoft / Okta) ship&lt;/strong&gt;. Layer B is fine to think about once an audit requirement actually shows up.&lt;/p&gt;

&lt;p&gt;If you want to read the behavior of ID-JAG itself, &lt;a href="https://dev.to/kanywst/id-jag-deep-dive-1mhp"&gt;ID-JAG Deep Dive&lt;/a&gt;; for the full map of agent auth, &lt;a href="https://dev.to/kanywst/ai-agent-authentication-authorization-deep-dive-reading-draft-klrc-aiagent-auth-00-5d1"&gt;AI Agent Authentication &amp;amp; Authorization Deep Dive: Reading draft-klrc-aiagent-auth-00&lt;/a&gt;. Those two are the supporting lines.&lt;/p&gt;

&lt;p&gt;Holding the three-layer model as a map means &lt;strong&gt;even when a vendor's new announcement is a proprietary integration, you can immediately classify what it solves and what it doesn't&lt;/strong&gt;. The OpenAI / Google Cloud agent-auth announcements that will probably come in the second half of 2026, even if they don't ship under the name ID-JAG, are fine as long as you can sort out whether they are trying to solve the Layer A problem or are Layer C. Whether the standard catches on is, in a sense, beside the point. &lt;strong&gt;As long as the problem gets solved&lt;/strong&gt;.&lt;/p&gt;

</description>
      <category>oauth</category>
      <category>security</category>
      <category>ai</category>
      <category>agents</category>
    </item>
    <item>
      <title>SPIFFE Compliance Deep Dive</title>
      <dc:creator>kt</dc:creator>
      <pubDate>Sun, 31 May 2026 06:05:03 +0000</pubDate>
      <link>https://dev.to/kanywst/spiffe-compliance-deep-dive-5e29</link>
      <guid>https://dev.to/kanywst/spiffe-compliance-deep-dive-5e29</guid>
      <description>&lt;h1&gt;
  
  
  Introduction
&lt;/h1&gt;

&lt;p&gt;"This is SPIFFE compliant." "Our implementation is SPIFFE-compatible." Read the SPIRE, Istio, or Cilium docs and you bump into these phrases everywhere. But if you actually stop and ask what compliance means, the answer is surprisingly hard to nail down.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;If I run SPIRE, am I SPIFFE compliant?&lt;/li&gt;
&lt;li&gt;If I roll my own, what do I have to ship to call it compliant?&lt;/li&gt;
&lt;li&gt;Does an SVID have to support both X.509 and JWT? Or is one enough?&lt;/li&gt;
&lt;li&gt;Is there an official conformance test you can point at?&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;A lot of people use the word without checking. I was one of them.&lt;/p&gt;

&lt;p&gt;So I cloned the upstream spec repo (&lt;code&gt;github.com/spiffe/spiffe&lt;/code&gt;) and read every document from top to bottom. What follows are my notes, organized around the question "what does the spec actually require?"&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;git clone https://github.com/spiffe/spiffe.git ~/spiffe
&lt;span class="nb"&gt;ls&lt;/span&gt; ~/spiffe/standards/
&lt;span class="c"&gt;# JWT-SVID.md&lt;/span&gt;
&lt;span class="c"&gt;# SPIFFE-ID.md&lt;/span&gt;
&lt;span class="c"&gt;# SPIFFE.md&lt;/span&gt;
&lt;span class="c"&gt;# SPIFFE_Federation.md&lt;/span&gt;
&lt;span class="c"&gt;# SPIFFE_Trust_Domain_and_Bundle.md&lt;/span&gt;
&lt;span class="c"&gt;# SPIFFE_Workload_API.md&lt;/span&gt;
&lt;span class="c"&gt;# SPIFFE_Workload_Endpoint.md&lt;/span&gt;
&lt;span class="c"&gt;# X509-SVID.md&lt;/span&gt;
&lt;span class="c"&gt;# workloadapi.proto&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Eight specs total. Read them in order and each one spells out where the hard requirements end and the soft suggestions begin. I'll pull out the load-bearing parts.&lt;/p&gt;




&lt;h2&gt;
  
  
  0. Prerequisites
&lt;/h2&gt;

&lt;p&gt;A few terms before we go any further. Skip this section if you've touched SPIFFE/SPIRE before.&lt;/p&gt;

&lt;h3&gt;
  
  
  What "workload identity" means
&lt;/h3&gt;

&lt;p&gt;A way to cryptographically prove "who you are" to a workload (a process or container). IP addresses and hostnames can be spoofed. Kubernetes labels can be rewritten. Instead, a trusted &lt;strong&gt;Certificate Authority (CA)&lt;/strong&gt; signs a certificate or token that the workload presents, and verifiers check the signature.&lt;/p&gt;

&lt;p&gt;To distinguish this from human identity (the account you log into via OAuth), it's increasingly called &lt;strong&gt;Workload Identity&lt;/strong&gt; or &lt;strong&gt;Non-Human Identity (NHI)&lt;/strong&gt;.&lt;/p&gt;

&lt;h3&gt;
  
  
  URI basics (RFC 3986)
&lt;/h3&gt;

&lt;p&gt;We'll be dealing with strings like &lt;code&gt;spiffe://example.com/payments/web-fe&lt;/code&gt;, so a quick refresher on URI structure.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;spiffe://example.com/payments/web-fe
  |        |              |
scheme  authority        path
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;In SPIFFE the scheme is fixed at &lt;code&gt;spiffe&lt;/code&gt;, the authority is the &lt;strong&gt;Trust Domain&lt;/strong&gt;, and everything after that is the &lt;strong&gt;Workload Path&lt;/strong&gt;.&lt;/p&gt;

&lt;h3&gt;
  
  
  X.509, mTLS, JWT, JWS, JWK
&lt;/h3&gt;

&lt;p&gt;Signed data structures, or representations of keys.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;X.509&lt;/strong&gt;: the certificate format TLS uses. Binary (DER) or PEM encoded.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;mTLS (mutual TLS)&lt;/strong&gt;: both client and server present X.509 certificates and verify each other. Plain TLS only verifies the server.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;JWT&lt;/strong&gt; (RFC 7519): a token made of JSON pieces joined with Base64URL. Common on the web.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;JWS&lt;/strong&gt; (RFC 7515): the structure that signs the JWT payload. &lt;strong&gt;A JWT requires a JWS to exist.&lt;/strong&gt; When the SPIFFE spec says "JWS Compact Serialization," it means the normal JWT layout: &lt;code&gt;header.payload.signature&lt;/code&gt; joined with dots.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;JWK&lt;/strong&gt; (RFC 7517): a JSON representation of a public (or private) key. Looks like &lt;code&gt;{"kty":"RSA", "n":"...", "e":"AQAB"}&lt;/code&gt;.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;JWKS (JWK Set)&lt;/strong&gt;: a JSON array bundling multiple JWKs. If you've ever fetched &lt;code&gt;/.well-known/jwks.json&lt;/code&gt; in OAuth/OIDC, you've seen one.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Bearer Token&lt;/strong&gt;: a token where possession alone grants access. Steal it, use it. JWTs are typically bearer tokens.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The SPIFFE spec stacks a SPIFFE ID on top of these and calls the result an &lt;strong&gt;SVID (SPIFFE Verifiable Identity Document)&lt;/strong&gt;:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;X.509 based → &lt;strong&gt;X.509-SVID&lt;/strong&gt;
&lt;/li&gt;
&lt;li&gt;JWT/JWS based → &lt;strong&gt;JWT-SVID&lt;/strong&gt;
&lt;/li&gt;
&lt;li&gt;Distribution of trust roots (CA keys) → &lt;strong&gt;SPIFFE Bundle&lt;/strong&gt; (a JWKS underneath)&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  gRPC and Unix Domain Sockets
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;gRPC&lt;/strong&gt;: an RPC framework that pushes Protocol Buffers over HTTP/2.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Unix Domain Socket (UDS)&lt;/strong&gt;: an inter-process socket on the local host. You reach it by filesystem path (for example &lt;code&gt;/run/spire/agent.sock&lt;/code&gt;).&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The Workload API ships over these two together.&lt;/p&gt;




&lt;h2&gt;
  
  
  1. The Shape of SPIFFE: Three Pillars
&lt;/h2&gt;

&lt;p&gt;The SPIFFE spec set, in one sentence:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;"Hand the workload a &lt;code&gt;spiffe://...&lt;/code&gt; ID, issue a verifiable document (SVID) that carries it, and serve it through a local API."&lt;/strong&gt;&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;The three pieces (ID, document, API) each get their own spec document.&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Pillar&lt;/th&gt;
&lt;th&gt;Document&lt;/th&gt;
&lt;th&gt;Role&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;1. SPIFFE ID&lt;/td&gt;
&lt;td&gt;&lt;code&gt;SPIFFE-ID.md&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Defines the workload namespace&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;2. SVID (X.509)&lt;/td&gt;
&lt;td&gt;&lt;code&gt;X509-SVID.md&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;How to carry a SPIFFE ID in an X.509 certificate&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;2. SVID (JWT)&lt;/td&gt;
&lt;td&gt;&lt;code&gt;JWT-SVID.md&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;How to carry a SPIFFE ID in a JWT&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;3. Workload API&lt;/td&gt;
&lt;td&gt;&lt;code&gt;SPIFFE_Workload_API.md&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;gRPC service that issues SVIDs&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;3. Workload Endpoint&lt;/td&gt;
&lt;td&gt;&lt;code&gt;SPIFFE_Workload_Endpoint.md&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Conventions for the socket that exposes the API&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Extra&lt;/td&gt;
&lt;td&gt;&lt;code&gt;SPIFFE_Trust_Domain_and_Bundle.md&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;How to represent trust roots (CA keys)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Extra&lt;/td&gt;
&lt;td&gt;&lt;code&gt;SPIFFE_Federation.md&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;How separate Trust Domains link up&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;You don't actually have to satisfy all of these to call yourself SPIFFE compliant. The next section explains why.&lt;/p&gt;




&lt;h2&gt;
  
  
  2. What the Spec Itself Says "Compliant" Means
&lt;/h2&gt;

&lt;p&gt;This one sentence at the end of &lt;code&gt;SPIFFE-ID.md&lt;/code&gt; Section 1 does a lot of work:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;Conformance with this document is sufficient for the purposes of SPIFFE compliance.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;Meet &lt;code&gt;SPIFFE-ID.md&lt;/code&gt; and you can claim SPIFFE compliance. That's it. The Workload API, the Trust Bundle, the Federation spec, none of them are required by the letter of the spec.&lt;/p&gt;

&lt;p&gt;That's the formal floor. In practice:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;An ID by itself is useless without an &lt;strong&gt;SVID&lt;/strong&gt; carrying it.&lt;/li&gt;
&lt;li&gt;An SVID is useless without a &lt;strong&gt;Workload API&lt;/strong&gt; to deliver it.&lt;/li&gt;
&lt;li&gt;And neither is verifiable across hosts or trust domains without a &lt;strong&gt;Bundle&lt;/strong&gt; representing the trust root.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;"Practically SPIFFE compliant" means the three core specs plus Bundle, four documents total. If you cross trust domains, add &lt;strong&gt;Federation&lt;/strong&gt; to that list.&lt;/p&gt;

&lt;p&gt;The rest of this article walks through the MUST requirements of each one.&lt;/p&gt;




&lt;h2&gt;
  
  
  3. SPIFFE ID: How You Name Things
&lt;/h2&gt;

&lt;h3&gt;
  
  
  3.1 Anatomy of the URI
&lt;/h3&gt;

&lt;p&gt;A SPIFFE ID is an RFC 3986 URI with a fixed shape.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;spiffe://trust-domain-name/path/segments
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Pull out everything the spec marks MUST or MUST NOT and you get:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Requirement&lt;/th&gt;
&lt;th&gt;Level&lt;/th&gt;
&lt;th&gt;Detail&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Scheme is fixed at &lt;code&gt;spiffe&lt;/code&gt;
&lt;/td&gt;
&lt;td&gt;MUST&lt;/td&gt;
&lt;td&gt;Case-insensitive&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Trust domain name is non-empty&lt;/td&gt;
&lt;td&gt;MUST&lt;/td&gt;
&lt;td&gt;The &lt;code&gt;host&lt;/code&gt; component&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;No &lt;code&gt;userinfo&lt;/code&gt;
&lt;/td&gt;
&lt;td&gt;MUST NOT&lt;/td&gt;
&lt;td&gt;
&lt;code&gt;spiffe://user:pass@td/&lt;/code&gt; is out&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;No &lt;code&gt;port&lt;/code&gt;
&lt;/td&gt;
&lt;td&gt;MUST NOT&lt;/td&gt;
&lt;td&gt;
&lt;code&gt;spiffe://td:8080/&lt;/code&gt; is out&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Trust domain name is lowercase&lt;/td&gt;
&lt;td&gt;MUST&lt;/td&gt;
&lt;td&gt;
&lt;code&gt;Example.com&lt;/code&gt; is out&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Trust domain charset is &lt;code&gt;[a-z0-9.-_]&lt;/code&gt;
&lt;/td&gt;
&lt;td&gt;MUST&lt;/td&gt;
&lt;td&gt;Percent-encoding is out&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;No &lt;code&gt;query&lt;/code&gt; or &lt;code&gt;fragment&lt;/code&gt;
&lt;/td&gt;
&lt;td&gt;MUST NOT&lt;/td&gt;
&lt;td&gt;
&lt;code&gt;?key=val&lt;/code&gt; and &lt;code&gt;#frag&lt;/code&gt; are out&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;No trailing slash on the path&lt;/td&gt;
&lt;td&gt;MUST NOT&lt;/td&gt;
&lt;td&gt;
&lt;code&gt;spiffe://td/foo/&lt;/code&gt; is out&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;No &lt;code&gt;.&lt;/code&gt; or &lt;code&gt;..&lt;/code&gt; segments&lt;/td&gt;
&lt;td&gt;MUST NOT&lt;/td&gt;
&lt;td&gt;Relative path tokens forbidden&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Whole URI under 2048 bytes&lt;/td&gt;
&lt;td&gt;MUST (SHOULD)&lt;/td&gt;
&lt;td&gt;Full URI length&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Trust domain name under 255 bytes&lt;/td&gt;
&lt;td&gt;MUST&lt;/td&gt;
&lt;td&gt;RFC 3986 host limit&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;h3&gt;
  
  
  3.2 Trust Domain Names Are Unregulated
&lt;/h3&gt;

&lt;p&gt;The spec is up front about this:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;Trust domain operators are free to choose any trust domain name they find suitable: there is no centralized authority for regulation or registration of trust domain names.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;Unlike DNS, there is no registry. Anyone can call themselves &lt;code&gt;example.com&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;So what happens when two parties pick the same name?&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;When a collision does occur, those trust domains will continue to operate independently but will be unable to federate.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;While they stay independent, nothing breaks. The moment they try to federate (link up across trust domains), the two sides hold different keys, so neither can verify the other's certificates. They just fail to connect.&lt;/p&gt;

&lt;p&gt;The practical workaround is to use a DNS name you already own. If you're auto-generating, a UUID works.&lt;/p&gt;

&lt;h3&gt;
  
  
  3.3 Path Design
&lt;/h3&gt;

&lt;p&gt;What the path means is up to the implementer. The spec gives three example patterns:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Pattern A, identify the service directly&lt;/strong&gt;: &lt;code&gt;spiffe://staging.example.com/payments/mysql&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Pattern B, mirror the orchestrator's ID structure&lt;/strong&gt;: &lt;code&gt;spiffe://k8s-west.example.com/ns/staging/sa/default&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Pattern C, opaque&lt;/strong&gt;: &lt;code&gt;spiffe://example.com/9eebccd2-12bf-40a6-b262-65fe0487d453&lt;/code&gt; (metadata managed elsewhere)&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;SPIRE's default on Kubernetes is Pattern B (&lt;code&gt;/ns/&amp;lt;namespace&amp;gt;/sa/&amp;lt;service-account&amp;gt;&lt;/code&gt;), and that's close to the de facto convention. It maps Kubernetes Pod Identity directly, so the operational view stays legible.&lt;/p&gt;




&lt;h2&gt;
  
  
  4. SVID: X.509 Profile Requirements
&lt;/h2&gt;

&lt;h3&gt;
  
  
  4.1 An X.509-SVID Is Just an X.509 Certificate With a URI SAN
&lt;/h3&gt;

&lt;p&gt;Read &lt;code&gt;X509-SVID.md&lt;/code&gt; and there are no new fields. &lt;strong&gt;It's a normal X.509 certificate with rules layered on top about how to use the existing ones.&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;The core rule is simple: &lt;strong&gt;the Subject Alternative Name extension's URI type holds exactly one SPIFFE ID&lt;/strong&gt;. Everything else (&lt;code&gt;Subject&lt;/code&gt;, &lt;code&gt;Basic Constraints&lt;/code&gt;, &lt;code&gt;Key Usage&lt;/code&gt;, &lt;code&gt;Extended Key Usage&lt;/code&gt;) carries its usual X.509 meaning. &lt;strong&gt;The spec only constrains which values are legal for SPIFFE use.&lt;/strong&gt;&lt;/p&gt;

&lt;h3&gt;
  
  
  4.2 Leaf SVID vs Signing SVID
&lt;/h3&gt;

&lt;p&gt;The X.509 field values differ between Leaf and Signing.&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Field&lt;/th&gt;
&lt;th&gt;Leaf SVID&lt;/th&gt;
&lt;th&gt;Signing SVID (CA)&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;
&lt;code&gt;cA&lt;/code&gt; (Basic Constraints)&lt;/td&gt;
&lt;td&gt;
&lt;code&gt;false&lt;/code&gt; MUST&lt;/td&gt;
&lt;td&gt;
&lt;code&gt;true&lt;/code&gt; MUST&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;
&lt;code&gt;keyCertSign&lt;/code&gt; (Key Usage)&lt;/td&gt;
&lt;td&gt;MUST NOT&lt;/td&gt;
&lt;td&gt;MUST&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;
&lt;code&gt;cRLSign&lt;/code&gt; (Key Usage)&lt;/td&gt;
&lt;td&gt;MUST NOT&lt;/td&gt;
&lt;td&gt;MAY&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;
&lt;code&gt;digitalSignature&lt;/code&gt; (Key Usage)&lt;/td&gt;
&lt;td&gt;MUST&lt;/td&gt;
&lt;td&gt;(not specified)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;SPIFFE ID path&lt;/td&gt;
&lt;td&gt;Non-root (at least one segment) MUST&lt;/td&gt;
&lt;td&gt;No path (&lt;code&gt;spiffe://td&lt;/code&gt;) SHOULD&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Number of URI SANs&lt;/td&gt;
&lt;td&gt;Exactly one MUST&lt;/td&gt;
&lt;td&gt;Exactly one MUST&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;A Leaf SVID can sign (mTLS client auth, API request signing) but cannot issue certificates for other parties. Same position as an ordinary client certificate.&lt;/p&gt;

&lt;h3&gt;
  
  
  4.3 Extra Checks at Validation Time
&lt;/h3&gt;

&lt;p&gt;Section 5.2 of &lt;code&gt;X509-SVID.md&lt;/code&gt; makes it clear that &lt;strong&gt;standard X.509 path validation is not enough&lt;/strong&gt;. Using a certificate as an SVID requires these additional checks.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fraw.githubusercontent.com%2F0-draft%2Fdev.to%2Fmain%2Farticles%2Fassets%2Fspiffe-compliance-deep-dive%2Fdiagrams%2F04-x509-svid-validation.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fraw.githubusercontent.com%2F0-draft%2Fdev.to%2Fmain%2Farticles%2Fassets%2Fspiffe-compliance-deep-dive%2Fdiagrams%2F04-x509-svid-validation.png" alt="X.509-SVID validation flow" width="800" height="1492"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Forget these and you open the door to vulnerabilities like an intermediate CA being used as a leaf certificate. Libraries like &lt;code&gt;go-spiffe/v2&lt;/code&gt; handle this for you. Only worry about it when writing your own.&lt;/p&gt;




&lt;h2&gt;
  
  
  5. SVID: JWT Profile Requirements
&lt;/h2&gt;

&lt;h3&gt;
  
  
  5.1 A JWT-SVID Is Just a JWS
&lt;/h3&gt;

&lt;p&gt;From Section 1 of &lt;code&gt;JWT-SVID.md&lt;/code&gt;:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;JWT-SVIDs are standard JWT tokens with a handful of restrictions applied.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;A normal JWT with a few extra constraints, nothing more. Format is JWS Compact Serialization (the &lt;code&gt;header.payload.signature&lt;/code&gt; layout). JWS JSON Serialization is &lt;strong&gt;MUST NOT&lt;/strong&gt;.&lt;/p&gt;

&lt;h3&gt;
  
  
  5.2 The Core Restriction: Reject &lt;code&gt;alg: none&lt;/code&gt;
&lt;/h3&gt;

&lt;p&gt;JWT has a well-known footgun: set &lt;code&gt;alg&lt;/code&gt; to &lt;code&gt;none&lt;/code&gt; in the header and the token sails through unverified.&lt;/p&gt;

&lt;p&gt;The JWT-SVID spec pins &lt;code&gt;alg&lt;/code&gt; to one of these nine values.&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;
&lt;code&gt;alg&lt;/code&gt; value&lt;/th&gt;
&lt;th&gt;Algorithm&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;RS256 / RS384 / RS512&lt;/td&gt;
&lt;td&gt;RSASSA-PKCS1-v1_5 + SHA&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;PS256 / PS384 / PS512&lt;/td&gt;
&lt;td&gt;RSASSA-PSS + SHA&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;ES256 / ES384 / ES512&lt;/td&gt;
&lt;td&gt;ECDSA + SHA&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;Anything else (especially &lt;code&gt;none&lt;/code&gt; or the symmetric &lt;code&gt;HS*&lt;/code&gt; family) &lt;strong&gt;MUST be rejected&lt;/strong&gt;. That single rule closes off most of the classic JOSE vulnerability surface.&lt;/p&gt;

&lt;p&gt;Required claims:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;code&gt;sub&lt;/code&gt;: the SPIFFE ID of the workload.&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;aud&lt;/code&gt;: at least one audience.&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;exp&lt;/code&gt;: expiry.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;code&gt;kid&lt;/code&gt; is optional in the header but required on the Bundle JWK side so verifiers can pick the right key.&lt;/p&gt;

&lt;h3&gt;
  
  
  5.3 Always Check &lt;code&gt;aud&lt;/code&gt;
&lt;/h3&gt;

&lt;p&gt;JWT-SVID is a bearer token. Anyone holding it can use it. To soften that blow:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;code&gt;aud&lt;/code&gt; MUST be set.&lt;/li&gt;
&lt;li&gt;The receiver MUST check that its own ID is in &lt;code&gt;aud&lt;/code&gt;.&lt;/li&gt;
&lt;li&gt;Single audience strongly recommended.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The example in Section 7.2 of the spec shows the failure mode:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;if Alice has a token with audiences Bob and Chuck, and transmits that token to Chuck, then Chuck can impersonate Alice by sending the same token to Bob.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;Alice issues a token addressed to both Bob and Chuck. Chuck replays it to Bob and impersonates Alice. &lt;strong&gt;One token, one audience. That's the rule.&lt;/strong&gt;&lt;/p&gt;




&lt;h2&gt;
  
  
  6. Workload API: How SVIDs Get Delivered
&lt;/h2&gt;

&lt;h3&gt;
  
  
  6.1 gRPC, Local Only
&lt;/h3&gt;

&lt;p&gt;&lt;code&gt;SPIFFE_Workload_Endpoint.md&lt;/code&gt;, summarized: the Workload API is a gRPC endpoint on a Unix Domain Socket (or localhost TCP) within the same host. In practice, the SPIRE Agent running on each node binds to &lt;code&gt;/run/spire/agent.sock&lt;/code&gt;, and every container on that host talks to it over gRPC. The Agent handles the conversation with the SPIRE Server (the central CA) over a separate network. The workload itself never reaches the outside.&lt;/p&gt;

&lt;p&gt;The MUSTs around transport and accessibility:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Requirement&lt;/th&gt;
&lt;th&gt;Detail&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;gRPC required&lt;/td&gt;
&lt;td&gt;Prefer UDS over TCP (SHOULD)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;No TLS&lt;/td&gt;
&lt;td&gt;In fact, MUST NOT require it. At bootstrap the workload has no trust root yet.&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Confined to one host&lt;/td&gt;
&lt;td&gt;SHOULD (don't make it reachable from other hosts)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;
&lt;code&gt;workload.spiffe.io: true&lt;/code&gt; metadata&lt;/td&gt;
&lt;td&gt;MUST (SSRF defense)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;No authentication handshake&lt;/td&gt;
&lt;td&gt;MUST NOT require one. Caller is identified at the OS layer instead.&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;That last point is unusual. A typical gRPC service authenticates clients via certificates or tokens. The Workload API flips that: the workload doesn't announce itself, the server identifies it via the OS. This identification flow is called &lt;strong&gt;Workload Attestation&lt;/strong&gt;. Concretely:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Fetch the connecting PID from the UDS via &lt;code&gt;getpeerucred&lt;/code&gt; (or equivalent).&lt;/li&gt;
&lt;li&gt;Inspect the PID's cgroup, or query the Kubernetes API server.&lt;/li&gt;
&lt;li&gt;Conclude "you're the web-fe container in pod foo-bar-1234."&lt;/li&gt;
&lt;li&gt;Issue the SPIFFE ID that container should hold.&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  6.2 Five RPCs Across Two Profiles
&lt;/h3&gt;

&lt;p&gt;The Workload API has two profiles (X.509 and JWT) and five RPCs between them.&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Profile&lt;/th&gt;
&lt;th&gt;RPC&lt;/th&gt;
&lt;th&gt;Mode&lt;/th&gt;
&lt;th&gt;What it returns&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;X.509&lt;/td&gt;
&lt;td&gt;&lt;code&gt;FetchX509SVID&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;stream&lt;/td&gt;
&lt;td&gt;SVID + Bundle&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;X.509&lt;/td&gt;
&lt;td&gt;&lt;code&gt;FetchX509Bundles&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;stream&lt;/td&gt;
&lt;td&gt;Bundles only&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;JWT&lt;/td&gt;
&lt;td&gt;&lt;code&gt;FetchJWTSVID&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;unary&lt;/td&gt;
&lt;td&gt;JWT for a given audience&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;JWT&lt;/td&gt;
&lt;td&gt;&lt;code&gt;FetchJWTBundles&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;stream&lt;/td&gt;
&lt;td&gt;JWKS&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;JWT&lt;/td&gt;
&lt;td&gt;&lt;code&gt;ValidateJWTSVID&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;unary&lt;/td&gt;
&lt;td&gt;Delegate JWT verification to the server&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;From Section 1 of &lt;code&gt;SPIFFE_Workload_API.md&lt;/code&gt;:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;Both profiles are mandatory and MUST be supported by SPIFFE implementations. However, operators MAY administratively disable a specific profile in their deployment.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;A SPIFFE-compliant Workload API has to implement both X.509 and JWT profiles (the operator is allowed to turn one off in their deployment). This bites in practice. Even popular OSS libraries often bolt JWT on later.&lt;/p&gt;

&lt;h3&gt;
  
  
  6.3 Why It Streams
&lt;/h3&gt;

&lt;p&gt;&lt;code&gt;FetchX509SVID&lt;/code&gt;, &lt;code&gt;FetchJWTBundles&lt;/code&gt;, and &lt;code&gt;FetchX509Bundles&lt;/code&gt; return as gRPC server-side streams so the server can push:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;New certificates during key rotation.&lt;/li&gt;
&lt;li&gt;CRL (revocation list) updates.&lt;/li&gt;
&lt;li&gt;New state without the workload paying reconnect cost.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The client &lt;strong&gt;SHOULD keep the connection open&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fraw.githubusercontent.com%2F0-draft%2Fdev.to%2Fmain%2Farticles%2Fassets%2Fspiffe-compliance-deep-dive%2Fdiagrams%2F07-workload-api-rotation-sequence.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fraw.githubusercontent.com%2F0-draft%2Fdev.to%2Fmain%2Farticles%2Fassets%2Fspiffe-compliance-deep-dive%2Fdiagrams%2F07-workload-api-rotation-sequence.png" alt="Workload API streaming during CA rotation and revocation" width="800" height="947"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Every message ships the full current state, not a delta (spec sections 4.3 and 4.4). That sidesteps the whole anti-entropy headache of incremental sync.&lt;/p&gt;

&lt;h3&gt;
  
  
  6.4 Discovery: &lt;code&gt;SPIFFE_ENDPOINT_SOCKET&lt;/code&gt;
&lt;/h3&gt;

&lt;p&gt;How does a client find the Workload API? From spec section 4:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;Clients may be explicitly configured with the socket location, or may utilize the well-known environment variable &lt;code&gt;SPIFFE_ENDPOINT_SOCKET&lt;/code&gt;. If not explicitly configured, conforming clients MUST fall back to the environment variable.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;Without explicit configuration, look at &lt;code&gt;SPIFFE_ENDPOINT_SOCKET&lt;/code&gt;. The value is URI-formatted.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;# Unix Domain Socket&lt;/span&gt;
&lt;span class="nb"&gt;export &lt;/span&gt;&lt;span class="nv"&gt;SPIFFE_ENDPOINT_SOCKET&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;unix:///run/spire/agent.sock

&lt;span class="c"&gt;# TCP (only on hosts with a specific reason)&lt;/span&gt;
&lt;span class="nb"&gt;export &lt;/span&gt;&lt;span class="nv"&gt;SPIFFE_ENDPOINT_SOCKET&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;tcp://127.0.0.1:8000
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Official libraries like &lt;code&gt;go-spiffe/v2&lt;/code&gt; do this automatically.&lt;/p&gt;




&lt;h2&gt;
  
  
  7. Trust Domain and Bundle: Building the Trust Root
&lt;/h2&gt;

&lt;h3&gt;
  
  
  7.1 A Bundle Is a Keyring Shaped Like a JWKS
&lt;/h3&gt;

&lt;p&gt;&lt;code&gt;SPIFFE_Trust_Domain_and_Bundle.md&lt;/code&gt; in one line: &lt;strong&gt;SPIFFE Bundle = RFC 7517 JWK Set + SPIFFE-specific metadata&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;JWKS is the same format you've seen as &lt;code&gt;/.well-known/jwks.json&lt;/code&gt; on Cognito, Auth0, and Google Identity Platform. The SPIFFE Bundle extends it to hold both X.509 CA certificates and JWT signing keys.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight json"&gt;&lt;code&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"spiffe_sequence"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;12035488&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"spiffe_refresh_hint"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;2419200&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"keys"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"kty"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"RSA"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"use"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"x509-svid"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"x5c"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s2"&gt;"&amp;lt;base64 DER encoding of X.509 CA cert&amp;gt;"&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"n"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"&amp;lt;base64urlUint-encoded modulus&amp;gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"e"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"AQAB"&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"kty"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"RSA"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"kid"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"&amp;lt;JWT key id&amp;gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"use"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"jwt-svid"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"n"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"&amp;lt;base64urlUint-encoded modulus&amp;gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"e"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"AQAB"&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Two things are unique to a SPIFFE Bundle:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;&lt;code&gt;use&lt;/code&gt; parameter&lt;/strong&gt;: MUST be either &lt;code&gt;x509-svid&lt;/code&gt; or &lt;code&gt;jwt-svid&lt;/code&gt;. Without it the verifier can't tell which kind of SVID the key validates.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;&lt;code&gt;spiffe_sequence&lt;/code&gt;&lt;/strong&gt;: monotonically increasing counter. Increments every time the Bundle is updated.&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  7.2 Keep Trust Domains Isolated
&lt;/h3&gt;

&lt;p&gt;Section 6.2 of the Bundle spec:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;When a root key is shared across multiple trust domains, it becomes critically important that authentication and authorization implementations carefully check the trust domain name component of an identity.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;Don't share keys across trust domains. If you do, verifiers had better check the trust domain name strictly, or an SVID from the wrong trust domain becomes a forgery vector.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fraw.githubusercontent.com%2F0-draft%2Fdev.to%2Fmain%2Farticles%2Fassets%2Fspiffe-compliance-deep-dive%2Fdiagrams%2F08-trust-domain-isolation.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fraw.githubusercontent.com%2F0-draft%2Fdev.to%2Fmain%2Farticles%2Fassets%2Fspiffe-compliance-deep-dive%2Fdiagrams%2F08-trust-domain-isolation.png" alt="Trust domain isolation, shared key vs separated keys" width="800" height="845"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;One trust domain, one independent set of keys. That's the iron rule of SPIFFE operations.&lt;/p&gt;

&lt;h3&gt;
  
  
  7.3 Key Rotation: Add First, Remove Later
&lt;/h3&gt;

&lt;p&gt;Bundle updates work as "add the new one, then remove the old." The Bundle moves through four states:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Phase&lt;/th&gt;
&lt;th&gt;Bundle contents&lt;/th&gt;
&lt;th&gt;Issuer signs new SVIDs with&lt;/th&gt;
&lt;th&gt;Verifier accepts SVIDs signed by&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Initial&lt;/td&gt;
&lt;td&gt;&lt;code&gt;[K1]&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;K1&lt;/td&gt;
&lt;td&gt;K1&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Add K2&lt;/td&gt;
&lt;td&gt;&lt;code&gt;[K1, K2]&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;K1 (still)&lt;/td&gt;
&lt;td&gt;K1 or K2&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Switch&lt;/td&gt;
&lt;td&gt;&lt;code&gt;[K1, K2]&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;K2&lt;/td&gt;
&lt;td&gt;K1 or K2&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Drop K1&lt;/td&gt;
&lt;td&gt;&lt;code&gt;[K2]&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;K2&lt;/td&gt;
&lt;td&gt;K2&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;&lt;code&gt;spiffe_refresh_hint&lt;/code&gt; (seconds) controls how often the workload re-fetches the Bundle. The spec's example value is 28 days (2419200 seconds), which is far too long if revocation matters. In practice, 5 minutes to 1 hour is the realistic production range.&lt;/p&gt;




&lt;h2&gt;
  
  
  8. What Existing Implementations Actually Cover
&lt;/h2&gt;

&lt;h3&gt;
  
  
  8.1 SPIRE
&lt;/h3&gt;

&lt;p&gt;The reference implementation. Satisfies every item by definition. Workload API streaming, JWT-SVID ValidateJWTSVID delegation, the whole list.&lt;/p&gt;

&lt;h3&gt;
  
  
  8.2 Istio
&lt;/h3&gt;

&lt;p&gt;Istio holds the workload's SVID inside each sidecar (Envoy). The on-the-wire interface to Envoy is &lt;strong&gt;SDS (Secret Discovery Service)&lt;/strong&gt;, which is an Envoy upstream API, not Istio-specific. From a SPIFFE-compliance standpoint, the relevant fact is that Istio's istio-agent feeds SVIDs into Envoy via SDS rather than exposing the SPIFFE Workload API to workloads. The payload format is SPIFFE-compatible, but the Workload Endpoint spec is not implemented.&lt;/p&gt;

&lt;p&gt;Istio can also be wired to use a SPIRE-backed SDS server. In that setup SPIRE is the CA, and the Workload Endpoint spec becomes implemented through SPIRE.&lt;/p&gt;

&lt;h3&gt;
  
  
  8.3 Cilium
&lt;/h3&gt;

&lt;p&gt;Cilium has two unrelated paths that get conflated under the SPIFFE label, and they need to be kept apart.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Path 1, SPIRE-backed Mutual Authentication (Beta since v1.14, still Beta in v1.19)&lt;/strong&gt;. Cilium auto-deploys a SPIRE Agent on each node and rides on top. Cilium itself doesn't implement the SPIFFE spec directly. It wraps SPIRE. This path is opt-in (&lt;code&gt;authentication.mutual.spire.enabled=true&lt;/code&gt;) and has been since v1.14.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Path 2, ztunnel-based transparent mTLS (Beta, added v1.19)&lt;/strong&gt;. Cilium picked up the same ztunnel data plane Istio ambient mode uses. Per the Cilium docs, ztunnel's certificates are generated via OpenSSL and stored as Kubernetes Secrets. SPIRE is not in the loop, and the docs do not claim SPIFFE compliance for this mode. Whether the certificates carry SPIFFE IDs in their SANs is an implementation detail I haven't verified, so don't take "Cilium ztunnel is SPIFFE compliant" as a settled claim.&lt;/p&gt;

&lt;p&gt;If someone says "Cilium is SPIFFE compliant", ask which feature: the SPIRE-backed mutual auth, or ztunnel.&lt;/p&gt;

&lt;p&gt;So when a doc says "Istio is SPIFFE compliant", read it as "the SVID format is compliant, the Workload Endpoint spec is not." Same caution applies to Cilium plus the version question.&lt;/p&gt;




&lt;h2&gt;
  
  
  9. 2026 Updates: Recent Spec Movement
&lt;/h2&gt;

&lt;p&gt;Tracking the &lt;code&gt;spiffe/spiffe&lt;/code&gt; repo, a couple of things have moved since late 2025.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;&lt;code&gt;draft-wit-svid&lt;/code&gt; branch, active since late 2025&lt;/strong&gt;. A candidate successor to JWT-SVID aimed at integration with IETF's WIMSE (Workload Identity in Multi-System Environments) work. PR #361 ("Introduce WIT-SVID Token Document") landed on the branch 2026-01-21. Two changes worth noting: WIT-SVID makes Proof of Possession via the &lt;code&gt;cnf&lt;/code&gt; claim mandatory (RFC 7800 confirmation), structurally closing off the bearer-token replay risk of JWT-SVID; and PR #372 (2026-01-26) prohibits the &lt;code&gt;aud&lt;/code&gt; claim entirely (audience scoping is delegated to the PoP layer). Still an experimental branch, not on main, so I don't count it toward "compliance" yet.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;PR #381, merged 2026-03-26&lt;/strong&gt;. "Strengthen validation language, and clarify leaf SPIFFE ID requirements." Tightens the path requirements and validation language around leaf SPIFFE IDs. Low impact for most implementations, worth a re-read if you wrote your own.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The "what compliance means" I described here tracks the &lt;code&gt;main&lt;/code&gt; branch as of May 2026. If WIT-SVID gets merged into &lt;code&gt;main&lt;/code&gt;, sections 4 to 5 grow by one more SVID profile with stricter rules than JWT-SVID.&lt;/p&gt;




&lt;h2&gt;
  
  
  10. Running the Checklist: &lt;code&gt;scc&lt;/code&gt;
&lt;/h2&gt;

&lt;p&gt;Walking sections 3 to 7 by hand gets old after the second time someone hands you an SVID and asks "is this compliant?". I packaged the static slice of the checklist as a single-binary CLI: &lt;a href="https://github.com/0-draft/spiffe-compliance-checker" rel="noopener noreferrer"&gt;&lt;code&gt;scc&lt;/code&gt;&lt;/a&gt; (spiffe-compliance-checker).&lt;/p&gt;

&lt;p&gt;Each subcommand reads one artifact and emits one line per MUST / SHOULD clause, with the spec file and section cited inline. Exit code is &lt;code&gt;1&lt;/code&gt; on any MUST failure, &lt;code&gt;0&lt;/code&gt; otherwise. SHOULD violations surface as &lt;code&gt;WARN&lt;/code&gt; and do not change the exit code.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;brew &lt;span class="nb"&gt;install &lt;/span&gt;kanywst/tap/spiffe-compliance-checker
&lt;span class="c"&gt;# or: go install github.com/0-draft/spiffe-compliance-checker/cmd/scc@latest&lt;/span&gt;

scc &lt;span class="nb"&gt;id&lt;/span&gt;        &lt;span class="s1"&gt;'spiffe://example.com/payments/web-fe'&lt;/span&gt;
scc x509-svid leaf.pem
scc jwt-svid  &amp;lt;token&amp;gt;
scc bundle    bundle.json
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;A failing run reads as a spec walkthrough rather than an opaque error:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F9neuaeucg0hj4i8yg78u.gif" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F9neuaeucg0hj4i8yg78u.gif" alt="scc demo" width="799" height="493"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;What it covers right now: SPIFFE ID syntax, X.509-SVID structural rules (URI SAN, Basic Constraints, Key Usage criticality, EKU), JWT-SVID claims, Trust Bundle JWKS shape including base64 + DER validation of &lt;code&gt;x5c&lt;/code&gt;. What it does not: live Workload API behaviour, federation endpoint trust, signature verification against a specific bundle. Those need a running Agent and are out of scope for a static checker.&lt;/p&gt;

&lt;p&gt;Source, issues, releases: &lt;a href="https://github.com/0-draft/spiffe-compliance-checker" rel="noopener noreferrer"&gt;https://github.com/0-draft/spiffe-compliance-checker&lt;/a&gt;.&lt;/p&gt;




&lt;h2&gt;
  
  
  Conclusion
&lt;/h2&gt;

&lt;p&gt;SPIFFE compliance has two definitions and you should know which one you're talking about.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;The &lt;strong&gt;spec-defined floor&lt;/strong&gt; is SPIFFE-ID plus SVID. That's it.&lt;/li&gt;
&lt;li&gt;The &lt;strong&gt;practical floor&lt;/strong&gt; is that plus Workload API plus Trust Bundle. Federation joins the list only if you cross trust domains.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;When something claims SPIFFE compliance, ask at what level.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Fully compliant (SPIRE)&lt;/strong&gt;: meets every spec, interoperable across the board.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;SVID-compatible (Istio)&lt;/strong&gt;: SVID format is compliant, Workload Endpoint is custom.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;SPIRE-dependent (Cilium SPIRE path)&lt;/strong&gt;: bundles a SPIFFE implementation. The rest is its own.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Not yet claimable (Cilium ztunnel)&lt;/strong&gt;: a separate mTLS path that doesn't go through SPIRE and hasn't been declared SPIFFE compliant by upstream.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Just want to know whether an artifact in front of you is compliant? Run it through &lt;code&gt;scc&lt;/code&gt; (section 10). Building your own implementation? Walk the MUSTs in sections 3 to 7 one at a time. SVID format alone is feasible from scratch. The Workload API is where the scope explodes. &lt;a href="https://github.com/spiffe/go-spiffe" rel="noopener noreferrer"&gt;go-spiffe/v2&lt;/a&gt; and &lt;a href="https://github.com/spiffe/java-spiffe" rel="noopener noreferrer"&gt;java-spiffe&lt;/a&gt; are the official libraries. Wrapping them is the saner starting point than reimplementing the gRPC service.&lt;/p&gt;

&lt;p&gt;What makes SPIFFE useful is that workload authentication completes inside a vendor-neutral spec. AWS IAM, Google IAM, Kubernetes Service Accounts, none of them have to be in the trust path. Wire SPIFFE in once and cross-environment authentication stops relying on IP allowlists and pre-shared secrets.&lt;/p&gt;

</description>
      <category>spiffe</category>
      <category>identity</category>
      <category>security</category>
      <category>workloadidentity</category>
    </item>
    <item>
      <title>AWS SigV4 and SigV4A Deep Dive</title>
      <dc:creator>kt</dc:creator>
      <pubDate>Sat, 30 May 2026 11:19:22 +0000</pubDate>
      <link>https://dev.to/kanywst/aws-sigv4-and-sigv4a-deep-dive-12li</link>
      <guid>https://dev.to/kanywst/aws-sigv4-and-sigv4a-deep-dive-12li</guid>
      <description>&lt;h2&gt;
  
  
  Introduction
&lt;/h2&gt;

&lt;p&gt;Hitting S3 from &lt;code&gt;boto3&lt;/code&gt;, I had never thought about SigV4. The SDK does everything. I knew an &lt;code&gt;Authorization: AWS4-HMAC-SHA256 ...&lt;/code&gt; header was being assembled somewhere under the hood, but I had never built one by hand.&lt;/p&gt;

&lt;p&gt;Multi-Region Access Point (MRAP) destroyed that complacency. The instant I hit S3 through MRAP from Lambda, an algorithm I had never seen called &lt;code&gt;AWS4-ECDSA-P256-SHA256&lt;/code&gt; showed up instead of the usual SigV4, and the old &lt;code&gt;botocore&lt;/code&gt; I had pinned locally crashed with &lt;code&gt;InvalidSignature&lt;/code&gt;. AWS has two signing schemes: &lt;strong&gt;SigV4&lt;/strong&gt; and &lt;strong&gt;SigV4A&lt;/strong&gt;. The latter is asymmetric, using ECDSA instead of HMAC.&lt;/p&gt;

&lt;p&gt;This article dissects both, in this order.&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Why AWS uses a custom signature instead of Bearer Token&lt;/li&gt;
&lt;li&gt;Background: what HMAC and SHA-256 contribute&lt;/li&gt;
&lt;li&gt;The four SigV4 steps (Canonical Request / StringToSign / SigningKey derivation / Signature)&lt;/li&gt;
&lt;li&gt;SigV4 written in 80 lines of Python&lt;/li&gt;
&lt;li&gt;Chunked Upload (STREAMING-AWS4-HMAC-SHA256-PAYLOAD)&lt;/li&gt;
&lt;li&gt;Streaming SigV4 (WebSocket / IoT MQTT)&lt;/li&gt;
&lt;li&gt;What is inside a Presigned URL&lt;/li&gt;
&lt;li&gt;SigV4A: asymmetric signing on ECDSA P-256&lt;/li&gt;
&lt;li&gt;SigV4 vs SigV4A comparison&lt;/li&gt;
&lt;li&gt;Clock Skew traps and debugging tips&lt;/li&gt;
&lt;/ol&gt;




&lt;h2&gt;
  
  
  1. Why a custom signature instead of Bearer Token
&lt;/h2&gt;

&lt;p&gt;In the REST world, &lt;code&gt;Authorization: Bearer &amp;lt;token&amp;gt;&lt;/code&gt; is the default. OAuth 2.0 is basically the same. Bearer means "whoever holds the token is legitimate", and if one is stolen it is over. Under HTTPS that is usually fine in practice, but AWS &lt;strong&gt;explicitly refused that design&lt;/strong&gt;. There are three reasons.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fraw.githubusercontent.com%2F0-draft%2Fdev.to%2Fmain%2Farticles%2Fassets%2Faws-sigv4-and-sigv4a-deep-dive%2Fdiagrams%2F01-bearer-vs-sigv4.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fraw.githubusercontent.com%2F0-draft%2Fdev.to%2Fmain%2Farticles%2Fassets%2Faws-sigv4-and-sigv4a-deep-dive%2Fdiagrams%2F01-bearer-vs-sigv4.png" alt="Bearer Token vs SigV4 design comparison" width="800" height="866"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;The three design goals.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Never put the Secret Key on the wire&lt;/strong&gt;: &lt;code&gt;SecretAccessKey&lt;/code&gt; is an absurdly powerful credential. Sending it on every call is a non-starter. SigV4 does not sign with the Secret Key itself; it signs with a short-lived key derived from it through HMAC.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Replay protection&lt;/strong&gt;: the signature covers an &lt;strong&gt;ISO8601 timestamp&lt;/strong&gt; and the &lt;strong&gt;CredentialScope&lt;/strong&gt; (date + region + service), so it is bound to a moment in time and a place. A captured signature replayed the next day will not pass.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Tamper detection&lt;/strong&gt;: HTTP method, URI, query string, headers, and the SHA-256 of the body are all in the signed string. Flip a single byte in transit and the signature mismatches.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;SigV4 proves three things at once: who sent it, when, and what was in it. Completely different model from Bearer.&lt;/p&gt;




&lt;h2&gt;
  
  
  2. Background: HMAC and SHA-256 in one paragraph
&lt;/h2&gt;

&lt;p&gt;SigV4 sits on top of HMAC-SHA256. &lt;strong&gt;SHA-256&lt;/strong&gt; compresses arbitrary-length input into a fixed 256-bit digest. It is a one-way function with collision resistance (you cannot produce two inputs with the same hash) and preimage resistance (you cannot reverse it). &lt;strong&gt;HMAC&lt;/strong&gt; (Hash-based MAC) wraps a key around a message: &lt;code&gt;HMAC(key, msg) = SHA256(key XOR opad || SHA256(key XOR ipad || msg))&lt;/code&gt;. Anyone without the key cannot reproduce the output. SigV4 chains HMAC four times to derive a signing key, then HMACs the StringToSign with it. The hex of that final HMAC becomes &lt;code&gt;Signature=...&lt;/code&gt; in the Authorization header. SigV4A swaps only that last step from HMAC to ECDSA. Hold that mental model and the rest is detail.&lt;/p&gt;




&lt;h2&gt;
  
  
  3. The four SigV4 steps at a glance
&lt;/h2&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fraw.githubusercontent.com%2F0-draft%2Fdev.to%2Fmain%2Farticles%2Fassets%2Faws-sigv4-and-sigv4a-deep-dive%2Fdiagrams%2F02-four-steps-overview.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fraw.githubusercontent.com%2F0-draft%2Fdev.to%2Fmain%2Farticles%2Fassets%2Faws-sigv4-and-sigv4a-deep-dive%2Fdiagrams%2F02-four-steps-overview.png" alt="Four SigV4 steps overview" width="800" height="946"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;One step at a time from here.&lt;/p&gt;




&lt;h2&gt;
  
  
  4. Step 1: Building the Canonical Request
&lt;/h2&gt;

&lt;p&gt;The signature only works if "the same request always serializes to the same string". Header order and URI escape variants have to be flattened out. That flattening is what the Canonical Request does.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;CanonicalRequest =
  HTTPRequestMethod + '\n' +
  CanonicalURI + '\n' +
  CanonicalQueryString + '\n' +
  CanonicalHeaders + '\n' +
  SignedHeaders + '\n' +
  HashedPayload
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The rules for each element, in one diagram.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fraw.githubusercontent.com%2F0-draft%2Fdev.to%2Fmain%2Farticles%2Fassets%2Faws-sigv4-and-sigv4a-deep-dive%2Fdiagrams%2F03-canonical-request-rules.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fraw.githubusercontent.com%2F0-draft%2Fdev.to%2Fmain%2Farticles%2Fassets%2Faws-sigv4-and-sigv4a-deep-dive%2Fdiagrams%2F03-canonical-request-rules.png" alt="Canonical Request normalization rules" width="800" height="795"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Three traps that bite hard.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;URI encoded twice&lt;/strong&gt;: for AWS APIs other than S3, the path is URI-encoded &lt;strong&gt;twice&lt;/strong&gt;. A frequent SDK bug. S3 is the only exception, encoded once.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Query string sort&lt;/strong&gt;: &lt;code&gt;b=2&amp;amp;a=1&lt;/code&gt; becomes &lt;code&gt;a=1&amp;amp;b=2&lt;/code&gt;. When the same key appears multiple times, sort the values too.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Header value trim&lt;/strong&gt;: strip leading and trailing whitespace, then collapse runs of internal whitespace to a single space. &lt;code&gt;Foo:  bar  baz&lt;/code&gt; becomes &lt;code&gt;foo:bar baz&lt;/code&gt;.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;In Python.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;urllib.parse&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;quote&lt;/span&gt;
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;hashlib&lt;/span&gt;

&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;canonical_request&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;method&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;uri&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;query&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;headers&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;payload&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="c1"&gt;# URI: encode once for S3, twice for others (this example assumes S3)
&lt;/span&gt;    &lt;span class="n"&gt;canonical_uri&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;quote&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;uri&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;safe&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;/-_.~&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

    &lt;span class="c1"&gt;# Query: sort by key, encode values
&lt;/span&gt;    &lt;span class="n"&gt;items&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;sorted&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;query&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;items&lt;/span&gt;&lt;span class="p"&gt;())&lt;/span&gt;
    &lt;span class="n"&gt;canonical_query&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;&amp;amp;&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;join&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
        &lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="nf"&gt;quote&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;k&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;safe&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;-_.~&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s"&gt;=&lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="nf"&gt;quote&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;v&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;safe&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;-_.~&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt; &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;k&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;v&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;items&lt;/span&gt;
    &lt;span class="p"&gt;)&lt;/span&gt;

    &lt;span class="c1"&gt;# Headers: lowercase + sort + value trim
&lt;/span&gt;    &lt;span class="n"&gt;lower&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="n"&gt;k&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;lower&lt;/span&gt;&lt;span class="p"&gt;():&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt; &lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;join&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;v&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;split&lt;/span&gt;&lt;span class="p"&gt;())&lt;/span&gt; &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;k&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;v&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;headers&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;items&lt;/span&gt;&lt;span class="p"&gt;()}&lt;/span&gt;
    &lt;span class="n"&gt;sorted_keys&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;sorted&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;lower&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;keys&lt;/span&gt;&lt;span class="p"&gt;())&lt;/span&gt;
    &lt;span class="n"&gt;canonical_headers&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="sh"&gt;""&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;join&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;k&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s"&gt;:&lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;lower&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;k&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="se"&gt;\n&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt; &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;k&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;sorted_keys&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;signed_headers&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;;&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;join&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;sorted_keys&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

    &lt;span class="c1"&gt;# Payload: SHA256 hex
&lt;/span&gt;    &lt;span class="n"&gt;hashed_payload&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;hashlib&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;sha256&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;payload&lt;/span&gt;&lt;span class="p"&gt;).&lt;/span&gt;&lt;span class="nf"&gt;hexdigest&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;

    &lt;span class="nf"&gt;return &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
        &lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;method&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="se"&gt;\n&lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;canonical_uri&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="se"&gt;\n&lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;canonical_query&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="se"&gt;\n&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
        &lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;canonical_headers&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="se"&gt;\n&lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;signed_headers&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="se"&gt;\n&lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;hashed_payload&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
    &lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="n"&gt;signed_headers&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;hashed_payload&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;code&gt;canonical_headers&lt;/code&gt; ends with &lt;code&gt;\n&lt;/code&gt;, and the line after it brings another &lt;code&gt;\n&lt;/code&gt;, so the serialized form has two newlines in a row. That is spec-correct. "Fix" it to a single newline and &lt;code&gt;SignatureDoesNotMatch&lt;/code&gt; greets you immediately.&lt;/p&gt;




&lt;h2&gt;
  
  
  5. Step 2: Building the StringToSign
&lt;/h2&gt;

&lt;p&gt;Once the Canonical Request exists, SHA256 it and combine the result with the signing scope (algorithm + datetime + region + service) into a single string. That is the StringToSign.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;StringToSign =
  Algorithm + '\n' +
  RequestDateTime + '\n' +
  CredentialScope + '\n' +
  HEX(SHA256(CanonicalRequest))
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Concrete example.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;AWS4-HMAC-SHA256
20260517T120000Z
20260517/us-east-1/s3/aws4_request
e3b0c44298fc1c149afbf4c8996fb92427ae41e4649b934ca495991b7852b855
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;code&gt;CredentialScope&lt;/code&gt; has the shape &lt;code&gt;&amp;lt;date&amp;gt;/&amp;lt;region&amp;gt;/&amp;lt;service&amp;gt;/aws4_request&lt;/code&gt;. That alone tells AWS "this signature is for this day, this region, and this service". A signature scoped to &lt;code&gt;s3&lt;/code&gt; cannot be reused against &lt;code&gt;dynamodb&lt;/code&gt;. That is the core of replay protection.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;hashlib&lt;/span&gt;

&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;string_to_sign&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;amz_date&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;scope_date&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;region&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;service&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;canonical_req&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="n"&gt;algorithm&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;AWS4-HMAC-SHA256&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
    &lt;span class="n"&gt;scope&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;scope_date&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s"&gt;/&lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;region&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s"&gt;/&lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;service&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s"&gt;/aws4_request&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
    &lt;span class="n"&gt;hashed_cr&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;hashlib&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;sha256&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;canonical_req&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;encode&lt;/span&gt;&lt;span class="p"&gt;()).&lt;/span&gt;&lt;span class="nf"&gt;hexdigest&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;algorithm&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="se"&gt;\n&lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;amz_date&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="se"&gt;\n&lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;scope&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="se"&gt;\n&lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;hashed_cr&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;scope&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;code&gt;amz_date&lt;/code&gt; is the basic ISO8601 form &lt;code&gt;20260517T120000Z&lt;/code&gt; (no &lt;code&gt;-&lt;/code&gt; or &lt;code&gt;:&lt;/code&gt;). &lt;code&gt;scope_date&lt;/code&gt; is just &lt;code&gt;20260517&lt;/code&gt;. Mismatching them is a classic bug.&lt;/p&gt;




&lt;h2&gt;
  
  
  6. Step 3: Deriving the SigningKey (4-stage HMAC chain)
&lt;/h2&gt;

&lt;p&gt;This is the heart of SigV4. You never sign with the Secret Key itself. You chain HMAC four times from the Secret Key to build &lt;strong&gt;kSigning&lt;/strong&gt;, and sign with that.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fraw.githubusercontent.com%2F0-draft%2Fdev.to%2Fmain%2Farticles%2Fassets%2Faws-sigv4-and-sigv4a-deep-dive%2Fdiagrams%2F04-signing-key-derivation.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fraw.githubusercontent.com%2F0-draft%2Fdev.to%2Fmain%2Farticles%2Fassets%2Faws-sigv4-and-sigv4a-deep-dive%2Fdiagrams%2F04-signing-key-derivation.png" alt="Four-stage HMAC chain that derives the signing key" width="800" height="959"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Why four stages. &lt;strong&gt;Key separation&lt;/strong&gt;. kDate is "a key valid only for this day", kRegion is "valid only for this day and this region", and so on. Each stage narrows the scope further. Even if an attacker leaks kRegion, it cannot be used on another day or in another region. &lt;strong&gt;The Secret Key is permanent, but kSigning is disposable, scoped to one day, one region, one service.&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;hmac&lt;/span&gt;
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;hashlib&lt;/span&gt;

&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;hmac_sha256&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;key&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;bytes&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;msg&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="nb"&gt;bytes&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;hmac&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;new&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;key&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;msg&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;encode&lt;/span&gt;&lt;span class="p"&gt;(),&lt;/span&gt; &lt;span class="n"&gt;hashlib&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;sha256&lt;/span&gt;&lt;span class="p"&gt;).&lt;/span&gt;&lt;span class="nf"&gt;digest&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;

&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;signing_key&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;secret&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;scope_date&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;region&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;service&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="n"&gt;k_date&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;hmac_sha256&lt;/span&gt;&lt;span class="p"&gt;((&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;AWS4&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="n"&gt;secret&lt;/span&gt;&lt;span class="p"&gt;).&lt;/span&gt;&lt;span class="nf"&gt;encode&lt;/span&gt;&lt;span class="p"&gt;(),&lt;/span&gt; &lt;span class="n"&gt;scope_date&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;k_region&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;hmac_sha256&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;k_date&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;region&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;k_service&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;hmac_sha256&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;k_region&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;service&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;k_signing&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;hmac_sha256&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;k_service&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;aws4_request&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;k_signing&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The &lt;code&gt;"AWS4" + SecretAccessKey&lt;/code&gt; prefix is hard-coded in the spec. Forget the &lt;code&gt;AWS4&lt;/code&gt; and &lt;code&gt;SignatureDoesNotMatch&lt;/code&gt; lands on the first request.&lt;/p&gt;




&lt;h2&gt;
  
  
  7. Step 4: Computing the Signature and the Authorization Header
&lt;/h2&gt;

&lt;p&gt;The last step is just HMAC the StringToSign with kSigning and hex-encode it.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;sign&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;k_signing&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;bytes&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;string_to_sign&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;hmac&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;new&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;k_signing&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;string_to_sign&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;encode&lt;/span&gt;&lt;span class="p"&gt;(),&lt;/span&gt; &lt;span class="n"&gt;hashlib&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;sha256&lt;/span&gt;&lt;span class="p"&gt;).&lt;/span&gt;&lt;span class="nf"&gt;hexdigest&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The Authorization header is assembled like this.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Authorization: AWS4-HMAC-SHA256
  Credential=&amp;lt;AccessKeyId&amp;gt;/&amp;lt;scope&amp;gt;,
  SignedHeaders=&amp;lt;signed&amp;gt;,
  Signature=&amp;lt;hex&amp;gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;A real example.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Authorization: AWS4-HMAC-SHA256 Credential=AKIAIOSFODNN7EXAMPLE/20260517/us-east-1/s3/aws4_request, SignedHeaders=host;x-amz-content-sha256;x-amz-date, Signature=fe5f80f77d5fa3beca038a248ff027d0445342fe2855ddc963176630326f1024
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;code&gt;Credential&lt;/code&gt; holds the AccessKeyId and the CredentialScope, &lt;code&gt;SignedHeaders&lt;/code&gt; lists the signed header names with &lt;code&gt;;&lt;/code&gt; separators, and &lt;code&gt;Signature&lt;/code&gt; is the hex string. This is exactly what the SDK is assembling for you.&lt;/p&gt;




&lt;h2&gt;
  
  
  8. Full implementation: hitting S3 GetObject with SigV4
&lt;/h2&gt;

&lt;p&gt;The four steps wired together in under 80 lines. Pure &lt;code&gt;urllib&lt;/code&gt;, no SDK.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;datetime&lt;/span&gt;
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;hashlib&lt;/span&gt;
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;hmac&lt;/span&gt;
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;urllib.request&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;urllib.parse&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;quote&lt;/span&gt;

&lt;span class="n"&gt;ACCESS_KEY&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;AKIAIOSFODNN7EXAMPLE&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
&lt;span class="n"&gt;SECRET_KEY&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;wJalrXUtnFEMI/K7MDENG/bPxRfiCYEXAMPLEKEY&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
&lt;span class="n"&gt;REGION&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;us-east-1&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
&lt;span class="n"&gt;SERVICE&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;s3&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
&lt;span class="n"&gt;BUCKET&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;kt-sigv4-demo&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
&lt;span class="n"&gt;KEY&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;hello.txt&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
&lt;span class="n"&gt;HOST&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;BUCKET&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s"&gt;.s3.&lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;REGION&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s"&gt;.amazonaws.com&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;


&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;hmac_sha256&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;key&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;bytes&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;msg&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="nb"&gt;bytes&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;hmac&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;new&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;key&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;msg&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;encode&lt;/span&gt;&lt;span class="p"&gt;(),&lt;/span&gt; &lt;span class="n"&gt;hashlib&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;sha256&lt;/span&gt;&lt;span class="p"&gt;).&lt;/span&gt;&lt;span class="nf"&gt;digest&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;


&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;signing_key&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;secret&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;scope_date&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;region&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;service&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="n"&gt;k&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;hmac_sha256&lt;/span&gt;&lt;span class="p"&gt;((&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;AWS4&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="n"&gt;secret&lt;/span&gt;&lt;span class="p"&gt;).&lt;/span&gt;&lt;span class="nf"&gt;encode&lt;/span&gt;&lt;span class="p"&gt;(),&lt;/span&gt; &lt;span class="n"&gt;scope_date&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;k&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;hmac_sha256&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;k&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;region&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;k&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;hmac_sha256&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;k&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;service&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="nf"&gt;hmac_sha256&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;k&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;aws4_request&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;


&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;sigv4_get&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;bucket&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;key&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="n"&gt;now&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;datetime&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;datetime&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;now&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;datetime&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;timezone&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;utc&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;amz_date&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;now&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;strftime&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;%Y%m%dT%H%M%SZ&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;scope_date&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;now&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;strftime&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;%Y%m%d&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;scope&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;scope_date&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s"&gt;/&lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;REGION&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s"&gt;/&lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;SERVICE&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s"&gt;/aws4_request&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;

    &lt;span class="n"&gt;method&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;GET&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
    &lt;span class="n"&gt;canonical_uri&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;/&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="nf"&gt;quote&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;key&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;safe&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;/-_.~&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;canonical_query&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="sh"&gt;""&lt;/span&gt;
    &lt;span class="n"&gt;payload_hash&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;hashlib&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;sha256&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sa"&gt;b&lt;/span&gt;&lt;span class="sh"&gt;""&lt;/span&gt;&lt;span class="p"&gt;).&lt;/span&gt;&lt;span class="nf"&gt;hexdigest&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;

    &lt;span class="n"&gt;headers_to_sign&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;host&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;HOST&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;x-amz-content-sha256&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;payload_hash&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;x-amz-date&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;amz_date&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;
    &lt;span class="n"&gt;sorted_keys&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;sorted&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;headers_to_sign&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;keys&lt;/span&gt;&lt;span class="p"&gt;())&lt;/span&gt;
    &lt;span class="n"&gt;canonical_headers&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="sh"&gt;""&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;join&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;k&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s"&gt;:&lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;headers_to_sign&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;k&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="se"&gt;\n&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt; &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;k&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;sorted_keys&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;signed_headers&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;;&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;join&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;sorted_keys&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

    &lt;span class="n"&gt;canonical_request&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;
        &lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;method&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="se"&gt;\n&lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;canonical_uri&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="se"&gt;\n&lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;canonical_query&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="se"&gt;\n&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
        &lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;canonical_headers&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="se"&gt;\n&lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;signed_headers&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="se"&gt;\n&lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;payload_hash&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
    &lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;hashed_cr&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;hashlib&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;sha256&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;canonical_request&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;encode&lt;/span&gt;&lt;span class="p"&gt;()).&lt;/span&gt;&lt;span class="nf"&gt;hexdigest&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
    &lt;span class="n"&gt;string_to_sign&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;AWS4-HMAC-SHA256&lt;/span&gt;&lt;span class="se"&gt;\n&lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;amz_date&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="se"&gt;\n&lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;scope&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="se"&gt;\n&lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;hashed_cr&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;

    &lt;span class="n"&gt;k_signing&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;signing_key&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;SECRET_KEY&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;scope_date&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;REGION&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;SERVICE&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;signature&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;hmac&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;new&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;k_signing&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;string_to_sign&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;encode&lt;/span&gt;&lt;span class="p"&gt;(),&lt;/span&gt; &lt;span class="n"&gt;hashlib&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;sha256&lt;/span&gt;&lt;span class="p"&gt;).&lt;/span&gt;&lt;span class="nf"&gt;hexdigest&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;

    &lt;span class="n"&gt;authorization&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;
        &lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;AWS4-HMAC-SHA256 Credential=&lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;ACCESS_KEY&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s"&gt;/&lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;scope&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s"&gt;, &lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
        &lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;SignedHeaders=&lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;signed_headers&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s"&gt;, Signature=&lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;signature&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
    &lt;span class="p"&gt;)&lt;/span&gt;

    &lt;span class="n"&gt;req&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;urllib&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;request&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nc"&gt;Request&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;https://&lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;HOST&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s"&gt;/&lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;key&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;method&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;method&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;h&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;v&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;headers_to_sign&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;items&lt;/span&gt;&lt;span class="p"&gt;():&lt;/span&gt;
        &lt;span class="n"&gt;req&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;add_header&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;h&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;v&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;req&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;add_header&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Authorization&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;authorization&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

    &lt;span class="k"&gt;with&lt;/span&gt; &lt;span class="n"&gt;urllib&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;request&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;urlopen&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;req&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="n"&gt;resp&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;resp&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;read&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;


&lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;__name__&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;__main__&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;sigv4_get&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;BUCKET&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;KEY&lt;/span&gt;&lt;span class="p"&gt;).&lt;/span&gt;&lt;span class="nf"&gt;decode&lt;/span&gt;&lt;span class="p"&gt;())&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Typical failures.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Forgot &lt;code&gt;host&lt;/code&gt; in headers_to_sign: SignatureDoesNotMatch&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;x-amz-date&lt;/code&gt; and the date inside &lt;code&gt;Authorization&lt;/code&gt; differ: SignatureDoesNotMatch&lt;/li&gt;
&lt;li&gt;Body is empty but you forgot to compute payload_hash: SignatureDoesNotMatch&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The SDK does all of this correctly. When writing your own, the fastest debugging path is &lt;strong&gt;byte-for-byte diff against what the SDK produces&lt;/strong&gt;.&lt;/p&gt;




&lt;h2&gt;
  
  
  9. End-to-end sequence: Client and Server verification
&lt;/h2&gt;

&lt;p&gt;The server side (AWS) reproduces the exact same procedure to build the signature, then compares against what the client sent.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fraw.githubusercontent.com%2F0-draft%2Fdev.to%2Fmain%2Farticles%2Fassets%2Faws-sigv4-and-sigv4a-deep-dive%2Fdiagrams%2F05-client-server-verification.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fraw.githubusercontent.com%2F0-draft%2Fdev.to%2Fmain%2Farticles%2Fassets%2Faws-sigv4-and-sigv4a-deep-dive%2Fdiagrams%2F05-client-server-verification.png" alt="Client signs and server reverifies sequence" width="800" height="869"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;When AWS returns &lt;code&gt;SignatureDoesNotMatch&lt;/code&gt;, &lt;strong&gt;the response body contains the Canonical Request that AWS itself reconstructed&lt;/strong&gt;. That is gold for debugging. Diff your own Canonical Request against the one AWS built and the divergent element (header trim, URI encoding, missing header) pops out immediately.&lt;/p&gt;




&lt;h2&gt;
  
  
  10. Chunked Upload: STREAMING-AWS4-HMAC-SHA256-PAYLOAD
&lt;/h2&gt;

&lt;p&gt;For large uploads to S3, hashing the entire body up front is not realistic. Hashing a 10 GB file before you even start sending it is a latency disaster. So S3 has a STREAMING mode where &lt;strong&gt;each chunk is signed individually&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;The headers look like this.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;x-amz-content-sha256: STREAMING-AWS4-HMAC-SHA256-PAYLOAD
Content-Encoding: aws-chunked
x-amz-decoded-content-length: &amp;lt;original body size&amp;gt;
Content-Length: &amp;lt;size including chunk headers&amp;gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The body is chunked-transfer-style but with a custom format.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;10000;chunk-signature=&amp;lt;sig1&amp;gt;\r\n
&amp;lt;8192 byte chunk 1&amp;gt;\r\n
10000;chunk-signature=&amp;lt;sig2&amp;gt;\r\n
&amp;lt;8192 byte chunk 2&amp;gt;\r\n
...
0;chunk-signature=&amp;lt;sigN&amp;gt;\r\n
\r\n
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Each chunk's signature is computed by chaining the previous chunk's signature with the current chunk's hash. Swap a chunk in the middle and every signature after it falls apart.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fraw.githubusercontent.com%2F0-draft%2Fdev.to%2Fmain%2Farticles%2Fassets%2Faws-sigv4-and-sigv4a-deep-dive%2Fdiagrams%2F06-chunked-signature-chain.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fraw.githubusercontent.com%2F0-draft%2Fdev.to%2Fmain%2Farticles%2Fassets%2Faws-sigv4-and-sigv4a-deep-dive%2Fdiagrams%2F06-chunked-signature-chain.png" alt="Chunk signatures chained through previous-signature" width="800" height="1720"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;The StringToSign gets a chunk-specific extension.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;AWS4-HMAC-SHA256-PAYLOAD
&amp;lt;amz-date&amp;gt;
&amp;lt;scope&amp;gt;
&amp;lt;previous-signature&amp;gt;
HEX(SHA256(""))
HEX(SHA256(chunk-data))
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;code&gt;previous-signature&lt;/code&gt; is what closes the chain. The stream terminates with a &lt;strong&gt;zero-byte chunk&lt;/strong&gt;. There is also a &lt;code&gt;STREAMING-AWS4-HMAC-SHA256-PAYLOAD-TRAILER&lt;/code&gt; variant that appends trailer headers like CRC32C for an extra integrity check.&lt;/p&gt;

&lt;p&gt;Writing this by hand is basically masochism. Let the SDK handle it. But knowing the mechanics makes questions like "why is Content-Length different from x-amz-decoded-content-length" trivial to answer.&lt;/p&gt;




&lt;h2&gt;
  
  
  11. Streaming SigV4: WebSocket / IoT MQTT
&lt;/h2&gt;

&lt;p&gt;SigV4 also rides on &lt;strong&gt;WebSocket&lt;/strong&gt; and &lt;strong&gt;MQTT over WebSocket&lt;/strong&gt;, not just plain HTTP. IoT Core, API Gateway WebSocket, and CloudWatch Logs Live Tail all sit here.&lt;/p&gt;

&lt;p&gt;These use a &lt;strong&gt;Presigned URL form that embeds SigV4 into the connection URL's query string&lt;/strong&gt;. The &lt;code&gt;Authorization&lt;/code&gt; header is hard to attach to a WebSocket Upgrade, so cramming everything into the URL lets the handshake complete in one round trip. With MQTT over WebSocket the URL ends up as &lt;code&gt;wss://&amp;lt;endpoint&amp;gt;/mqtt?X-Amz-Algorithm=...&amp;amp;X-Amz-Signature=...&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;The streaming case is special because &lt;strong&gt;the connection lives for a long time&lt;/strong&gt;. The SigV4 signature itself is checked only once at connection establishment, so even after &lt;code&gt;X-Amz-Expires&lt;/code&gt; passes, the existing connection keeps going (depending on the AWS implementation). Reconnecting requires a fresh signature.&lt;/p&gt;




&lt;h2&gt;
  
  
  12. What is inside a Presigned URL
&lt;/h2&gt;

&lt;p&gt;A Presigned URL is &lt;strong&gt;the Authorization header crammed into the query string&lt;/strong&gt;. Heavily used for "hand someone a URL and let the browser download the S3 object directly".&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;https://kt-bucket.s3.us-east-1.amazonaws.com/report.pdf
  ?X-Amz-Algorithm=AWS4-HMAC-SHA256
  &amp;amp;X-Amz-Credential=AKIA.../20260517/us-east-1/s3/aws4_request
  &amp;amp;X-Amz-Date=20260517T120000Z
  &amp;amp;X-Amz-Expires=900
  &amp;amp;X-Amz-SignedHeaders=host
  &amp;amp;X-Amz-Signature=fe5f80f77d5fa3beca038a248ff027d0445342fe2855ddc963176630326f1024
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Properties.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;X-Amz-Expires&lt;/strong&gt;: in seconds. &lt;strong&gt;Maximum 604800 (= 7 days)&lt;/strong&gt;. The 7-day cap exists because the SigV4 signing key rotates on roughly a 7-day cycle (kDate is per-day, but kSigning is valid for about 7 days).&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;X-Amz-SignedHeaders=host&lt;/strong&gt;: no body, no extra headers, so only host needs to be signed.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Payload hash handling&lt;/strong&gt;: presigned URLs use either &lt;code&gt;UNSIGNED-PAYLOAD&lt;/code&gt; or the actual body hash. Looser than the header-based form.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Watch out for these.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;A Presigned URL minted with IAM Role temporary credentials (e.g. STS AssumeRole) is capped by the credential's own lifetime&lt;/strong&gt;. &lt;code&gt;AssumeRole&lt;/code&gt; defaults to 1 hour for &lt;code&gt;DurationSeconds&lt;/code&gt;, configurable up to 12 hours, but once the SessionToken expires the URL dies. This is the standard root cause when someone passes &lt;code&gt;--expires-in 604800&lt;/code&gt; and the URL still dies after 1 hour.&lt;/li&gt;
&lt;li&gt;If you are handing the URL to a browser, mint it with an &lt;strong&gt;IAM User long-term key&lt;/strong&gt; or raise the Role's MaxSessionDuration. EC2 Instance Profile lands in the same trap: IMDS auto-rotates, but after the rotation any URL signed with the previous credentials is dead.&lt;/li&gt;
&lt;/ul&gt;




&lt;h2&gt;
  
  
  13. SigV4A: the asymmetric variant
&lt;/h2&gt;

&lt;p&gt;Now the main event. The instant you hit &lt;strong&gt;Multi-Region Access Point (MRAP)&lt;/strong&gt;, the SDK silently flips from SigV4 to SigV4A. The algorithm name is &lt;code&gt;AWS4-ECDSA-P256-SHA256&lt;/code&gt;. It uses &lt;strong&gt;ECDSA (Elliptic Curve Digital Signature Algorithm)&lt;/strong&gt; with NIST P-256 instead of HMAC.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fraw.githubusercontent.com%2F0-draft%2Fdev.to%2Fmain%2Farticles%2Fassets%2Faws-sigv4-and-sigv4a-deep-dive%2Fdiagrams%2F07-sigv4a-kdf.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fraw.githubusercontent.com%2F0-draft%2Fdev.to%2Fmain%2Farticles%2Fassets%2Faws-sigv4-and-sigv4a-deep-dive%2Fdiagrams%2F07-sigv4a-kdf.png" alt="SigV4A keypair derivation from SecretAccessKey" width="800" height="1278"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h3&gt;
  
  
  How it actually works
&lt;/h3&gt;

&lt;p&gt;SigV4A &lt;strong&gt;derives an ECDSA P-256 keypair from the SecretAccessKey through a KDF&lt;/strong&gt;. The spec uses &lt;code&gt;input_key = "AWS4A" || sk&lt;/code&gt;, the label &lt;code&gt;AWS4-ECDSA-P256-SHA256&lt;/code&gt;, the &lt;code&gt;akid&lt;/code&gt; (AccessKeyId) as context, and runs a counter that iterates until the derived scalar lands below the P-256 order &lt;code&gt;n&lt;/code&gt;. The resulting scalar &lt;code&gt;c&lt;/code&gt; becomes the private key as &lt;code&gt;k = c + 1&lt;/code&gt;, and the public key is &lt;code&gt;Q = k * G&lt;/code&gt;. The verifier needs only the public key to check the signature. The canonical implementation lives in &lt;code&gt;key_derivation.c&lt;/code&gt; in &lt;code&gt;awslabs/aws-c-auth&lt;/code&gt;.&lt;/p&gt;

&lt;h3&gt;
  
  
  Why Multi-Region forces this
&lt;/h3&gt;

&lt;p&gt;MRAP is a representative endpoint for a set of buckets across multiple regions, and any request can be routed to any of them. SigV4 &lt;strong&gt;bakes the region name directly into CredentialScope&lt;/strong&gt;, so a signature minted for &lt;code&gt;us-east-1&lt;/code&gt; simply cannot be verified at &lt;code&gt;us-west-2&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;SigV4A &lt;strong&gt;drops the region segment from CredentialScope entirely&lt;/strong&gt; and replaces it with the &lt;strong&gt;&lt;code&gt;X-Amz-Region-Set&lt;/code&gt;&lt;/strong&gt; header that declares "this signature is valid in us-east-1, us-west-2, and eu-west-1". CredentialScope changes like this.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;SigV4:  20260517/us-east-1/s3/aws4_request
SigV4A: 20260517/s3/aws4_request   (region segment removed)
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The &lt;code&gt;X-Amz-Region-Set&lt;/code&gt; value is a comma-separated list like &lt;code&gt;us-east-1,us-west-2&lt;/code&gt;, and wildcards like &lt;code&gt;us-east-*&lt;/code&gt; or a bare &lt;code&gt;*&lt;/code&gt; are allowed too.&lt;/p&gt;

&lt;p&gt;On top of that, every region can verify independently with only the public key. &lt;strong&gt;Distributing a symmetric HMAC key to every region is a security risk&lt;/strong&gt; (compromise one region and they all leak), which is exactly where asymmetric signing pays off.&lt;/p&gt;

&lt;h3&gt;
  
  
  Is the ECDSA here deterministic
&lt;/h3&gt;

&lt;p&gt;Standard ECDSA needs a fresh random nonce &lt;code&gt;k&lt;/code&gt; for every signature (the same input produces different signatures each time). Bias or reuse of &lt;code&gt;k&lt;/code&gt; is a fatal mistake that leaks the private key. SigV4A's implementation lives in AWS Common Runtime (&lt;code&gt;awslabs/aws-c-auth&lt;/code&gt; + &lt;code&gt;aws-c-cal&lt;/code&gt;), and the nonce comes from the OS RNG, so &lt;strong&gt;the same request produces a different signature on every call&lt;/strong&gt;. Whether it follows RFC 6979 (deterministic ECDSA) is not stated explicitly in public docs. "&lt;code&gt;awscrt&lt;/code&gt; handles the nonce correctly" is enough to know in practice.&lt;/p&gt;

&lt;h3&gt;
  
  
  Rolling your own is brutal
&lt;/h3&gt;

&lt;p&gt;ECDSA has heavy math, so if you go custom, use the &lt;code&gt;cryptography&lt;/code&gt; library. AWS officially recommends &lt;code&gt;aws-crt&lt;/code&gt; (a C library bound into Python through &lt;code&gt;awscrt&lt;/code&gt;). &lt;code&gt;botocore&lt;/code&gt; treats &lt;code&gt;awscrt&lt;/code&gt; as an &lt;strong&gt;optional dependency&lt;/strong&gt;. Install with &lt;code&gt;pip install botocore[crt]&lt;/code&gt; (or &lt;code&gt;pip install boto3[crt]&lt;/code&gt;) and SigV4A kicks in automatically against MRAP. Plain &lt;code&gt;boto3&lt;/code&gt; without the crt extra crashes against MRAP with &lt;code&gt;MissingDependencyException&lt;/code&gt;.&lt;/p&gt;




&lt;h2&gt;
  
  
  14. SigV4 vs SigV4A side by side
&lt;/h2&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Item&lt;/th&gt;
&lt;th&gt;SigV4&lt;/th&gt;
&lt;th&gt;SigV4A&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Algorithm name&lt;/td&gt;
&lt;td&gt;&lt;code&gt;AWS4-HMAC-SHA256&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;&lt;code&gt;AWS4-ECDSA-P256-SHA256&lt;/code&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Key type&lt;/td&gt;
&lt;td&gt;Symmetric (HMAC)&lt;/td&gt;
&lt;td&gt;Asymmetric (ECDSA P-256)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Key derivation&lt;/td&gt;
&lt;td&gt;4-stage HMAC chain (kDate to kRegion to kService to kSigning)&lt;/td&gt;
&lt;td&gt;One-shot KDF with a counter to derive an ECDSA keypair (label &lt;code&gt;AWS4-ECDSA-P256-SHA256&lt;/code&gt;, input &lt;code&gt;AWS4A&lt;/code&gt; + secret)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Region in CredentialScope&lt;/td&gt;
&lt;td&gt;Fixed (e.g. &lt;code&gt;us-east-1&lt;/code&gt;)&lt;/td&gt;
&lt;td&gt;Segment removed. &lt;code&gt;X-Amz-Region-Set&lt;/code&gt; header declares the regions (wildcards allowed)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Verification&lt;/td&gt;
&lt;td&gt;AWS recomputes the same HMAC from the same Secret&lt;/td&gt;
&lt;td&gt;AWS verifies the ECDSA signature with the public key&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Main use case&lt;/td&gt;
&lt;td&gt;Any single-region API call&lt;/td&gt;
&lt;td&gt;Multi-Region Access Point, replication paths&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Signature determinism&lt;/td&gt;
&lt;td&gt;Deterministic (same input gives same signature)&lt;/td&gt;
&lt;td&gt;Non-deterministic (&lt;code&gt;awscrt&lt;/code&gt; uses a random nonce)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Performance&lt;/td&gt;
&lt;td&gt;HMAC is ultra-light (microseconds)&lt;/td&gt;
&lt;td&gt;ECDSA is heavier (milliseconds)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;7-day Presigned URL&lt;/td&gt;
&lt;td&gt;Supported&lt;/td&gt;
&lt;td&gt;Supported (same conditions)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;SDK support&lt;/td&gt;
&lt;td&gt;All SDKs&lt;/td&gt;
&lt;td&gt;Through &lt;code&gt;aws-crt&lt;/code&gt; (Python needs &lt;code&gt;botocore[crt]&lt;/code&gt;; Go/Java/JS bundle it)&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;For working developers the only thing that matters in practice: &lt;strong&gt;the moment MRAP enters the picture the SDK flips to SigV4A; everything else stays on SigV4&lt;/strong&gt;. You almost never reach for SigV4A by hand.&lt;/p&gt;




&lt;h2&gt;
  
  
  15. SigV4 vs SigV4A: scope and verifier differences in one figure
&lt;/h2&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fraw.githubusercontent.com%2F0-draft%2Fdev.to%2Fmain%2Farticles%2Fassets%2Faws-sigv4-and-sigv4a-deep-dive%2Fdiagrams%2F08-sigv4-vs-sigv4a-scope.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fraw.githubusercontent.com%2F0-draft%2Fdev.to%2Fmain%2Farticles%2Fassets%2Faws-sigv4-and-sigv4a-deep-dive%2Fdiagrams%2F08-sigv4-vs-sigv4a-scope.png" alt="SigV4 vs SigV4A scope and verifier comparison" width="800" height="803"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;SigV4 assumes &lt;strong&gt;1 request = 1 region&lt;/strong&gt;, so HMAC is enough. SigV4A is built around &lt;strong&gt;1 request landing in any of several regions&lt;/strong&gt;, which makes a shared HMAC untenable and forces ECDSA. That is the structural difference.&lt;/p&gt;




&lt;h2&gt;
  
  
  16. The Clock Skew trap
&lt;/h2&gt;

&lt;p&gt;SigV4's &lt;code&gt;CredentialScope&lt;/code&gt; carries &lt;code&gt;&amp;lt;date&amp;gt;&lt;/code&gt; and &lt;code&gt;x-amz-date&lt;/code&gt; carries a second-precision timestamp. AWS rejects requests whose timestamp is &lt;strong&gt;more than ±15 minutes off the current time&lt;/strong&gt; (the S3 docs state this explicitly; the tolerance varies slightly by service).&lt;/p&gt;

&lt;p&gt;Three classic ways to step on this.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Docker container NTP drift&lt;/strong&gt;: when the container is not NTP-synced, host clock skew kills you instantly. Containers with no visible &lt;code&gt;hwclock&lt;/code&gt; and no &lt;code&gt;ntpd&lt;/code&gt; are the worst offenders. Lambda is fine because AWS keeps it in sync.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;EC2 with chrony missing&lt;/strong&gt;: AL2023 ships &lt;code&gt;chronyd&lt;/code&gt; by default, but custom AMIs that strip it out drift over time. Point chrony at &lt;code&gt;169.254.169.123&lt;/code&gt; (Amazon Time Sync Service) and the problem disappears.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Old Lambda layers / virtualized clocks&lt;/strong&gt;: BPF-based sandboxes where the clock is pinned. Upgrading the Lambda runtime usually fixes it.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Error examples.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;SignatureDoesNotMatch:
  The request signature we calculated does not match the signature you provided.
  Check your key and signing method.
RequestTimeTooSkewed:
  The difference between the request time and the current time is too large.
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;code&gt;RequestTimeTooSkewed&lt;/code&gt; is always clock skew, 100%. &lt;code&gt;SignatureDoesNotMatch&lt;/code&gt; is either clock skew or a Canonical Request construction bug.&lt;/p&gt;




&lt;h2&gt;
  
  
  Conclusion
&lt;/h2&gt;

&lt;p&gt;SigV4 solves "request signing that detects tampering and replay without ever exposing the secret". Chain HMAC four times to derive a signing key, SHA256 the Canonical Request to build a StringToSign, then HMAC the whole thing for the final signature. Write the four steps by hand once and everything the SDK was hiding finally becomes visible.&lt;/p&gt;

&lt;p&gt;SigV4A is &lt;strong&gt;SigV4 extended to multi-region by going asymmetric (ECDSA P-256)&lt;/strong&gt;. Strip the region from CredentialScope, declare the regions in &lt;code&gt;X-Amz-Region-Set&lt;/code&gt;, and let each region verify independently with the public key. The SDK flips to it automatically when MRAP is involved, so you basically never reach for it by hand.&lt;/p&gt;

&lt;p&gt;The two real-world traps are always the same: &lt;strong&gt;clock skew&lt;/strong&gt; and &lt;strong&gt;canonical-request normalization mistakes&lt;/strong&gt;. Read the exact error message (&lt;code&gt;RequestTimeTooSkewed&lt;/code&gt; vs &lt;code&gt;SignatureDoesNotMatch&lt;/code&gt; vs &lt;code&gt;AccessDenied&lt;/code&gt;) and diff your Canonical Request against the one AWS echoes back. Get that far and SigV4 stops biting.&lt;/p&gt;

</description>
      <category>aws</category>
      <category>sigv4</category>
      <category>security</category>
      <category>signing</category>
    </item>
    <item>
      <title>AWS IAM Roles Anywhere Deep Dive</title>
      <dc:creator>kt</dc:creator>
      <pubDate>Thu, 28 May 2026 15:46:19 +0000</pubDate>
      <link>https://dev.to/kanywst/aws-iam-roles-anywhere-deep-dive-j51</link>
      <guid>https://dev.to/kanywst/aws-iam-roles-anywhere-deep-dive-j51</guid>
      <description>&lt;h2&gt;
  
  
  Introduction
&lt;/h2&gt;

&lt;p&gt;"I want to drop a file from an on-prem server into S3."&lt;br&gt;
"I want to read DynamoDB from a Kubernetes pod sitting in my datacenter."&lt;br&gt;
"I want to pull a secret from AWS Secrets Manager from an app in someone else's cloud."&lt;/p&gt;

&lt;p&gt;Do this the naive way and you end up here:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Create an IAM User&lt;/li&gt;
&lt;li&gt;Issue an access key (the &lt;code&gt;AKIA...&lt;/code&gt; kind)&lt;/li&gt;
&lt;li&gt;Paste it into &lt;code&gt;~/.aws/credentials&lt;/code&gt;, env vars, or some on-prem Secrets Manager&lt;/li&gt;
&lt;li&gt;Use it forever&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;This is where most production incidents start today. Long-lived access keys leak and stay leaked, rotation gets forgotten, and there is no record of who copied them where or when.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;AWS IAM Roles Anywhere&lt;/strong&gt; is the mechanism that hands IAM Role temporary credentials to &lt;strong&gt;workloads outside AWS&lt;/strong&gt; without distributing any long-lived key. The key material is replaced by an &lt;strong&gt;X.509 certificate&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;This article goes deep on Roles Anywhere.&lt;/p&gt;


&lt;h2&gt;
  
  
  1. Vocabulary you need first
&lt;/h2&gt;

&lt;p&gt;Roles Anywhere sits on top of two things: IAM Role and PKI (the certificate world). If either is fuzzy you will get lost fast. The bare minimum below.&lt;/p&gt;
&lt;h3&gt;
  
  
  IAM Role / STS / temporary credentials
&lt;/h3&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Term&lt;/th&gt;
&lt;th&gt;What to remember&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;IAM Role&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;A box that says "who is allowed to take this permission on". It has two parts: a &lt;strong&gt;Trust Policy&lt;/strong&gt; (who can assume it) and a &lt;strong&gt;Permission Policy&lt;/strong&gt; (what the assumer can do)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;AssumeRole&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;The act of taking on a Role. When it succeeds, you immediately get back a set of &lt;strong&gt;temporary credentials&lt;/strong&gt;
&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;STS (Security Token Service)&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;The AWS service that issues temporary credentials. The &lt;code&gt;sts:AssumeRole&lt;/code&gt; family of APIs lands here&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Temporary credentials&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;A three-piece set: &lt;code&gt;AccessKeyId&lt;/code&gt; + &lt;code&gt;SecretAccessKey&lt;/code&gt; + &lt;code&gt;SessionToken&lt;/code&gt;. Default lifetime 1 hour, up to 12 hours via the Role config&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;
&lt;h3&gt;
  
  
  X.509 certificate / CA / PKI
&lt;/h3&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Term&lt;/th&gt;
&lt;th&gt;What to remember&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;X.509 certificate&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;A digital document where a CA signs "this public key belongs to this entity". The cert sitting in front of any HTTPS site is the same shape&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;CA (Certificate Authority)&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;The party that issues certificates. Trusting a CA's certificate means trusting every certificate it has ever issued&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Private CA&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;A CA for closed environments such as inside a company. AWS sells a managed version called &lt;strong&gt;AWS Private CA (formerly ACM PCA)&lt;/strong&gt;
&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;PKI (Public Key Infrastructure)&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;The whole ecosystem of issuing, distributing, and revoking certificates&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Private key&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;The key paired with the certificate. Digital signatures are made with it. &lt;strong&gt;Never put it on the network&lt;/strong&gt;
&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;Roughly: Roles Anywhere is the mechanism that &lt;strong&gt;makes &lt;code&gt;AssumeRole&lt;/code&gt; callable using an X.509 certificate's private-key signature, instead of an IAM User's long-term key&lt;/strong&gt;.&lt;/p&gt;


&lt;h2&gt;
  
  
  2. The big picture and the three actors
&lt;/h2&gt;

&lt;p&gt;Roles Anywhere has three specific concepts: &lt;strong&gt;Trust Anchor / Profile / Role&lt;/strong&gt;. Pin them down visually first.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fraw.githubusercontent.com%2F0-draft%2Fdev.to%2Fmain%2Farticles%2Fassets%2Faws-iam-roles-anywhere-deep-dive%2Fdiagrams%2F01-roles-anywhere-overview.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fraw.githubusercontent.com%2F0-draft%2Fdev.to%2Fmain%2Farticles%2Fassets%2Faws-iam-roles-anywhere-deep-dive%2Fdiagrams%2F01-roles-anywhere-overview.png" alt="Roles Anywhere overview: workload outside AWS, credential helper, Trust Anchor / Profile / Role, CreateSession to STS" width="800" height="734"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;The three actors and their jobs:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Concept&lt;/th&gt;
&lt;th&gt;What it is&lt;/th&gt;
&lt;th&gt;What it holds&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Trust Anchor&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;The declaration "I (this AWS account) trust this CA"&lt;/td&gt;
&lt;td&gt;A reference to an AWS Private CA, or the PEM of an external CA&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Profile&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;The declaration "for callers authenticated via this CA, which Roles can they use and with what session limits"&lt;/td&gt;
&lt;td&gt;A list of allowed Role ARNs + session duration + optional Session Policy&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;IAM Role&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;The permission body that activates once assumed&lt;/td&gt;
&lt;td&gt;Trust Policy (allowing the Roles Anywhere service principal) + Permission Policy&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;Walking through each in order.&lt;/p&gt;


&lt;h2&gt;
  
  
  3. Trust Anchor: the root of trust
&lt;/h2&gt;

&lt;p&gt;A Trust Anchor is the declaration &lt;strong&gt;"this AWS account trusts this CA"&lt;/strong&gt;. When Roles Anywhere sees a certificate, it walks the chain to confirm the certificate was issued by this CA.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fraw.githubusercontent.com%2F0-draft%2Fdev.to%2Fmain%2Farticles%2Fassets%2Faws-iam-roles-anywhere-deep-dive%2Fdiagrams%2F02-trust-anchor-chain.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fraw.githubusercontent.com%2F0-draft%2Fdev.to%2Fmain%2Farticles%2Fassets%2Faws-iam-roles-anywhere-deep-dive%2Fdiagrams%2F02-trust-anchor-chain.png" alt="Certificate chain from end-entity up through intermediate CA to root CA, with the root CA registered as Trust Anchor" width="800" height="1454"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;There are two ways to create a Trust Anchor.&lt;/p&gt;
&lt;h3&gt;
  
  
  A. Use AWS Private CA (PCA) as the source
&lt;/h3&gt;

&lt;p&gt;AWS Private CA (formerly ACM PCA) is AWS's managed paid CA. When creating a Trust Anchor you say "use this PCA", and every certificate issued by that PCA is trusted automatically.&lt;/p&gt;

&lt;p&gt;PCA has two modes (Tokyo region reference pricing, varies by region):&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;General Purpose Mode&lt;/strong&gt;: &lt;strong&gt;400 USD/month&lt;/strong&gt; plus per-certificate issuance fees. Full features (CRL publication, long-lived certs, freely issued via API). Capable as a replacement for an internal PKI&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Short-Lived Certificate Mode&lt;/strong&gt;: &lt;strong&gt;50 USD/month&lt;/strong&gt; plus per-certificate issuance fees. Only for certificates with &lt;strong&gt;lifetime under 7 days&lt;/strong&gt;, no CRL publication. Optimised for the Roles Anywhere "issue short-lived certs frequently" usage pattern&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Upside: issuance, revocation, and renewal automation lean on AWS. The AWS Private CA Issuer for &lt;code&gt;cert-manager&lt;/code&gt; lets Kubernetes consume it.&lt;/p&gt;

&lt;p&gt;Downside: &lt;strong&gt;either mode bills a flat monthly fee just for having a CA stood up&lt;/strong&gt;. For verification or dev work, an external CA is cheaper.&lt;/p&gt;
&lt;h3&gt;
  
  
  B. Upload an existing external CA
&lt;/h3&gt;

&lt;p&gt;Upload a CA certificate in PEM and register it as a Trust Anchor. If you already have an internal PKI (HashiCorp Vault, smallstep &lt;code&gt;step-ca&lt;/code&gt;, an OpenSSL-based homemade CA, etc.), the extra cost is zero.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;aws rolesanywhere create-trust-anchor &lt;span class="se"&gt;\&lt;/span&gt;
    &lt;span class="nt"&gt;--name&lt;/span&gt; &lt;span class="s2"&gt;"my-internal-ca"&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
    &lt;span class="nt"&gt;--source&lt;/span&gt; &lt;span class="s1"&gt;'{
        "sourceType": "CERTIFICATE_BUNDLE",
        "sourceData": {
            "x509CertificateData": "-----BEGIN CERTIFICATE-----\nMIID...\n-----END CERTIFICATE-----"
        }
    }'&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
    &lt;span class="nt"&gt;--enabled&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;With an external CA you own &lt;strong&gt;revocation management (CRL updates)&lt;/strong&gt;. Covered in §8.&lt;/p&gt;




&lt;h2&gt;
  
  
  4. Profile: which Role, with what guardrails
&lt;/h2&gt;

&lt;p&gt;A Profile is &lt;strong&gt;"the usage rules for callers authenticated against a given Trust Anchor"&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fraw.githubusercontent.com%2F0-draft%2Fdev.to%2Fmain%2Farticles%2Fassets%2Faws-iam-roles-anywhere-deep-dive%2Fdiagrams%2F03-profile-structure.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fraw.githubusercontent.com%2F0-draft%2Fdev.to%2Fmain%2Farticles%2Fassets%2Faws-iam-roles-anywhere-deep-dive%2Fdiagrams%2F03-profile-structure.png" alt="Profile branches into multiple Roles plus Session Policy and Session Duration constraints" width="800" height="435"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;What goes into a Profile:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;roleArns&lt;/strong&gt;: list of IAM Role ARNs that may be assumed&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;durationSeconds&lt;/strong&gt;: maximum session lifetime (900 to 43200 seconds, i.e. 15 minutes to 12 hours, capped by the Role's &lt;code&gt;MaxSessionDuration&lt;/code&gt;)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;sessionPolicy / managedPolicyArns&lt;/strong&gt;: extra policy that applies only to the session. "The Role's Permission Policy is X, but calls coming through this Profile are further narrowed to Y"&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;A Session Policy intersects with the Role's Permission Policy via &lt;strong&gt;AND&lt;/strong&gt;. Even if the Role allows &lt;code&gt;s3:*&lt;/code&gt;, narrowing the Session Policy to &lt;code&gt;s3:GetObject&lt;/code&gt; means only &lt;code&gt;s3:GetObject&lt;/code&gt; is effectively permitted.&lt;/p&gt;




&lt;h2&gt;
  
  
  5. The Role's Trust Policy: who is allowed to call
&lt;/h2&gt;

&lt;p&gt;The Role itself is created the usual way, but its &lt;strong&gt;Trust Policy&lt;/strong&gt; (the policy saying "who can Assume this Role") has a Roles Anywhere-specific shape:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight json"&gt;&lt;code&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"Version"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"2012-10-17"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"Statement"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"Effect"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"Allow"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"Principal"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
        &lt;/span&gt;&lt;span class="nl"&gt;"Service"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"rolesanywhere.amazonaws.com"&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"Action"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="w"&gt;
        &lt;/span&gt;&lt;span class="s2"&gt;"sts:AssumeRole"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
        &lt;/span&gt;&lt;span class="s2"&gt;"sts:TagSession"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
        &lt;/span&gt;&lt;span class="s2"&gt;"sts:SetSourceIdentity"&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"Condition"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
        &lt;/span&gt;&lt;span class="nl"&gt;"StringEquals"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
          &lt;/span&gt;&lt;span class="nl"&gt;"aws:PrincipalTag/x509Subject/CN"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"server-a.example.com"&lt;/span&gt;&lt;span class="w"&gt;
        &lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;&lt;span class="w"&gt;
        &lt;/span&gt;&lt;span class="nl"&gt;"ArnEquals"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
          &lt;/span&gt;&lt;span class="nl"&gt;"aws:SourceArn"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"arn:aws:rolesanywhere:ap-northeast-1:123456789012:trust-anchor/abc-..."&lt;/span&gt;&lt;span class="w"&gt;
        &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Key points:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;The service principal is &lt;code&gt;rolesanywhere.amazonaws.com&lt;/code&gt;&lt;/strong&gt;. Without this entry, AssumeRole via the Roles Anywhere CreateSession path will not work&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Allowing &lt;code&gt;sts:TagSession&lt;/code&gt;&lt;/strong&gt; lets the certificate's Subject / SAN / Issuer be injected as session tags (more on this below)&lt;/li&gt;
&lt;li&gt;Conditions like &lt;strong&gt;&lt;code&gt;aws:PrincipalTag/x509Subject/CN&lt;/code&gt;&lt;/strong&gt; let you &lt;strong&gt;narrow down further based on the certificate contents&lt;/strong&gt;. Fine-grained control like "only certificates with CN &lt;code&gt;server-a.example.com&lt;/code&gt; may assume this" is possible&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;&lt;code&gt;aws:SourceArn&lt;/code&gt;&lt;/strong&gt; pins the call to a specific Trust Anchor, so a request coming from a Trust Anchor you did not expect is denied&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Session tags auto-populated from the certificate
&lt;/h3&gt;

&lt;p&gt;A session created through CreateSession + AssumeRole automatically carries certificate attributes as tags. The common ones:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Tag key&lt;/th&gt;
&lt;th&gt;Content&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;aws:PrincipalTag/x509Subject/CN&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Common Name of the certificate Subject&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;aws:PrincipalTag/x509SAN/DNS&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;DNS name in the Subject Alternative Name (SAN)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;aws:PrincipalTag/x509SAN/URI&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;URI in SAN (e.g. a SPIFFE ID like &lt;code&gt;spiffe://example.com/ns/prod/sa/app&lt;/code&gt;)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;aws:PrincipalTag/x509Issuer/CN&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;CN of the issuing CA&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;aws:PrincipalTag/x509Serial&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Certificate serial number&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;A small bit of supplementary vocabulary:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;SAN (Subject Alternative Name)&lt;/strong&gt;: an X.509 extension field that holds additional identifiers beyond Common Name, including multiple hostnames, IPs, or URIs&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;SPIFFE ID&lt;/strong&gt;: a URI-format workload identifier defined by the SPIFFE spec (&lt;code&gt;spiffe://...&lt;/code&gt;). If your internal identity platform uses SPIFFE, dropping a SPIFFE ID into the SAN URI lets the AWS side reference it directly in a Condition&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The upshot: &lt;strong&gt;the Role's Permission Policy can also use &lt;code&gt;aws:PrincipalTag/x509Subject/CN&lt;/code&gt; in its Conditions&lt;/strong&gt;. Patterns like "the &lt;code&gt;server-a.example.com&lt;/code&gt; certificate can only write under the &lt;code&gt;prefix=server-a/*&lt;/code&gt; portion of S3" become possible. That is &lt;strong&gt;ABAC (Attribute Based Access Control)&lt;/strong&gt;: instead of carving permissions up by Role (RBAC-style), &lt;strong&gt;principal attributes (tags) dynamically tighten what is allowed&lt;/strong&gt;.&lt;/p&gt;




&lt;h2&gt;
  
  
  6. The CreateSession flow
&lt;/h2&gt;

&lt;p&gt;Tracing what actually happens from the client's first call until temporary credentials are in hand.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fraw.githubusercontent.com%2F0-draft%2Fdev.to%2Fmain%2Farticles%2Fassets%2Faws-iam-roles-anywhere-deep-dive%2Fdiagrams%2F04-create-session-sequence.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fraw.githubusercontent.com%2F0-draft%2Fdev.to%2Fmain%2Farticles%2Fassets%2Faws-iam-roles-anywhere-deep-dive%2Fdiagrams%2F04-create-session-sequence.png" alt="CreateSession sequence: workload signs request with private key via credential helper, Roles Anywhere verifies cert and CRL, then STS returns temp credentials" width="800" height="897"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Key points:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;A "SigV4 X.509 variant" sits on the SigV4 frame but signs with an X.509 private key&lt;/strong&gt;. Normal AWS APIs sign with &lt;code&gt;AWS4-HMAC-SHA256&lt;/code&gt; (symmetric HMAC using the access key). &lt;code&gt;CreateSession&lt;/code&gt; uses one of &lt;strong&gt;&lt;code&gt;AWS4-X509-RSA-SHA256&lt;/code&gt; / &lt;code&gt;AWS4-X509-ECDSA-SHA256&lt;/code&gt; / &lt;code&gt;AWS4-X509-MLDSA&lt;/code&gt;&lt;/strong&gt; (the last one is post-quantum, PQC-capable) depending on the key type. The Canonical Request construction is the same as SigV4. Only the final step is an asymmetric X.509 signature instead of HMAC&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;The private key never leaves the credential helper&lt;/strong&gt;. It does not go on the network&lt;/li&gt;
&lt;li&gt;The credentials handed back are &lt;strong&gt;plain IAM temporary credentials&lt;/strong&gt;. The SDK sees nothing unusual. Subsequent S3 / DynamoDB calls go out as normal SigV4 (HMAC)&lt;/li&gt;
&lt;/ul&gt;




&lt;h2&gt;
  
  
  7. What is inside the credential helper
&lt;/h2&gt;

&lt;p&gt;Hand-rolling the &lt;code&gt;CreateSession&lt;/code&gt; signing on the client side is not realistic, so AWS ships an official binary called &lt;strong&gt;&lt;code&gt;aws_signing_helper&lt;/code&gt;&lt;/strong&gt; (repo: &lt;code&gt;aws/rolesanywhere-credential-helper&lt;/code&gt;). That binary is the credential helper.&lt;/p&gt;

&lt;p&gt;&lt;code&gt;aws_signing_helper&lt;/code&gt; plugs into the AWS CLI / SDK &lt;code&gt;credential_process&lt;/code&gt; spec. Drop this into &lt;code&gt;~/.aws/config&lt;/code&gt; and both the CLI and any SDK transparently fetch credentials through Roles Anywhere:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight ini"&gt;&lt;code&gt;&lt;span class="nn"&gt;[profile myapp]&lt;/span&gt;
&lt;span class="py"&gt;credential_process&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s"&gt;/usr/local/bin/aws_signing_helper credential-process &lt;/span&gt;&lt;span class="se"&gt;\
&lt;/span&gt;    &lt;span class="s"&gt;--certificate /etc/pki/server.pem &lt;/span&gt;&lt;span class="se"&gt;\
&lt;/span&gt;    &lt;span class="s"&gt;--private-key /etc/pki/server.key &lt;/span&gt;&lt;span class="se"&gt;\
&lt;/span&gt;    &lt;span class="s"&gt;--trust-anchor-arn arn:aws:rolesanywhere:ap-northeast-1:123456789012:trust-anchor/xxxx &lt;/span&gt;&lt;span class="se"&gt;\
&lt;/span&gt;    &lt;span class="s"&gt;--profile-arn       arn:aws:rolesanywhere:ap-northeast-1:123456789012:profile/yyyy &lt;/span&gt;&lt;span class="se"&gt;\
&lt;/span&gt;    &lt;span class="s"&gt;--role-arn          arn:aws:iam::123456789012:role/MyAppRole&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;A call like &lt;code&gt;aws s3 ls --profile myapp&lt;/code&gt; triggers &lt;code&gt;credential_process&lt;/code&gt;, which runs &lt;code&gt;aws_signing_helper&lt;/code&gt;, which returns the JSON. The AWS CLI / SDK takes care of &lt;strong&gt;automatic credential refresh&lt;/strong&gt;. No manual refresh needed.&lt;/p&gt;

&lt;h3&gt;
  
  
  Where the private key lives
&lt;/h3&gt;

&lt;p&gt;The choice of private-key storage sets the ceiling on your Roles Anywhere operational quality. &lt;code&gt;aws_signing_helper&lt;/code&gt; supports the following backends. Leak resistance rises as you go down the table.&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Storage&lt;/th&gt;
&lt;th&gt;Properties&lt;/th&gt;
&lt;th&gt;Leak resistance&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;
&lt;strong&gt;File&lt;/strong&gt; (&lt;code&gt;/etc/pki/server.key&lt;/code&gt; etc.)&lt;/td&gt;
&lt;td&gt;Easiest. Anyone with read access can &lt;code&gt;cat&lt;/code&gt; it&lt;/td&gt;
&lt;td&gt;Low&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;
&lt;strong&gt;OS keystore&lt;/strong&gt; (Windows Certificate Store / macOS Keychain)&lt;/td&gt;
&lt;td&gt;Protected by OS access control&lt;/td&gt;
&lt;td&gt;Medium&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;
&lt;strong&gt;PKCS#11 module&lt;/strong&gt; (HSM / smartcard)&lt;/td&gt;
&lt;td&gt;Key stays inside the HSM, only signing operations are delegated&lt;/td&gt;
&lt;td&gt;High&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;
&lt;strong&gt;TPM 2.0&lt;/strong&gt; (secure chip on the motherboard)&lt;/td&gt;
&lt;td&gt;Key sealed into hardware, non-exportable&lt;/td&gt;
&lt;td&gt;High&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;In high-security environments, sealing the key into PKCS#11 (HSM) or TPM 2.0 so that &lt;strong&gt;the key cannot be extracted at all&lt;/strong&gt; is the right move. &lt;code&gt;aws_signing_helper&lt;/code&gt; exposes flags like &lt;code&gt;--cert-selector&lt;/code&gt; and &lt;code&gt;--tpm-key-handle&lt;/code&gt; to wire into these backends.&lt;/p&gt;




&lt;h2&gt;
  
  
  8. Revoking a certificate
&lt;/h2&gt;

&lt;p&gt;You will eventually need to kill a compromised server's certificate immediately. Roles Anywhere supports &lt;strong&gt;CRL (Certificate Revocation List)&lt;/strong&gt;.&lt;/p&gt;

&lt;h3&gt;
  
  
  Importing a CRL
&lt;/h3&gt;

&lt;p&gt;Generate a revocation list on the CA side and register it in Roles Anywhere as PEM.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;aws rolesanywhere import-crl &lt;span class="se"&gt;\&lt;/span&gt;
    &lt;span class="nt"&gt;--name&lt;/span&gt; &lt;span class="s2"&gt;"my-ca-crl"&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
    &lt;span class="nt"&gt;--crl-data&lt;/span&gt; file://crl.pem &lt;span class="se"&gt;\&lt;/span&gt;
    &lt;span class="nt"&gt;--trust-anchor-arn&lt;/span&gt; arn:aws:rolesanywhere:ap-northeast-1:123456789012:trust-anchor/xxxx &lt;span class="se"&gt;\&lt;/span&gt;
    &lt;span class="nt"&gt;--enabled&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Once registered, Roles Anywhere checks the CRL on every &lt;code&gt;CreateSession&lt;/code&gt;. Calls from revoked certificates are rejected.&lt;/p&gt;

&lt;h3&gt;
  
  
  Auto-integration with AWS Private CA
&lt;/h3&gt;

&lt;p&gt;When the Trust Anchor source is PCA, calling &lt;code&gt;revoke-certificate&lt;/code&gt; on the PCA causes the PCA to publish a CRL into S3 &lt;strong&gt;within about 30 minutes&lt;/strong&gt;. The standard pattern is to catch that with Lambda and feed it to the &lt;code&gt;ImportCrl&lt;/code&gt; API.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fraw.githubusercontent.com%2F0-draft%2Fdev.to%2Fmain%2Farticles%2Fassets%2Faws-iam-roles-anywhere-deep-dive%2Fdiagrams%2F05-crl-auto-integration.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fraw.githubusercontent.com%2F0-draft%2Fdev.to%2Fmain%2Farticles%2Fassets%2Faws-iam-roles-anywhere-deep-dive%2Fdiagrams%2F05-crl-auto-integration.png" alt="CRL auto-integration sequence: operator revokes cert in PCA, PCA emits CRL to S3, Lambda picks it up and calls ImportCrl on Roles Anywhere" width="800" height="517"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h3&gt;
  
  
  Temporarily disabling the CRL check
&lt;/h3&gt;

&lt;p&gt;When you need to disable it during incident response, &lt;code&gt;DisableCrl&lt;/code&gt; flips the check off and &lt;code&gt;EnableCrl&lt;/code&gt; turns it back on. In normal production, leave it on.&lt;/p&gt;




&lt;h2&gt;
  
  
  9. When it fits, and when it does not
&lt;/h2&gt;

&lt;p&gt;Roles Anywhere is powerful but not for every case. The decision flow:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fraw.githubusercontent.com%2F0-draft%2Fdev.to%2Fmain%2Farticles%2Fassets%2Faws-iam-roles-anywhere-deep-dive%2Fdiagrams%2F06-when-to-use-decision.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fraw.githubusercontent.com%2F0-draft%2Fdev.to%2Fmain%2Farticles%2Fassets%2Faws-iam-roles-anywhere-deep-dive%2Fdiagrams%2F06-when-to-use-decision.png" alt="Decision tree: workload location and caller type lead to Instance Profile, OIDC federation, Identity Center, or IAM Roles Anywhere" width="800" height="551"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Takeaways:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Workloads running inside AWS do not need Roles Anywhere&lt;/strong&gt;. Instance Profile / Execution Role / Pod Identity are the natural choice&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;If your CI can speak OIDC, prefer OIDC&lt;/strong&gt;. GitHub Actions / GitLab CI / etc. emit OIDC officially. Roles Anywhere is unnecessary&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;If you cannot emit OIDC, you already have an X.509-based PKI, and the server is fully on-prem&lt;/strong&gt;, that is where Roles Anywhere shines&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Where Roles Anywhere is a poor fit
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;"We do not have an internal PKI yet" cases: standing up a CA before introducing Roles Anywhere is much more work than Roles Anywhere itself. If there is an OIDC path, take that&lt;/li&gt;
&lt;li&gt;"Thousands of clients, each with its own certificate, CRL updated daily" cases: at this scale CRL propagation lag and other factors need real design work&lt;/li&gt;
&lt;li&gt;"Calling from inside a VPC" cases: anything inside a VPC is already inside AWS, so attaching a Role the normal way is enough&lt;/li&gt;
&lt;/ul&gt;




&lt;h2&gt;
  
  
  10. Pricing and limits
&lt;/h2&gt;

&lt;h3&gt;
  
  
  Pricing
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;The IAM Roles Anywhere service itself is free&lt;/strong&gt;. No per-CreateSession request fees&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;AWS Private CA is the only billed component&lt;/strong&gt;. &lt;strong&gt;400 USD/month&lt;/strong&gt; for General Purpose, &lt;strong&gt;50 USD/month&lt;/strong&gt; for Short-Lived Certificate, plus per-certificate issuance fees&lt;/li&gt;
&lt;li&gt;Using an external CA (HashiCorp Vault / step-ca / homemade PKI etc.) avoids the PCA bill (you pay in operational effort instead)&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Main limits (from official Service Quotas, per Region)
&lt;/h3&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Item&lt;/th&gt;
&lt;th&gt;Default&lt;/th&gt;
&lt;th&gt;Increase request&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Trust Anchors per account&lt;/td&gt;
&lt;td&gt;50&lt;/td&gt;
&lt;td&gt;Allowed&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Profiles per account&lt;/td&gt;
&lt;td&gt;250&lt;/td&gt;
&lt;td&gt;Allowed&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Roles per Profile&lt;/td&gt;
&lt;td&gt;250&lt;/td&gt;
&lt;td&gt;
&lt;strong&gt;Not allowed&lt;/strong&gt; (hard limit)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Registered certificates per Trust Anchor&lt;/td&gt;
&lt;td&gt;2&lt;/td&gt;
&lt;td&gt;
&lt;strong&gt;Not allowed&lt;/strong&gt; (two slots, for rotation)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;CRLs per Trust Anchor&lt;/td&gt;
&lt;td&gt;2&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;Not allowed&lt;/strong&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;CreateSession rate&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;10 req/sec&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Allowed&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Session lifetime&lt;/td&gt;
&lt;td&gt;15 minutes to 12 hours&lt;/td&gt;
&lt;td&gt;Within the Role's &lt;code&gt;MaxSessionDuration&lt;/code&gt;
&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;Note that &lt;strong&gt;&lt;code&gt;CreateSession&lt;/code&gt; is 10 req/sec&lt;/strong&gt;. A design that pulls short-lived credentials in volume will hit this per-Region rate. The basic pattern is to &lt;strong&gt;cache credentials on the client side&lt;/strong&gt; and refresh just before expiration. &lt;code&gt;aws_signing_helper&lt;/code&gt;'s &lt;code&gt;credential_process&lt;/code&gt; does this automatically.&lt;/p&gt;




&lt;h2&gt;
  
  
  11. Don'ts and Do's
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;❌ Don't&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Drop the private key as a plaintext file readable by anyone&lt;/li&gt;
&lt;li&gt;Reuse one certificate across multiple servers (you lose the ability to tell which one leaked)&lt;/li&gt;
&lt;li&gt;Write a Trust Policy that trusts &lt;code&gt;rolesanywhere.amazonaws.com&lt;/code&gt; alone, with no &lt;code&gt;aws:SourceArn&lt;/code&gt; or certificate Conditions&lt;/li&gt;
&lt;li&gt;Skip CRL operations entirely (i.e. no way to invalidate a cert on compromise)&lt;/li&gt;
&lt;li&gt;Issue certificates with absurdly long lifetimes (e.g. 10 years)&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;✅ Do&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Seal the private key into &lt;strong&gt;TPM 2.0 / HSM (PKCS#11) / OS keystore&lt;/strong&gt;
&lt;/li&gt;
&lt;li&gt;One server, one certificate. Put a hostname or SPIFFE-ID-equivalent identifier in the Subject / SAN&lt;/li&gt;
&lt;li&gt;Issue &lt;strong&gt;short-lived certificates&lt;/strong&gt; (e.g. 24 hours to a few days) with &lt;strong&gt;automatic renewal&lt;/strong&gt;. &lt;code&gt;cert-manager&lt;/code&gt; and &lt;code&gt;step-ca&lt;/code&gt;'s auto-renew machinery is the standard pattern&lt;/li&gt;
&lt;li&gt;Put both &lt;code&gt;aws:SourceArn&lt;/code&gt; (Trust Anchor) and &lt;code&gt;aws:PrincipalTag/x509Subject/CN&lt;/code&gt; into the Role's Trust Policy&lt;/li&gt;
&lt;li&gt;Automate CRL ingestion and keep &lt;code&gt;DisableCrl&lt;/code&gt; ready as an operations break-glass&lt;/li&gt;
&lt;li&gt;Use session tags (&lt;code&gt;aws:PrincipalTag/x509...&lt;/code&gt;) to drive ABAC in the Role's Permission Policy&lt;/li&gt;
&lt;/ul&gt;




&lt;h2&gt;
  
  
  12. Wrap-up
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;IAM Roles Anywhere hands IAM Role temporary credentials to workloads outside AWS without distributing long-lived keys&lt;/li&gt;
&lt;li&gt;It uses X.509 certificates as the key material, with the issuing CA registered as a Trust Anchor&lt;/li&gt;
&lt;li&gt;Three actors: Trust Anchor (the trusted CA) / Profile (which Role can be used) / Role (the actual permission)&lt;/li&gt;
&lt;li&gt;CreateSession is not plain SigV4. It uses a distinct flow signed with the certificate's private key, and the private key never goes on the network&lt;/li&gt;
&lt;li&gt;The official &lt;code&gt;aws_signing_helper&lt;/code&gt; ships as a &lt;code&gt;credential_process&lt;/code&gt; provider, so existing CLI and SDK code works as-is&lt;/li&gt;
&lt;li&gt;Seal the private key in TPM / HSM / OS keystore. A plain file on disk is an incident waiting to happen&lt;/li&gt;
&lt;li&gt;Revocation goes through CRL via &lt;code&gt;ImportCrl&lt;/code&gt;. PCA-backed setups can auto-integrate via S3&lt;/li&gt;
&lt;li&gt;If you are inside AWS, Roles Anywhere is unnecessary. If CI can speak OIDC, OIDC wins. On-prem, multi-cloud, and existing-PKI worlds are where it actually fits&lt;/li&gt;
&lt;li&gt;The service itself is free. Only PCA costs money&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  References
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;&lt;a href="https://docs.aws.amazon.com/rolesanywhere/latest/userguide/introduction.html" rel="noopener noreferrer"&gt;What is AWS Identity and Access Management Roles Anywhere?&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://docs.aws.amazon.com/rolesanywhere/latest/userguide/trust-model.html" rel="noopener noreferrer"&gt;The IAM Roles Anywhere trust model&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://docs.aws.amazon.com/rolesanywhere/latest/userguide/credential-helper.html" rel="noopener noreferrer"&gt;Get temporary security credentials from IAM Roles Anywhere&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://github.com/aws/rolesanywhere-credential-helper" rel="noopener noreferrer"&gt;aws/rolesanywhere-credential-helper (GitHub)&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://aws.amazon.com/blogs/security/extend-aws-iam-roles-to-workloads-outside-of-aws-with-iam-roles-anywhere/" rel="noopener noreferrer"&gt;Extend AWS IAM roles to workloads outside of AWS with IAM Roles Anywhere&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://aws.amazon.com/blogs/security/set-up-aws-private-certificate-authority-to-issue-certificates-for-use-with-iam-roles-anywhere/" rel="noopener noreferrer"&gt;Set up AWS Private Certificate Authority to issue certificates for use with IAM Roles Anywhere&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://aws.amazon.com/about-aws/whats-new/2024/12/iam-roles-anywhere-credential-helper-tpm-2-0/" rel="noopener noreferrer"&gt;IAM Roles Anywhere credential helper now supports TPM 2.0&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://unit42.paloaltonetworks.com/aws-roles-anywhere/" rel="noopener noreferrer"&gt;Roles Here? Roles There? Roles Anywhere: Exploring the Security of AWS IAM Roles Anywhere (Unit 42)&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;

</description>
      <category>aws</category>
      <category>iam</category>
      <category>security</category>
      <category>authentication</category>
    </item>
  </channel>
</rss>
