Good question - from work I did designing a JWT based authorisation system, we concluded that a store of invalidated but within expiry tokens would be several orders of magnitude smaller than a list of valid, in use tokens, so this significantly reduces the work/load/networking required to provide token safety through a stateful revocation mechanism.
In our case we were happy to accept the risk of a transient session token being used at any point within it's issued lifetime (an order of minutes), but wanted to revoke long-term tokens used for API keys (lifetime of months) within a similar order of minutes. We chose to publish revoked token identifiers (via their unique jti field) internally to consuming services at the same time as we published the JWT signing keys, by extending the OpenID connect metadata documents that all services regularly collect (order of minutes) from our source of truth / authentication database. Auth0 blogged a similar approach a number of years ago that was our inspiration: auth0.com/blog/denylist-json-web-t...
Hi there! So you were maintaining a "small" list of invalidated tokens that still hadn't expired? If yes, did this approach include periodical scanning for expired tokens? Was this really advantageous compared to regular sessions with opaque tokens?
Yes, the list was 10s of token IDs, owned and updated (hourly IIRC) by an authentication service, and critically, published as a static JSON file to a global CDN. Advantages are: zero coupling between global systems and authentication service (unlike opaque tokens that require a much more coupled global store); very little coupling between the many autonomous global development teams consuming this information in their own services (this was important for our 1000+ people, multi-company global org!).
Interesting! What if these global development teams didn't check against this JSON file? Just thinking aloud about the practical perspective. Was there any mandate on this?
Interesting case. So this was implemented for long-lived API tokens (order of months)? This must have been a very detailed design process for such an architecture, haven't been?
I believe you had scalability challenges to tackle! Just curious: standard OAuth with rotating refresh tokens was not feasible?
Was the ratio between active long-lived API tokens (many) and invalidated ones (few) one of the deciding factors?
Yes, we looked at the risk of accepting tokens over different timescales, and concluded that only API keys were a material risk to us (use outside of contract, reputation loss), most of the risk of shorter term token misuse was carried by our customers as it would be their account that got billed if they leaked a token. I should note that the majority of our customers (80%+) used our API integration, not the browser-based UI (for which we had standard OAuth with rotating session tokens and refresh tokens with lifetimes on the order of a few days).
At the time I retired, we were handling ~1billion API calls a day globally.
Good question - from work I did designing a JWT based authorisation system, we concluded that a store of invalidated but within expiry tokens would be several orders of magnitude smaller than a list of valid, in use tokens, so this significantly reduces the work/load/networking required to provide token safety through a stateful revocation mechanism.
In our case we were happy to accept the risk of a transient session token being used at any point within it's issued lifetime (an order of minutes), but wanted to revoke long-term tokens used for API keys (lifetime of months) within a similar order of minutes. We chose to publish revoked token identifiers (via their unique
jti
field) internally to consuming services at the same time as we published the JWT signing keys, by extending the OpenID connect metadata documents that all services regularly collect (order of minutes) from our source of truth / authentication database. Auth0 blogged a similar approach a number of years ago that was our inspiration: auth0.com/blog/denylist-json-web-t...Amazing, would definitely look into. Thank you!
Hi there! So you were maintaining a "small" list of invalidated tokens that still hadn't expired? If yes, did this approach include periodical scanning for expired tokens? Was this really advantageous compared to regular sessions with opaque tokens?
Yes, the list was 10s of token IDs, owned and updated (hourly IIRC) by an authentication service, and critically, published as a static JSON file to a global CDN. Advantages are: zero coupling between global systems and authentication service (unlike opaque tokens that require a much more coupled global store); very little coupling between the many autonomous global development teams consuming this information in their own services (this was important for our 1000+ people, multi-company global org!).
Interesting! What if these global development teams didn't check against this JSON file? Just thinking aloud about the practical perspective. Was there any mandate on this?
There was a mandate, as part of the global common user management function (which included a set of acceptance tests that had to pass).
Interesting case. So this was implemented for long-lived API tokens (order of months)? This must have been a very detailed design process for such an architecture, haven't been?
I believe you had scalability challenges to tackle! Just curious: standard OAuth with rotating refresh tokens was not feasible?
Was the ratio between active long-lived API tokens (many) and invalidated ones (few) one of the deciding factors?
Yes, we looked at the risk of accepting tokens over different timescales, and concluded that only API keys were a material risk to us (use outside of contract, reputation loss), most of the risk of shorter term token misuse was carried by our customers as it would be their account that got billed if they leaked a token. I should note that the majority of our customers (80%+) used our API integration, not the browser-based UI (for which we had standard OAuth with rotating session tokens and refresh tokens with lifetimes on the order of a few days).
At the time I retired, we were handling ~1billion API calls a day globally.
Great use case! What a scale!