Juan Torchia

Posted on May 30 • Originally published at juanchi.dev

Digital identity backend architecture: the decisions tutorials skip

#english #backend #seguridad #jwt

Digital identity backend architecture: the decisions tutorials skip

When I was studying Computer Science at UBA, there were classes I'd walk into straight from work, still in my office clothes. One night I showed up late to an operating systems lecture and the professor was talking about permissions and users. I'd spent that same afternoon breaking a Linux hosting server with a chmod -R 777 that seemed harmless at the time. The professor was explaining the theoretical model. I already knew what it cost not to understand it.

I think about that every time I read an auth tutorial that ends with "your login system is up and running!" Sure, it works. Until someone changes roles, logs out from one device, or you need to invalidate a token you issued 40 minutes ago.

My thesis: auth tutorials show you the happy path. The real problems in a digital identity backend show up in three places that almost never get covered: credential revocation, state-change propagation, and the trust model between services. If you design without thinking about those three, you'll be redesigning later.

The design mistake that starts with "let's just use JWT for everything"

JWT (RFC 7519) is a clean spec. A signed, self-describing token, verifiable without calling any server. That's exactly what makes it dangerous if you don't clearly understand what it guarantees and what it doesn't.

What JWT guarantees per the spec: that the token wasn't tampered with (signature), that the claims are what the issuer put there, and that you can verify it locally if you have the public key. That's it.

What JWT does not guarantee: that the user is still valid right now. If someone gets deactivated, changes their password, or loses permissions, the token remains cryptographically valid until it expires. RFC 7519 doesn't define revocation because that's not its problem. The problem is ours.

The most common trap I see in identity system designs:

// Typical pattern in Spring Security — looks complete, it's not
@Bean
public SecurityFilterChain filterChain(HttpSecurity http) throws Exception {
    http
        .oauth2ResourceServer(oauth2 -> oauth2
            .jwt(jwt -> jwt
                .decoder(jwtDecoder()) // validates signature and expiration
            )
        );
    return http.build();
}

// The decoder validates that the token is properly signed and not expired.
// It does NOT check whether the user was deactivated in the last 55 minutes.
// That gap is your design problem, not a framework bug.

Signature verification is necessary but not sufficient. If the token lasts 60 minutes and the user was suspended at minute 5, you've got 55 minutes of unauthorized access that the code above will never stop.

JWT vs stateful sessions: the real decision, not the Twitter debate

The "JWT vs sessions" debate usually boils down to "stateless vs stateful," as if that settles anything. It doesn't settle anything. The criterion that actually matters is how much control you need over the lifetime of a credential.

Criterion	Stateless JWT	Stateful session
Immediate revocation	❌ Not without a blocklist	✅ Yes, just delete the session
Horizontal scalability	✅ No coordination needed	⚠️ Needs shared session store (Redis, etc.)
Per-session auditing	❌ Limited	✅ Granular
Real-time permission changes	❌ Until next token	✅ Immediate
Operational complexity	Low initially, high once you add revocation	Medium, predictable

Note: this table represents design trade-offs. "Scalability" numbers depend on your concrete infrastructure; these are not universal benchmarks.

If the system requires that blocking a user takes effect within N seconds, pure JWT isn't enough. You need some form of active verification: token introspection (RFC 7662), a cache-backed blocklist, or short-lived tokens with aggressive refresh.

OpenID Connect Core 1.0 introduces the concept of id_token alongside access_token and refresh_token. The separation isn't arbitrary: the id_token asserts identity, the access_token authorizes actions, and the refresh_token controls the session lifecycle. Conflating the three is another classic design mistake.

Modeling the credential lifecycle: what the spec says and what you have to implement yourself

OpenID Connect defines the authorization flow, the endpoints, and the standard claims. But the lifecycle of a credential — how it's born, how it changes, how it dies — is the responsibility of the backend you're building, not the spec.

A minimal model that actually works in practice:

// Possible states of a credential/session
public enum CredentialState {
    ACTIVE,        // issued and valid
    SUSPENDED,     // temporarily blocked (e.g.: fraud suspicion)
    REVOKED,       // permanently invalidated
    EXPIRED        // timed out
}

// On issuance, you record the initial state
public record CredentialRecord(
    String jti,              // JWT ID — standard claim from RFC 7519 §4.1.7
    String userId,
    CredentialState state,
    Instant issuedAt,
    Instant expiresAt,
    String deviceFingerprint  // issuance context
) {}

The jti field (JWT ID) is defined in RFC 7519 §4.1.7. It's a unique identifier per token. If you persist it, you have the foundation for an efficient blocklist: when you want to revoke, you store the jti in Redis with a TTL equal to the token's remaining lifetime. Each request checks against that list. Cost: one cache lookup per request. Benefit: real revocation in approximately real time.

// Additional check on top of JWT signature validation
// After Spring Security validates the signature:
@Component
public class RevocationFilter extends OncePerRequestFilter {

    private final RevocationCache revocationCache;

    @Override
    protected void doFilterInternal(HttpServletRequest request,
                                    HttpServletResponse response,
                                    FilterChain filterChain) throws IOException, ServletException {

        String jti = extractJti(request); // extract from already-validated token

        // Check the blocklist before processing the request
        if (jti != null && revocationCache.isRevoked(jti)) {
            response.setStatus(HttpServletResponse.SC_UNAUTHORIZED);
            return; // stop here, don't continue the chain
        }

        filterChain.doFilter(request, response);
    }
}

This pattern doesn't eliminate state: it minimizes it. Instead of a full session, you store only what you need to invalidate. It's a conscious trade-off, not a magic solution.

The design mistakes that only surface when the system grows

1. Long-lived tokens as a shortcut

An access_token with a 24-hour expiration is a session with a worse interface. You get all the cost of user state management without the benefit of granular control. The general recommendation in identity systems — backed by the OIDC model — is short access tokens (minutes, not hours) with controlled refresh tokens.

2. Not modeling the device as an entity

If a user has three active sessions across three devices and logs out from one, what happens to the other two? If the design doesn't model the device as an entity, that question has no answer. In digital identity systems where the credential has legal or economic value, this isn't optional.

3. Propagating profile changes without propagating state changes

A common pattern: the user service updates the email, the auth backend doesn't find out until the token expires. If the email claim lives only in the JWT and there's no way to invalidate the previous token, the user operates on stale data for the remainder of the token's lifetime. The design has to define which claims are "live" (verified on every request) and which are "frozen" (trusted as of issuance).

This problem is related to something I covered in the post on digital signatures: format, certificate, and validation policy — trust in a claim has a timestamp, and that timestamp matters.

4. Assuming the Authorization Server is the single source of truth

In distributed systems, a service can receive a valid token but need context the token doesn't carry. The design mistake is solving this with increasingly fat tokens (more claims, more embedded info). The more robust solution is separating authentication from authorization: the token proves identity, the service decides permissions with its own model. See also: system prompts for agents in production — the same "who trusts whom" problem shows up in a completely different domain.

Decision checklist: before committing to pure JWT, sessions, or full OIDC

Before you lock in an identity architecture, answer these questions. Not as an academic exercise, but as a design gate:

Do you need immediate revocation? If yes → pure JWT without an additional mechanism won't cut it.
Do you have more than one device per user? If yes → model sessions per device, not per user.
Can permissions change within the token's lifetime? If yes → you need active introspection or very short-lived tokens.
Who verifies the token? If it's multiple services → JWKS endpoint, planned key rotation.
Do you have domain-required auditing (legal, financial, etc.)? If yes → persisted jti, not optional.
Can the refresh token be used from any device? If yes → potentially insecure design. Consider rotation + binding.

This checklist doesn't replace a threat model, but it prevents the most common design mistakes before you write a single line of code.

FAQ: common questions about digital identity architecture

Is JWT always better than server-side sessions?
No. JWT is better when you need stateless verification across multiple services without central coordination. Stateful sessions are better when you need immediate revocation, granular auditing, or device control. The right decision depends on system requirements, not on what's trending.

How do I implement JWT revocation without breaking scalability?
The most common pattern is a Redis blocklist with a TTL equal to the token's remaining lifetime. You only store the token's jti (claim defined in RFC 7519 §4.1.7), not the full token. The cost is one cache lookup per authenticated request. If Redis isn't in your stack, the same logic applies with any low-latency store.

What's the difference between access_token, id_token, and refresh_token in OIDC?
Per OpenID Connect Core 1.0: the id_token is an identity assertion (who you are), the access_token authorizes actions on resources (what you can do), and the refresh_token allows obtaining new access tokens without re-authentication. Mixing them up — for example, using the id_token to authorize API calls — is a design mistake the spec explicitly discourages.

What's the maximum size a JWT should be?
RFC 7519 doesn't define a limit. The practical limit comes from HTTP headers (8KB by default on many servers). A JWT bloated with unnecessary claims increases latency on every request. Design rule: a JWT should only contain the claims the receiver needs to verify locally. Everything else, you fetch when you need it.

When does it make sense to implement full OIDC vs rolling your own auth with JWT?
Full OIDC makes sense when you have multiple client applications, SSO across systems, or you need interoperability with external providers. Custom auth with JWT can be sufficient for an internal system with a single client. The cost of full OIDC is real operational complexity: discovery endpoints, JWKS rotation, session management. Don't underestimate it. Related: the post on rate limiting before picking a library applies the same "do you actually need this right now?" criterion.

What happens if the identity server goes down but already-issued tokens are still valid?
That's exactly the stateless guarantee of JWT: verification without central coordination. If the auth server goes down, existing tokens keep working until they expire. That can be a feature (resilience) or a bug (inability to invalidate quickly in an emergency). Design knowing that guarantee cuts both ways.

The identity architecture problem isn't a library problem

The uncomfortable truth about this topic is that it doesn't get solved by picking the right Spring Security library or the most popular JWT middleware. It gets solved by making design decisions before writing code: what the token guarantees, what it doesn't, how user state changes, and who has authority to invalidate what.

My take: if you start with pure JWT because "it's stateless and scales well" without modeling revocation, state changes, and trust between services, you're not building an identity system. You're building basic authentication with a modern format. That's not the same thing.

What I'd do differently from the start: model the credential lifecycle first — states, transitions, who can trigger each one — before choosing the token mechanism. Then the JWT/OIDC/sessions choice becomes a consequence of the design, not its starting point.

If the system touches permissions that change, multiple devices, or legal auditing, persisted jti isn't premature optimization. It's the minimum floor.

The concrete next step: check whether your current system can answer "what happens if I need to invalidate all sessions for this user in the next 30 seconds?" If the answer is "wait for the tokens to expire," now you know exactly where the design hole is.

Original sources: