DEV Community

Takayuki Kawazoe
Takayuki Kawazoe

Posted on

"One JWT, five services, and the python-jose audience list trap"

audience must be a string or None.

That was the exception python-jose threw the moment our unified MCP server tried to talk to the second backend behind it. The token was valid. The signature checked out. The claims were correct. The library just refused to accept a list as the expected audience, and the JWT spec disagrees with the library on whether that should be a problem.

We run a single MCP server, codens-mcp on PyPI, that fronts five backends: Red (auto-fix), Blue (QA), Green (PRD), Purple (orchestration), and Auth. One MCP token, five destinations. When Claude calls a Red tool, the MCP server proxies an HTTP request to the Red backend carrying that same token. Same for Blue, Green, Purple, Auth. Each backend has its own primary audience for its own user-facing tokens, and we wanted all of them to also accept the MCP server's token without minting five service-specific JWTs per session.

This is the story of how that ran into a python-jose quirk, and the 12-line workaround we ended up shipping.

The architecture, briefly

Codens exposes 31 tools across the five product surfaces through one MCP server. From Claude's side it is a single connection. From the backends' side, each one sees a normal authenticated HTTP request with a bearer token in the header. The token is issued by the Auth service. Its aud claim is purple-codens-mcp, because the MCP server is the thing the user logged into when they connected their client.

Each backend already had its own audience for its first-party tokens. Green expects green-codens. Red expects red-codens. And so on. Those audiences were baked into the OAuth verifier and matched the audience claim on tokens minted by that service's own login flow.

We had two ways forward.

The first option: mint five tokens per MCP session. The MCP server logs into Red, Green, Blue, Purple, and Auth as the user, gets five JWTs, and selects the right one based on which tool the user invoked. This is conceptually clean. It also means five times the token issuance, five rotation surfaces, five sets of refresh flows to coordinate, and a routing layer in the MCP server that has to know which token belongs to which tool. None of that adds value.

The second option: mint one token, declare its audience as purple-codens-mcp, and teach every backend to accept that audience in addition to its own primary one. The MCP server holds one credential. Each backend keeps its primary audience for its own native flows and additionally trusts MCP-issued tokens. Rotation surface stays small. The routing logic in the MCP server disappears.

We picked option two. The plan was to add a per-service config that lists additional accepted audiences, expand the verifier to check against the union, and ship it.

Fix v1: pass a list to python-jose

The setting looked like this in every backend service:

class Settings(BaseSettings):
    OAUTH_AUDIENCE: str = "green-codens"
    OAUTH_ADDITIONAL_AUDIENCES: list[str] = ["purple-codens-mcp"]
Enter fullscreen mode Exit fullscreen mode

The verifier change looked equally innocuous. python-jose's jwt.decode accepts an audience keyword. The naive reading of every JWT tutorial on the internet says you give it the expected audience and it checks the token's aud against that. So we built a list of accepted audiences and handed it over:

audiences = [self.audience] if verify_audience and self.audience else []
if audiences and settings.OAUTH_ADDITIONAL_AUDIENCES:
    audiences.extend(settings.OAUTH_ADDITIONAL_AUDIENCES)

payload = jwt.decode(
    token,
    self.secret_key,
    algorithms=[self.algorithm],
    audience=audiences if audiences else None,
)
Enter fullscreen mode Exit fullscreen mode

This is the version we wrote, ran a quick local smoke test against, and pushed to the dev environment thinking the work was done. The shape of the change matched the shape of the problem. A list of allowed audiences in, an aud claim checked against that list, request accepted. Done.

The dev environment, of course, immediately disagreed.

The trap

The MCP server made its first call into Green and the request came back as a 401. The Green logs had the actual exception underneath the generic auth failure:

TypeError: audience must be a string or None
Enter fullscreen mode Exit fullscreen mode

python-jose's jwt.decode does not accept a list for its audience parameter. If you pass one, it raises before it even looks at the token. The library has only ever supported single-string audience verification. There is no flag, no overload, no helper that takes a list.

RFC 7519 is unambiguous on the other side of this question. Section 4.1.3 defines aud as either a single case-sensitive string or an array of case-sensitive strings, and verification logic is supposed to check that the recipient identifies itself with at least one of the values present. The spec assumes set membership semantics on both ends. The token can have multiple audiences, and the verifier can accept multiple audiences. Whether either side is a list is a transport detail.

python-jose is one of the most-used Python JWT libraries. Most FastAPI tutorials reach for it without thinking. It is also old, and the maintainer activity is thin. There is a multi-year-old GitHub issue tracking exactly this limitation, with patches floating around in forks and pull requests that never merged. The library's behavior is what it is, and if you need list audience verification, you are on your own.

The honest read here is that the JWT spec describes capability and most libraries describe a comfortable subset of it. The subset is usually fine. The moment you do anything cross-service it stops being fine.

Fix v2: decode without audience verification, then verify manually

The fix that worked is to use python-jose for what it is good at, which is signature verification and claim decoding, and do the audience check ourselves. python-jose lets you disable individual claim checks through its options dict. verify_aud: False turns off the built-in audience verification entirely. The signature, expiry, issuer, and everything else still get checked. We just take responsibility for aud.

should_verify_aud = verify_audience and bool(self.audience)

payload = jwt.decode(
    token,
    self.secret_key,
    algorithms=[self.algorithm],
    options={"verify_aud": False},
)

if should_verify_aud:
    allowed_audiences = {self.audience, *settings.OAUTH_ADDITIONAL_AUDIENCES}
    token_aud = payload.get("aud")
    token_aud_set = (
        set(token_aud) if isinstance(token_aud, list)
        else {token_aud} if token_aud is not None
        else set()
    )
    if not (token_aud_set & allowed_audiences):
        raise InvalidTokenError(
            f"Invalid audience: token aud={token_aud!r}, expected one of {sorted(allowed_audiences)}"
        )
Enter fullscreen mode Exit fullscreen mode

The set intersection does the entire job. token_aud_set & allowed_audiences returns a set of values present in both, and if that set is empty the token is for someone else and we reject it. If the token's aud is a single string we wrap it in a one-element set. If it is a list we convert directly. If it is missing we get an empty set and the intersection is empty, which fails closed.

One subtle thing about the order. We compute should_verify_aud before calling jwt.decode, not after, because we want the variable to capture the caller's intent independent of what python-jose returns. If someone passes verify_audience=False, we skip the manual check entirely. If they pass verify_audience=True but the service has no configured audience, there is nothing to verify against, so we also skip. The manual block only runs when there is something real to check.

The error message includes both the token's actual aud value and the sorted list of audiences we accept. When you debug an inter-service auth failure at 2am, the only thing worse than a 401 with no detail is a 401 that tells you nothing about the mismatch. The cost of formatting that message into the exception is zero and the time it saves is real.

The bonus pattern: decode and verify as separate steps

Once you have done this once, decoupling decoding from verification starts to feel like the right default for any JWT code that has to do anything non-trivial. The library is good at parsing the structure and confirming the signature. Your service is the one that knows which claims matter and what acceptance looks like.

The same pattern handles a bunch of adjacent problems. Token introspection for audit logs without re-running all the checks. Soft expiry where you log a warning at 90 percent of the lifetime instead of rejecting. Migration windows where you accept tokens signed with either the old or new key for a week. Custom claim validation that the library has never heard of. Whenever a future library bug lands in the issuer check or the expiry math, you have an escape hatch already in place because the verification logic is yours.

This is also the answer even if python-jose ships list audience support tomorrow. You do not lose anything by owning the audience check. You gain a place to put the next requirement that does not fit cleanly into a kwarg.

Wrap

Multi-service authentication keeps running into the gap between what JWT can do and what the convenient libraries actually do. The spec is generous. The libraries are opinionated. When you stitch services together, the opinions usually have to give.

The unified-token path was worth the workaround. One JWT, one rotation, one issuer, five backends that each know how to accept it. The cost was a dozen lines of manual verification in a shared OAuth module. We would make the same trade again.

If you want to see how Codens uses this on the agent side, the English landing page is at https://www.codens.ai/en/. The MCP server is codens-mcp on PyPI and it is what the agent connects to when it needs to talk to any of the five product surfaces.

Top comments (0)