Stop replay attacks on AI agent tokens

#ai #python #security #opensource

Scope tokens let you prove what an AI agent is allowed to do. I wrote about portable scope tokens a few days ago. They travel with requests so the receiving service can verify permissions without calling back to your org.

But there is a problem I did not cover. If someone intercepts a valid scope token, they can replay it. The token is signed, unexpired, and carries legitimate permissions. Nothing stops a second use.

The replay problem

Say your agent creates a scope token with data:read permission and a one-hour TTL. It sends a request to a partner API with the token attached. An attacker sitting on the network copies that token. They now have 59 minutes to make their own requests using the same token, with the same permissions.

The signature still verifies. The TTL has not expired. The actions list matches. From the receiver's perspective, the replayed request looks identical to the original.

Short TTLs reduce the window but do not close it. Even a 60-second TTL gives an attacker 60 seconds.

Nonce-based replay protection

The fix in v0.2.14 is straightforward. Every scope token now includes a unique nonce, a random string generated at creation time. The receiver tracks which nonces it has already seen. If a nonce shows up twice, the token gets rejected.

import asqav

agent = asqav.Agent.create("api-caller")
token = agent.create_scope_token(actions=["data:read"], ttl=3600)

# Each token has a unique nonce
print(token.nonce)  # "a1b2c3d4..."

# Receiver side
seen_nonces = set()
if asqav.is_replay(token.nonce, seen_nonces):
    raise ValueError("Replay detected")
seen_nonces.add(token.nonce)

The pattern is simple: generate, check, reject duplicates. The nonce set only needs to hold entries until the token's TTL expires. After that, the token is invalid anyway, so you can clean up old nonces.

Why not just use shorter TTLs

Shorter TTLs help. But they create a different problem. Your agent needs to mint new tokens more frequently, adding latency to every request. And even a 1-second TTL is vulnerable to automated replay within that window.

Nonces and TTLs work together. The TTL limits how long the nonce set needs to persist. The nonce ensures each token is truly single-use regardless of TTL length.

Implementation notes

For most setups, an in-memory set works fine. If you are running multiple receiver instances behind a load balancer, you need a shared store like Redis so all instances see the same nonce set.

The is_replay helper handles the check and cleanup for you. Pass in the nonce and your set. It returns True if the nonce was already seen.

# Production setup with Redis
import redis
r = redis.Redis()

def check_nonce(token):
    key = f"nonce:{token.nonce}"
    if r.exists(key):
        raise ValueError("Replay detected")
    r.setex(key, token.ttl, 1)