"How a headless CLI logs in: implementing OAuth Device Code Flow for an MCP client"

#oauth #mcp #auth #redis

When you connect an MCP server to your own service, one unglamorous problem shows up fast: how does the CLI log in?

A web app with a browser can use the OAuth authorization code flow — redirect the user to a login page, exchange the returned code for a token. But MCP clients often run where there's no GUI browser: over SSH, in a CI container, on a headless box. The loopback trick (http://localhost:random_port as the redirect target) doesn't help either, because there's no browser to open.

OAuth has a proper answer for "authenticate a user where there's no browser": RFC 8628, the Device Authorization Grant, a.k.a. Device Code Flow. I implemented it in Codens' Auth service, so here's the design and the real code.

The idea: separate where you authenticate from where you approve

Device Code Flow splits the "device that shows a code" (the CLI) from the "device that approves" (your everyday browser). It's the same thing as logging into Netflix on a TV: a code appears on screen, you type it on your phone.

The flow:

The CLI calls /oauth/device/authorize and gets back a device_code (the machine's secret) and a user_code (a short code a human types).
The CLI shows the user "open this URL and enter ABCD-EFGH", then starts polling /oauth/device/token in the background.
The user opens the verification page in their normal browser, already logged in, enters the user_code, and approves.
The moment it's approved, the CLI's poll receives the token.

The CLI never opens a browser. The user approves from whatever browser they already have — phone, another laptop, anything.

Endpoint 1: device/authorize

The CLI calls this first. It takes a client_id and scope and issues the two codes.

@router.post("/device/authorize", response_model=DeviceAuthorizationResponse)
async def device_authorize(
    client_id: str = Form(...),
    scope: str = Form("openid profile email"),
    session: AsyncSession = Depends(get_session),
):
    # Is this client allowed to use the device_code grant?
    client = await OAuthClientRepository(session).get_by_client_id(client_id)
    if not client or not client.is_active:
        return JSONResponse(status_code=400,
            content={"error": "invalid_client", "error_description": "Unknown client"})

    allowed_grants = (client.grant_types or "").split()
    if _DEVICE_GRANT_TYPE not in allowed_grants:
        return JSONResponse(status_code=400,
            content={"error": "unauthorized_client",
                     "error_description": "Client not authorized for device_code grant"})

    store = get_device_code_store()
    result = await store.create(client_id=client_id, scope=scope)
    frontend_url = settings.FRONTEND_URL.rstrip("/")

    return DeviceAuthorizationResponse(
        device_code=result["device_code"],
        user_code=result["user_code"],
        verification_uri=f"{frontend_url}/device",
        expires_in=result["expires_in"],  # 900s
        interval=result["interval"],       # 5s poll interval
    )

_DEVICE_GRANT_TYPE is the RFC's canonical string urn:ietf:params:oauth:grant-type:device_code. If the client's grant_types doesn't include it, reject. Not everyone gets device flow — only clients that explicitly opt in.

Returning interval (5s) and expires_in (900s) matters: per RFC, the server dictates the poll interval and expiry and tells the client. Don't let the client hardcode them.

Making the two codes

device_code and user_code play different roles, so build them differently.

# device_code: a secret the machine holds. Just needs to be unguessable.
device_code = secrets.token_urlsafe(32)

# user_code: typed by a human. Readability comes first.
_USER_CODE_CHARS = "ABCDEFGHJKMNPQRSTUVWXYZ23456789"  # drop confusable 0/O/1/I/L

def _generate_user_code() -> str:
    left = "".join(secrets.choice(_USER_CODE_CHARS) for _ in range(4))
    right = "".join(secrets.choice(_USER_CODE_CHARS) for _ in range(4))
    return f"{left}-{right}"  # ABCD-EFGH

device_code gets token_urlsafe(32) — if it leaks, someone can grab the token, so entropy wins here.

user_code is typed by hand, so drop the confusable characters (0/O, 1/I/L) from the alphabet. The ABCD-EFGH hyphenated shape makes typos easier to spot. It's a small security-for-UX trade, and it's fine: the user_code is only used by an already-logged-in user to approve — it's not the token.

Storage: two Redis keys

State lives in Redis: a primary key from device_code to the state, and an index from user_code to device_code.

_CODE_PREFIX = "device:code:"            # device:code:{device_code} -> state JSON (primary)
_USER_CODE_PREFIX = "device:user_code:"  # device:user_code:{user_code} -> device_code (index)
_DEVICE_CODE_TTL = 900                    # 15 min, matches the MCP client timeout

async def create(self, client_id: str, scope: str) -> dict:
    device_code = secrets.token_urlsafe(32)
    user_code = _generate_user_code()

    data = {
        "device_code": device_code, "user_code": user_code,
        "client_id": client_id, "scope": scope,
        "status": "pending", "user_id": None,
        "created_at": now, "expires_at": now + _DEVICE_CODE_TTL,
    }

    # Both keys, same TTL, one round trip via pipeline.
    pipe = client.pipeline()
    pipe.set(f"{_CODE_PREFIX}{device_code}", json.dumps(data), ex=_DEVICE_CODE_TTL)
    pipe.set(f"{_USER_CODE_PREFIX}{user_code}", device_code, ex=_DEVICE_CODE_TTL)
    await pipe.execute()

Why two keys? Polling arrives by device_code (that's what the CLI holds). Approval arrives by user_code (that's what the user types). You need to look up from both directions, so you keep a separate index. Put the same TTL on both and they expire together after 15 minutes — no cleanup job to write. That's the Redis TTL paying off.

Normalize when looking up by user_code, because it's human input — it'll arrive lowercase or without the hyphen.

async def get_by_user_code(self, user_code: str):
    user_code = user_code.upper().strip()
    if len(user_code) == 8 and "-" not in user_code:
        user_code = f"{user_code[:4]}-{user_code[4:]}"  # ABCDEFGH -> ABCD-EFGH
    device_code = await client.get(f"{_USER_CODE_PREFIX}{user_code}")
    ...

abcdefgh and ABCD-EFGH both work. Being strict here causes "it's correct but rejected" UX bugs, so be lenient on input.

Endpoint 2: device/token (the poll target)

The CLI hits this every few seconds. It returns different answers for "not yet", "denied", "expired", and "here you go".

@router.post("/device/token")
async def device_token(
    grant_type: str = Form(...),
    device_code: str = Form(...),
    client_id: str = Form(...),
    session: AsyncSession = Depends(get_session),
):
    if grant_type != _DEVICE_GRANT_TYPE:
        return JSONResponse(status_code=400, content={"error": "unsupported_grant_type"})

    data = await get_device_code_store().get_by_device_code(device_code)
    if data is None:
        return JSONResponse(status_code=400, content={"error": "expired_token"})
    if data["client_id"] != client_id:
        return JSONResponse(status_code=400, content={"error": "invalid_client"})

    if data["status"] == "pending":
        return JSONResponse(status_code=400, content={"error": "authorization_pending"})
    if data["status"] == "denied":
        await store.delete(device_code)
        return JSONResponse(status_code=400, content={"error": "access_denied"})

    if data["status"] == "authorized":
        # Issue tokens through the same path as the authorization_code flow
        ...
        await store.delete(device_code)  # one-time use
        return JSONResponse(status_code=200, content=token_response.to_dict(),
            headers={"Cache-Control": "no-store", "Pragma": "no-cache"})

These error strings are defined by RFC 8628 — don't invent your own. In particular, authorization_pending means "the user just hasn't approved yet, this isn't an error, keep polling at the same interval", and any decent client library will quietly wait on it. On access_denied, delete the device_code immediately — no reason to keep a rejected code alive.

When authorized, issue the token through the same TokenGenerator as the authorization_code flow. Device flow doesn't change what's in the token: hash the refresh token into the DB, add an id_token if the openid scope is present — the normal path. Then delete the device_code to guarantee one-time use. You can't redeem the same device_code twice.

Don't forget Cache-Control: no-store on the token response. A token cached by a proxy or browser is an incident waiting to happen.

Endpoint 3: device/verify (the human approval side)

Called from the verification page (/device). This is the one endpoint that assumes a logged-in user, so current_user is required.

@router.post("/device/verify")
async def device_verify(body: DeviceVerifyRequest, current_user: CurrentUser):
    data = await store.get_by_user_code(body.user_code)
    if data is None:
        raise HTTPException(404, "Invalid or expired code")
    if data["status"] != "pending":
        raise HTTPException(400, "This code has already been used")

    if body.action == "approve":
        await store.authorize(body.user_code, str(current_user.id))
        return {"status": "authorized", "client_id": data["client_id"]}
    else:
        await store.deny(body.user_code)
        return {"status": "denied"}

This is the crucial split. The CLI (holding the device_code) receives the token, but who the token is issued for is decided by the user logged into this browser. store.authorize binds current_user.id to the user_code.

async def authorize(self, user_code: str, user_id: str) -> bool:
    data = await self.get_by_user_code(user_code)
    if data is None or data["status"] != "pending":
        return False  # block double-approval / expiry
    data["status"] = "authorized"
    data["user_id"] = user_id
    remaining = int(data["expires_at"] - time.time())
    if remaining <= 0:
        return False
    await client.set(f"{_CODE_PREFIX}{data['device_code']}", json.dumps(data), ex=remaining)
    return True

The status != "pending" check stops an already-approved or denied code from being approved again. The state machine is one-directional only: pending → authorized / pending → denied. Recomputing the remaining TTL and re-setting with ex=remaining means approving doesn't extend the lifetime — the code still dies at the original 15-minute mark.

Register it in OIDC discovery

Finally, add device_authorization_endpoint to .well-known/openid-configuration so an RFC 8628-aware client library can discover the endpoint automatically.

# well_known.py
"device_authorization_endpoint": f"{base_url}/oauth/device/authorize",

And add device_code to the client's (Codens MCP's) grant_types. It only works once both server and client support it.

Takeaways

Device Code Flow looks niche — "authenticate a user without a browser" — but it shows up a lot: MCP, CLI tools, IoT, TV apps. The implementation points that matter:

Build the two codes by role: device_code is a machine secret (high entropy), user_code is human input (readable, confusable chars removed).
Two Redis keys (primary + index) plus a TTL makes expiry cleanup structurally unnecessary.
The state machine starts at pending and is one-directional; approval happens on a separate endpoint by a logged-in user; tokens are one-time use.
Follow the RFC for error strings and let the server drive interval / expires_in.

Anyone building a tool that connects MCP to their own service will hit "how do I log in headless" eventually. Hope this is a useful starting point.

Codens builds all of this auth machinery into the product.

https://www.codens.ai/en/