When you connect an MCP server to your own service, one unglamorous problem shows up fast: how does the CLI log in?
A web app with a browser can use the OAuth authorization code flow — redirect the user to a login page, exchange the returned code for a token. But MCP clients often run where there's no GUI browser: over SSH, in a CI container, on a headless box. The loopback trick (http://localhost:random_port as the redirect target) doesn't help either, because there's no browser to open.
OAuth has a proper answer for "authenticate a user where there's no browser": RFC 8628, the Device Authorization Grant, a.k.a. Device Code Flow. I implemented it in Codens' Auth service, so here's the design and the real code.
The idea: separate where you authenticate from where you approve
Device Code Flow splits the "device that shows a code" (the CLI) from the "device that approves" (your everyday browser). It's the same thing as logging into Netflix on a TV: a code appears on screen, you type it on your phone.
The flow:
- The CLI calls
/oauth/device/authorizeand gets back adevice_code(the machine's secret) and auser_code(a short code a human types). - The CLI shows the user "open this URL and enter ABCD-EFGH", then starts polling
/oauth/device/tokenin the background. - The user opens the verification page in their normal browser, already logged in, enters the
user_code, and approves. - The moment it's approved, the CLI's poll receives the token.
The CLI never opens a browser. The user approves from whatever browser they already have — phone, another laptop, anything.
Endpoint 1: device/authorize
The CLI calls this first. It takes a client_id and scope and issues the two codes.
@router.post("/device/authorize", response_model=DeviceAuthorizationResponse)
async def device_authorize(
client_id: str = Form(...),
scope: str = Form("openid profile email"),
session: AsyncSession = Depends(get_session),
):
# Is this client allowed to use the device_code grant?
client = await OAuthClientRepository(session).get_by_client_id(client_id)
if not client or not client.is_active:
return JSONResponse(status_code=400,
content={"error": "invalid_client", "error_description": "Unknown client"})
allowed_grants = (client.grant_types or "").split()
if _DEVICE_GRANT_TYPE not in allowed_grants:
return JSONResponse(status_code=400,
content={"error": "unauthorized_client",
"error_description": "Client not authorized for device_code grant"})
store = get_device_code_store()
result = await store.create(client_id=client_id, scope=scope)
frontend_url = settings.FRONTEND_URL.rstrip("/")
return DeviceAuthorizationResponse(
device_code=result["device_code"],
user_code=result["user_code"],
verification_uri=f"{frontend_url}/device",
expires_in=result["expires_in"], # 900s
interval=result["interval"], # 5s poll interval
)
_DEVICE_GRANT_TYPE is the RFC's canonical string urn:ietf:params:oauth:grant-type:device_code. If the client's grant_types doesn't include it, reject. Not everyone gets device flow — only clients that explicitly opt in.
Returning interval (5s) and expires_in (900s) matters: per RFC, the server dictates the poll interval and expiry and tells the client. Don't let the client hardcode them.
Making the two codes
device_code and user_code play different roles, so build them differently.
# device_code: a secret the machine holds. Just needs to be unguessable.
device_code = secrets.token_urlsafe(32)
# user_code: typed by a human. Readability comes first.
_USER_CODE_CHARS = "ABCDEFGHJKMNPQRSTUVWXYZ23456789" # drop confusable 0/O/1/I/L
def _generate_user_code() -> str:
left = "".join(secrets.choice(_USER_CODE_CHARS) for _ in range(4))
right = "".join(secrets.choice(_USER_CODE_CHARS) for _ in range(4))
return f"{left}-{right}" # ABCD-EFGH
device_code gets token_urlsafe(32) — if it leaks, someone can grab the token, so entropy wins here.
user_code is typed by hand, so drop the confusable characters (0/O, 1/I/L) from the alphabet. The ABCD-EFGH hyphenated shape makes typos easier to spot. It's a small security-for-UX trade, and it's fine: the user_code is only used by an already-logged-in user to approve — it's not the token.
Storage: two Redis keys
State lives in Redis: a primary key from device_code to the state, and an index from user_code to device_code.
_CODE_PREFIX = "device:code:" # device:code:{device_code} -> state JSON (primary)
_USER_CODE_PREFIX = "device:user_code:" # device:user_code:{user_code} -> device_code (index)
_DEVICE_CODE_TTL = 900 # 15 min, matches the MCP client timeout
async def create(self, client_id: str, scope: str) -> dict:
device_code = secrets.token_urlsafe(32)
user_code = _generate_user_code()
data = {
"device_code": device_code, "user_code": user_code,
"client_id": client_id, "scope": scope,
"status": "pending", "user_id": None,
"created_at": now, "expires_at": now + _DEVICE_CODE_TTL,
}
# Both keys, same TTL, one round trip via pipeline.
pipe = client.pipeline()
pipe.set(f"{_CODE_PREFIX}{device_code}", json.dumps(data), ex=_DEVICE_CODE_TTL)
pipe.set(f"{_USER_CODE_PREFIX}{user_code}", device_code, ex=_DEVICE_CODE_TTL)
await pipe.execute()
Why two keys? Polling arrives by device_code (that's what the CLI holds). Approval arrives by user_code (that's what the user types). You need to look up from both directions, so you keep a separate index. Put the same TTL on both and they expire together after 15 minutes — no cleanup job to write. That's the Redis TTL paying off.
Normalize when looking up by user_code, because it's human input — it'll arrive lowercase or without the hyphen.
async def get_by_user_code(self, user_code: str):
user_code = user_code.upper().strip()
if len(user_code) == 8 and "-" not in user_code:
user_code = f"{user_code[:4]}-{user_code[4:]}" # ABCDEFGH -> ABCD-EFGH
device_code = await client.get(f"{_USER_CODE_PREFIX}{user_code}")
...
abcdefgh and ABCD-EFGH both work. Being strict here causes "it's correct but rejected" UX bugs, so be lenient on input.
Endpoint 2: device/token (the poll target)
The CLI hits this every few seconds. It returns different answers for "not yet", "denied", "expired", and "here you go".
@router.post("/device/token")
async def device_token(
grant_type: str = Form(...),
device_code: str = Form(...),
client_id: str = Form(...),
session: AsyncSession = Depends(get_session),
):
if grant_type != _DEVICE_GRANT_TYPE:
return JSONResponse(status_code=400, content={"error": "unsupported_grant_type"})
data = await get_device_code_store().get_by_device_code(device_code)
if data is None:
return JSONResponse(status_code=400, content={"error": "expired_token"})
if data["client_id"] != client_id:
return JSONResponse(status_code=400, content={"error": "invalid_client"})
if data["status"] == "pending":
return JSONResponse(status_code=400, content={"error": "authorization_pending"})
if data["status"] == "denied":
await store.delete(device_code)
return JSONResponse(status_code=400, content={"error": "access_denied"})
if data["status"] == "authorized":
# Issue tokens through the same path as the authorization_code flow
...
await store.delete(device_code) # one-time use
return JSONResponse(status_code=200, content=token_response.to_dict(),
headers={"Cache-Control": "no-store", "Pragma": "no-cache"})
These error strings are defined by RFC 8628 — don't invent your own. In particular, authorization_pending means "the user just hasn't approved yet, this isn't an error, keep polling at the same interval", and any decent client library will quietly wait on it. On access_denied, delete the device_code immediately — no reason to keep a rejected code alive.
When authorized, issue the token through the same TokenGenerator as the authorization_code flow. Device flow doesn't change what's in the token: hash the refresh token into the DB, add an id_token if the openid scope is present — the normal path. Then delete the device_code to guarantee one-time use. You can't redeem the same device_code twice.
Don't forget Cache-Control: no-store on the token response. A token cached by a proxy or browser is an incident waiting to happen.
Endpoint 3: device/verify (the human approval side)
Called from the verification page (/device). This is the one endpoint that assumes a logged-in user, so current_user is required.
@router.post("/device/verify")
async def device_verify(body: DeviceVerifyRequest, current_user: CurrentUser):
data = await store.get_by_user_code(body.user_code)
if data is None:
raise HTTPException(404, "Invalid or expired code")
if data["status"] != "pending":
raise HTTPException(400, "This code has already been used")
if body.action == "approve":
await store.authorize(body.user_code, str(current_user.id))
return {"status": "authorized", "client_id": data["client_id"]}
else:
await store.deny(body.user_code)
return {"status": "denied"}
This is the crucial split. The CLI (holding the device_code) receives the token, but who the token is issued for is decided by the user logged into this browser. store.authorize binds current_user.id to the user_code.
async def authorize(self, user_code: str, user_id: str) -> bool:
data = await self.get_by_user_code(user_code)
if data is None or data["status"] != "pending":
return False # block double-approval / expiry
data["status"] = "authorized"
data["user_id"] = user_id
remaining = int(data["expires_at"] - time.time())
if remaining <= 0:
return False
await client.set(f"{_CODE_PREFIX}{data['device_code']}", json.dumps(data), ex=remaining)
return True
The status != "pending" check stops an already-approved or denied code from being approved again. The state machine is one-directional only: pending → authorized / pending → denied. Recomputing the remaining TTL and re-setting with ex=remaining means approving doesn't extend the lifetime — the code still dies at the original 15-minute mark.
Register it in OIDC discovery
Finally, add device_authorization_endpoint to .well-known/openid-configuration so an RFC 8628-aware client library can discover the endpoint automatically.
# well_known.py
"device_authorization_endpoint": f"{base_url}/oauth/device/authorize",
And add device_code to the client's (Codens MCP's) grant_types. It only works once both server and client support it.
Takeaways
Device Code Flow looks niche — "authenticate a user without a browser" — but it shows up a lot: MCP, CLI tools, IoT, TV apps. The implementation points that matter:
- Build the two codes by role:
device_codeis a machine secret (high entropy),user_codeis human input (readable, confusable chars removed). - Two Redis keys (primary + index) plus a TTL makes expiry cleanup structurally unnecessary.
- The state machine starts at
pendingand is one-directional; approval happens on a separate endpoint by a logged-in user; tokens are one-time use. - Follow the RFC for
errorstrings and let the server driveinterval/expires_in.
Anyone building a tool that connects MCP to their own service will hit "how do I log in headless" eventually. Hope this is a useful starting point.
Codens builds all of this auth machinery into the product.
Top comments (0)