FlareCanary

Posted on May 6

GitHub App installation tokens are getting longer in May 2026 — your VARCHAR(40) column is about to silently truncate them

#github #api #devops #security

GitHub announced on April 24, 2026 that installation access tokens for GitHub Apps are changing format. Starting with a brownout mid-May 2026 and full cutover by late June 2026, tokens will grow from the current 40 characters (ghs_ + 36 chars) to up to roughly 520 characters. The prefix stays ghs_. The character set stays the same. Only the length changes — and only upward, and only sometimes.

That last part is the trap. During and after the rollout, some tokens will still be 40 chars (issued before the change, cached, returned by older Enterprise Server versions) and some will be 200, 380, 520. The same App, same installation, same call, on different days returns different lengths. There's no transition flag. There's no version header. The token still parses as bytes. It just doesn't fit anywhere you assumed it would.

The four shapes of the failure

# All four of these silently corrupt or reject valid tokens after May 2026.

# 1. Database column too narrow.
CREATE TABLE app_tokens (
    installation_id BIGINT,
    token VARCHAR(40),  -- silently truncates to 40 chars on insert
    expires_at TIMESTAMP
);

# 2. Regex validator pinned to old length.
TOKEN_RE = re.compile(r'^ghs_[A-Za-z0-9]{36}$')  # rejects new tokens
if not TOKEN_RE.match(token):
    raise ValueError("invalid token")

# 3. Length assertion in middleware.
assert len(token) == 40, f"expected 40-char token, got {len(token)}"

# 4. Fixed-size buffer in C/Go/Rust FFI.
char token_buf[64];  // overflows when memcpy'd, or strncpy truncates

The first three return 401 from GitHub with no helpful message — your code stored or transmitted a truncated/rejected token, GitHub rejected the truncated value, and the error trail goes cold one frame above the auth call. The fourth gets you a memory bug.

Why this is a quiet failure, not a loud one

Three reasons this rolls out as silent corruption rather than red-letter outage:

Type system unchanged. The token is still a string. Static analysis, schema validation, OpenAPI spec, type guards — all still pass. No compile error, no schema drift alarm. Octokit, PyGithub, go-github all return string from their token endpoints today and tomorrow.
Runtime unchanged at the issuance call. POST /app/installations/{installation_id}/access_tokens still returns 201 with a token field. Your code reads the field, uses it, gets a 401 on the next call — far from the issuance frame. A naive retry-on-401 hides it briefly, until the new token of the new size also fails to fit.
Brownout is intermittent. Per the GitHub announcement, the change rolls out as a brownout starting mid-May before full cutover late June. During the brownout window, the same token endpoint can return a 40-char token at 9:00am and a 380-char token at 9:30am. Tests written against a recorded fixture pass. CI passes. Production hits the brownout at unpredictable times.

Where to grep

Search every repo that touches a GitHub App for the four patterns:

# 1. DB column types
git grep -nE "token.*VARCHAR\(([0-9]+)\)" -- '*.sql' '*.ts' '*.py' '*.rb' '*.go'
git grep -nE "varchar\(40\)|VARCHAR\(40\)|String\(40\)" --

# 2. Regex validators
git grep -nE 'ghs_\[A-Za-z0-9\]\{[0-9]+\}|ghs_\\\w\{[0-9]+\}' --
git grep -n 'ghs_' -- '*.py' '*.ts' '*.go' '*.rb' '*.java' | grep -E 'match|regex|RE|Pattern'

# 3. Length assertions
git grep -nE 'len\(token\)\s*==\s*40|token\.length\s*==\s*40' --
git grep -nE 'fixed.*40|token\[0:40\]|substring\(0, 40\)' --

# 4. Fixed-size buffers
git grep -nE 'char token\[[0-9]+\]|\[40\]byte|token\[64\]' -- '*.c' '*.cpp' '*.go' '*.rs'

Common hiding spots beyond what grep catches:

Caching layers. Redis with MAXLEN, Memcached with item-size limits (default 1MB is fine, but custom-tuned smaller installs are not).
Audit logs. A token-redaction filter that masks the first 36 chars after ghs_ still leaks the last hundred characters of the new tokens.
Webhook signature verification. Code that uses an installation token in a downstream service's HMAC by hashing a fixed prefix length.
Environment variable storage. Some platforms truncate env vars over a length. Heroku, Vercel, Cloud Run all have limits per-platform; check your edition.
JWT claims. Apps that embed an installation token in a custom claim and the verifier asserts a fixed claim size.
Test fixtures. Recorded VCR cassettes, json fixtures, mock objects with hardcoded 40-char strings.

The most-bitten codebase is the one that wrote a Probot/Octokit-style validateToken helper years ago, copied a regex from a Stack Overflow answer, and forgot it existed.

Why VARCHAR(40) bites the hardest

Postgres and MySQL with strict mode raise an error when you try to insert a string longer than the column declared length — that's the good outcome, because the insert fails loudly and the call site gets the exception.

The bad outcome is what most production systems actually do:

MySQL with non-strict mode (the historic default before 5.7, and still common in older Docker images): silently truncates and emits a warning.
SQL Server with ANSI_WARNINGS OFF: silently truncates.
Some ORMs with default-string-conversion: do the truncation client-side before send.

The token gets stored at 40 chars. Reads return 40 chars. The auth call sends 40 chars to GitHub. GitHub rejects with 401. Your stack trace shows a 401 from POST /repos/.../check-runs — five layers away from the truncating column.

The fix is a column type change: ALTER TABLE app_tokens ALTER COLUMN token TYPE TEXT. There's no business reason to constrain token length at the storage layer; tokens are opaque to your app.

The migration that doesn't migrate cleanly

The obvious fix — widen all the columns, drop all the regexes, remove all the assertions — is correct end-state but introduces a window in the middle where the new tokens land in code paths that also still have stale references to the old format.

Three places this bites:

Reverse proxies and middleware. A Cloudflare Worker or Express middleware that strips/validates tokens. If you update the App but not the Worker, the Worker rejects valid tokens.
Cross-service token forwarding. Service A fetches a token, hands it to Service B, B logs and validates the format. Updating only A means B starts dropping tokens it gets from A.
CI/CD secret scanning. Internal secret scanners that pattern-match on ghs_[A-Za-z0-9]{36} will stop catching leaked tokens after the format change, because the regex no longer matches the new format. Update the scanners or you have a leak detection regression at the same time as the migration.

The order to roll: scanner regexes first (they need the broadest pattern, ghs_[A-Za-z0-9]+), then storage layer, then validators and assertions. The App itself needs no change — Octokit and friends will get the new tokens automatically once GitHub starts issuing them.

What "broadest pattern" means for the regex

Don't anchor on the new max length (~520) either; that's a current ceiling that GitHub may move again later. Anchor only on the prefix and character set:

# Wrong — pins to current new ceiling
TOKEN_RE = re.compile(r'^ghs_[A-Za-z0-9]{36,520}$')

# Right — accepts anything matching prefix and charset
TOKEN_RE = re.compile(r'^ghs_[A-Za-z0-9]+$')

If you actually need a length constraint for a defensive reason (e.g., a sanity check before storing), set a generous upper bound — 4096 is a reasonable belt-and-suspenders ceiling that won't bind on a future format change.

What's not changing

Prefix. Still ghs_. (Personal access tokens use ghp_, OAuth tokens gho_, app-to-server ghu_. None of these are announced as changing in this rollout, but the same lessons apply if they do later — strip your fixed lengths now.)
Issuance API. POST /app/installations/{installation_id}/access_tokens returns the same JSON shape.
Revocation API. DELETE /installation/token still works the same way.
Token TTL. Still 1 hour from issuance.
Permissions model. Unchanged.

Minimum-viable fix

git grep the four patterns above (column type, regex, length assertion, fixed buffer). Inventory every match.
Update DB columns to TEXT (or equivalent unbounded string type). For MySQL specifically, drop indexes on the token column before changing type if you have any — the index becomes invalid after the change.
Replace fixed-length regexes with prefix-only validation (^ghs_[A-Za-z0-9]+$) or remove the validation entirely (tokens are opaque, your code shouldn't be parsing them).
Remove length assertions in middleware and FFI boundaries. Resize fixed-size buffers to a generous ceiling (1024 or 4096).
Update internal secret scanners first — before any token-handling change — so leak detection doesn't regress mid-migration.
Add a real integration test against POST /app/installations/{id}/access_tokens and assert that the returned token round-trips through your storage layer without modification (use a length-comparison check, not a string-equality check, to keep the test stable across token issuances).
If you operate GitHub Enterprise Server, plan for the format change in your next GHES upgrade — the brownout schedule for GHES typically lags the GitHub.com schedule by one or two minor versions.

The pattern this fits

Token format changes are the canonical silent-fail. The type system sees a string. The schema sees a string. The contract test sees a string. The thing that breaks is an assumption about the string's shape — and assumptions don't show up in any contract.

GitHub has done this before (the token format change in 2021 introduced the ghp_, ghs_, etc. prefixes), and the same shape of breakage rolled out then: VARCHAR columns silently truncating, regex validators silently rejecting, fixed-buffer assertions overflowing. The fix five years ago was the same fix today: don't constrain the storage size, don't validate the format past the prefix, don't bake assumptions about token shape into anything except the issuance call itself.

If you're treating an opaque token as anything more structured than "an opaque blob from GitHub that is at most a few KB," you're carrying a latent bug that will trip on the next format change after this one.

FlareCanary monitors REST APIs and MCP servers for schema drift. Free tier covers 5 endpoints with daily checks.

DEV Community