Blue Hills

Posted on May 3 • Originally published at jwtshield.com

Auth regression tests for CI: what to assert and why

#cicd #devsecops #githubactions #testing

Most teams I have worked with have one auth test in their suite. It looks like this:

test('valid token verifies', () => {
  const token = signSync({ sub: 'user-1', aud: 'api://backend' }, secret);
  const result = verify(token, options);
  expect(result.valid).toBe(true);
});

That test is fine. It is also a smoke test, not a regression suite. It catches the case where verification is completely broken. It does not catch the case where verification accepts tokens that should be rejected — which is most of the auth bugs that ship to prod.

A real auth regression suite asserts that invalid tokens fail with the right code. Each test pairs a token with the failure mode it should produce. If the policy quietly accepts a token that should fail, the suite fails the PR. The audience configuration drift becomes visible the moment it's introduced, not three quarters later when someone writes the post-incident review.

Here is the assertion catalog that has caught real bugs in real services.

Pattern 1: Wrong audience

A token issued for service A should not authenticate against service B, even when both trust the same issuer. This is the most common configuration drift in microservice auth.

- token: <token issued for api://reporting>
  policy:
    issuer: https://login.example.com
    audiences: [api://billing]
    allowed_algs: [RS256]
  expected_failure_codes: [AUDIENCE_MISMATCH]

If your billing service accepts a reporting-audience token, a reporting-audience credential becomes a billing credential. The fix is one config line. The test ensures the config line stays correct after the next refactor.

Pattern 2: Expired token

A token whose exp claim is in the past must be rejected. Allow a small clock skew (60 seconds is typical) but no more.

- token: <token with exp = now() - 5 minutes>
  policy:
    issuer: https://login.example.com
    audiences: [api://backend]
    allowed_algs: [RS256]
    clock_skew_seconds: 60
  expected_failure_codes: [TOKEN_EXPIRED]

The bug this catches: a verifier that compares exp against the wrong clock (server-side vs UTC vs local timezone), or that skips expiry checking entirely when exp is missing.

Pattern 3: `alg=none`

An attacker sets "alg":"none" in the JWT header and ships an unsigned token. If your verifier accepts the token's own alg claim instead of using an explicit allowlist, the token will pass.

- token: <token with header.alg = none, no signature>
  policy:
    issuer: https://login.example.com
    audiences: [api://backend]
    allowed_algs: [RS256]
  expected_failure_codes: [ALGORITHM_NOT_ALLOWED]

This bug is eleven years old and still in production. Every JWT library has a way to fall into it. The regression test is the only thing that ensures your verifier is not in the family that does.

Pattern 4: Algorithm confusion (RS256 → HS256)

An attacker takes a token signed with RS256 (asymmetric) and re-signs it with HS256 (symmetric) using the public key as the shared secret. Some verifiers accept both algorithms with the same key material, and the public key happens to be valid as an HS256 secret.

- token: <token re-signed with HS256 using RS256 public key as secret>
  policy:
    issuer: https://login.example.com
    audiences: [api://backend]
    allowed_algs: [RS256]
  expected_failure_codes: [ALGORITHM_NOT_ALLOWED]

This test catches the bug class explicitly: even if your verifier code looks correct, the test ensures HS256 is rejected when only RS256 should be allowed.

Pattern 5: Wrong issuer

A token from https://login.legitimate.com should not authenticate against https://login.attacker.com or vice versa. The check has to be exact-match — no prefix substring, no glob.

- token: <token with iss = https://login.attacker.com>
  policy:
    issuer: https://login.legitimate.com
    audiences: [api://backend]
    allowed_algs: [RS256]
  expected_failure_codes: [ISSUER_MISMATCH]

The test catches verifiers that do iss.startsWith(expected) or fuzzy hostname comparison. Both have shipped in production. Both let attacker-issuer tokens through.

Pattern 6: Missing required claim

If your service requires a tenant_id claim, a token without it must be rejected.

- token: <token without tenant_id claim>
  policy:
    issuer: https://login.example.com
    audiences: [api://backend]
    allowed_algs: [RS256]
    required_claims: [sub, tenant_id]
  expected_failure_codes: [REQUIRED_CLAIM_MISSING]

The test catches drift where a new claim is added to your service's contract but the verifier config wasn't updated to require it.

Pattern 7: Forged signature

A token whose signature does not match the public key — anything from a copy-paste truncation to an active forgery — must be rejected.

- token: <token with signature segment replaced with garbage>
  policy:
    issuer: https://login.example.com
    audiences: [api://backend]
    allowed_algs: [RS256]
  expected_failure_codes: [SIGNATURE_INVALID]

This is the smoke test in disguise. Every verifier should pass it. Run it anyway — the day it fails is the day someone replaced your verifier with a stub.

Pattern 8: JWKS rotation drift

The verifier should pick up new keys after the issuer rotates. If a token signed with a new key returns KID_NOT_FOUND instead of pass, the cache is stale and your fleet is about to break.

- token: <token signed with key issued AFTER last cache refresh>
  policy:
    issuer: https://login.example.com
    audiences: [api://backend]
    allowed_algs: [RS256]
  expected_failure_codes: []   # should pass — verifier should refetch JWKS

This is the test we run in CI nightly. If it ever fails, JWKS rotation just happened and our cache is now stale; the fix is to force a JWKS refetch in our verifier.

Wiring it into CI

The eight patterns above run as a single batch in jwtshield's /v1/test/auth-regression endpoint. The endpoint accepts a list of (token, policy, expected_failure_codes) tuples and returns a suite-level pass/fail plus per-check structured findings.

- uses: redbullhorns/jwtshield-ci@v1
  with:
    issuer: https://login.example.com
    audience: api://backend
    fail-on-severity: high
    api-key: ${{ secrets.JWTSHIELD_API_KEY }}

The Action runs in roughly 800ms. It costs nothing on the free tier (200 verifies per month). It uses synthetic test tokens — never your production tokens. The audit trail lives at https://jwtshield.com/runs/<id> if you want compliance evidence.

What makes this different from a smoke test

A smoke test confirms verification works. A regression suite confirms verification rejects what it should — including the bug classes that have shipped to production for the last decade. The cost of writing the eight patterns is one afternoon. The cost of skipping them is one outage.

Get an API key → — or browse the GitHub Action listing.

Discuss on: Hacker News · dev.to · Hashnode · Mastodon

Related:

DEV Community

Auth regression tests for CI: what to assert and why

Pattern 1: Wrong audience

Pattern 2: Expired token

Pattern 3: `alg=none`

Pattern 4: Algorithm confusion (RS256 → HS256)

Pattern 5: Wrong issuer

Pattern 6: Missing required claim

Pattern 7: Forged signature

Pattern 8: JWKS rotation drift

Wiring it into CI

What makes this different from a smoke test

Top comments (0)

Pattern 1: Wrong audience

Pattern 2: Expired token

Pattern 3: alg=none

Pattern 4: Algorithm confusion (RS256 → HS256)

Pattern 5: Wrong issuer

Pattern 6: Missing required claim

Pattern 7: Forged signature

Pattern 8: JWKS rotation drift

Wiring it into CI

What makes this different from a smoke test

Pattern 3: `alg=none`