Narnaiezzsshaa Truong

Posted on May 12

The 20-Minute Compromise: CI/CD Audit Guide for the TanStack Supply Chain Attack

#github #security #devops #cicd

OIDC authentication worked correctly throughout the TanStack attack. The build cache is the substrate participant that wasn't governed. Here's the full audit checklist and the governance analysis that explains why OIDC alone can't prevent this attack class.

The TanStack NPM supply chain attack compromised 84 package versions across 42 packages in approximately 20 minutes. The attack chain:

pull_request_target misconfiguration
  → build cache poisoning
    → OIDC token extraction from runner memory
      → authenticated publication of malicious packages

Before the audit checklist, one framing point that changes how you think about this attack class:

OIDC authentication worked correctly throughout. The token was valid. The workload was authenticated. The authorization was granted. Everything Workload Identity Federation was designed to protect was functioning as intended.

The attack lived in the layer below authentication—the build cache. The cache receives artifacts from trusted processes and feeds them back into subsequent builds without re-verification. When the cache is poisoned, every downstream build inherits the compromise invisibly. OIDC proves who is acting. It cannot prove that what is being acted upon has maintained lineage integrity from its authorized formation.

That distinction matters for everything that follows.

Part I—Audit Your CI/CD for This Exact Attack Chain

A. Audit Surface 1—`pull_request_target` Misconfiguration

Goal: Ensure untrusted PR code can never execute with repository-level authority.

What to verify:

Search all workflows for these triggers:

on:
  pull_request_target:   # ← high risk
  workflow_run:          # ← if triggered by PR workflows, also high risk

Confirm checkout behavior:

actions/checkout@v4 pinned by SHA (not by tag)
ref explicitly set to base repo, not PR head

Confirm permissions block:

permissions:
  contents: read    # ← not write
  packages: read    # ← not write
  id-token: none    # ← never write in PR workflows
  actions: read     # ← not write

🚨 Red flags:

Any workflow running tests/builds on PR code using pull_request_target
pull_request_target without a repo-owner approval gate
id-token: write inside PR workflows

B. Audit Surface 2—GitHub Actions Cache Poisoning

Goal: Ensure untrusted PR workflows cannot write to caches used by release pipelines.

What to verify:

PR workflows must use different cache keys than release workflows. No shared keys like:

node-modules-{{ hashFiles('**/package-lock.json') }}
build-cache-{{ runner.os }}

Cache restore/write separation:

✅ PR workflows may restore caches
❌ PR workflows must never save caches

# In PR workflow — WRONG
- uses: actions/cache@v3
  with:
    path: ~/.npm
    key: ${{ runner.os }}-node-${{ hashFiles('**/package-lock.json') }}
    # Missing: restore-only: true

# In PR workflow — CORRECT
- uses: actions/cache/restore@v3
  with:
    path: ~/.npm
    key: pr-${{ runner.os }}-node-${{ hashFiles('**/package-lock.json') }}

🚨 Red flags:

save-cache steps inside PR workflows
Shared cache keys between main branch builds, PR builds, and release workflows
No explicit cache key namespace separation between trusted and untrusted contexts

C. Audit Surface 3—OIDC Token Exposure

Goal: Ensure OIDC tokens cannot be harvested from runner memory and used to impersonate release workflows.

What to verify:

Only release workflows should have id-token: write.

Cloud provider trust policies must validate all of these claims, not just sub:

repository
workflow (name)
ref_type
environment
actor

Runner isolation:

No self-hosted runners for release pipelines
No long-lived runners shared across contexts
No shared workspaces between PR and release workflows

🚨 Red flags:

OIDC trust policies that only validate sub—insufficient claim scope
Runners reused across PR and release workflows
id-token: write granted by default to all workflows

Part II—Harden OIDC Trust Boundaries

A. Enforce Strict Claim Validation

Cloud providers must reject OIDC tokens unless all of these match:

Claim	Prevents
`repository`	Tokens from other repos being replayed
`workflow`	PR runner tokens being used by release workflows
`ref_type`	Branch tokens used in tag-gated deployments
`environment`	Dev tokens used in production
`actor`	Fork tokens used by maintainer-level workflows

Validating only sub is insufficient. sub encodes repository and ref but not workflow name or environment.

AWS example—strict claim validation:

{
  "Condition": {
    "StringEquals": {
      "token.actions.githubusercontent.com:sub": "repo:org/repo:environment:production",
      "token.actions.githubusercontent.com:aud": "sts.amazonaws.com",
      "token.actions.githubusercontent.com:job_workflow_ref": "org/repo/.github/workflows/release.yml@refs/tags/*"
    }
  }
}

B. Enforce Environment-Scoped OIDC

Each environment needs separate:

OIDC trust relationships
IAM roles
Workload identities

A token valid in dev must be structurally incapable of assuming a prod role, regardless of how it was obtained.

C. Enforce Ephemeral, Task-Scoped Credentials

TTL ≤ 5 minutes
Bound to: workflow name + job name + environment + commit SHA

Token replay from memory extraction becomes nearly impossible when the token has expired and bound claims no longer match the current execution context.

Part III—Detect Whether Your Environment Pulled a Poisoned Cache

This is the section most teams never run. If TanStack packages are in your dependency tree, run these checks now.

A. Look for Cache-Restore Events in PR Workflows

Search GitHub Actions logs for:

Restored cache key: ...
Cache hit occurred on the primary key

If PR workflows restored a cache key that was subsequently used by release workflows, you have exposure.

B. Look for Unexpected File Modifications in Build Artifacts

Check node_modules for:

New files not in package-lock.json
Modified .js files with obfuscated code or large base64 blobs

Search for unexpected network call patterns in build output:

grep -r "fetch(" node_modules --include="*.js" | grep -v node_modules/.bin
grep -r "Buffer.from(" node_modules --include="*.js"
grep -r "crypto.subtle" node_modules --include="*.js"

C. Look for Anomalous OIDC Token Usage

Check cloud provider logs (CloudTrail, GCP Audit Logs) for:

OIDC tokens used outside expected workflow names
Tokens used from unexpected IP ranges
Tokens used outside expected CI/CD time windows
Tokens assuming roles not associated with the claiming workflow

The TanStack attack used valid tokens. The anomaly is in the context of use, not the token itself.

D. Look for Runner-Level Persistence (Self-Hosted Only)

ls -la /home/runner/work/_temp
ls -la /home/runner/.cache
ps aux | grep node   # look for processes running post-workflow

Why OIDC Alone Cannot Prevent This Attack Class

The audit checklist above addresses immediate hardening. But the substrate governance gap explains why this attack class will recur.

The build cache is a substrate participant: it sits inside the authority chain of your build pipeline, receives artifacts from trusted processes, and feeds them back into subsequent builds. It is not a passive storage layer. It is an active participant in the lineage chain of every build that uses it.

Five governance invariants are missing from current CI/CD build cache architecture:

Authority: Who authorized the cache to contain these specific artifacts? The cache has no mechanism to verify that what it contains was authorized by the same authority chain that will consume it.

Lineage: Can every artifact in the cache be traced to its formation and authorization? Cache poisoning exploits the absence of artifact lineage—poisoned artifacts are indistinguishable from legitimate ones without a lineage chain.

Reversibility: Once poisoned artifacts are published via authenticated OIDC tokens, 84 malicious versions cannot be automatically unwound. The state change is irreversible without manual intervention across every downstream consumer.

Boundary Integrity: The cache should enforce boundary integrity between PR and release contexts. Cache key namespacing is a configuration control, not a structural governance invariant.

Drift Control: The build environment drifted from its authorized state when the cache was poisoned. No instrument detected that drift before malicious artifacts were published.

OIDC proves who is acting. It cannot prove that what is being acted upon has maintained lineage integrity from its authorized formation. Until build pipelines enforce artifact lineage as a first-class governance primitive, the next variation of this attack will find a path through whatever configuration controls are in place.

Summary Checklist

Audit immediately:

[ ] Search for pull_request_target and workflow_run triggers
[ ] Verify no id-token: write in PR workflows
[ ] Confirm cache keys are namespaced between PR and release contexts
[ ] Confirm PR workflows cannot save to shared cache keys
[ ] Verify OIDC trust policies validate workflow, environment, actor, not just sub

Harden:

[ ] Enforce environment-scoped OIDC with separate IAM roles per environment
[ ] Set token TTL ≤ 5 minutes bound to workflow + job + environment + commit SHA
[ ] Switch release pipelines to ephemeral runners

Detect (run now if TanStack is in your dependency tree):

[ ] Search Actions logs for cache-restore events in PR workflows
[ ] Scan node_modules for unexpected files and network call patterns
[ ] Check cloud provider logs for anomalous OIDC token usage

Soft Armor Labs publishes governance research at the intersection of AI, security, and institutional risk. Technical note DOI: 10.5281/zenodo.20146739. ORCID: 0009-0000-1964-6440. CC BY-NC-ND 4.0.

Top comments (1)

VoltageGPU • May 13

Interesting analysis — it's reassuring to see OIDC holding up, but the build cache being the vector highlights how often we overlook caching layers in CI/CD security. In my work with GPU-based CI workflows, I've seen similar risks when cached dependencies or pre-built containers aren't versioned or signed.

Part I—Audit Your CI/CD for This Exact Attack Chain

A. Audit Surface 1—pull_request_target Misconfiguration

B. Audit Surface 2—GitHub Actions Cache Poisoning

C. Audit Surface 3—OIDC Token Exposure

Part II—Harden OIDC Trust Boundaries

A. Enforce Strict Claim Validation

B. Enforce Environment-Scoped OIDC

C. Enforce Ephemeral, Task-Scoped Credentials

Part III—Detect Whether Your Environment Pulled a Poisoned Cache

A. Look for Cache-Restore Events in PR Workflows

B. Look for Unexpected File Modifications in Build Artifacts

C. Look for Anomalous OIDC Token Usage

D. Look for Runner-Level Persistence (Self-Hosted Only)

Why OIDC Alone Cannot Prevent This Attack Class

Summary Checklist

A. Audit Surface 1—`pull_request_target` Misconfiguration