DEV Community

Olivier Buitelaar
Olivier Buitelaar

Posted on

5 Real GitHub Actions Bugs Caught by Static Analysis

You don't find out your CI is broken until it's too late. Here are five real GitHub Actions bugs — and how static analysis catches them before they ever run.


Static analysis for GitHub Actions workflows is still an underused idea. Most teams lint their application code, type-check their TypeScript, and run SAST on their Python. But the YAML files that orchestrate all of it? Those get copy-pasted from Stack Overflow and committed unchecked.

These are five categories of real bugs I've seen repeatedly — and how a workflow linter catches them before they cost you anything.


1. Secrets Accidentally Echoed in run: Steps

The bug:

- name: Deploy
  run: |
    echo "Deploying with token: ${{ secrets.DEPLOY_TOKEN }}"
    ./deploy.sh --token ${{ secrets.DEPLOY_TOKEN }}
Enter fullscreen mode Exit fullscreen mode

That echo line will print your secret in plain text in the CI logs. GitHub masks known secret values in logs, but only if the secret is registered correctly — and only in most contexts. If the value gets split across lines or embedded in a longer string, masking can fail silently.

What static analysis catches:

A linter flags any ${{ secrets.* }} reference that appears inside a string passed to echo, printf, or similar commands. The fix is simple:

- name: Deploy
  env:
    DEPLOY_TOKEN: ${{ secrets.DEPLOY_TOKEN }}
  run: ./deploy.sh --token "$DEPLOY_TOKEN"
Enter fullscreen mode Exit fullscreen mode

Setting secrets as environment variables instead of inline expressions keeps them out of the command string entirely.


2. Unpinned Third-Party Actions (Supply Chain Risk)

The bug:

- uses: some-org/some-action@main
- uses: another-org/setup-tool@v2
Enter fullscreen mode Exit fullscreen mode

Using a branch name (@main) or a mutable tag (@v2) means your workflow silently runs whatever that action points to tomorrow. A compromised update to a popular action runs with your repository's full permissions — token, secrets, and all.

This is a real supply chain risk. There have been several high-profile incidents where popular GitHub Actions were compromised via tag hijacking.

What static analysis catches:

# Flagged: mutable reference
- uses: actions/checkout@v4

# Clean: pinned to commit SHA
- uses: actions/checkout@11bd71901bbe5b1630ceea73d27597364c9af683 # v4.2.2
Enter fullscreen mode Exit fullscreen mode

Pinning to a full commit SHA guarantees you're running exactly the code you reviewed. A linter enforces this across your entire workflow directory automatically.


3. Missing timeout-minutes (Runaway Jobs)

The bug:

jobs:
  build:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4
      - run: npm install && npm run build
Enter fullscreen mode Exit fullscreen mode

No timeout-minutes. GitHub's default timeout is 6 hours. A hung process — npm stuck on a network call, a test waiting for a port that never opens, a deploy step waiting for interactive confirmation — will silently consume your runner for 6 hours.

On GitHub-hosted runners this costs real money. On self-hosted runners it can block your entire team's CI queue.

What static analysis catches:

jobs:
  build:
    runs-on: ubuntu-latest
    timeout-minutes: 15  # required
    steps:
      - uses: actions/checkout@v4
      - run: npm install && npm run build
Enter fullscreen mode Exit fullscreen mode

A linter can require timeout-minutes on every job. This single rule has saved teams significant CI bill spikes.


4. continue-on-error: true Silencing Real Failures

The bug:

- name: Run security scan
  continue-on-error: true
  run: ./security-scanner.sh
Enter fullscreen mode Exit fullscreen mode

continue-on-error: true means a failing step is marked as "warning" but doesn't fail the job. This is occasionally legitimate — but when applied to security scans, test suites, or linting steps, failures are silently swallowed and PRs merge anyway.

Worse, it tends to spread: once one engineer adds it to unblock a stuck PR, others copy the pattern until your quality gates have no teeth.

What static analysis catches:

A linter flags continue-on-error: true on steps whose names suggest quality gates (scan, test, lint, check) — or warns on any usage and asks for a justification comment. It catches the copy-paste propagation before it becomes policy.


5. Deprecated and EOL Runtime Configurations

The bug:

jobs:
  test:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/setup-node@v3
        with:
          node-version: '16'
Enter fullscreen mode Exit fullscreen mode

Node 16 has been EOL since GitHub deprecated it in their runner images. Jobs may fail with confusing errors, silently fall back to a different version, or work today but break next month when the runner image drops support.

Similar issues occur with ubuntu-18.04 runner labels (deprecated), old actions/cache versions with changed APIs, and patterns like set-output (deprecated workflow command).

What static analysis catches:

A linter maintains a list of deprecated values and flags them with suggested replacements: node-version: '16' suggests '20' or '22'; ubuntu-18.04 suggests ubuntu-latest. These are deterministic checks that never need to run a single line of your code.


Catching All of This Automatically

Running these checks manually doesn't scale, especially across a monorepo with dozens of workflow files. The solution is making them automatic and zero-friction.

workflow-guardian is a free GitHub Action that runs all of these checks statically on every PR. Add it in 30 seconds:

name: Validate Workflows
on: [pull_request]

jobs:
  validate:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4
      - uses: ollieb89/workflow-guardian@v1
        with:
          fail-on-warnings: true
Enter fullscreen mode Exit fullscreen mode

It flags all five categories above as annotations directly on the changed workflow files in the PR diff — no separate dashboard, no configuration required to get started.


Static analysis for your application code is table stakes. Your CI workflows deserve the same treatment. The bugs are there — they just don't show up until something goes wrong at 2am on a Friday.

What's the worst CI workflow bug you've been bitten by? Drop it in the comments.

Top comments (0)