Mike Anderson

Posted on Jun 7

Protecting GitHub from Supply-Chain Malware: Prevention, Cleanup, and Recovery

#githubmalware #security #supplychainattack #incidentresponse

GitHub is not just a place where code lives. In most engineering organizations, it is part of the software delivery control plane.

That means a compromised developer machine, OAuth app, GitHub App, personal access token, SSH key, service account, CI runner, or automation script can become a supply-chain problem very quickly.

The dangerous pattern looks like this:

A trusted identity pushes a small repo change
→ the change modifies developer tooling, CI, package scripts, Docker, or repo rules
→ developers pull it or CI executes it
→ credentials are stolen or branch protections are weakened
→ the same change propagates across many repositories

This post is a practical playbook for preventing this class of attack and recovering cleanly when it happens.

It is written for security engineers, platform engineers, SOC teams, and engineering managers who need something that works in production, not a theoretical checklist.

What usually goes wrong

The malware itself is rarely the only problem.

The real gaps are usually around control and visibility:

Developers can pull a branch that contains repo-controlled execution hooks.
Package scripts, Dockerfiles, devcontainers, GitHub workflows, and editor/AI-tool configs can change execution behavior.
Security cannot manually review every pull request.
Branch protection or rulesets can be weakened by a token or automation identity.
Audit logs exist, but there are no high-signal detections.
Endpoint process telemetry may not be centralized, especially for macOS fleets.
Cleanup is done file-by-file without first containing the identity or automation path that caused propagation.

A good response has to protect the full path:

Developer machine
→ local Git push
→ GitHub push ruleset
→ pull request scanner
→ CODEOWNERS review
→ protected branch merge
→ CI/runtime monitoring
→ SIEM alerts
→ drift scanning
→ incident response

Prevention: the control stack that actually works

1. Block high-risk repo execution paths at push time

Some paths are too risky to allow casually because they can influence local tooling or hidden setup behavior.

Use an organization-level GitHub push ruleset to restrict high-risk file paths.

Example restricted paths:

.github/setup.js
.github/setup.mjs
.github/setup.cjs
.github/**/setup.js
.claude/**
.gemini/**
.cursor/**
.cursor/rules/**

This is not a malware signature. It is a control around repo-controlled execution surfaces.

GitHub rulesets are designed to apply rules across repositories, and push rulesets can restrict file paths for targeted repositories and their fork networks. See GitHub’s ruleset documentation for implementation details:

Let’s not block every engineering file globally. Files such as Dockerfile, package.json, .github/workflows/**, and .devcontainer/** are legitimate. They should be scanned and routed to the right reviewer, not blindly blocked.

2. Protect default and release branches with rulesets

For main, master, develop, release/*, and hotfix/*, enforce:

Pull request required
CODEOWNER review required
Signed commits required where practical
Status checks required where CI exists
Stale approvals dismissed
Conversation resolution required
Force push blocked
Branch deletion blocked
Bypass limited to break-glass identities only

The goal is not bureaucracy. The goal is to prevent a token, automation script, or compromised user from quietly changing protected branches.

3. Use CODEOWNERS for sensitive paths only

Security should not review everything. That will fail.

Route only sensitive files to the right owners.

Example .github/CODEOWNERS:

# GitHub automation
.github/workflows/**       @org/platform-team
.github/actions/**         @org/platform-team
.github/CODEOWNERS         @org/security-team @org/platform-team

# AI/editor/agentic execution config
.claude/**                 @org/security-team
.gemini/**                 @org/security-team
.cursor/**                 @org/security-team

# Build and runtime execution surfaces
package.json               @org/platform-team
Dockerfile                 @org/platform-team
Dockerfile.*               @org/platform-team
docker-compose.yml         @org/platform-team
docker-compose.yaml        @org/platform-team
docker-compose*.yml        @org/platform-team
docker-compose*.yaml       @org/platform-team
.devcontainer/**           @org/platform-team
.vscode/tasks.json         @org/platform-team
.vscode/launch.json        @org/platform-team

This works only if branch protection or branch rulesets require Code Owner review. GitHub documents CODEOWNERS as a way to define responsible users or teams for files in a repository:

https://docs.github.com/repositories/managing-your-repositorys-settings-and-features/customizing-your-repository/about-code-owners

4. Treat Docker as legitimate but high-impact

Docker is not suspicious by itself. Blocking all Docker changes will break normal engineering work.

The practical approach is:

Dockerfile / Dockerfile.*      → Platform review + scanner scoring
docker-compose*.yml/yaml       → Platform review + scanner scoring
.devcontainer/**               → Platform review + scanner scoring
Docker socket mounts           → Critical unless explicitly approved
Host secret mounts             → Critical unless explicitly approved
Remote download-and-execute    → High or Critical depending on context

Examples that should alert:

RUN curl https://example.com/install.sh | bash

{
  "postCreateCommand": "node .github/setup.js"
}

volumes:
  - /var/run/docker.sock:/var/run/docker.sock
  - ~/.ssh:/root/.ssh

Examples that should not automatically alert:

FROM node:20-alpine
WORKDIR /app
COPY package*.json ./
RUN npm ci
COPY . .
CMD ["npm", "start"]

The difference is behavior. We care about downloader execution, secret exposure, Docker socket access, hidden setup scripts, and lifecycle hooks that execute unreviewed commands.

5. Add a central GitHub push and PR scanner

Local developer hooks are useful, but they can be bypassed. The central scanner is the scalable control.

The scanner should run as a small internal service:

GitHub organization webhook
→ HTTPS scanner endpoint
→ GitHub API read-only access
→ risk scoring
→ Datadog or SIEM logs
→ Slack/Jira/PagerDuty for high-risk findings

The scanner should:

Validate the GitHub webhook signature.
Read the repo, branch, actor, commit, and PR.
Pull changed file metadata from the GitHub API.
Fetch content only for changed files that matter.
Score risky paths and risky behavior.
Send high and critical findings to the SIEM.
Optionally ask AI to summarize high-risk changes for responders.

GitHub recommends validating webhook deliveries using X-Hub-Signature-256, which uses HMAC-SHA256:

https://docs.github.com/en/webhooks/using-webhooks/validating-webhook-deliveries

A practical risk model:

High-risk path:                  +5
Sensitive engineering path:       +3
Command execution:                +4
Dynamic execution:                +3
Crypto/decryption logic:          +4
Downloader behavior:              +2
Temp execution:                   +3
Credential reference:             +3
Docker socket / host secret mount:+5
Devcontainer lifecycle command:   +4

Severity:

0–4    log only
5–7    medium
8–11   high
12+    critical

The scanner should not execute repo code. It should not build the project. It should not approve pull requests. It should detect and route risk.

6. Use Datadog or another SIEM for GitHub control-plane detections

If GitHub audit logs are streamed to Datadog, add high-signal rules.

Datadog Cloud SIEM supports custom detection rules over ingested logs:

Recommended rules:

Critical:
- Branch protection removed
- Repository ruleset deleted
- Mass repository modification by the same actor/token/app

High:
- Branch/ruleset protection weakened
- Programmatic token modified repository controls
- Suspicious automation client modified GitHub controls
- Workflow run or workflow logs deleted

Example detection logic:

source:github* (@action:protected_branch.destroy OR @github.action:protected_branch.destroy)

source:github* (@action:repository_ruleset.delete OR @github.action:repository_ruleset.delete)

source:github* (
  @programmatic_access_type:"OAuth access token" OR
  @programmatic_access_type:"Personal access token" OR
  @programmatic_access_type:"GitHub App token"
)
(
  @action:protected_branch.* OR
  @action:repository_ruleset.* OR
  @action:repo.* OR
  @action:workflows.*
)

Field names vary by log pipeline, so open several real audit events first and confirm whether your Datadog fields are named @action, @github.action, @repo, @github.repository, @actor, @user_agent, @token_id, or something else.

Let’s not assume field names blindly. Confirm them once, then build the rules.

7. Protect developer Macs without flooding the SIEM

For macOS developer fleets, full process telemetry from every laptop can become noisy and expensive.

A better first step is targeted endpoint safety:

Local safe-push guardrail
Periodic local repo indicator scan
Endpoint/EDR high-confidence alerts
Only high-signal events forwarded to the SIEM

High-signal Mac detections:

node .github/setup.js
node */.github/setup.js
AI/editor tool spawning shell/downloader/interpreter
Execution from /tmp or /var/tmp
curl/wget followed by shell execution
Unexpected access to local SSH, cloud, package, or GitHub credentials

If a developer machine executed suspicious repo-controlled code, treat it as a credential exposure event until proven otherwise.

Cure: how to handle the incident cleanly from the moment it is detected

When this happens, speed matters. But order matters more.

Step 1: Stop the bleeding

Immediately:

Freeze merges on protected branches.
Disable or restrict suspicious OAuth apps, GitHub Apps, PATs, SSH keys, and service accounts.
Suspend or restrict the actor if the activity is unauthorized.
Block known high-risk paths with a push ruleset.
Preserve audit logs before retention or UI filters make investigation harder.

The goal is to stop reinfection while evidence is still intact.

Step 2: Preserve evidence

Capture:

GitHub audit log export
Affected repos and branches
Commit SHAs
PR URLs
Actor, IP, user agent
Token type and token ID if present
OAuth application ID or GitHub App ID if present
Workflow runs and logs
Endpoint evidence if local execution happened

Let’s not clean first and investigate later. Cleanup without evidence makes root cause and scope much harder.

Step 3: Scope the blast radius

Use GitHub code search and API-based scanning for known paths and behavior.

Search for high-risk files:

path:.github/setup.js
path:.github/setup.mjs
path:.github/setup.cjs
path:.claude/settings.json
path:.gemini/settings.json
path:.cursor/rules/setup.mdc

Search for behavior, not just filenames:

"node .github/setup.js"
"child_process.exec"
"crypto.createDecipheriv"
"new Function("
"eval("
"postinstall"
"curl"
"wget"
"/tmp/"
"bun-v"
"GITHUB_TOKEN"
"AWS_ACCESS_KEY"
"VAULT_TOKEN"

Map:

Which repos are infected?
Which branches are infected?
Which protected branches were weakened?
Which actors/tokens/apps touched multiple repos?
Which developers pulled or executed the affected code?
Which CI jobs ran after infection?

Step 4: Clean developer machines first where execution is confirmed

For a developer Mac that executed suspicious repo code:

Isolate or restrict network access if feasible.
Preserve process evidence if endpoint tooling exists.
Identify the repo path and command executed.
Revoke active GitHub sessions and tokens for the user.
Rotate reachable credentials:
- GitHub tokens
- GitHub SSH keys
- package registry tokens
- cloud CLI credentials
- Vault or secrets-manager tokens
- deployment credentials
Reclone affected repos from a clean protected branch after central cleanup.

If local work exists, preserve only safe work as a patch.

Example:

git status

git diff -- .   ':(exclude).github/setup.js'   ':(exclude).github/setup.mjs'   ':(exclude).github/setup.cjs'   ':(exclude).claude/**'   ':(exclude).gemini/**'   ':(exclude).cursor/**'   > ../safe-work.patch

git diff --cached -- .   ':(exclude).github/setup.js'   ':(exclude).github/setup.mjs'   ':(exclude).github/setup.cjs'   ':(exclude).claude/**'   ':(exclude).gemini/**'   ':(exclude).cursor/**'   > ../safe-staged-work.patch

Then apply the patch to a clean clone:

git clone git@github.com:ORG/REPO.git clean-repo
cd clean-repo
git checkout -b recover-safe-work
git apply --check ../safe-work.patch
git apply ../safe-work.patch

Review before pushing.

This protects ongoing work without carrying malicious files forward.

Step 5: Clean GitHub repositories

There are three cleanup options. The best one depends on what the malware did.

Option A: Cleanup PR that removes the malicious files

This is usually the best first recovery step for private enterprise repositories when the malicious file did not contain secrets and the priority is safe recovery.

Process:

Create cleanup branch
Remove malicious files and configs
Remove references from package scripts, workflows, Docker, devcontainer, editor configs
Open PR
Require security/platform review
Merge after scanner and CI pass
Keep audit trail intact

This is operationally clean because history remains available for forensics.

The downside: the malicious blob remains in Git history. That may be acceptable for internal containment if secrets were not committed and credentials are rotated where execution occurred.

Option B: Recreate infected branches from a clean base and cherry-pick safe commits

This is best for ongoing feature branches.

Process:

git fetch origin
git checkout origin/main
git checkout -b clean-feature-branch

# Cherry-pick only reviewed safe commits
git cherry-pick <safe_commit_sha_1>
git cherry-pick <safe_commit_sha_2>

If a commit mixes good work with malicious files, avoid cherry-picking it directly. Create a patch excluding risky paths or manually reapply the safe changes.

This is cleaner than trying to rebase a branch that already contains malicious commits.

Option C: Rewrite history

Use this only when necessary, for example:

Secrets were committed
Malware must be removed from history for legal/compliance reasons
Public repositories or forks make retained history unacceptable
Large-scale credential exposure requires complete removal

History rewrite is disruptive. It requires coordinated force-pushes, branch protection handling, developer reclones, fork handling, and communication. It also does not remove copies already cloned elsewhere. Rotate credentials regardless.

Tools commonly used for this type of cleanup include git filter-repo and BFG Repo-Cleaner. The exact choice depends on repo size, hosting constraints, and whether the team needs to remove files, strings, or large objects.

A safe rule:

If the goal is fast containment and no secrets were committed, use cleanup PRs and preserve history.
If secrets or regulated content were committed, rotate credentials and plan a controlled history rewrite.
If feature branches are infected, recreate them from a clean base and cherry-pick safe work.

Step 6: Restore and verify protections

After cleanup:

Restore branch protection and rulesets.
Confirm push rulesets are active.
Confirm CODEOWNERS review is enforced.
Confirm bypass actors are limited.
Run drift scanner.
Run SIEM audit queries for new suspicious activity.
Confirm no new infected files appear.

Let’s not reopen normal merging just because files were deleted. Reopen only after the propagation path is contained.

How to protect ongoing work during cleanup

This is where many teams create unnecessary pain.

Developers may have legitimate work sitting on infected branches. Throwing everything away is safe, but expensive. Blind rebasing is risky.

The best approach is:

Freeze the infected branch.
Create a clean branch from a known-clean protected base.
Extract only safe application changes.
Exclude risky paths.
Apply patch to the clean branch.
Run tests and scanner.
Open a fresh PR.

Practical command flow:

# On the infected branch
git diff origin/main...HEAD -- .   ':(exclude).github/**'   ':(exclude).claude/**'   ':(exclude).gemini/**'   ':(exclude).cursor/**'   ':(exclude).vscode/tasks.json'   ':(exclude).vscode/launch.json'   > ../safe-feature-work.patch

# In a clean clone
git checkout origin/main
git checkout -b recover-feature-work

git apply --check ../safe-feature-work.patch
git apply ../safe-feature-work.patch

git status
git diff --stat

Then review the patch manually:

git diff

Run the normal test suite and the central scanner before opening the PR.

This keeps delivery moving without dragging the infection forward.

How AI can help without becoming another risk

AI is useful here, but only if it is placed behind deterministic controls.

Good uses of AI:

Summarize suspicious diffs for responders
Explain developer-machine impact
Identify whether a change can execute in CI, Docker, devcontainer, or local tooling
Generate cleanup PR descriptions
Draft incident timelines from audit logs
Suggest safe cherry-pick candidates
Review patches for accidental inclusion of blocked paths

Poor uses of AI:

Auto-approving PRs
Auto-revoking users or tokens without human approval
Receiving full repositories or secrets
Being the only detector
Mass-editing repositories without review

A safe AI prompt pattern:

We are reviewing a GitHub change for supply-chain risk.

Inputs:
- Repository:
- Actor:
- Branch:
- Commit:
- Changed files:
- Matched scanner rules:
- Diff excerpt:

Tasks:
1. Explain what changed in plain English.
2. Identify whether this can execute on a developer machine, CI runner, Docker build, devcontainer, or production runtime.
3. Identify whether it can access credentials or tokens.
4. Rate the change as normal, suspicious, or likely malicious.
5. Use only the provided evidence.
6. Recommend one action: allow, request owner review, block merge, or incident response.

AI should receive small diff excerpts, matched rules, and metadata. It should not receive .env files, private keys, customer data, or full proprietary repositories.

The value is speed and clarity. AI can remove the back-and-forth by turning raw diffs and audit logs into a concise triage note for SOC, platform, and engineering managers.

A clean incident timeline from detection to recovery

Here is the sequence I would use.

0–15 minutes: confirm and contain

Open incident channel.
Assign incident commander.
Freeze merges if protected branches or many repos are affected.
Disable suspicious token/app/user path.
Preserve audit logs.
Block high-risk paths with push ruleset if not already active.

15–60 minutes: scope

Export audit logs.
List affected repos and branches.
Identify actor/token/app/user agent/IP.
Search for follow-on pushes after protection changes.
Check CI workflow execution.
Check whether developer machines executed suspicious files.

1–4 hours: eradicate

Create cleanup PRs for affected repos.
Recreate feature branches from clean base where needed.
Rotate credentials if local or CI execution occurred.
Restore branch protection/rulesets.
Disable or re-scope risky OAuth apps, PATs, and GitHub Apps.

Same day: verify

Run drift scanner.
Run SIEM audit queries.
Confirm no new infected files.
Confirm no new branch protection changes.
Confirm no new mass repo modification.
Confirm endpoint findings are triaged.

Next 1–2 weeks: harden

Deploy central scanner.
Add CODEOWNERS for sensitive paths.
Tune SIEM rules.
Roll out local developer guardrails.
Review automation identities.
Document exception process.
Run tabletop or controlled simulation.

The prevention and cure in one view

Area	Prevention	Cure
Developer Macs	Local guardrails, targeted endpoint checks, short-lived credentials	Isolate if executed, preserve evidence, rotate credentials, reclone clean
GitHub pushes	Push rulesets for high-risk paths	Block reinfection and remove malicious files by cleanup PR
Branches	Branch rulesets and limited bypass	Restore protections and review pushes after weakening
PR review	CODEOWNERS for sensitive paths	Route cleanup and risky changes to Platform/Security
Docker/devcontainer	Scan and require Platform review	Remove risky lifecycle commands and host mounts
CI/CD	Workflow review and status checks	Disable malicious workflow paths and preserve logs
Tokens/apps	Least privilege and approval process	Revoke, rotate, re-scope, and audit usage
SIEM	Datadog or equivalent rules for GitHub audit logs	Alert, correlate, and drive response
AI	Summarize high-risk findings	Draft cleanup notes and reduce responder back-and-forth

Final thoughts

A GitHub supply-chain malware incident is not solved by deleting one file.

The clean answer is layered:

Developer machine safety
GitHub push rulesets
Branch rulesets
CODEOWNERS
Central push/PR scanner
SIEM detections
Targeted endpoint findings
Credential rotation
Careful branch recovery
AI-assisted triage

The most important mindset shift is this:

Treat GitHub as a production control plane.

Once the team sees it that way, the controls become obvious. Protect the developer machine. Protect the push. Protect the merge. Monitor the control plane. Keep cleanup evidence intact. Use AI to summarize and accelerate, not to blindly decide.

That is how a team can recover cleanly and make the next attack much harder.

DEV Community