Gerardo Castro Arica for AWS Heroes

Posted on Mar 27 • Edited on Apr 16

Mutable tags. 10,000 pipelines. One credential. — What the Trivy attack taught me about implicit trust

#ai #devops #security

A few days ago I was designing a GitHub Actions pipeline with security scanning tools. Choosing what to integrate, how to structure it, what permissions to give it — especially for the context of the project I was building it for. The kind of work that feels productive — you're building something that will improve the team's security posture.

That's when I found out what had happened to Trivy.

On March 19, 2026, TeamPCP compromised the most widely used open-source vulnerability scanner in the cloud-native ecosystem. They didn't hack a business application. They didn't exploit a vulnerability in production code. They compromised the tool that thousands of organizations use to find vulnerabilities in their own applications. The security scanner became the weapon.

That made me stop. Not to throw the pipeline away — but to rethink from what principles I was building it.

This post is not a threat intelligence analysis. I'm a practitioner who is learning more and more about building security pipelines and who reflected on this case. What I found along the way changed how I think about trust in external dependencies.

What exactly happened?

The attack wasn't a single event — it was a multi-phase campaign that lasted several days and started earlier than most people think.

Three weeks before March 19, an automated bot called hackerbot-claw exploited a misconfigured workflow in the Trivy repository and stole a Personal Access Token. Aqua Security detected the incident and rotated credentials — but the rotation was incomplete. TeamPCP retained access to the credentials that survived. That's what made everything that followed possible. [Palo Alto Networks]

On March 19 at 17:43 UTC, using those retained credentials, TeamPCP compromised three components simultaneously:

The Trivy binary. A malicious version — v0.69.4 — was published across all official distribution channels: GitHub Releases, Docker Hub, GHCR, ECR Public, deb/rpm repositories, and get.trivy.dev. The scanning logic lives in this binary. The other two components are wrappers that invoke it. [Wiz Research]

trivy-action. The GitHub Action most teams use to integrate Trivy into their pipelines. TeamPCP force-pushed 76 of 77 version tags to point at malicious commits. Pipelines referencing these actions by tag began executing the attacker's code on their next run — with no visible change to the tag name or the releases page. [Microsoft Security Blog]

setup-trivy. The Action that installs the binary. All 7 existing tags were compromised in the same way.

The payload was an infostealer that harvested SSH keys, cloud tokens, Kubernetes credentials, and pipeline environment variables — all while Trivy ran normally and produced the expected output. To an engineer reviewing the logs, the step appeared successful. [SANS Institute]

Exposure windows ranged from 3 to 12 hours depending on the component. Three days later, on March 22, TeamPCP published additional malicious images on Docker Hub — v0.69.5 and v0.69.6 — using separately compromised Docker Hub credentials, extending exposure by another ~10 hours. [Legit Security]

According to SANS Institute, more than 10,000 CI/CD workflows are reported to have been affected.

They didn't exploit code — they exploited trust

When most teams think about an attack, they picture someone finding a buffer overflow, an SQL injection, an RCE. A technical vulnerability in the code that needs to be patched.

TeamPCP did none of that.

They didn't find a bug in Trivy. They didn't break any cryptographic algorithm. They exploited something much harder to patch: the implicit trust logic that organizations place in their dependencies. The premise that if a tool comes from the official vendor, through the official channel, with the official tag — it's safe.

That premise is what failed.

There's a brutal irony in how the impact was distributed. The most diligent teams — the ones who had integrated Trivy into every PR, every merge, every deploy — were the most exposed. The more disciplined the team was about running their security pipeline, the more times the infostealer ran. The good practice became the attack vector. [SANS Institute]

This is not an argument to stop scanning. It's an argument to understand that trusting an external tool without verifying its integrity is exactly the same as trusting a user without verifying their identity. In IAM, nobody accepts that. In dependencies, we accept it all the time.

The expansion of the attack confirms it. TeamPCP didn't stop at Trivy. On March 23 they compromised Checkmarx KICS — the infrastructure-as-code scanner. On March 24 they reached LiteLLM on PyPI, using credentials stolen from BerriAI's Trivy pipeline. According to SANS Institute, one stolen token propagated across five distinct ecosystems: CI/CD, npm, Docker, PyPI, and AI infrastructure. [Arctic Wolf]

TeamPCP isn't targeting business applications. It's targeting the security tooling ecosystem — exactly the tools that the most security-conscious teams have integrated into their pipelines.

Mutable tags: not the vector, but the multiplier

When the attack was reported, much of the media focus fell on mutable tags. That makes sense — it's the most striking technical detail. But it's worth understanding exactly what role they played, because they weren't the entry vector — they were what turned a single point of access into a massive problem.

To understand the difference, three concepts need to be separated.

The attack vector was the retained credential — the Personal Access Token that survived an incomplete rotation from a prior incident. That credential with write permissions over Aqua Security's repositories is what gave TeamPCP initial access. Without it, nothing that followed would have been possible. [Palo Alto Networks]

The attack surface was the CI/CD pipelines of thousands of organizations referencing Trivy and its GitHub Actions by tag. Every pipeline running uses: aquasecurity/trivy-action@v0.35.0 was an exposed surface — not because it had a vulnerability, but because its trust model depended on the immutability of a tag that wasn't immutable. According to SANS Institute, more than 10,000 CI/CD workflows were part of that surface.

Mutable tags were the multiplier. Once TeamPCP had access, tag mutability allowed them to silently redirect 76 of 77 tags in trivy-action and all 7 tags in setup-trivy to malicious commits — with no visible change in names, dates, or release pages. [Microsoft Security Blog]

Now, the fundamental difference between a tag and a digest.

A tag is a named pointer — v0.35.0, latest, v0.69.4. In Git and container registries, that name can be redirected to any commit or image without the name changing. No notification. No alert. No visible change on the releases page.

A digest is the SHA256 of the actual content — sha256:a3f8d2c1.... It's immutable by definition. If the content changes, the digest changes. It can't be forged, it can't be redirected. Referencing a dependency by digest anchors it to a specific verified piece of content.

Any pipeline referencing those actions by tag began executing malicious code on its next run. Without any visible change. Without any alert. If those same pipelines had referenced the actions by digest, the force-push would have had no effect. The digest would still point to the original commit. The blast radius would have been dramatically smaller. [Legit Security]

Supply chain security = least privilege applied to dependencies

There's a principle any engineer working with AWS knows by heart: least privilege. You don't give a role more permissions than it needs. You don't create access keys when you can use temporary roles. You don't leave a * in a Resource when you can specify the exact ARN.

It's the most repeated principle in cloud security. And yet, when it comes to external dependencies, we systematically ignore it.

When a pipeline runs uses: aquasecurity/trivy-action@v0.35.0, it's placing full trust in that tag — without verifying its integrity, without anchoring to specific content, without questioning whether the pointer still points to what it pointed to yesterday. It's the equivalent of giving AdministratorAccess to a role because "it's easier." It works. Until it doesn't.

Supply chain security isn't a separate domain from the security you already practice. It's the same least privilege principle applied to a different level: your dependencies. The question isn't "do I trust this tool?" — it's "what verification do I have that what I'm executing is exactly what I think it is?"

The expansion of the attack illustrates why this matters at scale. TeamPCP didn't need to compromise each organization individually. It compromised a central tool in the ecosystem and let implicit trust do the rest. LiteLLM was compromised because its pipeline used Trivy — the infostealer stole the PyPI publishing token, and from there TeamPCP published malicious versions with 3.6 million daily downloads. [SANS Institute]

One token. Five compromised ecosystems. That's the geometry of a well-executed supply chain attack.

What makes this case especially uncomfortable is that the victims weren't careless teams. They were teams that had invested in security, that scanned their pipelines, that used cloud-native tooling. Diligence didn't protect them — because the diligence was applied inside the trust perimeter, not at the perimeter itself.

The solution is not to stop using external tools

The wrong conclusion from this incident would be to stop using Trivy, to distrust all open-source tools, or to build everything in-house. That's neither viable nor the right lesson.

The lesson is that trust in external dependencies needs to be explicit and verifiable — not implicit and assumed.

Digest pinning

The most concrete and immediate change is to reference GitHub Actions by digest instead of by tag.

Instead of this:

- uses: aquasecurity/trivy-action@v0.35.0

This:

- uses: aquasecurity/trivy-action@sha256:57a97c7...

The digest anchors the pipeline to a specific verified piece of content. If someone force-pushes the tag, the pipeline keeps executing the original commit. Tag mutability stops being a problem because the pipeline doesn't depend on the tag. [Legit Security]

The obvious objection is that this makes updates harder — and it's valid. A fixed digest doesn't update itself. That's where the second part comes in.

Dependabot and Renovate

Dependabot and Renovate can manage GitHub Actions updates automatically, including digest pinning. When a new verified version is released, they open a PR with the updated digest. The team reviews, approves, and the pipeline updates in a controlled and auditable way.

The combination closes the loop: digest pinning eliminates exposure to mutable tags, and Dependabot/Renovate eliminates the friction of maintaining digests manually.

Confirmed safe versions

For those using Trivy at the time of the incident, confirmed safe versions are: Trivy binary v0.69.3 or earlier, trivy-action v0.35.0 at commit 57a97c7, setup-trivy v0.2.6 at commit 3fb12ec. Any reference to v0.70.0 in logs should be treated as suspicious — the attacker attempted to publish that version but was stopped before the tag was pushed. [Legit Security]

Secrets: less is more

One thing this incident makes clear is the problem of pipelines that inherit more secrets than they need. If a scanning job doesn't need production credentials, it shouldn't have them — regardless of whether they're temporary or static. Applying least privilege to what secrets get passed to each job reduces the blast radius when a tool is compromised.

Immediately rotate any credential that was exposed to a pipeline that ran compromised versions of Trivy between March 19 and March 22. That includes GitHub tokens, cloud credentials, registry tokens, SSH keys, and database passwords. [Legit Security]

Closing — what happened in LATAM

All of the above reflection could remain theoretical if it weren't for someone in our community who lived this in real time.

Alejandro Castañeda, AWS Community Builder and Cloud Engineer from Colombia, shared on LinkedIn what happened to his team. His pipeline was compromised. The infostealer ran on a self-hosted runner inside his EKS cluster and sent approximately 80KB to the attacker's server. To put that in perspective: 80KB is enough to carry away all the secrets from a complete pipeline.

And yet, they reviewed CloudTrail top to bottom and the result was zero lateral movement. No API calls from the attacker's IP. No IAM user creation, no policy changes, no resource access.

Why?

Because Alejandro's team had made an architecture decision ahead of time, when they were defining the foundations of their infrastructure: OIDC with IAM Roles instead of static access keys. Their pipelines assumed temporary roles that expired in minutes. By the time the attacker had the credentials in hand, they were already worthless. And the role's trust policy only allowed use from GitHub — so even if they hadn't expired, they couldn't be used from anywhere else.

If they had had static access keys stored as GitHub secrets, the story would be completely different. The attacker would have had access to ECR to inject malicious images into production containers, and to Secrets Manager to read database credentials and financial service tokens.

The difference between "our secrets were stolen and nothing happened" and a real disaster was an architecture decision that nobody made thinking about this specific attack. They made it because it was the right way to build.

That's what I take from this incident. Not the list of affected versions — that changes. Not the name of the threat actor — that changes too. What doesn't change is the principle: design so that when someone compromises a dependency, they don't find anything useful on the other side.

Security isn't about building a perfect wall. It's about making sure that when someone jumps it, there's nothing useful on the other side.

About the author

Gerardo Castro is an AWS Security Hero and Cloud Security Engineer focused on LATAM. Founder and Lead Organizer of the AWS Security Users Group LatAm. He believes the best way to learn cloud security is by building real things — not memorizing frameworks. He writes about what he builds, what he finds, and what he learns along the way.

🔗 GitHub: gerardokaztro
🔗 LinkedIn: gerardokaztro