We treat deployment pipelines like automation.
They are not.
They are identity systems.
Every time a pipeline runs, it answers a critical question: Who is allowed to change production? And increasingly, the answer is: the pipeline. Not a human. Not an admin. Not a ticket approval. The pipeline identity.
Not because we chose this architecture deliberately. Because we arrived here through a thousand small decisions that felt like operational improvements.
The Shift We Didn't Fully Acknowledge
Historically, humans logged into production. Engineers ran deployment scripts from jump boxes. Admins approved infrastructure changes through ticketing systems that everyone hated but at least understood. The trust model was explicit: this person, with these credentials, at this terminal, making this change.
Now: a commit merges. A workflow triggers. Automation deploys. Infrastructure updates itself.
Humans design the change. Pipelines execute it.
That seems like a productivity win—and it is. But it fundamentally relocates where authority lives. Pipelines don't just move code from one place to another. They act on behalf of delegated privilege. That is identity, whether we acknowledge it or not.
The thing is, we gutted the old identity model without building a replacement. We removed direct human access to production, celebrated the reduction in operational risk, and then... concentrated all that authority inside automation we barely monitor. The privilege didn't vanish. It migrated.
What Makes Something an Identity System?
An identity system does four things:
Authenticates. Proves who or what is making a request.
Authorizes. Determines what that actor is allowed to do.
Acts with delegated privilege. Exercises permissions beyond its own inherent capabilities.
Establishes trust boundaries. Creates a perimeter of assumed safety around certain operations.
Modern deployment pipelines do all four.
When a pipeline deploys, it authenticates to a cloud provider—usually by assuming a role through workload identity federation or presenting long-lived credentials someone uploaded six months ago and forgot about. It authorizes itself to apply infrastructure changes, publish artifacts to registries, modify runtime configuration, rotate secrets, update DNS records. It acts with delegated privilege that often exceeds what any individual engineer possesses. And it establishes trust boundaries: "If this workflow ran, the change is approved."
It is not "running a script." It is asserting identity and exercising authority across production systems.
The problem is we designed these systems like they were utilities—background processes that make deployments faster. We didn't design them like we were creating synthetic superusers with cross-environment reach and the ability to reconfigure foundational infrastructure.
The Delegated Authority Problem
Here's the architectural blind spot: pipelines often hold broader authority than any individual engineer.
An engineer cannot directly apply Terraform in production. Policy forbids it. But the pipeline can—because it needs to, and we trust that only approved code reaches the pipeline. An engineer cannot directly push images to the production container registry. But the pipeline can. An engineer cannot modify IAM policies or security groups or KMS key permissions. But the pipeline can, because infrastructure-as-code workflows require those capabilities.
We removed human privilege. Then concentrated it inside automation. That is not inherently wrong—in fact, it's probably necessary for operating at scale. But it must be explicitly modeled as an identity architecture decision, not treated as a deployment convenience.
I've seen production environments where the CI service account has broader permissions than the entire engineering team combined. Not because anyone intended that. Because permission grants accreted over time. A developer needed to add a CloudFront distribution, so the pipeline got cloudfront:*. Someone else needed to configure an RDS instance, so it got rds:*. Six months later, the pipeline can provision anything, and no one remembers why or questions whether it should.
Pipelines as Synthetic Superusers
In many systems, the deployment pipeline becomes a synthetic superuser. It has cross-environment access—promoting artifacts from staging to production. It can rotate credentials in secret managers. It can provision infrastructure in multiple AWS accounts or GCP projects. It can roll back production to previous states. It can modify ingress controllers, update certificate authorities, reconfigure observability backends.
That means compromising the pipeline identity is equivalent to compromising a production admin account. Except the pipeline is easier to reach, because it executes code from repositories.
Think about the attack surface: a malicious npm dependency executes during a build step. It extracts the GITHUB_TOKEN or GITLAB_CI_TOKEN or whatever credential the CI runner uses. That token allows the attacker to trigger workflows, modify environment variables, or—depending on configuration—assume roles directly into production accounts. No password cracking required. No human compromise required. Just delegated authority inherited from automation.
The pipeline identity becomes the attack path. And we've architected it to be maximally reachable: it runs arbitrary code on every pull request, every commit, every merge. We designed a superuser that executes untrusted inputs.
Identity Without Visibility
We typically monitor human logins. SSO events go to a SIEM. Privilege escalations trigger alerts. MFA challenges get logged and audited. That's identity hygiene we learned over decades.
But do we monitor pipeline role assumptions? Deployment identity behavior changes? Workflow-level permission shifts? Unexpected cross-account actions from CI service principals?
Often, no. Pipeline identities operate with minimal scrutiny. They are considered trusted automation—part of the infrastructure, not part of the threat model. But trust without monitoring is assumption. And assumptions age poorly.
I've investigated incidents where pipelines were compromised for weeks before anyone noticed. Why? Because no one was watching. The SIEM had rules for unusual human logins, failed SSH attempts, privilege escalation via sudo. It had nothing for "CI runner assumed production role at 3 AM on a Sunday and modified fourteen security groups."
When you ask security teams what identities exist in their environment, they'll list users, service accounts, maybe API keys. They rarely list deployment workflows as first-class identities, even though those workflows hold more authority than most of the humans.
The "Just Automation" Fallacy
Calling pipelines "just automation" hides risk.
Automation does not eliminate identity. It concentrates it. Every deployment workflow has a defined trust boundary, a set of permissions, a scope of authority, a blast radius. That is identity architecture, whether you design it intentionally or inherit it accidentally.
The "just automation" framing also obscures accountability. If a human misconfigures infrastructure, we know who to talk to. If automation misconfigures infrastructure... who owns that? The developer who wrote the Terraform? The platform team that maintains the pipeline? The security team that approved the permissions? The answer is usually unclear, which means the risk goes unowned.
I've seen organizations where deployment pipelines have the highest privilege in the entire system, and no one has explicit responsibility for auditing or governing that privilege. It's just... there. A dependency. An assumption. Something that has to work, so it has broad permissions, and everyone hopes it's configured correctly.
When Pipelines Become Lateral Movement Vectors
Imagine this scenario—because I've seen variations of it multiple times:
A developer adds a new package to a project. It's a popular library, thousands of downloads, looks legitimate. But it was compromised two weeks ago. The malicious code executes during CI—perhaps in a postinstall script, perhaps in a build step. It extracts a deployment token from environment variables. That token allows role assumption into production.
The attacker doesn't immediately deploy malicious infrastructure. That would trigger alerts. Instead, they modify a single IAM policy, granting themselves persistent access through a separate backdoor. Then they wait. When they're ready, they deploy modified infrastructure—maybe a Lambda function that exfiltrates data, maybe a container that mines cryptocurrency, maybe just a persistence mechanism for later use.
No password cracking required. No phishing campaign. No human compromise. Just delegated authority inherited from automation, exploited through supply chain insertion.
The pipeline identity becomes the pivot point. And because pipelines have broad cross-environment access, a compromise in one workflow can cascade into multiple production systems.
Modeling Pipelines as First-Class Identities
If pipelines are identity systems, they require identity design principles.
1. Least Privilege Per Workflow
A test workflow should not deploy infrastructure. It should not publish production images. It should not modify IAM roles or security groups or DNS records.
Segment permissions by purpose. A workflow that runs unit tests needs to read code and write test results. That's it. A workflow that deploys to staging needs permissions scoped to the staging environment—not production, not development, certainly not cross-account access. A workflow that publishes container images needs write access to a specific registry namespace, not ecr:* across all regions.
This is harder than it sounds, because modern pipelines often reuse the same runner identity across multiple workflows. You end up with a single service account that has the union of all permissions any workflow might need. That's convenient. It's also an identity design failure.
The alternative: workload identity federation with dynamic permission grants based on workflow context. GitHub Actions supports OIDC-based role assumption where permissions can be scoped to specific repositories, branches, or even workflow files. GitLab CI can use JWT-based identity with claims that map to granular IAM policies. These aren't perfect—the configuration is finicky, and the documentation assumes you already understand federated identity—but they allow per-workflow privilege scoping.
I've seen teams reduce pipeline blast radius by 80% just by segmenting workflow permissions. The test suite no longer has production deploy rights. The dependency update bot can't modify infrastructure. The documentation build can't access secret managers.
It requires upfront design. But the alternative is a single compromise granting access to everything.
2. Short-Lived, Context-Aware Credentials
Pipeline identity should be issued per run, expire automatically, and be bound to repository context.
Permanent access keys are identity debt. They sit in CI environment variables, maybe encrypted, maybe not. They get rotated... eventually. Maybe every 90 days if you have a good compliance program. Maybe never if you're honest.
Every permanent credential is a persistent attack surface. If an attacker extracts it, they have access until someone manually revokes it. That could be hours. Could be months. I've seen access keys in CI systems that were created three years ago and have never been rotated because no one wants to risk breaking the deployment process.
Short-lived credentials issued through workload identity mean an attacker has to maintain access to the CI system itself, not just steal a static token. That's a higher bar. It also means credentials automatically expire—usually within an hour—which limits the window for abuse.
Context-aware credentials go further: they embed claims about the repository, branch, workflow file, even the specific commit SHA. You can write IAM policies that say "this role can only be assumed by workflows running in the main branch of org/repo" or "this role can only deploy infrastructure if triggered by a tag matching v*."
That limits both accidental misconfiguration and deliberate abuse. A developer can't just fork the repository and run the production deployment workflow from their fork. The identity system checks the context and denies it.
3. Environment-Level Isolation
Development, staging, and production pipelines should not share identity roles.
Cross-environment authority increases blast radius catastrophically. If a single pipeline identity can deploy to both staging and production, then compromising staging—often less protected, sometimes running outdated dependencies, occasionally accessible by contractors—grants production access.
I've investigated breaches where the entry point was a compromised staging environment, but the actual damage occurred in production because the deployment pipeline had cross-environment permissions. The attacker didn't need to pivot. The pipeline did the pivoting for them.
Isolate identity by environment. Production deployment workflows assume roles that can only operate in production. Staging workflows assume separate roles with zero production access. Development workflows—if they even have deployment capabilities—operate in entirely separate accounts or projects.
This creates operational friction. You can't easily promote an artifact from staging to production with a single workflow. You need separate workflows, separate identities, explicit promotion gates. That friction is the point. It forces intentionality.
4. Observability for Automation Identity
Log and alert on unusual role assumptions, off-hours deployments, unexpected permission escalations, drift in workflow privilege definitions.
Automation must be observable like any other identity. That means:
- CloudTrail or equivalent audit logs that capture every API call made by pipeline identities, tagged with workflow context.
- Alerting on anomalies: if a deployment workflow assumes a role it's never assumed before, that's worth investigating. If deployments happen at 2 AM on a Saturday when no human is working, that's worth investigating.
-
Drift detection: if the permissions granted to a pipeline identity change—someone adds
s3:*when it only hads3:GetObjectbefore—that should trigger review. - Behavioral baselines: establish what "normal" looks like for pipeline activity, then alert on deviations. Most pipelines operate on predictable schedules. A sudden spike in deployment frequency or cross-region API calls deserves scrutiny.
The challenge: most logging systems aren't configured to treat pipeline identities as suspicious by default. They're infrastructure. They're trusted. They generate enormous log volume, so alerts get tuned to ignore them to reduce noise.
You have to intentionally model them as high-privilege actors whose behavior should be scrutinized, not assumed safe.
5. Explicit Blast Radius Mapping
Ask: if this deployment workflow is compromised, what can it change?
If the answer is "everything," your identity model is underdefined.
Map the blast radius explicitly. Document what each pipeline identity can access, what it can modify, what it can delete. Include indirect access—if the pipeline can modify IAM policies, it can grant itself additional permissions, which means its blast radius includes anything those policies could grant.
I've done this exercise with teams, and it's usually revelatory. They discover that a documentation deployment workflow hass3:* permissions because someone needed to upload to a specific bucket and just granted everything. Or that the infrastructure provisioning workflow can assume roles in accounts the team didn't even know existed.
Once you map the blast radius, you can start reducing it. Scope permissions to specific resources. Remove unnecessary cross-account access. Segment workflows so that each one has the minimum authority required for its specific purpose.
This takes time. It requires understanding the actual operations each workflow performs, not just guessing. But it converts implicit risk into explicit design decisions.
GitOps Complicates the Picture
GitOps models push even more authority into pipelines. Merged code becomes declarative truth. Controllers reconcile desired state continuously. Infrastructure updates automatically based on repository contents.
This increases safety in some ways—every change is auditable, versioned, reviewable. But it also means the identity that reconciles state becomes deeply privileged. Whether that identity lives in a CI platform or a cluster controller like ArgoCD or Flux, it is a powerful delegated actor.
GitOps controllers typically need:
- Read access to Git repositories to fetch desired state.
- Broad permissions in the target environment to apply changes.
- The ability to create, modify, and delete resources continuously.
If you compromise the GitOps controller—or the repository it watches—you control the entire declarative infrastructure. The controller will helpfully reconcile whatever malicious configuration you commit, because that's its job.
I've seen organizations treat their GitOps repositories with less security than they treat production credentials, because "it's just configuration files." But those configuration files define production state, and the controller that applies them has the authority to reconfigure everything.
GitOps doesn't eliminate the pipeline identity problem. It relocates it into a continuously running reconciliation loop.
The Architectural Reframe
Security discussions often focus on users, services, APIs. Pipelines sit in the middle, serving multiple roles:
- Users of cloud APIs: they authenticate, assume roles, invoke operations.
- Issuers of deployments: they publish artifacts, trigger updates, propagate changes.
- Publishers of artifacts: they write to registries, storage buckets, CDN origins.
- Enforcers of infrastructure state: they apply Terraform, Helm charts, Kubernetes manifests.
They are not background tooling. They are privileged actors with multi-faceted authority.
If you don't model them as such, your system has an undocumented superuser. One that runs arbitrary code. One that often has broader permissions than your actual administrators. One that's reachable through supply chain attacks, insider threats, or misconfigured repository permissions.
The reframe: every deployment workflow is an identity. It needs an identity profile, a privilege scope, monitoring, incident response procedures. When you grant permissions to a pipeline, you're granting them to every developer who can modify the code that pipeline runs, every dependency that code imports, every CI plugin that workflow uses.
That's a much broader trust boundary than most teams acknowledge.
Closing Thought
In modern infrastructure, humans design change. Pipelines enact it.
That makes pipelines identity systems. And identity systems require explicit privilege design, continuous monitoring, scoped authority, architectural ownership.
If your deployment workflow can reconfigure production, it is not "just CI." It is a privileged identity. Treat it like one.
That means: audit its permissions like you'd audit a superuser account. Monitor its behavior like you'd monitor administrative access. Scope its authority like you'd scope a service principal. Respond to anomalies like you'd respond to suspicious login attempts.
And when something breaks—because in complex systems, things always eventually break—you'll understand exactly what authority was exercised, by which identity, under what context. You won't be searching through logs trying to figure out how your infrastructure got reconfigured without any human touching it.
You'll know. Because you modeled the pipeline as what it actually is: a synthetic superuser with delegated authority to change production systems.
Top comments (0)