DEV Community

Cover image for How Do You Set Up RBAC on ECS Fargate Without Breaking Prod?
Matt
Matt

Posted on • Originally published at fortem.dev

How Do You Set Up RBAC on ECS Fargate Without Breaking Prod?

ECS Fargate RBAC: Scope Developer Access Safely

Originally published at https://fortem.dev/blog/ecs-fargate-rbac
IAM has no concept of an ECS environment. Build per-environment RBAC with ABAC tags — the working policy, the four ways it silently breaks prod, and where AWS-native IAM hits its ceiling.


Use Case · June 30, 2026 · 10 min read

ecs-rbacecs-iam-developer-accessrestrict-ecs-access-by-environment

You're the single human gate for ECS ops: developers ship through CI, but they can't restart staging or read a log without pinging you. You want to hand them scoped access — their environments, never prod — and every attempt is an IAM policy that grants too much or too little. This is the working RBAC model: an ABAC policy that scopes developers by environment tag, the four ways AWS-native IAM silently breaks prod, and where the ceiling is.

TL;DR

  • ECS has no "environment" concept in IAM. You build per-environment RBAC with tags (ABAC): a developer's principal tag must match the resource's Environment tag.
  • The working policy gates ecs:UpdateService / StopTask / DeleteService on aws:ResourceTag/Environment, and CreateService / RunTask on aws:RequestTag/Environment.
  • The trap: some ECS List actions ignore tag conditions entirely. Granted open to "*" they leak prod metadata — and PassRole lets a developer escalate past their environment.
  • On EC2 launch type, ECScape shows a low-privilege task can steal another task's credentials (instance-level isolation). Fargate's micro-VM isolation closes this — a real reason to be on Fargate.
  • AWS-native ABAC holds to about ten environments, then tag discipline and policy sprawl become the bottleneck. That's where a per-environment RBAC layer earns its keep.

ECS has no idea what an "environment" is

IAM has no native concept of an ECS environment. You simulate it: tag every resource with an Environment key, tag each developer's principal, and write policies that only allow an action when the two tags match.

IAM thinks in ARNs and tags, not environments. There is no ecs:Environment you can grant a developer "staging" access to. So you build the abstraction yourself with attribute-based access control (ABAC): give every cluster, service, and task an Environmenttag, give each developer's IAM role a matching principal tag, and condition every policy on the two being equal.

The alternative — listing cluster ARNs in the Resource block — works for three clusters and collapses at thirty. The simplest version of this, letting a developer restart staging with a policy matched to a *-stg-* cluster pattern, gets you started. ABAC is where it goes when you have real environments and real developers — and where it stays safe past the first handful.

Ready to use: the ABAC policy that scopes by environment

Gate ecs:UpdateService, StopTask, and DeleteService on aws:ResourceTag/Environment matching the developer's principal tag; gate RunTask and CreateService on aws:RequestTag/Environment. Add an explicit Deny on prod.

Tag the developer's IAM role with Environment=staging, tag every staging resource the same, and this policy lets them operate staging and nothing else. The ${aws:PrincipalTag/Environment} variable means one policy serves every environment — the match is dynamic.

Ready to use — copy this today

{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Sid": "OperateOwnedEnvironment",
      "Effect": "Allow",
      "Action": [
        "ecs:UpdateService",
        "ecs:StopTask",
        "ecs:DeleteService"
      ],
      "Resource": "*",
      "Condition": {
        "StringEquals": {
          "aws:ResourceTag/Environment": "${aws:PrincipalTag/Environment}"
        }
      }
    },
    {
      "Sid": "CreateTaggedToOwnedEnvironment",
      "Effect": "Allow",
      "Action": [
        "ecs:CreateService",
        "ecs:RunTask"
      ],
      "Resource": "*",
      "Condition": {
        "StringEquals": {
          "aws:RequestTag/Environment": "${aws:PrincipalTag/Environment}"
        }
      }
    },
    {
      "Sid": "PassOnlyOwnedEnvironmentRoles",
      "Effect": "Allow",
      "Action": "iam:PassRole",
      "Resource": "arn:aws:iam::*:role/ecs/${aws:PrincipalTag/Environment}-*"
    },
    {
      "Sid": "NeverProd",
      "Effect": "Deny",
      "Action": "ecs:*",
      "Resource": "*",
      "Condition": {
        "StringEquals": { "aws:ResourceTag/Environment": "prod" }
      }
    }
  ]
}
Enter fullscreen mode Exit fullscreen mode

Why the explicit Deny

The NeverProdstatement is a backstop, not the primary control. The Allow statements already scope to the developer's own environment — but an explicit Deny on the prod tag wins over any Allow, anywhere, including a future policy someone attaches by mistake. Defense in depth costs four lines.

Which ECS actions actually respect your tags

Not every ECS action honors tag conditions. StopTask, DeleteService, and UpdateService respect aws:ResourceTag; CreateService and RunTask respect both. But List actions like ListClusters and ListServices ignore tags entirely — scope those another way or they leak prod.

This is the table that decides whether your policy is airtight or quietly open. If you gate an action on aws:ResourceTagthat the action doesn't populate, the StringEqualscondition evaluates to false and the statement simply doesn't grant — it fails closed. The leak is the other direction: to make those tag-blind List and Describe calls usable at all, you end up granting them in a separate statement with no condition — and that open grant is what reaches every environment:

ActionResourceRequestScope it by

ecs:StopTaskyes—scope by resource tag

ecs:UpdateServiceyes—scope by resource tag

ecs:DeleteServiceyes—scope by resource tag

ecs:CreateServiceyesyestag at create time

ecs:RunTaskyesyestag at create time

ecs:DescribeServicesyes—scope by resource tag

ecs:ListClustersno—no tag conditions — leaks prod metadata

ecs:ListServicesno—no tag conditions — scope another way

For the actions that ignore tags — the account-level List calls — your only levers are separate AWS accounts or not granting them. A read-only ecs:ListClusters open to *won't let a developer change prod, but it shows them every environment's name and shape — metadata a strict RBAC model is supposed to withhold.

The four ways this breaks prod

RBAC breaks prod four ways: actions that ignore tags, PassRole escalation, untagged resources defaulting open, and — on EC2 — ECScape credential theft. Each is a specific misconfiguration, not bad luck.

  1. 1*Actions that ignore tags.* List and Describe calls that don't carry an Environment tag can't be scoped by one — your tag condition fails closed, so to make them usable you grant them with no condition, and that open statement reaches every environment's metadata. Audit which actions you've granted unconditioned against the support table above.
  2. 2*PassRole escalation.* A developer who can RegisterTaskDefinition and iam:PassRole a powerful role can launch a task that runs as that role — and reach anything that role can. Scope iam:PassRole to only the task and execution roles for their environment, never a wildcard. This is the non-obvious one, and the most dangerous.
  3. 3*Untagged resources slip the Deny.* A condition on aws:ResourceTag/Environment only matches resources that HAVE the tag — so an untagged cluster matches neither your scoped Allow nor your NeverProd Deny. It won't be reachable through the ABAC policy, but if any broader Allow exists, an untagged prod resource is no longer blocked by the Deny that was supposed to catch it. Enforce tagging at creation with a RequestTag condition or an SCP.
  4. 4*ECScape on EC2 launch type.* On EC2, a low-privilege task can steal the IAM credentials of a more privileged task on the same instance. Your task-level RBAC is real on paper and bypassed in practice. Fargate closes this — see the next section.

None of these throw an error when you deploy the policy. They're visible only when you test — or when the audit happens. The audit trail that proves who did what in CloudTrail is how you catch the escalation after the fact; the policy is how you prevent it.

Why Fargate makes this safer than EC2

On EC2 launch type, ECScape showed a low-privilege task can steal another task's IAM credentials — isolation is instance-level, not task-level. Fargate runs each task in its own micro-VM, so task-level RBAC actually holds.

The ECScape research is worth understanding because it undercuts an assumption most RBAC models rest on: that a task's IAM role is isolated to that task. On EC2, it isn't. Multiple tasks share one host, ECS delivers their credentials over a channel on that host, and a compromised low-privilege container can impersonate the agent and intercept the credentials destined for every other task on the instance. Your carefully scoped per-task roles become a shared pool.

On Fargate, each task gets its own micro-VM with isolated credentials and its own IMDS — there is no co-tenant to steal from. The mitigations on EC2 (block container access to IMDS at169.254.169.254, run privileged tasks on separate instances) are things you don't have to think about on Fargate. If your RBAC model assumes task-level isolation, Fargate is the launch type that makes the assumption true.

KEY INSIGHT: RBAC is only as strong as the isolation underneath it. A perfect ABAC policy on EC2 launch type can still be bypassed at the credential layer; the same policy on Fargate holds because the micro-VM boundary is real. The access model and the isolation model have to agree.

Where AWS-native RBAC hits its ceiling

ABAC works until tag discipline fails. Past about ten environments, every new env needs the tag applied everywhere, every policy re-audited, every untagged resource hunted down. IAM has no per-environment role concept to lean on.

The ABAC model is correct and it scales — until the thing it depends on, perfect tagging, stops being free. At ten environments, one missing Environment tag is a hole. At thirty, finding the missing tag is its own job. You end up writing SCPs to enforce tagging, AWS Config rules to flag untagged resources, and a runbook for onboarding each new environment into the policy set — maintaining the simulation of something IAM was never designed to model.

That's the point where a per-environment RBAC layer earns its keep: instead of hand-maintaining ABAC tags and policies, you grant a developer a role on the environments they own — restart, redeploy, read logs, run a one-off task — and prod is off-limits by construction, not by a condition key you hope is correct. The AWS-native approach is the right place to start; it's not the right place to be at fleet scale.

If you read this, you might also want to know

Can I scope ECS access by cluster instead of tags?

Yes, with a Resource block listing cluster ARNs or a StringLike condition on the cluster name (e.g. *-stg-*). It's simpler to reason about for a few clusters, but you edit every policy each time you add an environment. ABAC tags avoid that — at the cost of needing every resource tagged.

Does ECS Exec respect the same RBAC?

ECS Exec (the shell-into-a-container feature) is gated by ecs:ExecuteCommand, which supports the same aws:ResourceTag/Environment condition — so you can scope exec to a developer's own environment the same way. Many teams forget to scope it and leave a path into prod containers wide open.

How do I stop developers from escalating via PassRole?

Scope iam:PassRole to a path or ARN pattern that only covers their environment's roles (e.g. role/ecs/staging-*), never a wildcard. Without it, a developer who can register a task definition can pass any role they're allowed to pass and run a task as it — escalating straight past the environment boundary.

Do I need separate AWS accounts for hard RBAC boundaries?

For the strongest boundary, yes — an account boundary is the one thing a tag typo can't cross, and it's the only way to fully contain the tag-ignoring List/Describe actions. Most teams keep prod in its own account and share a non-prod account, using ABAC tags for per-environment scoping within each.


Book a 20-min fleet walkthrough: fortem.dev/book

Top comments (0)