DEV Community

Cover image for IAM Principal Cost Allocation on Amazon Bedrock (Update 2026)

IAM Principal Cost Allocation on Amazon Bedrock (Update 2026)

A few months ago I started a series about governing AI usage on AWS. In Part 1 we covered how to give developers governed access to LLMs from Day 0: IAM Policies, Guardrails, Inference Profiles, and a budget cut mechanism per team. If you haven't read it yet, I'd recommend starting there.
Spanish Podcast: Cloud para Todos

But there was one thing that kept bugging me: We couldn't tell who inside a team was driving the consumption. We'd cut the entire team's access, and then had to build a custom solution with Model Invocation Logging just to figure out who was responsible.

Well, AWS just solved the visibility part natively.

Blocking all access? That's still a different story, and I think doing it natively without a proxy is complex (for now).


What did AWS announce?

In April 2026, Amazon Bedrock launched support for cost allocation by IAM principal, in plain English, by user or IAM role, directly in Cost Explorer and CUR 2.0 (Cost and Usage Report).

Amazon Bedrock now supports cost allocation by IAM principal, such as IAM users and IAM roles, in AWS Cost and Usage Report 2.0 (CUR 2.0) and Cost Explorer.

Official announcement

In short: when your dev runs aws sso login and calls a Bedrock model, AWS now automatically records who made that call (after activation) and how much it cost. No proxies, no custom Lambdas, no CloudTrail scraping.

One thing you need to understand right away: this feature is about billing, not enforcement. Data arrives in CUR 2.0 and Cost Explorer with 24-48 hours of latency. That means you can see who spent how much, but you cannot block access in real time with this data. For that, the Budget Cut Lambda from Part 1 is still necessary.

Sorry for AWS Budgets, you can't use it.


What changes from the prev version of this serie?

Mechanism Granularity What it solves
Inference Profile + Resource Tags Per team / workload "How much did the backend team spend on Haiku?"
IAM Principal Cost Allocation (NEW) Per user / role "How much did user@company.com spend across all models?"

Combined, you get the full cost picture: who spent how much, on which model, for which team. But remember: you see this picture with a 24-48h delay. It's for analysis and chargeback, not real-time enforcement.

Prerequisites

Everything from Part 1, plus:

  • Tags on your IAM users/roles with business attributes (team, business-unit, project)
  • Access to the Billing and Cost Management console (if using Organizations, you do this from the management account)
  • CUR 2.0 enabled (legacy CUR doesn't support this)

Step by step: setting up IAM Principal Cost Allocation

Step 1: Tag your IAM Principals

If you use SSO, each Permission Set creates a role in the target account with format AWSReservedSSO_{PermissionSetName}_{hash}. These roles can be tagged.

# scripts/tag_iam_principals.py
"""
Tag SSO roles with business attributes.
These tags will show up in Cost Explorer and CUR 2.0.
"""
import boto3

iam = boto3.client("iam")

# Mapping: SSO role -> business tags
SSO_ROLES = {
    "AWSReservedSSO_BackendDev_a1b2c3d4": {
        "team": "backend",
        "business-unit": "BU-ENG-001",
        "department": "engineering",
        "environment": "development",
    },
    "AWSReservedSSO_FrontendDev_e5f6g7h8": {
        "team": "frontend",
        "business-unit": "BU-ENG-002",
        "department": "engineering",
        "environment": "development",
    },
    "AWSReservedSSO_DataTeam_i9j0k1l2": {
        "team": "data",
        "business-unit": "BU-DATA-001",
        "department": "data-science",
        "environment": "development",
    },
}

for role_name, tags_dict in SSO_ROLES.items():
    tags = [{"Key": k, "Value": v} for k, v in tags_dict.items()]
    iam.tag_role(RoleName=role_name, Tags=tags)
    print(f"{role_name} tagged with: {tags_dict}")
Enter fullscreen mode Exit fullscreen mode

For legacy IAM Users or service accounts, like ECS Roles:

aws iam tag-user \
  --user-name "ml-pipeline-sa" \
  --tags Key=team,Value=data Key=business-unit,Value=BU-DATA-001 Key=project,Value=recommendation-engine
Enter fullscreen mode Exit fullscreen mode

AWS note: Tags only appear for activation in the Billing console after the principal has made at least one Bedrock API call. If you tagged a role but nobody has used it yet, you won't see it in Cost Allocation Tags.

Step 2: Activate Cost Allocation Tags

Billing and Cost Management Console
    → Cost Organization (left panel)
        → Cost Allocation Tags
            → Filter by "IAM principal type"
            → Select: team, business-unit, department
            → Click "Activate"
Enter fullscreen mode Exit fullscreen mode

After activation, tags take up to 24 hours to become available in Cost Explorer and CUR.

# Verify which tags are active via CLI
aws ce list-cost-allocation-tags \
  --status Active \
  --tag-keys "team" "business-unit" "department" \
  --type "iamPrincipal"
Enter fullscreen mode Exit fullscreen mode

Step 3: Enable IAM Principal in CUR 2.0

If there's one step that makes all of this work, it's this one.
Activate the line_item_iam_principal column in cost reports.

Billing and Cost Management Console
    → Data Exports
        → Create export → Standard data export (CUR 2.0)
            → Additional export content:
                ✅ Include caller identity (IAM principal) allocation data
            → Destination: S3 bucket + Athena integration
        → Save
Enter fullscreen mode Exit fullscreen mode

What does this generate? Every line in CUR 2.0 now includes the exact ARN of the principal that called Bedrock. Principal tags appear with the iamPrincipal/ prefix so they don't collide with resource tags.


What does the data look like?

Cost Explorer: filter by team

Once tags are activated, Cost Explorer lets you group directly:

Cost Explorer
    → Filter: Service = "Amazon Bedrock"
    → Group by: Tag → "iamPrincipal/team"
Enter fullscreen mode Exit fullscreen mode

You'll see something like:

iamPrincipal/team    | Cost (USD)
---------------------|----------
backend              | $142.30
data                 | $89.50
frontend             | $23.10
(no tag)             | $5.40
Enter fullscreen mode Exit fullscreen mode

Want to drill down into the backend team? Change the Group by:

Cost Explorer
    → Filter: Service = "Amazon Bedrock"
    → Filter: Tag "iamPrincipal/team" = "backend"
    → Group by: Tag → "iamPrincipal/business-unit"
Enter fullscreen mode Exit fullscreen mode

BONUS

(Optional) CUR 2.0 + Athena: per-user analysis

If Cost Explorer doesn't give you enough granularity and you need to see per individual user, you can query CUR 2.0 directly with Amazon Athena.

About Athena costs: Athena charges $5 USD per TB scanned (on-demand mode). For small/medium org CUR files this is usually pennies per query. To reduce costs, enable CUR compression (Parquet format) and partition by month.

-- Top 10 Bedrock consumers this month
SELECT
    line_item_iam_principal AS iam_principal,
    tags['iamPrincipal/team'] AS team,
    tags['iamPrincipal/business-unit'] AS business_unit,
    SUM(line_item_unblended_cost) AS total_cost,
    SUM(line_item_usage_amount) AS total_usage
FROM cur_2_0.bedrock_usage
WHERE
    line_item_product_code = 'AmazonBedrock'
    AND EXTRACT(MONTH FROM billing_period) = EXTRACT(MONTH FROM CURRENT_DATE)
GROUP BY 1, 2, 3
ORDER BY total_cost DESC
LIMIT 10;
Enter fullscreen mode Exit fullscreen mode
iam_principal                                          | team     | business_unit | total_cost
-------------------------------------------------------+----------+---------------+-----------
arn:aws:sts::123456:assumed-role/BackendDev/javier@...  | backend  | BU-ENG-001    | $67.30
arn:aws:sts::123456:assumed-role/DataTeam/maria@...     | data     | BU-DATA-001   | $52.10
arn:aws:sts::123456:assumed-role/BackendDev/luis@...    | backend  | BU-ENG-001    | $41.20
Enter fullscreen mode Exit fullscreen mode

No proxies, no Lambdas, no CloudTrail scraping: you know exactly that Javier from the backend team spent $67.30 this month.


How it complements Part 1

The scheme from the previous post (Inference Profiles + IAM Policies + Guardrails + Budget Cut Lambda) isn't replaced, it's complemented:

Component Part 1 Part 2 (new)
Who can use which model? IAM Policy per-team No changes
Is PII protected? Bedrock Guardrails No changes
How much did each team spend? Inference Profile tags → Cost Explorer IAM Principal tags → Cost Explorer (more granular)
How much did each user spend? ❌ Not available line_item_iam_principal in CUR 2.0
Cut access when over budget? Budget Cut Lambda (per-team, ~10 min) ⚠️ CUR has 24h delay. NOT usable for blocking. Budget Cut Lambda remains the only cut mechanism

Trade-offs

Original trade-off Status with IAM Principal CA Notes
Couldn't tell who was spending inside a team SOLVED CUR 2.0 + line_item_iam_principal gives per-user visibility
Cut is per team, not per user 🟡 Partially improved You can now see who was responsible (post-mortem), but auto-cut is still per team
Cost Explorer has 24-48h delay 🔴 Remains (also applies to IAM Principal CA) For real-time cuts, Budget Cut Lambda via CloudWatch (Part 1) is still the only path
No self-service dashboard Improvable Cost Explorer with iamPrincipal/team filter can be shared with teams

What still needs a real proxy

Let's be honest: some things need a real proxy, and for proper AI governance it's the best you can aim for.

  • Per-user real-time cut: CUR has 24h delay. To cut a specific user in minutes you need Model Invocation Logging + Lambda + iam:PutRolePolicy
  • Response caching: No native Bedrock caching. A proxy like LiteLLM can cache repeated responses
  • Multi-provider routing: If you want OpenAI and Anthropic directly (not via Bedrock), you need an abstraction layer
  • Semantic observability: To see an agent's reasoning tree, you need OTel + Langfuse. CloudTrail tells you "who called", not "why it reasoned that way"

Conclusions

With this announcement, the native governance model from Part 1 gains a missing piece: per-user cost visibility, without extra infrastructure, without CloudTrail scraping, without custom Lambdas for attribution. It's an analysis and chargeback tool, not enforcement. Real-time blocking still depends on the Budget Cut Lambda and CloudWatch.

Always think in stages:

  • Day 0: IAM Policies + Guardrails + Inference Profiles → governed access
  • Day 1: Budget Cut Lambda, protection against runaway spending
  • Now (Day 2): IAM Principal Cost Allocation, know exactly who spends what
  • Next? Leave your comment

Is there still a lot to build? Yes. A real proxy (LiteLLM), semantic observability (Langfuse), and caching are all desirable evolutions as the organization scales. But the foundation is in place, and it's 100% native.

The good thing about all of this: AWS keeps closing the gaps that we as architects identify. Today it's cost allocation by principal. Tomorrow maybe it'll be native per-user throttling or service-level guardrails that don't require guardrailConfig in the dev's code.

The cloud tools we already master are the foundation for governing the AI we're building. And there are fewer and fewer excuses not to do it from Day 0.


Want to stay in touch? 📩

Héctor Fernández
AWS Community Builder

https://podcast.hectorfernandez.dev

Top comments (0)