A few months ago I started a series about governing AI usage on AWS. In Part 1 we covered how to give developers governed access to LLMs from Day 0: IAM Policies, Guardrails, Inference Profiles, and a budget cut mechanism per team. If you haven't read it yet, I'd recommend starting there.
Spanish Podcast: Cloud para Todos
But there was one thing that kept bugging me: We couldn't tell who inside a team was driving the consumption. We'd cut the entire team's access, and then had to build a custom solution with Model Invocation Logging just to figure out who was responsible.
Well, AWS just solved the visibility part natively.
Blocking all access? That's still a different story, and I think doing it natively without a proxy is complex (for now).
What did AWS announce?
In April 2026, Amazon Bedrock launched support for cost allocation by IAM principal, in plain English, by user or IAM role, directly in Cost Explorer and CUR 2.0 (Cost and Usage Report).
Amazon Bedrock now supports cost allocation by IAM principal, such as IAM users and IAM roles, in AWS Cost and Usage Report 2.0 (CUR 2.0) and Cost Explorer.
In short: when your dev runs aws sso login and calls a Bedrock model, AWS now automatically records who made that call (after activation) and how much it cost. No proxies, no custom Lambdas, no CloudTrail scraping.
One thing you need to understand right away: this feature is about billing, not enforcement. Data arrives in CUR 2.0 and Cost Explorer with 24-48 hours of latency. That means you can see who spent how much, but you cannot block access in real time with this data. For that, the Budget Cut Lambda from Part 1 is still necessary.
Sorry for AWS Budgets, you can't use it.
What changes from the prev version of this serie?
| Mechanism | Granularity | What it solves |
|---|---|---|
| Inference Profile + Resource Tags | Per team / workload | "How much did the backend team spend on Haiku?" |
| IAM Principal Cost Allocation (NEW) | Per user / role | "How much did user@company.com spend across all models?" |
Combined, you get the full cost picture: who spent how much, on which model, for which team. But remember: you see this picture with a 24-48h delay. It's for analysis and chargeback, not real-time enforcement.
Prerequisites
Everything from Part 1, plus:
-
Tags on your IAM users/roles with business attributes (
team,business-unit,project) - Access to the Billing and Cost Management console (if using Organizations, you do this from the management account)
- CUR 2.0 enabled (legacy CUR doesn't support this)
Step by step: setting up IAM Principal Cost Allocation
Step 1: Tag your IAM Principals
If you use SSO, each Permission Set creates a role in the target account with format AWSReservedSSO_{PermissionSetName}_{hash}. These roles can be tagged.
# scripts/tag_iam_principals.py
"""
Tag SSO roles with business attributes.
These tags will show up in Cost Explorer and CUR 2.0.
"""
import boto3
iam = boto3.client("iam")
# Mapping: SSO role -> business tags
SSO_ROLES = {
"AWSReservedSSO_BackendDev_a1b2c3d4": {
"team": "backend",
"business-unit": "BU-ENG-001",
"department": "engineering",
"environment": "development",
},
"AWSReservedSSO_FrontendDev_e5f6g7h8": {
"team": "frontend",
"business-unit": "BU-ENG-002",
"department": "engineering",
"environment": "development",
},
"AWSReservedSSO_DataTeam_i9j0k1l2": {
"team": "data",
"business-unit": "BU-DATA-001",
"department": "data-science",
"environment": "development",
},
}
for role_name, tags_dict in SSO_ROLES.items():
tags = [{"Key": k, "Value": v} for k, v in tags_dict.items()]
iam.tag_role(RoleName=role_name, Tags=tags)
print(f"✅ {role_name} tagged with: {tags_dict}")
For legacy IAM Users or service accounts, like ECS Roles:
aws iam tag-user \
--user-name "ml-pipeline-sa" \
--tags Key=team,Value=data Key=business-unit,Value=BU-DATA-001 Key=project,Value=recommendation-engine
AWS note: Tags only appear for activation in the Billing console after the principal has made at least one Bedrock API call. If you tagged a role but nobody has used it yet, you won't see it in Cost Allocation Tags.
Step 2: Activate Cost Allocation Tags
Billing and Cost Management Console
→ Cost Organization (left panel)
→ Cost Allocation Tags
→ Filter by "IAM principal type"
→ Select: team, business-unit, department
→ Click "Activate"
After activation, tags take up to 24 hours to become available in Cost Explorer and CUR.
# Verify which tags are active via CLI
aws ce list-cost-allocation-tags \
--status Active \
--tag-keys "team" "business-unit" "department" \
--type "iamPrincipal"
Step 3: Enable IAM Principal in CUR 2.0
If there's one step that makes all of this work, it's this one.
Activate the line_item_iam_principal column in cost reports.
Billing and Cost Management Console
→ Data Exports
→ Create export → Standard data export (CUR 2.0)
→ Additional export content:
✅ Include caller identity (IAM principal) allocation data
→ Destination: S3 bucket + Athena integration
→ Save
What does this generate? Every line in CUR 2.0 now includes the exact ARN of the principal that called Bedrock. Principal tags appear with the
iamPrincipal/prefix so they don't collide with resource tags.
What does the data look like?
Cost Explorer: filter by team
Once tags are activated, Cost Explorer lets you group directly:
Cost Explorer
→ Filter: Service = "Amazon Bedrock"
→ Group by: Tag → "iamPrincipal/team"
You'll see something like:
iamPrincipal/team | Cost (USD)
---------------------|----------
backend | $142.30
data | $89.50
frontend | $23.10
(no tag) | $5.40
Want to drill down into the backend team? Change the Group by:
Cost Explorer
→ Filter: Service = "Amazon Bedrock"
→ Filter: Tag "iamPrincipal/team" = "backend"
→ Group by: Tag → "iamPrincipal/business-unit"
BONUS
(Optional) CUR 2.0 + Athena: per-user analysis
If Cost Explorer doesn't give you enough granularity and you need to see per individual user, you can query CUR 2.0 directly with Amazon Athena.
About Athena costs: Athena charges $5 USD per TB scanned (on-demand mode). For small/medium org CUR files this is usually pennies per query. To reduce costs, enable CUR compression (Parquet format) and partition by month.
-- Top 10 Bedrock consumers this month
SELECT
line_item_iam_principal AS iam_principal,
tags['iamPrincipal/team'] AS team,
tags['iamPrincipal/business-unit'] AS business_unit,
SUM(line_item_unblended_cost) AS total_cost,
SUM(line_item_usage_amount) AS total_usage
FROM cur_2_0.bedrock_usage
WHERE
line_item_product_code = 'AmazonBedrock'
AND EXTRACT(MONTH FROM billing_period) = EXTRACT(MONTH FROM CURRENT_DATE)
GROUP BY 1, 2, 3
ORDER BY total_cost DESC
LIMIT 10;
iam_principal | team | business_unit | total_cost
-------------------------------------------------------+----------+---------------+-----------
arn:aws:sts::123456:assumed-role/BackendDev/javier@... | backend | BU-ENG-001 | $67.30
arn:aws:sts::123456:assumed-role/DataTeam/maria@... | data | BU-DATA-001 | $52.10
arn:aws:sts::123456:assumed-role/BackendDev/luis@... | backend | BU-ENG-001 | $41.20
No proxies, no Lambdas, no CloudTrail scraping: you know exactly that Javier from the backend team spent $67.30 this month.
How it complements Part 1
The scheme from the previous post (Inference Profiles + IAM Policies + Guardrails + Budget Cut Lambda) isn't replaced, it's complemented:
| Component | Part 1 | Part 2 (new) |
|---|---|---|
| Who can use which model? | IAM Policy per-team | No changes |
| Is PII protected? | Bedrock Guardrails | No changes |
| How much did each team spend? | Inference Profile tags → Cost Explorer | IAM Principal tags → Cost Explorer (more granular) |
| How much did each user spend? | ❌ Not available | ✅ line_item_iam_principal in CUR 2.0 |
| Cut access when over budget? | Budget Cut Lambda (per-team, ~10 min) | ⚠️ CUR has 24h delay. NOT usable for blocking. Budget Cut Lambda remains the only cut mechanism |
Trade-offs
| Original trade-off | Status with IAM Principal CA | Notes |
|---|---|---|
| Couldn't tell who was spending inside a team | ✅ SOLVED | CUR 2.0 + line_item_iam_principal gives per-user visibility |
| Cut is per team, not per user | 🟡 Partially improved | You can now see who was responsible (post-mortem), but auto-cut is still per team |
| Cost Explorer has 24-48h delay | 🔴 Remains (also applies to IAM Principal CA) | For real-time cuts, Budget Cut Lambda via CloudWatch (Part 1) is still the only path |
| No self-service dashboard | ✅ Improvable | Cost Explorer with iamPrincipal/team filter can be shared with teams |
What still needs a real proxy
Let's be honest: some things need a real proxy, and for proper AI governance it's the best you can aim for.
-
Per-user real-time cut: CUR has 24h delay. To cut a specific user in minutes you need Model Invocation Logging + Lambda +
iam:PutRolePolicy - Response caching: No native Bedrock caching. A proxy like LiteLLM can cache repeated responses
- Multi-provider routing: If you want OpenAI and Anthropic directly (not via Bedrock), you need an abstraction layer
- Semantic observability: To see an agent's reasoning tree, you need OTel + Langfuse. CloudTrail tells you "who called", not "why it reasoned that way"
Conclusions
With this announcement, the native governance model from Part 1 gains a missing piece: per-user cost visibility, without extra infrastructure, without CloudTrail scraping, without custom Lambdas for attribution. It's an analysis and chargeback tool, not enforcement. Real-time blocking still depends on the Budget Cut Lambda and CloudWatch.
Always think in stages:
- Day 0: IAM Policies + Guardrails + Inference Profiles → governed access
- Day 1: Budget Cut Lambda, protection against runaway spending
- Now (Day 2): IAM Principal Cost Allocation, know exactly who spends what
- Next? Leave your comment
Is there still a lot to build? Yes. A real proxy (LiteLLM), semantic observability (Langfuse), and caching are all desirable evolutions as the organization scales. But the foundation is in place, and it's 100% native.
The good thing about all of this: AWS keeps closing the gaps that we as architects identify. Today it's cost allocation by principal. Tomorrow maybe it'll be native per-user throttling or service-level guardrails that don't require guardrailConfig in the dev's code.
The cloud tools we already master are the foundation for governing the AI we're building. And there are fewer and fewer excuses not to do it from Day 0.
Want to stay in touch? 📩
Héctor Fernández
AWS Community Builder
Top comments (0)