Lee Wynne

Posted on Mar 31 • Edited on Apr 25 • Originally published at leewynne.com

The Secret to Near 100% AWS Tagging Compliance? People Shouldn't Know You're Doing It.

#aws #cloud #devops

Every enterprise with more than a handful of AWS accounts eventually has the same reckoning. Someone in finance asks which team owns the spend in AWS account 846241037459. Someone in security wants to know whether the resources in a particular VPC are production or development. Someone in operations needs to route an incident to the right application owner at 2am. And in every case, the answer depends on tags.... tags tags tags, never ending tags - tags that probably do not exist, may or may not be accurate, and almost certainly aren't consistent across business divisions.

This is the tagging problem, and most organisations try to solve it the wrong way. They write a tagging policy. They distribute it in a wiki that nobody bookmarks. They ask teams to please tag their resources correctly. And then they act surprised when compliance sits at sub 20% and the data is useless for any meaningful reporting or incident management.

The Provider > Consumer model solves this differently. Instead of asking consumers to tag their resources, the Provider delivers mandatory tags as part of the account vending pipeline (before the consumer ever touches the environment). The tags arrive with the account. The resources inherit them. The consumer can't remove them, can't modify them, and in most cases doesn't even realise they provided the data that populated them!

This isn't a tagging strategy. It's a tagging system. And the distinction matters more than most platform teams appreciate.

Why Voluntary Tagging Always Fails at Scale

Let's be honest about this. Voluntary tagging policies have a 100% failure rate in large enterprises. Not because people are negligent, but because the model is fundamentally broken.

A developer provisioning an EC2 instance is thinking about their application. They're thinking about instance types, security groups, and whether their deployment fettling works. They are not thinking about which cost centre code maps to their division's financial reporting hierarchy. They don't know that finance needs a specific format for the Cost Code tag (they certainly don't know that the value needs to match an entry in SAP that they've never seen and don't have access to).

So what happens? They leave the tag blank. Or they guess. Or TBD/D goes in. Or they put in something that looks right but doesn't match the source of truth. And now you have a resource in production with a cost code that doesn't resolve to anything in the financial system, which means it falls into an "unallocated" bucket, which means someone in FinOps spends three hours every month trying to reconcile it.

Multiply this across thousands of resources, dozens of teams, and many business divisions, and you have a tagging estate that's technically present but functionally useless. You've achieved the appearance of governance without any of the substance.

The root cause isn't laziness. It's that you've placed the burden of organisational metadata on the people least equipped to provide it.

A developer shouldn't need to understand the enterprise financial hierarchy to deploy a Lambda function. That's not their job. It's yours (as a Provider).

Mandatory Tags Are a Platform Concern, Not a Consumer Concern

This is the shift in thinking that makes the Provider > Consumer model work for tagging. Mandatory tags, the tags that drive financial reporting, security ownership, compliance classification, and business criticallity, are platform-level concerns. They exist to serve the organisation, not the individual workload. And because they serve the organisation, the organisation (through the Provider) should own them end to end.

In the Provider > Consumer model, mandatory tags are opinionated.

The platform team decides which tags exist, what their keys are called, what format the values must take, and where the source of truth lives. These aren't suggestions. They're not configurable. They're the platform's DNA, and they flow through every account, every VPC, and every resource that gets provisioned and this works because there is workload segregation at the account level.

The Provider owns the tag taxonomy. The Provider owns the enforcement mechanism. The Provider owns the pipeline that applies them. The consumer benefits from the data they generate, through cost reports, ownership dashboards, and incident routing (without ever having to maintain the tags themselves).

This is what opinionated platform design looks like when applied to metadata. You don't ask consumers what tags they want. You tell them what tags they'll get. And you make those tags so deeply embedded in the provisioning process that removing them would require dismantling the pipeline itself.

Pre-Account Vending: Where the Data Actually Comes From

Here's where it gets elegant. The data that populates mandatory tags doesn't come from a tagging form. It comes from the account request process, typically a well-designed ServiceNow form that the consumer fills out when they request a new AWS account pattern.

Think about what a consumer provides when they request an account. They specify which business division they belong to. They identify the application or workload the account is for. They nominate a technical owner and a security contact. They provide (or their request is enriched with) a cost centre code from the enterprise financial system. They indicate whether this is a production or non-production workload.

Every single one of those data points maps directly to a mandatory tag. Business Division becomes the Business Division tag. The application name maps to Application Name tag. The cost centre maps to Cost Code. The technical contact maps to Technical Owner. The consumer isn't filling out a tagging form. They're filling out an account request form. But the platform team has designed that form so that every field either directly populates a tag or feeds a lookup that resolves to a tag value.

The consumer doesn't know they're providing mandatory tagging data. They think they're answering reasonable questions about their workload. And they are, but those questions just happen to be the exact inputs the tagging pipeline needs.

This is deliberate. The best tagging strategies are invisible to the people being tagged. If your consumers are aware they're doing tagging, you've already introduced friction. And friction is where compliance goes to die.

The Vending Pipeline: Tags as Infrastructure

Once the account request is approved and the required data is captured, the Provider's account vending pipeline takes over. This is where tags stop being metadata and start being infrastructure.

The vending pipeline, whether it's built on AWS Control Tower Account Factory, Terraform, or a custom solution, provisions the account with mandatory tags already applied. The tags are written at the account level as part of the same automation that creates the account, configures the security baseline, deploys the VPC, and wires up connectivity.

There's no separate tagging step. There's no "apply tags after provisioning" task that someone might forget or defer. The tags are part of the account's identity from the moment it exists. By the time the consumer gets access to their new environment, the mandatory tags are already there, applied, validated, and from their perspective, immutable.

At the AWS account level, these tags propagate through AWS Organisations. Tag policies enforce the expected format and values. Service Control Policies can prevent the creation of resources that don't carry the required tags. And because the Provider controls the Terraform modules that consumers use to provision resources within their accounts, those modules can inject mandatory tags automatically, the consumer's resource inherits the account's mandatory tags without any action on their part.

This is the inheritance model, and it works because of a design decision the Provider made long before any tagging conversation happened, accounts are segregated by environment and by workload. One account per workload per environment. This means the mandatory tags that describe ownership, cost allocation, and classification are consistent for every resource in the account because every resource in the account belongs to the same workload in the same environment. There's no ambiguity. There's no "this EC2 instance belongs to Team A but that RDS instance belongs to Team B." The account boundary is the ownership boundary, and the tags reflect that.

Why Mandatory Tags Must Be Immutable to Consumers

This is where most tagging implementations go sideways.

If consumers can modify mandatory tags, they will. Not maliciously just inevitably. Someone will "fix" a cost code they think is wrong. Someone will update a technical owner field when a team member leaves, using a personal email instead of a distribution list. Someone will delete a tag they don't recognise because it looks like clutter.

Every one of these well-intentioned changes breaks something downstream. The cost code "fix" means the resource no longer maps to a valid entry in the financial system. The email change means the incident routing automation sends alerts to a personal inbox that nobody checks after hours. The deleted tag means a compliance report now has a gap that triggers an audit finding.

Mandatory tags are not the consumer's concern to maintain because they are not the consumer's data. They are organisational data that happens to be attached to the consumer's resources. The Provider maintains them. The Provider updates them. The Provider ensures they stay consistent with their sources of truth.

In practice, this means the tags are applied and managed through the Provider's pipeline, through Terraform state, through AWS Organisations tag policies, through account-level automation that runs on a schedule to detect and remediate drift. If a tag gets changed, the pipeline puts it back. If a tag gets deleted, the pipeline restores it. The consumer's Terraform code doesn't manage these tags. The consumer's IAM permissions don't grant write access to them at the organisational level.

This isn't about trust issues. It's about system design. You don't let application code modify the VPC's route table. You don't let developers reconfigure the transit gateway. And you don't let consumers edit the tags that drive your organisation's financial reporting, security ownership, and compliance posture. These are platform concerns, managed by the platform, enforced by the platform.

What Happens When a Consumer Needs a Tag Changed?

Tags like CostCode or TechnicalOwner do change over time. Teams reorganise. Cost centres get restructured. People move to different roles. The fact that mandatory tags are immutable to consumers doesn't mean they're permanent, instead it means changes flow through the right channels.

When a consumer needs a mandatory tag updated, they raise a request. This might be a ServiceNow ticket, or a conversation with the Platform team. The Provider validates the new value against the source of truth (does this cost code actually exist in SAP? Is this email address a valid distribution list in the directory?), then updates the tag through the pipeline, and the change propagates to the account and its resources.

The process is intentionally deliberate (not slow, deliberate). Mandatory tags affect reporting, billing allocation, security ownership, and compliance classification across the organisation. A change to the 'BusinessDivision' tag on an account ripples through every cost report, every ownership dashboard, and every compliance audit that references that account. That change should be validated, approved, and applied through automation, not made ad hoc in the AWS console by someone who needed to update a spreadsheet.

In practice, mandatory tag change requests are rare. Cost codes don't change monthly. business division structures don't shift weekly. Technical ownership transfers happen, but not frequently enough to create operational burden. The channel exists for when it's needed, but the design assumes stability, because mandatory tags represent organisational facts that are, by definition, slow-moving.

Consumer Tags: Where the Provider Steps Back

Not all tags are mandatory. Consumers obviously have their own tagging needs, operational tags, automation tags, application-specific metadata that the platform team has no opinion about. And that's exactly right. The Provider should have no opinion about them.

A consumer might want to tag resources with deployment version numbers, feature flags, team-specific identifiers, or automation triggers. These tags serve the consumer's operational needs, not the organisation's reporting needs. The Provider doesn't define them, doesn't enforce them, and doesn't manage them.

What the Provider can do, and should do, is provide guidance. A set of recommended operational tags, published as suggestions rather than mandates, helps consumers who aren't sure what to tag or how. Tags like CostCentre (for project-level cost tracking below the mandatory CostCode), ManagedBy (Terraform, CloudFormation, manual), or BackupPolicy (daily, weekly, none) are operationally useful and don't need to be mandated to be valuable.

The Provider can also provide tag validation tooling that consumers can optionally adopt. Linting rules in the CI/CD pipeline that warn (but don't block) when recommended tags are missing. Dashboard views that show tag coverage across a consumer's account. These are nudges, not guardrails, and the distinction really matters.

Mandatory tags get guardrails. Consumer tags get guidance. Conflating the two undermines both.

The ServiceNow Form: Designed for Data, Not for Tagging

This is where the consumer experience either succeeds or fails.

A poorly designed form asks the consumer to provide tagging data. It has fields labelled "Business Division Tag," "Cost Centre Tag," and "Application Name Tag." The consumer sees these and thinks "I don't know what half of these mean" or "I'll come back to this later" (they won't). The form creates friction, and friction creates gaps and gaps create delays in environment vending.

Behind the form, the platform team has mapped every field to a tag key and configured lookups against the sources of truth. The business division dropdown pulls from a source of truth. The cost centre field validates against SAP. The application name resolves against the enterprise architecture tool. The consumer selects from a dropdown and the pipeline writes the value to the account. The consumer types a cost centre and the pipeline validates it exists before writing.

This form is the interface between the consumer's knowledge and the Provider's tagging system. When it's designed well, the consumer knows what information is required well up front and fills it out in five minutes, the pipeline has everything it needs, and mandatory tags arrive fully populated and validated before the account is even created. The consumer never typed a tag key. They never formatted a tag value. They answered questions about their workload, and the platform did the rest.

Resource Inheritance: Why Account Segregation Makes This Work

The inheritance model (where resources automatically carry the mandatory tags of their parent account) only works if accounts are properly segregated. And this is a design decision the Provider must make early, because retrofitting it later is pain - real pain.

The principle is straightforward. One account per workload per environment. The production instance of Application X lives in one account. The development instance lives in another. The staging instance lives in a third. Each account has mandatory tags that describe that specific combination of workload and environment.

When a consumer provisions an ECS/EC2 instance, an RDS database, or an S3 bucket in that account, the resource inherits the account's mandatory tags. The Business Division tag is correct because the account belongs to that division. The Application Name tag is correct because the account hosts that application. The Environment tag is correct because the account is dedicated to that environment. The Cost Code is correct because the cost centre covers all resources in that workload's environment.

There's no need for the consumer to tag each resource individually with organisational metadata. The account is the tagging boundary. Everything inside it inherits. This eliminates the most common source of tagging inconsistency in large estates: resources within the same account tagged differently because different team members provisioned them at different times with different understandings of what the values should be.

Account segregation also eliminates the shared account problem, you know which one I mean.. The scenario where a single account hosts resources from multiple applications owned by different teams with different cost centres. In a shared account, tagging has to happen at the resource level because the account-level tags can't represent multiple owners. This is fragile, error-prone, and impossible to enforce consistently. Proper account segregation makes it unnecessary.

The Tags That Drive Everything

Let's be specific about what mandatory tags actually enable, because the payoff is what justifies the investment in the pipeline.

Financial reporting and cost allocation. When every account carries a validated Business Division, and Cost Code tag, cost allocation becomes automatic. AWS Cost Explorer can slice spend by any of these dimensions without manual intervention. FinOps teams can produce division-level cost reports that match the financial system because the tag values are sourced from the financial system. There's no reconciliation step. There's no "unallocated" bucket growing larger every month. The tags are the reporting layer, and because they're mandatory and immutable, the reports are trustworthy. This paves the way for successful integration of enterprise level FinOps tooling that can deliver incredible insights and value.

Security and compliance ownership. The Security Owner and Technical Owner tags establish clear accountability for every account and every resource within it. When a vulnerability is discovered, the security team doesn't need to dig through wikis or Slack/Teams channels to find who owns the affected resource. The tag tells them. When a compliance audit requires evidence of ownership for production workloads, the tag provides it. Automated incident routing can use these tags to page the right person at 2am without human triage.

Operational efficiency. Tags like Application Name, Application ID, and Environment enable automation that would be impossible without consistent metadata. Automated backup policies keyed to environment type. Monitoring thresholds that differ between production and development. Cost anomaly detection scoped to individual applications rather than entire accounts. Each of these automations depends on tags being present, accurate, and consistent (exactly what mandatory tagging through the Provider pipeline guarantees).

Discovery and migration planning. When an organisation needs to assess its cloud estate (for migration, for rationalisation, for M&A due diligence), mandatory tags provide the map. Every account can be categorised by business division, application, environment, and ownership without manual investigation. A query against AWS Organisations returns a complete inventory, tagged and classified, ready for analysis. Without mandatory tags, this same exercise takes weeks of interviews, spreadsheet reconciliation, and educated guessing.

Governance That Doesn't Feel Like Governance

This is the part that matters most, and it's the part that most tagging strategies miss the mark on.

Good governance is invisible to those being governed. If your consumers feel the weight of your tagging policy, if they're spending time looking up tag values, formatting keys correctly, remembering which tags are mandatory, then your governance is creating friction, and friction erodes compliance over time. People start cutting corners. They copy tags from another resource without checking if they're correct. They leave optional-but-important fields blank because the form is already too long.

Mandatory tagging through the Provider-Consumer model removes all of this. The governance is in the pipeline, not in the consumer's workflow. The consumer's experience is > request an account, answer some questions about your workload, get a fully provisioned environment, start building. The tags are there. They're correct. They'll stay correct. And the consumer never had to think about them.

This is what it means to build for consumability in the context of tagging. You're not building a tagging policy. You're building a tagging system that operates as infrastructure, fully automated, validated, enforced, and invisible to the people it serves. The reporting works because the data is clean. The data is clean because the pipeline ensures it. The pipeline ensures it because the Provider owns it end to end.

Tags aren't a practice. They're plumbing. And like all good plumbing, the best measure of success is that nobody thinks about it until something goes wrong, which, if you've built it right, it won't (famous last words 😉)

Mandatory tags don't ask for compliance. They deliver it. Build the pipeline, own the taxonomy, and let your consumers focus on what they're actually here to do, build and operate!

Top comments (1)

vandana.platform • Apr 1

This nails a truth most teams learn the hard way tagging is not a policy problem, it’s a platform design problem. Turning tags into inherited, immutable infrastructure via the account vending pipeline is what finally makes compliance real instead of aspirational.