Gerardo Castro Arica for AWS Heroes

Posted on Mar 25

I automated an AWS Security Maturity Model recommendation across 40 accounts — design decisions included

#aws #python #cloud #security

The AWS Security Maturity Model has a recommendation in Phase 1 — Quick Wins that seems trivial: assign a security contact in each account of your AWS Organization.

It's not glamorous. It doesn't have a complex architecture diagram. It doesn't require enabling any new service. It's literally filling out a form with a name, an email, and a phone number.

And yet, in most organizations I work with in LATAM, it isn't done.

Not because nobody knows about it — but because in environments with dozens of accounts, "filling out a form" becomes a manual process that depends on someone remembering, having access, and doing it correctly in each account. And when Control Tower provisions a new account, that process starts from scratch all over again.

The question that kicked off this project was simple: why am I doing this by hand?

The problem

An active AWS Organization isn't static. New projects arrive, development environments get created, teams join. With Control Tower, provisioning a new account takes minutes — and that's exactly what you want. The problem is what happens after provisioning.

Every new account is born without a security contact. AWS uses that contact to send critical alerts — abuse notifications, credential compromises, active vulnerabilities in your resources. If it isn't configured, those alerts go to the root account email, which in most cases nobody actively monitors.

In an organization with 10 accounts, you can manage it by hand. With 20, you start missing things. With 40, it's systematically inconsistent.

The maturity model is clear about this: it's not an advanced recommendation. It's in Phase 1. It's baseline. It's what you should have solved before anything else. And "solved" doesn't mean doing it once for the accounts that exist today — it means any account created tomorrow also has it, automatically, without anyone having to remember.

That requires automation. Not a runbook, not a checklist — real automation that reacts to the right event and doesn't depend on human intervention.

The architecture

The most important design decision in this project isn't the Lambda — it's where it lives and how it gets triggered.

The obvious temptation is simple: create an EventBridge rule in the Management account that invokes the Lambda directly. It works. But it creates coupling you'll regret later: the Management account knows a Lambda exists, knows where it is, and needs permissions to invoke it cross-account. Every new consumer of that event requires touching the Management account.

The right solution is a Custom Event Bus in the Security account.

Control Tower (Management)
  → EventBridge Rule
  → Custom Event Bus (Security)
  → EventBridge Rule
  → Lambda
  → account API (all accounts)
  → Slack notification

The flow is: Management publishes the event to the Bus. Security has its own rules that decide what to do with it. Management knows nothing about the Lambda — it only knows there's a Bus to send events to.

What you gain from this is real extensibility. The Custom Event Bus in Security becomes the central point for organization events. When tomorrow you want to react to the same event in a different way — send to a SIEM, trigger another automation, notify a different channel — you add a rule to the Bus. The Management account isn't touched. That's the kind of decision that seems like overhead at first and that you appreciate when the system grows.

CreateManagedAccount, not CreateAccount

When Control Tower provisions an account, AWS emits two distinct events at different moments in the process. Confusing them is one of the easiest mistakes to make — and one of the hardest to diagnose because both seem correct on paper.

CreateAccount is emitted when the creation process begins. At that point, the account exists in Organizations but Control Tower hasn't finished configuring it yet. If you trigger the Lambda with that event, you attempt to assign the security contact on an account that isn't fully accessible yet. The result is an error that looks like a permissions problem — and one that will have you spending time reviewing IAM policies that are perfectly fine.

CreateManagedAccount is emitted when Control Tower finishes provisioning. The account is ready, the roles are deployed, you can operate on it. This is the correct event.

But there's a second, subtler trap. The event's detail-type isn't AWS Control Tower via CloudTrail as you might assume — it's AWS Service Event via CloudTrail. This detail isn't well documented and there's only one way to discover it: opening CloudTrail, finding the actual event emitted when CT provisioned an account, and reading the full JSON.

The correct EventBridge rule looks like this:

{
  "source": ["aws.controltower"],
  "detail-type": ["AWS Service Event via CloudTrail"],
  "detail": {
    "eventName": ["CreateManagedAccount"]
  }
}

If you use CreateAccount or the wrong detail-type, the rule never fires — or fires at the wrong moment. In both cases, the security contact isn't assigned and there's no visible error telling you why.

The Lambda and idempotency

The most important behavior of this Lambda isn't what it does when it finds an account without a contact — it's what it does when it finds an account that already has the correct one.

It does nothing.

That seems obvious, but it has concrete design implications. The Lambda doesn't blindly call PutAlternateContact on every account. It first calls GetAlternateContact, reads the current value, compares it to the expected one, and only acts if there's a real difference. If the contact exists and is correct, the account is marked as "unchanged" and the process moves on.

current = get_contact(account_id, is_management)
if current == expected:
    results["unchanged"] += 1
    continue

set_contact(account_id, is_management)
results["assigned"] += 1

This makes the Lambda idempotent: you can run it a hundred times with the same result. There's no risk of overwriting a contact someone updated manually in a specific account — if the values match, it doesn't touch anything.

Idempotency also solves the problem of existing accounts. A reactive trigger that only responds to CreateManagedAccount covers new accounts, but not the 40 that already existed before deploying the tool. The solution is to run the Lambda once across the entire organization on the first deploy — same code, no additional logic, because idempotency guarantees it won't break anything that's already correctly configured.

The summary of each execution arrives via Slack:

✅ Security Contact Enforcer
Accounts processed: 40
Assigned: 3
Updated: 1
Unchanged: 36

That's the message you want to see. Thirty-six accounts that were already fine — and four that needed attention and no longer do.

The account API and its traps

The Lambda is simple. The AWS Account API is not.

There are three behaviors of the account API that aren't well documented, that don't generate descriptive errors when ignored, and that only appear when you deploy in a real environment.

The API is global — it only works in us-east-1

The AWS account service is a global API. That means it's not available in every region — only in us-east-1. If your Lambda lives in us-east-2 or sa-east-1 and you create the boto3 client without specifying a region, the call fails with AccessDeniedException.

The error is confusing because it looks like an IAM permissions problem. You can spend hours reviewing policies that are perfectly fine before realizing the issue is the region. The solution is explicit:

account_client = boto3.client('account', region_name='us-east-1')

One line. But one you only know you need after you've needed it.

The Management account doesn't accept AccountId

For member accounts, the API accepts an AccountId parameter that indicates which account to operate on. For the Management account, that parameter can't be passed — the call must be made without it, in standalone mode.

If you pass the Management account's AccountId, the API returns an error. If there's no logic to bifurcate behavior based on account type, the Lambda will silently fail on the Management account or skip it without warning.

The solution was adding an is_management parameter to the get_contact() and set_contact() functions:

def get_contact(account_id, is_management=False):
    if is_management:
        return client.get_alternate_contact(AlternateContactType='SECURITY')
    return client.get_alternate_contact(
        AccountId=account_id,
        AlternateContactType='SECURITY'
    )

Trusted Access must be enabled once

Before you can call account:GetAlternateContact or account:PutAlternateContact from an account other than Management, you need to enable Trusted Access between AWS Organizations and the Account Management service. Without this step, the API returns AccessDeniedException even if the role has all the correct permissions.

aws organizations enable-aws-service-access \
  --service-principal account.amazonaws.com \
  --profile YOUR-MANAGEMENT-PROFILE

It's a one-time step that isn't in the Terraform flow — it must be done manually before the first deploy. If it isn't documented, it's the kind of prerequisite that costs hours to whoever tries to reproduce the project from scratch.

The Makefile as the only path

Lambda with ZIP has a silent trap that's easy to ignore until it costs you real time.

When you update the Python code in src/security_contact_enforcer.py and build the ZIP manually, Lambda keeps running the old code. Without any error. Without any warning. It simply executes the previous version as if nothing changed.

What happens is that the code Lambda executes isn't the file you edited directly — it's what's inside src/package/, the directory that gets packaged into the ZIP. If you update the source but forget to copy it to package/ before repackaging, the "successful" Lambda deploy contains outdated code.

In a manual flow with multiple steps, that oversight is inevitable. The solution is to eliminate the manual flow.

The Makefile turns the update into a single command:

update:
    cp src/security_contact_enforcer.py src/package/
    rm -f function.zip
    cd src/package && zip -r ../../function.zip .
    aws lambda update-function-code \
        --function-name $(FUNCTION_NAME) \
        --zip-file fileb://function.zip \
        --profile $(SECURITY_PROFILE) \
        --region $(AWS_REGION)

make update

That's it. You edit src/security_contact_enforcer.py, run make update, and the full cycle — copy, package, deploy — happens in order without any steps that can be skipped.

The Makefile isn't an optimization — it's the only safe way to update the Lambda. When there's only one correct path, there's no room for human error.

How to validate it without creating an account

Validating that the system works end-to-end has two independent parts. Confusing them leads to incomplete tests or waiting for Control Tower to provision a real account every time you want to verify something.

Validate the Lambda logic

For this, you don't need to create any account. Manually delete the security contact from an existing account:

aws account delete-alternate-contact \
  --alternate-contact-type SECURITY \
  --account-id YOUR-ACCOUNT-ID \
  --profile YOUR-MANAGEMENT-PROFILE \
  --region us-east-1

Then invoke the Lambda directly:

aws lambda invoke \
  --function-name security-contact-enforcer \
  --payload '{}' \
  --cli-binary-format raw-in-base64-out \
  response.json \
  --profile YOUR-SECURITY-PROFILE \
  --region YOUR-REGION

cat response.json

If the result shows Assigned: 1, Unchanged: N-1 — the complete logic is validated. The Lambda traversed all accounts, detected the one without a contact, fixed it, and left the rest untouched. All without touching EventBridge or Control Tower.

Validate the EventBridge trigger

This validation is independent and requires the real event. The only way to do it is to create an account from Control Tower and verify in CloudWatch Logs that the Lambda fired automatically upon receiving the CreateManagedAccount event.

These are two distinct tests that validate different things. The first confirms the business logic works. The second confirms the trigger responds to the correct event. You need both — but you don't need to do them together or in that order.

The result

Running the Lambda for the first time against an organization with 40 active accounts produced this:

✅ Security Contact Enforcer
Accounts processed: 40
Assigned: 31
Updated: 4
Unchanged: 5

Thirty-one accounts without a security contact. Not because nobody cared about security — but because nobody had built the mechanism to guarantee it systematically. The 4 updated ones had a contact configured with outdated information, from an email that no longer existed or someone who had already left the team.

Only 5 accounts were correctly configured.

That number isn't unusual. It's what you find when you audit this control in organizations that have been on AWS for years without baseline automation. The problem isn't negligence — it's that without automation, consistency depends on someone remembering to do something manually at the right moment, every time a new account gets created.

Since the deploy, every account Control Tower provisions has a security contact assigned before the team that requested it finishes configuring their first resources. No tickets, no runbooks, no human intervention.

The operational cost of maintaining this control is now zero.

Closing

The AWS Security Maturity Model has dozens of controls. Some are complex, costly, and require weeks of planning. This one isn't.

Assigning a security contact is Phase 1 — Quick Wins. It's the first thing you should have solved. And yet, in practice, it's one of the most frequently omitted controls in multi-account organizations because nobody built the mechanism to guarantee it continuously.

What this project demonstrates isn't that automation is hard — it's exactly the opposite. A Lambda, a Custom Event Bus, Terraform and a Makefile are enough to turn a manual, error-prone process into a control that works on its own, forever, without anyone having to remember.

The traps that appeared along the way — the wrong Control Tower event, the global API that only lives in us-east-1, the special behavior of the Management account, the Lambda ZIP that silently deployed old code — aren't well documented anywhere. They appear when you deploy in production with real accounts. That's why they're documented here.

If you're building AWS security in LATAM with the resources you have, I hope this saves you the hours it cost to discover.

The repository is on GitHub with all the code, the IaC in Terraform, and the complete README.

🔗 GitHub: gerardokaztro/security-contact-enforcer

About the author

Gerardo Castro is an AWS Security Hero and Cloud Security Engineer focused on LATAM. Founder and Lead Organizer of the AWS Security Users Group LatAm. He believes the best way to learn cloud security is by building real things — not memorizing frameworks. He writes about what he builds, what he finds, and what he learns along the way.

🔗 GitHub: gerardokaztro
🔗 LinkedIn: gerardokaztro

DEV Community