DEV Community

Cover image for AWS Multi-Account IaC Sandwich: Terragrunt, Terraform, CloudFormation
Tarlan Huseynov
Tarlan Huseynov

Posted on

1 1

AWS Multi-Account IaC Sandwich: Terragrunt, Terraform, CloudFormation

Introduction

Hey Folks! Today we have "IaC sandwich" on the menu, and we will look into a guide for ultimate centralized and secure AWS multi-account management with a layered strategy that ensures the best of all worlds—leveraging AWS-native capabilities, infrastructure as code flexibility, and streamlined multi-account governance.

Back in the day, while diving deep into the Terraform/Terragrunt duo and its incredible power, I found myself asking: How can I further automate infrastructure provisioning while ensuring security, scalability, and maintainability across multiple AWS accounts, while also achieving short-lived, secure authorization management (role-based) that remains seamless and controlled? I wanted a solution that would streamline access management, enforce security best practices, and provide a centralized approach to managing infrastructure.

While Terraform and Terragrunt duo already offer fantastic Infrastructure-as-Code (IaC) capabilities, there's one AWS-native service that often gets overlooked—CloudFormation StackSets. Although CloudFormation is AWS-specific and lacks the flexibility of Terraform providers, it has a significant advantage: organization-wide deployments at scale. This made me realize that CloudFormation StackSets could be a powerful addition to a Terraform/Terragrunt-based workflow, addressing the challenge of managing IAM roles securely and efficiently—if added as the first layer of my sandwich during the pre-provisioning phase—where I need an easy "init-hook" to get all the needed roles and permissions in place even before beginning the actual deployment of infrastructure.

So, in this article, I’ll walk you through how I designed a fully automated, secure, and scalable infrastructure management approach using Terraform, Terragrunt, and AWS CloudFormation StackSets—leveraging the best of each tool to create an ironclad AWS setup.

The Challenge

In multi-account AWS environments, access management and security are paramount. When using Terraform and Terragrunt, you need IAM roles that allow infrastructure automation while maintaining strict security controls. However, these roles must be pre-provisioned before Terraform can even begin managing infrastructure. This raises an important question:

How can we centrally create IAM roles across all AWS accounts in a secure and automated way?

Here's what we needed:

  • A centralized approach for IAM role provisioning across all AWS accounts.
  • A secure way to assume roles with least-privilege access.
  • Seamless integration with Terraform and Terragrunt for managing infrastructure.
  • Short-lived credentials for increased security via AWS Identity Center (SSO).

The Solution Overview

AWS Multi-Account Access Management

All Terraform state will be stored in the Shared-Services account. This ensures centralized state management, improving security and consistency across environments.

The Main Terraform execution role and GitHub Actions OIDC Role will also reside in the Shared-Services account. These roles will be able to assume Account-level Terraform execution roles alongside with Management account admins - so the entity that can assume this main role will be able to controle multiple accounts with iac approach.

Account-Based Terraform Execution Roles: Each AWS account (Development, Staging, Production) has its own Terraform execution role, which can be assumed when provisioning infrastructure within that specific account. Highlighting that each individual role per env has to be tailored based on least-privilege principle, yet for demo purposes our roles will have AdministratorAccess policy attached.

GitHub Actions OIDC Role: This role enables secure CI/CD automation by allowing GitHub Actions workflows to assume it for deployments.

To maintain strict security controls, we need to limit the identities that can assume the Shared-Services Terraform Execution Role. In our case:

  • The Master Account can assume both the Shared-Services Terraform Execution Role and the individual account-based roles directly.
  • Only authorized users from the Master Account will have permission to assume roles within the Shared-Services account, ensuring controlled access.

So, ultimately, the identities that should be able to assume Terraform execution roles should be very limited. In our case, these entities are:

  • AWS Management account Administrators
  • Shared-Services "terraform-execution-role"
  • GitHub Actions OIDC federated role

The only entities that apply anything on the infrastructure-org path—which stores the organization-level configurations that go through the management account—are AWS Management Account Administrators. Therefore, we are not providing CI automation for this section and are limiting access to this part to very few (Master Account admins).

So, getting back to the pre-provisioning phase to prepare these roles before the actual provisioning—how do we achieve that?
The answer? Leverage AWS CloudFormation StackSets to pre-provision IAM roles across all AWS accounts before using Terraform/Terragrunt against infrastructure-live. In simple terms, we will have specific management modules utilized under the infrastructure-org path before switching to infrastructure-live.

Two-Phase Deployment

  1. Phase 1: Use CloudFormation StackSets to pre-provision IAM roles across AWS accounts.
  2. Phase 2: Terraform/Terragrunt assumes these pre-created roles to deploy infrastructure on managed accounts.

GitOps Folder-Based Environment Approach

We also want to leverage a GitOps-style approach, structuring environments using a folder/directory per environment model. This will be combined with Terragrunt functions for seamless mapping of common variables (e.g., dynamically mapping account IDs based on environment folder names).

Closer look

Now that we have established the core concepts, let’s explore the structure of our Infrastructure-as-Code (IaC) project in detail.

All referenced example code-base is stored in GitHub.

.
├── common.hcl
├── infrastructure-live
│   ├── development
│   ├── production
│   ├── shared-services
│   │   └── gha-oidc
│   ├── staging
│   └── terragrunt.hcl
├── infrastructure-org
│   ├── root
│   │   ├── cfstacksets
│   │   └── organization
│   └── terragrunt.hcl
└── modules
    ├── cfstacksets
    ├── gha-oidc
    └── organization
Enter fullscreen mode Exit fullscreen mode

For simplicity, we have only the skeleton defined for infrastructure-live, but there is no actual provisioning per environment except for the gha-oidc module in shared-services.

Key Aspects of Our IaC Structure

  1. infrastructure-org (Organization-Level Management) - This path represents the root (management) account and delegated administrator accounts. It includes CloudFormation StackSets (cfstacksets) and AWS organizatins setup - OUs, SCPs etc. (organization). Since it manages organization-wide resources, access to this path must be strictly limited.

  2. infrastructure-live (Environment-Specific Infrastructure) - This path contains per-environment configurations (development, staging, production, etc.). It includes shared-services, which hosts the gha-oidc module for GitHub Actions OIDC setup.

  3. common.hcl (Shared Variables and Configuration) - This file contains shared variables used by both infrastructure-org and infrastructure-live. Common configurations such as account IDs, main region, and global settings|inputs|locals are defined here.

Terragrunt Pathing Pattern

As highlighted earlier, our Terragrunt pathing follows a structured pattern

infrastructure-path/account|env/modules
Enter fullscreen mode Exit fullscreen mode

Parent terragrunt.hcl lives under the infrastructure-path, and child hcl files live on module level and source the parent hcl.

infrastructure-org

Here is a sample parent file for org path intended for organization-level management.

skip = true

terraform {
  source = "${get_repo_root()}/modules/${basename(get_terragrunt_dir())}"
}

locals {
  common_vars   = read_terragrunt_config(find_in_parent_folders("common.hcl"))
  global_prefix = local.common_vars.locals.global_prefix
  env           = "root"
  profile       = get_env("AWS_PROFILE_ROOT", "${local.global_prefix}-root-sso")
  region        = local.common_vars.inputs.region
}

inputs = merge(
  local.common_vars.inputs,
  {
    env        = local.env
    region     = local.region
    account_id = local.common_vars.inputs.org_account_ids[local.env]
  }
)

remote_state {
  backend = "s3"
  generate = {
    path      = "backend.tf"
    if_exists = "overwrite_terragrunt"
  }

  config = {
    bucket         = "${local.global_prefix}-terraform-state-root"
    key            = "${local.global_prefix}/${get_path_from_repo_root()}/terraform.tfstate"
    region         = local.region
    encrypt        = true
    dynamodb_table = "root-tfstate-lock-table"
  }
}

generate "provider" {
  path      = "provider.tf"
  if_exists = "overwrite_terragrunt"
  contents  = <<-EOF
    provider "aws" {
      region = "${local.region}"
      allowed_account_ids =["${local.common_vars.inputs.org_account_ids[local.env]}"]

      default_tags {
        tags = {
          Environment  = "${local.env}"
          ManagedBy    = "terraform"
        }
      }
    }
EOF
}

generate "versions" {
  path      = "versions.tf"
  if_exists = "overwrite_terragrunt"
  contents  = <<EOF
    terraform {
      required_providers {
        aws = {
          source  = "hashicorp/aws"
          version = "~> 5.74"
        }
      }
    }
EOF
}
Enter fullscreen mode Exit fullscreen mode

Provisioning the Root Account Modules

To apply the root account (call it management account) modules (cfstacksets and organization), we assume a management account administrator role and run the following command:

export AWS_PROFILE=your_management_admin_profile
terragrunt run-all apply --terragrunt-working-dir infrastructure-org
Enter fullscreen mode Exit fullscreen mode

cfstacksets Module

The cfstacksets module provisions key IAM roles across different AWS Organizational Units (OUs). Let's examine the main components:

stacks-sdlc.tf (IAM Role Provisioning for SDLC Accounts)

resource "aws_cloudformation_stack_set" "terraform_role_sdlc" {
  permission_model = "SERVICE_MANAGED"
  name             = "${var.tf_role_name}-sdlc"

  auto_deployment {
    enabled = true
  }

  capabilities = ["CAPABILITY_NAMED_IAM"]

  template_body = jsonencode({
    AWSTemplateFormatVersion = "2010-09-09",
    Description              = "AWS CloudFormation Template to create an IAM Role named '${var.tf_role_name}' and attach the 'AdministratorAccess' AWS managed policy.",
    Resources = {
      OrgRole = {
        Type = "AWS::IAM::Role",
        Properties = {
          RoleName = "${var.tf_role_name}",
          AssumeRolePolicyDocument = {
            Version = "2012-10-17",
            Statement = [
              {
                Effect = "Allow",
                Principal = {
                  AWS = ["arn:aws:iam::${var.shared_services_id}:root"]
                },
                Action = ["sts:AssumeRole", "sts:TagSession"],
                Condition = {
                  StringLike = {
                    "aws:PrincipalArn" = [
                      "arn:aws:iam::${var.shared_services_id}:role/${var.tf_role_name}",
                      "arn:aws:iam::${var.shared_services_id}:role/${var.gha_role_name}"
                    ]
                  }
                }
              },
              {
                Effect = "Allow",
                Principal = {
                  AWS = ["arn:aws:iam::${var.root_account_id}:root"]
                },
                Action = ["sts:AssumeRole"],
              }
            ]
          },
          ManagedPolicyArns = [
            "arn:aws:iam::aws:policy/AdministratorAccess"
          ]
        }
      }
    }
  })

  lifecycle {
    ignore_changes = [administration_role_arn]
  }
}

resource "aws_cloudformation_stack_set_instance" "terraform_role_sdlc" {
  stack_set_name = aws_cloudformation_stack_set.terraform_role_sdlc.name
  deployment_targets {
    organizational_unit_ids = [var.org_ou_ids["sdlc"], var.org_ou_ids["production"], var.org_ou_ids["sandbox"]]
  }
}
Enter fullscreen mode Exit fullscreen mode

stacks-shared-services.tf (IAM Role Provisioning for Shared Services)

This stack provisions an IAM role for shared services that can assume Terraform roles across multiple OUs.

resource "aws_cloudformation_stack_set" "terraform_role_shared" {
  permission_model = "SERVICE_MANAGED"
  name             = "${var.tf_role_name}-shared"

  auto_deployment {
    enabled = true
  }

  capabilities = ["CAPABILITY_NAMED_IAM"]

  template_body = jsonencode({
    AWSTemplateFormatVersion = "2010-09-09",
    Description              = <<EOT
AWS CloudFormation StackSet template to create an IAM Role named '${var.tf_role_name}' on Shared-Services
account and attach the 'AdministratorAccess' AWS managed policy. The role can be assumed by an external account with
a matching condition. Exclusively this role itself is able to assume '${var.tf_role_name}'s across the SDLC and
Production OUs. Note: Root Administrators are also able to assume target '${var.tf_role_name}'s across the SDLC
and Production OUs.
EOT
    Resources = {
      OrgRole = {
        Type = "AWS::IAM::Role",
        Properties = {
          RoleName = var.tf_role_name,
          AssumeRolePolicyDocument = {
            Version = "2012-10-17",
            Statement = [
              {
                Effect = "Allow",
                Principal = {
                  AWS = [
                    "arn:aws:iam::${var.shared_services_id}:root",
                    "arn:aws:iam::${var.root_account_id}:root"
                  ]
                },
                Action = ["sts:AssumeRole"]
              }
            ]
          },
          ManagedPolicyArns = [
            "arn:aws:iam::aws:policy/AdministratorAccess"
          ]
        }
      }
    }
  })

  lifecycle {
    ignore_changes = [administration_role_arn]
  }
}

resource "aws_cloudformation_stack_set_instance" "terraform_role_shared" {
  stack_set_name = aws_cloudformation_stack_set.terraform_role_shared.name
  deployment_targets {
    organizational_unit_ids = [var.org_ou_ids["core"]]
    account_filter_type     = "INTERSECTION"
    accounts                = [var.shared_services_id]
  }
}
Enter fullscreen mode Exit fullscreen mode

This ensures that the terraform_role_shared is provisioned only in the Shared-Services account and can assume roles across different OUs securely.

infrastructure-live

Now that the organization-wide IAM roles and policies are in place, we move on to the environment-specific infrastructure provisioning under infrastructure-live. This is where Terraform/Terragrunt dynamically maps account-specific configurations, enabling seamless role assumption and execution.

Common Configuration (common.hcl in the root of the project)

The common.hcl file defines global configurations, including account mappings, environment inference, and shared settings.

locals {
  env_regex = "infrastructure-live/([a-zA-Z0-9-]+)/"
  env       = try(regex(local.env_regex, get_original_terragrunt_dir())[0], "shared-services")

  sdlc_account_ids = {
    development = "XXXXXXXXXXXXX"
    staging     = "XXXXXXXXXXXXX"
    production  = "XXXXXXXXXXXXX"
  }

  core_account_ids = {
    shared-services = "XXXXXXXXXXXXX"
    backups         = "XXXXXXXXXXXXX"
  }

  management_account_id = {
    root = "XXXXXXXXXXXXX"
  }

  sandbox_account_id = {
    sandbox = "XXXXXXXXXXXXX"
  }

  global_prefix = "XXXXXXXXXXXXX"
}

inputs = {
  global_prefix      = local.global_prefix
  sdlc_account_ids   = local.sdlc_account_ids
  core_account_ids   = local.core_account_ids
  org_account_ids    = merge(local.sdlc_account_ids, local.core_account_ids, local.management_account_id, local.sandbox_account_id)
  shared_services_id = local.core_account_ids["shared-services"]
  backups_id         = local.core_account_ids["backups"]
  root_account_id    = local.management_account_id["root"]
  org_units          = ["SDLC", "Production", "Core", "Sandbox"]
  tf_repo            = "XXXXXXXXXXXXX/terragrunt-infrastructure"
  tf_role_name       = "terraform-execution-role"
  gha_role_name      = "gha-role"
  gha_oidc_enabled   = true
  repo_root_path     = get_repo_root()
}
Enter fullscreen mode Exit fullscreen mode

Terragrunt Configuration for infrastructure-live (terragrunt.hcl)

Each environment directory (e.g., development/, staging/, production/, etc.) will reference a common terragrunt.hcl file, ensuring consistent execution policies and automatic role assumption.

skip                          = true
terragrunt_version_constraint = ">= 0.66"
terraform_version_constraint  = ">= 1.9.0"
retryable_errors              = ["(?s).*failed calling webhook*"]
retry_max_attempts            = 2
retry_sleep_interval_sec      = 30

dependencies {
  paths = ["${get_repo_root()}/infrastructure-org/root/cfstacksets"]
}

terraform {
  source = "${get_repo_root()}/modules/${basename(get_terragrunt_dir())}"
}

locals {
  common_vars   = read_terragrunt_config(find_in_parent_folders("common.hcl"))
  region        = local.common_vars.inputs.region
  env_regex     = local.common_vars.locals.env_regex
  env           = local.common_vars.locals.env
  global_prefix = local.common_vars.locals.global_prefix
}

inputs = merge(
  local.common_vars.inputs,
  {
    env        = local.env
    region     = local.region
    account_id = local.common_vars.inputs.org_account_ids[local.env]
  }
)

remote_state {
  backend = "s3"
  generate = {
    path      = "backend.tf"
    if_exists = "overwrite_terragrunt"
  }

  config = {
    bucket         = "${local.global_prefix}-terraform-state-shared-services"
    key            = "${local.global_prefix}/${get_path_from_repo_root()}/terraform.tfstate"
    region         = local.region
    encrypt        = true
    dynamodb_table = "shared-services-tfstate-lock-table"

    assume_role = {
      role_arn = "arn:aws:iam::${local.common_vars.inputs.org_account_ids["shared-services"]}:role/${local.common_vars.inputs.tf_role_name}"
    }
  }
}

generate "provider" {
  path      = "provider.tf"
  if_exists = "overwrite_terragrunt"
  contents  = <<-EOF
    provider "aws" {
      region              = "${local.region}"
      allowed_account_ids = ["${local.common_vars.inputs.org_account_ids[local.env]}"]

      assume_role {
        role_arn = "arn:aws:iam::${local.common_vars.inputs.org_account_ids[local.env]}:role/${local.common_vars.inputs.tf_role_name}"
      }

      default_tags {
        tags = {
          Environment   = "${local.env}"
          ManagedBy     = "terraform"
          DeployedBy    = "terragrunt"
        }
      }
    }
EOF
}

generate "versions" {
  path      = "versions.tf"
  if_exists = "overwrite_terragrunt"
  contents  = <<EOF
    terraform {
      required_providers {
        aws = {
          source  = "hashicorp/aws"
          version = "~> 5.74"
        }
      }
    }
EOF
}
Enter fullscreen mode Exit fullscreen mode

Dynamic Mapping of Environments and Role Assumption

One of the key benefits of Terragrunt's DRY (Don't Repeat Yourself) approach is that we dynamically infer account configurations based on folder structure.

  • Each environment folder (infrastructure-live/development, staging, production, etc.) automatically determines its AWS account ID and region.
  • Local profile or CI/CD execution will dynamically assume the appropriate Terraform execution role for the target environment.

How It Works

  • env_regex captures the environment name from the path.
  • The inputs block maps the environment name to its corresponding AWS account ID.
  • remote_state ensures each environment uses its own Terraform state stored in the Shared-Services account.
  • The provider configuration automatically assumes the appropriate IAM role for infrastructure provisioning.

Executing Terraform/Terragrunt in infrastructure-live

With this setup, running Terraform/Terragrunt becomes straightforward. Whether executed locally or via GitHub Actions OIDC, the appropriate role is automatically assumed.

export AWS_PROFILE=your_profile # This can be Shared-Services Terraform Exec Role or Management Admin
terragrunt run-all apply --terragrunt-working-dir infrastructure-live
Enter fullscreen mode Exit fullscreen mode

What Happens?

  • Terragrunt reads the environment directory structure (infrastructure-live/development, staging, production).
  • It dynamically assumes the correct IAM role for Terraform execution.
  • State files are stored centrally in the Shared-Services account.
  • The appropriate infrastructure is provisioned, following AWS best practices for multi-account security.

Farewell 😊

Image description

We've navigated the intricacies of multi-account AWS infrastructure automation using Terraform, Terragrunt, and AWS CloudFormation StackSets. By layering pre-provisioned IAM roles, dynamic environment mapping, we've inspected a secure, scalable, and streamlined approach to managing AWS at scale.

I hope this guide provides practical insights and a solid foundation. Keep refining, keep automating, and embrace the power of infrastructure as code! 🚀

Hostinger image

Get n8n VPS hosting 3x cheaper than a cloud solution

Get fast, easy, secure n8n VPS hosting from $4.99/mo at Hostinger. Automate any workflow using a pre-installed n8n application and no-code customization.

Start now

Top comments (1)

Collapse
 
darrenhorwitz1 profile image
Darren Horwitz

Such a good read ! I thought about using cfn for the role creation the other day , to be used for deploying some baseline infrastructure and will defs use this as a point of reference 😎

A Workflow Copilot. Tailored to You.

Pieces.app image

Our desktop app, with its intelligent copilot, streamlines coding by generating snippets, extracting code from screenshots, and accelerating problem-solving.

Read the docs

👋 Kindness is contagious

Please leave a ❤️ or a friendly comment on this post if you found it helpful!

Okay