DEV Community

Cover image for AWS Governance with a Terragrunt Live Environment
Lucas de Camargo
Lucas de Camargo

Posted on

AWS Governance with a Terragrunt Live Environment

Many tech enthusiasts have the desire to start or to move their stack to AWS and benefit from its 200+ global services. It may feel intuitive to create an account on AWS and start creating resources in the AWS Console, but this is the absolutely don't when working in the Cloud.

I have introduced some basic concepts of Governance on AWS in First steps to Governance on AWS using Terraform. To recap briefly: proper AWS governance is built on Three Governance Pillars: Organizations, Accounts, and Service Control Policies (SCPs). You may understand these concepts, but if you're still manually creating resources, you're missing the foundation that makes governance actually work: Infrastructure as Code.

Without IaC, you'll spend weeks trying to reverse-engineer your own infrastructure just to understand what you've built. Worse, when disaster strikes, like a misconfiguration, a security breach, or even accidental resource deletion, recovery becomes a manual, error-prone nightmare instead of a simple deployment operation.

If you're already familiar with Terraform, you know the pain: managing multiple AWS accounts leads to massive code duplication, enormous state files, and configurations you're afraid to touch because you've lost track of dependencies. This is where Terragrunt becomes essential. It eliminates code duplication through a hierarchical configuration system while maintaining clear separation of concerns across your multi-account environment.

In this guide, we'll build a Terragrunt live environment: a Git repository that versions your AWS infrastructure, organized by accounts and regions, with configurations that cascade intelligently through the hierarchy. We'll establish the foundation that makes deploying your governance resources (in the next post) both simple and maintainable.

Getting Started

The content of the Terragrunt Live Environment for this series is being maintained in my GitHub repository. In this post, I'm explaining the concepts of the repository structure, and deploying the first organization resource.

GitHub logo lucasdecamargo / terragrunt-live-example

Template for a Terragrunt stack using my AWS Terraform modules including Governance management.

terragrunt-live-example

Template for a Terragrunt stack using my AWS Terraform modules including Governance management.




Installation

We'll be using three essential tools to manage our AWS infrastructure:

  • AWS CLI v2 authenticates and interacts with AWS services from your terminal. It's required for Terraform to communicate with AWS APIs.
  • Terraform is the Infrastructure as Code (IaC) engine that provisions and manages AWS resources using declarative configuration files.
  • Terragrunt is a thin wrapper around Terraform that helps eliminate code duplication and manage multiple environments.

Tip: If you're using VS Code, install these extensions for better HCL syntax support:

Named Profiles

Go and quickly read my post about AWS Named Profiles, if you haven't yet.

At this point, it is considered that our AWS account doesn't yet have the SSO service enabled. Therefore, we need an access key that has the required permissions for deploying the resources we're creating in this post. These are the following:

  1. Backend Policy for creating and updating the remote state files in our AWS root account.
  2. Organization Policy for deploying the organization resources we're creating in this post.

Make sure to check out the permissions in my GitHub repository for the Terragrunt Live Environment example.

  • In your AWS Console, go to IAM -> Policies, and create a new policy for each of the JSON documents I have under the policies folder in my repository.
  • Next, go again to IAM -> Users, create a new terragrunt-root user, and attach these policies to it.
  • Generate a new pair o acess keys, and add it your AWS credentials file, as we discussed, with a name of acme-root, for example.

Live Environment

Our AWS infrastructure is going to live in a structured Git repository. Terragrunt allows us to define a folder structure based on our AWS accounts and to share variables and configurations across the stacks deployed to each account.

Typically, you'd see in Gruntwork's recommendation the following pattern for live repositories:

account
 └ _global
 └ region
    └ _global
    └ environment
       └ category
          └ resource
Enter fullscreen mode Exit fullscreen mode

As an example:

backend
 └ _global
    └ networking
       └ route53-public-api-domain
 └ us-east-1
    └ _global
       └ storage
          └ s3-assets
    └ dev
       └ compute
          └ ecs-public-api-cluster
Enter fullscreen mode Exit fullscreen mode

The _global folders contain resources that aren't tied to a specific region (like Route53 hosted zones, IAM roles, or S3 buckets with global names) or environment-agnostic resources.

However, when onboarding to the Cloud for the first time, it can be challenging to predict exactly what environments and categories you'll need from the start. We can simplify this structure significantly by encoding the environment into the account name itself:

account
 └ _global
 └ region
    └ resource

backend-dev
 └ _global
    └ route53-public-api-domain
 └ us-east-1
    └ s3-assets
    └ ecs-public-api-cluster
Enter fullscreen mode Exit fullscreen mode

This approach aligns perfectly with our multi-account governance strategy from the previous post. We create a dedicated backend-dev account within our Development OU for all development work. When you're ready for production, you simply create a backend-prod account under your Production OU — each account maintains its own isolated folder structure.

This simplified pattern offers a better deal for startups because it is easier to get started with, as less folder nesting means less cognitive overhead when you're learning, and it is simple to evolve by refactoring it into more granular structure later as your needs grow.

Writing the Terragrunt Stack

Now that we understand the basics of a Terragrunt live environment, let's create our first stack. We'll build a Git repository to version our infrastructure state and walk through each component step by step.

We start by creating a Git repository locally that will version the state of our infrastructure. Consider these naming options:

  • terragrunt-live-management or terragrunt-live-governance - if you want separate repositories for different groups of stacks
  • terragrunt-live - if you're targeting a monorepo for all of your IaC stacks

I'm using terragrunt-live-example for this post, and our fictional organization will be called acme — we'll use this name when creating AWS resources.

Directory Structure

As we discussed earlier, the repository structure begins with a directory for an AWS account, followed by a directory for the region, and finally a directory for each resource we're deploying.

The key to understanding Terragrunt is that each layer uses an HCL file to declare local variables that cascade down through the hierarchy:

terragrunt-live-example
├── root                        # Root Account
│   ├── _global                 # Global Region
│   │   ├── organization        # Organization Resource
│   │   │   └── terragrunt.hcl
│   │   └── region.hcl
│   └── account.hcl
└── root.hcl
Enter fullscreen mode Exit fullscreen mode

When deploying the organization resource in terragrunt.hcl, Terragrunt loads all definitions in this order: root.hclaccount.hclregion.hclterragrunt.hcl. This layered approach keeps our configuration DRY (Don't Repeat Yourself) and makes it easy to share common settings across resources.

Locals

Terragrunt locals are similar to Terraform locals. They provide a mechanism to define named expressions within a terragrunt.hcl configuration file. These expressions are evaluated and their results can be referenced throughout the same configuration file, promoting reusability and reducing repetition.

In this example, we define the local variables for each directory layer.

# root/account.hcl
locals {
  account_name      = "root"
  aws_named_profile = "acme-root"

  tags = {
    "Account" = "root"
  }
}
Enter fullscreen mode Exit fullscreen mode
# root/_global/region.hcl
locals {
  aws_region = "us-east-1"

  tags = {
    "Region" = "global"
  }
}
Enter fullscreen mode Exit fullscreen mode

These locals are going to be available in the inner layers we are going to be working on, preventing us from repeating ourselves when declaring our Terraform modules based on these layer values.

Note that even though the region is _global, we still need to specify a default region (us-east-1) because the AWS CLI requires some region to be defined for API calls.

The Root File

The root.hcl file is the foundation of our Terragrunt configuration. We use Terragrunt's built-in functions read_terragrunt_config and find_in_parent_folders to automatically load the local variables we defined at each layer:

# root.hcl
locals {
  # Automatically load account-level variables
  account_vars = read_terragrunt_config(find_in_parent_folders("account.hcl"))
  # Automatically load region-level variables
  region_vars = read_terragrunt_config(find_in_parent_folders("region.hcl"))
}
Enter fullscreen mode Exit fullscreen mode

It's important to understand that find_in_parent_folders executes from the perspective of the resource level. This means when Terragrunt processes the organization resource, it searches upward through parent directories to find these configuration files.

Now we can use the inherited parameters to build new local variables:

# root.hcl
locals {
  # Automatically load account-level variables
  account_vars = read_terragrunt_config(find_in_parent_folders("account.hcl"))
  # Automatically load region-level variables
  region_vars = read_terragrunt_config(find_in_parent_folders("region.hcl"))

  # Extract locals from the loaded configurations
  account_name      = local.account_vars.locals.account_name
  aws_named_profile = local.account_vars.locals.aws_named_profile
  aws_region        = local.region_vars.locals.aws_region

  # Define common variables used across all resources
  org_name = "acme"
  tags = {
    "ManagedBy" = "Terragrunt"
  }

  # Merge all tags hierarchically
  default_tags = merge(
    local.tags,
    local.account_vars.locals.tags,
    local.region_vars.locals.tags,
  )
}
Enter fullscreen mode Exit fullscreen mode

We also declare an inputs block to pass variables to underlying Terraform modules, making them available for cross-stack references:

# root.hcl (continued)
inputs = merge(
  local.account_vars.locals,
  local.region_vars.locals,
)
Enter fullscreen mode Exit fullscreen mode

Generating Providers

With our variables configured, we can now declare the AWS Provider for the account we're deploying to. If you're not familiar with providers, they are plugins that serve as the interface between Terraform and external services or platforms. They enable Terraform to manage resources, interact with APIs, and perform operations on various cloud platforms.

We use Terragrunt's generate block to inject the provider configuration into our Terraform modules. This keeps our configuration DRY by defining this common setup once in root.hcl rather than repeating it in every resource:

# root.hcl (continued)
generate "provider" {
  path      = "provider.tf"
  if_exists = "overwrite_terragrunt"
  contents  = <<EOF
provider "aws" {
  profile = "${local.aws_named_profile}"
  region  = "${local.aws_region}"

  default_tags {
    tags = ${jsonencode(local.default_tags)}
  }
}
EOF
}
Enter fullscreen mode Exit fullscreen mode

When you run terragrunt init, this automatically generates a provider.tf file in the Terraform working directory with the appropriate configuration for your environment.

Remote State Files

In IaC, the state file is crucial — it stores information about your infrastructure's current status, acting as a bridge between your configuration files and the actual resources deployed in AWS. Terraform uses this state file as its single source of truth to understand what infrastructure it has created and is managing. Every change plan that Terraform calculates is based on this state file.

Here's the key insight: since we're working in AWS, we can use our AWS accounts to store their own state files. It's also a best practice to use a separate state file per region for each AWS account. This approach:

  • Minimizes the potential impact of state file corruption
  • Improves performance by keeping state files smaller
  • Reduces cross-regional dependencies

We use Terragrunt's remote_state block to configure S3 as our state backend, with DynamoDB for state locking to prevent multiple Terraform operations from modifying the state concurrently:

# root.hcl (continued)
remote_state {
  backend = "s3"
  config = {
    encrypt        = true
    bucket         = "${local.org_name}-tfstate-${local.account_name}-${local.aws_region}-TRGRUNT"
    key            = "${path_relative_to_include()}/terraform.tfstate"
    region         = local.aws_region
    dynamodb_table = "tfstate-lock"
    profile        = local.aws_named_profile
  }
  generate = {
    path      = "backend.tf"
    if_exists = "overwrite_terragrunt"
  }
}
Enter fullscreen mode Exit fullscreen mode

This configuration will create an S3 bucket named acme-tfstate-root-us-east-1 when deploying the organization resource. Remember that S3 bucket names must be globally unique — if the name is already taken, you'll need to modify your naming convention by adding additional tokens like TRGRUNT.

Units (Resources)

A unit in Terragrunt is a directory containing a terragrunt.hcl file. This hermetic unit of infrastructure is the smallest deployable entity in Terragrunt. It’s also the most important feature Terragrunt has.

With our foundation complete, we can now declare AWS resources in our Terragrunt live stack. Let's use the organization module from the previous post in this series, which is maintained in my GitHub repository.

This module requires only the variable aws_allowed_regions which specifies what regions are allowed by our root organization:

# https://github.com/lucasdecamargo/terraform-aws-governance/tree/main/organization/variables.tf
variable "aws_allowed_regions" {
  description = "The AWS regions allowed by the organization."
  type        = list(string)
  default     = ["us-east-1", "us-east-2"]
}
Enter fullscreen mode Exit fullscreen mode

To use this module, we create the simple Terragrunt configuration:

# root/_global/organization/terragrunt.hcl
terraform {
  source = "git::git@github.com:lucasdecamargo/terraform-aws-governance.git//organization"
}

include "root" {
  path = find_in_parent_folders("root.hcl")
}

inputs = {
  aws_allowed_regions = ["us-east-1", "us-east-2", "sa-east-1"]
}
Enter fullscreen mode Exit fullscreen mode

Let's break down what each block does:

  • The terraform block specifies the source of the module we're deploying, pointing to the organization directory in my GitHub repository (by default, it uses the main branch)
  • The include block links this resource to our live repository configuration, starting with root.hcl
  • The inputs block provides the required variables to the Terraform module

Deploying with Terragrunt

At this point, our Terragrunt live environment is configured and ready to deploy. Let's understand how Terragrunt's deployment workflow works and the different ways you can deploy your infrastructure.

Unlike plain Terraform, Terragrunt automatically calls terraform init before other commands when it detects that initialization is needed. After creating creating the remote state backend, the typical workflow is simply:

  1. Plan: Review what changes will be made
  2. Apply: Execute the changes

Once the backend is bootstrapped for an account, Terragrunt handles initialization automatically for subsequent commands and no manual init step required unless you need to update providers or modules.

Deploying the Remote State Backend

Before deploying any infrastructure in an account, you need to create the remote state backend. Our root.hcl configuration specifies an S3 bucket and DynamoDB table for state management at the account level, but these don't exist yet.

You must explicitly bootstrap these backend resources using:

cd root/_global/organization
terragrunt init --backend-bootstrap
Enter fullscreen mode Exit fullscreen mode

This creates the backend infrastructure for the entire root account:

  • An S3 bucket named acme-tfstate-root-us-east-1-TRGRUNT (with encryption enabled)
  • A DynamoDB table named tfstate-lock for state locking

Note: Automatic backend bootstrapping is deprecated functionality and will not be the default behavior in future Terragrunt versions. Always use the --backend-bootstrap flag explicitly.

Once the backend is created for an account, you don't need to bootstrap it again—all units within that account will use the same S3 bucket and DynamoDB table for their state files.

Deployment Scopes

You can run run --all commands at any depth of the stack to run the units in that stack and all of its children. Here are the different scopes:

Deploying a Single Unit*

The most common approach — deploy one specific resource:

cd root/_global/organization
terragrunt plan
terragrunt apply
Enter fullscreen mode Exit fullscreen mode

Deploying an Entire Region

Deploy all units within a region:

cd root/_global
terragrunt run --all plan
terragrunt run --all apply
Enter fullscreen mode Exit fullscreen mode

Deploying an Entire Account

Deploy all resources across all regions in an account:

cd root
terragrunt run --all plan
terragrunt run --all apply
Enter fullscreen mode Exit fullscreen mode

When using run --all, Terragrunt analyzes the dependencies in your stack and determines an order for runs so that outputs are ready to be used as inputs in dependent units.

Important: When using run --all with apply or destroy, Terragrunt automatically adds the -auto-approve flag, meaning you won't be prompted to confirm each unit. Always run plan first to review changes.

What's Next?

We've established our Terragrunt live environment with the complete repository structure, configuration hierarchy, and deployment workflow. Our infrastructure is now version-controlled, DRY, and ready to scale across multiple accounts and regions.

Follow me to get updates about my series IaC Startup on AWS. I'm writing new posts about configuring AWS named profiles, and deploying governance infrastructure with Terragrunt and Terraform. By the end, you'll have a production-ready, multi-account AWS environment managed entirely through Infrastructure as Code.

Also, make sure to check my GitHub profile for new projects and updates.

Buy me a coffee if you like this post! :)

Buy Me A Coffee

Top comments (0)