Many tech enthusiasts have the desire to start or to move their stack to AWS and benefit from its 200+ global services. It may feel intuitive to create an account on AWS and start creating resources in the AWS Console, but this is the absolutely don't when working in the Cloud.
I have introduced some basic concepts of Governance on AWS in First steps to Governance on AWS using Terraform. To recap briefly: proper AWS governance is built on Three Governance Pillars: Organizations, Accounts, and Service Control Policies (SCPs). You may understand these concepts, but if you're still manually creating resources, you're missing the foundation that makes governance actually work: Infrastructure as Code.
Without IaC, you'll spend weeks trying to reverse-engineer your own infrastructure just to understand what you've built. Worse, when disaster strikes, like a misconfiguration, a security breach, or even accidental resource deletion, recovery becomes a manual, error-prone nightmare instead of a simple deployment operation.
If you're already familiar with Terraform, you know the pain: managing multiple AWS accounts leads to massive code duplication, enormous state files, and configurations you're afraid to touch because you've lost track of dependencies. This is where Terragrunt becomes essential. It eliminates code duplication through a hierarchical configuration system while maintaining clear separation of concerns across your multi-account environment.
In this guide, we'll build a Terragrunt live environment: a Git repository that versions your AWS infrastructure, organized by accounts and regions, with configurations that cascade intelligently through the hierarchy. We'll establish the foundation that makes deploying your governance resources (in the next post) both simple and maintainable.
Getting Started
The content of the Terragrunt Live Environment for this series is being maintained in my GitHub repository. In this post, I'm explaining the concepts of the repository structure, and deploying the first organization resource.
lucasdecamargo
/
terragrunt-live-example
Template for a Terragrunt stack using my AWS Terraform modules including Governance management.
terragrunt-live-example
Template for a Terragrunt stack using my AWS Terraform modules including Governance management.
Installation
We'll be using three essential tools to manage our AWS infrastructure:
- AWS CLI v2 authenticates and interacts with AWS services from your terminal. It's required for Terraform to communicate with AWS APIs.
- Terraform is the Infrastructure as Code (IaC) engine that provisions and manages AWS resources using declarative configuration files.
- Terragrunt is a thin wrapper around Terraform that helps eliminate code duplication and manage multiple environments.
Tip: If you're using VS Code, install these extensions for better HCL syntax support:
Named Profiles
Go and quickly read my post about AWS Named Profiles, if you haven't yet.
At this point, it is considered that our AWS account doesn't yet have the SSO service enabled. Therefore, we need an access key that has the required permissions for deploying the resources we're creating in this post. These are the following:
- Backend Policy for creating and updating the remote state files in our AWS root account.
- Organization Policy for deploying the organization resources we're creating in this post.
Make sure to check out the permissions in my GitHub repository for the Terragrunt Live Environment example.
- In your AWS Console, go to IAM -> Policies, and create a new policy for each of the JSON documents I have under the
policiesfolder in my repository. - Next, go again to IAM -> Users, create a new
terragrunt-rootuser, and attach these policies to it. - Generate a new pair o acess keys, and add it your AWS credentials file, as we discussed, with a name of
acme-root, for example.
Live Environment
Our AWS infrastructure is going to live in a structured Git repository. Terragrunt allows us to define a folder structure based on our AWS accounts and to share variables and configurations across the stacks deployed to each account.
Typically, you'd see in Gruntwork's recommendation the following pattern for live repositories:
account
└ _global
└ region
└ _global
└ environment
└ category
└ resource
As an example:
backend
└ _global
└ networking
└ route53-public-api-domain
└ us-east-1
└ _global
└ storage
└ s3-assets
└ dev
└ compute
└ ecs-public-api-cluster
The _global folders contain resources that aren't tied to a specific region (like Route53 hosted zones, IAM roles, or S3 buckets with global names) or environment-agnostic resources.
However, when onboarding to the Cloud for the first time, it can be challenging to predict exactly what environments and categories you'll need from the start. We can simplify this structure significantly by encoding the environment into the account name itself:
account
└ _global
└ region
└ resource
backend-dev
└ _global
└ route53-public-api-domain
└ us-east-1
└ s3-assets
└ ecs-public-api-cluster
This approach aligns perfectly with our multi-account governance strategy from the previous post. We create a dedicated backend-dev account within our Development OU for all development work. When you're ready for production, you simply create a backend-prod account under your Production OU — each account maintains its own isolated folder structure.
This simplified pattern offers a better deal for startups because it is easier to get started with, as less folder nesting means less cognitive overhead when you're learning, and it is simple to evolve by refactoring it into more granular structure later as your needs grow.
Writing the Terragrunt Stack
Now that we understand the basics of a Terragrunt live environment, let's create our first stack. We'll build a Git repository to version our infrastructure state and walk through each component step by step.
We start by creating a Git repository locally that will version the state of our infrastructure. Consider these naming options:
-
terragrunt-live-managementorterragrunt-live-governance- if you want separate repositories for different groups of stacks -
terragrunt-live- if you're targeting a monorepo for all of your IaC stacks
I'm using terragrunt-live-example for this post, and our fictional organization will be called acme — we'll use this name when creating AWS resources.
Directory Structure
As we discussed earlier, the repository structure begins with a directory for an AWS account, followed by a directory for the region, and finally a directory for each resource we're deploying.
The key to understanding Terragrunt is that each layer uses an HCL file to declare local variables that cascade down through the hierarchy:
terragrunt-live-example
├── root # Root Account
│ ├── _global # Global Region
│ │ ├── organization # Organization Resource
│ │ │ └── terragrunt.hcl
│ │ └── region.hcl
│ └── account.hcl
└── root.hcl
When deploying the organization resource in terragrunt.hcl, Terragrunt loads all definitions in this order: root.hcl → account.hcl → region.hcl → terragrunt.hcl. This layered approach keeps our configuration DRY (Don't Repeat Yourself) and makes it easy to share common settings across resources.
Locals
Terragrunt locals are similar to Terraform locals. They provide a mechanism to define named expressions within a terragrunt.hcl configuration file. These expressions are evaluated and their results can be referenced throughout the same configuration file, promoting reusability and reducing repetition.
In this example, we define the local variables for each directory layer.
# root/account.hcl
locals {
account_name = "root"
aws_named_profile = "acme-root"
tags = {
"Account" = "root"
}
}
# root/_global/region.hcl
locals {
aws_region = "us-east-1"
tags = {
"Region" = "global"
}
}
These locals are going to be available in the inner layers we are going to be working on, preventing us from repeating ourselves when declaring our Terraform modules based on these layer values.
Note that even though the region is _global, we still need to specify a default region (us-east-1) because the AWS CLI requires some region to be defined for API calls.
The Root File
The root.hcl file is the foundation of our Terragrunt configuration. We use Terragrunt's built-in functions read_terragrunt_config and find_in_parent_folders to automatically load the local variables we defined at each layer:
# root.hcl
locals {
# Automatically load account-level variables
account_vars = read_terragrunt_config(find_in_parent_folders("account.hcl"))
# Automatically load region-level variables
region_vars = read_terragrunt_config(find_in_parent_folders("region.hcl"))
}
It's important to understand that find_in_parent_folders executes from the perspective of the resource level. This means when Terragrunt processes the organization resource, it searches upward through parent directories to find these configuration files.
Now we can use the inherited parameters to build new local variables:
# root.hcl
locals {
# Automatically load account-level variables
account_vars = read_terragrunt_config(find_in_parent_folders("account.hcl"))
# Automatically load region-level variables
region_vars = read_terragrunt_config(find_in_parent_folders("region.hcl"))
# Extract locals from the loaded configurations
account_name = local.account_vars.locals.account_name
aws_named_profile = local.account_vars.locals.aws_named_profile
aws_region = local.region_vars.locals.aws_region
# Define common variables used across all resources
org_name = "acme"
tags = {
"ManagedBy" = "Terragrunt"
}
# Merge all tags hierarchically
default_tags = merge(
local.tags,
local.account_vars.locals.tags,
local.region_vars.locals.tags,
)
}
We also declare an inputs block to pass variables to underlying Terraform modules, making them available for cross-stack references:
# root.hcl (continued)
inputs = merge(
local.account_vars.locals,
local.region_vars.locals,
)
Generating Providers
With our variables configured, we can now declare the AWS Provider for the account we're deploying to. If you're not familiar with providers, they are plugins that serve as the interface between Terraform and external services or platforms. They enable Terraform to manage resources, interact with APIs, and perform operations on various cloud platforms.
We use Terragrunt's generate block to inject the provider configuration into our Terraform modules. This keeps our configuration DRY by defining this common setup once in root.hcl rather than repeating it in every resource:
# root.hcl (continued)
generate "provider" {
path = "provider.tf"
if_exists = "overwrite_terragrunt"
contents = <<EOF
provider "aws" {
profile = "${local.aws_named_profile}"
region = "${local.aws_region}"
default_tags {
tags = ${jsonencode(local.default_tags)}
}
}
EOF
}
When you run terragrunt init, this automatically generates a provider.tf file in the Terraform working directory with the appropriate configuration for your environment.
Remote State Files
In IaC, the state file is crucial — it stores information about your infrastructure's current status, acting as a bridge between your configuration files and the actual resources deployed in AWS. Terraform uses this state file as its single source of truth to understand what infrastructure it has created and is managing. Every change plan that Terraform calculates is based on this state file.
Here's the key insight: since we're working in AWS, we can use our AWS accounts to store their own state files. It's also a best practice to use a separate state file per region for each AWS account. This approach:
- Minimizes the potential impact of state file corruption
- Improves performance by keeping state files smaller
- Reduces cross-regional dependencies
We use Terragrunt's remote_state block to configure S3 as our state backend, with DynamoDB for state locking to prevent multiple Terraform operations from modifying the state concurrently:
# root.hcl (continued)
remote_state {
backend = "s3"
config = {
encrypt = true
bucket = "${local.org_name}-tfstate-${local.account_name}-${local.aws_region}-TRGRUNT"
key = "${path_relative_to_include()}/terraform.tfstate"
region = local.aws_region
dynamodb_table = "tfstate-lock"
profile = local.aws_named_profile
}
generate = {
path = "backend.tf"
if_exists = "overwrite_terragrunt"
}
}
This configuration will create an S3 bucket named acme-tfstate-root-us-east-1 when deploying the organization resource. Remember that S3 bucket names must be globally unique — if the name is already taken, you'll need to modify your naming convention by adding additional tokens like TRGRUNT.
Units (Resources)
A unit in Terragrunt is a directory containing a terragrunt.hcl file. This hermetic unit of infrastructure is the smallest deployable entity in Terragrunt. It’s also the most important feature Terragrunt has.
With our foundation complete, we can now declare AWS resources in our Terragrunt live stack. Let's use the organization module from the previous post in this series, which is maintained in my GitHub repository.
This module requires only the variable aws_allowed_regions which specifies what regions are allowed by our root organization:
# https://github.com/lucasdecamargo/terraform-aws-governance/tree/main/organization/variables.tf
variable "aws_allowed_regions" {
description = "The AWS regions allowed by the organization."
type = list(string)
default = ["us-east-1", "us-east-2"]
}
To use this module, we create the simple Terragrunt configuration:
# root/_global/organization/terragrunt.hcl
terraform {
source = "git::git@github.com:lucasdecamargo/terraform-aws-governance.git//organization"
}
include "root" {
path = find_in_parent_folders("root.hcl")
}
inputs = {
aws_allowed_regions = ["us-east-1", "us-east-2", "sa-east-1"]
}
Let's break down what each block does:
- The
terraformblock specifies the source of the module we're deploying, pointing to theorganizationdirectory in my GitHub repository (by default, it uses the main branch) - The
includeblock links this resource to our live repository configuration, starting withroot.hcl - The
inputsblock provides the required variables to the Terraform module
Deploying with Terragrunt
At this point, our Terragrunt live environment is configured and ready to deploy. Let's understand how Terragrunt's deployment workflow works and the different ways you can deploy your infrastructure.
Unlike plain Terraform, Terragrunt automatically calls terraform init before other commands when it detects that initialization is needed. After creating creating the remote state backend, the typical workflow is simply:
- Plan: Review what changes will be made
- Apply: Execute the changes
Once the backend is bootstrapped for an account, Terragrunt handles initialization automatically for subsequent commands and no manual init step required unless you need to update providers or modules.
Deploying the Remote State Backend
Before deploying any infrastructure in an account, you need to create the remote state backend. Our root.hcl configuration specifies an S3 bucket and DynamoDB table for state management at the account level, but these don't exist yet.
You must explicitly bootstrap these backend resources using:
cd root/_global/organization
terragrunt init --backend-bootstrap
This creates the backend infrastructure for the entire root account:
- An S3 bucket named
acme-tfstate-root-us-east-1-TRGRUNT(with encryption enabled) - A DynamoDB table named
tfstate-lockfor state locking
Note: Automatic backend bootstrapping is deprecated functionality and will not be the default behavior in future Terragrunt versions. Always use the --backend-bootstrap flag explicitly.
Once the backend is created for an account, you don't need to bootstrap it again—all units within that account will use the same S3 bucket and DynamoDB table for their state files.
Deployment Scopes
You can run run --all commands at any depth of the stack to run the units in that stack and all of its children. Here are the different scopes:
Deploying a Single Unit*
The most common approach — deploy one specific resource:
cd root/_global/organization
terragrunt plan
terragrunt apply
Deploying an Entire Region
Deploy all units within a region:
cd root/_global
terragrunt run --all plan
terragrunt run --all apply
Deploying an Entire Account
Deploy all resources across all regions in an account:
cd root
terragrunt run --all plan
terragrunt run --all apply
When using run --all, Terragrunt analyzes the dependencies in your stack and determines an order for runs so that outputs are ready to be used as inputs in dependent units.
Important: When using run --all with apply or destroy, Terragrunt automatically adds the -auto-approve flag, meaning you won't be prompted to confirm each unit. Always run plan first to review changes.
What's Next?
We've established our Terragrunt live environment with the complete repository structure, configuration hierarchy, and deployment workflow. Our infrastructure is now version-controlled, DRY, and ready to scale across multiple accounts and regions.
Follow me to get updates about my series IaC Startup on AWS. I'm writing new posts about configuring AWS named profiles, and deploying governance infrastructure with Terragrunt and Terraform. By the end, you'll have a production-ready, multi-account AWS environment managed entirely through Infrastructure as Code.
Also, make sure to check my GitHub profile for new projects and updates.
Buy me a coffee if you like this post! :)

Top comments (0)