quixoticmonk

Posted on Nov 27, 2024

S3 Cross region replication with Terraform stacks

#terraform #stacks #hashicorp #aws

There have been a few articles on Terraform stacks and how some of the core concepts with Stacks help you manage the deployment/provisioning of those multi region/multi account scenarios. I had one of those scenarios which I wanted to test to get a better understanding of Stacks.

Scenario

S3 Cross region replication

The name itself suggests the challenge you have with this stack or configuration you are trying to stand up.

- An S3 bucket in one region
- A replication bucket in another region
- The replication rules and necessary IAM roles in the first region referencing the second bucket.

Considering the way the Terraform workspaces are setup, this can be a pain to manage the relationships and dependencies you have here.

Add in bidirectional replication and you have a bigger challenge with all those dependencies you will need to manage and orchestrate.

Enter Stacks

Terraform stacks are a configuration layer in HCP Terraform that simplifies the management of your infrastructure modules with some of the following use cases

Deploy an entire application with components like networking, storage, and compute as a single unit without worrying about dependencies
Deploy across multiple regions, availability zones, and cloud provider accounts without duplicating effort/code

How is this done, though ? There are a few pieces to the puzzle.

Components, Providers and variables.

Components are an alternative to module invocations. If you are familiar with using a module with a source and variable inputs, you will feel right at home here.

component "cluster" {
  source = "" # Can be local or from a source you can pull from
  inputs = {
    ......  # variable inputs you would want to provide to the undelying module
  }
  providers = {
    ...   # Not entirely new, but providers via components are a necessity as modules used for components cannot have provider definition in them. So this is your way to add provider details.
  }
}

Defined under a <name>.tfstacks.hcl file.

Providers in a high level are your conifguration mechanisms to interact with a target plaform ( Cloud provider, Artifact registry etc..)

They are different from your existing provider definitions in that they resemble any of your resource/datasource blocks you may currently have with the named blocks.

provider "aws" "<name_you_can_remember>" {
  config {
    region = <region>

    assume_role_with_web_identity {
      role_arn           = .. # ARN of the role of the target platform you are deploying to
      web_identity_token = .. # Workload identity token generated by a run in HCP Terraform
    }
  }
}

Defined under a providers.tfstacks.hcl file.

Variables are not a new concept and they mirror your root module variables if you have set one before.

Defined under a variables.tfstacks.hcl file.
Reference: component/variable/providers

Deployments

Deployments are your programmatic way of defining what to include in a provisioning step and acts as your entrypoint for your Stack.

deployment "environment_id_you_want_to_use" {
  inputs = {
    identity_token          = identity_token.aws.jwt
    role_arn                = store.varset.roles.role_dev

    + any other inputs which the components need as part of the module invocation
  }
}

Defined under a <name>.tfdeploy.hcl file.
Reference : deployment

Orchestration rules

Rules written in HCL which allow you to automation some of the repetitive actions based on some context information your stacks have based on the deployments. Terraform has been slowly easing us into this concept of checks with a condition and error_message to review a certain rule. This looks very similar to that in view, but helps to orchestrate the automation of approvals of your stack.

One of the orchestration rules available is auto_approve which can be based on some check in the format below.

orchestrate "auto_approve" “safe_plans” {
  check {
    #check that there are no resources being removed
    condition = context.plan.changes.remove == 0
    reason = "Plan has ${context,plan.changes. remove} resources to be removed."
  }
}

Additional ones

store block

A store block can be used to reference a variableset in your HCP Terraform project which can include some credentials you want to read on runtime. Specify the variables you are assigning this value to as ephemeral and you can avoid having your variable value in your logs or state. I wish these could reference a name than an id since they are anyway scoped to the project.

store "varset" "roles" {
  id       = "<variables-set-id>"
  category = "env"
}

# Access your variable set's value using your store and pass them into your 
# deployment's inputs.
deployment "dev" {
  inputs = {
    role_arn = store.varset.roles.role_dev  
  }
}

identity_token block

The identity_token block gives you a token that you will be used to authorize the AWS provider in this case. This bases itself on the dynamic credentials mechanism which HCP Terraform has with AWS. There are few updates to the existing workspace based one if you compare. The trust relationship on the role would look something like below:

            "Condition": {
                "StringEquals": {
                    "app.terraform.io:aud": "aws.workload.identity"
                },
                "StringLike": {
                    "app.terraform.io:sub": "organization:<org_name>:project:<proj_name>:stack:<stack_name>:deployment:<deployment_name>:operation:<operation_type>"
                }
            }

Read more about the OIDC based authentication here

S3 Cross region replication stack

Lets dive into this one. The project structure in my case is as below. You could have your modules be referenced as an external source as well. All the code I have here is in the GitHub repo at s3 cross region replication

.
├── README.md
├── components.tfstack.hcl
├── deployments.tfdeploy.hcl
├── providers.tfstack.hcl
└── variables.tfstack.hcl
├── modules
│   ├── replication
│   │   ├── main.tf
│   │   ├── outputs.tf
│   │   └── variables.tf
│   └── s3
│       ├── main.tf
│       ├── outputs.tf
│       └── variables.tf

Modules

There are two modules:

An S3 bucket module which deploys an S3 bucket and enables versioning with the bucket arn and id as outputs.
A replication module with an IAM role, policy and an S3 replication rule referencing a source and destination bucket.

Providers

Providers are defined in the root of the stack configuration here with the required inputs and region references. You could even add a for_each loop to the provider block if you so desire to make some components be deployed across multiple regions and referenced once ( Like in an IPAM Hub and spoke model)

required_providers {
  aws = {
    source  = "hashicorp/aws"
    version = "5.72.1"
  }
}

provider "aws" "source" {
  config {
    region = "us-east-1"

    assume_role_with_web_identity {
      role_arn           = var.role_arn
      web_identity_token = var.identity_token
    }
  }
}

provider "aws" "dest" {
  config {
    region = "us-west-2"

    assume_role_with_web_identity {
      role_arn           = var.role_arn
      web_identity_token = var.identity_token
    }
  }
}

Components

We are dividing this entire operation into 3 components.

One to provision the source S3 bucket
Second to provision the destination S3 bucket
Third to create the IAM roles and replication rules.

Note that the provider references for source and desintation are different.

component "source" {
  source = "./modules/s3"
  inputs = {
    bucket_name            = "${var.source_bucket_name}-${var.suffix}"
  }
  providers = {
    aws = provider.aws.source
  }
}

component "destination" {
  source = "./modules/s3"
  inputs = {
    bucket_name          = "${var.destination_bucket_name}-${var.suffix}"
  }
  providers = {
    aws = provider.aws.dest
  }
}


component "replication" {
  source = "./modules/replication"
  inputs = {
    role_name              = var.replication_role
    policy_name            = var.replication_policy
    source_bucket_arn      = component.source.bucket_arn
    destination_bucket_arn = component.destination.bucket_arn
    source_bucket          = component.source.bucket_id
  }
  providers = {
    aws = provider.aws.source
  }
}

Deployments

We have a deployment below which provisions the above infrastructure stack into those two regions. If we wanted to provision this to another account, all we need is another deployment block with a scoped role_arn.


identity_token "aws" {
  audience = ["aws.workload.identity"]
}

store "varset" "roles" {
  id       = "varset-6pcUK8q4FQVLBRJY"
  category = "env"
}

deployment "dev" {
  inputs = {
    identity_token          = identity_token.aws.jwt
    role_arn                = store.varset.roles.role_dev

    source_bucket_name      = "manu-2024-source"
    destination_bucket_name = "manu-2024-dest"
    suffix                  = "dev"

    replication_role        = "stacks-replication"
    replication_policy      = "stacks-replication"
  }
}

tfstacks CLI

The terraform-stacks-cli is a command-line tool for validating, initializing, and testing Stack configurations. I will skip the installation steps here which you can easily find in the documentation below.

More about the tfstacks CLI here

Additional files

.terraform-version : Stacks currently work with an alpha version of the Terraform binary and so you would need to ensure that is installed and saved into a file of this name.

  1.11.0-alpha20241106

.terraform.lock.hcl : Stacks currently need the lock file for the providers you are using to ensure it is pulling down that specific version. You can generate this by running tfstacks providers lock command in your terminal from the root path of the stacks configuration.

Provisioning

From your HCP Terraform project, create a Stack(instead of workspace) and associate the repo which holds this code. You would need to enable Stacks beta from your organization before you are able to create a stack.

I had a variableset which held the role_arns for my target AWS accounts in the project which were read via the store block.
The role_arn was used by the Stacks to authenticate with the AWS account.
Once the authentication was complete, the provisioning was in progress across those target accounts the role_arn was provided for( mapped to deployments) .

I am not diving into the details of the Stacks UI on setting this up, but that should be very intuitive for someone who has any familiarity with the workspaces and any deployments in general.

Plan details :

When the plan is executed, it does follow the source->destination->replication flow. I wish the plan structure shown in the applied configuration showed that order.

Terraform state:

As with any Terraform configuration , you would want to know that Terraform is managing the state for the configuration you provisioned. Stacks manage the state per deployment in HCP Terraform.

Open one of the deployments -> Select State history

The inspect state data option available on the page allows you to extract the state file if needed as state.tfstackstate.json. Snippet of the state managed below:

{
    "format_version": 1,
    "terraform_version": "1.11.0-alpha20241106",
    "components": [
        {
            "address": "component.destination",
            "component_address": "component.destination",
            "instance_correlator": "qFdpMimf/Uj7lSI5gfo3VHp7WJub2kmHeHZn0VZNW80=",
            "component_correlator": "sXZ+Hojf6Pd/ueCO6AQ1uhDKtfyKCfLzyrQeppezopE="
        }
    ],
    "resource_instances": [
        {
            "component_instance_correlator": "qFdpMimf/Uj7lSI5gfo3VHp7WJub2kmHeHZn0VZNW80=",
            "component_instance_address": "component.destination",
            "address": "aws_s3_bucket.this",
            "mode": "managed",
            "type": "aws_s3_bucket"

....

Conclusion

The idea of being able to provision this across multiple accounts and regions easily with a single infrastructure construct is a thing of beauty. I have had to deal with managing those dependencies in many ways to get a cross region replication setup configured for a source S3 bucket before this. We will dive into deferred changes and other stack specific details in a future article. Hopefully this gave you some more details into what Stacks offer from an infrastructure management standpoint.

You can infact extend the replication component to make this a bidirectional replication configuration with very little changes.

DEV Community

S3 Cross region replication with Terraform stacks

Scenario

S3 Cross region replication

Enter Stacks

Components, Providers and variables.

Deployments

Orchestration rules

Additional ones

store block

identity_token block

S3 Cross region replication stack

Modules

Providers

Components

Deployments

tfstacks CLI

Additional files

Provisioning

Terraform state:

Conclusion

Top comments (0)

Read next

Deploying NGINX on AWS EC2 Using Terraform🚀🚀(Project)

Day 04: Terraform Code for Clustered Web Server

🚀 Key Takeaways from Dr. Werner Vogels' Keynote at AWS re:Invent 2024 🌍

EKS Auto Mode Arrives in Terraform – Simplify Kubernetes Today