DEV Community

quixoticmonk
quixoticmonk

Posted on

S3 Cross region replication with Terraform stacks

There have been a few articles on Terraform stacks and how some of the core concepts with Stacks help you manage the deployment/provisioning of those multi region/multi account scenarios. I had one of those scenarios which I wanted to test to get a better understanding of Stacks.

Scenario

S3 Cross region replication

The name itself suggests the challenge you have with this stack or configuration you are trying to stand up.

- An S3 bucket in one region
- A replication bucket in another region
- The replication rules and necessary IAM roles in the first region referencing the second bucket.
Enter fullscreen mode Exit fullscreen mode

Considering the way the Terraform workspaces are setup, this can be a pain to manage the relationships and dependencies you have here.

Add in bidirectional replication and you have a bigger challenge with all those dependencies you will need to manage and orchestrate.

Enter Stacks

Terraform stacks are a configuration layer in HCP Terraform that simplifies the management of your infrastructure modules with some of the following use cases

  • Deploy an entire application with components like networking, storage, and compute as a single unit without worrying about dependencies
  • Deploy across multiple regions, availability zones, and cloud provider accounts without duplicating effort/code

How is this done, though ? There are a few pieces to the puzzle.

Components, Providers and variables.

Components are an alternative to module invocations. If you are familiar with using a module with a source and variable inputs, you will feel right at home here.

component "cluster" {
  source = "" # Can be local or from a source you can pull from
  inputs = {
    ......  # variable inputs you would want to provide to the undelying module
  }
  providers = {
    ...   # Not entirely new, but providers via components are a necessity as modules used for components cannot have provider definition in them. So this is your way to add provider details.
  }
}
Enter fullscreen mode Exit fullscreen mode
  • Defined under a <name>.tfstacks.hcl file.

Providers in a high level are your conifguration mechanisms to interact with a target plaform ( Cloud provider, Artifact registry etc..)

  • They are different from your existing provider definitions in that they resemble any of your resource/datasource blocks you may currently have with the named blocks.
provider "aws" "<name_you_can_remember>" {
  config {
    region = <region>

    assume_role_with_web_identity {
      role_arn           = .. # ARN of the role of the target platform you are deploying to
      web_identity_token = .. # Workload identity token generated by a run in HCP Terraform
    }
  }
}
Enter fullscreen mode Exit fullscreen mode
  • Defined under a providers.tfstacks.hcl file.

Variables are not a new concept and they mirror your root module variables if you have set one before.

Deployments

Deployments are your programmatic way of defining what to include in a provisioning step and acts as your entrypoint for your Stack.

deployment "environment_id_you_want_to_use" {
  inputs = {
    identity_token          = identity_token.aws.jwt
    role_arn                = store.varset.roles.role_dev

    + any other inputs which the components need as part of the module invocation
  }
}

Enter fullscreen mode Exit fullscreen mode
  • Defined under a <name>.tfdeploy.hcl file.
  • Reference : deployment

Orchestration rules

Rules written in HCL which allow you to automation some of the repetitive actions based on some context information your stacks have based on the deployments. Terraform has been slowly easing us into this concept of checks with a condition and error_message to review a certain rule. This looks very similar to that in view, but helps to orchestrate the automation of approvals of your stack.

Image description

  • One of the orchestration rules available is auto_approve which can be based on some check in the format below.
orchestrate "auto_approve" “safe_plans” {
  check {
    #check that there are no resources being removed
    condition = context.plan.changes.remove == 0
    reason = "Plan has ${context,plan.changes. remove} resources to be removed."
  }
}

Enter fullscreen mode Exit fullscreen mode

Additional ones

store block

A store block can be used to reference a variableset in your HCP Terraform project which can include some credentials you want to read on runtime. Specify the variables you are assigning this value to as ephemeral and you can avoid having your variable value in your logs or state. I wish these could reference a name than an id since they are anyway scoped to the project.

store "varset" "roles" {
  id       = "<variables-set-id>"
  category = "env"
}

# Access your variable set's value using your store and pass them into your 
# deployment's inputs.
deployment "dev" {
  inputs = {
    role_arn = store.varset.roles.role_dev  
  }
}
Enter fullscreen mode Exit fullscreen mode

identity_token block

The identity_token block gives you a token that you will be used to authorize the AWS provider in this case. This bases itself on the dynamic credentials mechanism which HCP Terraform has with AWS. There are few updates to the existing workspace based one if you compare. The trust relationship on the role would look something like below:

            "Condition": {
                "StringEquals": {
                    "app.terraform.io:aud": "aws.workload.identity"
                },
                "StringLike": {
                    "app.terraform.io:sub": "organization:<org_name>:project:<proj_name>:stack:<stack_name>:deployment:<deployment_name>:operation:<operation_type>"
                }
            }
Enter fullscreen mode Exit fullscreen mode

Read more about the OIDC based authentication here

S3 Cross region replication stack

Lets dive into this one. The project structure in my case is as below. You could have your modules be referenced as an external source as well. All the code I have here is in the GitHub repo at s3 cross region replication

.
├── README.md
├── components.tfstack.hcl
├── deployments.tfdeploy.hcl
├── providers.tfstack.hcl
└── variables.tfstack.hcl
├── modules
│   ├── replication
│   │   ├── main.tf
│   │   ├── outputs.tf
│   │   └── variables.tf
│   └── s3
│       ├── main.tf
│       ├── outputs.tf
│       └── variables.tf

Enter fullscreen mode Exit fullscreen mode

Modules

There are two modules:

  • An S3 bucket module which deploys an S3 bucket and enables versioning with the bucket arn and id as outputs.
  • A replication module with an IAM role, policy and an S3 replication rule referencing a source and destination bucket.

Providers

Providers are defined in the root of the stack configuration here with the required inputs and region references. You could even add a for_each loop to the provider block if you so desire to make some components be deployed across multiple regions and referenced once ( Like in an IPAM Hub and spoke model)

required_providers {
  aws = {
    source  = "hashicorp/aws"
    version = "5.72.1"
  }
}

provider "aws" "source" {
  config {
    region = "us-east-1"

    assume_role_with_web_identity {
      role_arn           = var.role_arn
      web_identity_token = var.identity_token
    }
  }
}

provider "aws" "dest" {
  config {
    region = "us-west-2"

    assume_role_with_web_identity {
      role_arn           = var.role_arn
      web_identity_token = var.identity_token
    }
  }
}

Enter fullscreen mode Exit fullscreen mode

Components

We are dividing this entire operation into 3 components.

  • One to provision the source S3 bucket
  • Second to provision the destination S3 bucket
  • Third to create the IAM roles and replication rules.

Note that the provider references for source and desintation are different.

component "source" {
  source = "./modules/s3"
  inputs = {
    bucket_name            = "${var.source_bucket_name}-${var.suffix}"
  }
  providers = {
    aws = provider.aws.source
  }
}

component "destination" {
  source = "./modules/s3"
  inputs = {
    bucket_name          = "${var.destination_bucket_name}-${var.suffix}"
  }
  providers = {
    aws = provider.aws.dest
  }
}


component "replication" {
  source = "./modules/replication"
  inputs = {
    role_name              = var.replication_role
    policy_name            = var.replication_policy
    source_bucket_arn      = component.source.bucket_arn
    destination_bucket_arn = component.destination.bucket_arn
    source_bucket          = component.source.bucket_id
  }
  providers = {
    aws = provider.aws.source
  }
}

Enter fullscreen mode Exit fullscreen mode

Deployments

We have a deployment below which provisions the above infrastructure stack into those two regions. If we wanted to provision this to another account, all we need is another deployment block with a scoped role_arn.


identity_token "aws" {
  audience = ["aws.workload.identity"]
}

store "varset" "roles" {
  id       = "varset-6pcUK8q4FQVLBRJY"
  category = "env"
}

deployment "dev" {
  inputs = {
    identity_token          = identity_token.aws.jwt
    role_arn                = store.varset.roles.role_dev

    source_bucket_name      = "manu-2024-source"
    destination_bucket_name = "manu-2024-dest"
    suffix                  = "dev"

    replication_role        = "stacks-replication"
    replication_policy      = "stacks-replication"
  }
}

Enter fullscreen mode Exit fullscreen mode

tfstacks CLI

The terraform-stacks-cli is a command-line tool for validating, initializing, and testing Stack configurations. I will skip the installation steps here which you can easily find in the documentation below.

More about the tfstacks CLI here

Additional files

  • .terraform-version : Stacks currently work with an alpha version of the Terraform binary and so you would need to ensure that is installed and saved into a file of this name.
  1.11.0-alpha20241106

Enter fullscreen mode Exit fullscreen mode
  • .terraform.lock.hcl : Stacks currently need the lock file for the providers you are using to ensure it is pulling down that specific version. You can generate this by running tfstacks providers lock command in your terminal from the root path of the stacks configuration.

Provisioning

From your HCP Terraform project, create a Stack(instead of workspace) and associate the repo which holds this code. You would need to enable Stacks beta from your organization before you are able to create a stack.

Image description

  • I had a variableset which held the role_arns for my target AWS accounts in the project which were read via the store block.
  • The role_arn was used by the Stacks to authenticate with the AWS account.
  • Once the authentication was complete, the provisioning was in progress across those target accounts the role_arn was provided for( mapped to deployments) .

I am not diving into the details of the Stacks UI on setting this up, but that should be very intuitive for someone who has any familiarity with the workspaces and any deployments in general.

Image description

Plan details :

When the plan is executed, it does follow the source->destination->replication flow. I wish the plan structure shown in the applied configuration showed that order.

Image description

Terraform state:

As with any Terraform configuration , you would want to know that Terraform is managing the state for the configuration you provisioned. Stacks manage the state per deployment in HCP Terraform.

  • Open one of the deployments -> Select State history

The inspect state data option available on the page allows you to extract the state file if needed as state.tfstackstate.json. Snippet of the state managed below:

{
    "format_version": 1,
    "terraform_version": "1.11.0-alpha20241106",
    "components": [
        {
            "address": "component.destination",
            "component_address": "component.destination",
            "instance_correlator": "qFdpMimf/Uj7lSI5gfo3VHp7WJub2kmHeHZn0VZNW80=",
            "component_correlator": "sXZ+Hojf6Pd/ueCO6AQ1uhDKtfyKCfLzyrQeppezopE="
        }
    ],
    "resource_instances": [
        {
            "component_instance_correlator": "qFdpMimf/Uj7lSI5gfo3VHp7WJub2kmHeHZn0VZNW80=",
            "component_instance_address": "component.destination",
            "address": "aws_s3_bucket.this",
            "mode": "managed",
            "type": "aws_s3_bucket"

....
Enter fullscreen mode Exit fullscreen mode

Conclusion

The idea of being able to provision this across multiple accounts and regions easily with a single infrastructure construct is a thing of beauty. I have had to deal with managing those dependencies in many ways to get a cross region replication setup configured for a source S3 bucket before this. We will dive into deferred changes and other stack specific details in a future article. Hopefully this gave you some more details into what Stacks offer from an infrastructure management standpoint.

You can infact extend the replication component to make this a bidirectional replication configuration with very little changes.

Top comments (0)