For the past several days I’ve been terraforming Opensearch resources with the Elasticsearch/OpenSearch Provider. Phill Baker did a really great work, so take a minute of your time and star this repo!
The problem statement
Let's assume that our infrastructure has three environments(Development
, UAT
and Prod
) and three clusters per environment - A
, B
and C
.
How would you organise the project structure so each cluster in each environment will have a separate state file?
But why one might need such setup? Isn't it enough to have a state file per environment? Of course, there are scenarios when that is absolutely fine, nonetheles, it's worth considering:
- 💥 the "blast radius". Potential misconfiguration that may happen will impact only a single cluster, while others will be safe and sound when their states are managed independently.
- ⏱
plan
/apply
step time. In very large projects it can be even a few hours. On top of that an update for a big number of resources may lead to the API requests throttling by cloud providers. Not a nice situation to be in.
In this post I'd like to describe a solution that I came up with. Hopefully, this could be useful for anyone solving anything similar.
First approximation
Usually a multi-cluster setup means that each cluster has it's own purpose, however, structure-wise clusters have the same set of resources. We can encapsulate the logic of creationindex_templates
,ism_policies
, etc in the opensearch
module by providing a number of tftpl
template files with Opensearch resources definitions.
The content of the clusters.tf
file can look like:
module "cluster_a" {
source = "./opensearch"
index_templates = fileset(path.module, "/path/to/cluster_a/index_templates/*.tftpl")
ism_policy_templates = fileset(path.module, "/path/to/cluster_a/ism_policy_templates/*.tftpl")
...
}
module "cluster_b" {
source = "./opensearch"
index_templates = fileset(path.module, "/path/to/cluster_b/index_templates/*.tftpl")
ism_policy_templates = fileset(path.module, "/path/to/cluster_b/ism_policy_templates/*.tftpl")
...
}
module "cluster_c" {
source = "./opensearch"
index_templates = fileset(path.module, "/path/to/cluster_c/index_templates/*.tftpl")
ism_policy_templates = fileset(path.module, "/path/to/cluster_c/ism_policy_templates/*.tftpl") ...
}
An abstract CI/CD tool can execute the following commands (from the project root level) to initialize and deploy the project:
export ENVIRONMENT=dev
terraform init -backend-config=./environment/$ENVIRONMENT/config.s3.tfbackend
terraform plan -var-file=./environment/$ENVIRONMENT/variables.tfvars
terraform apply -var-file=./environment/$ENVIRONMENT/variables.tfvars
A couple of notes:
-
config.s3.tfbackend
file reflects the fact that I use S3 remote backend, but it's not the requirement. You are good to go with any other backend. -
variables.tfvars
file makes it easier to pass values required for theopensearch
module to create resources.
Next step
The easiest way to covert the existing project to a multi state file one is to initialize each opensearch
module(for cluster_a
, cluster_b
, cluster_c
) separately. The picture below shows how the new structure might look like:
This solution is fit for purpose. It gives us multi state terraform project.-chdir setting is used here to be able to launch commands from the project root level:
export ENVIRONMENT=dev
export CLUSTER=cluster_a
terraform -chdir=./$CLUSTER/ init -backend-config=./../environment/$ENVIRONMENT/config.s3.tfbackend
terraform -chdir=./$CLUSTER/ plan -var-file=./../environment/$ENVIRONMENT/variables.tfvars
terraform -chdir=./$CLUSTER/ apply -var-file=./../environment/$ENVIRONMENT/variables.tfvars
...
# a similar set of commands for 'cluster_b' and 'cluster_c'
The obvious downside with this solution - multiple cluster_<id>
folders which have almost identical content.
Can we do better?
It would be great if we could initialize the same cluster
folder for different backend files without creating multiple copies of it. TF_DATA_DIR
to the rescue. As per documentation:
TF_DATA_DIR
changes the location where Terraform keeps its per-working-directory data, such as the current backend configuration.
This is exactly what we need: by using TF_DATA_DIR
we've written the cluster
initalization code only once but "reused" it with all our tfbackend
and tfvars
files thus being able to obtain a dedicated state file per cluster and per environment. Here is how the final project structure looks like:
And terraform commands:
mkdir {cluster_a,cluster_b,cluster_c} # create folders for cluster initialization
export ENVIRONMENT=dev
export CLUSTER=cluster_a
export TF_DATA_DIR=$CLUSTER
terraform -chdir=./$CLUSTER/ init -backend-config=./environment/$ENVIRONMENT/config.s3.tfbackend
terraform -chdir=./$CLUSTER/ plan -var-file=./environment/$ENVIRONMENT/variables.tfvars
terraform -chdir=./$CLUSTER/ apply -var-file=./environment/$ENVIRONMENT/variables.tfvars
export CLUSTER=cluster_b
...
This set of commands could be refactored as a loop and wrapped in a utility script.
Final thoughts
It is always nice to write the least amount of code/configuration as possible without violating functionality and readability.
The approach that has been shown in this post works fine. However, if there are no limitations on the tools you can choose, I would suggest looking at the terragrunt for solving a similar issue.
Thank you for your time! If you have any suggestions or concerns you are more than welcome to approach me in comments or via twitter.
P.S. Thanks to Eugene Rubanov for the review and feedback 🤝.
Top comments (0)