In Terraform, the template_file data source is the preferred method of, for example, injecting variable data into a templated-file like a script or configuration file. In my case, I make heavy use of template_file
as part of the spin-up routine for servers on Equinix Metal which uses cloud-init to run a script on first boot to configure the host.
For the sake of ease and user-customizability, I use a map
resource in Terraform to store all of the URLs for Kubernetes manifests that I'd like applied to a Kubernetes cluster once Terraform creates the host, and cloud-init has bootstrapped the cluster, which looks like this:
variable "workloads" {
type = map
default = {
ceph_common = "https://raw.githubusercontent.com/rook/rook/release-1.0/cluster/examples/kubernetes/ceph/common.yaml"
ceph_operator = "https://raw.githubusercontent.com/rook/rook/release-1.0/cluster/examples/kubernetes/ceph/operator.yaml"
ceph_cluster_minimal = "https://raw.githubusercontent.com/rook/rook/release-1.0/cluster/examples/kubernetes/ceph/cluster-minimal.yaml"
ceph_cluster = "https://raw.githubusercontent.com/rook/rook/release-1.0/cluster/examples/kubernetes/ceph/cluster.yaml"
open_ebs_operator = "https://openebs.github.io/charts/openebs-operator-1.2.0.yaml"
tigera_operator = "https://docs.projectcalico.org/manifests/tigera-operator.yaml"
calico = "https://docs.projectcalico.org/manifests/custom-resources.yaml"
flannel = "https://raw.githubusercontent.com/coreos/flannel/2140ac876ef134e0ed5af15c65e414cf26827915/Documentation/kube-flannel.yml"
metallb_namespace = "https://raw.githubusercontent.com/google/metallb/v0.9.3/manifests/namespace.yaml"
metallb_release = "https://raw.githubusercontent.com/google/metallb/v0.9.3/manifests/metallb.yaml"
}
}
The benefit to this approach, rather than hardcoding these values in a provisioner
in Terraform (if you're not using the Kubernetes provider, which I am not), or in something like a cloud-init script (i.e. kubectl apply -f {whatever above url}
), is that it doesn't require modifying the Terraform module itself to change a manifest version, or to add or remove workloads from the above-- these can be changed or overridden (or omitted entirely) from the terraform.tfvars
file on the users machine. This means that, if a workload changes, the resource that it was embedded within can be updated, rather than recycled (destroyed and re-applied), depending on how you wind up using that variable, but this is not the case for this example, because this example is derived from a cluster provisioner rather than a production workload Terraform plan.
Typically, when you want to a consume a variable, in this case workloads
, if it were of type string
, you can plug this into a template like so:
data "template_file" "controller" {
template = file("${path.module}/controller.tpl")
vars = {
workloads = var.workloads
}
}
However, because workloads
is of type map
, it must be converted using jsonencode
:
...
vars = {
workloads = jsonencode(var.workloads)
}
...
and then you can access it in your template file (in the above example, that was controller.tpl
:
#!/bin/bash
echo ${workloads}
One issue, I encountered, however, is that this creates an object like:
key1:value key2:value key3:value
and since I really wanted a JSON object I could work with and validate, I add this additional step to convert it, first, into a table:
echo ${workloads} | sed 's| |\n|'g | awk '{sub(/:/," ")}1' | tee /root/workloads.data
so the data now looks like:
key1 value
key2 value
key3 value
which normally would be easy enough to put into an associative array, but I liked the idea of using Python to do some additional validation at some point in the future, so I also have it create a copy of that table as a json file:
echo ${workloads} | sed 's| |\n|'g | awk '{sub(/:/," ")}1' | tee /root/workloads.data && \
cat << EOF > workloads.py
import json
filename = "/root/workloads.data"
with open(filename) as f:
content = f.readlines()
workloads = {}
for w in content:
key = w.split(" ")[0]
value = w.split(" ")[1]
workloads[key] = value.replace("\n","")
f = open("/root/workloads.json", "a")
f.write(json.dumps(workloads))
f.close()
EOF
python3 workloads.py
since, potentially, I could have the script expanded to interact with the Kubernetes API to apply the workloads directly, or whatever other automation I might want to add in the future, but the end result here is that I have a file that I can parse:
{"key1": "value", "key2": "value", "key3":"value"}
in the rest of the script using the Map above from Terraform:
cat workloads.json | jq .key1
This might seem like overkill, and for our purposes here, it absolutely is (as I noted, like two additional lines of bash could've created an associative array from this data and still be iterable in this way), but hopefully it becomes a bit clearer that there are many approaches to handling data managed by Terraform into resources that it does not have visibility into, in this case, once a resource like a server:
data "template_file" "controller" {
template = file("${path.module}/controller.tpl")
vars = {
workloads = jsonencode(var.workloads)
}
}
resource "metal_device" "web1" {
project_id = var.project_id
hostname = "web1"
plan = "c3.medium.x86"
facilities = ["ny5"]
operating_system = "ubuntu_20_04"
billing_cycle = "hourly"
user_data = data.template_file.controller.rendered
}
is provisioned, to handle the variable data any way you'd like (CSVs, maybe the table as-is was fine for you, etc.), which in my case was more JSON from an object that met that schema.
Top comments (0)