Consuming and decoding JSON in Terraform

#terraform #devops

Intro

HCL, the language of Terraform, has come a long way since 0.11. With its many new releases, the ability to consume JSON and create adaptive Terraform has become considerably easier with additions like for loops, arrays, and maps

Scenarios for consuming JSON in Terraform

Consuming JSON and then working with it is not always the easiest. In this post, I'll be showing how to consume JSON in 3 different ways:

Reading a file in the local directory
Reading in JSON from an AWS S3 bucket
Passing in JSON via environment variables

Why these 3 specific options? Well, all 3 are totally viable ways to consume JSON and are largely dependent on what your needs are. If the JSON is static, reading from S3 might be a good option. If you are running Terraform in a containerized environment, passing in as an environment variable may be a good option to differentiate between prod and testing. Or, you could even pass in a file location as an environment variable, which would then read a file from the local directory where the Terraform is being run.

How to consume the JSON in Terraform

To start, we are using a function called jsondecode(). Documentation here. This interprets a given string as JSON from the documentation. But, the most important part is that jsondecode maps JSON values to Terraform language values. Meaning, you can now use loops, store specific values in maps and arrays, or anything you can think of using HCL (the language of Terraform).

I am providing some sample JSON here that can be used for this post:

{
    "users": [
        {
            "user_name": "user_1",
            "role": "admin",
            "ssh_key": "ssh-rsa [shortened]"
        },
        {
            "user_name": "user_2",
            "role": "dev",
            "ssh_key": "ssh-rsa [shortened]"
        },
        {
            "user_name": "user_3",
            "role": "read_only",
            "ssh_key": "ssh-rsa [shortened]"
        },
        {
            "user_name": "user_4",
            "role": "dev",
            "ssh_key": "ssh-rsa [shortened]"
        }
    ]
}

Reading a file from your local directory

Now, create both a main.tf file and a users.json file that hosts the above JSON. Within the Terraform file, add the following block:

locals {
    # get json 
    user_data = jsondecode(file("${path.module}/users.json"))

    # get all users
    all_users = [for user in local.user_data.users : user.user_name]
}

output "users" {
    value = local.all_users
}

To break down above, we are:

Decoding the JSON into an HCL query-able local variable
Creating an array that loops through each user and attaches their username into the array
Outputs all users by using this local variable

To test it out, run a terraform init and then terraform plan! It should output something like this:

Changes to Outputs:
  + users = [
      + "user_1",
      + "user_2",
      + "user_3",
      + "user_4",
    ]

Passing in JSON via environment variables

Within Terraform, there are many ways to pass in an environment variable. In our case, I am simply going to create a terraform.tfvars file that will automatically hold environment variables. Here is the content of the file that can be added:

users = {
    "users": [
        {
            "user_name": "user_1",
            "role": "admin",
            "ssh_key": "ssh-rsa [shortened]"
        },
        {
            "user_name": "user_2",
            "role": "dev",
            "ssh_key": "ssh-rsa [shortened]"
        },
        {
            "user_name": "user_3",
            "role": "read_only",
            "ssh_key": "ssh-rsa [shortened]"
        },
        {
            "user_name": "user_4",
            "role": "dev",
            "ssh_key": "ssh-rsa [shortened]"
        }
    ]
}

It is essentially the exact same but has an HCL block that recognizes it as a Terraform variable.

Now, main.tf can be replaced with the following:

locals {    
    uniq_dev_roles = distinct(
      [for user in var.users.users : user.role]
    )
}

variable "users" {}

output "roles" {
    value = local.uniq_dev_roles
}

Now, we can run this Terraform in 2 different ways:

terraform plan -var-file="terraform.tfvars"
terraform plan

Why? Because in the documentation for input vars, the documentation states: load variable values from the given file, in addition to the default files terraform.tfvars... So you'll only need to pass in the file name if it's different. A common pattern could be something like this: prod.terraform.tfvars, staging.terraform.tfvars, testing.terraform.tfvars.

Things to note from this block:

We are now using another Terraform built-in function to expand how we are manipulating the JSON. In this case, it's showing the unique (or distinct) instances of dev roles, which in this case in admin, dev, and read_only.
The variable block for users is now where the user json is located. As you can see in the for loop, we are now looping over var.users.users. This is because we are no longer parsing a file and instead parsing a variable. In our case, we are not stating a type so Terraform is inherently inferring the types of the object. Because it interprets users as an array still, we are able to loop over it.

Reading in JSON from an AWS S3 bucket

Finally, our last example is reading in JSON as a data object from AWS. In this case, you'll need an AWS account and also to have uploaded this JSON from the examples above to somewhere in an S3 bucket for them to be referenced. However, the example is really not much different from the first.

Here is the complete main.tf file for a functional example (in my own local dev environment) of utilizing an S3 bucket:

terraform {
  required_providers {
    aws = {
      source  = "hashicorp/aws"
      version = "~> 3.0"
    }
  }
}

provider "aws" {
  region = "us-west-2"
  profile = "shannon" # change this to your own aws credentials or profile
}

# use a data object instead of a resource
data "aws_s3_bucket_object" "user_access_data" {
  bucket = "shannon-terraform"
  key    = "user_access/users.json"
}

locals {    
    user_data = jsondecode(data.aws_s3_bucket_object.user_access_data.body)
    users = [for user in local.user_data.users: user.user_name]
}

# create an IAM user for each user found in the JSON
resource "aws_iam_user" "user" {
    for_each = toset(local.users)
    name = each.value
}

So, we're no longer printing output and actually utilizing the JSON to loop over users and create an IAM user for each dev. Things to point out:

We're now using a data source in Terraform, which is just another type of resource that already exists and can be referenced in other resources. In our case, we're parsing the JSON we've already uploaded to S3.
We are utilizing for_each to loop over the users from the JSON in order to create an IAM user for each one. Because it's a list of strings, we'll specifically need to wrap it in toset(). Docs on how to use for_each can be found here.

Now, run terraform plan and you should have a Terraform plan for 4 new IAM resources!

Wrap up

I hope this was helpful in seeing how to work with JSON in Terraform. Once you are able to grep the data and turn it into HCL, the possibilities are limitless!