Shannon

Posted on Jun 25, 2021 • Edited on Jun 28, 2021

Don't let your Terraform go rogue with Conftest and the Open Policy Agent

#terraform #devops

Overview

Terraform is an incredible resource to turn any API into declarative, configurable code with Hashicorp's HCL language. However, the world is often at your fingertips when creating cloud resources with Terraform. Frequently, an HCL codebase is run through a CI/CD pipeline, and without proper checks or human involvement, a developer can destroy or change critical resources in production environments.

Insert Conftest! As they state in their GitHub description, Conftest tests against structured configuration data using the Open Policy Agent Rego query language. In the case of Terraform, this means we're actually running unit tests against sample JSON and actual tests against the Terraform plan JSON.

Writing some basic Terraform to test

Every time you plan or apply Terraform, the output of this plan, apply, or destroy can be viewed locally in order to see the potential changes. Note: this is only in Terraform >= 0.12. In order to test our Terraform, we will be generating the output of a plan in our current working directory.

Every Terraform plan output follows a similar pattern, and we can parse this in order to test whether specific resources are being modified, created, or deleted. First up, we're going to make some really simple Terraform resources. Create a new directory and add the following code to a main.tf file:

terraform {
    required_providers {}
}

resource "null_resource" "fake_instance" {
    count = 2
}

This code creates a null_resource, which implements the standard resource lifecycle but takes no further action. Meaning, we're not really doing anything with this Terraform other than generating some resources for our Terraform file. Specifically, we're creating 2 resources because count = 2.

Creating the Terraform file

Once you've created this file, it just takes a few CLI commands with the Terraform binary in order to generate a file:

Initialize the Terraform: terraform init
Create a plan: terraform plan -out=tfplan_2_resources_planned
Generate a JSON file of the Terraform plan: terraform show -json ./tfplan_2_resources_planned | jq > tfplan_2_resources_planned.json

Analyzing the Terraform plan JSON file with JQ

Now, you should have a JSON file in your local working directory that is parseable to see the patterns Terraform establishes to show info about any Terraform resource being created, changed, or destroyed. We can utilize jq to view a few of these, but I'd definitely suggest just scrolling through the file yourself!

Resource changes: cat tfplan_2_resources_planned.json | jq .resource_changes
Type of resources being changed: cat tfplan_2_resources_planned.json | jq '.resource_changes[].type'
Resource action (change, create, destroy): cat tfplan_2_resources_planned.json | jq '.resource_changes[].change.actions[]'

There are many, many possibilities for what can be tested in this file. For instance, I could confirm that the Terraform version is not changed by looking at the terraform_version key in the state file or whether the AWS provider is using the official release. It's really worth digging into this file's structure.

Writing Rego to parse the JSON file

First up, we'll want to make a directory: mkdir policy. It's idiomatic to have the directory called policy for Conftest. We'll write some simple Rego that checks how many null_resource objects we are creating. Name this file main.rego.

package main

planned_resources = [res | 
  res := input.planned_values.root_module.resources[_]
  res.type == "null_resource"
]

num_planned_resources := count(planned_resources)

deny[msg] {
  not num_planned_resources == 2
  msg := "there should be 2 total null_resources"
}

Some things to note about the code above:

You must have a package declared. This structure is similar to Go and you may import other packages.
planned_resources is using a list comprehension, which is common in Python. In this case, it is parsing the input, which in our case is the Terraform plan file. Then, it narrows down the JSON to planned_values.root_module.resources[_]. The [_] indicates that you are searching all objects. Finally, it narrows down the objects to only have a type of null_resource. In our case, that's all we're creating, but typically your Terraform will have many, many more resources.
count() is a built-in function. There are many such functions, which can be seen here.
deny[msg] is the common block that Conftest is checking and will provide a status on whether the test fails. Inside the block, if anything evaluates to true, then the deny block will fail and return the msg variable. Thus, that is why we declare the the test as not == 2.

Rego is particularly confusing to me as it gets more complicated, so if it doesn't make total sense at first, don't be discouraged! It took quite a bit of time for me to parse what's really going on with OPA and Rego (and I'm often still confused).

Now, we can test the code with the Conftest CLI. If you don't have that installed, here's the installation guide.

conftest test tfplan_2_resources_planned.json should generate something like: 1 test, 1 passed, 0 warnings, 0 failures, 0 exceptions. As mentioned above, it's idiomatic that Conftest will look in the directory ./policy for Rego policies. If you do not have it in that directory, this also works: conftest test --policy [location_here] tfplan_2_resources_planned.json

Unit testing your Rego

It's relatively simple to create mock input for unit tests, so here is an example of testing how many null_resources are being created. Make a file called main_test.rego with the following content:

test_num_planned_resources {
    num_planned_resources == 1 with input as {
        "planned_values": {
            "root_module": {
                "resources": [
                    {
                        "address": "null_resource.fake_instance[0]",
                        "type": "null_resource",
                    }
                ]
            }
        },
  }
}

As you can see, I'm confirming that 1 null_resource object is being created with the mocked out JSON.

To run this, you'll also need the OPA binary installed. Here is an installation guide from the docs.

And finally, run this command: opa test -v policy/*.rego

And that's it! You now have code that tests you're creating exactly 2 null resources. Run through the same process above with a different number of resources, and you will see the Conftest test fails. Similarly, you can apply these resources and then try deleting them to see what other areas of your Terraform code can be tested.

Real world example

Now that we have a general understanding of using Rego to test our Terraform, here's a pretty common workflow that you could expect to see with Conftest:

Introduce new Terraform code on a new Git branch
CI kicks off a new build based off the Git commit that outputs a Terraform plan binary
Conftest binary runs against this Terraform plan. If it fails, the entire build fails and developer is notified. Build ends.
If Conftest passes successfully, code is merged into main/master branch
Terraform is applied and new infrastructure deployed

Au revoir

I hope this was helpful and don't hesitate to ask questions!

Top comments (3)

Anders Eknert • Jun 27 '21

Great blog! Thanks for sharing :)

Shannon • Jun 28 '21

Thanks, Anders! Did not expect to see an OPA dev advocate comment on this! Thanks for the great work y'all do

Anders Eknert • Jun 28 '21

Thanks Shannon! If you haven't already, feel free to join the OPA Slack ( slack.openpolicyagent.org/ ). Great way to share content like this with the OPA community, ask Rego related questions, etc. Either way, I'll be following your blogging!