Mattias Fjellström

Posted on Nov 13, 2023 • Originally published at mattias.engineer

Testing Framework in Terraform 1.6: A deep-dive

#hashicorp #testing #terraform

In my previous blog post A Comprehensive Guide to Testing in Terraform: Keep your tests, validations, checks, and policies in order I went through all the options for testing and validation that are available to you when you write your Terraform configurations and modules. We saw check blocks, pre-conditions and post-conditions related to a resource's lifecycle, custom conditions for input variables and output values, and more. The latest and greatest topic I covered was the testing framework that arrived in Terraform 1.6.

In this post I want to focus more on the testing framework to uncover the possibilities that it brings.

Background on testing with Terraform

If this is the first time you hear about the new testing framework for Terraform I would like to give you a short introduction to what it is.

Terraform 1.6 brought a new testing framework to general availability after it had been available as an experimental feature for a period of time. The release notes for version 1.6 listed a single new feature¹:

terraform test: The terraform test command is now generally available. This comes with a significant change to how tests are written and executed, based on feedback from the experimental phase.

Terraform tests are written in .tftest.hcl files, containing a series of run blocks. Each run block executes a Terraform plan and optional apply against the Terraform configuration under test and can check conditions against the resulting plan and state.

What we learn from the release notes is that we can now write tests for our Terraform configurations by including one or more .tftest.hcl files that each contain a series of run blocks. We also learn that a run block runs terraform plan or terraform apply commands. This means that these tests could create real infrastructure from your configuration. For me this is a good thing. I am a strong believer in when it comes to testing infrastructure-as-code there is no way to be sure it will work unless you actually deploy it for real. Why is that? There are just too many things that could go wrong. There might be hidden dependencies that you have no idea of before you actually try to create new infrastructure.

What is not clear from the release notes is who this testing framework is intended for. Is it meant for all Terraform users? Should you immediately jump on the TDD-with-Terraform train and start writing tests for all Terraform configuration? This is not the case. At least this is not the primary intended case. The testing framework is designed for module producers.

Are you in charge of creating infrastructure modules internally for your organization or publicly for the Terraform community? Then you are a module producer and this testing framework is for you.

Are you consuming modules in order to create the infrastructure for your application? Then this testing framework is not primarily intended for you. However, there is nothing stopping you from using it if you think it makes sense for your situation.

Module producers write code that other users will consume. Users of your modules depend on the contract that your module exposes. What is part of the module contract? Generally this includes the following items:

The input variables your module expects.
The output values your module produces.
Any externally visible resources your module creates. This could include configuration files, application gateways in Azure, network loadbalancers in AWS, a Cloud Bigtable table in GCP, or it could be pretty much anything else.

That last point might make you wonder if there are any resources created in a module that is not part of the contract? There definitely could be. Some resources could be internal implementation details that are required in order to construct the rest of the infrastructure. If a resource could be swapped out with a different resource without module consumers noticing then it is an internal implementation detail and not part of the contact.

If you are a module producer and you make an update of your module where you unintentionally make a significant change to the contract your module exposes, then this mistake could end up causing trouble for your module consumers.

This is exactly the reasoning behind any other kind of test in software development, testing Terraform modules is no different!

One last point to make about the new testing framework is that you write your tests in HashiCorp Configuration Language (HCL). This means there is no need to learn a new language in order to write tests for your Terraform code. There is no need to install an additional test tool that you need to keep track of and update and scan for vulnerabilities and so on. There is no need to mix your Terraform configuration with a bunch of test-related framework files. Run your tests and deploy your infrastructure using one and the same Terraform binary.

With all that background out of the way, let's move on to seeing all the nitty-gritty details of what this testing framework can do.

Nomenclature

Sometimes I need to remind myself of the nomenclature of the Terraform HCL code. To make sure we are all on the same page I introduce the nomenclature I use here:

A block are containers of other content. A block have a block type and zero or more labels. The generic block looks like this:
```
<BLOCK TYPE> "<BLOCK LABEL 1>" "<BLOCK LABEL 2>" ... "<BLOCK LABEL N>" {
  # block content
}
```
In Terraform the number of labels is zero, one, or two. A block can contain other blocks. Some common block types are terraform, provider, resource, data, and run.
An expression represents a literal value such as a string or a number, or they could be referencing other values such as the name of a resource. An expression could also be more complex consisting of function calls, string interpolations, and references. Some examples of expressions:
```
"this is a string value"
12345
azurerm_storage_account.my_account.name
"rg-${var.name}-${var.location}"
```
An argument is the assignment of a value (from an expression) to a name. Arguments appear inside of blocks. Some examples of arguments:
```
resource_group_name = "rg-my-resource-group"
type = list(string)
count = length(var.number_of_regions)
```

Even if the details of the HCL were familiar from before the nomenclature might be unfamiliar.

Test framework patterns

The run block is what executes a given test when we run terraform test. This block is central to the testing framework, so this is a block you need to become familiar with.

In the following subsections I will go through three testing patterns² that you might see in Terraform.

Before we look at the patterns let's briefly look at a typical directory structure for Terraform tests.

First of all, remember that your test files should use the .tftest.hcl file ending. If not, the terraform test command will not execute the run blocks for your tests.

When you execute terraform test you should be located in the module root directory. This is the directory where your primary main.tf file exists. The Terraform binary will look for your test files in the root directory or in a directory named tests. I recommend you place your tests in the tests directory, and do not place test files in the module root directory.

A typical directory structure for a simple module with tests is this:
$ tree .
.
├── main.tf
├── outputs.tf
├── providers.tf
├── tests
│   └── main.tftest.hcl
└── variables.tf
If you place your test files somewhere else you need to add -test-directory=path/to/tests to the terraform test command. But once again I recommend that you keep your test files in the tests directory to avoid confusing future contributors to your module.

How many test files should you have? The simple answer is it depends! If you are building a large module with many moving parts you will probably need to have several test files divided up into coherent and related parts that test a certain part of your module. If your module have a small interface (variables and outputs) it might suffice with a single test file. Use common sense here, if it feels like a file has too many tests then it probable does.

Pattern 1: Assertions

The first pattern is simple, it consists of a run block with a nested assert block:

run "file_contents_should_be_valid_json" {
  command = plan

  assert {
    condition     = try(jsondecode(local_file.config.content), null) != null
    error_message = "The file is not valid JSON"
  }
}

Let's break down this run block:

The run block has one label. This label is the name of the test. You should give your test a self-explanatory name that describes what it does. If the test fails you should immediately know why it failed. In this example the name is file_contents_should_be_valid_json. If this test fails I know that the contents of a file was not valid JSON.
This run block executes a terraform plan command. You specify what command you would like the test to execute in the command = <command> argument. If you leave this out it will default to execute an apply command. Personally I think it is a good idea to be clear and always add the command = <command> argument to be explicit in what the test does.
The run block can contain zero or more nested assert blocks. Each assert block has a condition = <expression> argument where <expression> should evaluate to true or false to indicate if the test passes (true) or fails (false). If <expression> evaluates to false then the expression in error_message = <expression> will be displayed to the terminal (or in Terraform Cloud). In this case the error message is The file is not valid JSON.

Although this example showed a single run block containing a single assert block, remember that you could include multiple run blocks, each containing multiple assert blocks.

Pattern 2: Expecting failures

The second pattern concerns tests where we expect the test to fail, and we want the test to report success if it does. This is a common testing strategy. The following run block has a nested variable block and an expect_failures = [ ... ] argument:

run "bad_input_url_should_stop_deployment" {
  command = plan

  variables {
    config_url = "http://example.com"
  }

  expect_failures = [
    var.config_url
  ]
}

There are some new things to look at in this run block so let's break it down:

The variables block allows you to provide input values for any variables that your module expects. In this case the variables block is provided as a nested block to the run block, but it could also be provided as a standalone block outside of any run blocks. In that case the values would apply to all run blocks in the entire file. If you include a standalone variables block you can still include nested variables block inside of a run block to override the global values.
The expect_failures = [ ... ] specifies that we expect this test to fail, and we list the reasons for failure in the array expression of the argument. In this particular example I say that I expect this test to fail due to the variable named config_url. This basically mean that I validate the value provided for the config_url variable in my Terraform module, and the value provided in this test (http://example.com) should result in a failing validation. If the plan can proceed as normal without any failures, then this test would fail.

It is worth spending some time discussing expect_failures. The values in this array must be checkable objects with associated custom conditions. In my previous article I wrote a lot about custom conditions. Objects that can include custom conditions are variables, outputs, resources, data sources, and check blocks.

An important point about these custom conditions is that all of them except for the check block will actually cause Terraform to halt the execution of a plan or apply operation. What does this mean for your tests? It means that if you want to combine expect_failures with assert blocks you have to be careful in how you construct your module and your corresponding tests. If you include a variable in the expect_failures array of values and at the same time have an assert block that expects a plan to finish, then the assert block would never even be evaluated because the custom condition for the variable would halt the execution of the plan.

For this reason I suggest you keep your tests to either use one or more assert blocks, or use the expect_failures = [ ... ] argument, but not both unless you really know what you are doing.

Note that the array value to expect_failures could contain multiple values. But you most likely would not want to mix the type of checkable objects you include in this array because of the reason discussed above.

Pattern 3: Using helper modules

Sometimes it is necessary to create supporting infrastructure before you can test your module. This could be the case if your module creates resources in Azure and it expects that you use an existing resource group for all the resources. In order to test a module like that there must be an existing resource group you can use. A simple solution to this is to create a resource group up front and just let it sit there in your cloud environment for as long as required. A better solution is to create the resource group when you launch the terraform test command.

To illustrate what this looks like we have the following directory structure:

$ tree .
.
├── main.tf
├── outputs.tf
├── testing
│   └── setup
│       └── main.tf
├── tests
│   └── main.tftest.hcl
└── variables.tf

4 directories, 5 files

I have created a testing directory that contains a setup directory with a main.tf file. The contents of this file is:

// testing/setup/main.tf
variable "resource_group_name" {
  type = string
}

variable "location" {
  type = string
}

resource "azurerm_resource_group" "rg" {
  name     = var.resource_group_name
  location = var.location
}

It is a simple file that uses the azurerm provider to create a resource group. The module under test is:

// main.tf
terraform {
  required_providers {
    azurerm = {
      source  = "hashicorp/azurerm"
      version = ">= 3.0.0"
    }
  }
}

locals {
  resource_name_suffix = "${var.name_suffix}-${var.location}"
}

resource "azurerm_service_plan" "plan" {
  name                = "plan-${local.resource_name_suffix}"
  resource_group_name = var.resource_group_name
  location            = var.location
  os_type             = "Linux"
  sku_name            = var.appservice_plan_sku
}

resource "azurerm_linux_web_app" "app" {
  name                = "app-${local.resource_name_suffix}"
  service_plan_id     = azurerm_service_plan.plan.id
  resource_group_name = var.resource_group_name
  location            = var.location

  https_only = true

  site_config {
    always_on           = contains(["Free", "F1", "D1"], var.appservice_plan_sku) ? false : true
    http2_enabled       = true
    minimum_tls_version = "1.2"
  }
}

This module creates two resources. An App Service plan and a Linux Web App. The variables.tf file has the following content:

// variables.tf
variable "name_suffix" {
  type = string
}

variable "resource_group_name" {
  type = string
}

variable "location" {
  type = string
}

variable "appservice_plan_sku" {
  type = string

  validation {
    condition = contains([
      "B1", "B2", "B3", "D1", "F1", "S1", "S2", "S3"
    ], var.appservice_plan_sku)
    error_message = "Please provide a valid App Service Plan SKU"
  }
}

How do we create the resource group module before we run our tests? The test file main.tftest.hcl looks like this:

// tests/main.tftest.hcl
provider "azurerm" {
  features {}
}

variables {
  resource_group_name = "rg-app-service-test"
  location            = "swedencentral"
  appservice_plan_sku = "F1"
  name_suffix         = "apptest"
}

run "setup" {
  module {
    source = "./testing/setup"
  }
}

run "http_should_not_be_allowed" {
  command = plan

  assert {
    condition = azurerm_linux_web_app.app.https_only == true
    error_message = "Web App accepts HTTP traffic"
  }
}

run "confirm_always_on" {
  command = plan

  variables {
    name_suffix = "testingalwayson"
    appservice_plan_sku = "S1"
  }

  assert {
    condition = azurerm_linux_web_app.app.site_config[0].always_on == true
    error_message = "Always-On is off for S1 SKU"
  }
}

There are few new things to look at in this test file. Let's break it down.

First of all we configure the azurerm provider:

provider "azurerm" {
  features {}
}

This allows you to configure the provider in any way that fits your tests. In this case I use default settings (an empty features block is required). Note that you could also configure the provider to use a separate test subscription instead of any other default subscription you have configured.

The next piece in the test file defines global variables:

variables {
  resource_group_name = "rg-app-service-test"
  location            = "swedencentral"
  appservice_plan_sku = "F1"
  name_suffix         = "apptest"
}

These variables will be used for the setup module and all the following tests, unless the tests override these values.

Next we have the setup module:

run "setup" {
  module {
    source = "./testing/setup"
  }
}

A setup (or helper) module is created in its own run block. I set the label of this block to setup, but you can set it to whatever fits your context. The run block only contains a nested module block that specifies the source of the module to be my module located in the testing/setup directory. This run block is the first run block in the test file, so it will be run first (they are run in order). If I place the setup run block somewhere else in the file then the tests defined above the setup block would fail because the resource group would not exist.

The rest of the file contains two tests in two separate run blocks. The first block is similar to what we have seen before, but notice that we have a nested variables block in the other run block:

run "confirm_always_on" {
  command = plan

  variables {
    name_suffix = "testingalwayson"
    appservice_plan_sku = "S1"
  }

  assert {
    condition = azurerm_linux_web_app.app.site_config[0].always_on == true
    error_message = "Always-On is off for S1 SKU"
  }
}

This means that for this test we override the name_suffix and appservice_plan_sku variables.

I can run the tests with terraform test:

$ terraform test
tests/main.tftest.hcl... in progress
  run "setup"... pass
  run "http_should_not_be_allowed"... pass
  run "confirm_always_on"... pass
tests/main.tftest.hcl... tearing down
tests/main.tftest.hcl... pass

Success! 3 passed, 0 failed.

Notice that the output says 3 passed even though we only really had two tests. How come? This is because the setup module runs inside a run block, so it is considered to be a test from Terraform's point of view. I think this is a bit unfortunate, but for now we'll have to live with it.

My tests in this case used command = plan, so they are relatively fast to run. When you use command = apply you have to prepare for a potentially long test run, depending on what resources your module creates. Imagine if you have a module that creates multiple Kubernetes clusters and installs various components in these clusters, then an apply could take some time. Especially if you run multiple independent tests.

Terraform test state file and resource destruction

How does Terraform know what resources it should remove and in what order it should do it? If you are familiar with Terraform you know that there is usually a state file somewhere. When you run tests Terraform keeps the state files in memory, so you won't see any state files appearing in your module directory.

Terraform creates one state file for the main configuration under test, and one state file for each alternate module that you create through a run block. An example of an alternate module is the setup module we saw in an example above.

The state files are created in the order of the tests, and they are destroyed in the reverse order. An illustrative sample of what state files are created, updated, and destroyed and in what order, is shown below:

// first alternate module call
// creates state file for modules/setup-1.tf
run "setup-1" {
  module {
    source = "modules/setup-1.tf"
  }
}

// second alternate module call
// creates state file for modules/setup-2.tf
run "setup-2" {
  module {
    source = "modules/setup-2.tf"
  }
}

// test the main configuration
// creates the main statefile for the module's main.tf
run "test-1" {
  assert {
    ...
  }
}

// third alternate module call, once again to the setup-2 module
// updates the state file for modules/setup-2.tf
run "setup-2-again" {
  module {
    source = "modules/setup-2.tf"
  }
}

// second test for the main configuration
// updates the main statefile for the module's main.tf
run "test-2" {
  assert {
    ...
  }
}

// After everything is run clean-up starts:
// 1. The module's main.tf state file is destroyed
// 2. The alternate modules state files are destroyed in
//    reverse order from how they were created
//      - first the modules/setup-2 state file is destroyed
//      - then the modules/setup-1 state file is destroyed

A question that came to my mind when I first heard about the test framework was what happens if the test fails the destruction of resources? Let's see what happens!

I will run a test where an Azure App Service is created. I will use the setup module from before where I created a resource group. To make the test fail I will issue the following Azure CLI command in order to lock the resource group so that Terraform can't destroy it:

$ az lock create \
  --name failure \
  --resource-group rg-app-service-test \
  --lock-type ReadOnly

The test output is the following:

$ terraform test
tests/main.tftest.hcl... in progress
  run "setup"... pass
  .
  . (output truncated)
  .
Terraform left the following resources in state after executing
tests/main.tftest.hcl/http_should_not_be_allowed, and they need to
be cleaned up manually:
  - azurerm_linux_web_app.app
  - azurerm_service_plan.plan

There we go!

We are instructed that a number of resources need to be cleaned up manually. In Azure this is usually relatively simple if you have put all resources in the same resource group. However, if you are working with AWS you might be in for a tedious cleanup process if your module created a lot of resources!

I can see that this behavior could be an issue during development of your module and tests where you are not sure if everything works as intended. You will most likely end up with a few failing test cleanups.

Exploring the test command in the CLI

To cover everything we can about the test framework let's see what else we can do with the terraform test command:

$ terraform test -h
Usage: terraform [global options] test [options]

[ background description truncated for brevity ... ]

Options:

  -cloud-run=source     If specified, Terraform will execute this test run
                        remotely using Terraform Cloud. You must specify the
                        source of a module registered in a private module
                        registry as the argument to this flag. This allows
                        Terraform to associate the cloud run with the correct
                        Terraform Cloud module and organization.

  -filter=testfile      If specified, Terraform will only execute the test files
                        specified by this flag. You can use this option multiple
                        times to execute more than one test file.

  -json                 If specified, machine readable output will be printed in
                        JSON format

  -no-color             If specified, output won't contain any color.

  -test-directory=path  Set the Terraform test directory, defaults to "tests".

  -var 'foo=bar'        Set a value for one of the input variables in the root
                        module of the configuration. Use this option more than
                        once to set more than one variable.

  -var-file=filename    Load variable values from the given file, in addition
                        to the default files terraform.tfvars and *.auto.tfvars.
                        Use this option more than once to include more than one
                        variables file.

  -verbose              Print the plan or state for each test run block as it
                        executes.

There are a few interesting flags we can use. I want to highlight a few:

-cloud-run=source is useful if you have your module in a private registry in Terraform Cloud, and you want to trigger a test run in the cloud. I will cover testing in Terraform Cloud in a future post.
-filter is useful if you have a lot of test files and you would only want one or a few of the files to run. This is especially useful if you are testing a large module where your tests execute apply operations that take a long time.
-test-directory is useful if you place your test files somewhere else than in the tests directory. But as I mentioned earlier in this article you should probably stick to using the tests directory.

Summary

In this post we have looked at most of what the new testing framework for Terraform 1.6 has to offer. That is in fact not true. There are more we can say about the testing framework when it comes to Terraform Cloud. In a future post I will cover how we run tests in Terraform Cloud and some of the unique features that are available there.

The example patterns in this post have intentionally been left relatively simple. In reality creating good tests for your modules will require a lot of work. My purpose of this post has been to illustrate what we can do, what syntax is available, and a few of the behaviors we can expect from this framework.

I expect that there will be additional features added as HashiCorp receives feedback from users of this framework. We live in exciting times!

Apart from this single new feature there were enhancements and bug fixes included as well. ↩
I call them patterns here to use a familiar nomenclature. As with all patterns you will most likely not see them isolated in the real world. All patterns I present are most likely mixed and matched for real Terraform modules. The idea with patterns here is to introduce the testing framework piece by piece. ↩

DEV Community

Testing Framework in Terraform 1.6: A deep-dive

Background on testing with Terraform

Nomenclature

Test framework patterns

Pattern 1: Assertions

Pattern 2: Expecting failures

Pattern 3: Using helper modules

Terraform test state file and resource destruction

Exploring the test command in the CLI

Summary

Top comments (0)

Read next

Rest Assured Basics: A Beginner's Guide to Automated API Testing in Java

Top 8 Terraform Security Best Practices to Protect Your Infrastructure

Terraform Ephemeral Resources

How to Perform Firefox Mobile Testing? (Step-by-Step Guide)