Takashi Narikawa

Posted on Dec 28, 2021

Introducing Conftest and setting up CI with Github Actions to automate reviewing of Terraform code

#opa #rego #conftest #terraform

Introduction

Hi, I’m nari/wapper from SRE team at eureka, Match Group.👋

In our SRE Team, we are planning to set micro-services ready as one of the major keywords for next year ro prevent ourselves from becoming a bottleneck. (We are not going to rebuild our system as micro-services, though).

With that, as well as the Production Readiness check, I've started to try to use Policy as Code with Conftest, which is an important piece to promote transferability/Self Service, and I'll write about it as a memorandum.

In this article, I'll show you how to write a policy for resource tag rules (AWS reousrce) in Terraform with Rego, write tests for the policy itself, and set up the CI for its automatic execution using Github Actions.
(There is a lot of talk about writing Conftest (Rego), but not much has been written about setting up CI, and there were some points where I got stuck, so my main motivation for writing this article is to leave a reminder so that people who are about to try it don't get stuck.)

Target readers

People who have been curious about Policy as Code but needed a reason to start.
People who want to know concrete examples of how to set up Github Actions CI of Conftest.

What is OPA/Rego/Conftest???

OPA (Open Policy Agent) is an OSS general-purpose policy engine, which is currently a Graduation project of CNCF. OPA writes policies in a policy description language called Rego. OPA evaluates the policy according to the policy written in Rego and structured data such as JSON (base document), and returns the result in the same way as structured data such as JSON (virtual document).
- It just graduated from CNCF this year (2021): [https://www.infoq.com/news/2021/02/opa-cncf-graduation/:title]
- Source code: [https://github.com/open-policy-agent/opa:title]
Conftest is a tool that supports various types of file-formats such as YAML, JSON, HCL, Dockerfile, etc. based on the policy written in OPA's Rego, and can be tested easily from the command line.
- Conftest was originally a third party tool using OPA's Rego, and now it seems to be a widely used CLI tool officially under the umbrella of the OPA community!
- Source code: [https://github.com/open-policy-agent/conftest:title]
- This time, we will adopt this tool because it is supposed to be used in CI.

Write policy rules in Rego and set up CI with Github Actions

All sample code can be found here: https://github.com/fukubaka0825/terraform-conftest-with-acitions-ci-sample

In this article, I'd like to write a policy rule to make sure that a resource created by terraform has a minimum tag (name/owner/description) and that the importance of the data (high/middle/low) is set to the tag named data when it is a data store resource. We will use the result of the terraform plan converted to json as the input value so that we can test it with the policy rule in Rego we are about to write, and also test the policy rule itself. (The sample we are going to describe assumes the use of AWS)

0.Prerequisites

0.1.Set up Conftest (OPA/Rego)

If you are a Mac OS user, you can easily install it with brew. Users of other OS can also install it by referring to the README.



brew install conftest

# also install OPA
brew install opa

As for the editor plugin, you can use opa plugin for IntelliJ or opa plugin if you are IntelliJ/VSCode user.
- I'm not sure but other editors may have them too, so try to find them and add them.

0.2.Structure of Terraform Plan results

With this structure, we will write a logic in Rego to check the content of this data as input.

1. Try to write Terrafom resource tag rules in Conftest/Rego

First, we will write the entry point in main.rego.



# policy/main.rego


package main //❶

import data.tags_validation //❷

#####################################
# Policy as Code Rules
#####################################
deny_tags_contain_minimum_set[msg] {
    # Only target resources that have been changed/added.
    changeset := resources_not_no_op_action[_] //❸

    not tags_validation.contain_proper_keys(changeset.change.after.tags)

    msg := sprintf("Invalid tags (missing minimum required tags: name,owner,description) for the following resource: %v", [changeset.address]) //❹
}

deny_data_store_data_tag_is_proper[msg] {
    # Only target resources that have been changed/added.
    changeset := resources_not_no_op_action[_]

    # Only when resource_type is a data source/store type that can contain sensitive information
    tags_validation.is_data_tag_required_target_resource(changeset.type)

    tags_validation.not_has_proper_data_tag(changeset.change.after.tags.data)

    msg = sprintf("data tag needs to be set to one of [low,high,middle] resource: %v", [changeset.address])
}

#####################################
# Utils
#####################################

resources_not_no_op_action= {resource | resource := input.resource_changes[_]; resource.change.actions[_] != "no-op"} //❺

resources_with_type(resources, type) = all {
    all := [item | item := resources[_]; item.type == type]
}

❶ policy always requires a package clause. You can use any package name to divide the namespace.

❷ You can use the import clause to import a different namespace and use the resources in that namespace. This time, you can write your own logic in the tags_validation namespace and use it in main.

❸ Here, narrow down the resources to those whose action is not "no-op", and iterate through those resources (expression: [_]) to check for resources that violate the policy. This will ensure that only resources that are newly modified/added are targeted. In addition, since this operation is basically used in all policy rules, we have made it a common function and cut it out.

❹ Multiple evaluation expressions can be written in the same rule, and since they are evaluated in AND fashion, only when all of them are true, deny/violation is judged, and error messages are registered in the error message array for each of these policy rules and output at test execution time.

❺ In Rego, the familiar Python notation for inclusion can be used. The syntax is [ | ], and only the resources that match the rule are returned.




# policy/tags.rego

package tags_validation

minimum_tags = {"name", "owner", "description"}

contain_proper_keys(tags) {
    # Subtract the key list of the given tags from the minimum tag list, and if there is no more left, you have the minimum tag.
    keys := {key | tags[key]}
    leftover := minimum_tags - keys
    leftover == set()
}

not_has_proper_data_tag(value) { // ❶
    value != "low"  
    value != "middle"
    value != "high"
}

is_data_tag_required_target_resource(type) { // ❷
    type == "aws_dynamodb_table"
}

is_data_tag_required_target_resource(type) {
    type == "aws_s3_bucket"
}

❶ The evaluation expression within the same rule will be evaluated as AND, so it will return true if it is neither "low", "middle", nor "high".

❷ If you want to write OR logic, you can do so by writing multiple processes with the same name, such as is_data_tag_required_target_resource.

2. Write tests for the rule itself written in Rego with Conftest and try to run it.

First, we will write a test for main.rego. As in Go, we will name it ${test_target_file_name}_test.rego.



# policy/main_test.rego

package main

#####################################
# Tests of Policy as Code Rules
#####################################

...

test_tags_contain_minimum_set {
    plan := `                    // ❶
      resource_changes:
        - name: case normal
          address: module.one
          type: aws_security_group_rule
          change:
            actions:
              - create
            after:
              tags:
                name: hoge
                owner: piyo
                description: for test
    `

    input := yaml.unmarshal(plan)
    deny_tags_contain_minimum_set == set() with input as input // ❷
}

...

❶ The input for the test is written in json in the official documentation, but json is hard to maintain, so it is better to manage the description in yaml and convert it to json for use.

❷ The test for the rule can be written like this. If you don't need to check the message content of the array, you can just check whether the error message array is empty or not.



# tag_rest.rego

package tags_validation

test_contain_proper_keys {
    tags := {"name": "test", "owner": "hoge", "description": "normal test"}
    contain_proper_keys(tags) // ❶
}

...

❶ The function used in the main rule is unit tested in this way, and is considered to be passed if it returns true

3.Set CI to run Conftest with Github Action

3.1.Set up CI for fmt/verify of the policy rule itself written in Conftest/Rego.



name: conftest-fmt-and-verify-all

on:
  pull_request:
    branches: [ main ]
    paths:
      - 'policy/**'
env:
  CONFTEST_VERSION: 0.28.3
jobs:
  terraform:
    name: fmt-all
    runs-on: ubuntu-latest
    defaults:
      run:
        shell: bash
    steps:
      - name: Checkout
        uses: actions/checkout@v2

      - name: Install conftest
        run: |
          wget -O - 'https://github.com/open-policy-agent/conftest/releases/download/v${{ env.CONFTEST_VERSION }}/conftest_${{ env.CONFTEST_VERSION }}_Linux_x86_64.tar.gz' | tar zxvf -
          ./conftest --version

      - name: conftest fmt
        run: |
          git add . && ./conftest fmt ./ && git diff --exit-code ./.    // ❶

      - name: conftest verify
        run: |
          ./conftest verify ./   // ❷

❶ We don't have a flag like terraform's validation yet, so I write codes make it that the test will fail if there is a difference in the result of conftest fmt.

❷ Here, we run the test for the policy rule written in Rego that we wrote earlier.

3.2.Setting up CI to test Terraform plan results with Conftest test



name: tf-plan-apply

on:
  pull_request: 
    branches: [ main ]

env:
  TF_VERSION: 1.0.0
  CONFTEST_VERSION: 0.28.3
  WORKING_DIR: ./
jobs:
  terraform:
    name: aws-eureka-pairs-etc-s3
    runs-on: ubuntu-latest
    steps:
      - name: Checkout
        uses: actions/checkout@v2

      - name: Install conftest
        run: |
          wget -O - 'https://github.com/open-policy-agent/conftest/releases/download/v${{ env.CONFTEST_VERSION }}/conftest_${{ env.CONFTEST_VERSION }}_Linux_x86_64.tar.gz' | tar zxvf -
          ./conftest --version //❶

      - name: Setup Terraform
        uses: hashicorp/setup-terraform@v1
        with:
          terraform_wrapper: false //❷
          terraform_version: ${{ env.TF_VERSION }}
          cli_config_credentials_token: ${{ secrets.YOUR_CRED_NAME}}

      - name: Terraform Init ${{ env.WORKING_DIR }}
        working-directory: ${{ env.WORKING_DIR }}
        run: terraform init

      - name: Terraform Plan ${{ env.WORKING_DIR }}
        if: github.event_name == 'pull_request'
        env:
          GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }}
        working-directory: ${{ env.WORKING_DIR }}
        id: plan
        run: terraform plan -out=tfplan -no-color -lock=false -parallelism=50

      - name: Convert terraform plan result to json formmat
        if: github.event_name == 'pull_request'
        id: convert
        working-directory: ${{ env.WORKING_DIR }}
        run: terraform show -json tfplan > tfplan.json

      - name: conftest test
        if: github.event_name == 'pull_request'
        id: conftest
        run: ./conftest test --no-color ${{ env.WORKING_DIR }}/tfplan.json //❸

❶ I've been using wget to install conftest since I couldn't find any properly maintained Actions.

❷ If you are using the official hashicorp/setup-terraform, be aware that this value is set to true by default and the plan results cannot be shown in the appropriate json, so you need to set it to false.

❸ Now we will test the terraform plan results using the policy rule we just wrote. By default, it refers to policy/ when reading the policies to be used. If you want to refer to a different directory, you can set it by passing -p.

Conclusion

My honest impression after playing around with Conftest/OPA/Rego is that I would like to have a schema for data definitions because I am a type enthusiast, and the logic evaluation is quite strong and expressive, so if I don't try to write it in a simple and understandable way, it will soon become impossible for the team to maintain it.

Also, It makes me comfy that I can easily write tests for Policy as Code itself. I can change, add, and refactor Rego code with confidence.

Also, I feel that each company is still searching for a strategy for directory structure of OPA/Rego, so I'd like to try and find a good place to put it.

DEV Community