What are policies and how do you use them with CloudQuery?

#security #infrastructure #tutorial

Policy is a very broad term and in this blog we’ll be explaining what policies are, how we implement them at CloudQuery, and how you can use this in the security & compliance of your applications.

We have a good definition of what a policy is within our documentation, and it reads as so

Policy compliance is a broad term and can refer to any kind of policy, from internal standards to regulatory requirements. A CloudQuery Policy is a codified form of this that is written with HCL as the logic layer and SQL as the query layer.

Simplified this means a policy is a set of rules to be followed, for example, a password policy requiring you to

Be alphanumeric
Have a special character
Have a number
Changed every 4 weeks
No repeat password for 13 weeks

This is a really basic policy that is standard for a lot of applications or organisations.

In the wild various vendors have policies, and various standards bodies have varying policies that your applications may have to validate against to be classified as safe for certain usage. Even huge providers like AWS have made themselves compliant with ISO 27001 and a whole gamut of others.

What are the common policies?

A few of the most common policies you can come across are things like HIPAA (US healthcare) which has a full summary of security rules to follow. But for most of us in the software industry, we are going to come into contact with these two, CIS (Center for Internet Security) and PCI DSS (Payment Card Industry Data Security Standard)

How does CloudQuery work with policies?

CloudQuery is an open-source cloud asset inventory powered by SQL, so a policy to the tool is simply an HCL configuration file that references SQL queries. How this works is by CloudQuery ingesting the data from your cloud provider, such as AWS or Azure, and then the policy executes the SQL statements against that data as if it were a test suite. Our co-founder Yevgeny recently wrote an amazing piece on running the PCI DSS policy against AWS. If you are curious about the multitude of policies currently available you can check them out at https://hub.cloudquery.io/ which is our centrally available and searchable source for publicly available policies.

How to use policies?

As an example, we shall use AWS to check for compliance. Initially, you will need to install CloudQuery, to do this follow the instructions available in the Getting Started section of our documentation for your given operating system.

Now that you have CloudQuery installed you will need to create the configuration, with the provider you wish to connect it to which in our case is AWS

cloudquery init aws

This will create the config.hcl relative to where you ran this command, for a breakdown of how this is structured it’s available here https://docs.cloudquery.io/docs/configuration/overview.

If you don’t already have a Postgres instance running for the data (you will need to change the connection block of the config.hcl if you do) then run the following command

docker run -p 5432:5432 -e POSTGRES_PASSWORD=pass -d postgres

Once this is started you can run the following command to start the process of getting the data from AWS

cloudquery fetch

Now that you have the asset configuration available you can run policies against them. If you’d like to download them in advance you can execute

cloudquery policy download aws

This downloads the policy configuration and SQL to use later within the .cq directory, it also returns a block that you can add to your config.hcl that may look similar to

policy "aws" {
  source = "aws"
}

If you would like to skip the download step, you could execute the following commands to review and execute the policies:

# execute the whole policy pack (cis, pci dss, etc)
cloudquery policy run aws

# execute specific policy pack
cloudquery policy run aws//pci_dss_v3.2.1

# execute specific section in PCI DSS
cloudquery policy run aws//pci_dss_v3.2.1/autoscaling/1

It’s that simple and you will get a PASS/FAIL return list from all these existing policy statements.

Making a custom policy?

If you want to make a policy to automate some of your checks rather than writing the SQL repeatedly or storing them as a stored procedure, then it’s relatively easy.

For a full tutorial on creating policies, please follow the tutorial in our documentation, but for a quick guide keep reading.

The first step to a custom policy is to allow cloudquery to find it, the easiest way to do that is to add a policy block to the config.hcl and this can look like so

policy "my-custom-policy" {
  source = "./path/to/policy/directory"
}

Inside your new policy you will need a policy.hcl to define what your policy is, a simple example can be just like so:

policy "my-custom-policy" {
  title = "This is a test policy"
  doc = file(“README.md”)
  configuration {
    provider "theprovideritusesname" {
      version = ">= 1.0"
    }
  }
  …
}

After defining what the policy is, and its prerequisites for data such as the provider or documentation, you need to define the actual policy definitions

policy "my-custom-policy" {
  title = "This is a custom test policy"
  doc = file(“README.md”)
  configuration {
    provider "theprovideritusesname" {
      version = ">= 1.0"
    }
  }

  policy "nested-policy-peter" {
    title = "A policy about Peter"
    source = file("peter/policy.hcl")
  }

  policy "nested-policy-james" {
    title = "A policy about James"
    source = file("james/policy.hcl")
  }
}

Now for the policy itself, or the sub-policy in this case, we have out my-custom-policy and we are now creating the “peter” sub-policy. This should look like so:

policy "peter" {
  title = "Policy specifically for Peter"
  doc = file("peter/README.md")
  configuration {
    provider "theprovideritusesname" {
      version = "> 1.0"
    }
  }

  policy "meal" {
    source = file("peter/meal.hcl")
  }
}

You can see that this sub-policy defines the prerequisites, like the provider that supplies its data etc, as the initial policy did before. And as before the policy block here references another hcl, but this one contains the query to execute and looks like so:

policy "meal" {
  title = "Section Meal: Information regarding Peter’s meals"
  doc = file("peter/docs/meals.md")
  check "breakfast" {
    title = "What does Peter have to breakfast"
    doc = file("peter/docs/breakfast.md")
    query = file("queries/meals/breakfast_eaten.sql")
  }

  check "lunch" {
    title = "What does Peter have to lunch"
    doc = file("peter/docs/lunch.md")
    query = file("queries/meals/lunch_eaten.sql")
  }

  check "dinner" {
    title = "What does Peter have to dinner"
    doc = file("peter/docs/dinner.md")
    query = file("queries/meals/dinner_eaten.sql")
  }
}

Now you can see the policy specifically for Peters meals, and we aren’t going to another nested level, instead we have the check block. This block contains the title of the check, any doc concerning it, and a reference to the SQL query that’s contained in a .sql file (NOTE: you can just write the SQL into the query block-like `query = “SELECT 1;”, but we don’t advise that for query reuse).

tip All thefileincludes are from the root directory that you set the firstpolicy.hcl` in, leaving you with a directory structure a little like so:

policy.hcl
peter
- Docs
  - Meals.md
  - Breakfast.md
  - Lunch.md
  - dinner.md
- policy.hcl
- meals.hcl
queries
- meals
  - breakfast_eaten.sql
  - lunch_eaten.sql
  - dinner_eaten.sql `

And with this, you can create policies to automate the checks, or your internal organisation policies, against your assets in the cloud. If you’d like more documentation on policies, you can check out the policy documentation or the hubs policy page.

If you are looking for an open-source cloud asset inventory powered by SQL, check out our GitHub.

Also, Feel free to join our Discord if you run into any bugs/issues, or just want to chat.