SnykSec for Snyk

Posted on Nov 3, 2023 • Originally published at snyk.io

Rego for beginners: Introduction to Rego

#cloudsecurity #iacsecurity #rego #opa

This blog post series offers a gentle introduction to Rego, the policy language from the creators of the Open Policy Agent (OPA) engine. If you’re a beginner and want to get started with writing Rego policy as code, you’re in the right place.

In this three-part series, we’ll go over the following:

Part 1 (this part!): Basic Rego concepts and how to get started with OPA
Part 2: Intermediate Rego syntax
Part 3: Types of values and rules

What are Rego and OPA?

Rego is a declarative query language from the makers of the Open Policy Agent (OPA) framework. The Cloud Native Computing Foundation (CNCF) accepted OPA as an incubation-level hosted project in April 2019, and OPA graduated from incubating status in 2021.

Rego is used to write policy as code, which applies programming practices such as version control and modular design to the evaluation of cloud and infrastructure as code (IaC) resources.

OPA is the engine that evaluates policy as code written in Rego. It “decouples policy decision-making from policy enforcement,” which means it delivers a decision on whether a resource is compliant with policy so you don’t need to hard-code those checks into your application code.

By separating policy from your software and offloading policy checks to OPA, you add speed and flexibility to the development lifecycle. You can update policies at any time without having to recompile your application or redeploy your service. This enables you to build at scale, increase visibility into your compliance posture, and programmatically enforce policies. Introducing policy as code also has the benefit of shifting cloud and IaC development “left,” which means introducing it earlier in the lifecycle.

Here at Snyk, out-of-the-box and custom rules for Snyk IaC+ are written in Rego, and behind the scenes, Snyk uses the OPA engine to evaluate Rego policies and return decisions. In fact, Snyk has run more than one billion security rule evaluations using OPA!

How does Rego work?

In a declarative query language like Rego, you describe the data you want to retrieve, and the program searches a data source — known as the input — for a match. This is different from traditional imperative languages, where you describe the steps that need to happen to produce a result. Perhaps you are already familiar with some declarative query languages — SQL is probably the most widely used one.

With Rego, you describe the conditions for passing or failing a policy, and OPA searches a JSON (or YAML) input document for data that matches those pass/fail conditions.

Sounds great in the abstract, but what does a rule look like? Let's say we need to enforce a corporate policy that only Alice, a network administrator, should have permission to create and delete virtual networks in the production environment. Here's an example rule we could write:

allow := true {
  input.user == "alice"
}

We'll return to this rule momentarily so we can explain what it does and how it works. For now, just marvel at its elegant simplicity!

Input

OPA can process any JSON or YAML document as input. Did you notice our use of input.user in the example rule above? input is treated as a special JSON document that can be accessed globally, meaning you can refer to it from anywhere within the Rego policy file.

Here's an example input document to go with our example rule from the previous section:

{
  "user": "alice"
}

We’ll say that this is a document representing the currently logged-in user. In the real world, this input document might be a Kubernetes manifest or the output of a Terraform plan. We'll show you an example in an upcoming blog post.

Rules

Now that we've shown you what input looks like let's dive into the concept of rules. In the Rego language, a rule is a conditional assignment. Each rule has two parts:

a head {
  and a body
}

The head consists of a variable and a value that may be assigned to it.
The body consists of one or more queries that tell OPA what condition(s) must be fulfilled for the value to be assigned to the variable.

You can read a rule like this:

THIS VARIABLE    := HAS THIS VALUE {
    IF THESE CONDITIONS ARE MET
}

In sum, a rule queries the input to find a match for a condition, and if a match is found, a value is assigned to a variable.

Here's the example rule we used earlier:

allow := true {
  input.user == "alice"
}

In the example above, the head (variable and value) is allow := true and the body (query) is input.user == "alice". Put the head and body together, and you get a complete rule, which can be read as:

The variable allow has the value true IF user is equal to "alice".

Note that in Rego, := is the assignment operator, sometimes known as the walrus operator. It simply assigns a value to a variable, much like the equals sign does in other languages.

Queries

Let's dig into the concept of queries. As we stated earlier, a query represents a condition to check — it's essentially the first half of an IF statement.

For example, this is the query from the example rule in the previous section:

input.user == "alice"

This line tells OPA to query the input document to find out IF the user is equal to "alice".

There's some nuance here — Rego is declarative, so a query is technically just making the statement "This is how it is," and OPA finds all the values in the input that make that statement true. So, in this example, the query says, “the user property is set to alice.” It’s OPA’s job to examine the user property and find all the users in the input that make this statement true, if any (in this case, it's looking for the user alice).

Referring to input in a query

When you craft a Rego query, you use dot notation to drill down to the property you're looking for, meaning each nested level of the input document is separated by a period. First, you start with input, and then add a dot and the name of the property at the top level (in this case, input.user).

To refer to a nested property, you'll need to specify all the layers you pass through to get to it. Start with input, a period, and the top-level property (input.your-property-here), then keep adding dots and property names until you reach the nested property you want to query. If the property you want to query is an array, sit tight — we’ll address that momentarily.

Let's say the input document looks like this, instead:

{
  "admins": {
    "user": "alice"
  }
}

Since user is nested under admins, which is nested under input, you'd refer to it like so:

input.admins.user

If the input property you’re referring to is an array (list), on the other hand, you will use the wildcard operator — an underscore — to specify the property. For example, let’s say the input includes a user array:

{
  "users": [ "alice", "bob", "carlotta" ]
}

In this case, if you wanted to find out if alice is in the users array, you’d use this syntax:

allow := true {
  input.users[_] == "alice"
}

Above, the wildcard operator tells OPA to iterate through the array and see if any of the elements is equal to alice. We’ll dive into iteration in a future blog post.

Likewise, let's suppose you want to check whether the value of the admin property is true in any of the elements in the users array below (even though there's only one element shown):

{
  "users": [
    {
      "name": "alice",
      "admin": true
    }
  ]
}

You would use syntax like this:

allow := true {
  input.users[_].admin == true
}

Of course, IF conditions aren't very useful unless there's a conditional action to go with it. That's where it's useful to understand how assignment works in Rego — the second half of the IF statement.

Rule evaluation

The conditional action in a rule is assigning a variable. In our example, we are looking in the input to find out if the variable allow should be assigned the value true. You can read it like this:

allow := true {
  IF THIS CONDITION IS MET
}

What are variables?

A variable is a reference to a specific value. Here, the variable x is assigned the value 1:

x := 1

You can use the variable in place of the value after that, and it's all the same to Rego:

x := 1
y := 2
z := x + y

To return to our example, the variable is allow, and the value to be assigned to it is true:

allow := true

Combined with the condition (query) we discussed earlier, you get the entire rule:

allow := true {
  input.user == "alice"
}

To recap, you'd read this rule like so:

The rule allow has the value true IF input.user is equal to "alice".

In practice, this kind of rule is very common, so Rego allows you to abbreviate it to the following:

allow {
  input.user == "alice"
}

Evaluating the rule with OPA

We’ve got a rule, and we’ve got an input document. The next step is to use OPA to evaluate the input against the rule. Let’s find out for sure whether our input is compliant with our policy — and whether the logged-in user is allowed to create and delete virtual networks in the production environment.

We will focus on two ways of interacting with OPA:

Using the Rego Playground
Using OPA’s command line tool

Using the Rego Playground

The easiest way to get started with writing rules is to use OPA's Rego Playground. It's an interactive tool that allows you to write, test, and share rules and input.

Here are the basics:

To edit a rule, use the rule text field on the left side of the page.
To edit the input, use the Input field on the top right of the page. (Note that this must be valid JSON.)
To evaluate a rule, select the Evaluate button above the input.
To see the evaluation results, check out the Output field on the bottom right.
To share a rule with others, select the Publish button on the top right. OPA generates a URL you can give to anyone so they can test or modify your rule and input.

Experiment to your heart's content, and don't be afraid to get messy! You won't break anything — the compiler will inform you if the Rego isn't valid. Reload the page if you want to reset the playground to its original state (or its published state if you're viewing a published playground).

We've shared a playground using the example allow rule and input document, which you can access at this URL: https://play.openpolicyagent.org/p/SH5ApmfodX

Or, just fire up a fresh playground and paste the rule in the text box on the left:

allow := true {
  input.user == "alice"
}

And paste the input in the top right text box:

{
  "user": "alice"
}

Recall that our corporate policy states that only Alice, a network administrator, should have permission to create and delete virtual networks in the production environment. The input document represents the currently logged-in user. Our rule checks whether the user in the input document is equal to alice, and if so, allow evaluates to true.

Go ahead and click the Evaluate button. You'll see this output:

{
  "allow": true
}

This means that allow indeed evaluates to true. OPA has determined that the input document is compliant with our corporate policy. The user is Alice, which means the user is allowed to create and delete virtual networks in the production environment.

Using OPA’s command line tool

Another way to evaluate rules with OPA is to use the opa command line tool. You can find instructions for installing opa in OPA’s documentation. Once you install it, you’ll need two things:

A .rego policy file containing your rule, along with a package declaration such as package rules.check_user at the very top. We’ve named our policy file check_user.rego.
A .json file containing the input; we’ve named our input file input.json.

Once you have those two things, you can use the opa eval command to evaluate your policy as code:

opa eval -i input.json -d check_user.rego "data.rules.check_user" --format pretty

You can name your files whatever you want, of course — just make sure the command follows this structure:

opa eval -i  -d  "data." --format pretty

If allow evaluates to true, as it does in our example, you’ll see the same output you saw in the Rego Playground:

{
  "allow": true
}

Again, OPA is indicating that the input is consistent with — compliant with — our company policy, which means the currently logged-in user is allowed to create and delete virtual networks in the prod account.

Testing non-compliant input

What does it look like if an input document is non-compliant? In the playground or in your local .rego file, change the input to the following:

{
  "user": "bob"
}

Now, when you evaluate the rule (by clicking Evaluate in the Rego Playground, or running the aforementioned opa eval command), you see the following output:

{}

What does it mean? OPA returns an "undefined" result (i.e., an empty set) in this example because it does not find a value in the input that matches the condition input.user == "alice". The set is empty because there are no results. As a result, allow does not evaluate to true, and the input document is not compliant with company policy. Sorry, Bob!

What’s next?

Be sure to return to our blog for the rest of our Rego for Beginners content series, where we’ll explore intermediate Rego rule syntax, including AND and OR structures, custom messages, special keywords, and more.

In the meantime, here are some useful resources:

If you’re interested in using Rego to write custom rules for Snyk IaC check out our documentation here. In addition to Snyk’s built-in security and compliance-mapped rulesets, IaC+ custom rules enable you to set customized security controls across your SDLC.

IaC+ gives you a single view and controls for your configuration issues from code to cloud with an issues UI, ruleset, and policy engine spanning IDE, SCM, CLI, CI/CD, Terraform Cloud, and deployed cloud environments such as AWS, Azure, and Google Cloud.

DEV Community