Sebastiaan Brozius for AWS Community Builders

Posted on Dec 29, 2024 • Originally published at justtinkering.nl

Setting up AWS IoT Core using Terraform

#awsiotcore #terraform

The code that accompanies this blogpost can be found here

I've been tinkering with AWS IoT Core this year, and wanted to put at least some of what I've found and done into a blog, so here it is.

AWS IoT Core is the AWS Internet-of-Things service:

AWS IoT provides the cloud services that connect your IoT devices to other devices and AWS cloud services. AWS IoT provides device software that can help you integrate your IoT devices into AWS IoT-based solutions. If your devices can connect to AWS IoT, AWS IoT can connect them to the cloud services that AWS provides.

Setting up an IoT environment within AWS is pretty easy, but I want to put it in code, so I can easily reproduce the environment I set up, and also be able to easily remove the configuration. My tool of choice is (still) Terraform.

We'll need to create the following resources:

IAM role used for registering new things
IoT policy for devices
IoT thing group (optional)
IoT thing type (optional)
Pre-provisioning Lambda (optional)
IoT fleet provisioning template (with optional pre-provisioning hook)
IoT policy for provisioning
Certificate for claim-based provisioning
IoT event configurations (optional)

I've included Terraform code in the accompanying GitHub repository.

IAM role for registering new things

When provisioning, the role used by IoT requires permissions to register new things. A managed policy AWSIoTThingsRegistration exists for this purpose, which should be assigned to a (new) role.

# Create the IoT provisioning IAM role
resource "aws_iam_role" "iot_fleet_provisioning" {
  name = join("-", [var.name, "fleet-provisioning-role"])

  assume_role_policy = jsonencode({
    "Version" : "2012-10-17",
    "Statement" : [
      {
        "Effect" : "Allow",
        "Principal" : {
          "Service" : "iot.amazonaws.com"
        },
        "Action" : "sts:AssumeRole"
      }
    ]
  })
}

# Attach the managed role for registering things to the provisioning role
resource "aws_iam_role_policy_attachment" "iot_fleet_provisioning" {
  role       = aws_iam_role.iot_fleet_provisioning.name
  policy_arn = "arn:aws:iam::aws:policy/service-role/AWSIoTThingsRegistration"
}

# Ensure that these (managed) policies are the only ones attached to the provisioning role on every apply
resource "aws_iam_role_policy_attachments_exclusive" "iot_fleet_provisioning" {
  role_name = aws_iam_role.iot_fleet_provisioning.name
  policy_arns = [
    "arn:aws:iam::aws:policy/service-role/AWSIoTThingsRegistration",
  ]
}

IoT policy for devices

What actions are allowed for a thing, will be defined in an IoT policy. A thing authenticates with IoT Core using a device-specific certificate. During provisioning, the policy will be assigned to the device-specific certificate, as defined in the fleet provisioning template.

Below policy allows

# Create a device policy
resource "aws_iot_policy" "iot_device" {
  name = join("-", [var.name, "device-policy"])

  policy = jsonencode({
    "Version" : "2012-10-17",
    "Statement" : [
      {
        "Effect" : "Allow",
        "Action" : [
          "iot:Publish",
          "iot:Receive"
        ],
        "Resource" : [
          "arn:aws:iot:${data.aws_region.current.name}:${data.aws_caller_identity.current.account_id}:topic/$${iot:Connection.Thing.ThingName}/*",
        ]
      },
      {
        "Effect" : "Allow",
        "Action" : "iot:Subscribe",
        "Resource" : [
          "arn:aws:iot:${data.aws_region.current.name}:${data.aws_caller_identity.current.account_id}:topicfilter/$aws/things/$${iot:Connection.Thing.ThingName}/shadow/*",
        ]
      },
      {
        "Condition" : {
          "Bool" : {
            "iot:Connection.Thing.IsAttached" : [
              "true"
            ]
          }
        },
        "Effect" : "Allow",
        "Action" : "iot:Connect",
        "Resource" : "*"
      }
    ]
  })
}

IoT thing group (optional)

Thing groups are optional, and can be used to group things together. There are static and dynamic thing groups. There's a limit of 100 dynamic groups per account, so if you've got a large environment with possibly a lot of groups, think ahead on whether or not you can use dynamic groups.

Dynamic thing groups are created from specific search queries in the registry. Search query parameters such as device connectivity, device shadow creation, and AWS IoT Device Defender violations data support this. Dynamic thing groups require fleet indexing enabled to index, search, and aggregate your devices' data.

Static thing groups allow you to manage several things at once by categorizing them into groups. Static thing groups contain a group of things that are managed by using the console, CLI, or the API.

In this example we're using a static group, which the new thing will be assigned to by the fleet provisioning template.

# Create a Thing group
resource "aws_iot_thing_group" "provisioning" {
  name = "Provisioning"
}

IoT thing type (optional)

Thing types allow you to store description and configuration information that is common to all things associated with the same thing type.

Although thing types are optional, their use makes it easier to discover things.

Things with a thing type can have up to 50 attributes.
Things without a thing type can have up to three attributes.
A thing can be associated with only one thing type.
There is no limit on the number of thing types you can create in your account.

In this example we're creating a single thing type, which is assigned to the thing by the fleet provisioning template.

# Create a Thing type
# The delete process of these is that they'll be deprecated first,
# and 5 minutes later they can be deleted.
resource "aws_iot_thing_type" "example" {
  name = "Example"

  properties {
    description = "Example"
    searchable_attributes = [
      # There's a maximum of 3 searchable attributes per Thing Type
      "environment",
      "license",
    ]
  }
}

Pre-provisioning Lambda (optional)

AWS recommends using pre-provisioning hook functions when creating provisioning templates to allow more control of which and how many devices your account onboards. Pre-provisioning hooks are Lambda functions that validate parameters passed from the device before allowing the device to be provisioned. This Lambda function must exist in your account before you provision a device because it's called every time a device sends a request through RegisterThing.

In this example we're deploying a simple Lambda-function used for pre-provisioning. No logic is built into the Lambda, but it shows that you can have a gatekeeper present in your provisioning process. You could, for example, check if the license-number provided by the thing for registering is valid, if the IP-address of the thing is as expected, if you've reached your maximum number of things you want to register, et cetera.

import json


def pre_provisioning_hook(event, context):
    print(event)

    # You can put code here to check if a device trying to connect
    # should be allowed or not, like checking if any of the provided
    # attributes are valid.
    # This function has to be able to respond within 5 seconds,
    # otherwise the provisioning request fails.
    # Reference: https://docs.aws.amazon.com/iot/latest/developerguide/pre-provisioning-hook.html

    # If you want to allow the device to connect to IoT Core, return this:
    # 'allowProvisioning': True

    # If you want to disallow the device to connect to IoT Core, return this:
    # 'allowProvisioning': False
    return {
        'allowProvisioning': True
    }

IoT fleet provisioning template (with optional pre-provisioning hook)

The fleet provisioning template is what ties together all the resources we created previously.

This template defines the following:

The parameters it expects to receive from a thing that's submitting itself for registration, to be used in the template
The resource required for the thing to communicate with IoT; a thing, a certificate and one or more policies.

This is where we assign the device policy to the thing, through the device-specific certificate which will be created for the thing. It will assign the thing to the initial thing group, and assign a thing type. By using multiple fleet provisioning templates, with different provisioning certificates, you can easily register different types of devices in your IoT environment, and assign device-specific attributes to them.

# Create the fleet provisioning template
resource "aws_iot_provisioning_template" "fleet" {
  name                  = join("-", [var.name, "fleet-provisioning-tpl"])
  description           = "Fleet provisioning template for ${var.name}"
  provisioning_role_arn = aws_iam_role.iot_fleet_provisioning.arn
  enabled               = true

  template_body = jsonencode({
    "DeviceConfiguration" : {},
    "Parameters" : {
      "License" : {
        "Type" : "String"
      },
      "AWS::IoT::Certificate::Id" : {
        "Type" : "String"
      }
    },
    "Resources" : {
      "policy" : {
        "Type" : "AWS::IoT::Policy",
        "Properties" : {
          "PolicyName" : aws_iot_policy.iot_device.name
        }
      },
      "certificate" : {
        "Type" : "AWS::IoT::Certificate",
        "Properties" : {
          "CertificateId" : {
            "Ref" : "AWS::IoT::Certificate::Id"
          },
          "Status" : "Active"
        }
      },
      "thing" : {
        "Type" : "AWS::IoT::Thing",
        "OverrideSettings" : {
          "AttributePayload" : "MERGE",
          "ThingGroups" : "REPLACE",
          "ThingTypeName" : "REPLACE"
        },
        "Properties" : {
          "AttributePayload" : {
            "license" : { "Ref" : "License" },
          },
          "ThingGroups" : [
            aws_iot_thing_group.provisioning.name
          ],
          "ThingTypeName" : aws_iot_thing_type.example.name,
          "ThingName" : {
            "Fn::Join" : [
              "-",
              [
                "iot",
                {
                  "Ref" : "License"
                }
              ]
            ]
          }
        }
      }
    }
  })

  pre_provisioning_hook {
    target_arn      = aws_lambda_function.iot_preprovisioning.arn
    payload_version = "2020-04-01"
  }
}

IoT policy for provisioning

The environment we're setting up, uses the 'provisioning with claim' provisioning method. This means we don't have to create device certificates in advance, but a new device will register itself using a generic provisioning certificate.

Because this certificate will be 'out in the wild', we want to restrict the permissions it provides as much as possible. This means the certificate should only be allowed to be used to register a new thing, and create a device-specific certificate for that thing. This is also why we want to add the pre-provisioning hook as a gatekeeper.

Below policy allows the thing to connect to IoT, subscribe to, publish too and receive from MQTT topics related to certificate creation and provisioning, specific to a provisioning template.

# Create the claims provisioning certificate policy
resource "aws_iot_policy" "provisioning" {
  name = join("-", [var.name, "claim-certificate-policy"])

  policy = jsonencode({
    "Version" : "2012-10-17",
    "Statement" : [
      {
        "Effect" : "Allow",
        "Action" : "iot:Connect",
        "Resource" : "*"
      },
      {
        "Effect" : "Allow",
        "Action" : [
          "iot:Publish",
          "iot:Receive"
        ],
        "Resource" : [
          "arn:aws:iot:${data.aws_region.current.name}:${data.aws_caller_identity.current.account_id}:topic/$aws/certificates/create/*",
          "arn:aws:iot:${data.aws_region.current.name}:${data.aws_caller_identity.current.account_id}:topic/$aws/provisioning-templates/${aws_iot_provisioning_template.fleet.name}/provision/*"
        ]
      },
      {
        "Effect" : "Allow",
        "Action" : "iot:Subscribe",
        "Resource" : [
          "arn:aws:iot:${data.aws_region.current.name}:${data.aws_caller_identity.current.account_id}:topicfilter/$aws/certificates/create/*",
          "arn:aws:iot:${data.aws_region.current.name}:${data.aws_caller_identity.current.account_id}:topicfilter/$aws/provisioning-templates/${aws_iot_provisioning_template.fleet.name}/provision/*"
        ]
      }
    ]
  })
}

Certificate for claim-based provisioning

As the provisioning certificate, we're going to use a self-signed certificate. Reasons for using a self-signed certificate:

The ability to set how long the certificate will be valid
Not having to set up a Certificate Authority
An AWS IoT-generated certificate doesn't have the proper allowed uses to connect to MQTT

For this example we're setting the lifetime of the certificate to 365 days. The lifetime that's right for your environment will depend on things like how often you'll update/deploy the application that includes the provisioning template, and how easy it is to update the certificate in that application.

The IoT policy for provisioning will be assigned to this certificate, to make sure it's not going to be used for any actions other than registering a new thing.

# Create a self-signed provisioning certificate
resource "tls_private_key" "provisioning" {
  algorithm = "RSA"
  rsa_bits  = 2048
}

resource "tls_self_signed_cert" "provisioning" {
  private_key_pem = tls_private_key.provisioning.private_key_pem

  subject {
    common_name = "IoT Provisioning"
  }

  validity_period_hours = 8760 # 365 days

  allowed_uses = [
    "key_encipherment",
    "digital_signature",
    "server_auth",
  ]
}

# Add the provisioning certificate and attach the provisioning policy
resource "aws_iot_certificate" "iot_fleet_provisioning" {
  certificate_pem = tls_self_signed_cert.provisioning.cert_pem
  active          = true
}

resource "aws_iot_policy_attachment" "iot_fleet_provisioning_certificate" {
  policy = aws_iot_policy.provisioning.name
  target = aws_iot_certificate.iot_fleet_provisioning.arn
}

IoT event configurations (optional)

If you want to be able to act on IoT events, those will need to be enabled.

Enabling these, facilitates these events being published to specific MQTT topics. These can be used in IoT rules, to trigger actions when specific events happen. They can also be used by things, as long as the device policy grants permissions to subscribe to and receive from those topics.

This example enables events related to thing creation, updates, and deletion.

# Manage events that will publish messages to MQTT topics.
# Reference: https://docs.aws.amazon.com/iot/latest/developerguide/iot-events.html#iot-events-enable
resource "aws_iot_event_configurations" "this" {
  event_configurations = {
    "THING"                  = true,
    "THING_GROUP"            = false,
    "THING_TYPE"             = false,
    "THING_GROUP_MEMBERSHIP" = false,
    "THING_GROUP_HIERARCHY"  = false,
    "THING_TYPE_ASSOCIATION" = false,
    "JOB"                    = false,
    "JOB_EXECUTION"          = false,
    "POLICY"                 = false,
    "CERTIFICATE"            = false,
    "CA_CERTIFICATE"         = false,
  }
}

IoT logging to CloudWatch (not included)

By default, AWS IoT Core doesn't log to CloudWatch. You can enable this in the console under Settings, or using the Terraform resource aws_iot_logging_options.
This will incur extra costs, so do keep an eye on that.

IaC caveats

Lack of resource support

Not all resources can be created using Terraform. Support for jobs and jobs templates is missing, for example, which are resources that really help to create a workflow for provisioning and staging your things.

These resources can be created using the AWS SDK, so a workaround for this shortcoming is to create a Lambda that performs the desired action in a dynamic environment, or use the SDK in a script, when your environment is more static.

A way to leverage a Lambda function when deploying your environment using Terraform, is the resource aws_lambda_invocation. This way, if you set the triggers correctly, the Lambda-function will be invoked when any of the triggers changes.

Another solution can be to use a step function to orchestrate the provisioning of a new thing, but might be a bit of overkill, depending on the size of your environment. Starting out with one or more (simple) Lambda functions and later on refactoring this to a step function is always an option.

Destroying your environment

When destroying the environment using Terraform, any device certificate needs to be deactivated, and detached from any policy (and thing). Otherwise the policies cannot be removed.

Sample client

Now that we've set up the IoT environment, it's time to test it. I've included a Python sample client in the accompanying GitHub repository, which registers itself with IoT Core, and writes the device-specific certificates to disk.

❯ python3 ./iotservice.py
Connecting to akgbiozgh01fa-ats.iot.eu-west-1.amazonaws.com with client ID 'iot-123'...
Lifecycle Connection Success
Connected!
Subscribing to CreateKeysAndCertificate Accepted topic...
Subscribing to CreateKeysAndCertificate Rejected topic...
Subscribing to RegisterThing Accepted topic...
Subscribing to RegisterThing Rejected topic...
Publishing to CreateKeysAndCertificate...
Waiting... CreateKeysAndCertificateResponse: null
Published CreateKeysAndCertificate request..
Received a new message awsiot.iotidentity.CreateKeysAndCertificateResponse(certificate_id='<CERTIFICATE_ID>', certificate_ownership_token='<CERTIFICATE_OWNERSHIP_TOKEN>', certificate_pem='<CERTIFICATE_PEM>', private_key='<PRIVATE_KEY>')
Publishing to RegisterThing topic...
Waiting... RegisterThingResponse: null
Published RegisterThing request..
Received a new message awsiot.iotidentity.RegisterThingResponse(device_configuration={}, thing_name='iot-123')
Exiting Sample: success
Stop the Client...
No Client to stop
Thing name: iot-123

This example is based on examples provided by AWS.

Device registration caveats

The device-specific certificates are written to disk in the function registerthing_execution_accepted in fleetprovisioning_mqtt5.py. IoT Core creates the device certificates before pre-provisioning has finished. When writing the certificates to disk, while the device is rejected by pre-provisioning, any later attempts to connect can fail, because there are already device-specific certificates present on the device. That would mean the certificates on the device need to be removed, before a new attempt can be made.

Also, because the certificate is created before the device is actually accepted, there will be certificates listed in IoT Core with the status Pending activation.

Conclusion

I've had fun this year figuring out things like this, and hope I've been able to provide you with enough information to set up your own IoT Core environment and play around with it.

Think ahead of the challenges you think you'll be facing, and be agile. Start small, and prepare for expanding to a larger scale. And as always, variables and requirements can (and probably will) change. Knowing what your options are, what the pros and cons are of those options will greatly help in picking the solutions you need both short, and long term.

DEV Community