DEV Community: pmalmirae

Building a serverless CGM Monitor on AWS with Kiro

pmalmirae — Mon, 08 Jun 2026 04:55:39 +0000

There are roughly 10 million people with Type 1 diabetes worldwide, with half a million new cases diagnosed every year. Nightscout, the open-source continuous glucose monitoring (CGM) platform, has been forked more than 70,000 times on GitHub. It's a lifeline for patients and caregivers who need real-time access to glucose data on their phones, watches, and browsers.

Yet most deployments still run on container platforms with MongoDB, a stack that made sense when Heroku was free but feels today increasingly heavy and expensive. When platform after platform dropped their free tiers, the community scrambled to find alternatives. Railway, Northflank, Fly.io. But never AWS serverless. It was about time someone changed that.

The Starting Point

A close family member has Type 1 diabetes and uses a Freestyle Libre continuous glucose monitor. Like many in the CGM community, we relied on Nightscout, the open-source project that lets you view glucose data remotely, share it with caregivers, and connect apps like xDrip and Shuggah on Apple Watch.

The classic Nightscout setup used to run free on Heroku with MongoDB Atlas. When Heroku killed their free tier, I moved to Northflank, which worked for a while, until it didn't. The free container IP addresses got banned by Abbott's servers for some reason (probably too many Nightscout users on the same shared IPs), and I ended up paying $15/month for a dedicated container just to keep the data flowing. Fifteen dollars a month to relay one glucose reading per minute felt absurd.

I wanted something I fully controlled, something minimal, and something cheap.

So I built my own serverless Nightscout on AWS from scratch.

The Catalyst

I'd had the idea of building this on AWS serverless for a long time. The architecture was clear in my head: Lambda, DynamoDB, API Gateway, done. But between work and life, I never found the time to actually sit down and build it. Writing Terraform modules, implementing Lambda handlers, figuring out the LibreLink Up API, wiring it all together. It was a project that always stayed in the "someday" pile.

Then Kiro came along. With an AI development environment that could help me scaffold infrastructure, implement the Lambda functions, debug issues, and iterate fast, that "someday" project suddenly became a "this weekend" project.

The Idea

The concept was simple: replace the entire Nightscout stack with AWS serverless services. No servers to manage, no MongoDB to babysit, pay-per-use pricing. For a single patient system that processes one glucose reading per minute, the theoretical cost should be pocket change.

The architecture:

EventBridge Scheduler triggers a Lambda function every minute
The Lambda fetches data from Abbott's LibreLink Up API
Stores it in DynamoDB
Another Lambda serves a Nightscout-compatible REST API via API Gateway
A static web app on S3 + CloudFront shows the glucose graph
Cognito handles authentication for the web UI

All defined in Terraform. All serverless. All mine.

Building It

The build started with Kiro's spec-driven development. I wrote a couple of lines describing what I wanted: a serverless Nightscout on AWS that fetches data from LibreLink Up and serves it via a compatible API. I also pointed Kiro at two open-source projects as reference material: cgm-remote-monitor (the original Nightscout) to understand the API contract and data models, and nightscout-librelink-up to understand how the LibreLink Up data fetching works. Kiro studied both, then produced a full spec: user stories with acceptance criteria, a technical design document covering architecture decisions and data models, and a structured list of implementation tasks. Then it started executing them, one by one.

One of the user stories Kiro generated from my two-line prompt. Complete with acceptance criteria and technical context.

The high-level architecture diagram from Kiro's design document. This is what got built.

Phase 1: Infrastructure

I started with Terraform modules: DynamoDB tables, Lambda functions, API Gateway, Secrets Manager. Modular, reusable, properly parameterized. The kind of infrastructure code that makes you feel responsible and adult.

Phase 2: The CGM Fetcher

The core Lambda function authenticates with LibreLink Up, fetches glucose data, and writes it to DynamoDB. Go was the obvious choice: fast cold starts, tiny binaries, and it compiles to a single static binary that just works on Lambda.

Phase 3: The API Handler

A Nightscout-compatible API that serves data to existing clients. The beauty of Nightscout is its ecosystem. Dozens of apps already speak its protocol. I just needed to implement the endpoints and existing apps would work out of the box.

Phase 4: The Web App

Vanilla HTML/CSS/JavaScript. No React, no build step, no node_modules black hole. A glucose graph, current reading, trend arrow, and time-in-range stats. Served from S3 through CloudFront for pennies.

The finished web app. Real-time glucose graph, trend arrow, time-in-range stats. Vanilla JS, no frameworks.

Phase 5: Authentication

Cognito for the web UI with Lambda@Edge enforcing auth at the CDN level. API endpoints stay open with API key auth for backward compatibility with mobile apps.

Phase 6: Data Migration

Three years of glucose data lived in the old Nightscout's MongoDB. Kiro wrote all the scripts needed for the migration: downloading the data from MongoDB, transforming it from Nightscout's MongoDB format into the DynamoDB schema, validating each record, and uploading it in small chunks to avoid throttling the table. Roughly 1.5 million readings moved over without drama. This is where things got interesting cost-wise, though.

The $16 Problem

After everything was running, I checked the bill. About $10/month. For a system that processes one reading per minute. That seemed... wrong.

DynamoDB was the culprit, eating more than $8 of that. But was it storage or operations?

I had 3 years of data: 1.5 million items across the base table and two Global Secondary Indexes. At $0.25/GB, storage was only about $0.60/month. So the cost was overwhelmingly from read and write operations.

The write spike from the migration made sense: 7.9 million write request units in March. Migrating 1.5M items through two GSIs with conditional writes (many failing because items already existed, but DynamoDB charges for those too). One-time cost, acceptable.

But the ongoing costs were still ~$16/month projected. Time to dig deeper.

The 720 Writes Per Minute Problem

Here's where I discovered something beautifully stupid.

The CGM fetcher runs every minute. LibreLink Up's /graph endpoint returns not just the current reading, but the last ~12 hours of historical readings. About 720 data points. Every single time.

My fetcher was supposed to filter out duplicates. It had a QueryLastEntry function that found the most recent entry in DynamoDB, then only wrote entries newer than that.

The problem? QueryLastEntry used a table scan with Limit: 1000.

With 1.5 million items in the table, scanning 1000 random items and picking the highest timestamp is like searching for the newest book in a library by checking the first shelf you walk past. You'll almost certainly miss the actual newest one.

So the filter thought the "latest" entry was months old. Every minute, it dutifully wrote practically all 720 historical entries. With conditional writes (attribute_not_exists) to prevent duplicates, which meant DynamoDB accepted the request, checked the condition, rejected the write, and charged me anyway. Seven hundred and twenty times. Every sixty seconds.

That's ~2,160 write request units per minute (720 × 3 for the base table plus two GSIs). Over 3 million wasted WRUs per month.

The fix was one query:

input := &dynamodb.QueryInput{
    TableName:              aws.String(d.tableName),
    IndexName:              aws.String("DeviceTimestampIndex"),
    KeyConditionExpression: aws.String("#d = :device"),
    ScanIndexForward:       aws.Bool(false), // newest first
    Limit:                  aws.Int32(1),     // just one
}

Query the GSI in reverse order, take the first item. Correct answer, every time, for 0.5 RRUs instead of 500.

After deploying: writes dropped from ~175,000/day to ~3,000/day. A 98% reduction.

The Million Reads Mystery

Writes were fixed. But reads were still ~1 million RRUs per day. For a system with maybe 4 active clients.

I traced it to the DeviceTimestampIndex GSI: over 1 million reads per day, all from there. The API handler's QueryEntries function was the culprit.

Two Shuggah (xDrip) clients on mobile phones poll every ~16 seconds, requesting count=1. Just the latest glucose value for the watch face. Perfectly reasonable. About 10,000 requests per day.

But QueryEntries had no Limit on its DynamoDB query. It asked for the entire GSI partition (1.5 million items sorted by timestamp), DynamoDB started reading and returned up to 1 MB of data per page, and the code took the first item and threw the rest away.

Each count=1 request was reading thousands of items from DynamoDB. Ten thousand requests per day, each consuming ~100 RRUs instead of 0.5.

The fix: one line.

Limit: aws.Int32(int32(count)),

Tell DynamoDB to stop reading after you have what you need.

The Result

Metric	Before	After
Writes/day	175,000	3,000
Reads/day	1,000,000	~10,000
Projected DynamoDB cost	$16/month	$0.61/month

The entire AWS bill for a fully functional, Nightscout-compatible CGM monitoring system: about $1.50/month.

The CloudWatch Free Tier Trap

Even after fixing DynamoDB, the bill was still around $2/month. It ran fine at $0.05/day for the first three weeks of each month, then jumped to $0.09/day for the last week. Classic free tier exhaustion pattern.

The culprits:

Too many alarms. CloudWatch gives you 10 free alarms. I had 11 (7 in eu-central-1, 4 in us-east-1). Three of those were DynamoDB throttle alarms that would never fire with on-demand billing and my traffic level. Removed them.
API Gateway access logging. API Gateway logs are classified as "Vended Logs" in CloudWatch, which have a much smaller free tier (~0.21 GB) than standard log ingestion (5 GB). At ~10 MB/day, the free tier ran out around day 21. I already had Lambda-level logging that showed the same information, so I disabled the API Gateway access logs entirely.

After those two changes: 7 alarms (within the 10 free tier), zero vended log bytes. The CloudWatch line item dropped to essentially zero.

Lessons Learned

1. DynamoDB charges for work, not results. A conditional write that fails still costs a WRU. A query that returns 1 MB but you only use 1 item still costs for 1 MB of reads. DynamoDB is honest. It bills you for the I/O it performed, not the I/O you wanted.

2. Table scans are not queries. Scanning 1000 items and picking the max is O(n) and non-deterministic. Querying a GSI in reverse with Limit 1 is O(1) and always correct. Know your access patterns.

3. Always set Limit. If you need 1 item, tell DynamoDB you need 1 item. Don't ask for everything and filter in application code.

4. LibreLink Up returns history every time. The graph endpoint gives you ~12 hours of data on every call. If you don't filter properly, you'll rewrite the same data 1,440 times per day.

5. Cost Explorer is your friend. Break down costs by usage type. "DynamoDB costs $8" tells you nothing. "24 million read request units from the DeviceTimestampIndex GSI" tells you exactly where to look.

6. The free tier masks problems. If you have AWS credits or free tier, you might not notice you're burning resources until the credits run out. Check usage quantities, not just dollar amounts.

The Stack

For anyone wanting to build something similar:

Compute: Lambda (Go, ARM64) for fast cold starts and minimal cost
Storage: DynamoDB (on-demand) scales to zero with no fixed costs
API: API Gateway HTTP API, cheaper than REST API
CDN: CloudFront + S3 for a static web app at pennies
Auth: Cognito + Lambda@Edge at $0.07/month
IaC: Terraform for reproducible, version-controlled infrastructure
Scheduling: EventBridge Scheduler triggers the fetcher every minute

Total infrastructure: ~$1.50/month, dominated by the $0.80 fixed cost for Secrets Manager (2 secrets at $0.40 each). DynamoDB, Lambda, API Gateway, and CloudWatch all fit comfortably in the free tier.

Was It Worth It?

I replaced a $15/month Northflank deployment with a fully serverless system I own completely, for $1.50/month. The data never leaves my AWS account. The code is mine. The infrastructure is defined in Terraform and deploys in minutes.

But the money isn't the greatest catch here.

The real takeaway is that with Kiro, you can compress the path from idea to production from weeks to hours. A project that sat in my "someday" pile for over a year was running in production over a weekend. Terraform modules, Go Lambda functions, DynamoDB schemas, API compatibility, web frontend, authentication. All of it.

That said, Kiro also writes errors and false logic from time to time. The 720-writes-per-minute bug? That came from a QueryLastEntry implementation that looked reasonable at first glance but fell apart at scale. The missing Limit on the API queries? Same story. Code that works in your head but bleeds money in production.

You need to evaluate and monitor the results. You need the hunch for what seems fine versus what smells like a problem. The $10 bill was the smell. The 1000-item scan was the false logic. Kiro gets you to production fast, but you still need to be the one who looks at the bill, checks the CloudWatch metrics, and asks "why is this number so high?"

The combination works: Kiro handles the volume of code, you handle the judgment calls. And sometimes those judgment calls are "this DynamoDB cost doesn't add up" followed by a conversation with Kiro to debug it together.

Sometimes the journey is the destination. And sometimes the destination is a $1.50 monthly bill and a Lambda function that no longer desperately tries to write 720 duplicate glucose readings every minute.

Get the Code

My project is open source and available at github.com/pmalmirae/serverless-nightscout-on-aws. I'm happy to share it with anyone who wants to run their CGM data on their own AWS account. If this helps even one person get off an expensive hosted setup and onto something they fully control, it was worth open-sourcing. Consider it my contribution back to the Nightscout community that gave us so much in the first place.

What's Next

A couple of directions I'm considering for the project:

Modular Terraform structure. Reorganizing the infrastructure into independent modules: Backend (DynamoDB + API), Web UI (S3 + CloudFront + Cognito), and a separate Libre Data Fetcher module. This would make it possible for contributors to develop data fetchers for other CGM devices (Dexcom, Medtronic, ...) without touching the rest of the stack.
Richer statistics in the Web UI. The current dashboard shows up to 24-hour glucose graph and time-in-range. The next step is adding estimated HbA1c (long-term blood glucose balance), daily and weekly trend analysis, and pattern detection. The data is already in DynamoDB, it just needs the math and the visualization.

If any of this sounds interesting to you, contributions are welcome. Whether it's a data fetcher for another CGM device, improvements to the web dashboard, documentation, or just testing it on your own account and reporting what breaks. Open an issue, send a PR, or just star the repo if you want to follow along. Let's build something useful together.

Speeding Up Data on AWS: From Ingestion to Insights

pmalmirae — Wed, 07 Aug 2024 11:37:02 +0000

In a production-scale cloud environment, data is scattered across various storage formats and locations, such as RDS databases, DynamoDB tables, time series databases, S3 files, and external systems. While Amazon QuickSight can directly connect to many data sources, it is often not preferred due to design principles, costs, performance, and user experience. Instead, the best practice is to build a centralized data lake with tools to consolidate and transform data for business intelligence tools. But how can you optimize the data pipeline from ingestion to insights to ensure processed data is ready for analysis as quickly as possible?

In this article, we use real-world open data sets on Helsinki region public traffic, imported as DynamoDB tables. We showcase, how we can transform data from the source to a Data Lake in S3, combine the Data Sets in QuickSight to create interesting and actionable insights, and eventually, how we can speed up the Data Pipeline to ensure the insights are always as up-to-date as possible.

Anatomy of a typical Serverless Data Pipeline on AWS

NordHero has implemented data pipelines for various customers on AWS utilizing our Data to Insights Jump Start offering. The solution uses

AWS Glue Jobs to extract data from their sources, transform the data to be efficiently utilized with BI tools, and load the data in Parquet or ORC format to a data lake based on Amazon S3
AWS Glue Crawlers to determine the data lake schemas and to store the schemas in AWS Glue Data Catalog
Amazon Athena to provide a scalable and super-fast SQL interface to the data stored in the S3 data lake
AWS QuickSight to analyze the data, build actionable insights on the data, and deliver the insights to business users

AWS Glue is an AWS-managed service, meaning that AWS manages the needed compute instances, their software, and the scaling of the resources. You only pay for your data's processing time. You can create and run several AWS Glue jobs to extract, transform, and load (ETL) data from various data sources into the data lake and build different curated datasets in the data lake for various data consumption needs.

Amazon S3 is an ideal service to be used as the storage foundation for a data lake, providing several benefits:

Scalability and Elasticity: Amazon S3 can scale massively to store virtually unlimited amounts of data, without the need for provisioning or managing storage infrastructure.
Data Lake Architecture: S3 enables a decoupled storage and compute architecture, allowing you to store data in its raw form and use various analytics services and tools to process and analyze the data without being tied to a specific compute engine. S3 integrates seamlessly with various AWS analytics services like Amazon Athena, AWS Glue, Amazon EMR, Amazon QuickSight, and AWS Lake Formation, enabling you to build end-to-end data processing and analytics pipelines.
Cost-Effective: Amazon S3 offers a cost-effective storage solution, with pricing based on the amount of data stored and accessed. You can also leverage different storage classes (e.g., S3 Glacier) for cost optimization based on data access patterns.
Data Durability and Availability: Amazon S3 is designed for 99.999999999% durability and 99.99% availability, ensuring your data is safe and accessible when needed.
Data Lake Security and Compliance: Amazon S3 provides robust security features, including access control, encryption at rest and in transit, and integration with AWS Identity and Access Management (IAM) for granular permissions management.
Data Sharing and Collaboration: With Amazon S3, you can easily share data across teams, projects, or even with external parties, enabling collaboration and data monetization opportunities.
Centralized Data Repository: A data lake on Amazon S3 serves as a centralized repository for all your structured, semi-structured, and unstructured data, breaking down data silos and enabling data democratization within your organization.

Here's an example AWS Glue Job script, written in Python, that extracts passenger data from a DynamoDB table named hsl-passengers, transforms the column names from uppercase to lowercase, casts passenger_count field from String to Integer type, and lastly writes the transformed data in an S3 bucket in Parquet format.

import sys
from datetime import datetime, date, timedelta

from awsglue.transforms import *
from awsglue.utils import getResolvedOptions
from pyspark.context import SparkContext
from awsglue.context import GlueContext
from awsglue.job import Job
from pyspark.sql.dataframe import DataFrame
from pyspark.sql.types import IntegerType
from pyspark.sql.functions import col


SPARK_CONTEXT = SparkContext.getOrCreate()
GLUE_CONTEXT = GlueContext(SPARK_CONTEXT)
spark = GLUE_CONTEXT.spark_session
logger = GLUE_CONTEXT.get_logger()


def read_dynamo_db(tablename: str):
    """Reads DynamoDB table into a DynamicFrame"""

    dyf = GLUE_CONTEXT.create_dynamic_frame.from_options(
        connection_type="dynamodb",
        connection_options={
            "dynamodb.input.tableName": f"{tablename}",
            "dynamodb.throughput.read.percent": "0.5",
            "dynamodb.splits": "1",
        },
    )
    return dyf


def write_log(message: str):
    """Writes log to multiple outputs"""
    logger.warn(message)
    print(message)


def write_to_s3(s3_output_base_path: str, name: str, df: DataFrame):
    """Writes data to specific folder in S3"""

    path = f"{s3_output_base_path}/{name}"
    if not ("processed" in path):
        raise Exception(
            "Output folder must contain path element 'processed' to be valid"
        )
    write_log(f"Writing output to {path}")
    df.write.mode("overwrite").format("parquet").partitionBy("object_id").save(path)


def main():

    # @params: [JOB_NAME]
    args = getResolvedOptions(
        sys.argv,
        ["JOB_NAME", "s3_output_path"],
    )

    job = Job(GLUE_CONTEXT)
    job.init(args["JOB_NAME"], args)

    # Let's only overwrite partitions that have changed, even though we store all data
    spark.conf.set("spark.sql.sources.partitionOverwriteMode", "dynamic")

    s3_output_path = args["s3_output_path"]

    write_log(f"Parameter 's3_output_path': {s3_output_path}")

    # Reading data from DynamoDb
    passengers_raw = read_dynamo_db("hsl-passengers").toDF()

    passengers = (
        passengers_raw.withColumnRenamed("OBJECTID", "object_id")
        .withColumnRenamed("SHORTID", "short_id")
        .withColumnRenamed("STOPNAME", "stop_name")
        .withColumn("passenger_count", col("PASSENGERCOUNT").cast(IntegerType()))
        .drop("PASSENGERCOUNT")
    )

    write_to_s3(s3_output_path, "passengers-data", passengers)

    job.commit()


if __name__ == "__main__":
    main()

When you trigger the AWS Glue Job, AWS Glue fires up the needed Apache Spark compute instances, manages the parallel job execution between cluster nodes, and ramps down the compute services after the job execution has finished.

Optimizing the data delivery to the data lake with AWS Glue

You need to consider several things when optimizing the ETL process with AWS Glue, but eventually, it comes down to two criteria. The primary criterion is Time. The data lake is never 100% up-to-date with the source data. So the key question is, how often should the data be updated? The secondary criterion is always Money. After setting the Time criterion, how can the costs of the ETL process be optimized?

To beat the criteria, you need to plan your data pipeline well. Here are a few spices to compete against the clock:

Scale cluster capacity: Adjust the number of Data Processing Units (DPUs) and worker types based on your workload requirements. AWS Glue allows you to scale resources up or down to match the demands of your ETL jobs.
Use the latest AWS Glue version: AWS regularly releases new versions of AWS Glue with performance improvements and new features. Upgrade to the latest version to take advantage of these enhancements.
Reduce data scan: Minimize the amount of data your jobs scan by using techniques like partitioning, caching, and filtering data early in the ETL process.
Parallelize tasks: Divide your ETL tasks into smaller parts and process them concurrently to improve throughput. AWS Glue supports parallelization through features like repartitioning and coalesce operations.
Minimize planning overhead: Reduce the time spent on planning by optimizing your AWS Glue Data Catalog, using the correct data types, and avoiding unnecessary schema changes.
Optimize shuffles: Minimize the amount of data shuffled between tasks, as shuffles can be resource-intensive. Use techniques like repartitioning and coalescing to reduce shuffles.
Optimize user-defined functions (UDFs): If you're using UDFs, ensure they are efficient and optimize their execution using vectorization and caching.
Use AWS Glue Auto Scaling: Enable AWS Glue Auto Scaling to adjust the number of workers based on your workload automatically, ensuring efficient resource utilization.
Monitor and tune: Use AWS Glue's monitoring capabilities, such as the Spark UI and CloudWatch metrics, to identify bottlenecks and tune your jobs accordingly.
Leverage AWS Glue Workflow: Use AWS Glue Workflow to orchestrate and manage your ETL pipelines, ensuring efficient execution and resource utilization.
Optimize data formats: Use columnar data formats like Parquet or ORC, which are optimized for analytical workloads and can improve query performance.
Leverage AWS Glue Data Catalog: Use the AWS Glue Data Catalog to store and manage your data schemas, which can improve planning and reduce overhead.
Optimize data compression: Use appropriate compression techniques to reduce the amount of data transferred and stored, improving performance and reducing costs.
Avoid processing the same data multiple times: Use AWS Glue Job bookmarks to track the data already processed by the ETL job, and update only the changed partitions when loading data to the data lake.

Here's an example of AWS Glue Workflow. The workflow processes Helsinki Region Transport (HSL) open data on passenger amounts and public transport stops and shifts. The workflow has a trigger named hsl-data-glue-workflow-trigger that is configured to start once per hour. The trigger will fire up five parallel AWS Glue Jobs to process data related to shifts, passengers, stoptypes, network and stops. When all these Jobs end up in the SUCCESS state, the hsl-data-glue-crawler-trigger is triggered to start an AWS Glue Crawler to update the data schemas in the AWS Glue Data Catalog.

AWS Glue Workflows support three types of start triggers:

Schedule: The workflow is started according to a defined schedule (e.g., daily, weekly, monthly, or a custom cron expression).
On-demand: The workflow is started manually from the AWS Glue console, API, or AWS CLI.
EventBridge event: The workflow starts with a single Amazon EventBridge event or a batch of Amazon EventBridge events.

Optimizing the data insights experience with Amazon QuickSight

From a data consumption perspective, the key criterion is that the data be up-to-date and instantly available. If there's lots of data in the data lake, updating an analysis or dashboard view in Amazon QuickSight might take even tens of seconds. That will make the business analytics experience very poor and generate lots of costs.

Amazon QuickSight has solved this issue with a lightning-fast in-memory caching solution called SPICE (Super-fast, Parallel, In-memory Calculation Engine). When configuring QuickSight DataSets, you have the option to either query the underlying data directly or utilize SPICE. QuickSight comes with a 10 GB SPICE allocation per QuickSight Author license, and additional SPICE capacity can be purchased with GB/month pricing.

When using SPICE, the underlying data from the data sources, such as a data lake, is loaded into SPICE. QuickSight Analyses and Dashboards utilize only the version of data available in SPICE. SPICE can be refreshed

manually
by a preconfigured schedule
through QuickSight API

The SPICE refresh timing becomes an issue when targeting to have as recent data available in QuickSight as possible. Consider a situation where the Glue Workflow, containing multiple ETL jobs, runs once per hour and updates several datasets in the S3 data lake. In our imaginary example, the workflow process typically lasts 20 minutes. Still, depending on the amount of changed data in the data sources since the last run and the current utilization level of the AWS-managed Glue hardware, the Workflow run can take between 14 and 40 minutes.

In addition, the QuickSight SPICE refresh process runs on AWS-managed computing resources, and in our case, refreshing one QuickSight DataSet might take 2-8 minutes.

And in a typical production-scale QuickSight environment, the order of DataSet refreshes matter. There are "simple" DataSets that are not dependent on any other DataSet and then there combined DataSets that utilize simple DataSets. Before starting to refresh one DataSet in QuickSight, we need to be sure that all underlying DataSets our DataSet is dependent on have first been refreshed.

Here's an example of a combined DataSet in QuickSight. All datasets available in data lake have first been brought to QuickSight, and now the passengers data is first joined with stops data while passenger amounts are counted per stop. Then stops data is joined with stoptypes data (whether the stop is a glass shelter, steel shelter, post,...), network data (is it a bus stop, subway stop, tram stop,...) and shifts data (number of public transport shifts between different stops).

Optimizing the whole data pipeline from ingestion to insights with event triggering and AWS Step Functions

So how can we manage this all automatically and with optimal timing? To solve the issues, we need to

use AWS Glue Workflow to automate and order the Jobs and Crawlers within the Glue ETL process
Refresh the QuickSight DataSets into SPICE immediately after Glue Workflow has finished its execution
Refresh the QuickSight DataSets in the correct order so that the "simple" DataSets get updated first and combined DataSets right after the simple ones

Luckily we can achieve the latter two by utilizing AWS StepFunctions and CloudWatch Event Triggering!

Triggering a Step Function when Glue Workflow has finished

AWS Glue Crawlers create CloudWatch Events on their lifecycle changes, and we can trigger an AWS Step Function State Machine execution when the last Crawler in our Glue Workflow sends a Succeeded event. Here's a CDK/TypeScript snippet on creating the event rule to watch for Glue Crawler state change events and to start the Step Function execution:

    // Event rule to trigger the Step Function
    new events.CfnRule(this, "CrawlerSucceededRule", {
      description: "Glue crawler succeeded",
      roleArn: smTriggerRole.roleArn,
      name: `statemachine-trigger-rule`,
      eventPattern: {
        source: ["aws.glue"],
        "detail-type": ["Glue Crawler State Change"],
        detail: {
          state: ["Succeeded"],
          crawlerName: [{ "equals-ignore-case": `hsl-data-glue-crawler` }],
        },
      },
      targets: [
        {
          arn: cfnStateMachine.attrArn,
          id: cfnStateMachine.attrName,
          roleArn: smTriggerRole.roleArn,
        },
      ],
    });

Refreshing QuickSight DataSets with AWS Step Functions State Machine

The AWS Step Functions have inbuilt integrations with loads of different AWS services, including AWS QuickSight. Therefore, it is straightforward to build a State Machine that starts refreshing the QuickSight DataSet SPICE - the process is called DataSet Ingestion. The following image shows an AWS Step Functions State Machine that processes QuickSight DataSet Ingestions in two phases - in the first phase, it parallelly ingests data on five QuickSight DataSets: network, stops, passengers, shifts and stoptypes. When all those ingestions have been successfully finished, the State Machine continues to ingest the second set of QuickSight DataSets, which in this example contains only one DataSet: passengers-and-stops.

For each QuickSight DataSet, the State Machine

Starts the Ingestion process with CreateIngestion call and saves the IngestionId value of the started Ingestion process
Checks the Ingestion status with DescribeIngestion call
If IngestionStatus is COMPLETED, CANCELLED or FAILED, it will pass the phase
Otherwise, it will wait for 20 seconds and check the Ingestion status again

Summing it up

As an end result, we now have a data pipeline that is triggered automatically by a predefined schedule, or with EventBridge event, and that starts ingesting QuickSight DataSets in correct order and as quickly as underlying data is updated. And now we can enjoy the actionable, up-to-date insights:

In this article, we reviewed the components of AWS's serverless data lake solution and explored ways to optimize its performance and user experience. Lastly, we learned how to automate the whole process from data ingestion to data insights with AWS Step Functions and AWS Glue Crawler Event Triggering.

We hope you enjoyed the journey. If you would like to set up a Serverless Data Pipeline and Data Lake on AWS, we are here to help. Just contact NordHero or book a meeting with me!

The examples in this article were build using data adapted from Helsinki Region Transport's (HSL) public data on transport stations, shifts and passengers. The original data is available on Helsinki Region Infoshare site.

What's new and noteworthy on AWS - Summer 2023 edition

pmalmirae — Wed, 23 Aug 2023 06:43:47 +0000

During my summer vacation, I occasionally glimpsed the AWS announcements feed but didn’t have time to dig into the list. I still got the feeling that there were some major releases, so I decided to go through all those hundreds of announcements and put together a comprehensive list of the most remarkable and noteworthy releases and new features around my favorite topics: Data & Analytics, Serverless Architecture & App Development, AWS Management & DevOps & IaC, and Security. I hope that also you find some gems from the list!

The summer of 2023 was indeed scorching hot for what comes to AWS and releasing new services and features. From the start of June until late August 2023, the list of recent announcements is impressive. There are over 600 announcements that I read through, and I hand-picked 45 top news on Data, Serverless, DevOps, and Security spaces. Several features have now been released as GA or in Preview that had already been announced at AWS re:Invent in November 2022 with flashing lights. But talk is cheap, so here are my top picks from the summer releases, ordered by solution area and release date!

Data & Analytics

Amazon QuickSight launches geospatial heatmap for points on maps

Released: Jun 5, 2023

It has been possible to create analyses and dashboards with geospatial map visuals also earlier, but now it is possible to have a geospatial heat map with your own data.
Please see following example image:

Geospatial heat map style uses color gradations to indicate areas of high and low data point concentration, allowing readers to zoom in and out, pan across the map, and explore the data in detail. When the map is zoomed in to a certain level, the heat layer automatically reverts back to the basic points, allowing readers to interact with the underlying points.

See here for more details: https://aws.amazon.com/about-aws/whats-new/2023/06/amazon-quicksight-geospatial-heatmap-points-maps/

AWS Glue for Ray is now generally available

Released: Jun 5, 2023

AWS Glue for Ray is now generally available. AWS Glue for Ray is based on open-source compute framework ray.io. AWS Glue for Ray combines Glue's serverless capability for data integration with possibility to develop ETL jobs with Python programming language.

AWS Glue for Ray facilitates the distributed processing of your Python code over multi-node clusters. You can create and run Ray jobs anywhere that you can run AWS Glue ETL jobs. This includes existing AWS Glue jobs, command line interfaces (CLIs), and APIs.

AWS Glue for Ray is generally available currently in the following AWS Regions: US East (Ohio), US East (N. Virginia), US West (Oregon), Asia Pacific (Tokyo), and Europe (Ireland).

More info: https://aws.amazon.com/about-aws/whats-new/2023/06/aws-glue-ray-generally-available/

AWS Glue Data Quality is now generally available

Released: Jun 6, 2023

AWS announces general availability of AWS Glue Data Quality, a capability that automatically measures and monitors data lake and data pipeline quality. AWS Glue Data Quality helps reduce the need for manual data quality work by using open-source Deequ to evaluate rules and measure and monitor the data quality of petabyte-scale data lakes. It then recommends data quality rules to get started. You can update recommended rules or add new rules. If facing any issues with data quality, you can configure actions to alert users.

You can validate the data quality of Amazon Redshift, Apache Iceberg, Apache HUDI, and Delta Lake datasets that are cataloged in the AWS Glue Data Catalog. The quality results are published to Amazon EventBridge, simplifying how users are alerted and integrating data quality results with other applications.

AWS Glue Data Quality is generally available in all AWS Regions where AWS Glue is available. To learn more: https://aws.amazon.com/about-aws/whats-new/2023/06/aws-glue-data-quality-generally-available/

Amazon Redshift Serverless now supports query scheduling and Single sign-on support

Released: Jun 7, 2023

Amazon Redshift Serverless now allows scheduling of SQL queries. With scheduling, you can automate time sensitive or long running queries. You can utilize the scheduled queries with Amazon Redshift Query Editor V2 or Amazon Redshift Data API.

Amazon Redshift Serverless supports now also Single sign-on with Identity Providers (IdP). How it works is that you can pass a list of database roles granted to a user based on his/her IdP group membership. Redshift administrator then configures the Identity Provider(IdP) to pass in database roles by adding specific principal tags as SAML attributes. The sign-on support can be used with Amazon Redshift Query Editor V2, JDBC/ODBC clients, and Data API.

The features are available in all regions that support Amazon Redshift Serverless. Read more here: https://aws.amazon.com/about-aws/whats-new/2023/06/amazon-redshift-query-scheduling-single-sign-on/

Amazon QuickSight now supports APIs to automate and accelerate assets deployment

Released: Jun 7, 2023

This is a long-awaited feature! QuickSight has earlier been it's own island inside an AWS Account, and it has been extremely difficult to automate QuickSight assets' deployment from one environment to another. It has been possible through API or CLI, but requiring heavy-duty coding in managing all dependencies between assets and the environments.

So now it is possible to export and import a QuickSight asset with all required dependencies. The feature supports all essential QuickSight assets such as dashboards, analysis, datasets including ingestion schedules, datasources, themes, and VPC configurations. You can even select whether to export the assets as plain JSON or as CloudFormation templates.

The new APIs are available with the Amazon QuickSight Enterprise edition in following AWS Regions where QuickSight is available: US East (N. Virginia and Ohio), US West (Oregon), Canada, Sao Paulo, Europe (Frankfurt, Ireland and London), Asia Pacific (Mumbai, Seoul, Singapore, Sydney and Tokyo).

Read the announcement here: https://aws.amazon.com/about-aws/whats-new/2023/06/amazon-quicksight-apis-automate-accelerate-assets-deployment/

Amazon Athena for Apache Spark now supports custom Java libraries

Released: Jun 8, 2023

Amazon Athena for Apache Spark was first released at re:Invent 2022 conference. Amazon Athena for Apache Spark is a feature of Amazon Athena lets you run interactive analytics on Apache Spark in under a second to analyze petabytes of data. So basically it is Athena but turbo-charged. With the new release, you can now include your own Java libraries and modules as JAR files in Spark workloads to connect to different data sources and run advance calculations using user defined functions to perform feature exploration.

The release includes also a set of reference connector packages for Amazon CloudWatch logs, CloudWatch metrics and Amazon DynamoDB so that you can use data from the services in your insights.

The new features are currently supported in 9 AWS regions where Amazon Athena for Apache Spark is available: US East (Ohio), US East (N. Virginia), US West (Oregon), Europe (Ireland), Europe (Frankfurt), Asia Pacific (Tokyo), Asia Pacific (Singapore), Asia Pacific (Sydney), and Asia Pacific (Mumbai). To learn more: https://aws.amazon.com/about-aws/whats-new/2023/06/amazon-athena-apache-spark-custom-java-libraries/

Amazon Athena for Apache Spark now supports Apache Hudi, Apache Iceberg, and Delta Lake

Released: Jun 8, 2023

Amazon Athena for Apache Spark now supports open-source data lake storage frameworks Apache Hudi 0.13, Apache Iceberg 1.2.1, and Linux Foundation Delta Lake 2.0.2. These frameworks simplify incremental data processing of large data sets using ACID (atomicity, consistency, isolation, durability) transactions and make it simpler to store and process large data sets in your data lakes.

Apache Iceberg, Apache Hudi and Delta Lake support is available in all AWS regions where Amazon Athena for Apache Spark is available. Read more here: https://aws.amazon.com/about-aws/whats-new/2023/06/amazon-athena-apache-spark-hudi-iceberg-delta-lake/

Amazon Kinesis Data Firehose adds support for data stream delivery to Amazon Redshift Serverless

Released: Jun 19, 2023

This release is part of the "Zero-ETL" initiative announced by AWS CEO Adam Selipsky at re:Invent 2022. He stated that AWS is putting their efforts to connect the various AWS services so that builders can concentrate on creating value instead of spending their time trying to get services integrated.

With the new release, Amazon Kinesis Data Firehose can now deliver streaming data directly to Amazon Redshift Serverless. With few clicks, you can more easily ingest, transform, and reliably deliver streaming data into Amazon Redshift Serverless without building and managing your own data ingestion and delivery infrastructure.

Amazon Kinesis Data Firehose with Amazon Redshift Serverless is generally available in the regions here under Redshift Serverless API section.

AWS Glue now can detect 250 sensitive entity types from over 50 countries

Released: Jun 23, 2023

Sensitive data detection feature in AWS Glue can now detect over 250 sensitive entity types from 50 countries out-of-the-box - all Nordic countries included!

Sensitive data detection feature in AWS Glue identifies a variety of sensitive data elements like social security numbers, credit card numbers, names, driver license numbers and other entities. Once detected, customers can take actions to redact the sensitive information before writing records into their data repositories. Customers can also create custom detectors to detect entities specific to their organizations.

This feature is available in the commercial Regions as AWS Glue. Check the supported countries here: https://aws.amazon.com/about-aws/whats-new/2023/06/aws-glue-250-entity-types-50-countries/

AWS announces Amazon Aurora MySQL zero-ETL integration with Amazon Redshift (Public Preview)

Released: Jun 28, 2023

Yes, another Zero-ETL announcement!

Amazon Aurora MySQL zero-ETL integration with Amazon Redshift is now available in public preview. The feature enables near real-time analytics and machine learning (ML) on petabytes of transactional data stored in Amazon Aurora MySQL-Compatible Edition. Data written into Aurora is available in Amazon Redshift within seconds, so you can quickly act on it without having to build and maintain complex data pipelines. Amazon Aurora MySQL zero-ETL integration with Amazon Redshift is available for Amazon Aurora Serverless v2 and Provisioned as well as Amazon Redshift Serverless and RA3 instance types.

Check the details: https://aws.amazon.com/about-aws/whats-new/2023/06/amazon-aurora-mysql-zero-etl-integration-redshift-public-preview/

Amazon Athena now supports querying restored data in S3 Glacier

Released: Jun 29, 2023

This is fun! You can now use Amazon Athena to query data stored in Amazon S3 Glacier (Glacier Flexible Retrieval & Deep Archive storage classes supported). With this launch, you can use Athena to directly query restored data in the S3 Glacier for use cases such as log analytics and long-term trend analysis, saving you time by removing the need to move and duplicate data.

This feature is available with Athena Engine V3 in all Amazon Athena supported regions. To learn more: https://aws.amazon.com/about-aws/whats-new/2023/06/amazon-athena-querying-restored-data-s3-glacier/

Amazon OpenSearch Service now supports OpenSearch version 2.7

Released: Jul 10, 2023

You can now run features from open source OpenSearch version 2.7 in Amazon OpenSearch Service. Key improvements include introduction of a unified schema for OpenSearch, ability to add map visualisations to Dashboard panels, ability to filter geospatial data. The new version also includes support for five new security log types.

Read the annoncement: https://aws.amazon.com/about-aws/whats-new/2023/07/amazon-opensearch-service-opensearch-version-2-7/

Amazon Redshift announces automatic mounting of AWS Glue Data Catalog

Released: Jul 25, 2023

Amazon Redshift released automatic mounting of AWS Glue Data Catalog, making it easier for customers to run queries in their data lakes. So no need to anymore create an external schema in Amazon Redshift to use the data lake tables cataloged in AWS Glue Data Catalog. Now, you can query data lake tables directly from Amazon Redshift Query Editor v2 or your favorite SQL editors. Again, a release that makes the life of data specialists much more fun!

AWS Glue Studio now supports Amazon Redshift Serverless

Released: Jul 25, 2023

AWS Glue Studio now supports Amazon Redshift Serverless as a data source or target out-of-the-box. Earlir, only Amazon Redshift clusters were supported out-of-the-box in AWS Glue Studio. As serverless edition of Redshift takes the space among customers, this update is for sure well anticipated.

To learn more, here's the announcement: https://aws.amazon.com/about-aws/whats-new/2023/07/aws-glue-studio-amazon-redshift-serverless/

Amazon EMR Serverless now supports retrieving secrets from AWS Secrets Manager

Released: Jul 27, 2023

A small but very important update. No more playing around with passwords and other secrets with your Amazon EMR Serverless. You can now get to the good side by utilizing AWS Secrets Manager for secrets like passwords, API keys an so forth.

Read here: https://aws.amazon.com/about-aws/whats-new/2023/07/amazon-emr-serverless-retrieving-secrets-aws-secrets-manager/

Amazon SageMaker announces a new direct integration with Salesforce Data Cloud

Released: Aug 4, 2023

August started with an announcement on Amazon SageMaker having a direct integration with Salesforce Data Cloud. What it means is that now you can without any extra hassle access Salesforce Data Cloud from SageMaker with OAuth-2.0-based authentication to build, train and deploy ML models on SageMager. So you can easily train ML models with SalesForce data, and turbo-charge Salesforce Einstein with ML-driven wisdom.

Salesforce Data Cloud direct integration is supported in all AWS regions where SageMaker is available. To learn more, read the announcement here: https://aws.amazon.com/about-aws/whats-new/2023/08/amazon-sagemaker-direct-integration-salesforce-data-cloud/

AWS IAM Identity Center integration is now generally available for Amazon QuickSight

Released: Aug 14, 2023

This is again one of those announcements that many have been waiting for! As said already earlier, QuickSight has been a quite isolated island inside AWS account, having its own user and group management. With this update, QuickSight administrators can now configure QuickSight to use IAM Identity Center to enable their users to login using their existing credentials. Administrators can select IAM Identity Center to configure QuickSight with their organization’s supported identity provider or with the IAM Identity Center identity store without requiring additional single sign-on configuration in QuickSight. Furthermore, they can use their identity provider groups to assign QuickSight roles (administrator, author and reader) to users.

This new feature is available in all AWS Regions where QuickSight and IAM Identity Center are available. Read more here: https://aws.amazon.com/about-aws/whats-new/2023/08/aws-iam-identity-center-integration-amazon-quicksight/

Serverless architecture & app development

AWS Lambda adds support for Ruby 3.2

Released: Jun 7, 2023

AWS Lambda now supports Ruby 3.2 as both a managed runtime and a container base image. The new Ruby version brings new features such as endless methods, a new Data class, improved pattern matching, and performance improvements. The Ruby 3.2 runtime is available in all regions where Lambda is available. The announcement can be found here: https://aws.amazon.com/about-aws/whats-new/2023/06/aws-lambda-support-ruby-3-2/

Amazon SQS announces support for dead-letter queue redrive via AWS SDK or CLI

Released: Jun 8, 2023

Amazon Simple Queue Service (SQS) announced support for dead-letter queue redrive via AWS SDK or Command Line Interface (CLI). The new feature is an enhanced capability to improve the dead-letter queue management by giving users a possibility to move messages from the dead-letter queue, and programmatically manage the lifecycle of the unconsumed messages at scale.

To programmatically automate dead-letter queue message redrive workflows, customers can now use the following actions:

StartMessageMoveTask, to start a new message movement task from the dead-letter queue;
CancelMessageMoveTask, to cancel the message movement task;
ListMessageMoveTasks, to get 10 most recent message movement tasks for a specified source queue.

See SQS Documentation for more information.

Dead-letter queue redrive via AWS SDK and CLI is available in all AWS Regions where Amazon SQS is available. Announcement can be found here: https://aws.amazon.com/about-aws/whats-new/2023/06/amazon-sqs-dead-letter-queue-redrive-aws-sdk-cli/

AWS Step Functions adds integration for 7 services including Amazon VPC Lattice

Released: Jun 15, 2023

The more integrations the better. AWS Step Functions really has a momentum, and now they released seven new integrations available through SDK. Overall, Step Functions have over 12,000 API actions from over 320 AWS services. That is really impressive and brings considerable advantage when building solutions that connect different AWS services. The new integrations include services such as Amazon VPC Lattice, Amazon CloudWatch Internet Monitor, AWS IoT TwinMaker, and Amazon OpenSearch Ingestion.

These enhancements are now generally available in all regions where AWS Step Functions is available. Please read more here: https://aws.amazon.com/about-aws/whats-new/2023/06/aws-step-functions-7-services-vpc-lattice/

AWS Step Functions launches Versions and Aliases

Released: Jun 22, 2023

Yet another major update for AWS Step Functions. AWS Step Functions announced in July 2023 the availability of Versions and Aliases, improving resiliency for deployments of serverless workflows. The new set of capabilities makes it easier to set up continuous deployment, to help you iterate faster and release safely into production. You can now maintain multiple versions of your workflows, track which version was used for each execution, and create aliases that route traffic between workflow versions. You can deploy your workflows gradually using industry standard techniques such as blue-green and canary style deployments with fast rollbacks to your Step Functions workflows, increasing deployment safety and reducing downtime and risk.

More info here: https://aws.amazon.com/about-aws/whats-new/2023/06/aws-step-functions-versions-aliases/

Announcing general availability for watchOS and tvOS support on AWS Amplify Library for Swift

Released: Jun 27, 2023

Normally I tend to bypass news about UI development but this one really caught my eye. You can now use AWS Amplify to build applications for Apple Watch and Apple TV! In late June 2023, they announced general availability of watchOS and tvOS support for AWS Amplify for Swift (>= v2.12.0). This launch enables developers to build cloud-connected apps for Apple Watch (watchOS) and Apple TV (tvOS) devices, in addition to iOS and macOS platforms.

Learn more here: https://aws.amazon.com/about-aws/whats-new/2023/06/watchos-tvos-aws-amplify-library-swift/

Amazon ECS now launches tasks faster alongside tasks with prolonged shutdown

Released: Jun 30, 2023

Amazon ECS is a platform that takes care of running containerized services called tasks. If a task becomes unhealthy, it is stopped and a new task is launched based on your configurations. Sometimes shutdown of a task takes a long time, and new task launches could get blocked on the instance. To overcome the situation, Amazon ECS now enables faster task launches on container instances that have tasks with prolonged shutdown. This enables customers to scale their workloads faster and improve infrastructure utilization.

Previously, to enable higher task provisioning throughput, ECS optimistically considered instance resources (e.g. cpu, memory, ports) as free for launching new tasks whenever a running task transitioned to the stopping state. In cases when a stopping task takes a long time to shutdown, new tasks launches could get blocked on the instance. This happened because ECS Agent waited for all stopping tasks to shutdown before starting new tasks. With the new release from the end of June 2023, ECS Agent can start new tasks on an instance if requisite resources are available even if there are tasks pending shutdown, enabling faster task launches and improving infrastructure utilization.

The new experience is available for customers using Amazon ECS on EC2 or ECS Anywhere in all AWS regions on Amazon ECS Optimized AMIs with ECS Agent version 1.73.0 or later. To learn more, read the announcement: https://aws.amazon.com/about-aws/whats-new/2023/06/amazon-ecs-tasks-faster-prolonged-shutdown/

AWS Lambda now detects and stops recursive loops in Lambda functions

Released: Jul 13, 2023

This is a small but yet very important update. I'm sure everyone who has by mistake developed a solution that recursively calls the same Lambda over and over again, has included this feature in their evening prayers.

AWS Lambda can now detect and stop recursive loops in Lambda functions. Lambdas are really popularly used to process events from sources like Amazon SQS and Amazon SNS. However, in certain scenarios, due to resource misconfiguration or code defect, a processed event may be sent back to the same service or resource that invoked the Lambda function. This can cause an unintended recursive loop, and result in unintended usage and costs for customers. With this launch, Lambda will stop recursive invocations between Amazon SQS, AWS Lambda, and Amazon SNS after 16 recursive calls.

When facing the recursive calls situation, Lambda will stop the 17th invocation and sends the event to a Dead-Letter Queue or on-failure destination, if configured. Customers will also receive an AWS Health Dashboard notification with troubleshooting steps.

Please see more details and available regions here: https://aws.amazon.com/about-aws/whats-new/2023/07/aws-lambda-detects-recursive-loops-lambda-functions/

AWS Fargate enables faster container startup using Seekable OCI

Released: Jul 17, 2023

Nice development again from ECS team! Customers running applications on Amazon ECS with AWS Fargate can now leverage Seekable OCI (SOCI), a technology open sourced by AWS that helps applications deploy and scale out faster by enabling the containers to start without waiting for the entire container image to be downloaded.

In many cases, waiting for the entire container image to download from container image repository is unnecessary as in many cases only a small portion of it is needed for startup. SOCI reduces this wait time by lazily loading the image data in parallel to application startup, enabling containers to start with only a fraction of the image.

To SOCI-enable your container images, start from this announcement: https://aws.amazon.com/about-aws/whats-new/2023/07/aws-fargate-container-startup-seekable-oci/

Amazon SNS now supports mobile push notifications in twelve new AWS regions

Released: Jul 20, 2023

Mobile client communications just got more comprehensive as Amazon SNS mobile push notifications are now available in twelve additional AWS regions, including Africa (Cape Town), Asia Pacific (Hong Kong), Asia Pacific (Jakarta), Asia Pacific (Osaka), Canada (Central), Europe (London), Europe (Milan), Europe (Paris), Europe (Stockholm), Middle East (Bahrain), Middle East (UAE), and US East (Ohio). With this expansion, Amazon SNS now supports the ability to send mobile push notifications from 24 regions.

Amazon SNS can send mobile push notifications on your behalf to mobile devices and desktops using one of the following supported push notification services: Amazon Device Messaging (ADM), Apple Push Notification Service (APNs) for iOS and Mac OS X, Baidu Cloud Push (Baidu), Firebase Cloud Messaging (FCM) to Android devices, Microsoft Push Notification Service for Windows Phone (MPNS), and Windows Push Notification Services (WNS).

Read the news: https://aws.amazon.com/about-aws/whats-new/2023/07/amazon-sns-mobile-notifications-twelve-regions/

AWS Lambda adds support for Python 3.11

Released: Jul 27, 2023

Yet another Lambda programming language update: AWS Lambda now supports creating serverless applications using Python 3.11. Developers can use Python 3.11 as both a managed runtime and a container base image, and AWS will automatically apply updates to the managed runtime and base image as they become available.

The Python 3.11 runtime is available in all Regions where Lambda is available, except for China and GovCloud Regions. Read more here: https://aws.amazon.com/about-aws/whats-new/2023/07/aws-lambda-python-3-11/

Announcing preview of JSON protocol support for Amazon SQS

Released: Jul 28, 2023

This is huge! The grand old SQS is turning from XML to JSON! At the end of July 2023, Amazon SQS announced a preview of JSON protocol support, enabling lower latency and improved performance for SQS customers. Based on AWS performance tests for a 5KB message payload, JSON protocol for Amazon SQS reduces end-to-end message processing latency by up to 23% and reduces application client side CPU and memory usage.

Amazon SQS customers can take advantage of lower latency when using the specified AWS SDK version. The specified SDK version achieves these latency gains by upgrading the default communication protocols to JSON wire protocol when they make SQS API requests. Customers can upgrade their AWS SDK to specified SDK version to use JSON protocol. Customers can also revert back to the AWS Query protocol by changing the SDK version.

For more information, here's the announcement: https://aws.amazon.com/about-aws/whats-new/2023/07/json-protocol-support-amazon-sqs/

Amazon EventBridge Scheduler adds schedule deletion after completion

Released: Aug 2, 2023

Amazon EventBridge Scheduler can invoke more than 270 AWS services and over 6,000 API operations, and scales out enabling scheduling of millions of tasks. No wonder it has become a de-facto scheduler for different time-based or recurring solutions on AWS platform. EventBridge Scheduler's new delete upon completion helps manage and clean-up schedules that have completed its last invocation. It removes the need for manual processes or custom code to delete completed schedules saving you time and making it easier to scale.

If interested, start with the announcement: https://aws.amazon.com/about-aws/whats-new/2023/08/amazon-eventbridge-scheduler-deletion-completion/

AWS management & DevOps & IaC

Single Region Terraform support now available for AWS Control Tower Account Factory

Released: Jun 8, 2023

AWS Control Tower is a great tool for managing AWS Organizations with multiple organization units, AWS accounts and related guardrails. With the new release, AWS Control Tower now offers a possibility to configure account templates with Terraform and utilize those templates when provisioning new or existing accounts from AWS Control Tower.

To get started, you can use the AWS-provided Terraform Reference Engine on GitHub that conﬁgures the code and infrastructure required for the Terraform open source engine. After the one-time setup, customers can define their account requirements using Terraform and deploy them to their accounts as part of the well-defined account factory workflow.

AWS Control Tower adds 10 new AWS Security Hub controls

Released: Jun 12, 2023

Another news related to Control Tower! AWS has added 10 new AWS Security Hub detective controls to the AWS Control Tower controls library. These new controls target services such as Amazon APIGateway, AWS CodeBuild, Amazon Elastic Compute Cloud, Amazon Elastic Load Balancer, Amazon Redshift, Amazon SageMaker, and AWS WAF. These new controls help you meet control objectives, such as establish logging and monitoring, limiting network access and encrypting data at rest, enhancing your governance posture.

With this addition, AWS Control Tower now supports over 170 detective controls from AWS Security Hub. Read more from the announcement: https://aws.amazon.com/about-aws/whats-new/2023/06/aws-control-tower-new-aws-security-hub-controls/

Announcing general availability of AWS Control Tower's integration with Security Hub

Released: Jun 19, 2023

And one more to the AWS Control Tower! In July 2023 AWS announced the general availability of the integration between AWS Control Tower and AWS Security Hub. You can enable over 170 Security Hub detective controls that map to related control objectives from AWS Control Tower. With the new release, AWS Control Tower now detects when you disable a control from Security Hub which results in a ‘Drifted’ control state. With this drift detection capability, it is simpler for you to monitor the deployment state of your controls and take appropriate actions to manage the security posture of your AWS Control Tower environment.

The drift detection capability for Security Hub controls requires updating to the new version of the AWS Control Tower Landing Zone 3.2. The new Landing Zone verion also includes updates to the Region Deny control for multiple AWS services.

Read all about the announcement here: https://aws.amazon.com/about-aws/whats-new/2023/06/aws-control-tower-account-integration-security-hub/

AWS CloudFormation accelerates dev-test cycle with new ChangeSets parameter

Released: Jun 20, 2023

Sometimes a small update is actually a BIG one. AWS CloudFormation launched a new parameter OnStackFailure for the CreateChangeSet API that allows customers to control the rollback behavior of ChangeSets. Customers use ChangeSets to preview the impact of a stack operation on active resources. With this launch, customers can modify the actions that CloudFormation will take when ChangeSet execution is unsuccessful.

Customers can set OnStackFailure to ROLLBACK, DELETE, or DO_NOTHING, where ROLLBACK is the default option for OnStackFailure and it reverts the stack to its last stable state if ChangeSet execution fails. When setting OnStackFailure to DELETE, CloudFormation deletes the new stack if ChangeSet execution fails. This eliminates the need for manual clean-up of stacks and allows customers to retry stack creation with CI/CD actions. DO_NOTHING preserves the state of the stack if ChangeSet execution fails.

To learn more about OnStackFailure, click here: https://aws.amazon.com/about-aws/whats-new/2023/06/aws-cloudformation-accelerates-dev-test-cycle-changesets-parameter/

AWS CodeBuild now supports GitHub Actions

Released: Jul 7, 2023

AWS CodeBuild customers can now use GitHub Actions during the building and testing of software packages. AWS CodeBuild is a fully managed continuous integration service that compiles source code, runs tests, and produces ready-to-deploy software packages. Customers’ CodeBuild projects are now able to leverage many of the pre-built actions available in GitHub’s marketplace. With CodeBuild’s integration with GitHub Actions, you can now extend your buildspec definition to invoke third-party solutions. There is no need to author and maintain custom integrations, or learn how integrate others’ solutions into your build process.

Amazon CodeCatalyst now supports workflows triggered by GitHub pull requests

Released: Jul 19, 2023

Another AWS DevOps tool integrating with GitHub! AWS announced in July 2023 their support for starting Amazon CodeCatalyst workflows based on pull request events in linked GitHub repositories. When a workflow is triggered by a GitHub-based pull request, users will also be able to see the name of the PR that triggered it in the CodeCatalyst workflows UI, and click a link that takes them directly to the pull request in GitHub.

To learn more, see the announcement: https://aws.amazon.com/about-aws/whats-new/2023/07/amazon-codecatalyst-workflows-triggered-github-pull-requests/

AWS Control Tower launches additional proactive controls

Released: Jul 24, 2023

Trust is good, control is better! In July 2024, AWS announced a launch of 28 new proactive controls in AWS Control Tower. This launch enhances AWS Control Tower’s governance capabilities with services such as Amazon CloudWatch, Amazon Neptune, Amazon ElastiCache, AWS Step Functions, and Amazon DocumentDB. Read more here: https://aws.amazon.com/about-aws/whats-new/2023/07/aws-control-tower-proactive-controls/

Accelerate your CloudFormation authoring experience with looping function

Released: Jul 26, 2023

This might be the biggest announcement for AWS CloudFormation fans for time being. AWS CloudFormation announced at the end of July 2023 looping capability with Fn::ForEach intrinsic function. With Fn::ForEach, you can replicate parts of your templates with minimal lines of code.

To use Fn::ForEach you have to declare AWS::LanguageExtensions transform. The language extensions transform expands the functionality of the base CloudFormation JSON/YAML template language. With this launch, you can use Fn::ForEach in your Resources, Resource properties, Conditions, and Outputs sections of your templates. Here's an example of CloudFormation YAML script that creates four different SNS Topics with different TopicNames:

AWSTemplateFormatVersion: 2010-09-09
Transform: 'AWS::LanguageExtensions'
Resources:
  'Fn::ForEach::Topics':
    - TopicName
    - - Success
      - Failure
      - Timeout
      - Unknown
    - 'SnsTopic${TopicName}':
        Type: 'AWS::SNS::Topic'
        Properties:
          TopicName: !Ref TopicName
          FifoTopic: true

Read the whole announcement here: https://aws.amazon.com/about-aws/whats-new/2023/07/accelerate-cloudformation-authoring-experience-looping-function/

AWS CodePipeline now supports GitLab

Released: Aug 14, 2023

Yet one more AWS DevOps service that integrate with popular 3rd party platform. You can now use your GitLab.com source repository to build, test, and deploy code changes using AWS CodePipeline. Connect your GitLab.com account using AWS CodeStar Connections, and use the connection in your pipeline to automatically start a pipeline execution on changes in your repository.

More here: https://aws.amazon.com/about-aws/whats-new/2023/08/aws-codepipeline-supports-gitlab/

Security

AWS WAF now supports Header Order match statement for request inspection

Released: Jun 5, 2023

AWS WAF now supports the Header Order match statement, enabling customers to specify the order in which HTTP headers appear in a request. With this feature, customers can further strengthen their access control measures by verifying additional dimensions of request metadata.

There is no additional cost for using this feature, however, standard AWS WAF charges still apply. It is available in all AWS Regions where AWS WAF is available and for each supported service, including Amazon CloudFront, Application Load Balancer, Amazon API Gateway, AWS AppSync, and Amazon Cognito. To learn more, see here: https://aws.amazon.com/about-aws/whats-new/2023/06/aws-waf-header-order-match-statement-request-inspection/

AWS KMS now supports importing asymmetric and HMAC keys

Released: Jun 5, 2023

You can now import asymmetric and HMAC keys into AWS Key Management Service (AWS KMS) and use them within supported KMS-integrated AWS services and your own applications. Importing your own key gives you direct control over the generation, lifecycle management, and durability of your keys. You can control the availability of your imported keys by setting an expiration period, or deleting and re-importing them at any time. These controls help you meet your specific compliance requirements if you must generate and store copies of keys outside of AWS.

Importing your own keys to AWS KMS can also be useful in situation where keys need to exist in multiple environments, including hybrid (on-premise) and multi-cloud workflows. This lets you safely migrate workloads to AWS while expanding options on how you authorize, audit, and protect keys through AWS KMS.

Check more details at: https://aws.amazon.com/about-aws/whats-new/2023/06/aws-kms-importing-asymmetric-hmac-keys/

AWS introduces container image signing

Released: Jun 6, 2023

From early days of June 2023, AWS Signer and Amazon Elastic Container Registry (ECR) launched image signing, a new feature that enables you to sign and verify container images. You can now use AWS Signer to validate that only container images you have approved are deployed in your Amazon Elastic Kubernetes Service (EKS) clusters.

For more information: https://aws.amazon.com/about-aws/whats-new/2023/06/aws-container-image-signing/

AWS announces AWS Payment Cryptography

Released: Jun 12, 2023

An interesting new service release touching eCommerce space! AWS announced in June 2023 a new service called AWS Payment Cryptography. This service simplifies your implementation of cryptography operations used to secure data in payment processing applications for debit, credit, and stored-value cards in accordance with various payment card industry (PCI), network, and ANSI standards and rules. Financial service providers and processors can replace their on-premises hardware security modules (HSMs) with this elastic service and move their payments-specific cryptography and key management functions to the cloud.

AWS Payment Cryptography is currently available only in the following US Regions: US East (N. Virginia) and US West (Oregon).

Read more about the service launch here: https://aws.amazon.com/about-aws/whats-new/2023/06/aws-payment-cryptography/

Amazon Verified Permissions is now generally available

Released: Jun 13, 2023

Announced originally back in AWS re:Invent 2022, AW has now released the general availability of Amazon Verified Permissions, service for fine-grained authorization and permissions management for applications that you build. Verified Permissions uses Cedar, an open-source language for access control, allowing you to define permissions as easy-to-understand policies. Use Verified Permissions to support role - and attribute-based access control in your applications.

Read more about Amazon Verified Permissions here: https://aws.amazon.com/about-aws/whats-new/2023/06/amazon-verified-permissions-generally-available/

AWS IAM Identity Center now supports automated user provisioning from Google Workspace

Released: Jun 13, 2023

This is big news for all organization using Google Workspace! It has earlier been possible to integrate Google Workspace to AWS IAM Identity Center and single sign-on to AWS services with Google identities, but managing those identities between Google and AWS has required either manual administrative work or additional custom integration service to be developed.

The new integration features help administrators simplify AWS access management across multiple accounts while maintaining familiar Google Workspace experiences for end users as they sign in. IAM Identity Center and Google Workspace use now Google auto-provisioning to securely provision users into IAM Identity Center, saving administrative time.

Network Load Balancer now supports security groups

Released: Aug 10, 2023

Network Load Balancers (NLB) now supports security groups, enabling you to filter the traffic that your NLB accepts and forwards to your application. Using security groups, you can configure rules to help ensure that your NLB only accepts traffic from trusted IP addresses, and centrally enforce access control policies. This improves your application's security posture and simplifies operations.

To learn more, please read the announcement: https://aws.amazon.com/about-aws/whats-new/2023/08/network-load-balancer-supports-security-groups/

Mastering AWS deployments with Terragrunt

pmalmirae — Fri, 16 Sep 2022 07:22:36 +0000

Terraform offers a robust, declarative way to describe your cloud infrastructure as code. And unlike some other IaC tools, Terraform also does a decent job comparing differences between your current version of IaC code, the last deployment stored in Terraform state, and the current state of the deployed cloud resources. But when it comes to managing and deploying multiple copies of the same infrastructure, an additional tool is needed. And that tool has the name Terragrunt.

In this article, I assume that you understand the basics of Terraform. If you’d like to check the Terraform basics first, this article is for you (article in Finnish): https://www.nordhero.com/posts/google-cloud-terraformilla/.

Alright. So first, let’s get two acronyms right before moving forward.

IaC means Infrastructure as Code. The idea is that you don’t have to manually log in to the AWS Console and set up the infrastructure by selecting services in the console and clicking the systems up. The manual approach could be acceptable if you only had one environment for testing purposes. But if you would need to set it up again, or if you had more than one environment that should have the same resources and configurations (e.g., development, testing, and production environments), you shouldn’t try to manage those manually. Instead, you probably would like to use an IaC tool like Terraform to set up the infrastructure configurations as code to be deployed quickly and repeatably.

DRY means Don’t Repeat Yourself. The downside of a declarative language like Terraform is that it’s not that easy to manage variations of the same code if you need the deploy the same infrastructure with a few different flavors depending on the use case. You quickly end up making multiple copies of the infrastructure code to manage various similar kinds of deployments. And that’s what Terragrunt is here to solve.

How to keep Terraform code DRY?

The key idea of Terragrunt is to write needed infrastructure code only once utilizing Terraform (TF files) and to separate environment-specific values as variables to be defined in Terragrunt configuration files (HCL files). I prefer a folder structure where I have the Terraform infrastructure code in the project’s infrastructure folder and Terragrunt configurations in the deployments folder.

A typical Terragrunt configuration setup is to save environment-specific configuration files in a three-level folder structure that describes the AWS Accounts, AWS Regions under the accounts, and deployable environments under the regions. Each folder level can contain Terragrunt configuration files for account, region, or environment-specific variables.

The following figure illustrates an imaginary deployments folder setup where one production account has production environments both in the us-east-1 (N. Virginia) and eu-central-1 (Frankfurt) regions and an additional demo environment in the eu-central-1 region. There’s also a staging environment on its own account utilizing the eu-central-1 region and a dev account with a similar setup. As I’m working with both feature development and performance testing, I have two environments deployed on my sandbox account. The sandbox-pekka-perftest environment has production-kind infrastructure resources configured on it — for example, having larger Fargate clusters or heavier EC2 instances with provisioned IOPS SSD volumes. As the performance testing environment has more resources, it also generates more costs. Therefore it is easy for me to tear down the stack when the test session ends and re-deploy it again when needed without affecting the standard sandbox environment residing on the same account and region.

An example deployment folder structure for Terragrunt

Install Terraform and Terragrunt

First, install Terraform on your desktop. I have macOS, and I like to use the Homebrew package manager (https://brew.sh/) for the job, so the commands below use brew. There are multiple ways to install the Terraform on different operating systems, and you can find the right ingredients from https://www.terraform.io/.

Here are the magic commands for macOS/Homebrew installation of Terraform:

brew tap hashicorp/tap
brew install hashicorp/tap/terraform
terraform -version

If you get a Terraform version info, you have succeeded in the installation.

And next one goes for Terragrunt (more installation options at https://terragrunt.gruntwork.io/):

brew install terragrunt
terragrunt -version

If you get the Terragrunt version number, you are good to go to the next phase.

The infrastructure with Terraform IaC

Let’s first create our infrastructure code. In our example I have a small stack including only one S3 Bucket with data encryption utilizing a customer-managed KMS Key. I have included only a few lines of Terraform code here to point out how Terraform works with Terragrunt. If you wish to try out with my example code, you can find the whole Terraform code example in the GitHub repository.

First, create a new folder on your computer for our project (you can name it as you wish) and create a folder infrastructure in the project root. Under the newly created folder, add all needed Terraform files. For example here is the code for the S3 bucket configurations (in file infrastructure/s3.tf):

/******************************
S3 bucket with encryption and public access block configurations 
******************************/
# The S3 bucket
resource "aws_s3_bucket" "demo_bucket" {
  bucket_prefix = "${var.name}-${var.environment}-demo-bucket"
}
# Let's make the bucket private
resource "aws_s3_bucket_acl" "demo_bucket_acl" {
  bucket = aws_s3_bucket.demo_bucket.id
  acl    = "private"
}
/******************************
More configurations in the actual file, please check the Github repo.
******************************/

As you see, I have used Terraform variables name and environment in the bucket’s prefix. That will let us automatically change the bucket prefix per environment. And as you might already guess, those variables are managed with Terragrunt.

In addition to utilizing variables in Terraform code, we need to define the variables from Terraform perspective. Create a file named infrastructure/vars.tf:

/******************************
Variables to be used with the infrastructure code
******************************/
variable "name" {
  type        = string
  description = "Name of the company or the platform to build, etc." 
}
variable "environment" {
  type        = string
  description = "Name of the environment/stack"
}
/******************************
More variables in the actual file, please check the Github repo
******************************/

Next to the main course!

Set up Terragrunt configurations

Let’s start the Terragrunt part by creating a folder named deployments in the project root. As you remember from the DRY chapter, a typical Terragrunt configuration set has a three-level folder structure: account/region/environment. So, create the folder structure under the deployments folder for your first environment. In my example, I have created a folder structure deployments/sandbox-pekka/eu-central-1/sandbox-pekka.

The deployments/terragrunt.hcl file is a key configuration file for Terragrunt. Please go ahead and create the file with the following contents:

/******************************
TERRAGRUNT CONFIGURATION
******************************/
locals {
  # Load account, region and environment variables 
  account_vars      = read_terragrunt_config(find_in_parent_folders("account.hcl"))
  region_vars       = read_terragrunt_config(find_in_parent_folders("region.hcl"))
  environment_vars  = read_terragrunt_config(find_in_parent_folders("env.hcl"))
  # Extract the variables we need with the backend configuration
  aws_region      = local.region_vars.locals.aws_region
  environment     = local.environment_vars.locals.environment
  state_bucket    = local.environment_vars.locals.state_bucket
  dynamodb_table  = local.environment_vars.locals.dynamodb_table
}
/******************************
Configure the Terragrunt remote state to utilize a S3 bucket and state lock information in a DynamoDB table. 
And encrypt the state data.
******************************/
remote_state {
  backend   = "s3"
  generate  = {
    path      = "backend.tf"
    if_exists = "overwrite"
  }
  config    = {
    bucket         = "${local.state_bucket}"
    key            = "${path_relative_to_include()}/terraform.tfstate"
    region         = "${local.aws_region}"
    encrypt        = true
    dynamodb_table = "${local.dynamodb_table}"
  }
}
/******************************
Combine all account, region and environment variables as Terragrunt input parameters.
The input parameters can be used in Terraform configurations as Terraform variables.  
******************************/
inputs = merge(
  local.account_vars.locals,
  local.region_vars.locals,
  local.environment_vars.locals,
)

The deployments/terragrunt.hcl

loads account, region, and environment level variables (we will create those files next)
extracts the variables that are needed in Terragrunt backend configurations
configures the Terragrunt backend to utilize the state bucket and state lock table
merges all variables as input parameters to be fed to Terraform.

Next, let’s create the needed files for Terragrunt variables. Please create the following .hcl files and replace the folder names and variable values with your own account/region/environment information:

deployments/sandbox-pekka/account.hcl:

# Set AWS Account -wide variables
locals {
  account_name   = "sandbox-pekka"
  aws_account_id = "REPLACE_WITH_YOUR_ACCOUNT_ID"
}

deployments/sandbox-pekka/eu-central-1/region.hcl:

# Set common variables for the AWS Region
locals {
  aws_region = "eu-central-1"
}

deployments/sandbox-pekka/eu-central-1/sandbox-pekka/env.hcl:

# Set common variables for the environment
locals {
  name           = "nordhero"
  environment    = "sandbox-pekka"
  state_bucket   = "nordhero-terragrunt-demo-state-sandbox-pekka" # Replace with your preferred unique S3 bucket name 
  dynamodb_table = "nordhero-terragrunt-demo-locks-sandbox-pekka" # Replace with your preferred dynamodb table name
}

The name and environment variables will be utilized in the Terraform code created in the chapter The infrastructure with Terraform IaC. And the state_bucket and dynamodb_table values will be used to create the Terragrunt state bucket and state lock table for the environment.

And one last thing to the Terragrunt configurations. Create one more folder and file: deployments/sandbox-pekka/eu-central-1/sandbox-pekka/infra/terragrunt.hcl:

/******************************
TERRAGRUNT CONFIGURATIONS
******************************/
/******************************
Include the root terragrunt.hcl configurations gathering together
the needed variables and backend configurations
******************************/
include "root" {
  path = find_in_parent_folders()
}
locals {
  # Expose the base source path
  base_source = "${dirname(find_in_parent_folders())}/..//infrastructure"
}
# Set the location of Terraform configurations
terraform {
  source = local.base_source
}

This is the Terragrunt configuration file that we will execute later on. What it does is that it first takes in the common terragrunt.hcl configurations we created in the deployments folder and then configures the Terraform infrastructure folder to use with the deployment. As you see, it is possible to configure different versions of Terraform code, having different base_source to use with different environments (not that DRY approach, thou). Or, more importantly, you could split your Terraform code into multiple modules and select which modules to deploy to this particular environment. For example, if your full stack includes the OpenSearch service (formerly ElasticSearch), but you don’t need OpenSearch in your development sandbox, you could choose to deploy all other modules but not to deploy the module that contains OpenSearch configurations.

Now we are ready. You should now have the following kind of folder structure and following files in your deployments folder:

deployments/terragrunt.hcl
deployments/sandbox-pekka/account.hcl
deployments/sandbox-pekka/eu-central-1/region.hcl
deployments/sandbox-pekka/eu-central-1/sandbox-pekka/env.hcl
deployments/sandbox-pekka/eu-central-1/sandbox-pekka/infra/terragrunt.hcl

Deploying with Terragrunt

Deploying the infrastructure with Terragrunt is very similar to deploying with Terraform. You have the same commands in use. The main difference is that you need to run the Terragrunt commands from the respective deployments/your_account/your_region/your_env/infra folder you wish to deploy. So please cd to your infra folder:

cd deployments/sandbox-pekka/eu-central-1/sandbox-pekka/infra

Before running any Terragrunt commands, we must ensure we have successfully connected to the right AWS Account with AWS CLI. If you don’t already have the AWS CLI configured, please follow the instructions: http://docs.aws.amazon.com/cli/latest/userguide/. After configuring AWS CLI, test the connection by running the following command:

aws sts get-caller-identity

You should get a response containing your UserId, AWS Account Id, and an IAM Role ARN if successfully connected. Check once more that the Account is the one you desire to deploy infrastructure to and that you have the same information in your deployments/your_account/account.hcl file.

Next, we are ready to rock ‘n roll. Let’s first initialize the Terragrunt:

terragrunt init

When running the init for the first time, Terragrunt recognizes that the S3 state bucket does not yet exist and asks whether Terragrunt should create the bucket for you. Please allow Terragrunt to create the state bucket. Terragrunt will automatically create also the state lock DynamoDB table at the same time. When everything is ready, you should receive a green response stating that the backend has been successfully configured to use S3 and that Terraform has been successfully initialized to use the hashicorp/aws provider plugin.

Next, let’s plan our deployment and save the plan in the tfplan file:

terragrunt plan -out tfplan

Terragrunt will now describe the resources it plans to create. Please check that there are six resources to be added and that no errors or warnings have been raised. If everything looks ok, we can next deploy the infrastructure plan saved in the tfplan file:

terragrunt apply "tfplan"

You should now get a green message stating “Apply complete!” with the number of resources created, changed and destroyed.

Congratulations! You have now successfully set up Terragrunt and deployed your first infrastructure stack.

Deploying the infrastructure to another region

Now to the desserts. The benefits of Terragrunt start accumulating when setting up the next copy of the infrastructure. Let’s assume we would like to set up the same infrastructure for the same account but in a different region. The greatness is that we don’t need to touch the infrastructure/*.tf files at all.

What we need to do is:

Copy the current deployments/your_account/your_region folder and rename the copied folder with the new region name, e.g. eu-north-1
Edit the deployments/your_account/new_region/region.hcl file and replace aws_region value with the new region name
Edit deployments/your_account/new_region/your_environment/env.hcl file and replace the state_bucket, and dynamodb_table values with new bucket and table names to store the state of the new environment
Cd to the deployments/your_account/new_region/your_environment/infra folder and repeat the terragrunt init/plan/apply commands

Now try it yourself!

Last lines

That was a nice ride! We needed to create a bunch of configuration files. Still, in a typical development project where you have multiple sandbox environments and a deployment pipeline with development, staging, production, and demo environments, it starts to pay off quickly. And in real life, you would probably like to automate the deployment pipeline from your source code repository so that, depending on the repository branch, you would automatically select the deployment folder to use with Terragrunt deployment.

NordHero is there to help you set up the infrastructure, manage the multi-environment platforms and automate the deployments with your selected GitOPS platform. Give us a call/email/LinkedIn message if you would like to hear more!

P.S. You can download the whole demo project in the Github repository.