DEV Community

Cover image for Deploying Amazon MSK Serverless Across Multiple Environments with Terraform
Pratik Ponde
Pratik Ponde

Posted on

Deploying Amazon MSK Serverless Across Multiple Environments with Terraform

๐Ÿ‘‹ Hey there! This is Pratik, a Senior DevOps Consultant with a strong background in automating and optimizing cloud infrastructure, particularly on AWS. Over the years, I have designed and implemented scalable solutions for enterprises, focusing on infrastructure as code, CI/CD pipelines, cloud security, and resilience. My expertise lies in translating complex cloud requirements into efficient, reliable, and cost-effective architectures.
Through this article, I aim to share practical insights into building and managing AWS MSK Serverless using Terraform, helping fellow engineers and teams design scalable, secure, and resilient streaming architectures on AWS.


โšกAmazon MSK: Overview and Types

AWS MSK is a fully managed service that makes it easy to run Apache Kafka on AWS without managing the infrastructure yourself. Kafka is a distributed streaming platform used for building real-time data pipelines and streaming apps.

AWS MSK offers two deployment types:

1. Amazon MSK Provisioned

You manage the cluster capacity by choosing instance types and the number of brokers. It offers more control over performance, scaling, and configuration, making it suitable for predictable workloads and production environments that need fine-tuning.

2. Amazon MSK Serverless

AWS automatically manages capacity, scaling, and broker infrastructure. You donโ€™t need to choose instance types or manage brokers. It is ideal for variable or unpredictable workloads and for teams that want minimal operational overhead.


โš™๏ธCore Components and Their Functionality

  • Broker Nodes: When you create an Amazon MSK cluster, you define the number of broker nodes per Availability Zone (minimum one per AZ). In MSK Provisioned, you can choose between Standard and Express broker types. In MSK Serverless, broker management is handled automatically, and you only configure cluster-level capacity.

  • ZooKeeper Nodes: Amazon MSK automatically provisions Apache ZooKeeper nodes to support reliable cluster coordination.

  • KRaft Controllers: KRaft is Kafkaโ€™s modern metadata management mode that replaces ZooKeeper. Metadata is managed internally by Kafka controllers, with no additional setup or cost required.

  • Producers, Consumers, and Topics: You can use standard Kafka operations to create topics and publish or consume data.

  • Cluster Operations: You manage clusters using the AWS Console, AWS CLI, or SDKs to perform actions such as creating, updating, viewing, or deleting clusters.


๐ŸŽฏLearning Objectives and Hands-On Walkthrough

This article demonstrates how to design and implement reusable Terraform modules for core infrastructure components such as the VPC and Amazon MSK, while supporting multiple environments (for example, Dev and UAT) through dedicated environment configurations using terraform.tfvars and variable definition files. It also covers configuring a remote Terraform backend using Amazon S3 to securely store and manage the Terraform state.
The solution provisions a complete Amazon MSK infrastructure, including the VPC, subnets, security groups, IAM roles, the MSK cluster, and a client EC2 instance. With a small set of Terraform commands, you can reliably create, update, and decommission the entire environment in a consistent and repeatable manner across environments.

Resources Covered in This Guide:

  • A VPC with public and private subnets across three Availability Zones.
  • Full networking setup, including Internet Gateway and route tables.
  • A secure Amazon MSK Serverless cluster.
  • An EC2 instance configured with Kafka tools and authentication.
  • IAM roles and security groups to enable secure communication between the EC2 instance and the MSK cluster.

๐Ÿš€Letโ€™s begin!


๐Ÿ“‹1. Prerequisites

  • AWS Account: Ensure you have an AWS account with programmatic access and that your AWS credentials are configured locally using the AWS CLI.
aws configure
Enter fullscreen mode Exit fullscreen mode
  • S3 Bucket: S3 Bucket for remote backend.
  • Terraform Setup: Make sure Terraform is installed on your local environment.

Installing Terraform

Terraform is straightforward to set up. The following sections provide installation instructions for the most common operating systems.

On macOS (using Homebrew):

brew tap hashicorp/tap
brew install hashicorp/tap/terraform
Enter fullscreen mode Exit fullscreen mode

On Windows (using Chocolatey):

choco install terraform
Enter fullscreen mode Exit fullscreen mode

On Linux (Debian/Ubuntu):

wget -O- [https://apt.releases.hashicorp.com/gpg](https://apt.releases.hashicorp.com/gpg) | sudo gpg --dearmor -o /usr/share/keyrings/hashicorp-archive-keyring.gpg
echo "deb [signed-by=/usr/share/keyrings/hashicorp-archive-keyring.gpg] [https://apt.releases.hashicorp.com](https://apt.releases.hashicorp.com) $(lsb_release -cs) main" | sudo tee /etc/apt/sources.list.d/hashicorp.list
sudo apt update && sudo apt install terraform
Enter fullscreen mode Exit fullscreen mode

After installation, verify it's working by running:

terraform --version
Enter fullscreen mode Exit fullscreen mode

๐Ÿ’ป2. Deep Dive into the Terraform Code

The following directory structure shows the Terraform modules for provisioning the VPC and Amazon MSK Serverless.

  • Core Networking Resources (VPC, Subnets, Gateways & Route Tables)

This section establishes the core network infrastructure for our environment. We begin by creating a new VPC, followed by provisioning three public and three private subnets one in each AWS Availability Zone to ensure high availability. The Internet Gateway, NAT Gateway, and route tables are configured to provide secure internet connectivity for the EC2 instance.

  • Security Controls (Security Groups and SSH Access)

    • SSH Key: Terraform automatically generates an RSA key pair. The public key is uploaded to AWS using aws_key_pair, while the private key is stored locally as msk-client-key.pem, allowing secure SSH access to the EC2 instance.
    • Security Groups: Two security groups are created one for the MSK cluster and one for the EC2 client. The rules are configured to allow unrestricted communication between the EC2 instance and the MSK cluster, while limiting inbound internet access to the EC2 instance to SSH traffic only.
  • IAM Permissions for EC2

Instead of embedding AWS access keys, we use an IAM role to grant permissions securely. This configuration creates an aws_iam_role that the EC2 instance can assume. An attached aws_iam_policy provides the required permissions to connect to the MSK cluster, describe resources, and read from or write to Kafka topics. This approach follows AWS security best practices and is the recommended way to manage service access.

  • Amazon MSK Serverless Cluster

This resource creates an Amazon MSK Serverless cluster with a configurable name. It deploys the cluster within the specified subnets and associates it with the MSK security group for secure network access. IAM-based SASL authentication is enabled to allow secure client access using IAM roles. Resource tags are also applied for easier management and identification.

  • Kafka Client EC2 Instance

This instance acts as the Kafka client for our setup. We provision a t2.micro EC2 instance and use a user_data script that runs automatically during the first boot to configure the environment.
The script performs the following tasks:

  • Installs Java

  • Downloads and extracts the required Kafka version

  • Installs the AWS MSK IAM Authentication library for secure access

  • Creates the client.properties file with the necessary configuration for IAM-based authentication.

This ensures the EC2 instance is fully prepared to connect to and interact with the MSK cluster.

โ˜๏ธ3. Provisioning the Infrastructure

Note: The full Terraform scripts are available hereโฌ‡๏ธ:

GitHub logo pratiksponde / AWS-MSK-Terraform-code

This repo contains terraform module code to provision AWS VPC and MSK Serverless cluster

AWS-MSK-Terraform-code

This repo contains terraform module code to provision AWS VPC and MSK Serverless cluster






Step 1: Configure the Remote Backend (Dev Environment)

The first step is to configure the remote backend for the Dev environment. This is done by updating the backend configuration under the path:
Environment โ†’ Dev โ†’ backend.tf
The following configuration uses Amazon S3 to store the Terraform state file securely and enables state locking to prevent concurrent modifications:

terraform {
  backend "s3" {
    bucket        = "your-bucket-name"
    key           = "msk/dev/terraform.tfstate"
    region        = "region-of-bucket"
    use_lockfile  = true
  }
}
Enter fullscreen mode Exit fullscreen mode

This setup ensures that the Terraform state is centrally managed, secure, and consistent when working across environments or teams.

Step 2: Run Terraform from the Correct Environment Directory

Before executing any Terraform commands, ensure that you are in the correct environment directory. For the Dev environment, navigate to:
Environment โ†’ Dev

  • Initialize Terraform

Run the following command to initialize the working directory. This step downloads the required AWS provider plugins and configures the backend.

terraform init
Enter fullscreen mode Exit fullscreen mode
  • Plan the Deployment

This command performs a dry run and shows a detailed preview of the resources Terraform will create, modify, or delete without making any changes.

terraform plan
Enter fullscreen mode Exit fullscreen mode
  • Apply the Configuration

This command executes the planned changes and provisions the resources in your AWS account. Confirm the operation by typing yes when prompted.

terraform apply
Enter fullscreen mode Exit fullscreen mode

โœ…4. Testing and Verifying the Cluster Setup

After terraform apply completes successfully, follow the steps below to validate that the setup is working as expected.

Step 1: Get the EC2 Public IP

Retrieve the public IP address of the EC2 instance from the Terraform outputs.

Step 2: SSH into the EC2 Instance

The private key file msk-client-key.pem is saved in your project directory. Ensure it has the correct permissions:

chmod 400 msk-client-key.pem
Enter fullscreen mode Exit fullscreen mode

Then connect to the instance using SSH:

ssh -i "msk-client-key.pem" ec2-user@<YOUR_EC2_PUBLIC_IP>
Enter fullscreen mode Exit fullscreen mode

Step 3: Get the Bootstrap Brokers String

In your local terminal (not inside the SSH session), retrieve the MSK bootstrap brokerโ€™s string:

terraform output bootstrap_brokers
Enter fullscreen mode Exit fullscreen mode

Copy this value. You will use it in the Kafka commands.

Step 4: Create a Kafka Topic

Inside the EC2 SSH session, navigate to the Kafka bin directory and create a topic:

bin/kafka-topics.sh --create \
  --bootstrap-server <bootstrapServerString> \
  --command-config /home/ec2-user/kafka_2.13-3.6.0/bin/client.properties \
  --replication-factor 3 \
  --partitions 1 \
  --topic my-first-topic

Enter fullscreen mode Exit fullscreen mode

Step 5: Start a Producer

In the same terminal, start the Kafka console producer:

bin/kafka-console-producer.sh \
  --broker-list <bootstrapServerString> \
  --producer.config /home/ec2-user/kafka_2.13-3.6.0/bin/client.properties \
  --topic my-first-topic

Enter fullscreen mode Exit fullscreen mode

You will see a > prompt. Type a message such as:

Hello from Terraform!

and press Enter.

Step 6: Start a Consumer (in a New Terminal)

Open another terminal window and SSH into the EC2 instance again. Then run the consumer:

bin/kafka-console-consumer.sh \
  --bootstrap-server <bootstrapServerString> \
  --consumer.config /home/ec2-user/kafka_2.13-3.6.0/bin/client.properties \
  --topic my-first-topic \
  --from-beginning

Enter fullscreen mode Exit fullscreen mode

You should see the message "Hello from Terraform!" appear immediately.
This confirms that your MSK cluster, authentication, and connectivity are working correctly.


๐Ÿ—‘๏ธ5. Cleaning Up Resources

To avoid unnecessary AWS charges, remember to delete the infrastructure once you are done. One of the advantages of using Terraform is that cleanup can be performed with a single command.

terraform destroy
Enter fullscreen mode Exit fullscreen mode

When prompted, type yes to confirm. Terraform will then safely and systematically remove all resources that were created.


๐Ÿ’ธCost Optimization Tips

  1. Use MSK Serverless for variable or unpredictable workloads.
  2. Delete unused topics regularly.
  3. Compress messages to minimize data transfer and storage.
  4. Avoid large message payloads where possible. For detailed pricing information on Amazon MSK, please refer to the link. โฌ‡๏ธ https://aws.amazon.com/msk/pricing/

๐Ÿ’กFinal Thoughts

In this article, we designed and implemented a complete Amazon MSK Serverless environment using a modular and reusable Terraform approach. By separating infrastructure into well-structured modules for VPC and MSK, and managing multiple environments such as Dev and UAT with environment-specific configurations and remote state stored in Amazon S3, we achieved a solution that is scalable, maintainable, and aligned with infrastructure-as-code best practices.

This approach not only simplifies provisioning and management but also improves consistency across environments and reduces operational risk. With automated deployment, secure authentication using IAM, and an EC2-based client for validation, you now have a solid foundation for building and operating real-time streaming solutions on AWS.

You can further extend this setup by integrating monitoring, enhancing security controls, and adding CI/CD pipelines to automate infrastructure changes.


โœ… Wrapping Up

Thanks for taking the time to explore AWS MSK Serverless with Terraform! I hope this article has helped you understand how to build a scalable, secure, and maintainable streaming data infrastructure. Whether you are a seasoned engineer or just starting with AWS, applying these concepts can make a real difference in managing real-time workloads efficiently.


๐Ÿ’ฌ Letโ€™s Keep the Conversation Going

Have thoughts, questions, or experience with MSK to share? I would love to hear from you! Feel free to leave a comment or connect with me on LinkedIn. Let's learn and grow together as a community of builders.

Keep exploring, keep automating and see you in the next one!

Top comments (2)

Collapse
 
sarvar_04 profile image
Sarvar Nadaf

Excellent Work!

Collapse
 
pratik_26 profile image
Pratik Ponde

Thank You !