DEV Community

Cover image for How I Built an AI Terraform Review Agent on Serverless AWS

How I Built an AI Terraform Review Agent on Serverless AWS

๐ŸŒŸ Introduction

Welcome, Devs ๐Ÿ‘‹

Today, weโ€™re stepping into the exciting intersection of AI, automation, and cloud infrastructure.

In this project, weโ€™ll explore how an AI-powered agent can actively participate in a real DevOps workflow, just like a senior reviewer on your team. This isnโ€™t a toy demo โ€” it closely resembles how real-world infrastructure changes are reviewed, validated, and approved in production environments.

Weโ€™ll use Terraform to provision cloud resources and GitHub Actions to automatically validate every pull request that modifies our HCL code. But hereโ€™s the twist ๐Ÿ‘€

Instead of relying only on static checks, we introduce an AI agent into the pipeline.

Every infrastructure change is:

  • Scanned using Terrascan

  • Reviewed by an AI agent powered by Gemini

  • Automatically approved, approved with changes, or rejected based on risk severity

If a pull request introduces dangerous or insecure infrastructure changes, the AI agent blocks the PR โ€” just like an automated infrastructure security reviewer.

Think of it as:

๐Ÿง  An AI-powered Infra Guardian that never gets tired of reviewing Terraform code.

So without further ado, letโ€™s dive in and see how we built an AI-driven, serverless DevOps workflow that brings intelligence directly into your CI/CD pipeline.


๐ŸŒŸ Pre-requisites

Before we dive deep into the implementation, letโ€™s make sure your environment is ready. This project touches multiple tools across cloud, IaC, security, and CI/CD, so having these set up beforehand will save you a lot of time.

Make sure you have the following in place:

  • AWS CLI installed and configured with an IAM user

    The IAM user should have permissions to create resources like ALB, ECS, Lambda, IAM, ACM, etc.

  • Terraform CLI installed on your system

  • GitHub account (pretty easy ๐Ÿ˜‰)

  • Terrascan installed locally

    ๐Ÿ‘‰ Follow the official guide here

If youโ€™re completely new to AWS CLI or Terraform, donโ€™t worry. Iโ€™ve already written a beginner-friendly guide that walks you through everything step by step:

๐Ÿ“˜ Getting Started with Terraform (Beginnerโ€™s Guide)

Once these prerequisites are fulfilled, youโ€™re all set ๐Ÿš€


๐ŸŒŸ Why AI Agents in Modern DevOps?

The current DevOps landscape is heavily influenced by AI-driven automation. What we now call AIOps has quietly become the de-facto standard for deploying, monitoring, and delivering software at scale.

AI agents are everywhere today โ€” but letโ€™s address the elephant in the room.

An AI agent is essentially a program that automates work which previously required human intervention. In many cases, it still follows a human-in-the-loop approach, but the heavy lifting โ€” analysis, validation, and decision-making โ€” is handled by the agent itself.

In this project, weโ€™ll bring that concept to life.

Weโ€™ll deploy a Super Mario Bros game (containerized using Docker) on a serverless AWS architecture, leveraging services like:

  • Amazon ECS

  • AWS Lambda

  • Application Load Balancer (ALB)

  • ACM for HTTPS

  • GitHub Actions for CI/CD

This setup closely resembles a real-world production environment.

Now comes the interesting part ๐Ÿ‘€

Every time a Pull Request is raised against our Terraform codebase:

  • GitHub Actions kicks in

  • Terrascan scans our IaC for security and best-practice violations

  • The scan report is sent to an AI agent powered by Gemini

  • The AI analyzes the findings and decides whether to:

    • โœ… Approve
    • โš ๏ธ Approve with Changes
    • โŒ Reject the PR

In a real-world DevOps workflow, this kind of system can save hours of manual review, reduce human error, and provide actionable remediation suggestions along with architectural risk insights.

Think of it as an automated Infrastructure Reviewer โ€” one that never gets tired and scales with your team.


๐ŸŒŸ Practical Demonstration: Building the AI-Powered DevOps Workflow

Enough theory โ€” letโ€™s get our hands dirty and see this system in action.

To get started, head over to the following GitHub repository, fork it under your own GitHub username, and then clone it locally:

๐Ÿ‘‰ Repository: https://github.com/Pravesh-Sudha/ai-devops-agent

git clone https://github.com/<your-username>/ai-devops-agent.git
cd ai-devops-agent
Enter fullscreen mode Exit fullscreen mode

Now navigate into the main project directory:

cd terraform-review-agent
Enter fullscreen mode Exit fullscreen mode

Open the project in VS Code (or your favorite editor). Youโ€™ll notice two main subdirectories:

terraform-review-agent/
โ”œโ”€โ”€ lambda/
โ””โ”€โ”€ terraform/
Enter fullscreen mode Exit fullscreen mode
  • lambda/ โ†’ Contains the AI review Lambda function

  • terraform/ โ†’ Contains all infrastructure provisioning code

Letโ€™s walk through the Terraform configuration piece by piece.

๐Ÿงฉ Terraform Code Breakdown

๐Ÿ”น provider.tf

Defines AWS as the cloud provider:

  • AWS provider version: 6.26.0

  • Region: us-east-1

This ensures consistent provider behavior across environments.

๐Ÿ”น backend.tf

We store Terraform state remotely using Amazon S3 โ€” a production best practice.

use_lockfile = true
Enter fullscreen mode Exit fullscreen mode

This enables state locking without DynamoDB, preventing concurrent state corruption using Terraformโ€™s native lockfile mechanism.

๐Ÿ”น variables.tf

Only two variables are required:

  • project_name โ†’ fixed as mario-game

  • gemini_api_key โ†’ passed dynamically (never hardcoded)

This ensures our API key remains secure and out of version control.

๐Ÿ”น outputs.tf

Provides useful runtime information after provisioning:

  • ALB DNS name (where the game runs)

  • ACM certificate ARN (used later for HTTPS)

๐Ÿ”น networking.tf

Instead of using the default VPC, we create our own VPC using the official AWS VPC module:

  • Two public subnets

  • Clean network isolation

  • Better control and scalability

๐Ÿ”น security.tf

Security is handled via two separate security groups:

  • ALB Security Group

    • Allows inbound traffic from anywhere (port 80 initially)
  • ECS Task Security Group

    • Only allows traffic from the ALB

This follows the least privilege principle.

(We later extend this to support HTTPS on port 443.)

๐Ÿ”น secrets.tf

The Gemini API key is securely stored using AWS Secrets Manager.

No plaintext secrets. No leaks. Production-safe by default.

๐Ÿง  The AI Brain: Lambda Function

๐Ÿ”น lambda.tf

This file defines a Python-based AWS Lambda function responsible for reviewing Terrascan findings and acting as a CI/CD security gate.

At the heart of this Lambda is a carefully crafted prompt:

def build_prompt(findings: dict) -> str:
    return f"""
You are a senior DevOps and Terraform security reviewer acting as a CI/CD security gate.

Your task is to analyze Terrascan findings and decide whether the infrastructure
can be deployed based on **risk thresholds**, not perfection.

Decision Policy (STRICT)
- REJECT if:
  - Any HIGH or CRITICAL severity issue exists
  - OR MEDIUM severity issues โ‰ฅ 4
  - OR Application Load Balancer has **no HTTPS listener at all**
- APPROVE_WITH_CHANGES if:
  - MEDIUM severity issues are 1โ€“3
- APPROVE if:
  - Only LOW or INFO issues exist

Output Format
Provide:
1. ๐Ÿšจ Security issues ordered by severity (summary only)
2. ๐Ÿ›  Required remediation (only actionable items)
3. โš–๏ธ Risk justification (1โ€“2 lines)
4. ๐Ÿ“Œ Final verdict: APPROVE | APPROVE_WITH_CHANGES | REJECT

Rules:
- Be concise
- Use bullet points
- Focus on AWS (ALB, ECS, VPC, IAM)
- Ignore Terrascan scan_errors
- Do NOT repeat raw JSON
- Verdict must strictly follow the Decision Policy

Findings:
{json.dumps(findings, indent=2)}
"""
Enter fullscreen mode Exit fullscreen mode

This logic ensures:

  • Security is enforced pragmatically

  • No false rejections for minor issues

  • HTTPS is mandatory for approval

  • Clear, actionable feedback for developers

๐Ÿ”น iam.tf

IAM roles and policies are defined here:

  • Lambda is granted access to Secrets Manager

  • ECS task role attaches:

    • AmazonECSTaskExecutionRolePolicy

This allows ECS to pull images, write logs, and function correctly.

๐Ÿ”น ecs.tf

This is where the Mario game comes to life:

  • ECS task definition using Fargate

  • Docker image for Super Mario Bros

  • ECS service to keep the task running

Fully serverless. No EC2 management required.

๐Ÿ”น alb.tf

To expose the application publicly:

  • Application Load Balancer

  • Listener on port 80 (initially)

  • Target group pointing to ECS tasks

Later, we enhance this with HTTPS + ACM, making the setup production-ready.

๐Ÿš€ Provisioning the Infrastructure

Before running Terraform, we need to create the S3 bucket for state storage:

aws s3 mb s3://pravesh-terraform-mario-state
Enter fullscreen mode Exit fullscreen mode

โš ๏ธ If you see BucketAlreadyExists, simply:

  • Update the bucket name in backend.tf

  • Re-run the command with a unique name

Now initialize Terraform:

cd terraform
terraform init
Enter fullscreen mode Exit fullscreen mode

Gemini API Key Setup

Head over to Google AI Studio and generate a free Gemini API key.

Once you have it, keep it safe โ€” weโ€™ll pass it dynamically to Terraform.

Plan & Apply

Preview the infrastructure:

terraform plan -var="gemini_api_key=<YOUR_GEMINI_API_KEY>"
Enter fullscreen mode Exit fullscreen mode

Review the plan and then deploy:

terraform apply -var="gemini_api_key=<YOUR_GEMINI_API_KEY>" -auto-approve
Enter fullscreen mode Exit fullscreen mode

โฑ๏ธ Provisioning takes around 5โ€“7 minutes, mainly due to ALB setup.

๐ŸŽฎ Final Result

Once Terraform finishes:

  • Copy the ALB DNS name from the outputs

  • Open it in your browser

๐ŸŽ‰ You should now see the Super Mario Bros game running on ECS, backed by a serverless AWS architecture and guarded by an AI-powered DevOps review system.


๐ŸŒŸ Terraform AI Review Agent in Action

Now comes the most exciting part โ€” seeing the Terraform AI review agent in action.

Letโ€™s simulate a real-world scenario by making a small change to our infrastructure code and opening a Pull Request. As soon as we do this, our GitHub Actions workflow will automatically kick in and run the AI-based review.

Triggering the AI Review

Make a minor change in the Terraform code and raise a Pull Request. Once the pipeline runs, youโ€™ll notice that the workflow fails โŒ.

Why did this happen?

If you check the Violation report, youโ€™ll see that the AI agent rejected the changes. The reason is simple and important:

  • Three MEDIUM-severity issues are related to the Application Load Balancer

  • Our application is currently running only on HTTP

  • Running production workloads over HTTP is not secure

Because our AI agent follows a strict policy (defined in the Lambda prompt), the absence of an HTTPS listener on the ALB results in a PR rejection.

This is exactly how a real-world AI-powered infrastructure gate should behave.

Fixing the Issue: Enabling HTTPS ๐Ÿ”’

To resolve this, weโ€™ll enable HTTPS by creating an ACM certificate and updating our ALB configuration.

Step 1: Update Security Group Rules

Inside security.tf, uncomment the ingress rule for port 443 so that HTTPS traffic is allowed.

Step 2: Enable HTTPS Listener on ALB

Open alb.tf and do the following:

  • Uncomment the aws_lb_listener "https" block

  • Uncomment the ACM certificate resource

  • In the existing app_listener (HTTP listener), remove the forward action by deleting this block:

default_action {
  type             = "forward"
  target_group_arn = aws_lb_target_group.app_tg.arn
}
Enter fullscreen mode Exit fullscreen mode

This ensures HTTP is no longer used for forwarding traffic directly.

Step 3: Update Domain Name in ACM Certificate

Inside the ACM certificate resource:

  • Replace praveshsudha.com with your own domain name

  • This is required because youโ€™ll be adding CAA and CNAME records for certificate validation

Step 4: Add CAA Record (IMPORTANT โš ๏ธ)

Before creating the ACM certificate, make sure to add the following CAA record in your DNS provider:

  • Type: CAA

  • Name: @

  • Flag: 0

  • Tag: issue

  • CA Domain: amazonaws.com

  • TTL: Default

โš ๏ธ Important: Add this CAA record before applying Terraform, otherwise ACM certificate creation may fail.

Step 5: Enable ACM Output

In outputs.tf, uncomment the output block for acm_certificate_arn.

This will help us fetch validation details later.

Step 6: Apply the Changes

Run the following command:

terraform apply --var="gemini_api_key=<YOUR_GEMINI_KEY>" --auto-approve
Enter fullscreen mode Exit fullscreen mode

This will:

  • Create the ACM certificate

  • Add an HTTPS listener to the ALB

Once completed, Terraform will output the ACM certificate ARN.

Step 7: Validate the ACM Certificate

Use the ARN and run:

aws acm describe-certificate \
  --certificate-arn arn:aws:acm:us-east-1:<ACCOUNT_ID>:certificate/<CERT_ID>
Enter fullscreen mode Exit fullscreen mode

From the output:

  • Copy the CNAME name (only up to mario, not the full domain)

  • Copy the CNAME value

Add this CNAME record to your DNS provider.

Within a few minutes, the certificate status will change to ISSUED โœ….

Step 8: Point Your Domain to the ALB

Now create a DNS record:

  • Type: CNAME

  • Name: mario

  • Target: <YOUR_ALB_DNS_NAME>

  • TTL: Default

After a few minutes, your application will be live at:

๐Ÿ‘‰ https://mario.your-domain.com

Re-running the AI Review โœ…

Now that HTTPS is enabled, letโ€™s test the AI agent again.

Run the following commands:

git checkout -b test
git add outputs.tf security.tf alb.tf
git commit -m "testing ai-agent-workflow"
git push origin test
Enter fullscreen mode Exit fullscreen mode

Go to your GitHub repository and open a Pull Request.

This time:

  • GitHub Actions runs successfully

  • Terrascan reports are generated

  • Gemini analyzes the findings

  • โœ… AI agent APPROVES the PR


๐ŸŒŸ Cleaning Up Resources

Once youโ€™re done experimenting with the project, itโ€™s very important to clean up all the resources to avoid any unnecessary AWS charges.

Follow the steps below in order to safely delete everything we created.

Step 1: Destroy Terraform Resources

First, navigate to the terraform directory and run:

terraform destroy --auto-approve --var="gemini_api_key=<YOUR_GEMINI_KEY>"
Enter fullscreen mode Exit fullscreen mode

This command will:

  • Terminate ECS services and tasks

  • Delete the Application Load Balancer

  • Remove Lambda functions and IAM roles

  • Clean up networking components like VPCs, subnets, and security groups

Step 2: Delete the Terraform State Files from S3

Once Terraform has destroyed all the resources, delete the remote state files stored in S3.

aws s3 rm s3://pravesh-terraform-mario-state --recursive
Enter fullscreen mode Exit fullscreen mode

This removes all objects inside the bucket, including the Terraform state file.

Step 3: Remove the S3 Bucket

Finally, delete the empty S3 bucket:

aws s3 rb s3://pravesh-terraform-mario-state
Enter fullscreen mode Exit fullscreen mode


๐ŸŒŸ Conclusion

This project goes far beyond deploying a Super Mario game on AWS โ€” it represents how modern DevOps is evolving with AI and serverless architectures.

By integrating Terraform, GitHub Actions, Terrascan, and Gemini, we built an AI-powered Terraform review agent that acts as a real CI/CD security gate. Every infrastructure change is evaluated based on risk, not guesswork. The AI summarizes security findings, suggests concrete remediations, and makes approval decisions that closely resemble how a senior DevOps engineer would review production infrastructure.

On the infrastructure side, we embraced a serverless-first approach using AWS ECS Fargate, Lambda, ALB, and managed cloud services. This setup reflects real-world architectures used in production today โ€” scalable, cost-efficient, and operationally simple, without managing servers manually.

The key takeaway from this project is clear:

AI in DevOps is not about replacing engineers โ€” itโ€™s about empowering them.

By automating repetitive infrastructure reviews, we save valuable engineering hours, reduce human errors, and ship changes with higher confidence and security.

I highly encourage you to fork the repository, experiment with breaking changes, tune the AI decision thresholds, and extend this project further. This is just the beginning of what AI-assisted DevOps can achieve.

Happy building ๐Ÿš€

๐Ÿ”— Connect with me

If this project helped you learn something new, feel free to share it with your network โ€” it truly helps a lot โค๏ธ

Top comments (0)