๐ Introduction
Welcome, Devs ๐
Today, weโre stepping into the exciting intersection of AI, automation, and cloud infrastructure.
In this project, weโll explore how an AI-powered agent can actively participate in a real DevOps workflow, just like a senior reviewer on your team. This isnโt a toy demo โ it closely resembles how real-world infrastructure changes are reviewed, validated, and approved in production environments.
Weโll use Terraform to provision cloud resources and GitHub Actions to automatically validate every pull request that modifies our HCL code. But hereโs the twist ๐
Instead of relying only on static checks, we introduce an AI agent into the pipeline.
Every infrastructure change is:
Scanned using Terrascan
Reviewed by an AI agent powered by Gemini
Automatically approved, approved with changes, or rejected based on risk severity
If a pull request introduces dangerous or insecure infrastructure changes, the AI agent blocks the PR โ just like an automated infrastructure security reviewer.
Think of it as:
๐ง An AI-powered Infra Guardian that never gets tired of reviewing Terraform code.
So without further ado, letโs dive in and see how we built an AI-driven, serverless DevOps workflow that brings intelligence directly into your CI/CD pipeline.
๐ Pre-requisites
Before we dive deep into the implementation, letโs make sure your environment is ready. This project touches multiple tools across cloud, IaC, security, and CI/CD, so having these set up beforehand will save you a lot of time.
Make sure you have the following in place:
-
AWS CLI installed and configured with an IAM user
The IAM user should have permissions to create resources like ALB, ECS, Lambda, IAM, ACM, etc.
Terraform CLI installed on your system
GitHub account (pretty easy ๐)
Terrascan installed locally
๐ Follow the official guide here
If youโre completely new to AWS CLI or Terraform, donโt worry. Iโve already written a beginner-friendly guide that walks you through everything step by step:
๐ Getting Started with Terraform (Beginnerโs Guide)
Once these prerequisites are fulfilled, youโre all set ๐
๐ Why AI Agents in Modern DevOps?
The current DevOps landscape is heavily influenced by AI-driven automation. What we now call AIOps has quietly become the de-facto standard for deploying, monitoring, and delivering software at scale.
AI agents are everywhere today โ but letโs address the elephant in the room.
An AI agent is essentially a program that automates work which previously required human intervention. In many cases, it still follows a human-in-the-loop approach, but the heavy lifting โ analysis, validation, and decision-making โ is handled by the agent itself.
In this project, weโll bring that concept to life.
Weโll deploy a Super Mario Bros game (containerized using Docker) on a serverless AWS architecture, leveraging services like:
Amazon ECS
AWS Lambda
Application Load Balancer (ALB)
ACM for HTTPS
GitHub Actions for CI/CD
This setup closely resembles a real-world production environment.
Now comes the interesting part ๐
Every time a Pull Request is raised against our Terraform codebase:
GitHub Actions kicks in
Terrascan scans our IaC for security and best-practice violations
The scan report is sent to an AI agent powered by Gemini
-
The AI analyzes the findings and decides whether to:
- โ Approve
- โ ๏ธ Approve with Changes
- โ Reject the PR
In a real-world DevOps workflow, this kind of system can save hours of manual review, reduce human error, and provide actionable remediation suggestions along with architectural risk insights.
Think of it as an automated Infrastructure Reviewer โ one that never gets tired and scales with your team.
๐ Practical Demonstration: Building the AI-Powered DevOps Workflow
Enough theory โ letโs get our hands dirty and see this system in action.
To get started, head over to the following GitHub repository, fork it under your own GitHub username, and then clone it locally:
๐ Repository: https://github.com/Pravesh-Sudha/ai-devops-agent
git clone https://github.com/<your-username>/ai-devops-agent.git
cd ai-devops-agent
Now navigate into the main project directory:
cd terraform-review-agent
Open the project in VS Code (or your favorite editor). Youโll notice two main subdirectories:
terraform-review-agent/
โโโ lambda/
โโโ terraform/
lambda/โ Contains the AI review Lambda functionterraform/โ Contains all infrastructure provisioning code
Letโs walk through the Terraform configuration piece by piece.
๐งฉ Terraform Code Breakdown
๐น provider.tf
Defines AWS as the cloud provider:
AWS provider version: 6.26.0
Region: us-east-1
This ensures consistent provider behavior across environments.
๐น backend.tf
We store Terraform state remotely using Amazon S3 โ a production best practice.
use_lockfile = true
This enables state locking without DynamoDB, preventing concurrent state corruption using Terraformโs native lockfile mechanism.
๐น variables.tf
Only two variables are required:
project_nameโ fixed as mario-gamegemini_api_keyโ passed dynamically (never hardcoded)
This ensures our API key remains secure and out of version control.
๐น outputs.tf
Provides useful runtime information after provisioning:
ALB DNS name (where the game runs)
ACM certificate ARN (used later for HTTPS)
๐น networking.tf
Instead of using the default VPC, we create our own VPC using the official AWS VPC module:
Two public subnets
Clean network isolation
Better control and scalability
๐น security.tf
Security is handled via two separate security groups:
-
ALB Security Group
- Allows inbound traffic from anywhere (port 80 initially)
-
ECS Task Security Group
- Only allows traffic from the ALB
This follows the least privilege principle.
(We later extend this to support HTTPS on port 443.)
๐น secrets.tf
The Gemini API key is securely stored using AWS Secrets Manager.
No plaintext secrets. No leaks. Production-safe by default.
๐ง The AI Brain: Lambda Function
๐น lambda.tf
This file defines a Python-based AWS Lambda function responsible for reviewing Terrascan findings and acting as a CI/CD security gate.
At the heart of this Lambda is a carefully crafted prompt:
def build_prompt(findings: dict) -> str:
return f"""
You are a senior DevOps and Terraform security reviewer acting as a CI/CD security gate.
Your task is to analyze Terrascan findings and decide whether the infrastructure
can be deployed based on **risk thresholds**, not perfection.
Decision Policy (STRICT)
- REJECT if:
- Any HIGH or CRITICAL severity issue exists
- OR MEDIUM severity issues โฅ 4
- OR Application Load Balancer has **no HTTPS listener at all**
- APPROVE_WITH_CHANGES if:
- MEDIUM severity issues are 1โ3
- APPROVE if:
- Only LOW or INFO issues exist
Output Format
Provide:
1. ๐จ Security issues ordered by severity (summary only)
2. ๐ Required remediation (only actionable items)
3. โ๏ธ Risk justification (1โ2 lines)
4. ๐ Final verdict: APPROVE | APPROVE_WITH_CHANGES | REJECT
Rules:
- Be concise
- Use bullet points
- Focus on AWS (ALB, ECS, VPC, IAM)
- Ignore Terrascan scan_errors
- Do NOT repeat raw JSON
- Verdict must strictly follow the Decision Policy
Findings:
{json.dumps(findings, indent=2)}
"""
This logic ensures:
Security is enforced pragmatically
No false rejections for minor issues
HTTPS is mandatory for approval
Clear, actionable feedback for developers
๐น iam.tf
IAM roles and policies are defined here:
Lambda is granted access to Secrets Manager
-
ECS task role attaches:
AmazonECSTaskExecutionRolePolicy
This allows ECS to pull images, write logs, and function correctly.
๐น ecs.tf
This is where the Mario game comes to life:
ECS task definition using Fargate
Docker image for Super Mario Bros
ECS service to keep the task running
Fully serverless. No EC2 management required.
๐น alb.tf
To expose the application publicly:
Application Load Balancer
Listener on port 80 (initially)
Target group pointing to ECS tasks
Later, we enhance this with HTTPS + ACM, making the setup production-ready.
๐ Provisioning the Infrastructure
Before running Terraform, we need to create the S3 bucket for state storage:
aws s3 mb s3://pravesh-terraform-mario-state
โ ๏ธ If you see BucketAlreadyExists, simply:
Update the bucket name in
backend.tfRe-run the command with a unique name
Now initialize Terraform:
cd terraform
terraform init
Gemini API Key Setup
Head over to Google AI Studio and generate a free Gemini API key.
Once you have it, keep it safe โ weโll pass it dynamically to Terraform.
Plan & Apply
Preview the infrastructure:
terraform plan -var="gemini_api_key=<YOUR_GEMINI_API_KEY>"
Review the plan and then deploy:
terraform apply -var="gemini_api_key=<YOUR_GEMINI_API_KEY>" -auto-approve
โฑ๏ธ Provisioning takes around 5โ7 minutes, mainly due to ALB setup.
๐ฎ Final Result
Once Terraform finishes:
Copy the ALB DNS name from the outputs
Open it in your browser
๐ You should now see the Super Mario Bros game running on ECS, backed by a serverless AWS architecture and guarded by an AI-powered DevOps review system.
๐ Terraform AI Review Agent in Action
Now comes the most exciting part โ seeing the Terraform AI review agent in action.
Letโs simulate a real-world scenario by making a small change to our infrastructure code and opening a Pull Request. As soon as we do this, our GitHub Actions workflow will automatically kick in and run the AI-based review.
Triggering the AI Review
Make a minor change in the Terraform code and raise a Pull Request. Once the pipeline runs, youโll notice that the workflow fails โ.
Why did this happen?
If you check the Violation report, youโll see that the AI agent rejected the changes. The reason is simple and important:
Three MEDIUM-severity issues are related to the Application Load Balancer
Our application is currently running only on HTTP
Running production workloads over HTTP is not secure
Because our AI agent follows a strict policy (defined in the Lambda prompt), the absence of an HTTPS listener on the ALB results in a PR rejection.
This is exactly how a real-world AI-powered infrastructure gate should behave.
Fixing the Issue: Enabling HTTPS ๐
To resolve this, weโll enable HTTPS by creating an ACM certificate and updating our ALB configuration.
Step 1: Update Security Group Rules
Inside security.tf, uncomment the ingress rule for port 443 so that HTTPS traffic is allowed.
Step 2: Enable HTTPS Listener on ALB
Open alb.tf and do the following:
Uncomment the
aws_lb_listener "https"blockUncomment the ACM certificate resource
In the existing
app_listener(HTTP listener), remove the forward action by deleting this block:
default_action {
type = "forward"
target_group_arn = aws_lb_target_group.app_tg.arn
}
This ensures HTTP is no longer used for forwarding traffic directly.
Step 3: Update Domain Name in ACM Certificate
Inside the ACM certificate resource:
Replace
praveshsudha.comwith your own domain nameThis is required because youโll be adding CAA and CNAME records for certificate validation
Step 4: Add CAA Record (IMPORTANT โ ๏ธ)
Before creating the ACM certificate, make sure to add the following CAA record in your DNS provider:
Type: CAA
Name:
@Flag:
0Tag:
issueCA Domain:
amazonaws.comTTL: Default
โ ๏ธ Important: Add this CAA record before applying Terraform, otherwise ACM certificate creation may fail.
Step 5: Enable ACM Output
In outputs.tf, uncomment the output block for acm_certificate_arn.
This will help us fetch validation details later.
Step 6: Apply the Changes
Run the following command:
terraform apply --var="gemini_api_key=<YOUR_GEMINI_KEY>" --auto-approve
This will:
Create the ACM certificate
Add an HTTPS listener to the ALB
Once completed, Terraform will output the ACM certificate ARN.
Step 7: Validate the ACM Certificate
Use the ARN and run:
aws acm describe-certificate \
--certificate-arn arn:aws:acm:us-east-1:<ACCOUNT_ID>:certificate/<CERT_ID>
From the output:
Copy the CNAME name (only up to
mario, not the full domain)Copy the CNAME value
Add this CNAME record to your DNS provider.
Within a few minutes, the certificate status will change to ISSUED โ .
Step 8: Point Your Domain to the ALB
Now create a DNS record:
Type: CNAME
Name:
marioTarget:
<YOUR_ALB_DNS_NAME>TTL: Default
After a few minutes, your application will be live at:
๐ https://mario.your-domain.com
Re-running the AI Review โ
Now that HTTPS is enabled, letโs test the AI agent again.
Run the following commands:
git checkout -b test
git add outputs.tf security.tf alb.tf
git commit -m "testing ai-agent-workflow"
git push origin test
Go to your GitHub repository and open a Pull Request.
This time:
GitHub Actions runs successfully
Terrascan reports are generated
Gemini analyzes the findings
โ AI agent APPROVES the PR
๐ Cleaning Up Resources
Once youโre done experimenting with the project, itโs very important to clean up all the resources to avoid any unnecessary AWS charges.
Follow the steps below in order to safely delete everything we created.
Step 1: Destroy Terraform Resources
First, navigate to the terraform directory and run:
terraform destroy --auto-approve --var="gemini_api_key=<YOUR_GEMINI_KEY>"
This command will:
Terminate ECS services and tasks
Delete the Application Load Balancer
Remove Lambda functions and IAM roles
Clean up networking components like VPCs, subnets, and security groups
Step 2: Delete the Terraform State Files from S3
Once Terraform has destroyed all the resources, delete the remote state files stored in S3.
aws s3 rm s3://pravesh-terraform-mario-state --recursive
This removes all objects inside the bucket, including the Terraform state file.
Step 3: Remove the S3 Bucket
Finally, delete the empty S3 bucket:
aws s3 rb s3://pravesh-terraform-mario-state
๐ Conclusion
This project goes far beyond deploying a Super Mario game on AWS โ it represents how modern DevOps is evolving with AI and serverless architectures.
By integrating Terraform, GitHub Actions, Terrascan, and Gemini, we built an AI-powered Terraform review agent that acts as a real CI/CD security gate. Every infrastructure change is evaluated based on risk, not guesswork. The AI summarizes security findings, suggests concrete remediations, and makes approval decisions that closely resemble how a senior DevOps engineer would review production infrastructure.
On the infrastructure side, we embraced a serverless-first approach using AWS ECS Fargate, Lambda, ALB, and managed cloud services. This setup reflects real-world architectures used in production today โ scalable, cost-efficient, and operationally simple, without managing servers manually.
The key takeaway from this project is clear:
AI in DevOps is not about replacing engineers โ itโs about empowering them.
By automating repetitive infrastructure reviews, we save valuable engineering hours, reduce human errors, and ship changes with higher confidence and security.
I highly encourage you to fork the repository, experiment with breaking changes, tune the AI decision thresholds, and extend this project further. This is just the beginning of what AI-assisted DevOps can achieve.
Happy building ๐
๐ Connect with me
Twitter / X: https://x.com/praveshstwt
If this project helped you learn something new, feel free to share it with your network โ it truly helps a lot โค๏ธ














Top comments (0)