EphraimX

Posted on Aug 13

End-to-End DevOps: Deploying and Monitoring a Full-Stack App with Docker, Terraform, and AWS

In this article, you’ll learn how to build and deploy a production-ready full-stack application on AWS using Docker, Terraform, Prometheus, and Grafana for containerization, infrastructure provisioning, monitoring, and visualizing application metrics, respectively.

This project reflects how full-stack applications are deployed in real-world environments, from startups to large organizations. It’s ideal for full-stack developers looking to deploy their apps to the cloud, or aspiring DevOps engineers who want a hands-on understanding of how infrastructure, deployment, and monitoring all come together.

By the end of this article, you’ll walk away with both the practical skills and theoretical knowledge needed to:

Containerize a full-stack app
Provision and manage cloud infrastructure using Terraform
Deploy services to AWS
Set up full observability using Prometheus and Grafana

Prerequisites and Setup

Before you begin, ensure you have the following tools and resources configured on your local machine:

Docker & Docker Compose These are required to containerize and run the application services locally before deployment. Follow the official installation guide based on your operating system here.
Terraform Terraform will be used to provision all necessary infrastructure on AWS. You can install it by following the instructions on the official HashiCorp website.
AWS CLI Required to configure and authenticate your AWS credentials locally. Install it using the official AWS CLI installation guide.
AWS Account + Access Keys You’ll need an AWS account with programmatic access enabled. If you don’t have one, sign up here. Then generate your Access Key ID and Secret Access Key by following this step-by-step guide.
Git and Project Repository Clone the GitHub repository containing all the code and infrastructure configuration for this project:

  git clone https://github.com/EphraimX/aws-devops-fullstack-pipeline.git

Once all tools are installed and your credentials are set, you're ready to begin deploying and monitoring the application.

Understanding the Application

The application you're deploying is a classic three-tier architecture consisting of:

Frontend (Client-side) – A simple interface that collects user input.
Backend (Server-side) – A FastAPI application that handles computation and database operations.
Database – A PostgreSQL instance hosted on AWS RDS.

The application functions as an ROI (Return on Investment) calculator. It accepts:

The total cost of an investment,
The expected monthly profit, and
The number of months the profit is expected to come in.

Once the user submits these values, the frontend sends a request to the backend via the /api/calculate-roi endpoint. The backend calculates the ROI percentage and the time to break even, then sends the result back to the frontend.

Here’s the logic behind the ROI calculation:

class RoiCalculationRequest(BaseModel):
    cost: float
    revenue: float
    timeHorizon: int

@app.post("/api/calculate-roi")
async def calculate_roi(request: RoiCalculationRequest):
    """
    Calculates Return on Investment (ROI) and break-even time.
    """
    current_cost = request.cost
    current_revenue = request.revenue
    current_time_horizon = request.timeHorizon

    if any(val is None for val in [current_cost, current_revenue, current_time_horizon]) or current_time_horizon <= 0:
        raise HTTPException(status_code=400, detail="Please enter valid numbers for all fields.")

    total_revenue = current_revenue * current_time_horizon
    net_profit = total_revenue - current_cost

    calculated_roi_percent = 0.0
    if current_cost > 0:
        calculated_roi_percent = (net_profit / current_cost) * 100
    elif net_profit > 0:
        calculated_roi_percent = float('inf')

    calculated_break_even_months: Union[str, float] = "N/A"
    if current_revenue > 0:
        calculated_break_even_months = round(current_cost / current_revenue, 2)
    elif current_cost > 0:
        calculated_break_even_months = "Never"
    else:
        calculated_break_even_months = "0"

    roi_percent = "Infinite" if calculated_roi_percent == float('inf') else f"{calculated_roi_percent:.2f}%"
    break_even_months = str(calculated_break_even_months)

    return {"roiPercent": roi_percent, "breakEvenMonths": break_even_months}

Once the frontend receives the ROI and break-even results, it makes another API request to /api/recordEntry. This second endpoint records both the user inputs and the computed results into the PostgreSQL database for tracking and analysis.

Here’s the relevant backend logic:

class PaymentRecordRequest(BaseModel):
    cost: float
    revenue: float
    time_horizon: int = Field(..., alias="timeHorizon")
    roi_percent: str = Field(..., alias="roiPercent")
    break_even_months: str = Field(..., alias="breakEvenMonths")
    date: str

    class Config:
        validate_by_name = True 

@app.post("/api/recordEntry")
def record_payment(request: PaymentRecordRequest, db: Session = Depends(get_db)):
    try:
        new_record = PaymentRecord(
            cost=request.cost,
            revenue=request.revenue,
            time_horizon=request.time_horizon,
            roi_percent=request.roi_percent,
            break_even_months=request.break_even_months,
            date=request.date
        )
        db.add(new_record)
        db.commit()
        db.refresh(new_record)

        return {"success": True, "message": "Payment record saved to RDS."}
    except Exception as e:
        logging.exception(e)
        raise HTTPException(status_code=500, detail="Failed to save payment record.")

Below is a short demo that shows the flow of the application in action:

Application Containerization

Containerization allows developers to build and ship their applications on any machine or server without running into a myriad of issues like installation errors or dependency problems. It creates its operating system for applications to run on.

The process of containerizing an application depends mainly on how the application runs, the packages it needs, and the overall architecture of your system. For this application, the frontend tier runs on Next.js, and the backend runs on FastAPI.

For the frontend tier, here's the Dockerfile for it:

# ---------- Builder Stage ----------
FROM node:21.5-bullseye-slim AS builder

WORKDIR /usr/src/app

# Copy package files and install dependencies
COPY package*.json ./
RUN npm install --force

# Copy application code
COPY . .

# Build-time environment variable
ARG NEXT_PUBLIC_APIURL
ENV NEXT_PUBLIC_APIURL=$NEXT_PUBLIC_APIURL

# Build the app
RUN npm run build

# ---------- Production Stage ----------
FROM node:21.5-bullseye-slim AS runner

WORKDIR /usr/src/app

# Install only production dependencies
COPY package*.json ./
RUN npm install --omit=dev --force

# Install PM2 globally
RUN npm install -g pm2

# Copy built app from builder stage
COPY --from=builder /usr/src/app/.next ./.next
COPY --from=builder /usr/src/app/public ./public
COPY --from=builder /usr/src/app/node_modules ./node_modules
COPY --from=builder /usr/src/app/package.json ./package.json
COPY --from=builder /usr/src/app/next.config.mjs ./next.config.mjs

# Expose app port
EXPOSE 3000

# Set runtime env (can override with --env at runtime)
ENV NEXT_PUBLIC_APIURL=$NEXT_PUBLIC_APIURL

# Start Next.js using PM2
CMD ["pm2-runtime", "start", "npm", "--", "start"]

From the Dockerfile above, you can see you have two stages: the builder stage and the production stage. Dockerfiles like this are referred to as multi-stage Dockerfiles, and they're important because they help reduce the size of the image and make it more secure by using only the necessary components of the application, thereby reducing the total surface area for an attack.

In the builder stage of the frontend Dockerfile, you start by defining the base image, which in this case is an operating system with a version of Node.js installed. Next, you define the working directory and copy all variations of the package.json file, before running npm install --force to force install all the packages. You then set the API URL as a build-time argument and set it as an environment variable that the application can reference during build. Finally, you run npm run build to build the application.

Once that's done, you move to the production stage, where you choose a similar base image and working directory. You then copy all variations of the package.json file again and run npm install --omit=dev --force to install only production dependencies. Next, you install pm2, which is what you'll use to serve the application. You then copy all the relevant folders from the previous stage into your working directory, expose port 3000, set your runtime environment variable, and pass in the entrypoint command to start the application with PM2.

Once this is done, you can move on to your backend Dockerfile. Here, you follow the same principle of a multi-stage Dockerfile, for the added reasons of security and image size efficiency.

Here's the Dockerfile for the backend system:

# =================================
# Stage 1: Base Dependencies
# =================================
FROM python:3.11-slim AS base

# Install system dependencies
RUN apt-get update && apt-get install -y \
    curl \
    && rm -rf /var/lib/apt/lists/*

WORKDIR /app

# =================================
# Stage 2: Dependencies Builder
# =================================
FROM base AS deps-builder

# Install build dependencies
RUN apt-get update && apt-get install -y \
    build-essential \
    libpq-dev \
    && rm -rf /var/lib/apt/lists/*

# Create virtual environment
RUN python -m venv /venv
ENV PATH="/venv/bin:$PATH"

# Install Python dependencies
COPY requirements.txt .
RUN pip install --upgrade pip && \
    pip install --no-cache-dir -r requirements.txt

# =================================
# Stage 3: Production
# =================================
FROM base AS production

# Install runtime dependencies only
RUN apt-get update && apt-get install -y \
    libpq5 \
    && rm -rf /var/lib/apt/lists/* \
    && groupadd -r fastapi && useradd -r -g fastapi fastapi

# Copy virtual environment
COPY --from=deps-builder /venv /venv
ENV PATH="/venv/bin:$PATH"

# Copy application
COPY . /app/

# Make scripts executable
RUN chmod +x /app/scripts/entrypoint.sh
RUN chmod +x /app/scripts/healthcheck.sh

# Set ownership
RUN chown -R fastapi:fastapi /app

USER fastapi

# Health check
HEALTHCHECK --interval=30s --timeout=10s --start-period=5s --retries=3 \
    CMD /app/scripts/healthcheck.sh

EXPOSE 8000

ENTRYPOINT ["/app/scripts/entrypoint.sh"]

From your backend Dockerfile, you have 3 stages. The first stage is where you install the base dependencies, more like the fundamental packages needed for the image OS to run properly. You also set the working directory in this stage.

Next is the dependencies builder stage, where you use the base stage as your image and install the Python dependencies. The last stage is the production stage, where you copy the environment packages from the deps stage, set your Python path to /venv/bin, and copy the entire application to your /app directory.

Next, you make both the entrypoint and healthcheck scripts executable. The healthcheck script is what your Dockerfile runs to ensure the backend system is working as expected.

Here's the content of the file:

#!/bin/bash

set -e
set -x

# Check application viability
curl --fail http://localhost:8000/api/healthcheck || exit 1

# Check database viability
curl --fail http://localhost:8000/api/dbHealth || exit 1

Here you set -e for the script to cancel if any command fails, and also -x to print the processes as they run. Then you make requests to both the application healthcheck and the database healthcheck endpoints to ensure the main components of the backend infrastructure are running as required.

Once that’s done, you set fastapi as the owner of the system, switch to that user, and run the healthcheck script as described above. Finally, you expose port 8000 and then run the entrypoint script.

Here’s the entrypoint script:

#!/bin/bash

set -e
set -x

alembic -c /app/alembic.ini upgrade head

exec uvicorn main:app --host 0.0.0.0 --port 8000 --workers 4

Similar to the healthcheck script, you set it to fail on error and print logs as the application runs. Then you run migrations using Alembic to essentially create the tables in your PostgreSQL database, before starting the FastAPI application on host 0.0.0.0 and port 8000, with 4 workers to run concurrent processes.

Monitoring Setup With Prometheus and Grafana

Monitoring is an important aspect of every application structure. It is often said that you can’t improve what you don’t measure, hence why monitoring is a core part of every application that makes it to production.

One thing to note is that there are two major components of monitoring: logs and metrics. Logs are information collected or inputted by the different services that are running, while metrics are thresholds that are generally measured from actions or events.

There are different tools used within the industry to monitor systems, but two of the most popular tools used in tandem are Prometheus and Grafana. Prometheus is a monitoring application that collects information and stores it in a central database. It gets this information by pulling and scraping metrics from different targets and servers.

Grafana, on the other hand, is a visualization platform that uses PromQL (Prometheus Query Language) to query this information from Prometheus. This queried data can be visualized via dashboards, either created by the engineer or sourced from the community. It's generally advisable to first search if the dashboard you need already exists on the Grafana Community Dashboards before going ahead to build a custom solution.

For this article, you will monitor two important sets of metrics. First, you will monitor the host machine metrics using node_exporter, and also monitor container resources using cadvisor. Monitoring these two gives you a perspective into how your system is performing. With node_exporter, you get to see how your host machine is behaving, how your CPU and RAM are responding to the system’s workload. With cadvisor, you monitor your container resources to keep an eye on how they’re performing with the applications running in them.

To get started with monitoring, navigate to the monitoring folder and take a look at the monitoring-docker-compose.yml file, as shown below:

volumes:
  prometheus-data:
    driver: local
  grafana-data:
    driver: local

networks:
  monitoring:
    driver: bridge

services:
  prometheus:
    image: prom/prometheus:latest
    container_name: prometheus
    ports:
      - "9090:9090"
    volumes:
      - "./prometheus.yaml:/etc/prometheus/prometheus.yml"
      - "prometheus-data:/prometheus"
    restart: unless-stopped
    command:
      - "--config.file=/etc/prometheus/prometheus.yml"
    networks:
      - monitoring

  grafana:
    image: grafana/grafana:latest
    container_name: grafana
    ports:
      - "3000:3000"
    volumes:
      - "grafana-data:/var/lib/grafana"
    restart: unless-stopped
    networks:
      - monitoring

  cadvisor:
    image: gcr.io/cadvisor/cadvisor:v0.52.1
    container_name: cadvisor
    ports:
      - "8085:8080"
    volumes:
      - /:/rootfs:ro
      - /run:/run:ro
      - /sys:/sys:ro
      - /var/lib/docker/:/var/lib/docker:ro
      - /dev/disk/:/dev/disk:ro
    devices:
      - /dev/kmsg
    privileged: true
    restart: unless-stopped
    networks:
      - monitoring

  node_exporter:
    image: quay.io/prometheus/node-exporter:v1.9.1
    container_name: node_exporter
    command: "--path.rootfs=/host"
    pid: host
    restart: unless-stopped
    volumes:
      - /:/host:ro,rslave
    networks:
      - monitoring

For your monitoring file, you start out by defining the volumes for both Prometheus and Grafana, as these services require persistent volumes to store configurations and data. Next, you define the different services, including the image, container name, the host, and container ports, volumes, and startup commands where necessary.

For Prometheus to know what services to scrape, they need to be included as a target in the prometheus.yml file, which you can see below:

---
global:
  scrape_interval: 15s  

scrape_configs:
  # Monitor Prometheus
  - job_name: 'prometheus'
    scrape_interval: 5s
    static_configs:
      - targets: ['localhost:9090']

  # Monitor EC2 Instances
  - job_name: 'node_exporter'
    static_configs:
      - targets: ['node_exporter:9100']

  # Monitor All Docker Containers
  - job_name: 'cadvisor'
    static_configs:
      - targets: ['cadvisor:8085']

Here, you set Prometheus up to scrape its own metrics on port 9090, node_exporter on port 9100, and cadvisor on port 8085

Once this is done, you can now build and start your monitoring stack by running:

sudo docker compose -f monitoring-docker-compose.yml up -d prometheus cadvisor node_exporter grafana

Once successful, you should find a similar printout in your terminal:

[+] Running 30/30
 ✔ node_exporter Pulled                                                  132.6s 
 ✔ prometheus Pulled                                                     129.6s 
 ✔ cadvisor Pulled                                                        20.9s 
 ✔ grafana Pulled                                                        129.4s 


[+] Running 7/7
 ✔ Network monitoring_default           Cr...                              0.1s 
 ✔ Volume "monitoring_prometheus-data"  Created                            0.0s 
 ✔ Volume "monitoring_grafana-data"     Created                            0.0s 
 ✔ Container cadvisor                   Started                           13.6s 
 ✔ Container grafana                    Started                           13.6s 
 ✔ Container node_exporter              Start...                          13.3s 
 ✔ Container prometheus                 Started                           13.4s

Once your monitoring services are running, the next step is to set up your Grafana dashboard. You can do this by heading to localhost:3000, where you’ll be met with the Grafana login page. Proceed to log in with the default username and password: admin. For security purposes, it is advisable to change this on first login.

Once logged in, the first step would be to set Prometheus as your data source:

![Select Prometheus as your Data Source]

Next, input the Prometheus URL, which, if you're running on localhost, should be http://localhost:9090 or http://prometheus:9090.

Once successful, you should see the success message at the bottom of the page.

All that’s left to do is create the dashboards to visualize the scraped data. To do that, navigate to the /grafana-dashboards folder in monitoring and copy the contents of the node-exporter.json file in the repository. Once done, click on the “Dashboards” option on the left-hand side of the page, and click on the “New” button at the right-hand side of the screen. You will find three ways to create a dashboard — select the Import method.

![Choose the Import method to create the dashboard]

Proceed to paste the content of the node-exporter.json into the "Import via dashboard JSON model" section and click on the “Load” button.

On the next page, select “Prometheus” as the data source for your dashboard. Once done, you should be met with this dashboard displaying your host machine metrics.

From the image above, you can see metrics like “System Load” based on time averages, the “Root FS” usage, and a host of other insights.

For cadvisor, follow the same process to import the dashboard using the cadvisor.json, and you should get something like this:

The result above shows the CPU, Memory, Network, and Misc metrics of the different running containers.

Understanding the Architecture

Before deploying any application to the cloud, it's important to have a laid-out architecture or system design for how the application will be deployed. This matters because it helps you understand what works and what doesn't, along with the trade-offs between different architectural setups. Doing this allows you to optimize your deployment based on your specific use case and take into account cost, reliability, and scalability.

Architectural Diagram

Represented below is the application's architectural diagram to be deployed on AWS:

From the diagram above, you can see the different AWS services that come together to make the application production-ready. The diagram really just has two sets of services: network and compute.

Under the network layer, you have:

VPC – This isolates your application from the world.
Internet Gateway – Attached to the VPC so it can communicate with the outside world.
Public Subnets – Connected to the Internet Gateway and used to house public-facing resources like your Application Load Balancer (ALB), Public EC2 instance, and NAT Gateways.
NAT Gateways – These allow resources in your private subnet to send traffic to the internet, while blocking incoming traffic.
Private Subnets – These house private services that don’t need to communicate directly with the outside world.

For your compute resources, you have:

Application Load Balancer (ALB) – Lives in the public subnet, provides an entry point for external users, and routes requests to the appropriate targets.
Bastion Host – Also in the public subnet. This allows SSH access to the private EC2 instance and also doubles as the server running the Grafana container.
Private EC2 Instance – Hosts both the frontend and backend containers, and also runs monitoring tools like Prometheus, Node Exporter, and cAdvisor.
RDS (PostgreSQL) – This is your managed database instance that holds all the application data.

Implementing the Architecture with Terraform

Terraform is an Infrastructure as Code (IaC) tool, which is used for automatically provisioning resources in the cloud. Its need stems from a place of developers not wanting to "point and click" their way to the creation of resources, but rather have them automatically provisioned, while having a record of what changed and what did not for effective management.

To provision your resources with Terraform, you begin by defining your providers.tf file, where you specify the providers or "packages" terraform should install for your project. Here's the providers.tf file for this project.

terraform {
  required_providers {
    aws = {
      syource = "hashicorp/aws"
      version = "6.5.0"
    }
  }
}


provider "aws" {
  region = var.aws_region
}

In your provider's Terraform file, you specify AWS as a required provider and assign the region from the variables Terraform file.

The variables Terraform file is used to prevent repetition of variables across different files in your Terraform configuration.

In the variables.tf file for this project, as seen below, you will specify a couple of variables that will be used throughout the project.

variable "aws_region" {
  type = string
  default = "us-east-2"
}


variable "tags" {
  type = object({
    name = string
    environment = string
    developer = string
    team = string
  })

  default = {
    name = "EphraimX"
    developer = "TheJackalX"
    environment = "Production"
    team = "Boogey Team"
  }
}


variable "DB_PORT" {
  type = number
  sensitive = true
}


variable "DB_NAME" {
  type = string
  default = "roi_calculator"
}


variable "DB_USER" {
  type = string
  sensitive = true
}


variable "DB_PASSWORD" {
  type = string
  sensitive = true
}


variable "DB_IDENTIFIER" {
  type = string
  default = "roi-calculator"
}


variable "DB_INSTANCE_CLASS" {
  type = string
  default = "db.t3.micro"
}


variable "DB_ENGINE" {
  type = string
  default = "postgres"
}


variable "DB_TYPE" {
  type = string
  default = "postgresql"
}


# variable "NEXT_PUBLIC_APIURL" {
#   type = string
#   sensitive = true
# }


variable "CLIENT_URL" {
  type = string
  default = "*"
}

Specified in the variables.tf are different variables such as the AWS region you'll be deploying your resources to, the tags for your resources, the database port, and other database values such as the name, the user, the password, instance identifier, instance class, database engine, and the database type. Also included is the client URL, which is added to the list of allowed origins in the backend code to prevent a CORS error,r as seen in the snippet below:

# CORS Settings
CLIENT_URL = os.getenv("CLIENT_URL", "http://localhost:3000")
ALLOyouD_ORIGINS = [
    "http://localhost",
    "http://localhost:3000",  # or the port your frontend uses
    CLIENT_URL
]

With this in place, you can go on to build your main.tf, the file that will handle all your provisioning and the configurations.

Starting with the VPC and Subnets:

#################################################
## AWS VPC and Subnets
#################################################

resyource "aws_vpc" "roi_calculator_vpc" {
  cidr_block = "10.10.0.0/16"
  tags = var.tags
}


resyource "aws_subnet" "roi_calculator_public_subnet_one" {
  vpc_id = aws_vpc.roi_calculator_vpc.id
  cidr_block = "10.10.10.0/24"
  availability_zone = "us-east-2a"
  tags = var.tags
}


resyource "aws_subnet" "roi_calculator_public_subnet_two" {
  vpc_id = aws_vpc.roi_calculator_vpc.id
  cidr_block = "10.10.20.0/24"
  availability_zone = "us-east-2b"
  tags = var.tags
}


resyource "aws_subnet" "roi_calculator_private_subnet_one" {
  vpc_id = aws_vpc.roi_calculator_vpc.id
  cidr_block = "10.10.30.0/24"
  availability_zone = "us-east-2a"
  tags = var.tags
}


resyource "aws_subnet" "roi_calculator_private_subnet_two" {
  vpc_id = aws_vpc.roi_calculator_vpc.id
  cidr_block = "10.10.40.0/24"
  availability_zone = "us-east-2b"
  tags = var.tags
}

Here, you define your VPC and the CIDR block you want it to operate on. CIDR blocks are how you define IP address ranges in your VPC; they tell AWS which IPs your network will own. Instead of assigning individual IPs, you just define a range like 10.10.0.0/16, and AWS handles the rest. With this in place, you define your subnets, both public and private, deploying them in different availability zones in order to ensure high availability.

Next, you define your internet gateway and route tables for your public subnets.

#################################################
## Internet Gateway and Route Tables - Public 
#################################################


resyource "aws_internet_gateway" "roi_calculator_igw" {
  vpc_id = aws_vpc.roi_calculator_vpc.id
  tags = var.tags
}


resyource "aws_route_table" "roi_calculator_route_table" {
  vpc_id = aws_vpc.roi_calculator_vpc.id

  route {
    cidr_block = "0.0.0.0/0"
    gateway_id = aws_internet_gateway.roi_calculator_igw.id
  }

  tags = var.tags 
}


resyource "aws_route_table_association" "roi_calculator_route_table_association_public_subnet_one" {
  route_table_id = aws_route_table.roi_calculator_route_table.id
  subnet_id = aws_subnet.roi_calculator_public_subnet_one.id
}


resyource "aws_route_table_association" "roi_calculator_route_table_association_public_subnet_two" {
  route_table_id = aws_route_table.roi_calculator_route_table.id
  subnet_id = aws_subnet.roi_calculator_public_subnet_two.id
}

An internet gateway is important when working in a VPC as it allows resources within it to access the internet. Without it, resources in your VPC will be completely shut off from the internet. You also define a route table here, which routes all incoming and outgoing traffic from the internet through your internet gateway. And lastly, for your public subnets to be truly public, you have to associate them with the route table connected to the internet gateway.

Next, you define your Elastic IP address, NAT gateway, and corresponding route tables for your private subnets.

#############################################################
## EIP, NAT Gateway, Route Tables - Private Subnet One
#########################################################


resyource "aws_eip" "roi_calculator_ngw_eip_private_subnet_one" {
  domain = "vpc"
  tags = var.tags
}


resyource "aws_nat_gateway" "roi_calculator_ngw_private_subnet_one" {
  subnet_id = aws_subnet.roi_calculator_public_subnet_one.id
  allocation_id = aws_eip.roi_calculator_ngw_eip_private_subnet_one.id
  tags = var.tags
  depends_on = [ aws_internet_gateway.roi_calculator_igw ]
}


resyource "aws_route_table" "roi_calculator_route_table_private_subnet_one" {
  vpc_id = aws_vpc.roi_calculator_vpc.id
  route {
    cidr_block = "0.0.0.0/0"
    nat_gateway_id = aws_nat_gateway.roi_calculator_ngw_private_subnet_one.id
  }
  tags = var.tags
}


resyource "aws_route_table_association" "roi_calculator_route_table_association_private_subnet_one" {
  subnet_id = aws_subnet.roi_calculator_private_subnet_one.id
  route_table_id = aws_route_table.roi_calculator_route_table_private_subnet_one.id
}


#############################################################
## EIP, NAT Gateway, Route Tables - Private Subnet Two
#########################################################


resyource "aws_eip" "roi_calculator_ngw_eip_private_subnet_two" {
  domain = "vpc"
  tags = var.tags
}


resyource "aws_nat_gateway" "roi_calculator_ngw_private_subnet_two" {
  subnet_id = aws_subnet.roi_calculator_public_subnet_two.id
  allocation_id = aws_eip.roi_calculator_ngw_eip_private_subnet_two.id
  tags = var.tags
  depends_on = [ aws_internet_gateway.roi_calculator_igw ]
}


resyource "aws_route_table" "roi_calculator_route_table_private_subnet_two" {
  vpc_id = aws_vpc.roi_calculator_vpc.id
  route {
    cidr_block = "0.0.0.0/0"
    nat_gateway_id = aws_nat_gateway.roi_calculator_ngw_private_subnet_two.id
  }
  tags = var.tags
}


resyource "aws_route_table_association" "roi_calculator_route_table_association_private_subnet_two" {
  subnet_id = aws_subnet.roi_calculator_private_subnet_two.id
  route_table_id = aws_route_table.roi_calculator_route_table_private_subnet_two.id
}

It is generally expected that resources in your private subnet not have any form of communication to the internet, as it hosts very sensitive applications. But what happens in production is that some of these resources need to access the internet for things like patches and updates.

To solve this, a NAT gateway is placed in a public subnet and attached to the private subnet through a route table. And for the NAT gateway to be able to communicate with the internet, an Elastic IP address (EIP) is attached to provide it with a static public IPv4 address for this purpose.

All these define your network resources for the project. For the compute resources, the first set of resources to define is the security groups for the EC2 servers.

#################################################
## Bastion Host Security Group
#################################################


resyource "aws_security_group" "roi_calculator_bastion_host_sg" {
  name = "roi-calculator-bastion-host-sg"
  vpc_id = aws_vpc.roi_calculator_vpc.id
}


resyource "aws_vpc_security_group_ingress_rule" "roi_calculator_grafana_sg_ingress" {
  security_group_id = aws_security_group.roi_calculator_bastion_host_sg.id
  description = "grafana"
  cidr_ipv4 = "0.0.0.0/0"
  from_port = 3000
  to_port = 3000
  ip_protocol = "tcp"
}


resyource "aws_vpc_security_group_ingress_rule" "roi_calculator_bastion_ssh_ingress" {
  security_group_id = aws_security_group.roi_calculator_bastion_host_sg.id
  description = "ssh"
  cidr_ipv4         = "0.0.0.0/0"   # Open to the world – for production, restrict this!
  from_port         = 22
  to_port           = 22
  ip_protocol       = "tcp"
}


resyource "aws_vpc_security_group_egress_rule" "bastion_host_allow_all_traffic_ipv4" {
  security_group_id = aws_security_group.roi_calculator_bastion_host_sg.id
  cidr_ipv4         = "0.0.0.0/0"
  ip_protocol       = "-1"
}


#################################################
## Production Host Security Group
#################################################


resyource "aws_security_group" "roi_calculator_production_host_sg" {
  name = "roi-calculator-production-host-sg"
  vpc_id = aws_vpc.roi_calculator_vpc.id
  tags = var.tags
}


resyource "aws_vpc_security_group_ingress_rule" "roi_calculator_production_ssh_ingress" {
  security_group_id = aws_security_group.roi_calculator_production_host_sg.id
  description = "ssh"
  cidr_ipv4         = aws_vpc.roi_calculator_vpc.cidr_block   # Open to the world – for production, restrict this!
  from_port         = 22
  to_port           = 22
  ip_protocol       = "tcp"
}


resyource "aws_vpc_security_group_ingress_rule" "roi_calculator_http_sg_ingress" {
  security_group_id = aws_security_group.roi_calculator_production_host_sg.id
  description = "http-and-also-for-next-js"
  cidr_ipv4 = aws_vpc.roi_calculator_vpc.cidr_block
  from_port = 80
  to_port = 80
  ip_protocol = "tcp"
}


resyource "aws_vpc_security_group_ingress_rule" "roi_calculator_https_sg_ingress" {
  security_group_id = aws_security_group.roi_calculator_production_host_sg.id
  description = "https"
  cidr_ipv4 = aws_vpc.roi_calculator_vpc.cidr_block
  from_port = 443
  to_port = 443
  ip_protocol = "tcp"
}


resyource "aws_vpc_security_group_ingress_rule" "roi_calculator_fastapi_sg_ingress" {
  security_group_id = aws_security_group.roi_calculator_production_host_sg.id
  description = "fastapi"
  cidr_ipv4 = aws_vpc.roi_calculator_vpc.cidr_block
  from_port = 8000
  to_port = 8000
  ip_protocol = "tcp"
}


resyource "aws_vpc_security_group_ingress_rule" "roi_calculator_next_js_sg_ingress" {
  security_group_id = aws_security_group.roi_calculator_production_host_sg.id
  description = "next_js"
  cidr_ipv4 = aws_vpc.roi_calculator_vpc.cidr_block
  from_port = 8081
  to_port = 8081
  ip_protocol = "tcp"
}


resyource "aws_vpc_security_group_ingress_rule" "roi_calculator_prometheus_sg_ingress" {
  security_group_id = aws_security_group.roi_calculator_production_host_sg.id
  description = "prometheus"
  cidr_ipv4 = aws_vpc.roi_calculator_vpc.cidr_block
  from_port = 9090
  to_port = 9090
  ip_protocol = "tcp"
}


resyource "aws_vpc_security_group_ingress_rule" "roi_calculator_cadvisor_sg_ingress" {
  security_group_id = aws_security_group.roi_calculator_production_host_sg.id
  description = "cadvisor"
  cidr_ipv4 = aws_vpc.roi_calculator_vpc.cidr_block
  from_port = 8085
  to_port = 8085
  ip_protocol = "tcp"
}


resyource "aws_vpc_security_group_ingress_rule" "roi_calculator_node_exporter_sg_ingress" {
  security_group_id = aws_security_group.roi_calculator_production_host_sg.id
  description = "node_exporter"
  cidr_ipv4 = aws_vpc.roi_calculator_vpc.cidr_block
  from_port = 9100
  to_port = 9100
  ip_protocol = "tcp"
}


resyource "aws_vpc_security_group_egress_rule" "production_host_allow_all_traffic_ipv4" {
  security_group_id = aws_security_group.roi_calculator_production_host_sg.id
  cidr_ipv4         = "0.0.0.0/0"
  ip_protocol       = "-1"
}


#################################################
## Application Load Balancer Security Group
#################################################


resyource "aws_security_group" "roi_calculator_alb_sg" {
  name = "roi-calculator-alb-sg"
  vpc_id = aws_vpc.roi_calculator_vpc.id
  tags = var.tags
}


resyource "aws_vpc_security_group_ingress_rule" "roi_calculator_alb_sg_http" {
  security_group_id = aws_security_group.roi_calculator_alb_sg.id
  description = "alb_http"
  cidr_ipv4 = "0.0.0.0/0"
  from_port = 80
  to_port = 80
  ip_protocol = "tcp"
}


resyource "aws_vpc_security_group_ingress_rule" "roi_calculator_alb_sg_https" {
  security_group_id = aws_security_group.roi_calculator_alb_sg.id
  description = "alb_https"
  cidr_ipv4 = "0.0.0.0/0"
  from_port = 443
  to_port = 443
  ip_protocol = "tcp"
}


resyource "aws_vpc_security_group_egress_rule" "alb_allow_all_traffic_ipv4" {
  security_group_id = aws_security_group.roi_calculator_alb_sg.id
  cidr_ipv4         = "0.0.0.0/0"
  ip_protocol       = "-1"
}

A VPC security group is responsible for determining what traffic comes in and out, from what source, and also what port the traffic should be directed to. For the bastion host in this project, you defined two ingress rules, ingress means what traffic should be allowed into the instance.

These rules allow traffic from any IP address to port 3000 for accessing Grafana, and port 22 for accessing the system via SSH. In the example above, access is granted from all IP addresses, but in production, limit SSH access to just your IP address. Lastly, on the bastion host, you allow traffic out of the instance to all IP addresses.

For the production host, you have a similar setup, just with more ports. For ingress rules, you have 22 for SSH, 80 & 8081 for the frontend application, depending on the configuration, 8000 for the backend application, 443 for HTTPS, 9090 for Prometheus, 8085 for cAdvisor, 9100 for the Node Exporter, and lastly, for the egress rules, you allow traffic to all IP addresses.

Next, you define the EC2 instance for both the bastion host and production host.

#################################################
## Bastion Host 
#################################################


resyource "aws_instance" "roi_calculator_bastion_host_ec2_public_subnet_one" {
  ami           = data.aws_ami.ubuntu.id
  instance_type = "t3.xlarge"
  vpc_security_group_ids = [aws_security_group.roi_calculator_bastion_host_sg.id]
  subnet_id = aws_subnet.roi_calculator_public_subnet_one.id
  key_name = "rayda-application"
  associate_public_ip_address = true
  user_data = file("scripts/bastion-host.sh")
  tags = var.tags
}


resyource "aws_instance" "roi_calculator_bastion_host_ec2_public_subnet_two" {
  ami           = data.aws_ami.ubuntu.id
  instance_type = "t3.xlarge"
  vpc_security_group_ids = [aws_security_group.roi_calculator_bastion_host_sg.id]
  subnet_id = aws_subnet.roi_calculator_public_subnet_two.id
  key_name = "rayda-application"
  associate_public_ip_address = true
  user_data = file("scripts/bastion-host.sh")
  tags = var.tags
}


#################################################
## Production Host
#################################################


resyource "aws_instance" "roi_calculator_production_host_ec2_private_subnet_one" {
  ami           = data.aws_ami.ubuntu.id
  instance_type = "t3.xlarge"
  vpc_security_group_ids = [aws_security_group.roi_calculator_production_host_sg.id]
  subnet_id = aws_subnet.roi_calculator_private_subnet_one.id
  key_name = "rayda-application"
  associate_public_ip_address = false
  user_data = templatefile("${path.module}/scripts/production-host.sh", {
    db_host            = aws_db_instance.roi_calculator.address
    db_port            = var.DB_PORT
    db_name            = var.DB_NAME
    db_user            = var.DB_USER
    db_password        = var.DB_PASSWORD
    db_type            = var.DB_TYPE
    client_url         = var.CLIENT_URL
    MY_IP              = "127.0.0.1"
  })
  tags = var.tags
}


resyource "aws_instance" "roi_calculator_production_host_ec2_private_subnet_two" {
  ami           = data.aws_ami.ubuntu.id
  instance_type = "t3.xlarge"
  vpc_security_group_ids = [aws_security_group.roi_calculator_production_host_sg.id]
  subnet_id = aws_subnet.roi_calculator_private_subnet_two.id
  key_name = "rayda-application"
  associate_public_ip_address = false
  user_data = templatefile("${path.module}/scripts/production-host.sh", {
    db_host            = aws_db_instance.roi_calculator.address
    db_port            = var.DB_PORT
    db_name            = var.DB_NAME
    db_user            = var.DB_USER
    db_password        = var.DB_PASSWORD
    db_type            = var.DB_TYPE
    client_url         = var.CLIENT_URL
    MY_IP              = "127.0.0.1"
  })
  tags = var.tags
}

For both these hosts, you define the properties of the EC2 instances, such as the instance class, which is set to "t3.xlarge", and the VPC security groups are set to the security group you defined initially. You create two instances and place them in two different subnets in different availability zones. You set the key name for SSH access, and you associate a public IP in case of a bastion host, but not for the production host. There's also the AMI or Amazon Machine Image, which provides the required software to set up and boot an Amazon EC2 instance. For the AMI, you get the ID of the image you want by retrieving the image properties from AWS in data.tf as shown below:

data "aws_ami" "ubuntu" {
  most_recent = true

  filter {
    name   = "name"
    values = ["ubuntu/images/hvm-ssd/ubuntu-jammy-22.04-amd64-server-*"]
  }

  filter {
    name   = "virtualization-type"
    values = ["hvm"]
  }

  owners = ["099720109477"] # Canonical
}

As seen in the config above, you tell AWS to get you the most recent image running Ubuntu Jammy 22.04. You select virtualization to Hardware Virtual Machine (HVM) and select the owners to be canonical.

Another important configuration of both the production and bastion host is the user_data. The user_data is are set of commands you give to AWS to run at boot time. For an EC2 instance, this allows you to predefine installation steps without having to access these servers and run the installation manually.

For the bastion host, the user_data is using the file function to load the bastion-host.sh file in scripts. Here is the contents of the bastion scripts file:

#!/bin/bash

set -e
set -x


sudo apt update
sudo apt install -y unzip curl

curl "https://awscli.amazonaws.com/awscli-exe-linux-x86_64.zip" -o "awscliv2.zip"
unzip awscliv2.zip
sudo ./aws/install

# Update and install Docker prerequisites
sudo apt update -y
sudo apt install -y apt-transport-https ca-certificates curl software-properties-common git

# Add Docker GPG key and repo
curl -fsSL https://download.docker.com/linux/ubuntu/gpg | sudo apt-key add -
sudo add-apt-repository -y "deb [arch=amd64] https://download.docker.com/linux/ubuntu focal stable"

# Install Docker
sudo apt update -y
sudo apt install -y docker-ce

# Verify Docker is running
sudo systemctl is-active --quiet docker && echo "Docker is running" || echo "Docker is not running"

# Allow current user to run Docker without sudo
sudo usermod -aG docker ubuntu

# Clone your repo
git clone https://github.com/EphraimX/roi-calculator.git

# Change directory
cd roi-calculator/monitoring

# Build and start the Grafana service from your Docker Compose file
sudo docker compose -f monitoring-docker-compose.yml up -d grafana

In the user_data for the bastion host, you set your initial parameters, such as set -e for exit on failure and set -x to log your process as it runs. Next, you update your packages and install much-needed packages such as the AWS CLI and Docker. Then you verify Docker is running and allow the current user to run Docker without sudo. Lastly, you clone the repository, navigate to the monitoring folder, and start the Grafana service.

The process is also very similar to the user_data of the production host, as seen below.

#!/bin/bash

set -e
set -x

sudo apt update
sudo apt install -y unzip curl

curl "https://awscli.amazonaws.com/awscli-exe-linux-x86_64.zip" -o "awscliv2.zip"
unzip awscliv2.zip
sudo ./aws/install

# Step 1: Request a metadata session token (valid for 6 hours)
TOKEN=$(curl -X PUT "http://169.254.169.254/latest/api/token" \
  -H "X-aws-ec2-metadata-token-ttl-seconds: 21600")

# Step 2: Use that token to access instance metadata securely
MY_IP=$(curl -H "X-aws-ec2-metadata-token: $TOKEN" \
  http://169.254.169.254/latest/meta-data/local-ipv4)

# Set environment variables
# export NEXT_PUBLIC_APIURL="__next_public_apiurl__"
export NEXT_PUBLIC_APIURL="http://$(curl -H "X-aws-ec2-metadata-token: $(curl -X PUT http://169.254.169.254/latest/api/token \
                                                                              -H "X-aws-ec2-metadata-token-ttl-seconds: 21600")" \
                                                                              http://169.254.169.254/latest/meta-data/local-ipv4):8000/api"
export DB_HOST="${db_host}"
export DB_PORT="${db_port}"
export DB_NAME="${db_name}"
export DB_USER="${db_user}"
export DB_PASSWORD="${db_password}"
export DB_TYPE="${db_type}"
export CLIENT_URL="${client_url}"

# echo "NEXT_PUBLIC_APIURL=http://${MY_IP}:8000/api" >> /etc/environment
echo "DB_HOST=$DB_HOST"
echo "DB_PORT=$DB_PORT"
echo "DB_NAME=$DB_NAME"
echo "DB_USER=$DB_USER"
echo "DB_PASSWORD=$DB_PASSWORD"
echo "DB_TYPE=$DB_TYPE"
echo "CLIENT_URL=$CLIENT_URL"

# Install Docker
sudo apt update -y
sudo apt install -y apt-transport-https ca-certificates curl software-properties-common git
curl -fsSL https://download.docker.com/linux/ubuntu/gpg | sudo apt-key add -
sudo add-apt-repository -y "deb [arch=amd64] https://download.docker.com/linux/ubuntu focal stable"
sudo apt update -y
sudo apt install -y docker-ce
sudo systemctl is-active --quiet docker && echo "Docker is running" || echo "Docker is not running"
sudo usermod -aG docker ubuntu

# Clone repo
git clone https://github.com/EphraimX/roi-calculator.git
cd roi-calculator


# Create Docker Network
sudo docker network create roi-calculator-network

# Frontend
cd client-side
sudo docker build \
  -f Dockerfile.dev \
  --build-arg NEXT_PUBLIC_APIURL=$NEXT_PUBLIC_APIURL \
  -t roi-calculator-frontend .
sudo docker run -d --name roi-calculator-frontend --network roi-calculator-network -p 80:3000 roi-calculator-frontend

# Backend
cd ../server-side
sudo docker run -d \
  --name roi-calculator-backend \
  --network roi-calculator-network \
  -e DB_HOST=$DB_HOST \
  -e DB_PORT=$DB_PORT \
  -e DB_NAME=$DB_NAME \
  -e DB_USER=$DB_USER \
  -e DB_PASSWORD=$DB_PASSWORD \
  -e DB_TYPE=$DB_TYPE \
  -e CLIENT_URL=$CLIENT_URL \
  -p 8000:8000 \
  ephraimx57/roi-calculator-backend

# Monitoring
cd ../monitoring
sudo docker compose -f monitoring-docker-compose.yml up -d prometheus cadvisor node_exporter

Here, you perform the initial process involved for the bastion-host.sh. Past there, you generate the NEXT_PUBLIC_APIURL, which is the backend API to communicate with the frontend. Then you export the environment variables, perform a couple of installations including Docker, and then launch the frontend and backend of the.

To launch the frontend application, you pull the repository, navigate to the "client-side" directory, and build and run the Dockerfile with the appropriate parameters. For the backend, you run the already published image at "ephraimx57/roi-calcylator-backend" with the environment variables that you're initially passed to the script. Lastly, navigate to the monitoring directory to start Prometheus, cAdvisor, and Node Exporter.

After the production host, the next resource for you to implement is the RDS Postgres database, which is displayed below:

#################################################
## AWS RDS
#################################################


resyource "aws_db_subnet_group" "roi_calculator_rds_db_subnet_group" {
  name       = "roi-calculator-rds-db-subnet-group"
  subnet_ids = [aws_subnet.roi_calculator_private_subnet_one.id, aws_subnet.roi_calculator_private_subnet_two.id]
  tags = var.tags
}


resyource "aws_security_group" "rds_sg" {
  name        = "rds-sg"
  description = "Allow Postgres inbound traffic"
  vpc_id      = aws_vpc.roi_calculator_vpc.id
  ingress {
    description = "Allow Postgres from my IP"
    from_port   = 5432
    to_port     = 5432
    protocol    = "tcp"
    cidr_blocks = [aws_subnet.roi_calculator_private_subnet_one.cidr_block, aws_subnet.roi_calculator_private_subnet_two.cidr_block]
  }

  egress {
    from_port   = 0
    to_port     = 0
    protocol    = "-1"
    cidr_blocks = [aws_subnet.roi_calculator_private_subnet_one.cidr_block, aws_subnet.roi_calculator_private_subnet_two.cidr_block]
  }
}


resyource "aws_db_instance" "roi_calculator" {
  identifier = var.DB_IDENTIFIER
  allocated_storage = 5
  db_name = var.DB_NAME
  engine = var.DB_ENGINE
  engine_version = "17.5"
  instance_class = var.DB_INSTANCE_CLASS
  username = var.DB_USER
  password = var.DB_PASSWORD
  skip_final_snapshot = true
  port = var.DB_PORT
  publicly_accessible = false
  db_subnet_group_name = aws_db_subnet_group.roi_calculator_rds_db_subnet_group.name
  vpc_security_group_ids = [aws_security_group.rds_sg.id]
}

The RDS instance is pretty straightforward to set up. You define a "db_subnet_group", which allows for the high availability of the database. Next, you have the security group, which allows the database to be reached on port 5432 from resources within the VPC. And finally, you create the instance itself with some parameters already specified in your variables.tf file.

Your last resource to create is the Application Load Balancer (ALB).

#################################################
## Application Load Balancer
#################################################


resyource "aws_lb" "roi_calculator_aws_lb" {
  name               = "roi-calculator-aws-lb"
  internal           = false
  load_balancer_type = "application"
  security_groups    = [aws_security_group.roi_calculator_alb_sg.id]
  subnets            = [aws_subnet.roi_calculator_public_subnet_one.id, aws_subnet.roi_calculator_public_subnet_two.id]
  tags = var.tags
}


resyource "aws_lb_target_group" "roi_calculator_aws_lb_target_group" {
  name     = "roi-calc-aws-lb-tg-prisub-one"
  port     = 80
  protocol = "HTTP"
  target_type = "instance"
  health_check {
    path                = "/"
    protocol            = "HTTP"
    healthy_threshold   = 2
    unhealthy_threshold = 2
    timeout             = 5
    interval            = 30
  }
  vpc_id   = aws_vpc.roi_calculator_vpc.id
}


resyource "aws_lb_listener" "roi_calculator_alb_sg_listener_private_subnet_one" {

  load_balancer_arn = aws_lb.roi_calculator_aws_lb.arn
  port              = "80"
  protocol          = "HTTP"

  default_action {
    type             = "forward"
    target_group_arn = aws_lb_target_group.roi_calculator_aws_lb_target_group.arn
  }
}


resyource "aws_lb_target_group_attachment" "roi_calculator_aws_lb_target_group_attachment_private_subnet_one" {
  target_group_arn = aws_lb_target_group.roi_calculator_aws_lb_target_group.arn
  target_id        = aws_instance.roi_calculator_production_host_ec2_private_subnet_two.id
  port             = 80
}


resyource "aws_lb_target_group_attachment" "roi_calculator_aws_lb_target_group_attachment_private_subnet_two" {
  target_group_arn = aws_lb_target_group.roi_calculator_aws_lb_target_group.arn
  target_id        = aws_instance.roi_calculator_production_host_ec2_private_subnet_two.id
  port             = 80
}

The main purpose of the ALB is to balance load across different prospective loads. For this application, the ALB is also preferred because it provides a URL to easily access internal-facing applications. To define your ALB, you need to specify the target group. A target group describes the nature of the target machine network. You will also require a listener to listen to all the requests made and forward them to the right target group. Finally, you attach the target group to the private subnet, which enables it to reach the resources hidden behind this private subnet.

##CI/CD With GitHub Actions

Once your Terraform infrastructure is done, you are now ready to automate the deployment using GitHub Actions. GitHub Actions is a Continuous Integration and Continuous Deployment tool that is used by companies to automate the testing, integration, and deployment of their code. It helps streamline development and prevents the deployment of code that does not pass certain criteria or tests.

Using CI/CD in this project allows you to build, test, and automatically effect a change once you push to the Version Control System, which is GitHub in this case.

To set up GitHub Actions, simply create a .github/workflows folder, and create the file deploy.yml in it and paste in the following code:

 name: Terraform Deploy

on:
  push:
    branches:
      - main  # or your deployment branch

jobs:
  terraform:
    name: Terraform Apply
    runs-on: ubuntu-latest

    defaults:
      run:
        working-directory: terraform

    steps:
    - name: Checkout code
      uses: actions/checkout@v3

    - name: Setup Terraform
      uses: hashicorp/setup-terraform@v3
      with:
        terraform_version: 1.8.0  # or your preferred version

    - name: Configure AWS credentials
      uses: aws-actions/configure-aws-credentials@v2
      with:
        aws-access-key-id: ${{ secrets.AWS_ACCESS_KEY_ID }}
        aws-secret-access-key: ${{ secrets.AWS_SECRET_ACCESS_KEY }}

    - name: Terraform Init
      run: terraform init

    - name: Terraform Validate
      run: terraform validate

    - name: Terraform Plan
      run: |
        terraform plan \
          -var="db_user=${{ secrets.DB_USER }}" \
          -var="db_password=${{ secrets.DB_PASSWORD }}" \
          -var="db_port=${{ secrets.DB_PORT }}"

    - name: Terraform Apply
      if: github.ref == 'refs/heads/main'  # protect applies to main branch
      run: |
        terraform apply -auto-approve \
          -var="db_user=${{ secrets.DB_USER }}" \
          -var="db_password=${{ secrets.DB_PASSWORD }}" \
          -var="db_port=${{ secrets.DB_PORT }}"

The GitHub Actions code above performs a set of actions once a push is made to the main branch. It first sets the runner of the GitHub Actions, which is ubuntu-latest. Next, it sets the folder "terraform" as the default working directory, checks out the code, and sets up (installs) Terraform on the runner.

To be able to deploy to AWS, you configure your running using your ACCESS_KEY and SECRET_KEY that can be gotten from your AWS "Security Credentials" page. Next, it runs terraform init, which initializes the Terraform environment. Then it validates the code, ensuring the syntax is correct. Next, it runs terraform plan, which creates a plan of the resources to be created using variables from GitHub secrets. And finally, it applies the code with the same variables from GitHub Secrets.

For this to work, you have to have your secret set up. To do that, click on the "Settings" tab of your repository:

Under "Security," click on "Secrets and variables" and select "Actions"

Lastly, create repository secrets for AWS_ACCESS_KEY_ID and AWS_SECRET_ACCESS_KEY, assigning values from your AWS account, and DB_USER, DB_PASSWORD, DB_PORT, assigning values of your choice. For DB_PORT, it is advisable to use "5432" as that is the default port of PostgreSQL.

Verifying the Deployment

After completing the deployment, you need to verify that everything is running as expected.

Start by connecting to the Bastion Host using SSH:

eval "$(ssh-agent -s)"
ssh-add ~/path/to/key-pair
ssh -A ubuntu@<bastion-ip>

Once inside the Bastion Host, check that the monitoring containers are running:

docker ps

You should see the container for Grafana running.

Next, open your browser and navigate to:

http://<bastion-public-ip>:3000

This will take you to the Grafana dashboard. Log in using the default credentials (admin / admin), then go to Settings → Data Sources, and add a new Prometheus data source with the following URL:

http://<private-ec2-ip>:9090

This links Grafana to the Prometheus instance running on the private EC2 instance.

Once that's done, return to your terminal and SSH into the private EC2 instance:

ssh ubuntu@<private_ip>

Check that the frontend and backend services are running:

docker ps

Lastly, open the application in your browser by visiting the ALB DNS URL. This is the public-facing entry point to your full-stack application. You should be able to load the frontend and confirm that it connects properly to the backend.

Everything should now be live and fully functional.

Conclusion

This guide walks through how to deploy a full-stack Dockerized application to AWS using Terraform. You set up the infrastructure from scratch, including VPCs, subnets, security groups, and EC2 instances, and made sure your application could run securely inside a private subnet. You also added a bastion host to help us reach the private instances when needed.

On top of that, you configured a load balancer for public access and set up monitoring using Prometheus and Grafana. With this in place, you’re able to track system metrics and confirm that everything’s running as expected.

There are several ways in which this setup can be improved, one of which is including extensive logging within the system and application, but for now, this is a good start. If you enjoyed this article, you can head on to my page to read more DevOps-related articles.