Fidelis Ikoroje

Posted on Feb 9 • Edited on Mar 6

DeepSeek-R1 Deployment on AWS via Terraform & GitHub Actions

Hey there! In this blog post, I’m going to walk you through how I deployed the DeepSeek Model R1 8B on AWS using Terraform and GitHub Actions. If you’ve ever tried deploying a machine learning model, you know it can get pretty complicated—especially when you’re juggling multiple AWS services. To make things easier, I decided to automate the whole process using Terraform for infrastructure as code and GitHub Actions for CI/CD. Spoiler alert: it worked like a charm!

This project involved setting up an EC2 instance, an Application Load Balancer (ALB), security groups, IAM roles, and even a custom domain using Route 53. The best part? Everything was automated, so I didn’t have to manually configure resources every time I made a change. Whether you’re a seasoned DevOps pro or just getting started with cloud deployments, I hope this walkthrough gives you some useful insights (and maybe saves you a few headaches along the way).

Introduction
Project Overview
Terraform Configuration
- Provider Configuration
- Security Groups
- Load Balancer (ALB)
- EC2 Instance
- IAM Roles and Instance Profile
- Route 53 DNS Record
- Terraform Backend (S3)
GitHub Actions Workflow
- Workflow Triggers
- Setup Job
- Apply Job
- Post-Apply Job
- Destroy Job
The Application in Action (Result)
Challenges Faced & Lessons Learned
Future Improvements
Conclusion

1. Introduction

Deploying machine learning models in production can be a complex task, especially when it involves multiple AWS services. To streamline this process, I used Terraform to define the infrastructure as code and GitHub Actions to automate the deployment pipeline. This approach ensures consistency, scalability, and repeatability.

2. Project Overview

The goal of this project was to deploy the DeepSeek Model R1 on AWS, making it accessible via a web interface (OpenWebUI) and an API (Ollama). The infrastructure includes:

EC2 Instance: Hosts the DeepSeek model in a Docker container and associated services.
Application Load Balancer (ALB): Distributes traffic to the EC2 instance and handles SSL termination.
Security Groups: Control inbound and outbound traffic to the ALB and EC2 instance.
IAM Roles: Provide the necessary permissions for the EC2 instance.
Route 53: Manages DNS records for the ALB. I just employed a cetificate I already have in us-east-1 and a ready public hosted zone in same region.
Terraform Backend: Stores the Terraform state file in an S3 bucket for team collaboration.

3. Terraform Configuration

The Terraform configuration is the backbone of this project. It defines all the AWS resources required for the deployment. Below is a breakdown of the key components:

Provider Configuration

The first step in the Terraform configuration is to define the AWS provider and specify the region:

provider "aws" {
  region = "us-east-1"
}

Security Groups

Two security groups were created: one for the ALB and one for the EC2 instance. The ALB security group allows HTTPS traffic (port 443) and HTTP traffic (port 80) for testing purposes. It also allows TCP traffic on port 11434 for the Ollama API. The EC2 security group restricts traffic to only allow communication from the ALB on ports 8080 (OpenWebUI) and 11434 (Ollama API).

resource "aws_security_group" "alb_sg" {
  name        = "deepseek_alb_sg"
  description = "Security group for ALB"
  vpc_id      = var.vpc_id

  ingress {
    from_port   = 443
    to_port     = 443
    protocol    = "tcp"
    cidr_blocks = ["0.0.0.0/0"]
  }

  ingress {
    from_port   = 80
    to_port     = 80
    protocol    = "tcp"
    cidr_blocks = ["0.0.0.0/0"]
  }

  ingress {
    from_port = 11434
    to_port = 11434
    protocol = "tcp"
    cidr_blocks = ["0.0.0.0/0"]
  }

  egress {
    from_port   = 0
    to_port     = 0
    protocol    = "-1"
    cidr_blocks = ["0.0.0.0/0"]
  }
}

Load Balancer (ALB)

The ALB is configured to listen on ports 443 (HTTPS) and 80 (HTTP). The HTTP listener redirects traffic to HTTPS for secure communication. Additionally, I set up a separate listener for the Ollama API on port 11434. Although this is not needed because the OpenWebUI already exposes the Ollama API. I just needed to have it as a standby maybe for direct API access programmatically.

resource "aws_lb" "deepseek_lb" {
  name               = "deepseek-alb"
  internal           = false
  load_balancer_type = "application"
  security_groups    = [aws_security_group.alb_sg.id]
  subnets            = var.subnet_ids

  enable_deletion_protection = false
}

EC2 Instance

The EC2 instance is configured with a gp3 EBS volume (48GB) and an IAM role for necessary permissions. The instance is placed in a public subnet and associated with the EC2 security group. Note: An instance with GPU support like p3.2xlarge, g4dn.xlarge etc would do better here to handle bigger model and process responses faster but I didn't get one approved by AWS at the time of project execution. So, I used c4.4xlarge.

resource "aws_instance" "deepseek_ec2" {
  ami             = var.ami_id
  instance_type   = var.instance_type
  key_name        = data.aws_key_pair.existing_key.key_name
  subnet_id       = var.public_subnet_id
  security_groups = [aws_security_group.deepseek_ec2_sg.id]
  #iam_instance_profile = aws_iam_instance_profile.deepseek_ec2_profile.name

  root_block_device {
    volume_size           = 48
    volume_type           = "gp3"
    delete_on_termination = true
  }

  tags = {
    Name = "DeepSeekModelInstance"
  }
}

IAM Roles and Instance Profile

An IAM role is created for the EC2 instance, allowing it to assume the role and access necessary AWS resources. At first, I wanted to copy the model from an S3 bucket, so I created this EC2 resource. So, I don't need this now because I opted to use a docker image but will keep it for future modifications.

resource "aws_iam_role" "deepseek_ec2_role" {
  name = "deepseek_ec2_role"

  assume_role_policy = jsonencode({
    Version = "2012-10-17"
    Statement = [
      {
        Action = "sts:AssumeRole"
        Effect = "Allow"
        Principal = {
          Service = "ec2.amazonaws.com"
        }
      }
    ]
  })
}

resource "aws_iam_instance_profile" "deepseek_ec2_profile" {
  name = "deepseek_ec2_profile"
  role = aws_iam_role.deepseek_ec2_role.name
}

Route 53 DNS Record

A Route 53 DNS record is created to map the ALB’s DNS name to a custom domain. Like I mentioned above, I used an already existing certificate to enable SSL and I also employed an existing hosted zone in us-east-1.

resource "aws_route53_record" "deepseek_dns" {
  zone_id = var.hosted_zone_id
  name    = "deepseek.fozdigitalz.com"
  type    = "A"

  alias {
    name                   = aws_lb.deepseek_lb.dns_name
    zone_id                = aws_lb.deepseek_lb.zone_id
    evaluate_target_health = true
  }
}

Terraform Backend (S3)

The Terraform state file is stored in an S3 bucket to enable team collaboration and state management.

terraform {
  backend "s3" {
    bucket         = "foz-terraform-state-bucket"
    key            = "infra.tfstate"
    region         = "us-east-1"
    encrypt        = true
  }
}

Variables Configuration

The variables.tf file defines all the input variables required for the Terraform configuration. These variables make the configuration reusable and customizable.

variable "aws_region" {
  description = "AWS region to deploy resources"
  type        = string
  default     = "us-east-1"
}

variable "vpc_id" {
  description = "Existing VPC ID where resources will be deployed"
  type        = string
}

variable "subnet_ids" {
  description = "Subnet ID for the ALB"
  type        = list(string)
}

variable "public_subnet_id" {
  description = "Public subnet ID for the EC2 instance"
  type        = string
}

variable "key_name" {
  description = "Key ID for EC2 instance"
  type        = string
}

variable "key_id" {
  description = "The ID of the key pair to use for the EC2 instance"
  type        = string
}

variable "ami_id" {
  description = "Amazon Machine Image (AMI) ID"
  type        = string
}

variable "certificate_arn" {
  description = "ARN of the SSL certificate for HTTPS"
  type        = string
}

variable "hosted_zone_id" {
  description = "ID of the existing Route 53 hosted zone for fozdigitalz.com in us-east-1"
  type        = string
}

variable "terraform_state_bucket" {
  description = "The name of the S3 bucket for Terraform state"
  type        = string
}

variable "instance_type" {
  description = "Instance type for the EC2 instance"
  type        = string
}

variable "my_ip" {
  description = "IP address allowed to SSH"
  type        = string
}

Terraform.tfvars

The terraform.tfvars file is used to assign values to the variables defined in variables.tf. This file is typically not committed to version control (e.g., Git) for security reasons, as it may contain sensitive information like AWS credentials. I added terraform.tfvars to .gitignore so it won't be tracked. I provided the variables and secrets in GitHub using environment variables and secrets. Also, the values below are made up and not real.

aws_region              = "us-east-1"
vpc_id                  = "vpc-1234567890abcdef0"
subnet_ids              = ["subnet-1234567890abcdef0", "subnet-0987654321abcdef0"]
public_subnet_id        = "subnet-1234567890abcdef0"
key_name                = "my-key-pair"
key_id                  = "key-1234567890abcdef0"
ami_id                  = "ami-0abcdef1234567890"
certificate_arn         = "arn:aws:acm:us-east-1:123456789012:certificate/12345678-1234-1234-1234-123456789012"
hosted_zone_id          = "Z1234567890ABCDEF"
terraform_state_bucket  = "foz-terraform-state-bucket"
instance_type           = "c4.4xlarge"
my_ip                   = "192.168.1.1/32"

Output Configuration

The output.tf file defines the outputs that Terraform will display after applying the configuration. I used these outputs for retrieving information like the EC2 instance’s public IP or the ALB’s DNS name that are needed for my Github Action workflow to configure my EC2 instances.

output "ec2_public_ip" {
  description = "Public IP address of the EC2 instance"
  value       = aws_instance.deepseek_ec2.public_ip
}

output "lb_url" {
  description = "DNS name of the Application Load Balancer"
  value       = aws_lb.deepseek_lb.dns_name
}

output "deepseek_ec2_sg_id" {
  description = "Security Group ID of the EC2 instance"
  value       = aws_security_group.deepseek_ec2_sg.id
}

4. GitHub Actions Workflow

The GitHub Actions workflow automates the deployment process. It consists of four main jobs: setup, apply, post-apply, and destroy. My workflow will be triggered by pushing to the main branch or manually through the GitHub UI using workflow dispatch with an input to choose the action (apply/destroy).

Workflow Triggers

The workflow is triggered on a push to the main branch or manually via the workflow_dispatch event. The manual trigger allows you to choose between apply (to create resources) and destroy (to tear down resources).

on:
  push:
    branches:
      - main
  workflow_dispatch:
    inputs:
      action:
        description: "Choose action (apply/destroy)"
        required: true
        default: "apply"
        type: choice
        options:
          - apply
          - destroy

Setup Job

The setup job initializes the environment by checking out the code, setting up Terraform, and configuring AWS credentials.

jobs:
  setup:
    runs-on: ubuntu-latest
    steps:
      - name: Checkout Code
        uses: actions/checkout@v4

      - name: Set up Terraform
        uses: hashicorp/setup-terraform@v3

      - name: Configure AWS Credentials
        uses: aws-actions/configure-aws-credentials@v4
        with:
          aws-access-key-id: ${{ secrets.AWS_ACCESS_KEY_ID }}
          aws-secret-access-key: ${{ secrets.AWS_SECRET_ACCESS_KEY }}
          aws-region: ${{ secrets.AWS_DEFAULT_REGION }}

Apply Job

The apply job runs Terraform to create the infrastructure. It generates a terraform.tfvars file using GitHub Secrets, initializes Terraform, and applies the configuration. This is where my Github Actions Runner SSH into my EC2 to install docker, pulls the images and deploy Ollama and OpenWebUIusing Docker.

  apply:
    runs-on: ubuntu-latest
    needs: setup
    if: |
      (github.event_name == 'workflow_dispatch' && github.event.inputs.action == 'apply') ||
      (github.event_name == 'push' && !contains(github.event.head_commit.message, 'destroy'))
    steps:
      - name: Checkout Code
        uses: actions/checkout@v4

      - name: Set up Terraform
        uses: hashicorp/setup-terraform@v3

      - name: Configure AWS Credentials
        uses: aws-actions/configure-aws-credentials@v4
        with:
          aws-access-key-id: ${{ secrets.AWS_ACCESS_KEY_ID }}
          aws-secret-access-key: ${{ secrets.AWS_SECRET_ACCESS_KEY }}
          aws-region: ${{ secrets.AWS_DEFAULT_REGION }}

      - name: Create terraform.tfvars
        run: |
          cat <<EOF > terraform.tfvars
          ami_id = "${{ secrets.AMI_ID }}"
          certificate_arn = "${{ secrets.CERTIFICATE_ARN }}"
          vpc_id = "${{ secrets.VPC_ID }}"
          subnet_ids = [${{ secrets.SUBNET_IDS }}]
          key_name = "${{ secrets.KEY_NAME }}"
          key_id = "${{ secrets.KEY_ID }}"
          hosted_zone_id = "${{ secrets.HOSTED_ZONE_ID }}"
          instance_type = "${{ secrets.INSTANCE_TYPE }}"
          my_ip = "${{ secrets.MY_IP }}"
          aws_access_key_id = "${{ secrets.AWS_ACCESS_KEY_ID }}"
          aws_secret_access_key = "${{ secrets.AWS_SECRET_ACCESS_KEY }}"
          aws_region = "${{ secrets.AWS_DEFAULT_REGION }}"
          public_subnet_id = "${{ secrets.PUBLIC_SUBNET_ID }}"
          terraform_state_bucket = "${{ secrets.TERRAFORM_STATE_BUCKET }}"
          EOF

      - name: Terraform Init
        run: |
          terraform init \
            -backend-config="bucket=${{ secrets.TERRAFORM_STATE_BUCKET }}" \
            -backend-config="key=infra.tfstate" \
            -backend-config="region=${{ secrets.AWS_DEFAULT_REGION }}"

      - name: Terraform Plan
        run: terraform plan -out=tfplan -var-file=terraform.tfvars

      - name: Terraform Apply
        run: terraform apply -auto-approve -var-file=terraform.tfvars

Post-Apply Job

The post-apply job retrieves Terraform outputs, updates the EC2 security group to allow SSH access from the GitHub runner, installs Docker on the EC2 instance, and deploys the DeepSeek Model and OpenWebUI using Docker.

  post_apply:
    runs-on: ubuntu-latest
    needs: apply
    if: success()
    steps:
      - name: Checkout Code
        uses: actions/checkout@v4

      - name: Set up Terraform
        uses: hashicorp/setup-terraform@v3

      - name: Configure AWS Credentials
        uses: aws-actions/configure-aws-credentials@v4
        with:
          aws-access-key-id: ${{ secrets.AWS_ACCESS_KEY_ID }}
          aws-secret-access-key: ${{ secrets.AWS_SECRET_ACCESS_KEY }}
          aws-region: ${{ secrets.AWS_DEFAULT_REGION }}

      - name: Retrieve Terraform Outputs
        id: tf_outputs
        run: |
          echo "Retrieving Terraform Outputs..."
          echo "EC2_PUBLIC_IP=$(terraform output -raw ec2_public_ip)" >> $GITHUB_ENV
          echo "LB_DNS=$(terraform output -raw lb_url)" >> $GITHUB_ENV

      - name: Retrieve EC2 Security Group ID
        id: get_sg_id
        run:  |
          SECURITY_GROUP_ID=$(terraform output -raw deepseek_ec2_sg_id)
          echo "SECURITY_GROUP_ID=$SECURITY_GROUP_ID" >> $GITHUB_ENV

      - name: Add GitHub Runner IP to Security Group
        run: |
          RUNNER_IP=$(curl -s https://checkip.amazonaws.com)
          echo "Runner IP: $RUNNER_IP"
          aws ec2 authorize-security-group-ingress \
            --group-id "${{ env.SECURITY_GROUP_ID }}" \
            --protocol tcp \
            --port 22 \
            --cidr "$RUNNER_IP/32" || echo "Failed to add rule to EC2 Security Group"

      - name: Wait for Security Group to Update
        run: sleep 12

      - name: Save SSH Private Key
        run: |
          mkdir -p ~/.ssh
          echo "${{ secrets.SSH_PRIVATE_KEY }}" | base64 --decode > ~/.ssh/my-key.pem
          chmod 600 ~/.ssh/my-key.pem

      - name: Verify SSH Connection
        run: |
           ssh -o StrictHostKeyChecking=no -i ~/.ssh/my-key.pem ubuntu@${{ env.EC2_PUBLIC_IP }} 
           echo "SSH Connection Successful"

      - name: Install Docker on EC2
        run: |
          ssh -o StrictHostKeyChecking=no -i ~/.ssh/my-key.pem ubuntu@${{ env.EC2_PUBLIC_IP }} <<EOF
          sudo apt-get update
          sudo apt-get install -y docker.io docker-compose
          sudo systemctl enable docker
          sudo systemctl start docker
          sudo usermod -aG docker ubuntu
          sudo sed -i 's/^ENABLED=1/ENABLED=0/' /etc/apt/apt.conf.d/20auto-upgrades
          sudo reboot
          EOF

      - name: Wait for EC2 Instance to Reboot
        run: sleep 60

      - name: Run DeepSeek Model and WebUI via Docker
        run: |
          ssh -o StrictHostKeyChecking=no -i ~/.ssh/my-key.pem ubuntu@${{ env.EC2_PUBLIC_IP }} <<EOF
          docker run -d -v ollama:/root/.ollama -p 11434:11434 --name ollama ollama/ollama
          sleep 20
          docker exec ollama ollama pull deepseek-r1:8b
          sleep 15
          docker exec -d ollama ollama serve
          sleep 15
          docker run -d -p 8080:8080 --add-host=host.docker.internal:host-gateway -v open-webui:/app/backend/data --name open-webui --restart always ghcr.io/open-webui/open-webui:main
          sleep 15
          EOF

      - name: Confirm WebUI is Running & Accessible via Custom Domain
        run: |
          ssh -o StrictHostKeyChecking=no -i ~/.ssh/my-key.pem ubuntu@${{ env.EC2_PUBLIC_IP }} <<EOF
          curl -I https://deepseek.fozdigitalz.com
          EOF

      - name: Remove GitHub Runner IP from Security Group
        if: always()
        run: |
            RUNNER_IP=$(curl -s https://checkip.amazonaws.com)
            echo "Removing Runner IP: $RUNNER_IP"
            aws ec2 revoke-security-group-ingress \
              --group-id "${{ env.SECURITY_GROUP_ID }}" \
              --protocol tcp \
              --port 22 \
              --cidr "$RUNNER_IP/32"

Destroy Job

The destroy job tears down the infrastructure when triggered manually or via a commit message containing "destroy".

  destroy:
    runs-on: ubuntu-latest
    needs: setup
    if: |
      (github.event_name == 'workflow_dispatch' && github.event.inputs.action == 'destroy') ||
      (github.event_name == 'push' && contains(github.event.head_commit.message, 'destroy'))
    steps:
      - name: Checkout Code
        uses: actions/checkout@v4

      - name: Set up Terraform
        uses: hashicorp/setup-terraform@v3

      - name: Configure AWS Credentials
        uses: aws-actions/configure-aws-credentials@v4
        with:
          aws-access-key-id: ${{ secrets.AWS_ACCESS_KEY_ID }}
          aws-secret-access-key: ${{ secrets.AWS_SECRET_ACCESS_KEY }}
          aws-region: ${{ secrets.AWS_DEFAULT_REGION }}

      - name: Create terraform.tfvars
        run: |
          cat <<EOF > terraform.tfvars
          ami_id = "${{ secrets.AMI_ID }}"
          certificate_arn = "${{ secrets.CERTIFICATE_ARN }}"
          vpc_id = "${{ secrets.VPC_ID }}"
          subnet_ids = [${{ secrets.SUBNET_IDS }}]
          key_name = "${{ secrets.KEY_NAME }}"
          key_id = "${{ secrets.KEY_ID }}"
          hosted_zone_id = "${{ secrets.HOSTED_ZONE_ID }}"
          instance_type = "${{ secrets.INSTANCE_TYPE }}"
          my_ip = "${{ secrets.MY_IP }}"
          aws_access_key_id = "${{ secrets.AWS_ACCESS_KEY_ID }}"
          aws_secret_access_key = "${{ secrets.AWS_SECRET_ACCESS_KEY }}"
          aws_region = "${{ secrets.AWS_DEFAULT_REGION }}"
          public_subnet_id = "${{ secrets.PUBLIC_SUBNET_ID }}"
          terraform_state_bucket = "${{ secrets.TERRAFORM_STATE_BUCKET }}"
          EOF

      - name: Terraform Init & Destroy
        run: |
          terraform init -reconfigure \
            -backend-config="bucket=${{ secrets.TERRAFORM_STATE_BUCKET }}" \
            -backend-config="key=infra.tfstate" \
            -backend-config="region=${{ secrets.AWS_DEFAULT_REGION }}"
          terraform destroy -auto-approve -var-file=terraform.tfvars

Completed workflow run

5. The Application in Action (Result)

After successfully deploying the DeepSeek Model R1 on AWS, I was able to access the OpenWebUI and interact with the model. Below are some screenshots demonstrating the setup and functionality:

I. OpenWebUI Interface

The OpenWebUI provides a user-friendly interface for interacting with the DeepSeek Model R1. Here’s a screenshot of the dashboard:

The OpenWebUI dashboard, accessible via the custom domain deepseek.fozdigitalz.com.

II. Model Interaction

I tested the model by asking it a few questions. Here’s an example of the model’s response:
The DeepSeek Model R1 generating a response to a sample query.

Sample model response 11

Sample mode response 12

Sample model response 21

Sample model response 22

III. Infrastructure Clean Up

I can destroy my infrastructure when I trigger my workflow in different ways. My workflow will be triggered by pushing to the main branch or manually through the GitHub UI using workflow dispatch with an input to choose the action (apply/destroy).

By manually triggering the workflow from the GitHub UI and selecting the destroy action.
By pushing to the main branch with a commit message that contains the word destroy.
By running the command gh workflow run "workflow name" --field action=destroy locally but you must have the Github CLI installed for this to work. Infrastructure cleanup using terraform

IV. Model Performance Metrics & Instance Metrics

I measured the model’s response time and resource utilization. Here’s a summary of the performance:

Average Response Time: 3.06 minutes
CPU Utilization: 26.8%
Memory Usage: 13.65GB

6. Challenges Faced & Lessons Learned

Deploying the DeepSeek Model R1 on AWS using Terraform and GitHub Actions presented several challenges, each of which taught valuable lessons for improving the deployment process and infrastructure design.

1. Security Configuration

Challenge: Configuring security groups and IAM roles to balance accessibility and security was complex. Misconfigurations could expose the EC2 instance or over-privilege IAM roles.
Lesson Learned: Follow the principle of least privilege. Restrict SSH access to specific IPs, limit IAM permissions, and ensure only necessary ports are open.

2. Workflow Reliability

Challenge: My workflow failed countless times owing to different issues from terraform config to secerts and variables and I had to troubleshoot the issue on different ocassions employing debugging. The EC2 instance required a reboot after Docker installation, causing delays in the GitHub Actions workflow. Without proper handling, subsequent steps could fail.
Lesson Learned: Automate delays (e.g., sleep) and post-reboot tasks to ensure smooth workflow execution. Modularize the workflow for better reliability.

3. State and Code Management

Challenge: Managing the Terraform state file in S3 and maintaining a monolithic configuration led to potential security risks and reduced reusability.
Lesson Learned: Use versioning, encryption, and access controls for the Terraform state. Break configurations into reusable modules for better organization and maintainability.

7. Future Improvements

While the deployment process is now functional, there are opportunities I may consider to enhance scalability, security, and cost-efficiency.

1. Scalability

Implement auto-scaling for the EC2 instance to handle varying loads and ensure optimal performance during peak usage.

2. Monitoring and Security

Add CloudWatch alarms and logs for real-time monitoring and troubleshooting. Enhance security by using AWS Systems Manager for instance management instead of SSH.

3. Cost Optimization

Explore spot instances or reserved instances to reduce costs. Additionally, consider upgrading to GPU-powered instances for better performance if the model is computationally intensive.

8. Conclusion

Deploying the DeepSeek Model R1 on AWS using Terraform and GitHub Actions was a rewarding experience. It not only streamlined the deployment process but also provided a scalable and secure infrastructure. By automating the deployment pipeline, I ensured consistency and repeatability, making it easier to manage and update the infrastructure in the future.

DEV Community