Oluwademilade Oyekanmi

Posted on Apr 28, 2025 • Edited on Mar 19

From Flask App to Production on AWS EKS: A Complete CI/CD Walkthrough

#python #aws #kubernetes #devops

I recently deployed a Flask REST API to Amazon EKS with a full CI/CD pipeline, PostgreSQL on Kubernetes, and Prometheus + Grafana monitoring — all wired together with GitHub Actions. This post is a step-by-step walkthrough of exactly how I did it.

The app itself is a Titanic passenger API (a classic dataset) but the architecture is production-grade: containerized app, ECR image registry, EKS cluster, persistent storage, and alerting to Slack. Let me walk you through it.

What We're Building

Here's the full stack:

App: Python Flask REST API with PostgreSQL
Containerization: Docker + Amazon ECR
Orchestration: Amazon EKS (Kubernetes)
CI/CD: GitHub Actions
Monitoring: Prometheus + Grafana via Helm (kube-prometheus-stack)
Alerting: Slack via Alertmanager

The source code is at: github.com/MsOluwademilade/titanic-test-app

The Application

The app is a CRUD REST API built with Flask and SQLAlchemy. It manages a people table seeded with Titanic passenger data.

Endpoints:

GET /people — list all passengers
GET /people/<uuid> — get one passenger
POST /people — add a passenger
PUT /people/<uuid> — update a passenger
DELETE /people/<uuid> — remove a passenger

The data model (src/models/person.py) maps to a PostgreSQL table:

class Person(db.Model):
    __tablename__ = 'people'
    uuid = db.Column(UUID(as_uuid=True), primary_key=True, default=uuid.uuid4)
    survived = db.Column(db.Integer)
    passengerClass = db.Column(db.Integer)
    name = db.Column(db.String(255))
    sex = db.Column(db.String(6))
    age = db.Column(db.Float)
    siblingsOrSpousesAboard = db.Column(db.Integer)
    parentsOrChildrenAboard = db.Column(db.Integer)
    fare = db.Column(db.Float)

The app factory pattern in src/app.py supports both development and production configs, pulling DATABASE_URL from environment variables — which makes it Kubernetes-friendly from day one.

Step 1: Dockerizing the App

The Dockerfile uses a python:3.12-slim base image to keep things lean.

FROM python:3.12-slim

WORKDIR /app

RUN apt-get update && apt-get install -y \
    build-essential \
    libpq-dev \
    python3-dev \
    && rm -rf /var/lib/apt/lists/*

COPY requirements.txt .

RUN pip install --no-cache-dir -r requirements.txt && \
    pip install --no-cache-dir Flask==2.2.5 SQLAlchemy==1.4.49 setuptools==70.0.0 && \
    pip uninstall -y psycopg2 || true && \
    pip install --no-cache-dir psycopg2-binary==2.9.9

COPY . .

EXPOSE 5000

ENV DATABASE_URL=postgresql+psycopg2://user:password@db:5432/postgres

CMD ["python", "run.py"]

A few things worth noting here:

We explicitly uninstall psycopg2 and reinstall psycopg2-binary — this avoids compilation errors in slim images that don't have full build toolchains.
The DATABASE_URL env var set in the Dockerfile is just a default. In Kubernetes, we'll override it per-deployment.

Testing locally with Docker Compose:

Before pushing to EKS, test the full stack locally with compose.yml:

services:
  db:
    image: postgres:latest
    environment:
      POSTGRES_USER: user
      POSTGRES_PASSWORD: password
      POSTGRES_DB: postgres
    healthcheck:
      test: ["CMD-SHELL", "pg_isready -U user -d postgres"]
      interval: 5s
      timeout: 5s
      retries: 5

  app:
    build: .
    ports:
      - "5000:5000"
    environment:
      DATABASE_URL: postgresql+psycopg2://user:password@db:5432/postgres
    depends_on:
      db:
        condition: service_healthy

The condition: service_healthy on the db dependency is important — it ensures Postgres is actually ready to accept connections before Flask tries to connect.

docker compose up --build

Hit http://localhost:5000/people to confirm it's working.

Step 2: Push to Amazon ECR

Before setting up the full pipeline, create your ECR repository manually (you only do this once):

aws ecr create-repository \
  --repository-name titanic-app \
  --region <your-region>

Note the repository URI, you'll need it as a GitHub secret.

Step 3: Provision the EKS Cluster

Create your cluster with eksctl (the easiest way to get started):

eksctl create cluster \
  --name titanic-cluster \
  --region <your-region> \
  --nodegroup-name titanic-nodes \
  --node-type t3.medium \
  --nodes 2

This provisions the control plane, worker nodes, and configures your local kubeconfig automatically. The CI/CD pipeline will later call aws eks update-kubeconfig to do the same in GitHub Actions.

Why t3.medium? The kube-prometheus-stack (Prometheus + Grafana + Alertmanager) is resource-hungry. t3.small nodes will struggle. Budget for at least t3.medium if you're running monitoring.

Step 4: The Kubernetes Manifests

All Kubernetes configs live in the k8s/ folder. Here's what each file does:

PostgreSQL Setup

k8s/postgres-configmap.yaml — Holds the SQL init script that creates the people table and seeds initial data. Kubernetes mounts this as a volume into the Postgres container at /docker-entrypoint-initdb.d/, which Postgres automatically runs on first start.

apiVersion: v1
kind: ConfigMap
metadata:
  name: titanic-sql
data:
  init.sql: |
    CREATE TABLE IF NOT EXISTS people (
      uuid VARCHAR(255) PRIMARY KEY DEFAULT gen_random_uuid()::text,
      survived INTEGER NOT NULL,
      ...
    );
    INSERT INTO people (...) VALUES (...);

k8s/postgres-deployment.yaml — Deploys Postgres with a PersistentVolumeClaim for durable storage:

apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  name: postgres-pvc
spec:
  accessModes:
    - ReadWriteOnce
  resources:
    requests:
      storage: 1Gi
  storageClassName: gp2

The storageClassName: gp2 tells EKS to provision an AWS EBS volume. This is what survives pod restarts and gives you actual persistence — without it, every time Postgres restarts you'd lose your data.

The deployment mounts both the PVC (for data) and the ConfigMap (for init scripts):

volumeMounts:
  - name: postgres-data
    mountPath: /pgdata
  - name: init-script
    mountPath: /docker-entrypoint-initdb.d

k8s/postgres-service.yaml — Exposes Postgres internally as a ClusterIP service on port 5432. The Flask app reaches it via the service name postgres.

Flask App

k8s/app-deployment.yaml — Deploys the Flask app. Notice the IMAGE_URI_PLACEHOLDER:

containers:
  - name: titanic-app
    image: IMAGE_URI_PLACEHOLDER
    ports:
      - containerPort: 5000
    env:
      - name: DATABASE_URL
        value: postgresql+psycopg2://user:password@postgres:5432/postgres

The CI/CD pipeline replaces IMAGE_URI_PLACEHOLDER with the actual ECR image URI using sed before applying the manifest. This keeps the manifest clean in version control while allowing dynamic image URIs in the pipeline.

k8s/app-service.yaml — Exposes the Flask app as a LoadBalancer service. EKS provisions an AWS ELB automatically:

spec:
  type: LoadBalancer
  ports:
    - protocol: TCP
      port: 80
      targetPort: 5000

Traffic hits port 80 on the load balancer and gets forwarded to port 5000 on the Flask container.

Step 5: Monitoring with Prometheus & Grafana

The monitoring stack lives in monitoring/. Rather than writing raw Kubernetes manifests for Prometheus, we use the kube-prometheus-stack Helm chart — it bundles Prometheus, Grafana, and Alertmanager and pre-configures them to scrape Kubernetes metrics out of the box.

monitoring/prometheus-pvcs.yaml — Creates PVCs for Prometheus and Alertmanager data persistence:

# Prometheus: 8Gi on gp2 EBS
# Alertmanager: 2Gi on gp2 EBS

monitoring/prometheus-values.yaml — This is where the interesting configuration lives.

Grafana gets a LoadBalancer service so it's accessible externally, and persistence so dashboards survive pod restarts:

grafana:
  enabled: true
  adminPassword: "your-password"
  service:
    type: LoadBalancer
  persistence:
    enabled: true
    storageClassName: gp2
    size: 5Gi

Alertmanager is configured to route all alerts to Slack:

alertmanager:
  config:
    route:
      receiver: 'slack-notifications'
    receivers:
      - name: 'slack-notifications'
        slack_configs:
          - api_url_file: /etc/alertmanager/secrets/alertmanager-slack-webhook/url
            channel: '#all-titanic-alert'
            title: '🚨 Kubernetes Alert - {{ .CommonLabels.alertname }}'
            text: |
              {{ range .Alerts }}
              *Alert:* {{ .Labels.alertname }}
              *Severity:* {{ .Labels.severity }}
              *Summary:* {{ .Annotations.summary }}
              {{ end }}
            send_resolved: true

The Slack webhook URL is mounted from a Kubernetes secret (not hardcoded) — the api_url_file path points to a mounted secret volume.

The defaultRules section enables pre-built alert rules covering etcd, Kubernetes API server, node resources, pod health, storage, and more — all without writing a single PromQL rule yourself.

Step 6: The CI/CD Pipeline

This is where everything ties together. The GitHub Actions workflow in .github/workflows/deploy.yml triggers on every push to main and handles the full deploy sequence.

1. AWS Authentication

- name: Configure AWS credentials
  uses: aws-actions/configure-aws-credentials@v4
  with:
    aws-access-key-id: ${{ secrets.AWS_ACCESS_KEY_ID }}
    aws-secret-access-key: ${{ secrets.AWS_SECRET_ACCESS_KEY }}
    aws-region: ${{ secrets.AWS_REGION }}

AWS credentials are stored as GitHub repository secrets — never hardcoded.

2. Build & Push to ECR

- name: Login to Amazon ECR
  id: login-ecr
  uses: aws-actions/amazon-ecr-login@v2

- name: Build, tag, and push image to ECR
  env:
    ECR_REGISTRY: ${{ steps.login-ecr.outputs.registry }}
    ECR_REPOSITORY: ${{ secrets.ECR_REPOSITORY }}
  run: |
    IMAGE_URI=$ECR_REGISTRY/$ECR_REPOSITORY:latest
    docker build -t $IMAGE_URI .
    docker push $IMAGE_URI

The login-ecr step outputs the registry URL, which we capture and use to construct the full image URI.

3. Configure kubectl for EKS

- name: Update kubeconfig for EKS
  run: |
    aws eks update-kubeconfig \
      --name ${{ secrets.EKS_CLUSTER_NAME }} \
      --region ${{ secrets.AWS_REGION }}

This authenticates the GitHub Actions runner to your EKS cluster.

4. Deploy the Monitoring Stack

- name: Create monitoring namespace
  run: |
    kubectl get namespace monitoring >/dev/null 2>&1 || \
    kubectl create namespace monitoring

- name: Apply Prometheus PVCs
  run: kubectl apply -f monitoring/prometheus-pvcs.yaml

- name: Deploy Prometheus Stack with Helm
  run: |
    helm repo add prometheus-community \
      https://prometheus-community.github.io/helm-charts
    helm repo update
    helm upgrade --install prometheus \
      prometheus-community/kube-prometheus-stack \
      --namespace monitoring \
      --values monitoring/prometheus-values.yaml \
      --wait \
      --timeout 10m

helm upgrade --install is idempotent — it installs on first run and upgrades on subsequent runs. The --wait --timeout 10m flags make the pipeline block until the Helm release is healthy before proceeding.

5. Grafana with IP Restriction

- name: Create Grafana LoadBalancer Service with source restriction
  run: |
    cat <<EOF | kubectl apply -f -
    apiVersion: v1
    kind: Service
    metadata:
      name: prometheus-grafana
      namespace: monitoring
      annotations:
        service.beta.kubernetes.io/aws-load-balancer-source-ranges: "${{ secrets.SOURCE_IP }}/32"
    spec:
      type: LoadBalancer
      selector:
        app.kubernetes.io/name: grafana
    EOF

The aws-load-balancer-source-ranges annotation tells AWS to restrict inbound traffic to a specific IP — so Grafana isn't publicly accessible to the entire internet. The allowed IP is stored as a GitHub secret.

6. Deploy the App

- name: Replace image placeholder with actual ECR URI
  run: |
    IMAGE_URI="$ECR_REGISTRY/$ECR_REPOSITORY:latest"
    sed -i "s|IMAGE_URI_PLACEHOLDER|$IMAGE_URI|g" k8s/app-deployment.yaml

- name: Delete existing postgres deployment
  run: |
    kubectl delete deployment postgres --ignore-not-found=true
    kubectl wait --for=delete pod -l app=postgres --timeout=60s || true

- name: Deploy application to EKS
  run: kubectl apply -f k8s/

The sed command swaps the placeholder with the real ECR image URI before applying. The Postgres deletion step forces a clean redeployment so the init script runs fresh on every deploy.

Step 7: Required GitHub Secrets

Before the pipeline runs, configure these secrets in your GitHub repo settings under Settings → Secrets and variables → Actions:

Secret	Value
`AWS_ACCESS_KEY_ID`	IAM user access key
`AWS_SECRET_ACCESS_KEY`	IAM user secret key
`AWS_REGION`	e.g. `us-east-1`
`ECR_REPOSITORY`	ECR repo name (e.g. `titanic-app`)
`EKS_CLUSTER_NAME`	Your EKS cluster name
`SOURCE_IP`	Your IP for Grafana access restriction

The IAM user needs permissions for ECR (push images) and EKS (describe cluster + apply manifests).

Step 8: Verify the Deployment

After the pipeline completes:

Check pods are running:

kubectl get pods
kubectl get pods -n monitoring

Get the app's load balancer URL:

kubectl get svc titanic-app

Hit the EXTERNAL-IP in your browser — you should see the welcome message. Then test the API:

# List all passengers
curl http://<EXTERNAL-IP>/people

# Add a passenger
curl -H "Content-Type: application/json" \
  -X POST http://<EXTERNAL-IP>/people \
  -d '{
    "survived": 1,
    "passengerClass": 1,
    "name": "Miss. Test User",
    "sex": "female",
    "age": 28.0,
    "siblingsOrSpousesAboard": 0,
    "parentsOrChildrenAboard": 0,
    "fare": 71.28
  }'

Access Grafana:

kubectl get svc prometheus-grafana -n monitoring

Navigate to http://<EXTERNAL-IP> — Grafana will be pre-loaded with Kubernetes dashboards showing cluster health, pod resource usage, and any firing alerts.

Architecture Summary

GitHub Push → GitHub Actions
                  ├── Build Docker image
                  ├── Push to Amazon ECR
                  ├── Deploy monitoring stack (Helm → EKS)
                  │       ├── Prometheus (8Gi EBS)
                  │       ├── Grafana (5Gi EBS, LoadBalancer, IP-restricted)
                  │       └── Alertmanager (2Gi EBS) → Slack
                  └── Deploy app manifests → EKS
                            ├── Flask app (LoadBalancer, port 80 → 5000)
                            └── PostgreSQL (ClusterIP, 1Gi EBS)

Key Takeaways

gp2 storage class matters. Every PVC in this setup uses storageClassName: gp2 — the default EKS storage class backed by AWS EBS. Without it, PVCs stay in Pending state indefinitely and your pods never start.

Health checks prevent race conditions. Without condition: service_healthy in Docker Compose (and equivalent readiness probes in Kubernetes), your app will try to connect before Postgres is ready and crash on startup.

helm upgrade --install is pipeline-friendly. It handles first-time install and subsequent upgrades in one idempotent command — no need to check whether a release already exists.

IP-restrict your admin interfaces. Using aws-load-balancer-source-ranges to lock down Grafana is a simple but effective security measure. Never expose monitoring dashboards to the public internet.

Keep secrets out of manifests. The IMAGE_URI_PLACEHOLDER pattern keeps image URIs out of version control. Combined with GitHub secrets for AWS credentials, nothing sensitive lives in the repo.

Resources

If this walkthrough helped you, drop a reaction or a comment, I'd love to hear how you're approaching Kubernetes deployments on AWS. And if you spot something that could be improved in the architecture, let's discuss it below!

DEV Community