DEV Community

Cover image for Building a Production-Grade E-Commerce Platform on GCP: A Complete DevOps Journey
Deepanshu
Deepanshu

Posted on

Building a Production-Grade E-Commerce Platform on GCP: A Complete DevOps Journey

Hey there, fellow developers!

Have you ever wondered what it takes to build and deploy a real production-grade application on the cloud? Not just a simple "Hello World" app, but a full-fledged microservices platform with proper CI/CD, monitoring, security, and all the bells and whistles that make it truly production-ready?

Well, you're in the right place! In this comprehensive guide, I'll walk you through my journey of building and deploying a complete e-commerce platform on Google Cloud Platform (GCP) using modern DevOps practices. Whether you're a beginner taking your first steps into cloud-native development or an experienced engineer looking to level up your skills, this guide has something for everyone.

What You'll Learn

By the end of this guide, you'll understand:

  • How to architect a microservices-based application from scratch
  • Setting up Google Kubernetes Engine (GKE) for production workloads
  • Implementing GitOps with ArgoCD for automated deployments
  • Building a complete CI/CD pipeline with GitHub Actions
  • Adding full observability with Datadog and SonarQube
  • Managing infrastructure as code with Terraform
  • Implementing security best practices (SSL/TLS, secrets management, vulnerability scanning)
  • Setting up automated monitoring and alerting

Why This Project?

I created this project to bridge the gap between simple tutorials and real-world production systems. Most tutorials show you how to deploy a single container, but they don't show you:

  • How to manage multiple microservices
  • How to set up proper CI/CD pipelines
  • How to monitor and debug issues in production
  • How to handle secrets and security
  • How to make your infrastructure reproducible

This project addresses all of these challenges and more!

Application Souce Code: Link

Project Overview

This is a cloud-native e-commerce platform built with:

  • 5 microservices written in Java, Go, and Node.js
  • Kubernetes (GKE) for container orchestration
  • ArgoCD for GitOps-based deployments
  • GitHub Actions for CI/CD automation
  • Datadog for monitoring, logging, and APM
  • SonarQube for code quality analysis
  • Terraform for infrastructure provisioning
  • Helm for Kubernetes package management

The Architecture

Let me show you the high-level architecture of what we're building:

┌─────────────────────────────────────────────────────────────┐
│                    Internet Traffic (HTTPS)                  │
└────────────────────┬────────────────────────────────────────┘
                     │
            ┌────────▼────────┐
            │  GCP Load       │
            │  Balancer       │
            └────────┬────────┘
                     │
            ┌────────▼────────┐
            │  NGINX Ingress  │
            │  Controller     │
            │  (SSL/TLS)      │
            └────────┬────────┘
                     │
        ┌────────────┼────────────┐
        │            │            │
   ┌────▼───┐  ┌────▼───┐  ┌────▼───┐
   │   UI   │  │  Cart  │  │Checkout│
   │ Service│  │ Service│  │ Service│
   └────┬───┘  └────┬───┘  └────┬───┘
        │           │            │
        └───────┬───┴────────┬───┘
                │            │
           ┌────▼───┐   ┌───▼────┐
           │Catalog │   │ Orders │
           │ Service│   │ Service│
           └────┬───┘   └───┬────┘
                │           │
                └─────┬─────┘
                      │
                ┌─────▼─────┐
                │   MySQL   │
                │  Database │
                └───────────┘
Enter fullscreen mode Exit fullscreen mode

Key Components:

  1. Frontend (UI Service): Java-based web interface for customers
  2. Cart Service: Manages shopping cart operations (Java)
  3. Catalog Service: Product catalog and inventory (Go)
  4. Checkout Service: Handles order processing (Node.js)
  5. Orders Service: Order management and history (Java)
  6. MySQL Database: Persistent data storage

All of this runs on Google Kubernetes Engine, managed by ArgoCD, monitored by Datadog, and deployed automatically via GitHub Actions.

Key Features

DevOps Excellence

  • ** GitOps Workflow**: Every deployment is version-controlled and declarative
  • ** Automated CI/CD**: Push code → Build → Test → Scan → Deploy (all automatic!)
  • ** Full Observability**: See everything happening in your cluster in real-time
  • Security First: Automated vulnerability scanning, secrets management, SSL/TLS
  • ** Infrastructure as Code**: Entire infrastructure defined in Terraform
  • ** Self-Healing**: Kubernetes automatically restarts failed containers

Production-Ready Features

  • Auto-scaling: Automatically scales based on traffic
  • Zero-downtime deployments: Update without affecting users
  • Multi-zone deployment: High availability across multiple zones
  • Automated SSL certificates: Let's Encrypt integration
  • Centralized logging: All logs in one place
  • Distributed tracing: Track requests across microservices

Technology Stack

Let me break down the technologies we'll be using:

Cloud & Infrastructure

Technology Purpose Why We Use It
Google Cloud Platform Cloud provider Excellent Kubernetes support, competitive pricing
GKE (Google Kubernetes Engine) Container orchestration Managed Kubernetes, auto-updates, built-in monitoring
Terraform Infrastructure as Code Reproducible infrastructure, version control
Helm Kubernetes package manager Simplifies complex deployments

CI/CD & GitOps

Technology Purpose Why We Use It
GitHub Actions CI/CD pipeline Free for public repos, easy to configure
ArgoCD GitOps controller Automated deployments, easy rollbacks
GCP Artifact Registry Container registry Native GCP integration, secure

Monitoring & Quality

Technology Purpose Why We Use It
Datadog Monitoring & APM Best-in-class observability platform
SonarQube Code quality Catches bugs and security issues early
Trivy Security scanning Finds vulnerabilities in container images

Networking & Security

Technology Purpose Why We Use It
NGINX Ingress Traffic routing Industry standard, highly configurable
Cert-Manager SSL/TLS automation Free SSL certificates from Let's Encrypt
Kubernetes Secrets Secrets management Secure credential storage

Prerequisites

Before we dive in, make sure you have:

Required Tools

  1. Google Cloud Account (with billing enabled)
  1. gcloud CLI - Google Cloud command-line tool
   # Install on Linux/macOS
   curl https://sdk.cloud.google.com | bash

   # Or use package manager
   # macOS: brew install google-cloud-sdk
   # Windows: Download installer from cloud.google.com
Enter fullscreen mode Exit fullscreen mode
  1. kubectl - Kubernetes command-line tool
   # Install via gcloud
   gcloud components install kubectl

   # Or standalone
   # macOS: brew install kubectl
   # Linux: snap install kubectl --classic
Enter fullscreen mode Exit fullscreen mode
  1. Terraform - Infrastructure as Code tool
   # macOS
   brew install terraform

   # Linux
   wget https://releases.hashicorp.com/terraform/1.6.0/terraform_1.6.0_linux_amd64.zip
   unzip terraform_1.6.0_linux_amd64.zip
   sudo mv terraform /usr/local/bin/
Enter fullscreen mode Exit fullscreen mode
  1. Docker - For building container images
   # Install from docker.com/get-started
Enter fullscreen mode Exit fullscreen mode
  1. Helm - Kubernetes package manager
   # macOS
   brew install helm

   # Linux
   curl https://raw.githubusercontent.com/helm/helm/main/scripts/get-helm-3 | bash
Enter fullscreen mode Exit fullscreen mode
  1. Git - Version control
   # Most systems have this pre-installed
   git --version
Enter fullscreen mode Exit fullscreen mode

Required Accounts

  • GitHub Account (for code hosting and CI/CD)
  • Datadog Account (free trial available at datadoghq.com)
  • Domain Name (optional, but recommended for production)

Knowledge Prerequisites

Beginner-Friendly! You should have:

  • Basic understanding of Docker containers
  • Familiarity with command-line interfaces
  • Basic Git knowledge (clone, commit, push)
  • Willingness to learn!

Nice to Have (but not required):

  • Understanding of Kubernetes concepts
  • Experience with cloud platforms
  • Knowledge of CI/CD pipelines

Don't worry if you're missing some of these - I'll explain everything as we go!


Part 1: Getting Started & Infrastructure Setup

Step 1: Initial Setup & Authentication

First, let's set up your local environment and authenticate with Google Cloud.

1.1 Clone the Repository

# Clone the repository
git clone https://github.com/YOUR_USERNAME/Ecommerce-K8s.git
cd Ecommerce-K8s
Enter fullscreen mode Exit fullscreen mode

1.2 Authenticate with Google Cloud

# Login to Google Cloud
gcloud auth login

# This will open a browser window - sign in with your Google account
Enter fullscreen mode Exit fullscreen mode

1.3 Create a GCP Project

# Create a new project (replace PROJECT_ID with your desired ID)
gcloud projects create YOUR_PROJECT_ID --name="E-Commerce Platform"

# Set as default project
gcloud config set project YOUR_PROJECT_ID

# Enable billing (required for GKE)
# You'll need to do this in the GCP Console: console.cloud.google.com
Enter fullscreen mode Exit fullscreen mode

1.4 Enable Required APIs

# Enable all necessary GCP APIs
gcloud services enable container.googleapis.com
gcloud services enable compute.googleapis.com
gcloud services enable artifactregistry.googleapis.com
gcloud services enable cloudresourcemanager.googleapis.com
gcloud services enable iam.googleapis.com
gcloud services enable dns.googleapis.com
Enter fullscreen mode Exit fullscreen mode

** Pro Tip**: This step might take 2-3 minutes. Grab a coffee! ☕

1.5 Set Your Region and Zone

# Choose a region close to your users
# Popular choices: us-central1, europe-west1, asia-southeast1
gcloud config set compute/region us-central1
gcloud config set compute/zone us-central1-a
Enter fullscreen mode Exit fullscreen mode

Step 2: Infrastructure Provisioning with Terraform

Now we'll use Terraform to create all the infrastructure we need.

2.1 Configure Terraform Variables

cd terraform

# Create a terraform.tfvars file
cat > terraform.tfvars <<EOF
project_id = "YOUR_PROJECT_ID"
region     = "us-central1"
zone       = "us-central1-a"
cluster_name = "ecommerce-gke-cluster"
EOF
Enter fullscreen mode Exit fullscreen mode

2.2 Initialize Terraform

# Initialize Terraform (downloads required providers)
terraform init
Enter fullscreen mode Exit fullscreen mode

You should see output like:

Initializing the backend...
Initializing provider plugins...
- Finding latest version of hashicorp/google...
- Installing hashicorp/google v5.x.x...

Terraform has been successfully initialized!
Enter fullscreen mode Exit fullscreen mode

2.3 Review the Infrastructure Plan

# See what Terraform will create
terraform plan
Enter fullscreen mode Exit fullscreen mode

This shows you:

  • VPC network and subnets
  • GKE cluster with node pools
  • Artifact Registry for container images
  • IAM roles and service accounts
  • Firewall rules

** What's Being Created?**

  • VPC Network: Isolated network for your resources
  • GKE Cluster: Kubernetes cluster with 3 nodes (auto-scaling enabled)
  • Artifact Registry: Private container registry
  • Service Account: For GitHub Actions to deploy
  • Firewall Rules: Security rules for your cluster

2.4 Apply the Configuration

# Create the infrastructure (this takes 10-15 minutes)
terraform apply

# Type 'yes' when prompted
Enter fullscreen mode Exit fullscreen mode

** Time for a break!** Creating a GKE cluster takes about 10-15 minutes. This is a good time to:

  • Read through the ArgoCD documentation
  • Set up your Datadog account
  • Grab another coffee ☕

2.5 Verify the Cluster

# Configure kubectl to use your new cluster
gcloud container clusters get-credentials ecommerce-gke-cluster \
    --region us-central1

# Verify connection
kubectl get nodes
Enter fullscreen mode Exit fullscreen mode

You should see something like:

NAME                                          STATUS   ROLES    AGE   VERSION
gke-ecommerce-default-pool-xxxxx-xxxx        Ready    <none>   5m    v1.28.x
gke-ecommerce-default-pool-xxxxx-yyyy        Ready    <none>   5m    v1.28.x
gke-ecommerce-default-pool-xxxxx-zzzz        Ready    <none>   5m    v1.28.x
Enter fullscreen mode Exit fullscreen mode

Congratulations! Your Kubernetes cluster is ready!

The GKE cluster running in Google Cloud Console, showing nodes and cluster configuration


🔧 Step 3: Installing Core Components

Now let's install the essential components: NGINX Ingress, Cert-Manager, and ArgoCD.

3.1 Install NGINX Ingress Controller

# Add Helm repository
helm repo add ingress-nginx https://kubernetes.github.io/ingress-nginx
helm repo update

# Install NGINX Ingress
helm install ingress-nginx ingress-nginx/ingress-nginx \
  --namespace ingress-nginx \
  --create-namespace \
  --set controller.service.type=LoadBalancer
Enter fullscreen mode Exit fullscreen mode

Wait for the Load Balancer to be ready:

# This might take 2-3 minutes
kubectl get svc -n ingress-nginx -w
Enter fullscreen mode Exit fullscreen mode

Press Ctrl+C when you see an EXTERNAL-IP assigned.

Load balancers created for external traffic routing

3.2 Install Cert-Manager (for SSL/TLS)

# Add Helm repository
helm repo add jetstack https://charts.jetstack.io
helm repo update

# Install Cert-Manager
helm install cert-manager jetstack/cert-manager \
  --namespace cert-manager \
  --create-namespace \
  --set installCRDs=true
Enter fullscreen mode Exit fullscreen mode

Verify installation:

kubectl get pods -n cert-manager
Enter fullscreen mode Exit fullscreen mode

All pods should be in Running state.

3.3 Configure Let's Encrypt

# Create ClusterIssuer for SSL certificates
kubectl apply -f - <<EOF
apiVersion: cert-manager.io/v1
kind: ClusterIssuer
metadata:
  name: letsencrypt-prod
spec:
  acme:
    server: https://acme-v02.api.letsencrypt.org/directory
    email: your-email@example.com  # Change this!
    privateKeySecretRef:
      name: letsencrypt-prod
    solvers:
    - http01:
        ingress:
          class: nginx
EOF
Enter fullscreen mode Exit fullscreen mode

** Important**: Replace your-email@example.com with your actual email!

3.4 Install ArgoCD

# Create namespace
kubectl create namespace argocd

# Install ArgoCD
kubectl apply -n argocd -f https://raw.githubusercontent.com/argoproj/argo-cd/stable/manifests/install.yaml

# Wait for pods to be ready
kubectl wait --for=condition=Ready pods --all -n argocd --timeout=300s
Enter fullscreen mode Exit fullscreen mode

Get ArgoCD admin password:

# Retrieve the initial admin password
kubectl -n argocd get secret argocd-initial-admin-secret \
  -o jsonpath="{.data.password}" | base64 -d && echo
Enter fullscreen mode Exit fullscreen mode

** Save this password!** You'll need it to login to ArgoCD.

Access ArgoCD UI:

# Port-forward to access locally
kubectl port-forward svc/argocd-server -n argocd 8080:443
Enter fullscreen mode Exit fullscreen mode

Now open your browser to: https://localhost:8080

  • Username: admin
  • Password: (the password you just retrieved)

You now have ArgoCD running!

ArgoCD dashboard showing all microservices managed via GitOps


Step 4: Setting Up the Database

Before deploying microservices, we need a database.

4.1 Create Namespace

# Create namespace for our application
kubectl create namespace retail-store
Enter fullscreen mode Exit fullscreen mode

4.2 Create MySQL Secret

# Create secret for MySQL password
kubectl create secret generic mysql-secret \
  --from-literal=mysql-root-password=YOUR_SECURE_PASSWORD \
  --from-literal=mysql-password=YOUR_SECURE_PASSWORD \
  -n retail-store
Enter fullscreen mode Exit fullscreen mode

4.3 Deploy MySQL

# Deploy MySQL
kubectl apply -f k8s/catalog-mysql.yaml -n retail-store
kubectl apply -f k8s/catalog-mysql-service.yaml -n retail-store
Enter fullscreen mode Exit fullscreen mode

Verify MySQL is running:

kubectl get pods -n retail-store
Enter fullscreen mode Exit fullscreen mode

Wait until the MySQL pod shows Running status.

Initialize the database:

# Run database initialization job
kubectl apply -f k8s/mysql-setup-job.yaml -n retail-store
Enter fullscreen mode Exit fullscreen mode

Step 5: Setting Up Monitoring with Datadog

5.1 Get Your Datadog API Key

  1. Sign up at datadoghq.com (free trial available)
  2. Go to: Organization Settings → API Keys
  3. Copy your API key

5.2 Create Datadog Secret

# Create secret with your Datadog API key
kubectl create secret generic datadog-secret \
  --from-literal=api-key=YOUR_DATADOG_API_KEY \
  -n default
Enter fullscreen mode Exit fullscreen mode

5.3 Deploy Datadog Agent

# Deploy the Datadog agent
kubectl apply -f k8s/datadog-agent.yaml
Enter fullscreen mode Exit fullscreen mode

Verify Datadog is collecting data:

kubectl get pods -l app=datadog-agent
Enter fullscreen mode Exit fullscreen mode

Within 5 minutes, you should see data in your Datadog dashboard!

Real-time monitoring of Kubernetes pod restarts and alerts in Datadog

Detailed view of all pods with resource usage and status


Step 6: Setting Up SonarQube for Code Quality

6.1 Deploy SonarQube

# Deploy SonarQube
kubectl apply -f terraform/sonarqube-production.yaml
Enter fullscreen mode Exit fullscreen mode

6.2 Expose SonarQube

# Change service type to LoadBalancer
kubectl patch svc sonarqube -n default -p '{"spec": {"type": "LoadBalancer"}}'

# Get the external IP
kubectl get svc sonarqube -n default -w
Enter fullscreen mode Exit fullscreen mode

6.3 Access SonarQube

Wait for the EXTERNAL-IP to be assigned, then:

  1. Open browser to: http://EXTERNAL-IP:9000
  2. Default login: admin / admin
  3. Change password when prompted

6.4 Create SonarQube Projects

In the SonarQube UI, create 5 projects:

  1. retail-store-cart
  2. retail-store-catalog
  3. retail-store-checkout
  4. retail-store-orders
  5. retail-store-ui

For each project:

  • Click "Create Project" → "Manually"
  • Enter project key and name
  • Click "Set Up"
  • Choose "With GitHub Actions"
  • Copy the token generated

SonarQube code quality metrics showing all microservices passing quality gates

6.5 Configure GitHub Secrets

Go to your GitHub repository → Settings → Secrets and variables → Actions

Add these secrets:

  • SONAR_TOKEN: Your SonarQube token
  • SONAR_HOST_URL: http://YOUR_SONARQUBE_IP:9000
  • GCP_PROJECT_ID: Your GCP project ID
  • GCP_REGION: us-central1
  • ARTIFACT_REGISTRY: us-central1-docker.pkg.dev
  • GCP_SA_KEY: Service account JSON key (from Terraform output)
  • DATADOG_API_KEY: Your Datadog API key

Part 2: Deploying Microservices & CI/CD

Step 7: Deploying Microservices with ArgoCD

7.1 Configure ArgoCD Applications

# Apply ArgoCD project definition
kubectl apply -f argocd/projects/retail-store-project.yaml

# Apply all ArgoCD application definitions
kubectl apply -f argocd/applications/retail-store-cart.yaml
kubectl apply -f argocd/applications/retail-store-catalog.yaml
kubectl apply -f argocd/applications/retail-store-checkout.yaml
kubectl apply -f argocd/applications/retail-store-orders.yaml
kubectl apply -f argocd/applications/retail-store-ui.yaml
Enter fullscreen mode Exit fullscreen mode

7.2 Sync Applications

In the ArgoCD UI:

  1. Click on each application
  2. Click "Sync" → "Synchronize"

Or use CLI:

# Install ArgoCD CLI
brew install argocd  # macOS
# or download from: https://argo-cd.readthedocs.io/en/stable/cli_installation/

# Login
argocd login localhost:8080

# Sync all applications
argocd app sync retail-store-cart
argocd app sync retail-store-catalog
argocd app sync retail-store-checkout
argocd app sync retail-store-orders
argocd app sync retail-store-ui
Enter fullscreen mode Exit fullscreen mode

7.3 Monitor Deployment

# Watch pods being created
kubectl get pods -n retail-store -w
Enter fullscreen mode Exit fullscreen mode

You should see all services starting up:

  • cart-xxxxx
  • catalog-xxxxx
  • checkout-xxxxx
  • orders-xxxxx
  • ui-xxxxx

Visualized Kubernetes resource relationships for the deployed application

All running pods across namespaces including ArgoCD, cert-manager, and microservices


Step 8: Accessing Your Application

8.1 Create Ingress

# Apply ingress configuration
kubectl apply -f terraform/retail-store-ingress.yaml -n retail-store
Enter fullscreen mode Exit fullscreen mode

8.2 Get the Ingress IP

# Get the external IP
kubectl get ingress -n retail-store
Enter fullscreen mode Exit fullscreen mode

8.3 Configure DNS (Optional)

If you have a domain:

  1. Create an A record pointing to the Ingress IP
  2. Wait for DNS propagation (5-30 minutes)

Example DNS configuration:

store.example.com -> <INGRESS_IP>
Enter fullscreen mode Exit fullscreen mode

8.4 Access the Application

Open your browser to:

  • With domain: https://store.example.com
  • Without domain: http://<INGRESS_IP>

Your e-commerce platform is live!

Live e-commerce application accessible via the configured domain


Understanding the CI/CD Pipeline

Let's dive deep into how the automated CI/CD pipeline works!

Pipeline Overview

Every time you push code to the main branch, GitHub Actions automatically:

  1. Detects Changes - Identifies which microservices were modified
  2. Runs SonarQube Analysis - Checks code quality and security
  3. Builds Docker Images - Creates container images
  4. Scans for Vulnerabilities - Uses Trivy to find security issues
  5. Pushes to Registry - Uploads images to GCP Artifact Registry
  6. Updates Helm Values - Modifies deployment configurations
  7. ArgoCD Deploys - Automatically deploys to Kubernetes

Pipeline Workflow Diagram

┌─────────────┐
│   Git Push  │
└──────┬──────┘
       │
       ▼
┌─────────────────┐
│ Detect Changes  │ (Which services changed?)
└──────┬──────────┘
       │
       ▼
┌─────────────────┐
│ SonarQube Scan  │ (Code quality check)
└──────┬──────────┘
       │
       ▼
┌─────────────────┐
│  Build Image    │ (Docker build)
└──────┬──────────┘
       │
       ▼
┌─────────────────┐
│  Security Scan  │ (Trivy)
└──────┬──────────┘
       │
       ▼
┌─────────────────┐
│ Push to Registry│ (Artifact Registry)
└──────┬──────────┘
       │
       ▼
┌─────────────────┐
│ Update Helm Vals│ (Git commit)
└──────┬──────────┘
       │
       ▼
┌─────────────────┐
│ ArgoCD Deploys  │ (Automatic)
└─────────────────┘
Enter fullscreen mode Exit fullscreen mode

Stage 1: Detect Changes

The pipeline only builds services that have changed:

detect-changes:
  runs-on: ubuntu-latest
  outputs:
    cart: ${{ steps.filter.outputs.cart }}
    catalog: ${{ steps.filter.outputs.catalog }}
    checkout: ${{ steps.filter.outputs.checkout }}
    orders: ${{ steps.filter.outputs.orders }}
    ui: ${{ steps.filter.outputs.ui }}
  steps:
    - uses: actions/checkout@v4
    - uses: dorny/paths-filter@v2
      id: filter
      with:
        filters: |
          cart: 'src/cart/**'
          catalog: 'src/catalog/**'
          checkout: 'src/checkout/**'
          orders: 'src/orders/**'
          ui: 'src/ui/**'
Enter fullscreen mode Exit fullscreen mode

What this does: Only builds services that actually changed. Saves time and resources!

Stage 2: Code Quality Analysis

sonarqube:
  needs: detect-changes
  runs-on: ubuntu-latest
  strategy:
    matrix:
      service: [cart, catalog, checkout, orders, ui]
  steps:
    - name: Checkout code
      uses: actions/checkout@v4

    - name: Run SonarQube Scan
      # Different scan methods for Java, Go, and Node.js
      # Checks for bugs, vulnerabilities, code smells
Enter fullscreen mode Exit fullscreen mode

What this does:

  • Analyzes code for bugs and security issues
  • Checks code coverage
  • Enforces quality gates
  • Fails the build if quality standards aren't met

Stage 3: Build and Deploy

build-deploy:
  needs: [detect-changes, sonarqube]
  runs-on: ubuntu-latest
  strategy:
    matrix:
      service: [cart, catalog, checkout, orders, ui]
  steps:
    - name: Build Docker Image
      run: |
        docker build -t $IMAGE_NAME:$TAG .

    - name: Scan with Trivy
      uses: aquasecurity/trivy-action@master
      with:
        image-ref: $IMAGE_NAME:$TAG
        severity: "CRITICAL,HIGH"

    - name: Push to Artifact Registry
      run: docker push $IMAGE_NAME:$TAG

    - name: Update Helm Values
      run: |
        # Updates image tag in Helm chart
        # ArgoCD detects this change and deploys
Enter fullscreen mode Exit fullscreen mode

What this does:

  • Builds optimized Docker images
  • Scans for security vulnerabilities
  • Pushes to private registry
  • Triggers GitOps deployment

How to Trigger the Pipeline

Method 1: Push Code Changes

# Make changes to a service
cd src/cart
# Edit some files...

# Commit and push
git add .
git commit -m "feat: add new cart feature"
git push origin main
Enter fullscreen mode Exit fullscreen mode

The pipeline automatically runs!

Method 2: Manual Trigger

In GitHub:

  1. Go to Actions tab
  2. Select "Build and Deploy"
  3. Click "Run workflow"

Part 3: Monitoring, Security & Operations

Monitoring and Observability

Datadog Dashboard Setup

1. Kubernetes Overview Dashboard

In Datadog, go to Dashboards → New Dashboard

Key Metrics to Monitor:

  • Pod CPU Usage
  • Pod Memory Usage
  • Pod Restart Count
  • Container Status
  • Network Traffic
  • Disk I/O

Overview of pod health, resource usage, and container states in the GKE cluster

GCP Console showing resource usage, container restarts, and namespace metrics

2. Application Performance Monitoring (APM)

Enable APM in your services:

Each microservice is already configured with Datadog APM environment variables:

env:
  - name: DD_AGENT_HOST
    valueFrom:
      fieldRef:
        fieldPath: status.hostIP
  - name: DD_SERVICE
    value: "cart-service"
  - name: DD_ENV
    value: "production"
  - name: DD_VERSION
    value: "1.0.0"
  - name: DD_TRACE_ENABLED
    value: "true"
Enter fullscreen mode Exit fullscreen mode

View traces in Datadog:

  1. Go to APM → Traces
  2. See request flow across microservices
  3. Identify slow queries and bottlenecks

3. Log Management

View logs:

  1. Go to Logs → Explorer
  2. Filter by service: service:cart-service
  3. Search for errors: status:error

Create log alerts:

Alert when: Error count > 10 in 5 minutes
Notify: Your email or Slack
Enter fullscreen mode Exit fullscreen mode

Security Best Practices

1. Container Security

Image Scanning with Trivy

Every image is scanned for vulnerabilities before deployment:

- name: Run Trivy vulnerability scanner
  uses: aquasecurity/trivy-action@master
  with:
    image-ref: ${{ env.IMAGE_NAME }}:${{ env.IMAGE_TAG }}
    format: "sarif"
    severity: "CRITICAL,HIGH"
    exit-code: "1" # Fail build if vulnerabilities found
Enter fullscreen mode Exit fullscreen mode

Best Practices:

  • Use minimal base images (Alpine, Distroless)
  • Regularly update dependencies
  • Don't run containers as root
  • Scan images in CI/CD pipeline

Dockerfile Security

Bad Example:

FROM ubuntu:latest
RUN apt-get update && apt-get install -y python3
COPY . /app
CMD ["python3", "app.py"]
Enter fullscreen mode Exit fullscreen mode

Good Example:

FROM python:3.11-alpine
RUN addgroup -g 1000 appuser && \
    adduser -D -u 1000 -G appuser appuser
WORKDIR /app
COPY --chown=appuser:appuser requirements.txt .
RUN pip install --no-cache-dir -r requirements.txt
COPY --chown=appuser:appuser . .
USER appuser
CMD ["python", "app.py"]
Enter fullscreen mode Exit fullscreen mode

2. Secrets Management

Never Commit Secrets!

Bad:

env:
  - name: DB_PASSWORD
    value: "mypassword123" # Never do this!
Enter fullscreen mode Exit fullscreen mode

Good:

env:
  - name: DB_PASSWORD
    valueFrom:
      secretKeyRef:
        name: mysql-secret
        key: password
Enter fullscreen mode Exit fullscreen mode

Creating Secrets

# From literal values
kubectl create secret generic mysql-secret \
  --from-literal=password=YOUR_PASSWORD \
  -n retail-store

# From files
kubectl create secret generic tls-secret \
  --from-file=tls.crt=cert.pem \
  --from-file=tls.key=key.pem \
  -n retail-store
Enter fullscreen mode Exit fullscreen mode

3. Network Security

Network Policies

Restrict traffic between pods:

apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
  name: cart-network-policy
  namespace: retail-store
spec:
  podSelector:
    matchLabels:
      app: cart
  policyTypes:
    - Ingress
    - Egress
  ingress:
    - from:
        - podSelector:
            matchLabels:
              app: ui
      ports:
        - protocol: TCP
          port: 8080
  egress:
    - to:
        - podSelector:
            matchLabels:
              app: mysql
      ports:
        - protocol: TCP
          port: 3306
Enter fullscreen mode Exit fullscreen mode

Troubleshooting Guide

Common Issues and Solutions

Issue 1: Pods Not Starting

Symptom:

kubectl get pods -n retail-store
NAME                    READY   STATUS             RESTARTS   AGE
cart-xxxxx              0/1     ImagePullBackOff   0          2m
Enter fullscreen mode Exit fullscreen mode

Diagnosis:

kubectl describe pod cart-xxxxx -n retail-store
kubectl logs cart-xxxxx -n retail-store
Enter fullscreen mode Exit fullscreen mode

Common Causes:

  1. Image doesn't exist - Check Artifact Registry
  2. Authentication issues - Verify image pull secrets
  3. Wrong image tag - Check Helm values

Issue 2: Service Not Accessible

Symptom:

curl http://INGRESS_IP
# Connection refused or 404
Enter fullscreen mode Exit fullscreen mode

Diagnosis:

# Check ingress
kubectl get ingress -n retail-store

# Check service
kubectl get svc -n retail-store

# Check endpoints
kubectl get endpoints -n retail-store
Enter fullscreen mode Exit fullscreen mode

Solutions:

  1. Verify ingress configuration
  2. Check service selector matches pod labels
  3. Ensure pods are ready

Issue 3: Database Connection Errors

Symptom:

Error: Unable to connect to MySQL
Enter fullscreen mode Exit fullscreen mode

Diagnosis:

# Check MySQL pod
kubectl get pods -n retail-store | grep mysql

# Check MySQL logs
kubectl logs mysql-xxxxx -n retail-store

# Test connection
kubectl run -it --rm debug --image=mysql:8.0 --restart=Never -n retail-store -- \
  mysql -h mysql -u root -p
Enter fullscreen mode Exit fullscreen mode

Debugging Commands Cheat Sheet

# View all resources
kubectl get all -n retail-store

# Describe a resource
kubectl describe pod POD_NAME -n retail-store

# View logs
kubectl logs POD_NAME -n retail-store
kubectl logs POD_NAME -n retail-store --previous

# Follow logs in real-time
kubectl logs -f POD_NAME -n retail-store

# Execute commands in a pod
kubectl exec -it POD_NAME -n retail-store -- /bin/bash

# Port forward to a service
kubectl port-forward svc/cart 8080:80 -n retail-store

# View events
kubectl get events -n retail-store --sort-by='.lastTimestamp'

# Check resource usage
kubectl top nodes
kubectl top pods -n retail-store
Enter fullscreen mode Exit fullscreen mode

Cost Optimization

1. Right-Size Your Resources

Monitor Resource Usage

# Check actual resource usage
kubectl top pods -n retail-store

# Compare with requested resources
kubectl describe deployment cart -n retail-store | grep -A 5 Requests
Enter fullscreen mode Exit fullscreen mode

Adjust based on actual usage:

resources:
  requests:
    cpu: "100m" # Start small
    memory: "256Mi"
  limits:
    cpu: "500m" # Set reasonable limits
    memory: "512Mi"
Enter fullscreen mode Exit fullscreen mode

2. Use Preemptible/Spot Instances

Create a preemptible node pool:

gcloud container node-pools create preemptible-pool \
  --cluster=ecommerce-gke-cluster \
  --preemptible \
  --num-nodes=2 \
  --machine-type=e2-medium \
  --region=us-central1
Enter fullscreen mode Exit fullscreen mode

3. Implement Autoscaling

Horizontal Pod Autoscaler (HPA)

# Create HPA
kubectl autoscale deployment cart \
  --cpu-percent=70 \
  --min=2 \
  --max=10 \
  -n retail-store
Enter fullscreen mode Exit fullscreen mode

Cluster Autoscaler

Already enabled in Terraform configuration:

autoscaling {
  min_node_count = 1
  max_node_count = 10
}
Enter fullscreen mode Exit fullscreen mode

4. Estimated Monthly Costs

Component Estimated Cost Optimization Tips
GKE Cluster $150-300 Use Preemptible nodes for dev/test
Compute Instances $200-500 Right-size node pools, enable autoscaling
Artifact Registry $10-30 Implement image retention policies
Load Balancers $20-40 Consolidate ingresses where possible
Cloud Storage $5-15 Lifecycle policies for old backups
Datadog $0-100+ Depends on plan/usage
Total $385-985+

Learning Resources

Official Documentation

Recommended Books

  • "Kubernetes in Action" by Marko Lukša
  • "The DevOps Handbook" by Gene Kim
  • "Site Reliability Engineering" by Google
  • "Terraform: Up & Running" by Yevgeniy Brikman

Online Courses


Conclusion

Congratulations! 🎊 You've just learned how to build and deploy a production-grade microservices platform on Google Cloud Platform!

What We've Accomplished

Infrastructure: Set up a complete GKE cluster with Terraform
Microservices: Deployed 5 microservices in multiple languages
GitOps: Implemented automated deployments with ArgoCD
CI/CD: Built a complete pipeline with GitHub Actions
Monitoring: Added full observability with Datadog
Security: Implemented security best practices
Quality: Integrated code quality checks with SonarQube

Next Steps

Now that you have a solid foundation, consider:

  1. Add More Features
  • Implement caching with Redis
  • Add message queuing with Pub/Sub
  • Integrate with Cloud CDN
  1. Enhance Security
  • Implement service mesh (Istio)
  • Add OAuth2/OIDC authentication
  • Enable Cloud Armor WAF
  1. Improve Observability
  • Add custom business metrics
  • Implement distributed tracing
  • Create SLO/SLI dashboards
  1. Scale Further
    • Multi-region deployment
    • Global load balancing
    • Database sharding

Final Thoughts

Building production systems is a journey, not a destination. This project gives you a solid foundation, but there's always more to learn and improve. Keep experimenting, keep learning, and most importantly, keep building!

If you found this guide helpful, please:

  • ⭐ Star the repository
  • 🔄 Share with your network
  • 💬 Leave a comment with your experience
  • 🐛 Report issues or suggest improvements

Connect With Me

Happy coding! 🚀


License

This project is licensed under the Apache License 2.0 - see the LICENSE file for details.

Acknowledgments

  • Original AWS Retail Store Sample App team
  • Google Cloud Platform documentation
  • Kubernetes community
  • ArgoCD project
  • All open-source contributors

Did you deploy this successfully? I'd love to hear about your experience! Drop a comment below! 👇

kubernetes #gcp #devops #microservices #gitops #cloudnative #terraform #argocd #cicd #production

Top comments (1)

Collapse
 
deepanshub09 profile image
Deepanshu

Building production systems is a journey, not a destination. This project gives you a solid foundation, but there's always more to learn and improve. Keep experimenting, keep learning, and most importantly, keep building!