DEV Community

Shaikh Al Amin
Shaikh Al Amin

Posted on

Build Production-Ready GCP Infrastructure from Scratch Part 03

Build Production-Ready GCP Infrastructure from Scratch: A Complete Console Guide

A 4-Part Series for Complete Beginners


Table of Contents


Part 3: Database & Compute Resources

Overview

In this part, you'll build the data and compute layers of your infrastructure. We'll create a Cloud SQL PostgreSQL instance with high availability, set up a Managed Instance Group (MIG) for backend VMs, and deploy a cache VM with Redis and PgBouncer.

What you'll build:

  • Private Service Connection for Cloud SQL private IP
  • Cloud SQL PostgreSQL with regional HA
  • Managed Instance Group (MIG) with 2-4 backend VMs
  • Cache VM with Redis and PgBouncer

Estimated time: 60-90 minutes (includes Cloud SQL provisioning time)

Estimated cost: ~$184/month

Cumulative cost: ~$237/month


Prerequisites

Before continuing, ensure you've completed Parts 1 & 2:

  • [ ] VPC and 5 subnets exist (including private-data subnet)
  • [ ] Cloud NAT gateway is running
  • [ ] Service accounts created (including backend-dev-sa, cache-dev-sa)
  • [ ] Firewall rules created (health check, backend-to-cache)
  • [ ] Bastion host is running
  • [ ] Secrets created in Secret Manager

If you missed Parts 1-2: Start with Part 1: Foundation →


Step 1: Setup Private Service Connection

What is Private Service Connection?

Cloud SQL requires a private IP address for secure access. Private Service Connection allocates a CIDR range (10.100.0.0/16) that Google-managed services (like Cloud SQL) use for private IPs.

How it works:

  1. You allocate an IP range (10.100.0.0/16)
  2. Google creates a VPC peering connection
  3. Cloud SQL gets a private IP from this range

Why private IP: More secure than public IP. No exposure to internet. Lower latency.

Navigation Path

  1. Navigate to VPC networksPrivate service connection
  2. Click "Set up private service connection"

Screenshot: Private Service Connection

Allocation Settings

Allocate an IP range:

Field Value Notes
Name google-managed-services-dev-network Descriptive
IP range 10.100.0.0/16 Standard range for PSA
Purpose VPC peering Required for Cloud SQL

Screenshot: PSA IP Range

Why 10.100.0.0/16: This range doesn't overlap with our VPC (10.0.0.0/16). Google-managed services will allocate IPs from this range.

Create Connection

Click "Connect" and wait 1-2 minutes.

Verify Connection

You should see:

  • Status: Connected
  • IP range: 10.100.0.0/16
  • Peering: servicenetworking-googleapis-com

Step 2: Create Cloud SQL PostgreSQL Instance

What is Cloud SQL?

Cloud SQL is Google's managed PostgreSQL service:

  • Automatic backups
  • High availability (regional)
  • Point-in-time recovery
  • Automatic patching
  • 99.99% SLA (regional)

Cost Alert: Cloud SQL is the most expensive component (~$115/month for regional HA). Consider db-g1-small (~$45/month) for non-critical workloads.

Navigation Path

  1. Navigate to SQLSQL
  2. Click "Create Instance"
  3. Click "Choose PostgreSQL"

Screenshot: Create Cloud SQL

Instance Settings

Basic Information

Field Value Notes
Instance ID app-dev-db Must be globally unique
Password (Generate strong password) Save this password!
PostgreSQL version PostgreSQL 16 Latest stable
Region europe-west1 Same as VPC

Password Generation: Use a password manager or generate with: openssl rand -base64 32

Zone selection:

Field Value Notes
Zone Automatically select high availability zone Let GCP choose for HA

Why automatic zone: GCP selects the zone with best availability. Regional HA will create a standby in another zone.

Database Configuration

Machine Type

Click "Change" to customize:

Field Value Notes
Instance type db-n1-standard-2 2 vCPU, 8GB RAM
Availability Regional (High availability) Recommended for production

Screenshot: Cloud SQL Machine Type

Cost Impact: Regional HA adds ~$70/month over single zone. Critical for production (99.99% vs 99.95% SLA).

Storage

Field Value Notes
Storage type SSD Fast I/O
Storage capacity 100 GB Sufficient for most apps
Automatic storage increases Enable Prevent disk full issues
Storage autoincrease limit 1000 GB Maximum auto-increase

Connectivity

Private IP

Field Value Notes
Private IP Enable Critical - no public IP
Network dev-network Our VPC
IP allocation Use automatically allocated IP Easier than manual

Screenshot: Cloud SQL Private IP

Associated allocation:

Field Value
Allocation google-managed-services-dev-network

CRITICAL - No Public IP: Do NOT enable public IP. Private IP with PSA is more secure.

Ensure: Public IP checkbox is unchecked.

Security

Automatic Backups

Field Value Notes
Automated backups Enable Required for production
Time 3:00 AM Low-traffic window
Retention 7 days Standard retention

Point-in-Time Recovery

Field Value Notes
Point-in-time recovery Enable Allows restore to any point in time

Why PITR: If you accidentally delete data, you can restore to any second within the retention period.

Deletion Protection

Field Value Notes
Deletion protection Enable CRITICAL - prevents accidental deletion

Why Deletion Protection: Prevents accidental data loss. You must disable this before deleting the instance.

IAM Authentication

Field Value Notes
IAM authentication Enable Passwordless auth via IAM

Why IAM Auth: Backend VMs can connect using service account (no passwords in secrets).

Force SSL

Field Value Notes
Force SSL Enable Encrypt database connections

Note: SSL is optional for private IP (traffic stays within Google's network), but recommended for defense in depth.

Customize Your Instance

Database Flags

Click "Add flag" to add performance tuning flags:

Flag 1: Max Connections

Field Value
Flag name max_connections
Value 100

Flag 2: Shared Buffers

Field Value
Flag name shared_buffers
Value 256MB

Flag 3: Effective Cache Size

Field Value
Flag name effective_cache_size
Value 1GB

Flag 4: IAM Authentication

Field Value
Flag name cloudsql.iam_authentication
Value on

Screenshot: Cloud SQL Flags

Maintenance

Field Value Notes
Preferred window Sunday 3:00 AM Low-traffic window
Maintenance channel Preview channel Get new features early (optional)

Create the Instance

Review all settings and click "Create instance".

Provisioning Time: 10-15 minutes

You can monitor progress in the SQL instances list.

Verify Cloud SQL Creation

Once ready, you should see:

  • Instance ID: app-dev-db
  • Status: (Checkmark) - Runnable
  • Region: europe-west1
  • IP addresses: Private IP assigned (e.g., 10.100.0.2)

Screenshot: Cloud SQL Ready

Copy the private IP - we'll need it for the secret update in Step 5.


Step 3: Create Database

What is a Database?

A Cloud SQL instance can host multiple databases. We'll create one for our application.

Navigation Path

  1. Click on app-dev-db instance
  2. Click the "Databases" tab
  3. Click "Create database"

Database Configuration

Field Value Notes
Database name appdb Our application database
Charset (Default) UTF-8
Collation (Default) Default collation

Screenshot: Create Database

Click "Create".

Verify Database

You should see appdb in the databases list:

  • Database name: appdb
  • Size: ~8MB (empty database)

Step 4: Create IAM Database User

What is IAM Authentication?

IAM authentication allows VMs to connect to Cloud SQL using their service account (no passwords needed).

Navigation Path

  1. Still on app-dev-db instance page
  2. Click the "Users" tab
  3. Click "Add user account"

User Configuration

Field Value Notes
User type IAM service account Use IAM, not password
Service account backend-dev-sa@PROJECT_ID.iam.gserviceaccount.com Our backend SA

Screenshot: IAM User

Click "Add".

Verify IAM User

You should see:

  • Username: backend-dev-sa@PROJECT_ID
  • Type: IAM service account

Why IAM User: Backend VMs can now connect without storing passwords in secrets. They use their service account identity.


Step 5: Update Secret with Cloud SQL Connection Details

Why Update the Secret?

Now that Cloud SQL has a private IP, we need to update the db-credentials-dev secret with the correct connection details.

Navigation Path

  1. Navigate to SecuritySecret Manager
  2. Click on db-credentials-dev
  3. Click "Create new version" at the top

Update Secret Value

Replace the secret value with the updated JSON:

{
  "username": "backend-dev-sa",
  "password": "use-iam-authentication",
  "host": "10.100.0.2",
  "database": "appdb",
  "port": "5432"
}
Enter fullscreen mode Exit fullscreen mode

Replace host value: Use the private IP you copied from Cloud SQL (Step 2).

Note on credentials:

  • username: Uses IAM service account name
  • password: Not needed with IAM auth (placeholder)
  • host: Cloud SQL private IP from PSA range

Click "Create new version".

Verify Secret Update

You should see 2 versions:

  • Version 1: Original (with placeholder host)
  • Version 2: Updated (with Cloud SQL private IP)

Step 6: Create Backend Instance Template

What is an Instance Template?

An instance template defines VM configuration for a Managed Instance Group (MIG). It includes:

  • Machine type
  • Boot disk image
  • Startup script
  • Service account
  • Network settings

Why Templates: Ensure all VMs in the MIG have identical configuration.

Navigation Path

  1. Navigate to Compute EngineInstance templates
  2. Click "Create instance template"

Template Configuration

Basic Settings

Field Value Notes
Name dev-backend-template Descriptive
Machine type e2-medium 2 vCPU, 4GB RAM
Boot disk type pd-balanced Balance cost/performance
Boot disk size 20 GB Sufficient for app

Screenshot: Instance Template

Boot Disk

Click "Change":

Field Value
OS Ubuntu
Version Ubuntu 22.04 LTS Minimal
Disk type pd-balanced
Size 20 GB

Network Interface

Field Value Notes
Network dev-network Our VPC
Subnetwork private-backend Private subnet
Network interface type IPv4 IPv4 only
External IPv4 address None No public IP!

CRITICAL - No Public IP: Backend VMs must NOT have public IP. They access internet via Cloud NAT.

Screenshot: Template Network

Service Account

Field Value
Service account backend-dev-sa

Metadata - Startup Script

This script runs automatically when the VM starts. It:

  1. Installs dependencies (nginx, Node.js)
  2. Clones your application
  3. Fetches secrets from Secret Manager
  4. Starts the application with PM2

Add Metadata Item

Expand "Metadata" section:

Click "Add item":

Key Value
startup-script (See script below)

Startup Script:

#!/bin/bash
# Backend VM Startup Script

set -e  # Exit on error

# Logging
exec > >(tee /var/log/startup-script.log)
exec 2>&1

echo "=== Backend Startup Script Begin $(date) ==="

# Install dependencies
apt-get update
apt-get install -y nginx git curl wget

# Install Node Exporter for metrics
NODE_EXPORTER_VERSION="1.6.1"
echo "Installing Node Exporter ${NODE_EXPORTER_VERSION}..."

useradd --no-create-home --shell /bin/false node_exporter || true

wget -q "https://github.com/prometheus/node_exporter/releases/download/v${NODE_EXPORTER_VERSION}/node_exporter-${NODE_EXPORTER_VERSION}.linux-amd64.tar.gz" -O /tmp/node_exporter.tar.gz
tar xzf /tmp/node_exporter.tar.gz -C /tmp
cp /tmp/node_exporter-${NODE_EXPORTER_VERSION}.linux-amd64/node_exporter /usr/local/bin/
chown node_exporter:node_exporter /usr/local/bin/node_exporter
chmod +x /usr/local/bin/node_exporter
rm -rf /tmp/node_exporter*

# Create systemd service for Node Exporter
cat > /etc/systemd/system/node_exporter.service <<'EOF'
[Unit]
Description=Prometheus Node Exporter
After=network.target

[Service]
Type=simple
User=node_exporter
ExecStart=/usr/local/bin/node_exporter --web.listen-address=:9100
Restart=always
RestartSec=5

[Install]
WantedBy=multi-user.target
EOF

systemctl daemon-reload
systemctl enable node_exporter
systemctl start node_exporter

echo "Node Exporter running on port 9100"

# Install nvm and Node.js
if [ ! -d "$HOME/.nvm" ]; then
  curl -o- https://raw.githubusercontent.com/nvm-sh/nvm/v0.39.0/install.sh | bash
fi

export NVM_DIR="$HOME/.nvm"
[ -s "$NVM_DIR/nvm.sh" ] && \. "$NVM_DIR/nvm.sh"

nvm install 20
nvm use 20
nvm alias default 20

# Clone application (replace with your repo)
if [ ! -d "/opt/app" ]; then
  echo "Cloning application..."
  git clone https://github.com/your-org/your-repo.git /opt/app
  cd /opt/app
  npm install
else
  cd /opt/app
  git pull
  npm install
fi

# Fetch secrets from Secret Manager
echo "Fetching secrets..."
DB_CREDENTIALS=$(gcloud secrets versions access latest --secret="db-credentials-dev")
API_KEY=$(gcloud secrets versions access latest --secret="api-key-dev")

# Create .env file
cat > /opt/app/.env <<'EOF'
DATABASE_URL=${DB_CREDENTIALS}
API_KEY=${API_KEY}
PORT=3000
NODE_ENV=production
EOF

# Configure nginx as reverse proxy
cat > /etc/nginx/sites-available/default <<'EOF'
server {
  listen 80;
  server_name _;

  location / {
    proxy_pass http://localhost:3000;
    proxy_http_version 1.1;
    proxy_set_header Upgrade $http_upgrade;
    proxy_set_header Connection 'upgrade';
    proxy_set_header Host $host;
    proxy_set_header X-Real-IP $remote_addr;
    proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
    proxy_set_header X-Forwarded-Proto $scheme;
    proxy_cache_bypass $http_upgrade;
  }
}
EOF

systemctl restart nginx

# Install and start PM2
npm install -g pm2
cd /opt/app
pm2 delete nestjs-app 2>/dev/null || true
pm2 start npm --name "nestjs-app" -- start
pm2 save
pm2 startup systemd

echo "=== Startup Script Complete $(date) ==="
echo "Application running on port 3000"
Enter fullscreen mode Exit fullscreen mode

Screenshot: Startup Script

Modify the script: Replace git clone URL with your actual repository.

Management - Automation

Field Value Notes
Enable autohealing (Configure in MIG) Health check based

Create the Template

Click "Create".

Creation Time: 1-2 minutes


Step 7: Create Managed Instance Group (MIG)

What is a MIG?

A Managed Instance Group (MIG):

  • Ensures a specified number of VMs are running
  • Auto-heals unhealthy VMs
  • Auto-scales based on CPU/load
  • Performs rolling updates

Navigation Path

  1. Navigate to Compute EngineInstance groups
  2. Click "Create instance group"

Group Configuration

Group Type

Field Value Notes
Group type Regional MIG High availability across zones
Name dev-backend-mig Descriptive
Region europe-west1 Same as VPC

Why Regional MIG: Spreads VMs across multiple zones for HA. If one zone fails, VMs in other zones continue serving traffic.

Instance Template

Field Value
Instance template dev-backend-template

Group Size

Field Value Notes
Group size 2 Minimum for HA

Screenshot: MIG Creation

Autoscaling

Expand "Autoscaling policy":

Field Value Notes
Autoscaling mode On Enable autoscaling
Minimum instances 2 Always have 2 VMs
Maximum instances 4 Scale up to 4 VMs
Autoscaling metric CPU utilization Scale based on CPU
Target CPU utilization 70% Scale out when CPU > 70%
Cooldown period 300 seconds Wait 5 min between scaling

Screenshot: MIG Autoscaling

Autohealing

Expand "Autohealing policy":

Field Value Notes
Health check Create health check Click link to create

Health Check Configuration

Field Value Notes
Name dev-backend-http-health-check Descriptive
Protocol HTTP HTTP health check
Port 3000 NestJS app port
Request path /health Your app must implement this
Check interval 30 seconds How often to check
Timeout 10 seconds Response timeout
Healthy threshold 1 consecutive success Mark healthy after 1 success
Unhealthy threshold 3 consecutive failures Mark unhealthy after 3 failures

Screenshot: Health Check

CRITICAL - /health endpoint: Your NestJS application MUST implement a /health endpoint that returns HTTP 200. Without this, health checks will fail and VMs will be constantly recreated.

Back to MIG Autohealing:

Field Value
Initial delay 300 seconds

Why 5 min delay: The startup script takes time (nvm install, npm install can take 3-5 minutes). If health checks start too early, VMs will be marked unhealthy and recreated (infinite loop).

Create the MIG

Click "Create".

VM Creation Time: 5-10 minutes per VM

Verify MIG Creation

You should see:

  • Group name: dev-backend-mig
  • Status: Running
  • Instances: 2/2 (healthy)
  • Autoscaling: Enabled (2-4 VMs)

Click on the MIG to see individual instances:

  • dev-backend-xxxxx (europe-west1-b)
  • dev-backend-yyyyy (europe-west1-c)

Zones: Regional MIG spreads VMs across zones for HA.


Step 8: Create Cache VM (Redis + PgBouncer)

What is the Cache VM?

A single VM running:

  • Redis: In-memory cache for session data, query results
  • PgBouncer: Connection pooler for Cloud SQL (reduces connections)

Why combine: Cost optimization. A single VM handles both services (~$23/month vs ~$46/month for separate VMs).

Navigation Path

  1. Navigate to Compute EngineVM instances
  2. Click "Create instance"

VM Configuration

Basic Settings

Field Value Notes
Name dev-cache-vm Descriptive
Region europe-west1 Same as VPC
Zone europe-west1-b Zone b

Machine Type

Field Value Notes
Machine type e2-medium 2 vCPU, 4GB RAM

Boot Disk

Field Value
OS Ubuntu 22.04 LTS Minimal
Disk type pd-balanced
Size 20 GB

Network Interface

Field Value Notes
Network dev-network Our VPC
Subnetwork private-cache Cache subnet
External IPv4 address None No public IP

Service Account

Field Value
Service account cache-dev-sa

Metadata - Startup Scripts

We'll add 3 metadata items for Redis, PgBouncer, and observability agents.

Metadata Item 1: Redis Startup Script

Key: redis-startup-script

Value:

#!/bin/bash
# Redis Startup Script

set -e

echo "Configuring Redis..."

# Install Redis
apt-get update
apt-get install -y redis-server

# Configure Redis to listen on all interfaces
sed -i 's/^bind 127.0.0.1/bind 0.0.0.0/' /etc/redis/redis.conf

# Set maxmemory policy
sed -i 's/^# maxmemory/maxmemory/' /etc/redis/redis.conf
sed -i 's/^maxmemory .*/maxmemory 512mb/' /etc/redis/redis.conf

# Set eviction policy
sed -i 's/^# maxmemory-policy/maxmemory-policy/' /etc/redis/redis.conf
sed -i 's/^maxmemory-policy .*/maxmemory-policy allkeys-lru/' /etc/redis/redis.conf

# Restart Redis
systemctl restart redis-server

# Enable Redis on boot
systemctl enable redis-server

echo "Redis configured and running on port 6379"
Enter fullscreen mode Exit fullscreen mode

Metadata Item 2: PgBouncer Startup Script

Key: pgbouncer-startup-script

Value:

#!/bin/bash
# PgBouncer Startup Script

set -e

# Replace with your Cloud SQL private IP
CLOUD_SQL_IP="10.100.0.2"  # From Step 2

echo "Configuring PgBouncer for Cloud SQL at ${CLOUD_SQL_IP}..."

# Install PgBouncer
apt-get update
apt-get install -y pgbouncer

# Configure PgBouncer
cat > /etc/pgbouncer/pgbouncer.ini << EOF
[databases]
appdb = host=${CLOUD_SQL_IP} port=5432 dbname=appdb

[pgbouncer]
pool_mode = transaction
max_client_conn = 100
default_pool_size = 50
reserve_pool = 5
reserve_pool_timeout = 3
listen_port = 6432
listen_addr = 0.0.0.0
auth_type = any
EOF

# Create user list (for auth_type=any)
# In production, use proper authentication
echo "\"app_admin\" \"any_password\"" > /etc/pgbouncer/userlist.txt

# Restart PgBouncer
systemctl restart pgbouncer

# Enable PgBouncer on boot
systemctl enable pgbouncer

echo "PgBouncer configured and running on port 6432"
Enter fullscreen mode Exit fullscreen mode

Replace CLOUD_SQL_IP: Use the private IP from Step 2.

Metadata Item 3: Observability Agents

Key: observability-agents-script

Value:

#!/bin/bash
# Observability Agents (Node Exporter + Promtail)

set -e

echo "Installing observability agents..."

# Install Node Exporter
NODE_EXPORTER_VERSION="1.6.1"

useradd --no-create-home --shell /bin/false node_exporter || true

wget -q "https://github.com/prometheus/node_exporter/releases/download/v${NODE_EXPORTER_VERSION}/node_exporter-${NODE_EXPORTER_VERSION}.linux-amd64.tar.gz" -O /tmp/node_exporter.tar.gz
tar xzf /tmp/node_exporter.tar.gz -C /tmp
cp /tmp/node_exporter-${NODE_EXPORTER_VERSION}.linux-amd64/node_exporter /usr/local/bin/
chown node_exporter:node_exporter /usr/local/bin/node_exporter
chmod +x /usr/local/bin/node_exporter

cat > /etc/systemd/system/node_exporter.service <<'EOF'
[Unit]
Description=Prometheus Node Exporter
After=network.target

[Service]
Type=simple
User=node_exporter
ExecStart=/usr/local/bin/node_exporter --web.listen-address=:9100
Restart=always
RestartSec=5

[Install]
WantedBy=multi-user.target
EOF

systemctl daemon-reload
systemctl enable node_exporter
systemctl start node_exporter

echo "Node Exporter installed on port 9100"

# Install Promtail ( Loki URL will be updated in Part 4)
PROMTAIL_VERSION="2.9.0"
LOKI_URL="http://10.0.5.11:3100"  # Loki VM IP

useradd --no-create-home --shell /bin/false promtail || true

wget -q "https://github.com/grafana/loki/releases/download/v${PROMTAIL_VERSION}/promtail-linux-amd64.zip" -O /tmp/promtail.zip
unzip -o /tmp/promtail.zip -d /tmp
mv /tmp/promtail-linux-amd64/promtail /usr/local/bin/
chown promtail:promtail /usr/local/bin/promtail
chmod +x /usr/local/bin/promtail

mkdir -p /etc/promtail
cat > /etc/promtail/config.yml << EOF
server:
  http_listen_port: 9080

positions:
  filename: /tmp/positions.yaml

clients:
  - url: ${LOKI_URL}/loki/api/v1/push

scrape_configs:
  - job_name: cache
    static_configs:
      - targets:
          - localhost
        labels:
          job: cache
          env: dev
          host: $(hostname)
EOF

chown promtail:promtail /etc/promtail/config.yml

cat > /etc/systemd/system/promtail.service <<'EOF'
[Unit]
Description=Promtail Log Agent
After=network.target

[Service]
Type=simple
User=promtail
ExecStart=/usr/local/bin/promtail -config.file /etc/promtail/config.yml
Restart=always
RestartSec=5

[Install]
WantedBy=multi-user.target
EOF

systemctl daemon-reload
systemctl enable promtail
systemctl start promtail

echo "Promtail installed and shipping logs to ${LOKI_URL}"
Enter fullscreen mode Exit fullscreen mode

Create the VM

Click "Create".

VM Creation Time: 2-3 minutes

Verify Cache VM

You should see:

  • Name: dev-cache-vm
  • Status: Running
  • Internal IP: 10.0.4.x
  • Zone: europe-west1-b

Screenshot: Cache VM Ready

Test Redis Connectivity

From the bastion host:

# SSH to bastion
gcloud compute ssh dev-bastion --tunnel-through-iap

# From bastion, SSH to cache VM (via internal IP)
ssh 10.0.4.2

# Test Redis
redis-cli PING
# Expected: PONG

# Test PgBouncer
psql -h 127.0.0.1 -p 6432 -U app_admin -d appdb
# Expected: psql (16.x) connection
Enter fullscreen mode Exit fullscreen mode

Part 3 Verification Checklist

Before moving to Part 4, verify:

  • [ ] Private Service Connection is active (10.100.0.0/16)
  • [ ] Cloud SQL instance is running with private IP
  • [ ] Database "appdb" exists
  • [ ] IAM user created for backend-dev-sa
  • [ ] Secret updated with Cloud SQL private IP
  • [ ] Backend MIG has 2 running VMs (healthy)
  • [ ] Autohealing health check is configured
  • [ ] Cache VM is running
  • [ ] Redis is listening on port 6379
  • [ ] PgBouncer is listening on port 6432
  • [ ] Firewall rule allows backend to cache (ports 6379, 6432)

Screenshot: Completed Part 3


Cost Summary - Part 3

Component Monthly Cost Notes
Cloud SQL (regional HA) ~$115 db-n1-standard-2
Backend MIG (2-4 VMs) ~$46-92 e2-medium, autoscaling 2-4
Cache VM ~$23 e2-medium
Total Part 3 ~$184-230 ~$184/month (2 VMs)

Cumulative Cost (Part 1 + 2 + 3):

Component Monthly Cost
Part 1 (VPC, NAT, etc.) ~$42
Part 2 (Bastion) ~$11
Part 3 (Cloud SQL, MIG, Cache) ~$184
Total ~$237/month

Troubleshooting - Part 3

Issue: Private Service Connection Fails

Symptom: "Allocation failed" error

Solution:

  1. Verify 10.100.0.0/16 doesn't overlap with VPC
  2. Check VPC peering status
  3. Ensure you're in the correct project
# Check PSA status
gcloud compute networks peerings list \
  --network=dev-network
Enter fullscreen mode Exit fullscreen mode

Issue: Cloud SQL Creation Fails

Symptom: "Instance creation failed"

Solution:

  1. Verify Private Service Connection is active
  2. Check subnet has Private Google Access enabled
  3. Ensure sufficient quota (SQL instances)

Issue: Backend MIG VMs Unhealthy

Symptom: 0/2 VMs healthy

Solution:

  1. Check health check configuration (/health endpoint)
  2. Increase initial delay to 300+ seconds
  3. Increase unhealthy threshold to 5
  4. Check VM serial port logs for startup script errors
# View VM logs
gcloud compute instances get-serial-port-output INSTANCE_NAME \
  --zone=europe-west1-b --port=1
Enter fullscreen mode Exit fullscreen mode

Issue: Cache VM Cannot Connect to Cloud SQL

Symptom: PgBouncer connection timeout

Solution:

  1. Verify Cloud SQL private IP is correct
  2. Check firewall allows cache VM to Cloud SQL
  3. Ensure Private Service Connection is active
  4. Verify PgBouncer configuration
# From cache VM, test Cloud SQL connectivity
ping 10.100.0.2
telnet 10.100.0.2 5432
Enter fullscreen mode Exit fullscreen mode

What's Next - Part 4?

In Part 4: Observability & Load Balancer, you'll build:

  • Prometheus VM for metrics collection
  • Loki VM for log aggregation
  • Grafana dashboards for visualization
  • External Application Load Balancer
  • End-to-end testing

Continue to Part 4: Observability & Load Balancer →


References


Data and Compute Layers Complete! Your Cloud SQL database, backend MIG, and cache VM are ready. Next, we'll add observability (Prometheus, Loki, Grafana) and the load balancer for external access.

Top comments (0)