DEV Community

Akhilesh Mishra
Akhilesh Mishra

Posted on

Build Production-Ready Google Cloud Infrastructure with Terraform in 2025

Complete step-by-step guide to creating VPC networks, subnets, and storage buckets using Infrastructure as Code

Part 1 of our comprehensive 6-part Terraform on Google Cloud series - from beginner setup to advanced DevOps automation

Ready to transform your Google Cloud infrastructure management? This comprehensive guide kicks off our Terraform on Google Cloud series, where you'll master professional-level cloud automation, GitHub Actions CI/CD, advanced security patterns, and production-ready DevOps practices.

By the end of this tutorial series, you'll have hands-on experience with:

  • Infrastructure as Code fundamentals and best practices
  • Automated CI/CD pipelines with GitHub Actions and Workload Identity Federation
  • Production security with Secret Manager and key rotation
  • Serverless computing with Cloud Functions and Cloud Run
  • Big data processing with Cloud SQL and Dataproc

πŸš€ Perfect for cloud engineers, DevOps practitioners, and developers ready to level up their infrastructure automation game.

What You'll Build in This Series

Part 1 (This Post): Foundation - VPC, Subnets, and Storage

Part 2: Compute Engine VMs with GitHub Actions CI/CD

Part 3: Secure PostgreSQL with Workload Identity Federation

Part 4: Secret Management and Automated Key Rotation

Part 5: Serverless FastAPI with Cloud Run

Part 6: Big Data Processing with Dataproc (Advanced)

Understanding Google Cloud Networking Fundamentals

What is a VPC in Google Cloud?

A Virtual Private Cloud (VPC) serves as your isolated network environment within Google Cloud:

  • πŸ”’ Network Isolation: Complete separation from other projects and tenants
  • πŸ›‘οΈ Security Control: Fine-grained access control and firewall rules
  • 🌍 Global Resource: Automatically spans multiple regions worldwide
  • πŸ“ IP Management: Custom IP addressing, subnets, and routing control
  • πŸ”— Connectivity: VPN, interconnect, and peering capabilities

What is a Subnet in Google Cloud?

Subnets are regional network segments within your VPC that:

  • Define IP address ranges (CIDR blocks) for your resources
  • Enable regional resource deployment and isolation
  • Provide traffic segmentation and security boundaries
  • Support private Google API access for enhanced security
  • Allow custom routing and firewall rule application

Pro Tip: Unlike AWS, Google Cloud subnets are regional (not zonal), giving you automatic high availability across zones within a region.


Prerequisites: Setting Up Your Environment

Before diving into Terraform automation, ensure you have:

βœ… Google Cloud Platform account (Sign up here)

βœ… Active GCP project with billing enabled

βœ… Terraform installed (Installation guide) - Version 1.0+ recommended

βœ… Google Cloud SDK (Download here)
βœ… Basic understanding of cloud networking concepts

Verify Your Installation

# Check Google Cloud SDK
gcloud --version
gcloud init 

# Verify Terraform
terraform --version

# Check your active project
gcloud config get-value project
Enter fullscreen mode Exit fullscreen mode

Google Cloud Authentication Setup

Creating a Service Account (Development Setup)

For this tutorial series, we'll create a service account with owner permissions. Important: In production environments, always follow the principle of least privilege and use keyless authentication with Workload Identity Federation (covered in Part 3).

# Set your project ID (replace with your actual project ID)
export PROJECT_ID="your-gcp-project-id"
gcloud config set project $PROJECT_ID

# Create service account for Terraform
gcloud iam service-accounts create terraform-automation \
    --description="Service account for Terraform infrastructure automation" \
    --display-name="Terraform Automation SA"

# Grant owner role (development only - we'll improve this in later parts)
gcloud projects add-iam-policy-binding $PROJECT_ID \
    --member="serviceAccount:terraform-automation@${PROJECT_ID}.iam.gserviceaccount.com" \
    --role="roles/owner"

# Generate service account key
gcloud iam service-accounts keys create terraform-sa-key.json \
    --iam-account terraform-automation@${PROJECT_ID}.iam.gserviceaccount.com
Enter fullscreen mode Exit fullscreen mode

Set Environment Variables

# Set credentials for Terraform
export GOOGLE_APPLICATION_CREDENTIALS="terraform-sa-key.json"

# Verify authentication
gcloud auth application-default print-access-token
Enter fullscreen mode Exit fullscreen mode

Project Structure: Organizing Your Terraform Code

Create a clean, maintainable project structure that will scale throughout our series:

mkdir terraform-gcp-foundation && cd terraform-gcp-foundation

# Create core Terraform files
touch main.tf variables.tf outputs.tf providers.tf terraform.tfvars

# Create resource-specific files
touch networking.tf storage.tf

# Create backend configuration
touch backend.tf

# Initialize git for version control
git init
echo "*.tfvars" >> .gitignore
echo "terraform-sa-key.json" >> .gitignore
echo ".terraform/" >> .gitignore
echo "*.tfstate*" >> .gitignore
Enter fullscreen mode Exit fullscreen mode

Why this structure?

  • Separation of concerns: Each file has a specific purpose
  • Scalability: Easy to add new resources in dedicated files
  • Team collaboration: Clear organization for multiple developers
  • Version control ready: Proper .gitignore for sensitive files

Terraform Configuration Files Explained

1. Provider Configuration (providers.tf)

terraform {
  required_version = ">= 1.0"

  required_providers {
    google = {
      source  = "hashicorp/google"
      version = "~> 5.0"
    }
    google-beta = {
      source  = "hashicorp/google-beta" 
      version = "~> 5.0"
    }
  }
}

provider "google" {
  project = var.project_id
  region  = var.region
}

provider "google-beta" {
  project = var.project_id
  region  = var.region
}
Enter fullscreen mode Exit fullscreen mode

2. Input Variables (variables.tf)

variable "project_id" {
  type        = string
  description = "Google Cloud Project ID"
  validation {
    condition     = length(var.project_id) > 6
    error_message = "Project ID must be more than 6 characters."
  }
}

variable "region" {
  type        = string
  description = "GCP region for resources"
  default     = "us-central1"
  validation {
    condition = contains([
      "us-central1", "us-east1", "us-west1", "us-west2",
      "europe-west1", "europe-west2", "asia-east1"
    ], var.region)
    error_message = "Region must be a valid GCP region."
  }
}

variable "environment" {
  type        = string
  description = "Environment name (dev, staging, prod)"
  default     = "dev"
  validation {
    condition     = contains(["dev", "staging", "prod"], var.environment)
    error_message = "Environment must be dev, staging, or prod."
  }
}

variable "vpc_name" {
  type        = string
  description = "Name for the VPC network"
  default     = "main-vpc"
}

variable "subnet_name" {
  type        = string
  description = "Name for the subnet"
  default     = "main-subnet"
}

variable "subnet_cidr" {
  type        = string
  description = "CIDR range for the subnet"
  default     = "10.0.1.0/24"
  validation {
    condition     = can(cidrhost(var.subnet_cidr, 0))
    error_message = "Subnet CIDR must be a valid IPv4 CIDR block."
  }
}
Enter fullscreen mode Exit fullscreen mode

3. Networking Resources (networking.tf)

# VPC Network
resource "google_compute_network" "main_vpc" {
  name                    = "${var.environment}-${var.vpc_name}"
  project                 = var.project_id
  auto_create_subnetworks = false
  routing_mode           = "REGIONAL"
  mtu                    = 1460

  description = "Main VPC network for ${var.environment} environment"

  # Enable deletion protection in production
  delete_default_routes_on_create = false
}

# Subnet
resource "google_compute_subnetwork" "main_subnet" {
  name                     = "${var.environment}-${var.subnet_name}"
  project                  = var.project_id
  region                   = var.region
  network                  = google_compute_network.main_vpc.name
  ip_cidr_range           = var.subnet_cidr
  private_ip_google_access = true

  description = "Primary subnet in ${var.region} for ${var.environment}"

  # Enable flow logs for security monitoring
  log_config {
    aggregation_interval = "INTERVAL_10_MIN"
    flow_sampling       = 0.5
    metadata           = "INCLUDE_ALL_METADATA"
  }

  # Secondary IP ranges for future use (GKE, etc.)
  secondary_ip_range {
    range_name    = "pods"
    ip_cidr_range = "10.1.0.0/16"
  }

  secondary_ip_range {
    range_name    = "services"
    ip_cidr_range = "10.2.0.0/16"
  }
}

# Cloud Router for Cloud NAT (we'll use this in Part 2)
resource "google_compute_router" "main_router" {
  name    = "${var.environment}-cloud-router"
  project = var.project_id
  region  = var.region
  network = google_compute_network.main_vpc.id

  description = "Cloud Router for NAT gateway"
}
Enter fullscreen mode Exit fullscreen mode

4. Storage Resources (storage.tf)

# Storage bucket for Terraform state (remote backend)
resource "google_storage_bucket" "terraform_state" {
  name     = "${var.project_id}-${var.environment}-terraform-state"
  project  = var.project_id
  location = var.region

  # Force destroy for development (disable in production)
  force_destroy = var.environment == "dev" ? true : false

  # Enable versioning for state file safety
  versioning {
    enabled = true
  }

  # Uniform bucket-level access for better security
  uniform_bucket_level_access = true

  # Encryption at rest
  encryption {
    default_kms_key_name = null # We'll add KMS in Part 4
  }

  # Lifecycle management to control costs
  lifecycle_rule {
    condition {
      age = 30
    }
    action {
      type = "Delete"
    }
  }

  lifecycle_rule {
    condition {
      age                   = 7
      with_state           = "ARCHIVED"
    }
    action {
      type = "Delete"
    }
  }

  # Labels for resource management
  labels = {
    environment = var.environment
    purpose     = "terraform-state"
    managed-by  = "terraform"
  }
}

# General purpose storage bucket
resource "google_storage_bucket" "app_storage" {
  name     = "${var.project_id}-${var.environment}-app-storage"
  project  = var.project_id
  location = var.region

  # Storage class optimization
  storage_class = "STANDARD"

  uniform_bucket_level_access = true

  # CORS configuration for web applications
  cors {
    origin          = ["*"]
    method          = ["GET", "HEAD", "PUT", "POST", "DELETE"]
    response_header = ["*"]
    max_age_seconds = 3600
  }

  labels = {
    environment = var.environment
    purpose     = "application-storage"
    managed-by  = "terraform"
  }
}

# IAM binding for service account access to state bucket
resource "google_storage_bucket_iam_member" "terraform_state_access" {
  bucket = google_storage_bucket.terraform_state.name
  role   = "roles/storage.admin"
  member = "serviceAccount:terraform-automation@${var.project_id}.iam.gserviceaccount.com"
}
Enter fullscreen mode Exit fullscreen mode

5. Outputs (outputs.tf)

output "vpc_name" {
  value       = google_compute_network.main_vpc.name
  description = "Name of the created VPC"
}

output "vpc_id" {
  value       = google_compute_network.main_vpc.id
  description = "ID of the created VPC"
}

output "vpc_self_link" {
  value       = google_compute_network.main_vpc.self_link
  description = "Self-link of the VPC (useful for other resources)"
}

output "subnet_name" {
  value       = google_compute_subnetwork.main_subnet.name
  description = "Name of the created subnet"
}

output "subnet_cidr" {
  value       = google_compute_subnetwork.main_subnet.ip_cidr_range
  description = "CIDR range of the subnet"
}

output "subnet_gateway_address" {
  value       = google_compute_subnetwork.main_subnet.gateway_address
  description = "Gateway IP address of the subnet"
}

output "terraform_state_bucket" {
  value       = google_storage_bucket.terraform_state.name
  description = "Name of the Terraform state storage bucket"
}

output "app_storage_bucket" {
  value       = google_storage_bucket.app_storage.name
  description = "Name of the application storage bucket"
}

output "cloud_router_name" {
  value       = google_compute_router.main_router.name
  description = "Name of the Cloud Router (for NAT in Part 2)"
}

# Network details for use in subsequent parts
output "network_details" {
  value = {
    vpc_name    = google_compute_network.main_vpc.name
    subnet_name = google_compute_subnetwork.main_subnet.name
    region      = var.region
    project_id  = var.project_id
  }
  description = "Network configuration details for other Terraform configurations"
}
Enter fullscreen mode Exit fullscreen mode

6. Variable Values (terraform.tfvars)

# Project Configuration
project_id  = "your-gcp-project-id"  # Replace with your actual project ID
environment = "dev"

# Network Configuration
region      = "us-central1"
vpc_name    = "main-vpc"
subnet_name = "primary-subnet"
subnet_cidr = "10.0.1.0/24"
Enter fullscreen mode Exit fullscreen mode

7. Backend Configuration (backend.tf)

# Local backend for initial setup
terraform {
  backend "local" {
    path = "terraform.tfstate"
  }
}

# After creating the storage bucket, uncomment below and migrate:
# terraform {
#   backend "gcs" {
#     bucket = "your-project-id-dev-terraform-state"
#     prefix = "foundation/state"
#   }
# }
Enter fullscreen mode Exit fullscreen mode

Understanding Terraform State Management

Terraform state is the crucial component that tracks your infrastructure:

  • πŸ“Š Resource Metadata: Current state, IDs, and configurations
  • πŸ”— Dependencies: Relationships and creation order between resources
  • πŸ“ Performance: Caches resource attributes for faster operations
  • 🎯 Drift Detection: Identifies manual changes made outside Terraform

State Storage Options

Local State (Development):

  • Stored on your local machine
  • Simple for learning and testing
  • Not suitable for team collaboration

Remote State (Production):

  • Google Cloud Storage (recommended for GCP)
  • Terraform Cloud (HashiCorp's managed solution)
  • Amazon S3 (for multi-cloud scenarios)

Migrating to Remote State

After your first deployment, migrate to remote state:

# 1. Update backend.tf with your bucket name
# 2. Run migration command
terraform init -migrate-state
Enter fullscreen mode Exit fullscreen mode

Deploying Your Infrastructure

Step 1: Initialize Terraform

terraform init
Enter fullscreen mode Exit fullscreen mode

Downloads provider plugins and initializes the working directory

Step 2: Format and Validate

# Format code for consistency
terraform fmt -recursive

# Validate configuration
terraform validate
Enter fullscreen mode Exit fullscreen mode

Step 3: Plan Your Deployment

terraform plan -out=tfplan
Enter fullscreen mode Exit fullscreen mode

Creates an execution plan and saves it to a file

Understanding the Plan Output:

  • + Resources to be created
  • - Resources to be destroyed
  • ~ Resources to be modified
  • <= Resources to be read (data sources)

Step 4: Apply Changes

terraform apply tfplan
Enter fullscreen mode Exit fullscreen mode

Executes the plan - no additional confirmation needed since plan was saved

Step 5: Verify Deployment

# Check outputs
terraform output

# Verify in GCP Console
gcloud compute networks list
gcloud compute networks subnets list --network=dev-main-vpc
gsutil ls gs://your-project-id-dev-terraform-state
Enter fullscreen mode Exit fullscreen mode

Step 6: Clean Up (When Needed)

terraform destroy
Enter fullscreen mode Exit fullscreen mode

⚠️ Removes all managed infrastructure - be extremely careful in production!

Production-Ready Best Practices

πŸ”§ Code Organization & Standards

  • Modularize configurations for reusability across environments
  • Use consistent naming conventions with environment prefixes
  • Implement proper file structure with clear separation of concerns
  • Version pin providers and modules to avoid breaking changes

πŸ”’ Security & Access Management

  • Never commit sensitive data or credentials to version control
  • Use remote state backend with proper access controls and encryption
  • Implement state locking to prevent concurrent modifications
  • Follow least privilege principle for service account permissions

πŸ“š Documentation & Team Collaboration

  • Add meaningful descriptions to all resources and variables
  • Use version control (Git) with proper branching strategies
  • Implement peer reviews for all infrastructure changes
  • Document runbooks for common operations and troubleshooting

πŸš€ Automation & CI/CD (Coming in Part 2)

  • Integrate with GitHub Actions for automated deployments
  • Use environment-specific variable files and workspaces
  • Implement automated testing for infrastructure code
  • Set up monitoring and alerting for infrastructure changes

πŸ’° Cost Optimization

  • Use labels and tags for resource tracking and cost allocation
  • Implement lifecycle policies for storage resources
  • Choose appropriate machine types and disk sizes
  • Monitor and set up budget alerts for cost control

What's Coming Next in This Series

πŸ”œ Part 2: Compute Engine VMs with GitHub Actions

  • Deploy secure VM instances with proper networking
  • Set up automated CI/CD pipelines with GitHub Actions
  • Implement firewall rules and Cloud NAT for secure internet access
  • Master service account management and authentication

πŸ”œ Part 3: PostgreSQL with Workload Identity Federation

  • Deploy Cloud SQL PostgreSQL in private networks
  • Implement keyless authentication with Workload Identity Federation
  • Set up VPC peering and private service connections
  • Database security and backup configuration

πŸ”œ Part 4: Secret Management and Key Rotation

  • Secure credential storage with Secret Manager
  • Automated service account key rotation with Cloud Functions
  • Production-grade security patterns
  • Integration with existing infrastructure

πŸ”œ Part 5: Serverless FastAPI with Cloud Run

  • Containerize and deploy Python applications
  • Implement proper artifact registry workflows
  • Configure custom domains and SSL certificates
  • Performance optimization and scaling strategies

πŸ”œ Part 6: Big Data with Dataproc (Advanced)

  • Set up managed Hadoop and Spark clusters
  • Configure preemptible instances for cost optimization
  • Data processing pipelines and job orchestration
  • Integration with Cloud Storage and BigQuery

Key Takeaways

βœ… Infrastructure as Code provides consistency, repeatability, and version control

βœ… Proper project structure makes maintenance and collaboration easier

βœ… Variable validation catches errors early in the development cycle

βœ… Remote state management is essential for team collaboration and production use

βœ… Security best practices should be implemented from day one, not as an afterthought

Ready to continue? This foundation sets you up for success in the remaining parts of our series. Each subsequent tutorial builds upon this infrastructure, adding more sophisticated patterns and production-ready features.

Troubleshooting Common Issues

Authentication Problems

# Re-authenticate if needed
gcloud auth application-default login
gcloud config set project YOUR_PROJECT_ID
Enter fullscreen mode Exit fullscreen mode

API Enablement

# Enable required APIs
gcloud services enable compute.googleapis.com
gcloud services enable storage-api.googleapis.com
Enter fullscreen mode Exit fullscreen mode

Permission Issues

# Verify service account permissions
gcloud projects get-iam-policy YOUR_PROJECT_ID \
  --flatten="bindings[].members" \
  --filter="bindings.members:serviceAccount:terraform-automation@YOUR_PROJECT_ID.iam.gserviceaccount.com"
Enter fullscreen mode Exit fullscreen mode

Connect with me:

Tags: #Terraform #GoogleCloud #GCP #DevOps #InfrastructureAsCode #CloudAutomation #NetworkingSeries

Top comments (0)