DatanestDigital

Posted on Mar 23 • Originally published at datanest-stores.pages.dev

Cloud Data Platform Blueprint

#cloud #aws #terraform #architecture

Cloud Data Platform Blueprint

A complete, layered architecture for building a modern cloud data platform from ingestion to serving. This blueprint provides Terraform modules, pipeline configurations, and governance policies for each layer of the data stack — covering batch and streaming ingestion, lakehouse storage with medallion architecture, processing with Spark and serverless, serving through APIs and BI tools, and governance with catalog, lineage, and access controls. Designed for data teams building their platform from scratch or modernizing a legacy warehouse.

Key Features

Medallion Architecture — Bronze, Silver, Gold layers with clear data quality contracts at each stage
Multi-Source Ingestion — Templates for database CDC, API polling, file drops, and streaming event capture
Lakehouse Storage — Delta Lake / Iceberg table format configurations with partitioning and compaction strategies
Processing Patterns — Spark jobs, serverless ETL (Glue/Data Factory), and streaming (Kinesis/Event Hubs) blueprints
Serving Layer — Pre-built configurations for BI tool connections, REST APIs, and materialized views
Data Governance — Data catalog setup, column-level access controls, PII tagging, and lineage tracking
Infrastructure as Code — Full Terraform modules for provisioning storage, compute, and networking
Cost Controls — Auto-scaling policies, lifecycle rules, and storage tiering to manage platform costs

Quick Start

# Deploy the storage foundation (S3/ADLS + Delta Lake)
cd src/terraform/storage
cp terraform.tfvars.example terraform.tfvars
# Edit terraform.tfvars with your account details

terraform init
terraform plan -out=plan.out
terraform apply plan.out

# Deploy the ingestion layer
cd ../ingestion
terraform init
terraform apply -var-file="../storage/outputs.json"

Architecture

┌──────────────────────────────────────────────────────────────┐
│                    Cloud Data Platform                        │
│                                                              │
│  Sources              Ingestion        Storage (Lakehouse)   │
│  ┌────────┐          ┌─────────┐      ┌──────────────────┐  │
│  │Database│──CDC────►│         │      │  Bronze (Raw)    │  │
│  │  APIs  │──Poll───►│ Ingest  │─────►│  Silver (Clean)  │  │
│  │ Files  │──Drop───►│ Layer   │      │  Gold (Business) │  │
│  │Streams │──Push───►│         │      └────────┬─────────┘  │
│  └────────┘          └─────────┘               │             │
│                                                │             │
│  Processing                    Serving         │             │
│  ┌──────────────────┐         ┌────────────┐   │             │
│  │ Spark / Glue     │◄────────┤ BI Tools   │◄──┘             │
│  │ Stream Processing│         │ REST APIs  │                 │
│  │ dbt Models       │────────►│ ML Feature │                 │
│  └──────────────────┘         │ Store      │                 │
│                               └────────────┘                 │
│  Governance                                                  │
│  ┌──────────────────────────────────────────────────────┐    │
│  │ Data Catalog │ Lineage │ Access Control │ Quality    │    │
│  └──────────────────────────────────────────────────────┘    │
└──────────────────────────────────────────────────────────────┘

Usage Examples

Bronze Layer — S3 Bucket with Lifecycle Rules

# src/terraform/storage/bronze.tf
resource "aws_s3_bucket" "bronze" {
  bucket = "${var.project_prefix}-bronze-${var.environment}"
  tags = {
    Layer       = "bronze"
    Environment = var.environment
    ManagedBy   = "terraform"
  }
}

resource "aws_s3_bucket_lifecycle_configuration" "bronze_lifecycle" {
  bucket = aws_s3_bucket.bronze.id
  rule {
    id     = "transition-to-ia"
    status = "Enabled"
    transition {
      days          = 30
      storage_class = "STANDARD_IA"
    }
    transition {
      days          = 90
      storage_class = "GLACIER"
    }
    expiration {
      days = 365   # Raw data retained for 1 year
    }
  }
}

Silver Layer — Data Quality Checks

# src/processing/silver_transform.py
from dataclasses import dataclass
from enum import Enum

class QualityLevel(Enum):
    PASS = "pass"
    WARN = "warn"
    FAIL = "fail"

@dataclass
class QualityResult:
    check_name: str
    level: QualityLevel
    metric_value: float
    threshold: float

def check_completeness(
    records: list[dict], required_fields: list[str]
) -> QualityResult:
    """Verify all required fields are present and non-null."""
    total = len(records)
    if total == 0:
        return QualityResult("completeness", QualityLevel.FAIL, 0.0, 0.95)
    complete = sum(
        1 for r in records
        if all(r.get(f) is not None for f in required_fields)
    )
    ratio = complete / total
    level = (
        QualityLevel.PASS if ratio >= 0.95
        else QualityLevel.WARN if ratio >= 0.80
        else QualityLevel.FAIL
    )
    return QualityResult("completeness", level, ratio, 0.95)

Ingestion — CDC Configuration

# configs/ingestion/cdc-source.yaml
source:
  type: postgresql
  host: db.internal.example.com
  port: 5432
  database: orders_db
  tables:
    - schema: public
      table: orders
      primary_key: order_id
      mode: cdc           # Change Data Capture via logical replication
    - schema: public
      table: customers
      primary_key: customer_id
      mode: full_refresh   # Small dimension table — full load daily

target:
  type: s3
  bucket: acme-bronze-production
  prefix: cdc/orders_db/
  format: parquet
  partition_by: [_ingested_date]

Configuration

# configs/platform-config.yaml
project_prefix: acme-data
environment: production
region: us-east-1

storage:
  bronze_bucket: acme-bronze-production
  silver_bucket: acme-silver-production
  gold_bucket: acme-gold-production
  format: delta                    # delta or iceberg
  encryption: aws:kms

processing:
  engine: spark                    # spark, glue, or databricks
  cluster_size: medium             # small=2 workers, medium=5, large=10
  autoscale: true
  max_workers: 20

governance:
  catalog: glue_catalog            # glue_catalog or unity_catalog
  pii_detection: true              # Automated PII column tagging
  retention_days:
    bronze: 365
    silver: 730
    gold: 1825

Best Practices

Treat bronze as immutable — Never modify raw data; reprocess from bronze into silver if logic changes
Schema-on-read for bronze, schema-on-write for silver — Accept any shape into bronze, enforce contracts at silver
Partition by ingestion date, not business date — Avoids late-arriving data overwriting partitions
Separate compute from storage — Use object storage so clusters can scale independently
Implement data quality gates — Failed quality checks should halt downstream processing, not silently propagate
Version your transformations — Use dbt or similar tools for version-controlled, testable SQL transformations

Troubleshooting

Issue	Cause	Fix
Spark job OOMs on silver transform	Skewed join keys creating large partitions	Add `repartition()` before joins or enable AQE
Delta table query is slow	Too many small files from frequent writes	Run `OPTIMIZE` (compaction) and `VACUUM` on affected tables
CDC slot growing indefinitely	Consumer not acknowledging WAL positions	Check CDC connector health; drop and recreate slot if abandoned
S3 access denied on cross-account	Bucket policy missing cross-account principal	Add consuming account's role ARN to the bucket policy

This is 1 of 11 resources in the Cloud Architecture Pro toolkit. Get the complete [Cloud Data Platform Blueprint] with all files, templates, and documentation for $49.

Get the Full Kit →

Or grab the entire Cloud Architecture Pro bundle (11 products) for $149 — save 30%.

Get the Complete Bundle →

DEV Community

Cloud Data Platform Blueprint

Cloud Data Platform Blueprint

Key Features

Quick Start

Architecture

Usage Examples

Bronze Layer — S3 Bucket with Lifecycle Rules

Silver Layer — Data Quality Checks

Ingestion — CDC Configuration

Configuration

Best Practices

Troubleshooting

Related Articles

Top comments (0)