Thesius Code

Posted on Mar 23 • Originally published at datanest-stores.pages.dev

Microservices Architecture Guide

#cloud #aws #terraform #architecture

Microservices Architecture Guide

A practitioner's guide to designing, deploying, and operating microservices at production scale. This kit goes beyond theory to deliver working configurations for API gateways, service mesh, distributed tracing, and circuit breakers. Includes service decomposition decision frameworks, Kubernetes deployment manifests, Istio service mesh configs, and Grafana dashboards for the four golden signals. Built for teams transitioning from monoliths or scaling existing microservices architectures.

Key Features

Service Decomposition Framework — Decision matrix for identifying service boundaries using domain-driven design bounded contexts
API Gateway Configurations — Ready-to-deploy configs for Kong, AWS API Gateway, and Azure APIM with rate limiting, auth, and routing
Service Mesh Setup — Istio and Linkerd configurations with mTLS, traffic splitting, circuit breaking, and retry policies
Observability Stack — OpenTelemetry instrumentation, Grafana dashboards, and alerting rules for the four golden signals
Kubernetes Manifests — Production-grade Deployments, Services, HPAs, PDBs, and NetworkPolicies for each service pattern
Circuit Breaker Patterns — Implementation patterns with fallback strategies, bulkhead isolation, and timeout configurations
Database Per Service — Data ownership patterns including event-driven synchronization and API composition
CI/CD Templates — GitHub Actions and GitLab CI pipelines for independent service deployments

Quick Start

# Deploy the sample microservices stack to Kubernetes
kubectl create namespace acme-services
kubectl apply -f src/kubernetes/namespace-config.yaml

# Deploy API Gateway (Kong)
kubectl apply -f src/gateway/kong-deployment.yaml -n acme-services

# Deploy a sample service with observability
kubectl apply -f examples/order-service/ -n acme-services

# Verify health
kubectl get pods -n acme-services
kubectl port-forward svc/kong-gateway 8080:80 -n acme-services
curl http://localhost:8080/api/orders/health

Architecture

┌──────────────────────────────────────────────────────────┐
│                Microservices Architecture                │
│                                                          │
│  ┌──────────┐     ┌──────────────────────────────────┐   │
│  │  Client   │────►│        API Gateway              │   │
│  │ (Web/App) │     │  Auth │ Rate Limit │ Routing    │   │
│  └──────────┘     └──────────┬───────────────────────┘   │
│                              │                           │
│  ┌───────────────────────────▼─────────────────────────┐ │
│  │              Service Mesh (Istio / Linkerd)         │ │
│  │   mTLS │ Traffic Mgmt │ Retries │ Circuit Breaker  │ │
│  │  ┌──────────┐ ┌──────────┐ ┌──────────┐            │ │
│  │  │  Order   │ │  User    │ │ Payment  │            │ │
│  │  │ Service  │ │ Service  │ │ Service  │            │ │
│  │  │ ┌──────┐ │ │ ┌──────┐ │ │ ┌──────┐ │            │ │
│  │  │ │ DB   │ │ │ │ DB   │ │ │ │ DB   │ │            │ │
│  │  │ └──────┘ │ │ └──────┘ │ │ └──────┘ │            │ │
│  │  └──────────┘ └──────────┘ └──────────┘            │ │
│  └─────────────────────────────────────────────────────┘ │
│                                                          │
│  ┌────────────────────────────────────────────────────┐   │
│  │            Observability Platform                  │   │
│  │  Traces (Jaeger) │ Metrics (Prometheus) │ Logs     │   │
│  └────────────────────────────────────────────────────┘   │
└──────────────────────────────────────────────────────────┘

Usage Examples

Kubernetes Deployment with Health Checks

# src/kubernetes/order-service/deployment.yaml
apiVersion: apps/v1
kind: Deployment
metadata:
  name: order-service
  labels:
    app: order-service
    version: v1
spec:
  replicas: 3
  selector:
    matchLabels: { app: order-service }
  template:
    metadata:
      labels:
        app: order-service
        version: v1
      annotations:
        prometheus.io/scrape: "true"
        prometheus.io/port: "8080"
        prometheus.io/path: "/metrics"
    spec:
      containers:
        - name: order-service
          image: acme/order-service:1.0.0
          ports:
            - containerPort: 8080
          readinessProbe:
            httpGet: { path: /health/ready, port: 8080 }
            initialDelaySeconds: 10
            periodSeconds: 5
          livenessProbe:
            httpGet: { path: /health/live, port: 8080 }
            initialDelaySeconds: 30
            periodSeconds: 10
          resources:
            requests: { cpu: 250m, memory: 256Mi }
            limits:   { cpu: 500m, memory: 512Mi }
          env:
            - name: DB_HOST
              valueFrom:
                secretKeyRef: { name: order-db-creds, key: host }

Istio Circuit Breaker + Retry Policy

# src/service-mesh/istio/destination-rule.yaml
apiVersion: networking.istio.io/v1beta1
kind: DestinationRule
metadata:
  name: order-service-dr
spec:
  host: order-service
  trafficPolicy:
    connectionPool:
      tcp:
        maxConnections: 100          # Bulkhead: limit connections
      http:
        h2UpgradePolicy: DEFAULT
        maxRequestsPerConnection: 10
    outlierDetection:
      consecutive5xxErrors: 5        # Trip after 5 consecutive 5xx
      interval: 30s                  # Evaluation window
      baseEjectionTime: 60s          # Eject failing pod for 60s
      maxEjectionPercent: 50         # Never eject more than 50% of pods
---
apiVersion: networking.istio.io/v1beta1
kind: VirtualService
metadata:
  name: order-service-vs
spec:
  hosts: [order-service]
  http:
    - route:
        - destination: { host: order-service }
      retries:
        attempts: 3
        perTryTimeout: 2s
        retryOn: 5xx,reset,connect-failure

API Gateway Rate Limiting (Kong)

# src/gateway/kong-rate-limit.yaml
apiVersion: configuration.konghq.com/v1
kind: KongPlugin
metadata:
  name: rate-limit-orders
plugin: rate-limiting
config:
  minute: 100                        # 100 requests per minute per consumer
  hour: 5000
  policy: redis
  redis_host: redis.acme-services.svc
  redis_port: 6379
  limit_by: consumer                 # Per authenticated consumer
  hide_client_headers: false         # Return X-RateLimit-* headers

Configuration

# configs/platform-config.yaml
cluster:
  provider: aws                     # aws, azure, gcp
  kubernetes_version: "1.28"
  node_pools:
    - name: services
      instance_type: m5.large
      min_nodes: 3
      max_nodes: 10

gateway:
  type: kong                         # kong, aws-apigw, azure-apim
  rate_limit_per_minute: 100
  auth_method: jwt                   # jwt, oauth2, api-key

service_mesh:
  type: istio                        # istio, linkerd, none
  mtls_mode: STRICT                  # STRICT or PERMISSIVE
  tracing_sample_rate: 0.1           # 10% trace sampling in production

observability:
  metrics: prometheus
  tracing: jaeger
  logging: fluentbit
  dashboard: grafana
  alert_channels:
    - type: slack
      webhook: YOUR_SLACK_WEBHOOK_HERE

Best Practices

One service, one database — Shared databases create tight coupling; use events to synchronize data across services
Set resource requests AND limits — Requests guarantee scheduling; limits prevent noisy neighbors
Use readiness probes, not just liveness — Readiness controls traffic routing; liveness controls restarts — they serve different purposes
Start with 3 replicas minimum — Ensures availability during rolling updates and node failures
Circuit breakers on every outbound call — Without them, one slow dependency takes down your entire service graph
Trace across service boundaries — Propagate trace context headers (W3C Trace Context) through all service calls

Troubleshooting

Issue	Cause	Fix
503 errors after deploying new version	Readiness probe failing on new pods	Check probe endpoint; ensure app is ready before probe starts
Cascading failures across services	No circuit breaker; failing service saturates callers	Apply DestinationRule with outlierDetection (circuit breaker)
High p99 latency on service calls	Retries amplifying tail latency	Reduce retry attempts or add retry budget; check outlier detection
Istio sidecar injection not working	Namespace missing `istio-injection: enabled` label	`kubectl label namespace acme-services istio-injection=enabled`

This is 1 of 11 resources in the Cloud Architecture Pro toolkit. Get the complete [Microservices Architecture Guide] with all files, templates, and documentation for $39.

Get the Full Kit →

Or grab the entire Cloud Architecture Pro bundle (11 products) for $149 — save 30%.

Get the Complete Bundle →

DEV Community

Microservices Architecture Guide

Microservices Architecture Guide

Key Features

Quick Start

Architecture

Usage Examples

Kubernetes Deployment with Health Checks

Istio Circuit Breaker + Retry Policy

API Gateway Rate Limiting (Kong)

Configuration

Best Practices

Troubleshooting

Related Articles

Top comments (0)