Microservices Architecture Guide
A practitioner's guide to designing, deploying, and operating microservices at production scale. This kit goes beyond theory to deliver working configurations for API gateways, service mesh, distributed tracing, and circuit breakers. Includes service decomposition decision frameworks, Kubernetes deployment manifests, Istio service mesh configs, and Grafana dashboards for the four golden signals. Built for teams transitioning from monoliths or scaling existing microservices architectures.
Key Features
- Service Decomposition Framework — Decision matrix for identifying service boundaries using domain-driven design bounded contexts
- API Gateway Configurations — Ready-to-deploy configs for Kong, AWS API Gateway, and Azure APIM with rate limiting, auth, and routing
- Service Mesh Setup — Istio and Linkerd configurations with mTLS, traffic splitting, circuit breaking, and retry policies
- Observability Stack — OpenTelemetry instrumentation, Grafana dashboards, and alerting rules for the four golden signals
- Kubernetes Manifests — Production-grade Deployments, Services, HPAs, PDBs, and NetworkPolicies for each service pattern
- Circuit Breaker Patterns — Implementation patterns with fallback strategies, bulkhead isolation, and timeout configurations
- Database Per Service — Data ownership patterns including event-driven synchronization and API composition
- CI/CD Templates — GitHub Actions and GitLab CI pipelines for independent service deployments
Quick Start
# Deploy the sample microservices stack to Kubernetes
kubectl create namespace acme-services
kubectl apply -f src/kubernetes/namespace-config.yaml
# Deploy API Gateway (Kong)
kubectl apply -f src/gateway/kong-deployment.yaml -n acme-services
# Deploy a sample service with observability
kubectl apply -f examples/order-service/ -n acme-services
# Verify health
kubectl get pods -n acme-services
kubectl port-forward svc/kong-gateway 8080:80 -n acme-services
curl http://localhost:8080/api/orders/health
Architecture
┌──────────────────────────────────────────────────────────┐
│ Microservices Architecture │
│ │
│ ┌──────────┐ ┌──────────────────────────────────┐ │
│ │ Client │────►│ API Gateway │ │
│ │ (Web/App) │ │ Auth │ Rate Limit │ Routing │ │
│ └──────────┘ └──────────┬───────────────────────┘ │
│ │ │
│ ┌───────────────────────────▼─────────────────────────┐ │
│ │ Service Mesh (Istio / Linkerd) │ │
│ │ mTLS │ Traffic Mgmt │ Retries │ Circuit Breaker │ │
│ │ ┌──────────┐ ┌──────────┐ ┌──────────┐ │ │
│ │ │ Order │ │ User │ │ Payment │ │ │
│ │ │ Service │ │ Service │ │ Service │ │ │
│ │ │ ┌──────┐ │ │ ┌──────┐ │ │ ┌──────┐ │ │ │
│ │ │ │ DB │ │ │ │ DB │ │ │ │ DB │ │ │ │
│ │ │ └──────┘ │ │ └──────┘ │ │ └──────┘ │ │ │
│ │ └──────────┘ └──────────┘ └──────────┘ │ │
│ └─────────────────────────────────────────────────────┘ │
│ │
│ ┌────────────────────────────────────────────────────┐ │
│ │ Observability Platform │ │
│ │ Traces (Jaeger) │ Metrics (Prometheus) │ Logs │ │
│ └────────────────────────────────────────────────────┘ │
└──────────────────────────────────────────────────────────┘
Usage Examples
Kubernetes Deployment with Health Checks
# src/kubernetes/order-service/deployment.yaml
apiVersion: apps/v1
kind: Deployment
metadata:
name: order-service
labels:
app: order-service
version: v1
spec:
replicas: 3
selector:
matchLabels: { app: order-service }
template:
metadata:
labels:
app: order-service
version: v1
annotations:
prometheus.io/scrape: "true"
prometheus.io/port: "8080"
prometheus.io/path: "/metrics"
spec:
containers:
- name: order-service
image: acme/order-service:1.0.0
ports:
- containerPort: 8080
readinessProbe:
httpGet: { path: /health/ready, port: 8080 }
initialDelaySeconds: 10
periodSeconds: 5
livenessProbe:
httpGet: { path: /health/live, port: 8080 }
initialDelaySeconds: 30
periodSeconds: 10
resources:
requests: { cpu: 250m, memory: 256Mi }
limits: { cpu: 500m, memory: 512Mi }
env:
- name: DB_HOST
valueFrom:
secretKeyRef: { name: order-db-creds, key: host }
Istio Circuit Breaker + Retry Policy
# src/service-mesh/istio/destination-rule.yaml
apiVersion: networking.istio.io/v1beta1
kind: DestinationRule
metadata:
name: order-service-dr
spec:
host: order-service
trafficPolicy:
connectionPool:
tcp:
maxConnections: 100 # Bulkhead: limit connections
http:
h2UpgradePolicy: DEFAULT
maxRequestsPerConnection: 10
outlierDetection:
consecutive5xxErrors: 5 # Trip after 5 consecutive 5xx
interval: 30s # Evaluation window
baseEjectionTime: 60s # Eject failing pod for 60s
maxEjectionPercent: 50 # Never eject more than 50% of pods
---
apiVersion: networking.istio.io/v1beta1
kind: VirtualService
metadata:
name: order-service-vs
spec:
hosts: [order-service]
http:
- route:
- destination: { host: order-service }
retries:
attempts: 3
perTryTimeout: 2s
retryOn: 5xx,reset,connect-failure
API Gateway Rate Limiting (Kong)
# src/gateway/kong-rate-limit.yaml
apiVersion: configuration.konghq.com/v1
kind: KongPlugin
metadata:
name: rate-limit-orders
plugin: rate-limiting
config:
minute: 100 # 100 requests per minute per consumer
hour: 5000
policy: redis
redis_host: redis.acme-services.svc
redis_port: 6379
limit_by: consumer # Per authenticated consumer
hide_client_headers: false # Return X-RateLimit-* headers
Configuration
# configs/platform-config.yaml
cluster:
provider: aws # aws, azure, gcp
kubernetes_version: "1.28"
node_pools:
- name: services
instance_type: m5.large
min_nodes: 3
max_nodes: 10
gateway:
type: kong # kong, aws-apigw, azure-apim
rate_limit_per_minute: 100
auth_method: jwt # jwt, oauth2, api-key
service_mesh:
type: istio # istio, linkerd, none
mtls_mode: STRICT # STRICT or PERMISSIVE
tracing_sample_rate: 0.1 # 10% trace sampling in production
observability:
metrics: prometheus
tracing: jaeger
logging: fluentbit
dashboard: grafana
alert_channels:
- type: slack
webhook: YOUR_SLACK_WEBHOOK_HERE
Best Practices
- One service, one database — Shared databases create tight coupling; use events to synchronize data across services
- Set resource requests AND limits — Requests guarantee scheduling; limits prevent noisy neighbors
- Use readiness probes, not just liveness — Readiness controls traffic routing; liveness controls restarts — they serve different purposes
- Start with 3 replicas minimum — Ensures availability during rolling updates and node failures
- Circuit breakers on every outbound call — Without them, one slow dependency takes down your entire service graph
- Trace across service boundaries — Propagate trace context headers (W3C Trace Context) through all service calls
Troubleshooting
| Issue | Cause | Fix |
|---|---|---|
| 503 errors after deploying new version | Readiness probe failing on new pods | Check probe endpoint; ensure app is ready before probe starts |
| Cascading failures across services | No circuit breaker; failing service saturates callers | Apply DestinationRule with outlierDetection (circuit breaker) |
| High p99 latency on service calls | Retries amplifying tail latency | Reduce retry attempts or add retry budget; check outlier detection |
| Istio sidecar injection not working | Namespace missing istio-injection: enabled label |
kubectl label namespace acme-services istio-injection=enabled |
This is 1 of 11 resources in the Cloud Architecture Pro toolkit. Get the complete [Microservices Architecture Guide] with all files, templates, and documentation for $39.
Or grab the entire Cloud Architecture Pro bundle (11 products) for $149 — save 30%.
Top comments (0)