Aisalkyn Aidarova

Posted on Mar 2

final work for batch #2: ENTERPRISE DEVOPS GOVERNANCE LAB

#architecture #cicd #devops #microservices

enterprise-grade DevOps strategy, not tools.
our focus is:

Shared pipelines
Version control strategy
Safe rollouts
Image security governance
Multi-service blue-green
Centralized base images
Distributed tracing
Rollback strategy
Enterprise routing design
Upgrade enforcement

So this is a FULL ENTERPRISE DEVOPS LAB

This lab simulates a real enterprise with:

10 microservices
Shared CI/CD templates
Centralized base image repo
Blue-Green deployment
Image vulnerability enforcement
Versioned Terraform modules
ArgoCD GitOps
Distributed tracing
Rollback strategy

🏗️ Architecture Overview

Client
→ Load Balancer
→ Ingress
→ 10 Microservices
→ Shared Base Image
→ Centralized CI Templates
→ ArgoCD
→ EKS Cluster

Monitoring:

Prometheus
Grafana
Jaeger (Tracing)
ELK (Logging)

PART 1 — CENTRALIZED BASE IMAGE STRATEGY

Step 1: Create Base Image Repository

Repo: enterprise-base-images

Dockerfile:

FROM amazonlinux:2023

RUN yum update -y && \
    yum install -y python3

LABEL maintainer="platform-team"

Push to ECR:

docker build -t enterprise/python-base:1.0 .
docker push <ecr>/enterprise/python-base:1.0

Step 2: Application Repo Uses Approved Base

Each microservice repo:

FROM <ecr>/enterprise/python-base:1.0

COPY app.py /app/
CMD ["python3", "/app/app.py"]

Now base image is centralized and controlled.

PART 2 — SHARED CI/CD TEMPLATE

Create repo: ci-templates

ci-template.yml

stages:
  - build
  - scan
  - deploy

build:
  script:
    - docker build -t $IMAGE_TAG .
    - docker push $IMAGE_TAG

scan:
  script:
    - trivy image $IMAGE_TAG

deploy:
  script:
    - git commit -am "update image"

Each microservice repo references it.

Now shared logic = centralized.

PART 3 — VERSIONED MODULE STRATEGY

Terraform modules repo:

modules/
  vpc/
  eks/
  rds/

Applications reference:

module "eks" {
  source  = "git::https://repo//modules/eks?ref=v1.2.0"
}

Upgrade safely by bumping version.

PART 4 — BLUE GREEN DEPLOYMENT PER MICROSERVICE

For each service:

Blue Deployment:

metadata:
  labels:
    app: users
    version: blue

Green Deployment:

metadata:
  labels:
    app: users
    version: green

Service:

selector:
  app: users
  version: blue

Switch:

selector:
  app: users
  version: green

PART 5 — 100 SERVICES SCENARIO

Only 10 services modified.

Only those 10 get green deployments.

Routing remains unchanged for other 90.

PART 6 — IMAGE GOVERNANCE SCENARIO

Interview question:
“How do you know which services use bad base image?”

Answer implementation:

grep -r "python-base:1.0" .

Cluster check:

kubectl get pods -A -o jsonpath="{..image}"

Enforce via policy:

Use OPA / Kyverno:

deny:
  message: "Unapproved base image"

PART 7 — DISTRIBUTED TRACING

Install Jaeger.

Each service:

trace_id = request.headers.get("X-Trace-ID")

Logs include trace ID.

Search in ELK:

trace_id: 12345

Jaeger UI shows full request graph.

PART 8 — PATCHING + ROLLBACK LAB

Before Linux upgrade:

aws ec2 create-snapshot

If failure:

aws ec2 create-volume --snapshot-id snap-xxxx

Or:

yum downgrade package-name

PART 9 — SAFE SHARED TEMPLATE UPDATE

Create new template version v2
Test against sample apps (3 representative apps)
Validate in staging
Gradual rollout
Monitor
Rollback if needed

PART 10 — FORCED SECURITY UPGRADE SCENARIO

If base image vulnerable:

Build new base image 1.1
Update CI policy to block old tag
Auto-create PRs to update Dockerfiles
Argo deploy
Monitor

PART 11 — ENTERPRISE ROUTING

Single Ingress:

rules:
  - path: /users
    backend: users-service
  - path: /orders
    backend: orders-service

Blue-green switching happens at Service level, not Ingress.

WHY INTERVIEWER can ask

Do you understand governance?
Do you understand scale?
Do you understand versioning?
Do you understand impact isolation?
Do you understand rollout safety?
Do you understand enterprise control model?

Not just Kubernetes commands.

You now answer with:

Versioning strategy
Centralized control
Gradual rollout
Governance enforcement
Observability validation
Rollback plan

DEV Community