CI/CD Patterns & Best Practices Guide
Everything you need to know to build reliable, maintainable CI/CD pipelines.
Table of Contents
- Pipeline Design Principles
- Branching Strategies
- Environment Promotion
- Docker Best Practices
- Secrets Management
- Caching Strategies
- Testing in CI
- Rollback Strategies
- Pipeline Security
- Monitoring Your Pipeline
Pipeline Design Principles
1. Fail fast
Put the cheapest, fastest checks first. A linting error that takes 2 seconds to detect shouldn't wait behind a 10-minute integration test suite.
Lint (10s) → Unit Tests (1m) → Build (2m) → Integration Tests (5m) → Deploy
2. Keep pipelines idempotent
Running the same pipeline twice on the same commit should produce the same result. No side effects, no partial states.
3. Make deployments boring
If your deployment process gives you anxiety, it's too complex. Automate until deploying feels like pushing a button (because it should be pushing a button).
4. Parallelize where possible
┌── Lint ──┐
push → ├── Test ──├── Build → Deploy
└── Scan ──┘
Independent stages should run in parallel, not sequentially.
5. One artifact, many environments
Build once, deploy the same artifact to dev → staging → production. Never rebuild for each environment. Use environment variables for configuration differences.
Branching Strategies
GitHub Flow (recommended for most teams)
main ─────●──────●──────●──────●─────
\ / \ /
└──┘ └──┘
feature/x feature/y
-
mainis always deployable - Features branch from
main, merge back via PR - Deploy from
main(or tags for versioned releases)
GitFlow (for versioned software releases)
main ─────●────────────────●─────
\ /
develop ────●──●──●──●──●──────
\ /
feature/x
-
developfor integration -
mainfor releases - Feature branches merge into
develop
Trunk-Based Development (for experienced teams)
main ─●─●─●─●─●─●─●─●─●─
- Everyone commits to
main(or very short-lived branches) - Requires: feature flags, good test coverage, CI that runs in < 10 minutes
Environment Promotion
The promotion pipeline
Build → Dev (auto) → Staging (auto) → Production (manual approval)
| Environment | Deploy Trigger | Purpose |
|---|---|---|
| Dev | Push to develop
|
Developer testing, integration |
| Staging | Push to main
|
Pre-production validation, QA |
| Production | Git tag v* + manual approval |
Live users |
Environment configuration
Never hardcode environment-specific values. Use environment variables:
# docker-compose.yml
services:
app:
environment:
- DATABASE_URL=${DATABASE_URL}
- REDIS_URL=${REDIS_URL}
- LOG_LEVEL=${LOG_LEVEL:-info}
Each environment sets these via:
- GitHub: Environment secrets/variables
- GitLab: CI/CD variables with environment scope
- Jenkins: Credentials + environment blocks
- Azure DevOps: Variable groups linked to environments
Docker Best Practices
Multi-stage builds
# Stage 1: Build (has compilers, dev tools — ~1GB)
FROM python:3.12 AS builder
RUN pip install -r requirements.txt
# Stage 2: Runtime (minimal image — ~120MB)
FROM python:3.12-slim
COPY --from=builder /opt/venv /opt/venv
Image size reduction
| Technique | Impact |
|---|---|
| Multi-stage builds | -80% image size |
| Alpine/slim base images | -60% base size |
.dockerignore |
-50% build context |
--no-cache-dir (pip) |
-10% layer size |
| Combine RUN commands | -5% (fewer layers) |
Security
- Non-root user: Always run as non-root in production
- Scan images: Use Trivy, Snyk, or Docker Scout
-
Pin versions: Use specific tags, not
latest - Distroless: Consider Google's distroless images for maximum security
Secrets Management
What NOT to do
# NEVER commit secrets to your repository
env:
DATABASE_URL: "postgresql://admin:password123@prod-db:5432/app"
Platform-specific secret storage
GitHub Actions:
env:
DB_URL: ${{ secrets.DATABASE_URL }}
GitLab CI:
variables:
DB_URL: $CI_REGISTRY_PASSWORD # Masked, protected variable
Jenkins:
environment {
DB_URL = credentials('database-url')
}
Azure DevOps:
variables:
- group: production-secrets # Variable group
Secret rotation
- Rotate secrets every 90 days (or on any suspected compromise)
- Use short-lived tokens where possible (OIDC, workload identity)
- Audit secret access regularly
Caching Strategies
Dependency caching
GitHub Actions:
- uses: actions/cache@v4
with:
path: ~/.cache/pip
key: ${{ runner.os }}-pip-${{ hashFiles('requirements.txt') }}
GitLab CI:
cache:
key: "$CI_COMMIT_REF_SLUG"
paths:
- .cache/pip/
- node_modules/
Docker layer caching
# GitHub Actions — GHA cache backend
- uses: docker/build-push-action@v5
with:
cache-from: type=gha
cache-to: type=gha,mode=max
What to cache
| Cache Target | Key | Expected Speedup |
|---|---|---|
| pip/npm packages | Hash of lockfile | 30-60s |
| Docker layers | Branch + Dockerfile hash | 2-5min |
| Build artifacts | Commit SHA | 1-3min |
| Test results | Not recommended (always re-run) | — |
Testing in CI
Test pyramid in CI
/ E2E Tests \ ← Slow, flaky (run nightly or on main only)
/ Integration \ ← Medium (run on every PR)
/ Unit Tests \ ← Fast, reliable (run on every push)
/____________________\
Recommended CI test strategy
| Test Type | When to Run | Timeout | Retry |
|---|---|---|---|
| Lint | Every push | 2min | No |
| Unit tests | Every push | 5min | No |
| Integration tests | Every PR | 10min | 1x |
| E2E tests | Main branch, nightly | 20min | 2x |
| Security scan | Every PR | 5min | No |
Dealing with flaky tests
- Quarantine: Move flaky tests to a separate job that's allowed to fail
- Retry once: Retry failed tests once before marking as failed
- Fix or delete: Flaky tests that aren't fixed within 2 weeks should be deleted
Rollback Strategies
1. Redeploy previous version (recommended)
# Docker: just pull and run the previous tag
docker pull myapp:v1.2.3 # previous known-good version
docker compose up -d --no-deps app
2. Git revert
git revert HEAD # Creates a new commit that undoes the last change
git push # Triggers normal CI/CD pipeline
3. Feature flags
if feature_flags.is_enabled("new-checkout-flow", user):
return new_checkout(request)
else:
return old_checkout(request)
4. Blue-green / canary deployments
Blue (current): 100% traffic → v1.2.3
Green (new): 0% traffic → v1.3.0
# Gradually shift traffic
Blue: 90% → Green: 10% (canary)
Blue: 50% → Green: 50% (50/50)
Blue: 0% → Green: 100% (full cutover)
Pipeline Security
Supply chain security
-
Pin action versions: Use SHA, not tags (
uses: actions/checkout@abc123) - Lock dependencies: Always commit lockfiles
- Scan dependencies: Use Dependabot, Renovate, or Snyk
- Sign artifacts: Use Sigstore/cosign for container images
Least privilege
- CI service accounts should have minimal permissions
- Use OIDC/workload identity instead of long-lived secrets
- Scope secrets to specific environments/branches
Branch protection
- Require PR reviews before merging to
main - Require CI to pass before merging
- Prevent force-pushing to
main - Require signed commits (optional but recommended)
Monitoring Your Pipeline
Key metrics to track
| Metric | Good | Needs Work | Critical |
|---|---|---|---|
| Pipeline duration | < 10 min | 10-20 min | > 20 min |
| Success rate | > 95% | 90-95% | < 90% |
| Time to deploy | < 30 min | 30-60 min | > 60 min |
| Mean time to recovery | < 1 hour | 1-4 hours | > 4 hours |
DORA metrics
The Four Keys of DevOps performance:
- Deployment Frequency: How often you deploy to production
- Lead Time for Changes: Time from commit to production
- Change Failure Rate: % of deployments that cause failures
- Mean Time to Recovery: How quickly you recover from failures
Datanest Digital | datanest.dev | MIT License
Questions? Email support@datanest.dev
This is 1 of 6 resources in the DevOps Toolkit Pro toolkit. Get the complete [Cicd Pipeline Blueprints] with all files, templates, and documentation for $XX.
Or grab the entire DevOps Toolkit Pro bundle (6 products) for $178 — save 30%.
Top comments (0)