DevOps for Startups, Setting Up CI/CD Without a Full DevOps Team

#docker #github #cicd #devops

Most early-stage US startups don't have a dedicated DevOps engineer. They have developers who also manage infrastructure, which means CI/CD setups that either don't exist, break regularly, or were copy-pasted from a tutorial and nobody fully understands.

Here's a production-grade CI/CD setup that a small team can actually own and maintain, without needing a DevOps specialist on staff.

The Stack

For a US startup with a Node.js/React application deployed to AWS, this setup covers 95% of what you need:

Source control:      GitHub
CI/CD:               GitHub Actions
Container registry:  Amazon ECR
Orchestration:       Amazon ECS (Fargate)
Infrastructure:      Terraform
Secrets:             AWS Secrets Manager

All of these have generous free tiers or are inexpensive at startup scale.

The GitHub Actions Pipeline

Three workflows cover the full lifecycle:

# .github/workflows/ci.yml, runs on every PR
name: CI

on:
  pull_request:
    branches: [main, develop]

jobs:
  test:
    runs-on: ubuntu-latest
    services:
      postgres:
        image: postgres:15
        env:
          POSTGRES_PASSWORD: test
          POSTGRES_DB: app_test
        options: >-
          --health-cmd pg_isready
          --health-interval 10s

    steps:
      - uses: actions/checkout@v4

      - name: Setup Node.js
        uses: actions/setup-node@v4
        with:
          node-version: '20'
          cache: 'npm'

      - name: Install dependencies
        run: npm ci

      - name: Run linter
        run: npm run lint

      - name: Run type check
        run: npm run type-check

      - name: Run tests
        run: npm test
        env:
          DATABASE_URL: postgresql://postgres:test@localhost:5432/app_test

      - name: Build
        run: npm run build

# .github/workflows/deploy-staging.yml, deploys on merge to develop
name: Deploy to Staging

on:
  push:
    branches: [develop]

jobs:
  deploy:
    runs-on: ubuntu-latest
    environment: staging

    steps:
      - uses: actions/checkout@v4

      - name: Configure AWS credentials
        uses: aws-actions/configure-aws-credentials@v4
        with:
          aws-access-key-id: ${{ secrets.AWS_ACCESS_KEY_ID }}
          aws-secret-access-key: ${{ secrets.AWS_SECRET_ACCESS_KEY }}
          aws-region: us-east-1

      - name: Login to Amazon ECR
        id: login-ecr
        uses: aws-actions/amazon-ecr-login@v2

      - name: Build and push Docker image
        env:
          ECR_REGISTRY: ${{ steps.login-ecr.outputs.registry }}
          IMAGE_TAG: ${{ github.sha }}
        run: |
          docker build -t $ECR_REGISTRY/app:$IMAGE_TAG .
          docker push $ECR_REGISTRY/app:$IMAGE_TAG

      - name: Deploy to ECS
        uses: aws-actions/amazon-ecs-deploy-task-definition@v1
        with:
          task-definition: .aws/task-definition-staging.json
          service: app-staging
          cluster: app-staging
          wait-for-service-stability: true

The production deploy workflow is identical but triggers on merge to main and requires a manual approval gate in GitHub Environments.

Dockerfile for Production

# Multi-stage build: keeps the final image lean
FROM node:20-alpine AS builder
WORKDIR /app
COPY package*.json ./
RUN npm ci --only=production
COPY . .
RUN npm run build

FROM node:20-alpine AS runner
WORKDIR /app

# Run as non-root user
RUN addgroup -S appgroup && adduser -S appuser -G appgroup
USER appuser

COPY --from=builder --chown=appuser:appgroup /app/dist ./dist
COPY --from=builder --chown=appuser:appgroup /app/node_modules ./node_modules
COPY --from=builder --chown=appuser:appgroup /app/package.json ./

EXPOSE 3000
HEALTHCHECK --interval=30s --timeout=5s --retries=3 \
  CMD wget -qO- http://localhost:3000/health || exit 1

CMD ["node", "dist/app.js"]

The multi-stage build keeps your final Docker image small (no dev dependencies, no source files). The health check lets ECS know when a container is ready, and when to restart it.

Database Migrations in CI/CD

Database migrations are the trickiest part of a CI/CD pipeline. The safest approach:

# In your deploy workflow, before the ECS task update:
- name: Run database migrations
  run: |
    # Run migrations as a one-off ECS task, not in the web container startup
    aws ecs run-task \
      --cluster app-${{ env.ENVIRONMENT }} \
      --task-definition app-migrate-${{ env.ENVIRONMENT }} \
      --overrides '{"containerOverrides":[{"name":"migrate","command":["npm","run","migrate"]}]}' \
      --launch-type FARGATE \
      --network-configuration "..."

Never run migrations automatically on container startup. If a migration fails, you want it to fail loudly in CI before ECS tries to boot any web containers.

Environment Secrets

All secrets live in AWS Secrets Manager, never in GitHub secrets or environment files:

// Load secrets from AWS Secrets Manager at app startup
import { SecretsManagerClient, GetSecretValueCommand } from '@aws-sdk/client-secrets-manager';

async function loadSecrets() {
  const client = new SecretsManagerClient({ region: 'us-east-1' });
  const response = await client.send(new GetSecretValueCommand({
    SecretId: `app/${process.env.ENVIRONMENT}/secrets`
  }));
  const secrets = JSON.parse(response.SecretString!);
  Object.assign(process.env, secrets);
}

// Call before app initializes
await loadSecrets();

The ECS task role grants permission to read these secrets, no credentials needed in the container itself.

Monitoring: The Minimum You Need

A startup CI/CD setup is incomplete without alerting:

// CloudWatch metric filter → alarm → SNS → Slack/PagerDuty
// Add to your Terraform:

resource "aws_cloudwatch_metric_alarm" "error_rate" {
  alarm_name  = "${var.app_name}-${var.env}-5xx-rate"
  metric_name = "HTTPCode_Target_5XX_Count"
  namespace   = "AWS/ApplicationELB"

  statistic           = "Sum"
  period              = 60
  evaluation_periods  = 3
  threshold           = 10
  comparison_operator = "GreaterThanThreshold"

  alarm_actions = [aws_sns_topic.alerts.arn]
}

Three alarms are the minimum viable monitoring setup: 5xx error rate, response time (P99), and ECS task failure count.

This setup takes about a day to configure from scratch. After that, every PR gets tested automatically, staging deploys on every merge to develop, and production deploys are one approved merge away.

If you need DevOps infrastructure set up properly for a US startup, without hiring a full DevOps team, that's work I do. See my DevOps services at waqarhabib.com/services/devops-services.

Originally published at waqarhabib.com