DEV Community

Cover image for GitOps for Infrastructure: How We Deploy With Zero SSH
Samson Tanimawo
Samson Tanimawo

Posted on

GitOps for Infrastructure: How We Deploy With Zero SSH

The Last Time I Used SSH

I haven't SSH'd into a production server in 14 months. Not because I'm lazy because our infrastructure doesn't require it.

GitOps changed everything.

What GitOps Actually Means

GitOps = Git is the single source of truth for your infrastructure. Every change goes through a PR. No manual kubectl, no SSH, no ClickOps.

Traditional: Developer → kubectl apply → Cluster
GitOps: Developer → Git PR → CI Review → ArgoCD → Cluster
Enter fullscreen mode Exit fullscreen mode

Our Setup

Repositories:
├── app-service-a/ # Application code + Dockerfile
├── app-service-b/ # Application code + Dockerfile
└── infrastructure/ # All K8s manifests
├── base/ # Shared configurations
│ ├── namespaces/
│ ├── network-policies/
│ └── rbac/
├── services/
│ ├── api-service/
│ │ ├── deployment.yaml
│ │ ├── service.yaml
│ │ ├── hpa.yaml
│ │ └── kustomization.yaml
│ └── payment-service/
└── environments/
├── staging/
│ └── kustomization.yaml
└── production/
└── kustomization.yaml
Enter fullscreen mode Exit fullscreen mode

ArgoCD Configuration

apiVersion: argoproj.io/v1alpha1
kind: Application
metadata:
name: production-services
spec:
project: default
source:
repoURL: https://github.com/org/infrastructure
targetRevision: main
path: environments/production
destination:
server: https://kubernetes.default.svc
namespace: production
syncPolicy:
automated:
prune: true # Delete resources removed from Git
selfHeal: true # Revert manual changes
syncOptions:
- CreateNamespace=true
Enter fullscreen mode Exit fullscreen mode

The key: selfHeal: true. If someone manually changes something, ArgoCD reverts it within 3 minutes. Git is the truth.

The Deployment Flow

# 1. Developer pushes code to app repo
# 2. CI builds image: api-service:sha-abc123
# 3. CI updates infrastructure repo:

# Automated PR to infrastructure repo
apiVersion: apps/v1
kind: Deployment
metadata:
name: api-service
spec:
template:
spec:
containers:
- name: api
image: registry/api-service:sha-abc123 # Updated by CI
Enter fullscreen mode Exit fullscreen mode
# GitHub Actions: Update image tag
- name: Update manifest
run: |
cd infrastructure
kustomize edit set image api-service=registry/api-service:${{ github.sha }}
git add.
git commit -m "deploy: api-service ${{ github.sha }}"
git push
Enter fullscreen mode Exit fullscreen mode

Benefits We've Seen

1. Complete Audit Trail

# Who changed what and when?
git log --oneline environments/production/
# a1b2c3d deploy: api-service sha-abc123 (2024-03-15)
# d4e5f6g feat: add rate limiting to api (2024-03-14)
# g7h8i9j fix: increase memory limit for payment (2024-03-13)
Enter fullscreen mode Exit fullscreen mode

2. Easy Rollbacks

# Rollback = revert a commit
git revert HEAD
git push
# ArgoCD detects change, reverts cluster to previous state
# Total time: ~90 seconds
Enter fullscreen mode Exit fullscreen mode

3. Environment Parity

# environments/staging/kustomization.yaml
bases:
-../../services/api-service
patchesStrategicMerge:
- replicas-patch.yaml # 1 replica instead of 3

# environments/production/kustomization.yaml
bases:
-../../services/api-service
patchesStrategicMerge:
- replicas-patch.yaml # 3 replicas
- resources-patch.yaml # More CPU/memory
Enter fullscreen mode Exit fullscreen mode

Same base, different overlays. Drift between environments becomes impossible.

4. Disaster Recovery

# Cluster dies? No problem.
# 1. Provision new cluster
# 2. Install ArgoCD
# 3. Point ArgoCD at Git repo
# 4. Everything reconverges automatically

# Recovery time: ~15 minutes (cluster provisioning)
# Data loss: zero (Git has everything)
Enter fullscreen mode Exit fullscreen mode

The Cultural Shift

The hardest part wasn't technical. It was convincing engineers to stop using kubectl directly.

Old way: "I'll just quickly fix this in production" (kubectl edit)
New way: "I'll open a PR to fix this" (5 minutes longer, 100% safer)
Enter fullscreen mode Exit fullscreen mode

After 3 months, nobody missed kubectl. The safety and audit trail are worth the extra 5 minutes.

If you want GitOps with AI-powered drift detection and automated remediation, check out what we're building at Nova AI Ops.


Written by Dr. Samson Tanimawo
BSc · MSc · MBA · PhD
Founder & CEO, Nova AI Ops. https://novaaiops.com

Top comments (0)