Docker and Ansible: Setting Up a Reproducible On-Prem Stack in a Weekend
You have decided to move off the cloud. The spreadsheets convinced your CTO, the timeline is approved, and now someone — probably you — has to actually build the infrastructure. The question is not whether Docker and Ansible are the right tools. They are, for 90% of Nordic SMBs running steady-state workloads. The question is how to set them up so that your on-prem stack is reproducible, maintainable, and not a snowflake that only one person understands.
I recently stood up exactly this kind of Docker Ansible on-premise setup for a client migrating off Azure — a .NET backend API, Angular frontends, PostgreSQL, and a full monitoring stack. Four VMs, zero cloud dependencies, and the whole thing was reproducible from a single ansible-playbook command within a weekend. Here is how.
Reference Architecture: The 4-VM Layout
Before writing a single line of YAML, you need a target architecture. Here is the layout I use for small-to-mid workloads — and it is the same architecture in my Cloud Exit Starter Kit:
| VM | Role | What Runs Here |
|---|---|---|
| DEV | Development environment | App containers (dev config), dev database |
| TEST | Staging / QA | App containers (staging config), test database, automated test runners |
| PROD | Production | App containers (prod config), production database, Nginx reverse proxy with SSL |
| TOOLS | Shared tooling | Harbor (container registry), SonarQube, Grafana + Loki + Promtail, CI/CD agents, Vaultwarden |
This separation is deliberate. DEV and TEST can break without touching production. TOOLS is isolated so that a misbehaving SonarQube scan does not eat your production server's RAM. And every VM is configured identically at the OS level — same packages, same users, same SSH hardening — because Ansible makes that trivial.
Why not one big server? Because isolation is cheap and debugging resource contention on a shared host is not. Four modest machines (or VMs on a hypervisor) give you clear boundaries and simpler troubleshooting.
Hardware and OS Selection
OS: Rocky Linux 9. It is the CentOS successor with a 10-year support lifecycle. Ubuntu 22.04 LTS is also fine — pick whichever your team already knows. I chose Rocky because the client's sysadmin had RHEL experience, and the ecosystem (SELinux policies, RPM packaging) matched their existing tooling.
Hardware: For a typical Nordic SMB, each VM needs 4–8 CPU cores, 16–32 GB RAM, and 500 GB SSD. Budget around €5,000 per server if buying physical hardware, or use an existing hypervisor. Four Dell PowerEdge T350s or equivalent run about €20,000 total and will last 5+ years.
Networking: A 1 Gbps internal switch at minimum. All four VMs should be on the same subnet with a dedicated VLAN for inter-service traffic. Nginx on the PROD VM handles external traffic with SSL termination.
Step-by-Step: Provisioning with Ansible
Directory Structure
A clean Ansible layout prevents the "where did I put that playbook" problem at 2 AM:
infrastructure/
├── ansible.cfg
├── inventory/
│ ├── hosts.yml
│ └── group_vars/
│ ├── all.yml
│ ├── dev.yml
│ ├── test.yml
│ ├── prod.yml
│ └── tools.yml
├── playbooks/
│ ├── site.yml # runs everything
│ ├── common.yml # base OS config
│ ├── docker.yml # Docker + Compose install
│ ├── monitoring.yml # Grafana/Loki/Promtail
│ ├── registry.yml # Harbor setup
│ └── app-deploy.yml # application deployment
└── roles/
├── base/ # SSH hardening, firewall, packages
├── docker/ # Docker CE + Compose plugin
├── nginx/ # reverse proxy + SSL
├── monitoring/ # Grafana stack
└── harbor/ # container registry
Inventory
# inventory/hosts.yml
all:
children:
dev:
hosts:
dev-01:
ansible_host: 10.0.1.10
test:
hosts:
test-01:
ansible_host: 10.0.1.20
prod:
hosts:
prod-01:
ansible_host: 10.0.1.30
tools:
hosts:
tools-01:
ansible_host: 10.0.1.40
Base Role: Making Every Server Identical
The base role handles everything that should be the same on every VM:
# roles/base/tasks/main.yml
- name: Set timezone
community.general.timezone:
name: Europe/Copenhagen
- name: Install base packages
ansible.builtin.dnf:
name:
- vim
- curl
- wget
- htop
- git
- firewalld
- fail2ban
state: present
- name: Harden SSH - disable password auth
ansible.builtin.lineinfile:
path: /etc/ssh/sshd_config
regexp: "^#?PasswordAuthentication"
line: "PasswordAuthentication no"
notify: restart sshd
- name: Harden SSH - disable root login
ansible.builtin.lineinfile:
path: /etc/ssh/sshd_config
regexp: "^#?PermitRootLogin"
line: "PermitRootLogin no"
notify: restart sshd
- name: Enable and start firewalld
ansible.builtin.systemd:
name: firewalld
enabled: true
state: started
Nothing clever here. That is the point — it should be boring and obvious.
Docker Role: Install and Configure
# roles/docker/tasks/main.yml
- name: Add Docker CE repository
ansible.builtin.command:
cmd: dnf config-manager --add-repo https://download.docker.com/linux/centos/docker-ce.repo
creates: /etc/yum.repos.d/docker-ce.repo
- name: Install Docker CE and Compose plugin
ansible.builtin.dnf:
name:
- docker-ce
- docker-ce-cli
- containerd.io
- docker-compose-plugin
state: present
- name: Add deploy user to docker group
ansible.builtin.user:
name: "{{ deploy_user }}"
groups: docker
append: true
- name: Enable and start Docker
ansible.builtin.systemd:
name: docker
enabled: true
state: started
After running this across all four VMs, every server has Docker and the Compose plugin. No manual SSH-ing, no "I forgot to install Compose on the test server" at 11 PM.
Deploying with Docker Compose
Each environment gets its own docker-compose.yml, but they share a common structure. Here is a simplified PROD example:
# docker-compose.prod.yml
services:
api:
image: harbor.internal/myapp/api:${APP_VERSION}
restart: unless-stopped
environment:
- ASPNETCORE_ENVIRONMENT=Production
- ConnectionStrings__Default=${DB_CONNECTION_STRING}
networks:
- app-network
deploy:
resources:
limits:
memory: 2G
frontend:
image: harbor.internal/myapp/frontend:${APP_VERSION}
restart: unless-stopped
networks:
- app-network
db:
image: postgres:16
restart: unless-stopped
volumes:
- pgdata:/var/lib/postgresql/data
environment:
- POSTGRES_DB=${DB_NAME}
- POSTGRES_USER=${DB_USER}
- POSTGRES_PASSWORD=${DB_PASSWORD}
networks:
- app-network
nginx:
image: harbor.internal/myapp/nginx:${APP_VERSION}
restart: unless-stopped
ports:
- "443:443"
- "80:80"
volumes:
- ./nginx/conf.d:/etc/nginx/conf.d:ro
- /etc/letsencrypt:/etc/letsencrypt:ro
networks:
- app-network
volumes:
pgdata:
networks:
app-network:
driver: bridge
The DEV and TEST variants differ only in environment variables and resource limits. The Cloud Exit Starter Kit includes production-ready versions of these Compose files with health checks, logging drivers configured for the Loki stack, and proper volume backup hooks — the kind of details you only remember after losing data once.
Ansible deploys the Compose stack with a straightforward playbook:
# playbooks/app-deploy.yml
- name: Deploy application
hosts: "{{ target_env }}"
vars_files:
- "../inventory/group_vars/{{ target_env }}.yml"
tasks:
- name: Copy docker-compose file
ansible.builtin.template:
src: "docker-compose.{{ target_env }}.yml.j2"
dest: "/opt/myapp/docker-compose.yml"
- name: Pull latest images
ansible.builtin.command:
cmd: docker compose pull
chdir: /opt/myapp
- name: Deploy with zero downtime
ansible.builtin.command:
cmd: docker compose up -d --remove-orphans
chdir: /opt/myapp
Deploy to any environment with:
ansible-playbook playbooks/app-deploy.yml -e target_env=prod
Secrets Management Without Cloud KMS
This is where people overcomplicate things. You do not need HashiCorp Vault for a 4-VM setup. Here is what works:
Ansible Vault for infrastructure secrets. Encrypt your group_vars files that contain passwords and API keys:
ansible-vault encrypt inventory/group_vars/prod.yml
Now your database passwords, registry credentials, and API keys are encrypted at rest. Decrypt at deploy time with a password file or prompt.
.env files on each host, deployed by Ansible and readable only by the deploy user:
- name: Deploy environment file
ansible.builtin.template:
src: env.j2
dest: /opt/myapp/.env
owner: "{{ deploy_user }}"
mode: "0600"
Vaultwarden on the TOOLS VM for shared team credentials (not application secrets). This gives your team a self-hosted Bitwarden-compatible password manager.
When NOT to do this: if you have compliance requirements for secret rotation and audit trails, invest in HashiCorp Vault. For most Nordic SMBs running internal applications, Ansible Vault plus locked-down .env files is sufficient and dramatically simpler.
CI/CD Pipeline Changes When Your Infra Is Local
Your CI/CD pipeline needs three changes when moving from cloud to on-prem:
1. Self-Hosted Build Agents
Cloud CI/CD (Azure DevOps hosted agents, GitHub Actions runners) cannot reach your internal servers. Install self-hosted agents on the TOOLS VM:
# roles/ci-agents/tasks/main.yml
- name: Create agent directory
ansible.builtin.file:
path: /opt/azdevops-agent
state: directory
owner: "{{ deploy_user }}"
- name: Download and configure Azure DevOps agent
ansible.builtin.shell: |
curl -fsSL https://vstsagentpackage.azureedge.net/agent/{{ agent_version }}/vsts-agent-linux-x64-{{ agent_version }}.tar.gz | tar xz
./config.sh --unattended \
--url https://dev.azure.com/{{ azdo_org }} \
--auth pat --token {{ azdo_pat }} \
--pool {{ agent_pool }} \
--agent tools-01
args:
chdir: /opt/azdevops-agent
creates: /opt/azdevops-agent/.agent
2. Push to Your Private Registry
Replace docker push myregistry.azurecr.io/... with:
docker tag myapp/api:latest harbor.internal/myapp/api:${BUILD_NUMBER}
docker push harbor.internal/myapp/api:${BUILD_NUMBER}
Harbor on the TOOLS VM gives you vulnerability scanning, access control, and image replication — features that Azure Container Registry charges extra for.
3. Deploy via Ansible from the Pipeline
Your pipeline's deploy step becomes an Ansible call instead of az webapp deploy:
# azure-pipelines.yml (deploy stage)
- stage: Deploy
jobs:
- job: DeployProd
pool: 'self-hosted-pool'
steps:
- script: |
ansible-playbook playbooks/app-deploy.yml \
-e target_env=prod \
-e app_version=$(Build.BuildNumber) \
--vault-password-file /opt/secrets/vault-pass
displayName: 'Deploy to production'
The pipeline still runs in Azure DevOps — only the agents and targets are local. You keep the familiar interface, PR triggers, and approval gates while deploying to your own hardware.
The TOOLS VM: Your On-Prem Control Plane
The TOOLS VM deserves special attention because it runs everything that supports your development workflow:
- Harbor — private container registry with vulnerability scanning
- SonarQube — code quality and security analysis
- Grafana + Loki + Promtail — monitoring, log aggregation, and dashboards
- Vaultwarden — team password management
- Self-hosted CI/CD agents — Azure DevOps or Gitea runners
All of these run as Docker Compose services on a single VM with 32 GB RAM and 8 cores. The Ansible playbook for TOOLS is the most complex one in the stack, but it is also the one you run least often — set it up once and it hums along.
The monitoring stack in particular is worth getting right on day one. Grafana dashboards showing container health, Loki ingesting logs from every service via Promtail — this is your replacement for Azure Application Insights, and it costs exactly nothing beyond the hardware it runs on.
When This Approach Falls Short
Docker Compose and Ansible are not the answer to everything.
If you need auto-scaling, this setup does not scale horizontally. You would need Kubernetes or Docker Swarm — and at that point, you are adding significant operational complexity. For most Nordic SMBs with predictable load, fixed capacity with headroom is simpler and cheaper than auto-scaling.
If you have 50+ microservices, managing individual Compose files and Ansible playbooks for each one becomes painful. At that scale, Kubernetes earns its complexity tax. But if you are reading this article, you probably have 5–15 services, and Compose handles that without breaking a sweat.
If your team has zero Linux experience, the learning curve for Ansible, Docker, SSH key management, and firewall configuration is real. Budget 2–4 weeks for the team to get comfortable, or bring in someone who has done it before.
From Zero to Running in a Weekend
Here is the realistic timeline — assuming you have the hardware racked and Rocky Linux installed on all four VMs:
Saturday morning: Run the base Ansible playbook across all VMs. SSH hardening, packages, firewall rules, Docker installation. Two hours if your playbooks are solid. The Ansible playbooks in the Cloud Exit Starter Kit are tested against Rocky Linux 9 and cover this entire base layer, so you are not starting from a blank file.
Saturday afternoon: Stand up the TOOLS VM — Harbor, Grafana stack, CI/CD agents. Push your first container image to Harbor. Three hours.
Sunday morning: Deploy the application stack to DEV and TEST. Verify everything works, fix the inevitable environment variable typo. Two hours.
Sunday afternoon: Deploy to PROD. Configure Nginx with SSL. Point DNS. Run your smoke tests. Two hours, plus whatever time you spend staring at Grafana dashboards making sure the metrics look right.
Is it actually a weekend? For someone who has done it before, yes. For a first-timer with well-structured playbooks, add another day or two for troubleshooting and learning. The point is that this is not a multi-month infrastructure project — it is a focused build with clear milestones.
Ready to migrate off the cloud?
I put together a Cloud Exit Starter Kit ($49) — Ansible playbooks, Docker Compose production templates, and the migration checklist I use on real projects. Everything you need to go from Azure/AWS to your own hardware.
Or if you just want to talk it through: book a free 30-minute cloud exit assessment. No sales pitch — just an honest look at whether on-prem makes sense for your situation.
Top comments (0)