DEV Community

Cover image for Building a High-Availability Vault Cluster with Docker and Raft Storage
Kubilay İşen
Kubilay İşen

Posted on

Building a High-Availability Vault Cluster with Docker and Raft Storage

Vault Cluster Architecture

Introduction

HashiCorp Vault is one of the most powerful secrets management solutions in the industry. However, setting up a production-ready, highly-available Vault cluster can be intimidating. In this article, I'll walk you through building a 3-node Vault cluster using Docker with automatic unsealing, Raft-based storage, and infrastructure-as-code automation.

By the end of this guide, you'll have a resilient secrets management infrastructure that can handle node failures and scale horizontally.


Why Vault? Why High-Availability?

The Problem

In modern infrastructure:

  • Secrets are scattered across multiple systems (databases, APIs, certificates)
  • No single source of truth for credential rotation
  • Compliance requirements demand audit trails
  • Manual secret management is error-prone

The Solution

Vault provides:

  • Centralized secret management - Single source of truth
  • Encryption as a service - Encrypt/decrypt without exposing keys
  • Dynamic credentials - Automatically generate short-lived credentials
  • Audit logging - Complete trail of who accessed what and when
  • High availability - Never lose access to your secrets

Architecture Overview

┌─────────────────────────────────────────────────────────┐
│                 Docker Compose Network                  │
├─────────────────────────────────────────────────────────┤
│                                                         │
│  ┌──────────────┐  ┌──────────────┐  ┌──────────────┐   │
│  │  Vault Node  │  │  Vault Node  │  │  Vault Node  │   │
│  │     (1)      │  │     (2)      │  │     (3)      │   │
│  └──────────────┘  └──────────────┘  └──────────────┘   │
│         │                 │                  │          │
│  ┌──────┴─────────────────┼──────────────────┴────┐     │
│  │                  Raft Consensus                │     │
│  └────────────────────────────────────────────────┘     │
│                                                         │
│  ┌──────────────────────────────────────────────────┐   │
│  │   Nginx Load Balancer (Optional)                 │   │
│  └──────────────────────────────────────────────────┘   │
│                                                         │
└─────────────────────────────────────────────────────────┘
Enter fullscreen mode Exit fullscreen mode

Our setup features:

  • 3 Vault nodes for true high-availability
  • Raft storage backend - No external dependencies (unlike Consul)
  • Auto-unsealing - Automatic node recovery without manual intervention
  • Docker Compose orchestration - Easy to deploy and manage
  • Health checks - Automatic failure detection

Prerequisites

Before we begin, make sure you have:

- Docker & Docker Compose (v3.8+)
- Taskfile CLI for [go-task](https://taskfile.dev/#/installation) automation
- curl (for API testing)
Enter fullscreen mode Exit fullscreen mode

Install the requirements:

# macOS
brew install docker docker-compose task jq curl

# Linux
sudo apt-get install docker.io docker-compose taskfile jq curl
Enter fullscreen mode Exit fullscreen mode

Project Structure

vault-docker-cluster/
├── docker-compose.yaml           # Services definition
├── Dockerfile.vault              # Custom Vault image
├── Taskfile.yml                  # Task automation
├── init-and-generate-unseal.sh   # Cluster initialization
├── auto-unseal-monitor.sh        # Monitoring script
├── vault-1/
│   ├── config/
│   │   ├── vault.hcl            # Vault configuration
│   │   └── unseal.sh            # Auto-generated unseal script
│   └── data/                     # Raft data storage
├── vault-2/                      # (same structure)
└── vault-3/                      # (same structure)
Enter fullscreen mode Exit fullscreen mode

Step 1: Docker Compose Configuration

Let's start with the docker-compose.yaml. This file orchestrates three Vault nodes:

version: '3.8'

networks:
  vault_net:
    driver: bridge

services:
  vault-1:
    image: vaultdockercluster:1.20
    restart: unless-stopped
    volumes:
      - ./vault-1/config:/vault/config
      - ./vault-1/data:/vault/data
    cap_add:
      - IPC_LOCK                    # Required for mlock
    entrypoint:
      - vault
      - server
      - -config=/vault/config/vault.hcl
    ulimits:
      memlock: -1                   # Unlimited memory lock
      nofile:
        soft: 65535
        hard: 65535
    networks:
      - vault_net
    healthcheck:
      test: ["CMD", "vault", "status", "-tls-skip-verify"]
      interval: 10s
      timeout: 5s
      retries: 5
    deploy:
      resources:
        limits:
          cpus: '1'
          memory: 1G

  # vault-2 and vault-3 use the same configuration...
Enter fullscreen mode Exit fullscreen mode

Key Configuration Points:

Setting Purpose
IPC_LOCK Allows Vault to lock memory pages (prevents swapping secrets to disk)
memlock: -1 Unlimited memory lock for all processes
healthcheck Detects node failures automatically
ulimits Handles many concurrent connections
networks Isolated network for inter-node communication

Step 2: Vault Configuration (HCL)

Each node has its own vault.hcl configuration:

vault-1:

storage "raft" {
  path    = "/vault/data"
  node_id = "vault-1"
}

listener "tcp" {
  address       = "0.0.0.0:8200"
  cluster_address = "0.0.0.0:8201"
  tls_disable   = true
}

api_addr     = "http://vault-1:8200"
cluster_addr = "http://vault-1:8201"

ui = true
disable_mlock = true
Enter fullscreen mode Exit fullscreen mode

vault-2:

storage "raft" {
  path    = "/vault/data"
  node_id = "vault-2"

  retry_join {
    leader_api_addr = "http://vault-1:8200"
  }
}

listener "tcp" {
  address       = "0.0.0.0:8200"
  cluster_address = "0.0.0.0:8201"
  tls_disable   = true
}

api_addr     = "http://vault-2:8200"
cluster_addr = "http://vault-2:8201"

ui = true
disable_mlock = true
Enter fullscreen mode Exit fullscreen mode

vault-3:

storage "raft" {
  path    = "/vault/data"
  node_id = "vault-3"

  retry_join {
    leader_api_addr = "http://vault-1:8200"
  }
}

listener "tcp" {
  address       = "0.0.0.0:8200"
  cluster_address = "0.0.0.0:8201"
  tls_disable   = true
}

api_addr     = "http://vault-3:8200"
cluster_addr = "http://vault-3:8201"

ui = true
disable_mlock = true
Enter fullscreen mode Exit fullscreen mode

Raft vs Other Backends:

Aspect Raft Consul S3
Setup Complexity Simple Complex Simple
Dependencies None Consul cluster needed AWS account
Cost Free Free (self-hosted) $ per API call
Performance Excellent Good Slower
Best For Self-hosted HA Enterprise AWS-native

Step 3: Automated Setup with Taskfile

The Taskfile.yml automates common operations:

version: '3'

vars:
  VAULT_ADDR: 'http://127.0.0.1:8200'
  KEY_SHARES: 5
  KEY_THRESHOLD: 3

tasks:
  bootstrap:
    desc: Full bootstrap - start cluster, initialize, and setup
    cmds:
      - task: up
      - echo "⏳ Waiting for containers to be ready (15 seconds)..."
      - sleep 15
      - task: init
      - echo ""
      - echo "📋 Credentials saved! Review them above."
      - echo "Press Enter to continue with cluster setup..."
      - read _
      - task: setup-cluster
      - echo ""
      - echo "✅ Bootstrap complete!"

  up:
      desc: Start the Vault cluster
      cmds:
        - docker-compose up -d

  init:
    desc: Initialize vault-1 and auto-generate unseal.sh scripts
    cmds:
      - ./init-and-generate-unseal.sh
    preconditions:
      - sh: docker-compose ps vault-1 | grep -q "Up"
        msg: "vault-1 is not running. Run 'task up' first."
      - sh: command -v jq >/dev/null 2>&1
        msg: "jq is required but not installed. Install with: brew install jq"

  setup-cluster:
    desc: Complete cluster setup - unseal vault-1, join and unseal vault-2 and vault-3
    cmds:
      - task: unseal-vault-1
      - task: join-vault-2
      - task: unseal-vault-2
      - task: join-vault-3
      - task: unseal-vault-3
Enter fullscreen mode Exit fullscreen mode

Usage is straightforward:

# Start everything
task bootstrap

# Or step by step
task up
task init
task setup-cluster
# Check status
docker-compose ps
Enter fullscreen mode Exit fullscreen mode

Step 4: Initialization and Unsealing

The initialization script (init-and-generate-unseal.sh) handles the critical setup:

  • Calls vault operator init against vault-1.
  • Prints the unseal keys + root token (and writes them to vault-credentials-<timestamp>.md and vault-init-keys.json).
  • Generates vault-*/config/unseal.sh helper scripts.
#!/bin/bash
set -e

echo "🔐 Initializing Vault Cluster..."
echo ""

# Wait for vault-1 to be ready (unsealed but not initialized)
echo "⏳ Waiting for vault-1 to be ready..."
max_attempts=30
attempt=0

while [ $attempt -lt $max_attempts ]; do
    # Check if we can connect to Vault and get a status response
    # vault status returns non-zero exit code when sealed, so we need to capture both stdout and exit code
    status=$(docker-compose exec -T vault-1 sh -c "export VAULT_ADDR='http://127.0.0.1:8200' && vault status -format=json 2>&1" || true)

    # Check if we got valid JSON output (meaning Vault is responding)
    if echo "$status" | jq -e . >/dev/null 2>&1; then
        initialized=$(echo "$status" | jq -r '.initialized // false')

        if [ "$initialized" = "false" ]; then
            echo "✅ Vault is ready for initialization"
            break
        elif [ "$initialized" = "true" ]; then
            echo "❌ Error: Vault is already initialized!"
            echo ""
            echo "If you want to re-initialize:"
            echo "  1. Run: task reset"
            echo "  2. Run: task bootstrap"
            exit 1
        fi
    else
        # Vault is not responding yet, keep waiting
        echo "⏳ Waiting for Vault to start (attempt $((attempt + 1))/$max_attempts)..."
    fi

    attempt=$((attempt + 1))
    sleep 2
done

if [ $attempt -eq $max_attempts ]; then
    echo "❌ Timeout waiting for vault-1 to be ready"
    echo ""
    echo "Check logs with: docker-compose logs vault-1"
    exit 1
fi

echo ""

# Initialize vault-1 and capture output
echo "🔑 Initializing Vault..."
INIT_OUTPUT=$(docker-compose exec -T vault-1 sh -c "export VAULT_ADDR='http://127.0.0.1:8200' && vault operator init -key-shares=5 -key-threshold=3 -format=json")

# Parse the JSON output
UNSEAL_KEY_1=$(echo "$INIT_OUTPUT" | jq -r '.unseal_keys_b64[0]')
UNSEAL_KEY_2=$(echo "$INIT_OUTPUT" | jq -r '.unseal_keys_b64[1]')
UNSEAL_KEY_3=$(echo "$INIT_OUTPUT" | jq -r '.unseal_keys_b64[2]')
UNSEAL_KEY_4=$(echo "$INIT_OUTPUT" | jq -r '.unseal_keys_b64[3]')
UNSEAL_KEY_5=$(echo "$INIT_OUTPUT" | jq -r '.unseal_keys_b64[4]')
ROOT_TOKEN=$(echo "$INIT_OUTPUT" | jq -r '.root_token')

echo "════════════════════════════════════════════════════════════════"
echo "⚠️  SAVE THESE CREDENTIALS SECURELY - THEY CANNOT BE RECOVERED!"
echo "════════════════════════════════════════════════════════════════"
echo ""
echo "Unseal Key 1: $UNSEAL_KEY_1"
echo "Unseal Key 2: $UNSEAL_KEY_2"
echo "Unseal Key 3: $UNSEAL_KEY_3"
echo "Unseal Key 4: $UNSEAL_KEY_4"
echo "Unseal Key 5: $UNSEAL_KEY_5"
echo ""
echo "Root Token: $ROOT_TOKEN"
echo ""
echo "════════════════════════════════════════════════════════════════"
echo ""

# Save to a backup file
BACKUP_FILE="vault-credentials-$(date +%Y%m%d-%H%M%S).md"
cat > "$BACKUP_FILE" << EOF
Vault Cluster Initialization - $(date)
════════════════════════════════════════════════════════════════

Unseal Key 1: $UNSEAL_KEY_1
Unseal Key 2: $UNSEAL_KEY_2
Unseal Key 3: $UNSEAL_KEY_3
Unseal Key 4: $UNSEAL_KEY_4
Unseal Key 5: $UNSEAL_KEY_5

Root Token: $ROOT_TOKEN

════════════════════════════════════════════════════════════════
⚠️  Store this file securely and delete it from this location!
EOF

echo "✅ Credentials saved to: $BACKUP_FILE"
echo ""

# Generate unseal.sh for vault-1
echo "📝 Generating unseal.sh scripts..."

cat > vault-1/config/unseal.sh << EOF
#!/bin/sh
set -e
export VAULT_ADDR='http://127.0.0.1:8200'
vault operator unseal $UNSEAL_KEY_1
vault operator unseal $UNSEAL_KEY_2
vault operator unseal $UNSEAL_KEY_3
echo "✅ Vault unsealed successfully"
EOF

chmod +x vault-1/config/unseal.sh

# Copy to vault-2
cp vault-1/config/unseal.sh vault-2/config/unseal.sh

# Copy to vault-3
cp vault-1/config/unseal.sh vault-3/config/unseal.sh

echo "✅ Created unseal.sh in vault-1/config/"
echo "✅ Created unseal.sh in vault-2/config/"
echo "✅ Created unseal.sh in vault-3/config/"
echo ""

# Also save JSON format for automation
echo "$INIT_OUTPUT" | jq '.' > vault-init-keys.json
echo "✅ Saved JSON format to: vault-init-keys.json"
echo ""

echo "════════════════════════════════════════════════════════════════"
echo "🎉 Initialization Complete!"
echo "════════════════════════════════════════════════════════════════"
echo ""
echo "Next steps:"
echo "  1. Secure the credentials file: $BACKUP_FILE"
echo "  2. Run: task setup-cluster"
echo "  3. Or manually:"
echo "     - task unseal-vault-1"
echo "     - task join-vault-2 && task unseal-vault-2"
echo "     - task join-vault-3 && task unseal-vault-3"
echo ""
Enter fullscreen mode Exit fullscreen mode

Important Security Notes:

⚠️ Critical: Store the root token and unseal keys safely:

  • Save them in a password manager
  • Never commit them to version control
  • Consider using Vault's auto-unseal with KMS (AWS, GCP, Azure)

Step 5: Building the Docker Image

A minimal Dockerfile.vault:

FROM hashicorp/vault:1.20
ENV TZ=Europe/Istanbul
Enter fullscreen mode Exit fullscreen mode

Build it with:

docker build -f Dockerfile.vault -t vaultdockercluster:1.20 .
Enter fullscreen mode Exit fullscreen mode

Step 6: Raft Cluster Formation

After initializing vault-1, join the other nodes:

# Join vault-2 to the cluster
docker-compose exec vault-2 sh -c \
  "vault operator raft join http://vault-1:8200"

# Join vault-3 to the cluster
docker-compose exec vault-3 sh -c \
  "vault operator raft join http://vault-1:8200"

# Verify cluster status
docker-compose exec vault-1 vault operator raft list-peers
Enter fullscreen mode Exit fullscreen mode

Expected output:

Node ID    Address            State       Voter
------     -------            -----       -----
vault-1    vault-1:8201       leader      true
vault-2    vault-2:8201       follower    true
vault-3    vault-3:8201       follower    true
Enter fullscreen mode Exit fullscreen mode

Complete Workflow: From Zero to Hero

Here's the complete startup sequence:

# 1. Start the containers
task up

# 2. Watch the logs
docker-compose logs -f vault-1

# 3. Initialize the cluster
task init

# Save the credentials somewhere safe!

# 4. Unseal all nodes
task unseal-all

# 5. Join nodes 2 and 3 to the cluster
task join-vault-2
task join-vault-3

# 6. Verify the cluster
docker-compose exec vault-1 vault operator raft list-peers

# 7. Access the UI
open http://localhost:8200/ui
Enter fullscreen mode Exit fullscreen mode

http://localhost:8200/ui


Testing Your Cluster

Test 1: High Availability

Kill the leader node and watch the cluster recover:

# Kill vault-1 (the leader)
docker-compose kill vault-1

# Check who's the new leader
docker-compose exec vault-2 vault operator raft list-peers

# Bring it back
docker-compose restart vault-1

# Verify recovery
docker-compose ps
Enter fullscreen mode Exit fullscreen mode

Auto-Unseal Monitor

When a Vault container restarts, the auto-unseal monitor automatically detects that the Vault node has become sealed and immediately unseals it using the stored unseal keys.

The auto-unseal-monitor.sh script runs in its own container (vault-unsealer).

  • Polls each Vault node every 30 seconds via vault status.
  • If a node transitions to sealed, reads the unseal keys from the mounted unseal.sh file and runs the three vault operator unseal commands.
  • Retries up to three times per incident. Logs show timestamps and outcomes.

Because the helper executes the same unseal script stored on disk, protect the vault-*/config/unseal.sh files and remove them when no longer needed.

Test 2: Secret Storage

Store and retrieve a secret:


docker-compose exec vault-1 sh

# Login
export VAULT_TOKEN="your-root-token"
export VAULT_ADDR="http://localhost:8200"

# Create a secret
vault kv put secret/my-app/database \
  username=admin \
  password=super-secret-password

# Retrieve it
vault kv get secret/my-app/database

# Read it via API
curl -H "X-Vault-Token: $VAULT_TOKEN" \
  http://localhost:8200/v1/secret/data/my-app/database
Enter fullscreen mode Exit fullscreen mode

Test 3: Encryption as a Service

# Enable transit engine
vault secrets enable transit

# Create an encryption key
vault write -f transit/keys/my-key

# Encrypt data
vault write transit/encrypt/my-key plaintext=@data.txt

# Decrypt it
vault write transit/decrypt/my-key ciphertext=vault:v1:...
Enter fullscreen mode Exit fullscreen mode

Monitoring and Maintenance

Check Cluster Health

# Status of all nodes
docker-compose exec vault-1 vault status

# Raft peer status
docker-compose exec vault-1 vault operator raft list-peers

# Audit logs
docker-compose logs vault-1 | grep ERROR

# System metrics
docker-compose stats
Enter fullscreen mode Exit fullscreen mode

Common Issues and Solutions

Issue Cause Solution
Node won't join Already part of a cluster Run task reset and task bootstrap
Connection refused Node not running Check with docker-compose ps
Memory locked mlock issues Check ulimits and IPC_LOCK capability

Production Considerations

1. Enable TLS/HTTPS

listener "tcp" {
  address            = "0.0.0.0:8200"
  tls_cert_file      = "/vault/config/cert.pem"
  tls_key_file       = "/vault/config/key.pem"
}
Enter fullscreen mode Exit fullscreen mode

2. Enable Audit Logging

audit {
  file {
    path = "/vault/logs/audit.log"
  }
}
Enter fullscreen mode Exit fullscreen mode

3. Configure Storage Snapshots

# Backup Raft data
vault operator raft snapshot save vault-backup.snap

# Restore from snapshot
vault operator raft snapshot restore -force vault-backup.snap
Enter fullscreen mode Exit fullscreen mode

4. Set Resource Limits

deploy:
  resources:
    limits:
      cpus: '2'
      memory: 4G
    reservations:
      cpus: '1'
      memory: 2G
Enter fullscreen mode Exit fullscreen mode

Scaling Considerations

Adding More Nodes

# Copy vault-1 directory structure
cp -r vault-1 vault-4

# Update vault-4/config/vault.hcl (change node_id)
sed -i 's/vault-1/vault-4/g' vault-4/config/vault.hcl

# Add to docker-compose.yaml and run
docker-compose up -d vault-4
Enter fullscreen mode Exit fullscreen mode

Load Balancing

upstream vault {
    server vault-1:8200;
    server vault-2:8200;
    server vault-3:8200;
}

server {
    listen 80;
    location / {
        proxy_pass http://vault;
    }
}
Enter fullscreen mode Exit fullscreen mode

Advanced: Disaster Recovery

Scenario: Complete Cluster Failure

# 1. Reset everything
task reset

# 2. Restore from backup
vault operator raft snapshot restore -force vault-backup.snap

# 3. Bring cluster back up
task bootstrap
Enter fullscreen mode Exit fullscreen mode

Scenario: Corrupted Raft State

# 1. Stop all nodes
task down

# 2. Clean data directories
rm -rf vault-*/data/raft/*

# 3. Restore from known-good backup
# Copy backed-up raft directory to all nodes

# 4. Start nodes
task up
Enter fullscreen mode Exit fullscreen mode

Conclusion

You now have a production-ready, highly-available Vault cluster with:

Three-node Raft cluster for high availability

Automated deployment via Docker Compose

Task automation for common operations

Auto-unsealing capability

Health monitoring and failure detection

Scalable architecture ready for growth

Resources


About the Author

This article demonstrates a practical approach to secrets management using open-source tools. For production deployments, consider consulting with security specialists to ensure compliance with your organization's security requirements.

Have questions? Share them in the comments below!


GitHub Repository: vault-docker-cluster

Top comments (0)