DEV Community

Cover image for Building a Production-Grade Private EKS Cluster with OpenVPN, Prometheus & Grafana

Building a Production-Grade Private EKS Cluster with OpenVPN, Prometheus & Grafana

Step-by-step guide to deploying a private Amazon EKS cluster with zero public API exposure, self-hosted OpenVPN access, kube-prometheus-stack monitoring, and Route 53 private DNS — all automated with Terraform.

The Problem with Public Kubernetes Clusters

Every time I see an EKS cluster with a public API endpoint, I cringe.

Sure, it's convenient. But it means your Kubernetes API server — the brain of your entire cluster — is reachable from anywhere on the internet. One misconfigured IAM policy, one leaked credential, and you have a very bad day.

In this guide, I'll walk you through building a fully private EKS cluster where:

  • The Kubernetes API has zero internet exposure
  • Access is gated through a self-hosted OpenVPN server
  • Prometheus + Grafana monitor everything, exposed via an internal load balancer
  • Route 53 private DNS gives us grafana.devops.private
  • Everything is Terraform — one terraform apply to rule them all

Let's build this.


Architecture

What's in the box:

Component Role
VPC (10.0.0.0/16) Isolated network — 2 public subnets, 2 private subnets across 2 AZs
EKS (Private API) Kubernetes control plane — API endpoint only accessible within VPC
OpenVPN EC2 Self-hosted VPN in public subnet — the single entry point
NAT Gateway Outbound internet for private subnets (pulling container images)
kube-prometheus-stack Prometheus (metrics) + Grafana (dashboards) + Alertmanager
Internal Load Balancer Exposes Grafana inside VPC only
Route 53 Private Zone grafana.devops.private → Internal LB CNAME

The traffic flow

Your Laptop
    │
    │ OpenVPN (UDP:1194)
    ▼
┌──────────────┐        ┌─────────────────────────────┐
│  OpenVPN EC2 │──────▶ │  EKS Private API (HTTPS:443)│
│  Public      │  NAT   │  Worker Node 1              │
│  Subnet      │Masq.   │  Worker Node 2              │
│              │        │  ┌─────────────────────┐    │
│              │───────▶│  │ Grafana (Internal LB)│    │
│              │        │  │ grafana.devops.private│   │
│              │        │  └─────────────────────┘    │
└──────────────┘        └─────────────────────────────┘
    Public Subnet              Private Subnets
Enter fullscreen mode Exit fullscreen mode

The OpenVPN server uses iptables NAT masquerade to rewrite VPN client IPs (10.8.0.0/24) to its own VPC address. This means all VPC services see traffic from a legitimate VPC IP — not an unknown external range.


Project Structure

eks-private-vpn/
├── main.tf                 # VPC, EKS, OpenVPN EC2, Security Groups
├── monitoring.tf           # kube-prometheus-stack (Prometheus + Grafana)
├── dns.tf                  # Route 53 private zone + Grafana CNAME
├── providers.tf            # AWS, Helm, Kubernetes providers
├── variables.tf            # Input variables
├── outputs.tf              # Useful outputs
├── openvpn_userdata.sh     # OpenVPN server bootstrap script
└── terraform.tfvars        # Your configuration values
Enter fullscreen mode Exit fullscreen mode

Step 1 — Provider Configuration

We need four providers. The Helm and Kubernetes providers use exec-based authentication to talk to the private EKS cluster:

# providers.tf
terraform {
  required_version = ">= 1.5.0"

  required_providers {
    aws        = { source = "hashicorp/aws",        version = "~> 5.0" }
    tls        = { source = "hashicorp/tls",        version = "~> 4.0" }
    helm       = { source = "hashicorp/helm",       version = "~> 2.0" }
    kubernetes = { source = "hashicorp/kubernetes",  version = "~> 2.0" }
  }
}

provider "aws" {
  region = var.aws_region
}

provider "helm" {
  kubernetes {
    host                   = module.eks.cluster_endpoint
    cluster_ca_certificate = base64decode(module.eks.cluster_certificate_authority_data)

    exec {
      api_version = "client.authentication.k8s.io/v1beta1"
      command     = "aws"
      args        = ["eks", "get-token", "--cluster-name",
                     module.eks.cluster_name, "--region", var.aws_region]
    }
  }
}

provider "kubernetes" {
  host                   = module.eks.cluster_endpoint
  cluster_ca_certificate = base64decode(module.eks.cluster_certificate_authority_data)

  exec {
    api_version = "client.authentication.k8s.io/v1beta1"
    command     = "aws"
    args        = ["eks", "get-token", "--cluster-name",
                   module.eks.cluster_name, "--region", var.aws_region]
  }
}
Enter fullscreen mode Exit fullscreen mode

Why exec-based auth? The aws eks get-token command generates short-lived tokens via IAM. This is more secure than static kubeconfig tokens and works seamlessly with the private endpoint.


Step 2 — VPC & Networking

# main.tf
module "vpc" {
  source  = "terraform-aws-modules/vpc/aws"
  version = "~> 5.0"

  name = "${var.project_name}-vpc"
  cidr = var.vpc_cidr                          # 10.0.0.0/16

  azs             = var.availability_zones      # ["us-east-1a", "us-east-1b"]
  private_subnets = var.private_subnet_cidrs    # ["10.0.1.0/24", "10.0.2.0/24"]
  public_subnets  = var.public_subnet_cidrs     # ["10.0.101.0/24", "10.0.102.0/24"]

  enable_nat_gateway   = true
  single_nat_gateway   = true
  enable_dns_hostnames = true
  enable_dns_support   = true

  # Required tags for EKS load balancer discovery
  public_subnet_tags = {
    "kubernetes.io/role/elb" = "1"
  }

  private_subnet_tags = {
    "kubernetes.io/role/internal-elb" = "1"
  }
}
Enter fullscreen mode Exit fullscreen mode

The kubernetes.io/role/internal-elb tag on private subnets is what tells the AWS Load Balancer Controller where to place internal load balancers — this is how our Grafana LB ends up in the right subnet.


Step 3 — Private EKS Cluster

This is where the magic happens. Two settings change everything:

module "eks" {
  source  = "terraform-aws-modules/eks/aws"
  version = "~> 20.0"

  cluster_name    = "${var.project_name}-cluster"
  cluster_version = var.kubernetes_version          # "1.31"

  vpc_id     = module.vpc.vpc_id
  subnet_ids = module.vpc.private_subnets

  # ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
  # THE TWO SETTINGS THAT MAKE THIS PRIVATE
  cluster_endpoint_public_access  = false
  cluster_endpoint_private_access = true
  # ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━

  # Required in EKS module v20+ — without this, kubectl fails
  enable_cluster_creator_admin_permissions = true

  eks_managed_node_groups = {
    default = {
      instance_types = var.node_instance_types      # ["t3.medium"]
      min_size       = var.node_min_size            # 1
      max_size       = var.node_max_size            # 3
      desired_size   = var.node_desired_size        # 2
    }
  }
}
Enter fullscreen mode Exit fullscreen mode

We also need to let the VPN server talk to the EKS API on port 443:

resource "aws_security_group_rule" "vpn_to_eks_api" {
  description              = "Allow VPN server to access EKS API"
  type                     = "ingress"
  from_port                = 443
  to_port                  = 443
  protocol                 = "tcp"
  security_group_id        = module.eks.cluster_security_group_id
  source_security_group_id = aws_security_group.vpn.id
}
Enter fullscreen mode Exit fullscreen mode

EKS Module v20 Breaking Change: The enable_cluster_creator_admin_permissions flag is new in module v20. In older versions, the cluster creator automatically got admin access via aws-auth ConfigMap. In v20+, this was replaced with EKS Access Entries — a more secure, IAM-native RBAC mechanism. Without this flag, you'll get the cryptic error: "You must be logged in to the server".


Step 4 — OpenVPN Server

The VPN server lives in the public subnet — it's the bridge between the internet and your private infrastructure:

data "aws_ami" "ubuntu" {
  most_recent = true
  owners      = ["099720109477"]  # Canonical

  filter {
    name   = "name"
    values = ["ubuntu/images/hvm-ssd/ubuntu-jammy-22.04-amd64-server-*"]
  }

  filter {
    name   = "virtualization-type"
    values = ["hvm"]
  }
}

resource "aws_security_group" "vpn" {
  name_prefix = "${var.project_name}-vpn-"
  vpc_id      = module.vpc.vpc_id
  description = "OpenVPN server security group"

  # OpenVPN — open to the world (authentication via certificates)
  ingress {
    description = "OpenVPN"
    from_port   = 1194
    to_port     = 1194
    protocol    = "udp"
    cidr_blocks = ["0.0.0.0/0"]
  }

  # SSH — restricted to your IP only
  ingress {
    description = "SSH"
    from_port   = 22
    to_port     = 22
    protocol    = "tcp"
    cidr_blocks = [var.admin_ingress_cidr]
  }

  egress {
    from_port   = 0
    to_port     = 0
    protocol    = "-1"
    cidr_blocks = ["0.0.0.0/0"]
  }

  lifecycle { create_before_destroy = true }
}

resource "aws_instance" "vpn" {
  ami                         = data.aws_ami.ubuntu.id
  instance_type               = var.openvpn_instance_type   # t3.small
  key_name                    = var.ssh_key_name
  subnet_id                   = module.vpc.public_subnets[0]
  vpc_security_group_ids      = [aws_security_group.vpn.id]
  associate_public_ip_address = true
  source_dest_check           = false   # ← Critical for VPN routing!

  root_block_device {
    volume_size = 20
    volume_type = "gp3"
  }

  user_data = templatefile("${path.module}/openvpn_userdata.sh", {
    vpc_cidr        = var.vpc_cidr
    vpn_client_cidr = var.vpn_client_cidr
  })

  tags = { Name = "${var.project_name}-openvpn" }
}

resource "aws_eip" "vpn" {
  instance = aws_instance.vpn.id
  domain   = "vpc"
  tags     = { Name = "${var.project_name}-vpn-eip" }
}
Enter fullscreen mode Exit fullscreen mode

Why source_dest_check = false? By default, AWS drops traffic where the EC2 instance isn't the source or destination. Since the VPN server forwards traffic between VPN clients (10.8.0.0/24) and VPC resources (10.0.0.0/16), we must disable this check. Without it, all forwarded packets get silently dropped — your VPN connects but nothing works.


Step 5 — OpenVPN Bootstrap Script

This user data script does everything automatically on first boot: installs OpenVPN, generates a full PKI, configures the server, sets up NAT rules, and generates a ready-to-use .ovpn client profile:

#!/bin/bash
set -euo pipefail
exec > /var/log/openvpn-setup.log 2>&1

export DEBIAN_FRONTEND=noninteractive
apt-get update -y && apt-get upgrade -y

# Pre-seed iptables-persistent to avoid interactive prompts
echo iptables-persistent iptables-persistent/autosave_v4 boolean true | debconf-set-selections
echo iptables-persistent iptables-persistent/autosave_v6 boolean true | debconf-set-selections
apt-get install -y -o Dpkg::Options::='--force-confdef' \
  -o Dpkg::Options::='--force-confold' openvpn easy-rsa iptables-persistent

# Enable IP forwarding
echo 'net.ipv4.ip_forward = 1' >> /etc/sysctl.conf
sysctl -p

# ── PKI setup ──────────────────────────────────────────────────────
EASY_RSA="/etc/openvpn/easy-rsa"
mkdir -p "$EASY_RSA"
cp -r /usr/share/easy-rsa/* "$EASY_RSA/"
cd "$EASY_RSA"

./easyrsa init-pki
EASYRSA_BATCH=1 ./easyrsa build-ca nopass
EASYRSA_BATCH=1 ./easyrsa build-server-full server nopass
EASYRSA_BATCH=1 ./easyrsa build-client-full client1 nopass
./easyrsa gen-dh
openvpn --genkey secret /etc/openvpn/ta.key

cp pki/ca.crt pki/issued/server.crt pki/private/server.key pki/dh.pem /etc/openvpn/

# ── Server config ─────────────────────────────────────────────────
cat > /etc/openvpn/server.conf <<'EOF'
port 1194
proto udp
dev tun
ca   /etc/openvpn/ca.crt
cert /etc/openvpn/server.crt
key  /etc/openvpn/server.key
dh   /etc/openvpn/dh.pem
tls-auth /etc/openvpn/ta.key 0

server ${vpn_client_cidr} 255.255.255.0
topology subnet

push "route ${vpc_cidr} 255.255.0.0"
push "dhcp-option DNS 10.0.0.2"

keepalive 10 120
cipher AES-256-GCM
auth SHA256
user nobody
group nogroup
persist-key
persist-tun
status /var/log/openvpn-status.log
log-append /var/log/openvpn.log
verb 3
EOF

# ── NAT / forwarding rules ────────────────────────────────────────
PRIMARY_IF=$(ip route | grep default | awk '{print $5}')
iptables -t nat -A POSTROUTING -s ${vpn_client_cidr}/24 \
  -o "$PRIMARY_IF" -j MASQUERADE
iptables -A FORWARD -i tun0 -o "$PRIMARY_IF" -j ACCEPT
iptables -A FORWARD -i "$PRIMARY_IF" -o tun0 \
  -m state --state RELATED,ESTABLISHED -j ACCEPT
netfilter-persistent save

systemctl enable openvpn@server
systemctl start openvpn@server

# ── Generate client .ovpn profile ─────────────────────────────────
PUBLIC_IP=$(curl -s http://169.254.169.254/latest/meta-data/public-ipv4 \
  || curl -s http://checkip.amazonaws.com)
mkdir -p /home/ubuntu/client-configs

cat > /home/ubuntu/client-configs/client1.ovpn <<CLIENTCONF
client
dev tun
proto udp
remote $PUBLIC_IP 1194
resolv-retry infinite
nobind
persist-key
persist-tun
remote-cert-tls server
cipher AES-256-GCM
auth SHA256
key-direction 1
verb 3

<ca>
$(cat /etc/openvpn/ca.crt)
</ca>
<cert>
$(openssl x509 -in "$EASY_RSA/pki/issued/client1.crt")
</cert>
<key>
$(cat "$EASY_RSA/pki/private/client1.key")
</key>
<tls-auth>
$(cat /etc/openvpn/ta.key)
</tls-auth>
CLIENTCONF

chown -R ubuntu:ubuntu /home/ubuntu/client-configs
chmod 600 /home/ubuntu/client-configs/client1.ovpn

echo "=== OpenVPN setup complete ==="
Enter fullscreen mode Exit fullscreen mode

Key details to highlight:

  • push "route ${vpc_cidr} 255.255.0.0" — tells VPN clients to route all VPC traffic (10.0.0.0/16) through the tunnel
  • push "dhcp-option DNS 10.0.0.2" — pushes the VPC DNS resolver to clients. 10.0.0.2 is the Amazon-provided DNS (always VPC CIDR base + 2). This is how grafana.devops.private resolves!
  • MASQUERADE — rewrites VPN client source IPs to the server's VPC IP, so EKS and internal services accept the traffic
  • DEBIAN_FRONTEND=noninteractive + debconf-set-selections — prevents iptables-persistent from hanging on interactive prompts in user data

Step 6 — Monitoring Stack (Prometheus + Grafana)

We deploy the kube-prometheus-stack Helm chart — the industry-standard monitoring bundle:

# monitoring.tf
resource "kubernetes_namespace" "monitoring" {
  metadata {
    name = "monitoring"
  }
  depends_on = [module.eks]
}

resource "helm_release" "kube_prometheus_stack" {
  name       = "kube-prometheus-stack"
  namespace  = kubernetes_namespace.monitoring.metadata[0].name
  repository = "https://prometheus-community.github.io/helm-charts"
  chart      = "kube-prometheus-stack"
  version    = "65.1.0"

  # ── Grafana ─────────────────────────────────────────────────────
  set {
    name  = "grafana.enabled"
    value = "true"
  }

  set {
    name  = "grafana.adminPassword"
    value = var.grafana_admin_password
  }

  set {
    name  = "grafana.service.type"
    value = "LoadBalancer"
  }

  # Internal LB annotations — type = "string" is critical!
  set {
    name  = "grafana.service.annotations.service\\.beta\\.kubernetes\\.io/aws-load-balancer-internal"
    value = "true"
    type  = "string"
  }

  set {
    name  = "grafana.service.annotations.service\\.beta\\.kubernetes\\.io/aws-load-balancer-scheme"
    value = "internal"
    type  = "string"
  }

  set {
    name  = "grafana.service.port"
    value = "80"
  }

  # ── Prometheus ──────────────────────────────────────────────────
  set {
    name  = "prometheus.prometheusSpec.retention"
    value = "7d"
  }

  set {
    name  = "prometheus.prometheusSpec.storageSpec.volumeClaimTemplate.spec.accessModes[0]"
    value = "ReadWriteOnce"
  }

  set {
    name  = "prometheus.prometheusSpec.storageSpec.volumeClaimTemplate.spec.resources.requests.storage"
    value = "20Gi"
  }

  depends_on = [module.eks]
}
Enter fullscreen mode Exit fullscreen mode

Gotcha: type = "string" is mandatory on annotation set blocks. Without it, Terraform passes "true" as a boolean. Kubernetes annotations are map[string]string — you'll get json: cannot unmarshal bool into Go struct field ObjectMeta.metadata.annotations of type string. This one cost me 30 minutes of debugging.


Step 7 — Private DNS (Route 53)

The final piece — a private hosted zone that maps a clean domain to Grafana's internal load balancer:

# dns.tf
resource "aws_route53_zone" "private" {
  name = "devops.private"

  vpc {
    vpc_id = module.vpc.vpc_id
  }

  tags = { Name = "${var.project_name}-private-zone" }
}

# Read the LB hostname after Helm deploys Grafana
data "kubernetes_service" "grafana" {
  metadata {
    name      = "kube-prometheus-stack-grafana"
    namespace = "monitoring"
  }

  depends_on = [helm_release.kube_prometheus_stack]
}

# CNAME: grafana.devops.private → internal-xxx.elb.amazonaws.com
resource "aws_route53_record" "grafana" {
  zone_id = aws_route53_zone.private.zone_id
  name    = "grafana.devops.private"
  type    = "CNAME"
  ttl     = 300
  records = [
    data.kubernetes_service.grafana.status[0].load_balancer[0].ingress[0].hostname
  ]
}
Enter fullscreen mode Exit fullscreen mode

The data "kubernetes_service" block reads the hostname that AWS assigns to Grafana's internal load balancer after the Helm chart deploys. This hostname becomes the CNAME target.


Step 8 — Variables & Outputs

# variables.tf
variable "aws_region"            { default = "us-east-1" }
variable "project_name"          { default = "eks-private" }
variable "vpc_cidr"              { default = "10.0.0.0/16" }
variable "availability_zones"    { default = ["us-east-1a", "us-east-1b"] }
variable "private_subnet_cidrs"  { default = ["10.0.1.0/24", "10.0.2.0/24"] }
variable "public_subnet_cidrs"   { default = ["10.0.101.0/24", "10.0.102.0/24"] }
variable "kubernetes_version"    { default = "1.31" }
variable "node_instance_types"   { default = ["t3.medium"] }
variable "node_desired_size"     { default = 2 }
variable "node_min_size"         { default = 1 }
variable "node_max_size"         { default = 3 }
variable "openvpn_instance_type" { default = "t3.small" }
variable "vpn_client_cidr"       { default = "10.8.0.0/24" }

variable "ssh_key_name" {
  description = "Name of an existing EC2 key pair"
  type        = string
}

variable "admin_ingress_cidr" {
  description = "Your public IP/32 for SSH access"
  type        = string
}

variable "grafana_admin_password" {
  description = "Admin password for Grafana"
  type        = string
  sensitive   = true
  default     = "admin"
}
Enter fullscreen mode Exit fullscreen mode
# outputs.tf
output "openvpn_public_ip" {
  value = aws_eip.vpn.public_ip
}

output "ssh_to_vpn" {
  value = "ssh -i <your-key.pem> ubuntu@${aws_eip.vpn.public_ip}"
}

output "configure_kubectl" {
  description = "Run after connecting to VPN"
  value       = "aws eks update-kubeconfig --region ${var.aws_region} --name ${module.eks.cluster_name}"
}

output "grafana_url" {
  description = "Grafana URL (accessible only through VPN)"
  value       = "http://grafana.devops.private"
}

output "grafana_lb_hostname" {
  value = data.kubernetes_service.grafana.status[0].load_balancer[0].ingress[0].hostname
}
Enter fullscreen mode Exit fullscreen mode

Deploying & Connecting

1. Initialize and Apply

# Create terraform.tfvars
cat > terraform.tfvars <<EOF
aws_region             = "us-east-1"
project_name           = "eks-private"
ssh_key_name           = "your-keypair-name"
admin_ingress_cidr     = "YOUR_PUBLIC_IP/32"
grafana_admin_password = "YourSecurePassword"
EOF

terraform init
terraform apply
Enter fullscreen mode Exit fullscreen mode

This creates approximately 60 resources — VPC, subnets, NAT gateway, EKS cluster, managed node group, OpenVPN EC2, security groups, Elastic IP, Helm releases, Route 53 zone, and DNS records.

2. Download VPN Profile & Connect

# Get the VPN public IP from outputs
VPN_IP=$(terraform output -raw openvpn_public_ip)

# Download the auto-generated client profile
scp -i your-key.pem ubuntu@${VPN_IP}:/home/ubuntu/client-configs/client1.ovpn .

# Connect to VPN
sudo openvpn --config client1.ovpn
Enter fullscreen mode Exit fullscreen mode

Wait for Initialization Sequence Completed. You now have a tunnel into the VPC.

3. Configure DNS Resolution

Your local DNS resolver doesn't know about Route 53 private zones. Fix that:

# Route DNS through the VPN tunnel to VPC DNS
sudo resolvectl dns tun0 10.0.0.2
sudo resolvectl domain tun0 "~."
Enter fullscreen mode Exit fullscreen mode

4. Access Your Cluster

# Configure kubectl
aws eks update-kubeconfig --region us-east-1 --name eks-private-cluster

# Verify
kubectl get nodes
Enter fullscreen mode Exit fullscreen mode
NAME                             STATUS   ROLES    AGE   VERSION
ip-10-0-1-xxx.ec2.internal      Ready    <none>   15m   v1.31.x
ip-10-0-2-xxx.ec2.internal      Ready    <none>   15m   v1.31.x
Enter fullscreen mode Exit fullscreen mode

5. Open Grafana

Navigate to http://grafana.devops.private in your browser.

Login: admin / your configured password.

You get pre-built dashboards for:

  • Kubernetes cluster health and resource utilization
  • Node CPU, memory, disk, and network metrics
  • Pod-level resource consumption
  • Prometheus self-monitoring

Gotchas I Hit (So You Don't Have To)

1. "You must be logged in to the server"

Cause: EKS module v20+ no longer auto-grants admin access to the cluster creator.
Fix: Add enable_cluster_creator_admin_permissions = true to the EKS module.

2. Helm annotation boolean marshaling error

json: cannot unmarshal bool into Go struct field
ObjectMeta.metadata.annotations of type string
Enter fullscreen mode Exit fullscreen mode

Cause: Terraform passes "true" as a boolean, but K8s annotations must be strings.
Fix: Add type = "string" to annotation set blocks.

3. iptables-persistent hangs during user data

Cause: The package prompts interactively — even in automated scripts.
Fix: Pre-seed with debconf-set-selections and use DEBIAN_FRONTEND=noninteractive.

4. Private DNS doesn't resolve on your machine

Cause: Your local DNS resolver (127.0.0.53) doesn't know about VPC private zones.
Fix: sudo resolvectl dns tun0 10.0.0.2 && sudo resolvectl domain tun0 "~." — routes DNS through VPN to the VPC DNS server.

5. EKS version jumps fail

Cause: AWS only allows upgrading one minor version at a time (e.g., 1.29 → 1.30, not 1.29 → 1.33).
Fix: Increment cluster_version one step at a time, running terraform apply for each.


Security Posture

Attack Surface Status
Kubernetes API Not internet-facing — private endpoint only
Grafana / Prometheus Not internet-facing — internal LB only
OpenVPN UDP:1194 open, but certificate-authenticated (PKI + TLS-auth HMAC)
SSH to VPN server Restricted to admin IP (admin_ingress_cidr)
DNS records Private hosted zone — not resolvable outside VPC
IAM / RBAC EKS Access Entries — IAM-native, auditable

The only internet-facing resource is the OpenVPN server on UDP:1194, protected by mutual TLS authentication with a pre-shared HMAC key.


Cost Breakdown

Resource ~Monthly Cost
EKS Control Plane $73
2x t3.medium Workers $60
NAT Gateway + Data $32+
t3.small OpenVPN $15
Internal Load Balancer $16
Elastic IP $3.65
EBS (20Gi Prometheus + roots) ~$5
Total ~$205/mo

Cost optimization ideas: Spot instances for workers, Graviton (t4g) for ~20% savings, scheduled scaling for dev/staging environments.


Wrapping Up

We built a production-grade, fully private EKS cluster with:

  • Zero public Kubernetes API exposure
  • Self-hosted OpenVPN with automated PKI and client profile generation
  • Prometheus + Grafana monitoring via internal load balancer
  • Route 53 private DNSgrafana.devops.private
  • 100% Terraform — reproducible, version-controlled, auditable

This architecture eliminates an entire class of attack vectors by making the Kubernetes API server unreachable from the internet. Combined with certificate-based VPN authentication and private DNS, it provides a secure, practical setup for teams that take infrastructure security seriously.

The complete source code is on GitHub

Top comments (1)

Collapse
 
klement_gunndu profile image
klement Gunndu

The iptables NAT masquerade for VPN clients is a nice touch — ran into issues without it where VPC security groups couldn't match the 10.8.0.0/24 range. One gotcha: OpenVPN UDP on some corporate networks gets blocked, worth having a TCP fallback.