DEV Community

Cover image for Building a Production Kubernetes Cluster for $15/Month In Four Days
Ahmed Rakan
Ahmed Rakan

Posted on • Edited on

Building a Production Kubernetes Cluster for $15/Month In Four Days

Running lean is essential for sustaining a software business long-term. Excessive infrastructure costs can sink a startup before it has a chance to succeed. This guide will show you how to build a production-ready Kubernetes cluster for approximately $15 per month while maintaining security, scalability, and reliability.

Why Kubernetes?

Kubernetes provides container orchestration that enables you to manage and horizontally scale services across multiple nodes securely and efficiently. A solid understanding of K8s gives you the ability to run services with enterprise-grade scalability and robustness without paying premium managed service prices.

The Maintenance Myth

Many developers assume that maintaining Kubernetes requires a dedicated DevOps team. While K8s does have a learning curve, once you understand the fundamentals, daily operations become manageable. As a solo founder or small team, you can handle security audits, monitoring, and operations in roughly 30 minutes per day.

However, setting up bare-metal Kubernetes from scratch is complex and time-consuming. That's why we'll use MicroK8s instead.

Why MicroK8s?

MicroK8s is a production-ready, ultra-lightweight Kubernetes distribution created by Canonical, the company behind Ubuntu. It provides:

  • Simplified installation and configuration
  • Full Kubernetes functionality with minimal overhead
  • Built-in high availability support
  • Easy addon management
  • Automatic updates

This makes it perfect for small teams and solo founders who want production-grade Kubernetes without the operational complexity of bare-metal setup.

Architecture Overview

Our architecture consists of three main components:

  1. Infrastructure Layer: Three Contabo VPS nodes running MicroK8s
  2. Security Layer: Cloudflare for DNS, load balancing, and DDoS protection
  3. Observability Layer: Monitoring and logging stack

Here's the high-level design:

Architecture Diagram

Infrastructure Setup

Choosing Your Provider

We'll use Contabo for our infrastructure. They offer:

  • Affordable VPS instances starting at €5/month
  • Global network presence
  • Reliable performance for the price point

Recommended configuration: Cloud VPS 20 (€5/month each)

  • 3 nodes total (≈$15-16/month)
  • 1 tainted master node (dedicated to control plane tasks)
  • 2 worker nodes for application workloads

By tainting the master node, we ensure that only Kubernetes control plane components run on it, preventing application workloads from interfering with cluster management.

High Availability Configuration

MicroK8s replicates the control plane across multiple nodes. With three nodes, high availability is automatically configured, ensuring your cluster survives node failures.

Security Architecture

We'll implement security in three layers: infrastructure security, runtime security, and disaster recovery.

Layer 1: Infrastructure Security

Cloudflare Integration

Routing traffic through Cloudflare provides multiple security benefits:

  1. DDoS Protection: Built-in mitigation for distributed denial-of-service attacks
  2. WAF (Web Application Firewall): Protection against OWASP Top 10 vulnerabilities
  3. IP Masking: Your server IPs remain hidden from potential attackers
  4. Global Load Balancing: Distributes traffic across nodes with automatic failover

Cloudflare's load balancer offers:

  • Active health monitoring
  • Intelligent routing based on latency and geography
  • Custom rules for traffic management
  • Detailed analytics

Note on single point of failure: Cloudflare maintains a 100% uptime SLA and powers a significant portion of the internet. Their infrastructure is among the most resilient globally.

Network Security

Implement defense-in-depth with these measures:

Firewall Rules (UFW):

# Allow only necessary traffic
- Node-to-node communication (K8s internal)
- Load balancer to node traffic
- SSH access from specific IPs only
- Block all other inbound traffic
Enter fullscreen mode Exit fullscreen mode

SSH Hardening:

  • Disable password authentication
  • Use SSH keys only
  • Change default SSH port
  • Implement fail2ban to block brute-force attempts

OS Security:

  • Regular OS and kernel patching
  • Minimal package installation
  • Hardened user permissions
  • Security auditing with tools like Lynis

Layer 2: Runtime Security

Falco - Runtime Threat Detection

Falco is an open-source cloud-native runtime security tool that provides:

  • Real-time threat detection for containers
  • Configurable rules for suspicious behavior
  • Integration with Kubernetes audit logs
  • Alerts for policy violations

Example use cases:

  • Detecting unexpected process execution in containers
  • Monitoring privileged container operations
  • Tracking sensitive file access
  • Identifying suspicious network activity

Istio Service Mesh

Istio extends Kubernetes networking with:

  • mTLS encryption: Automatic service-to-service encryption
  • Traffic management: Fine-grained routing and load balancing
  • Security policies: Authorization and authentication at the service level
  • Observability: Distributed tracing and metrics

Key Istio features for our setup:

  • Virtual Services for traffic routing
  • Destination Rules for load balancing
  • PeerAuthentication for mTLS enforcement
  • AuthorizationPolicies for access control

Kiali - Service Mesh Visualization

Kiali provides a visual dashboard for Istio, offering:

  • Service topology visualization
  • Real-time traffic flow monitoring
  • Configuration validation
  • Health and performance metrics

Monitoring and Observability

A production cluster requires comprehensive monitoring. Here's our observability stack:

Prometheus and Grafana

Prometheus (metrics collection):

  • Cluster resource utilization
  • Node health metrics
  • Application performance metrics
  • Custom business metrics

Grafana (visualization):

  • Pre-built Kubernetes dashboards
  • Custom dashboard creation
  • Alert visualization
  • Historical data analysis

MicroK8s makes this easy with built-in addons:

microk8s enable prometheus
microk8s enable grafana
Enter fullscreen mode Exit fullscreen mode

Logging Stack

Implement centralized logging with the EFK/ELK stack:

Elasticsearch: Log storage and indexing
Fluentd/Fluent Bit: Log collection and forwarding
Kibana: Log visualization and search

Benefits:

  • Centralized log aggregation from all nodes
  • Full-text search across all logs
  • Historical log retention
  • Custom dashboards for log patterns

Alerting Strategy

Configure alerts for critical events:

Infrastructure alerts:

  • Node down or unreachable
  • High CPU/memory utilization (>85%)
  • Disk space warnings (>80% used)
  • Network connectivity issues

Application alerts:

  • Pod crash loops
  • Failed deployments
  • High error rates
  • Response time degradation

Security alerts:

  • Falco threat detections
  • Failed authentication attempts
  • Unauthorized API access
  • Certificate expiration warnings

Deliver alerts via:

  • Email for non-critical issues
  • Slack/Discord for team notifications
  • PagerDuty for critical incidents (optional)

Disaster Recovery and Backup

Backup Strategy

Kasten K10 provides enterprise-grade Kubernetes backup and disaster recovery:

  • Automated daily backups of cluster state
  • Application-centric backup and restore
  • Volume snapshots with point-in-time recovery
  • Cross-region backup storage
  • Policy-driven automation
  • Ransomware protection with immutable backups

What to backup:

  • Kubernetes resources (deployments, services, configs)
  • Persistent volume data
  • Secrets and ConfigMaps
  • Custom resource definitions

Backup schedule:

  • Daily incremental backups
  • Weekly full backups
  • 30-day retention policy

Disaster Recovery Plan

Document and test your recovery procedures:

  1. Node failure: Automatic failover via Cloudflare load balancer
  2. Cluster failure: Restore from Velero backup to new nodes
  3. Data corruption: Point-in-time restore from snapshots
  4. Region failure: Restore cluster in alternate region (if using geo-distributed setup)

Recovery Time Objective (RTO): < 1 hour
Recovery Point Objective (RPO): < 24 hours

Test your disaster recovery plan quarterly to ensure procedures work as expected.

Cost Breakdown

Let's review the actual costs:

Infrastructure:

  • 3x Contabo VPS (€5 each): €15/month (~$16/month)

Services:

  • Cloudflare DNS: Free
  • Cloudflare Load Balancer: $5/month (entry tier)
  • Kasten K10: Free tier (up to 10 nodes)
  • All other open-source software: Free

Total: Approximately $21/month

You can reduce this to $15-16/month by:

  • Using Cloudflare DNS without paid load balancing (implement DNS round-robin)
  • Starting with 2 nodes for development environments
  • Using geographic DNS routing as an alternative

Implementation Checklist

Follow this sequence for implementation:

Phase 1: Infrastructure (1 days )

  • [ ] Provision three Contabo VPS instances
  • [ ] Install and configure MicroK8s on all nodes
  • [ ] Join nodes into a cluster
  • [ ] Taint master node
  • [ ] Configure Cloudflare DNS and load balancing
  • [ ] Implement UFW firewall rules

Phase 2: Security (1 days)

  • [ ] Deploy Istio service mesh
  • [ ] Configure mTLS policies
  • [ ] Install and configure Falco
  • [ ] Set up custom Falco rules
  • [ ] Deploy Kiali for visualization
  • [ ] Harden SSH access

Phase 3: Observability (1 day)

  • [ ] Enable Prometheus and Grafana addons
  • [ ] Deploy logging stack (EFK/ELK)
  • [ ] Configure alerting rules
  • [ ] Create custom dashboards
  • [ ] Set up alert notifications

Phase 4: Backup and DR (1 day)

  • [ ] Install Kasten K10
  • [ ] Configure backup storage location
  • [ ] Create backup policies
  • [ ] Set up automated backup schedules
  • [ ] Document recovery procedures
  • [ ] Perform DR test

Best Practices

Security Maintenance

Daily (15 minutes):

  • Review Falco security alerts
  • Check cluster health in Grafana
  • Verify backup completion

Weekly (30 minutes):

  • Review access logs
  • Update security rules as needed
  • Check for available patches
  • Review resource utilization trends

Monthly (2 hours):

  • Apply OS and security updates
  • Rotate credentials and certificates
  • Review and update firewall rules
  • Analyze security audit logs
  • Test disaster recovery procedures

Performance Optimization

  • Use resource requests and limits for all pods
  • Implement horizontal pod autoscaling
  • Use node affinity to optimize placement
  • Regularly review and optimize container images
  • Monitor and tune Istio performance

Cost Optimization

  • Right-size your workloads based on actual usage
  • Use resource quotas to prevent overcommitment
  • Implement pod disruption budgets
  • Schedule non-critical workloads during off-peak
  • Monitor and optimize storage usage

Common Pitfalls to Avoid

  1. Skipping monitoring: You can't manage what you can't measure
  2. Neglecting backups: Test your backups regularly
  3. Ignoring security updates: Automate patching where possible
  4. Over-provisioning: Start small and scale as needed
  5. No documentation: Document your setup and procedures
  6. Skipping DR tests: Quarterly testing is essential

Conclusion

Building a production-ready Kubernetes cluster for $15-20/month is achievable with the right tools and approach. MicroK8s provides enterprise-grade Kubernetes functionality without the operational overhead of bare-metal setup. Combined with Cloudflare's security features and open-source monitoring tools, you can create a robust, scalable infrastructure on a bootstrap budget.

The key is taking time to learn the fundamentals. While Kubernetes has a learning curve, the investment pays dividends in operational efficiency, scalability, and cost savings. As your business grows, this foundation scales with you.

Remember: security is an iterative process. Start with these fundamentals, monitor continuously, and improve based on real-world observations. This approach provides an excellent foundation for production workloads while maintaining flexibility for future growth.

Extra Notes

  • One limitation for keeping our Kubernetes cluster as cheap as possible is the 300 MiB/S port limit on Contabo VPS. Nevertheless, we can migrate the nodes to more robust even BareMetal on Contabo as they offer such when needed.
  • Recommended Services for Dev Teams are ArgoCD + GitHub actions for complete, secure, easy to maintain and setup CI/CD pipelines.
  • One thing we recommend avoid hosting are databases, they are doable however with horizontally distributed databases like MongoDB and NewSQL databases things can get tricky. One solution we found helpful when hosting databases like Redis, MongoDB is LongHorn, LongHorn is distributed file system.

Additional Resources


Have questions or suggestions? Feel free to reach out in the comments below.

Top comments (1)