Ahmed Rakan

Posted on Oct 28 • Edited on Oct 30

Building a Production Kubernetes Cluster for $15/Month In Four Days

#devops #backend #webdev #programming

Running lean is essential for sustaining a software business long-term. Excessive infrastructure costs can sink a startup before it has a chance to succeed. This guide will show you how to build a production-ready Kubernetes cluster for approximately $15 per month while maintaining security, scalability, and reliability.

Why Kubernetes?

Kubernetes provides container orchestration that enables you to manage and horizontally scale services across multiple nodes securely and efficiently. A solid understanding of K8s gives you the ability to run services with enterprise-grade scalability and robustness without paying premium managed service prices.

The Maintenance Myth

Many developers assume that maintaining Kubernetes requires a dedicated DevOps team. While K8s does have a learning curve, once you understand the fundamentals, daily operations become manageable. As a solo founder or small team, you can handle security audits, monitoring, and operations in roughly 30 minutes per day.

However, setting up bare-metal Kubernetes from scratch is complex and time-consuming. That's why we'll use MicroK8s instead.

Why MicroK8s?

MicroK8s is a production-ready, ultra-lightweight Kubernetes distribution created by Canonical, the company behind Ubuntu. It provides:

Simplified installation and configuration
Full Kubernetes functionality with minimal overhead
Built-in high availability support
Easy addon management
Automatic updates

This makes it perfect for small teams and solo founders who want production-grade Kubernetes without the operational complexity of bare-metal setup.

Architecture Overview

Our architecture consists of three main components:

Infrastructure Layer: Three Contabo VPS nodes running MicroK8s
Security Layer: Cloudflare for DNS, load balancing, and DDoS protection
Observability Layer: Monitoring and logging stack

Here's the high-level design:

Infrastructure Setup

Choosing Your Provider

We'll use Contabo for our infrastructure. They offer:

Affordable VPS instances starting at €5/month
Global network presence
Reliable performance for the price point

Recommended configuration: Cloud VPS 20 (€5/month each)

3 nodes total (≈$15-16/month)
1 tainted master node (dedicated to control plane tasks)
2 worker nodes for application workloads

By tainting the master node, we ensure that only Kubernetes control plane components run on it, preventing application workloads from interfering with cluster management.

High Availability Configuration

MicroK8s replicates the control plane across multiple nodes. With three nodes, high availability is automatically configured, ensuring your cluster survives node failures.

Security Architecture

We'll implement security in three layers: infrastructure security, runtime security, and disaster recovery.

Layer 1: Infrastructure Security

Cloudflare Integration

Routing traffic through Cloudflare provides multiple security benefits:

DDoS Protection: Built-in mitigation for distributed denial-of-service attacks
WAF (Web Application Firewall): Protection against OWASP Top 10 vulnerabilities
IP Masking: Your server IPs remain hidden from potential attackers
Global Load Balancing: Distributes traffic across nodes with automatic failover

Cloudflare's load balancer offers:

Active health monitoring
Intelligent routing based on latency and geography
Custom rules for traffic management
Detailed analytics

Note on single point of failure: Cloudflare maintains a 100% uptime SLA and powers a significant portion of the internet. Their infrastructure is among the most resilient globally.

Network Security

Implement defense-in-depth with these measures:

Firewall Rules (UFW):

# Allow only necessary traffic
- Node-to-node communication (K8s internal)
- Load balancer to node traffic
- SSH access from specific IPs only
- Block all other inbound traffic

SSH Hardening:

Disable password authentication
Use SSH keys only
Change default SSH port
Implement fail2ban to block brute-force attempts

OS Security:

Regular OS and kernel patching
Minimal package installation
Hardened user permissions
Security auditing with tools like Lynis

Layer 2: Runtime Security

Falco - Runtime Threat Detection

Falco is an open-source cloud-native runtime security tool that provides:

Real-time threat detection for containers
Configurable rules for suspicious behavior
Integration with Kubernetes audit logs
Alerts for policy violations

Example use cases:

Detecting unexpected process execution in containers
Monitoring privileged container operations
Tracking sensitive file access
Identifying suspicious network activity

Istio Service Mesh

Istio extends Kubernetes networking with:

mTLS encryption: Automatic service-to-service encryption
Traffic management: Fine-grained routing and load balancing
Security policies: Authorization and authentication at the service level
Observability: Distributed tracing and metrics

Key Istio features for our setup:

Virtual Services for traffic routing
Destination Rules for load balancing
PeerAuthentication for mTLS enforcement
AuthorizationPolicies for access control

Kiali - Service Mesh Visualization

Kiali provides a visual dashboard for Istio, offering:

Service topology visualization
Real-time traffic flow monitoring
Configuration validation
Health and performance metrics

Monitoring and Observability

A production cluster requires comprehensive monitoring. Here's our observability stack:

Prometheus and Grafana

Prometheus (metrics collection):

Cluster resource utilization
Node health metrics
Application performance metrics
Custom business metrics

Grafana (visualization):

Pre-built Kubernetes dashboards
Custom dashboard creation
Alert visualization
Historical data analysis

MicroK8s makes this easy with built-in addons:

microk8s enable prometheus
microk8s enable grafana

Logging Stack

Implement centralized logging with the EFK/ELK stack:

Elasticsearch: Log storage and indexing
Fluentd/Fluent Bit: Log collection and forwarding
Kibana: Log visualization and search

Benefits:

Centralized log aggregation from all nodes
Full-text search across all logs
Historical log retention
Custom dashboards for log patterns

Alerting Strategy

Configure alerts for critical events:

Infrastructure alerts:

Node down or unreachable
High CPU/memory utilization (>85%)
Disk space warnings (>80% used)
Network connectivity issues

Application alerts:

Pod crash loops
Failed deployments
High error rates
Response time degradation

Security alerts:

Falco threat detections
Failed authentication attempts
Unauthorized API access
Certificate expiration warnings

Deliver alerts via:

Email for non-critical issues
Slack/Discord for team notifications
PagerDuty for critical incidents (optional)

Disaster Recovery and Backup

Backup Strategy

Kasten K10 provides enterprise-grade Kubernetes backup and disaster recovery:

Automated daily backups of cluster state
Application-centric backup and restore
Volume snapshots with point-in-time recovery
Cross-region backup storage
Policy-driven automation
Ransomware protection with immutable backups

What to backup:

Kubernetes resources (deployments, services, configs)
Persistent volume data
Secrets and ConfigMaps
Custom resource definitions

Backup schedule:

Daily incremental backups
Weekly full backups
30-day retention policy

Disaster Recovery Plan

Document and test your recovery procedures:

Node failure: Automatic failover via Cloudflare load balancer
Cluster failure: Restore from Velero backup to new nodes
Data corruption: Point-in-time restore from snapshots
Region failure: Restore cluster in alternate region (if using geo-distributed setup)

Recovery Time Objective (RTO): < 1 hour
Recovery Point Objective (RPO): < 24 hours

Test your disaster recovery plan quarterly to ensure procedures work as expected.

Cost Breakdown

Let's review the actual costs:

Infrastructure:

3x Contabo VPS (€5 each): €15/month (~$16/month)

Services:

Cloudflare DNS: Free
Cloudflare Load Balancer: $5/month (entry tier)
Kasten K10: Free tier (up to 10 nodes)
All other open-source software: Free

Total: Approximately $21/month

You can reduce this to $15-16/month by:

Using Cloudflare DNS without paid load balancing (implement DNS round-robin)
Starting with 2 nodes for development environments
Using geographic DNS routing as an alternative

Implementation Checklist

Follow this sequence for implementation:

Phase 1: Infrastructure (1 days )

[ ] Provision three Contabo VPS instances
[ ] Install and configure MicroK8s on all nodes
[ ] Join nodes into a cluster
[ ] Taint master node
[ ] Configure Cloudflare DNS and load balancing
[ ] Implement UFW firewall rules

Phase 2: Security (1 days)

[ ] Deploy Istio service mesh
[ ] Configure mTLS policies
[ ] Install and configure Falco
[ ] Set up custom Falco rules
[ ] Deploy Kiali for visualization
[ ] Harden SSH access

Phase 3: Observability (1 day)

[ ] Enable Prometheus and Grafana addons
[ ] Deploy logging stack (EFK/ELK)
[ ] Configure alerting rules
[ ] Create custom dashboards
[ ] Set up alert notifications

Phase 4: Backup and DR (1 day)

[ ] Install Kasten K10
[ ] Configure backup storage location
[ ] Create backup policies
[ ] Set up automated backup schedules
[ ] Document recovery procedures
[ ] Perform DR test

Best Practices

Security Maintenance

Daily (15 minutes):

Review Falco security alerts
Check cluster health in Grafana
Verify backup completion

Weekly (30 minutes):

Review access logs
Update security rules as needed
Check for available patches
Review resource utilization trends

Monthly (2 hours):

Apply OS and security updates
Rotate credentials and certificates
Review and update firewall rules
Analyze security audit logs
Test disaster recovery procedures

Performance Optimization

Use resource requests and limits for all pods
Implement horizontal pod autoscaling
Use node affinity to optimize placement
Regularly review and optimize container images
Monitor and tune Istio performance

Cost Optimization

Right-size your workloads based on actual usage
Use resource quotas to prevent overcommitment
Implement pod disruption budgets
Schedule non-critical workloads during off-peak
Monitor and optimize storage usage

Common Pitfalls to Avoid

Skipping monitoring: You can't manage what you can't measure
Neglecting backups: Test your backups regularly
Ignoring security updates: Automate patching where possible
Over-provisioning: Start small and scale as needed
No documentation: Document your setup and procedures
Skipping DR tests: Quarterly testing is essential

Conclusion

Building a production-ready Kubernetes cluster for $15-20/month is achievable with the right tools and approach. MicroK8s provides enterprise-grade Kubernetes functionality without the operational overhead of bare-metal setup. Combined with Cloudflare's security features and open-source monitoring tools, you can create a robust, scalable infrastructure on a bootstrap budget.

The key is taking time to learn the fundamentals. While Kubernetes has a learning curve, the investment pays dividends in operational efficiency, scalability, and cost savings. As your business grows, this foundation scales with you.

Remember: security is an iterative process. Start with these fundamentals, monitor continuously, and improve based on real-world observations. This approach provides an excellent foundation for production workloads while maintaining flexibility for future growth.

Extra Notes

One limitation for keeping our Kubernetes cluster as cheap as possible is the 300 MiB/S port limit on Contabo VPS. Nevertheless, we can migrate the nodes to more robust even BareMetal on Contabo as they offer such when needed.
Recommended Services for Dev Teams are ArgoCD + GitHub actions for complete, secure, easy to maintain and setup CI/CD pipelines.
One thing we recommend avoid hosting are databases, they are doable however with horizontally distributed databases like MongoDB and NewSQL databases things can get tricky. One solution we found helpful when hosting databases like Redis, MongoDB is LongHorn, LongHorn is distributed file system.

Additional Resources

Have questions or suggestions? Feel free to reach out in the comments below.

DEV Community