Ali Soltaninasab

Posted on Nov 27

Building an Air-gapped Hardened Kubernetes Cluster with Kubespray

#kubernetes #kubespray #devops #sre

Building an Air-gapped, Hardened Kubernetes Cluster with Kubespray

Most Kubernetes examples assume you have full Internet access and can just curl | bash your way into a cluster.

In many real environments, that's not the case.

In this project I built and documented a production-style, air-gapped, hardened, highly available Kubernetes cluster using Kubespray. This post is a high-level walkthrough of the approach, the constraints, and how the pieces fit together.

👉 Full runbook and supporting material are available on GitHub:

https://a-soltani255.github.io/Kubespray/

Why air-gapped + hardened?

The scenario behind this project:

No direct Internet access from the Kubernetes nodes
Strict security requirements
Need for a repeatable, documented process to create new clusters
Desire for high availability (multi-control-plane) rather than a single-master lab

That combination is common in regulated environments, but it's rarely covered in simple tutorials.

High-level architecture

At a very high level, the setup looks like this:

Several Linux nodes (for control plane and workers)
A bastion / deploy node where Kubespray runs Ansible
Internal infrastructure:
- Local OS package repositories (mirrors of upstream repos)
- Private container registry for all required images
Kubernetes deployed by Kubespray with:
- Multi-node control plane (HA)
- Custom CNI
- Opinionated hardening

The GitHub repo contains:

A full Markdown runbook:
- Installing-Airgapped-Hardened-Kubernetes-Cluster-Using-Kubespray.md
A Scripts, appendices and Configurations/ folder with:
- Example inventories and group vars
- Helper scripts and one-liners
- Diagrams and troubleshooting appendices

Step 1 – Preparing the environment

The runbook starts with the basics:

Choosing and preparing the OS for all nodes
Setting up hostnames, IPs, and SSH access
Ensuring consistent time sync and basic hardening at the OS level
Making sure the deploy node can reach all cluster nodes via SSH

The goal is to have a clean, predictable base before Kubespray comes into play.

Step 2 – Building offline OS repositories and registry

Because the cluster is air-gapped, public repos and registries are not available directly.

The project walks through:

Mirroring OS repositories (e.g. BaseOS/AppStream/EPEL equivalents) into an internal repo
Setting up a private container registry and pushing required images into it
Wiring Kubespray and the nodes to use these internal sources instead of the Internet

This is the foundation that allows you to fully install and operate Kubernetes without external network dependency.

Step 3 – Designing the Kubespray inventory

Kubespray uses an Ansible inventory plus group vars to describe the cluster.

In the repo you’ll find examples showing:

Separation of control plane and worker nodes
How to describe the HA layout in the inventory
How group vars are tuned for:
- The chosen OS
- The air-gapped environment
- The selected CNI and other components

The idea is to move away from one-off kubeadm experiments and into a declarative, repeatable cluster definition.

Step 4 – Running Kubespray and bringing the cluster up

With the inventory designed and the registry/repos in place:

Kubespray is used to:
- Bootstrap etcd and the control plane
- Configure worker nodes
- Install networking, DNS, and core add-ons

The runbook includes:

The exact commands to run from the deploy node
Common pitfalls and validation checks (e.g. verifying nodes, pods, and core services)

Step 5 – Hardening and validation

After the cluster is up, the project focuses on:

Hardening steps that make sense in this environment
Verifying:
- Node status and core components
- DNS and basic networking
- Access to the private registry from inside the cluster

There is also a troubleshooting section with diagrams to help understand where things can break and how to debug them.

Why I built this and what it demonstrates

From a skills point of view, this project showcases:

Kubernetes cluster provisioning with Kubespray
Designing for high availability
Working in air-gapped / restricted environments
Building internal package repositories and container registries
Applying basic hardening and doing structured validation
Writing a runbook that other engineers can follow

If you’re interested in the full details, you can explore the GitHub repo here:

👉 https://github.com/A-Soltani255/Kubespray

Closing thoughts

My goal with this work was to go beyond “local single-node clusters” and document something closer to a realistic production-style setup, including all the operational pieces around it.

If you work with Kubernetes in constrained environments, or you’re moving towards platform engineering / SRE roles, projects like this are a great way to practise end-to-end thinking: from OS and networking, to provisioning, to security and operations.

DEV Community