DEV Community

Python-T Point
Python-T Point

Posted on • Originally published at pythontpoint.in

☁️ GKE private cluster setup — common mistakes and how to avoid them

Private clusters are not inherently valuable — they’re only effective when used to reduce attack surface. For teams running production workloads in Google Kubernetes Engine (GKE), leaving worker nodes exposed to the public internet increases blast radius during incidents. A gke private cluster setup is not a compliance checkbox; it’s a structural control that isolates nodes, restricts control plane access, and limits lateral movement. This guide covers how to deploy a GKE cluster with private nodes and master authorized networks , including the underlying networking model, required configurations, and key failure modes.

📑 Table of Contents

  • 🔐 VPC & Subnet — Build the Foundation
  • 🧱 GKE Cluster — Configure Private Nodes
  • 🔧 Node Boot Process — What Happens Under the Hood
  • ⚠️ Common Pitfall — No Internet Egress
  • 🔐 Master Authorization — Control Access
  • 🔍 Access Flow — How kubectl Reaches the Master
  • 🚨 Emergency Access — Don’t Lock Yourself Out
  • ✅ Verification — Confirm the Setup
  • 🔍 Network Flow — Packet-Level View
  • 🟩 Final Thoughts
  • ❓ Frequently Asked Questions
  • Can I enable private nodes on an existing cluster?
  • What happens if I lose access to all authorized networks?
  • Do private clusters cost more?
  • 📚 References & Further Reading

🔐 VPC & Subnet — Build the Foundation

A GKE private cluster depends on correct VPC (Virtual Private Cloud) and subnet configuration — errors here prevent node booting or control plane connectivity. The VPC must enable private Google access , and the subnet must have sufficient IP space for node pools and pod/service CIDRs. GKE uses alias IP ranges to assign pod IPs directly from the subnet’s secondary ranges, avoiding NAT and preserving source IP end-to-end. Create a VPC and subnet with required settings:

$ gcloud compute networks create gke-vpc \ -subnet-mode=custom \ -bgp-routing-mode=regional $ gcloud compute networks subnets create gke-subnet \ -network=gke-vpc \ -region=us-central1 \ -range=10.100.0.0/22 \ -enable-private-ip-google-access \ -secondary-range=pod-cidr=10.101.0.0/16,svc-cidr=10.102.0.0/20
Enter fullscreen mode Exit fullscreen mode

Expected output:

Created [https://www.googleapis.com/compute/v1/projects/my-project/global/networks/gke-vpc].
Created [https://www.googleapis.com/compute/v1/projects/my-project/regions/us-central1/subnetworks/gke-subnet].
Enter fullscreen mode Exit fullscreen mode

The -enable-private-ip-google-access flag allows VMs with internal IPs to reach Google APIs (e.g., gcr.io, Cloud Logging) without NAT. Omitting this blocks container image pulls.


🧱 GKE Cluster — Configure Private Nodes

A private node has no external IP and communicates only via internal VPC routes. Without outbound egress configured, nodes cannot reach the internet — including Google APIs. Use -enable-private-nodes to assign only internal IPs to nodes. This requires VPC-native networking (-enable-ip-alias) and mapping of secondary ranges for pods and services. Deploy the cluster:

$ gcloud container clusters create private-cluster \ -zone=us-central1-a \ -network=gke-vpc \ -subnetwork=gke-subnet \ -enable-private-nodes \ -master-ipv4-cidr=172.16.0.0/28 \ -enable-ip-alias \ -enable-private-endpoint \ -services-secondary-range-name=svc-cidr \ -cluster-secondary-range-name=pod-cidr \ -enable-master-authorized-networks \ -release-channel=regular
Enter fullscreen mode Exit fullscreen mode

Output:

Creating cluster private-cluster...done. Created [https://container.googleapis.com/v1/projects/my-project/zones/us-central1-a/clusters/private-cluster]. To inspect the contents of your cluster, go to: https://console.cloud.google.com/kubernetes/workload_/gcloud/us-central1-a/private-cluster?project=my-project
Enter fullscreen mode Exit fullscreen mode

Key flags: - -enable-private-nodes: Worker nodes receive only internal IPs.

  • -enable-ip-alias: Enables VPC-native networking using alias IPs.
  • -services-secondary-range-name, -cluster-secondary-range-name: Bind secondary ranges to services and pods.
  • -master-ipv4-cidr: Reserves a /28 block (172.16.0.0/28) for the internal control plane endpoint.
  • -enable-private-endpoint: Disables public control plane endpoint.
  • -enable-master-authorized-networks: Restricts API access to defined CIDR blocks. Without -enable-master-authorized-networks, you lose access — the control plane has no public endpoint, and no IPs are whitelisted by default.

Private clusters don’t just hide nodes — they enforce zero-trust access at the network layer.

🔧 Node Boot Process — What Happens Under the Hood

During boot, a private node: 1. Acquires an internal IP from the primary subnet (10.100.0.0/22).

2. Resolves internal GKE endpoints via metadata-provided DNS (169.254.169.254).

3. Authenticates using the attached IAM service account.

4. Fetches configuration and connects to the master via the private endpoint. No public IP, no inbound SSH, no egress — unless explicitly configured.

⚠️ Common Pitfall — No Internet Egress

Private nodes can’t pull images from gcr.io or us-docker.pkg.dev without outbound access. Enable Cloud NAT or rely on Private Google Access for API connectivity. Provision Cloud NAT:

$ gcloud compute routers create nat-router \ -network=gke-vpc \ -region=us-central1 $ gcloud compute routers nats create nat-config \ -router=nat-router \ -auto-allocate-nat-external-ips \ -nat-custom-subnet-ip-ranges=gke-subnet \ -region=us-central1
Enter fullscreen mode Exit fullscreen mode

After creation, nodes can reach Google APIs and public registries via NAT.


🔐 Master Authorization — Control Access

A private endpoint alone isn’t sufficient — any host in a whitelisted CIDR can reach the API server. Use -enable-master-authorized-networks to restrict access to specific networks. The feature enforces IP-based allowlists for control plane connectivity. CIDRs can be public or private, but only listed ranges are permitted. Whitelist office IP and bastion host:

$ gcloud container clusters update private-cluster \ -zone=us-central1-a \ -enable-master-authorized-networks \ -master-authorized-networks=203.0.113.10/32,10.1.0.5/32
Enter fullscreen mode Exit fullscreen mode

Output:

Updating cluster private-cluster...done. Updated [https://container.googleapis.com/v1/projects/my-project/zones/us-central1-a/clusters/private-cluster].
Enter fullscreen mode Exit fullscreen mode

Only systems at 203.0.113.10 or 10.1.0.5 may connect to the control plane.

🔍 Access Flow — How kubectl Reaches the Master

When kubectl runs: 1. gcloud container clusters get-credentials retrieves the private endpoint IP (172.16.0.1, from -master-ipv4-cidr).

2. Resolution occurs via internal DNS if on the VPC, or through Cloud VPN / Interconnect.

3. The request reaches the control plane only if the source IP matches a CIDR in master-authorized-networks.

4. Authentication proceeds via OAuth token from gcloud auth. No public load balancer, no DNS exposure — the API server is unreachable from unapproved networks.

🚨 Emergency Access — Don’t Lock Yourself Out

It’s possible to exclude all valid IPs. Always include a fallback path such as a bastion host or Cloud Shell. To temporarily allow Cloud Shell:

$ gcloud container clusters update private-cluster \ -zone=us-central1-a \ -master-authorized-networks=203.0.113.10/32,35.235.240.0/20
Enter fullscreen mode Exit fullscreen mode

Google’s Cloud Shell egress IPs fall within 35.235.240.0/20. Remove this range after recovery.


✅ Verification — Confirm the Setup

Validate every component after deployment. Check cluster configuration:

$ gcloud container clusters describe private-cluster -zone=us-central1-a
Enter fullscreen mode Exit fullscreen mode

Relevant output:

privateClusterConfig: enablePrivateEndpoint: true enablePrivateNodes: true masterIpv4CidrBlock: 172.16.0.0/28
masterAuthorizedNetworksConfig: cidrBlocks: - cidrBlock: 203.0.113.10/32 displayName: office - cidrBlock: 10.1.0.5/32 displayName: bastion
Enter fullscreen mode Exit fullscreen mode

Verify node IPs:

$ gcloud compute instances list -filter="name~gke-private-cluster"
Enter fullscreen mode Exit fullscreen mode

Output:

NAME ZONE MACHINE_TYPE PREEMPTIBLE INTERNAL_IP EXTERNAL_IP STATUS
gke-private-cluster-default-pool-abc123 us-central1-a e2-medium 10.100.0.2 RUNNING
Enter fullscreen mode Exit fullscreen mode

No EXTERNAL_IP confirms private node configuration. Test control plane access:

$ kubectl get nodes
Enter fullscreen mode Exit fullscreen mode

Expected:

NAME STATUS ROLES AGE VERSION
gke-private-cluster-default-pool-abc123 Ready <none> 5m v1.27.3-gke.100
Enter fullscreen mode Exit fullscreen mode

On failure, verify:

  • Your IP is in master-authorized-networks
  • VPC routes allow return traffic
  • Firewall rules permit port 443 to 172.16.0.0/28

🔍 Network Flow — Packet-Level View

A kubectl request traverses: 1. From client to gateway.

2. Into Google’s network via Cloud VPN tunnel (if applicable).

3. Routed to control plane at 172.16.0.0/28.

4. Evaluated by master: - Source IP in masterAuthorizedNetworksConfig? → Yes → Proceed. - Bearer token valid? → Yes → Return response. No public internet involvement. No DNS leakage. All traffic is contained.


🟩 Final Thoughts

A gke private cluster setup is not optional for production: it removes public attack vectors from nodes, limits control plane exposure, and enforces network-layer access control. The operational overhead is low, but the reduction in exposure is significant. This configuration prevents direct node access and blocks unauthorized API calls — even if an attacker compromises a pod. It integrates seamlessly with CI/CD, policy engines, and observability stacks. For production workloads, private clusters should be the default. Not an exception.

❓ Frequently Asked Questions

Can I enable private nodes on an existing cluster?

No — you cannot convert a public-node cluster to private nodes after creation. You must recreate the cluster with --enable-private-nodes. However, you can enable master authorized networks on an existing cluster using gcloud container clusters update.

What happens if I lose access to all authorized networks?

You’ll be locked out of the control plane. Recovery requires using the GCP Console from an allowed IP or temporarily enabling public access via the API (if not disabled). Always maintain at least one fallback access path, like a bastion host or Cloud Shell.

Do private clusters cost more?

Not directly. GKE pricing is based on node count and usage. However, you may incur additional costs from Cloud NAT or Cloud Interconnect if you need egress or on-prem connectivity.

📚 References & Further Reading

  • Official GKE private cluster guide — complete reference for IP ranges, flags, and networking: cloud.google.com
  • VPC networking for GKE — deep dive into alias IPs and secondary ranges: cloud.google.com
  • Master authorized networks configuration — how to manage CIDR whitelists: cloud.google.com

Top comments (0)