DEV Community: Michael Mekuleyi

Designing Alerts That Matters using Amazon CloudWatch

Michael Mekuleyi — Wed, 15 Apr 2026 16:56:40 +0000

The Alert Fatigue Problem

Cloud systems today generate a huge amount of data. Every time a Lambda function runs, an RDS query happens, or an API Gateway is called, it creates information. When teams are just starting out with the cloud, it’s easy to want to set up alerts for everything. If a metric is available, someone usually wants to be notified about it. But this often leads to so many alerts that it ends up overwhelming the team, making it hard to focus on what really matters. Instead of helping, it steals the engineers’ attention.

Alert fatigue is not a soft problem. It is a direct cause of production incidents being missed or escalated too slowly. When an on-call engineer receives 200 notifications on a quiet night, and 50 of them fire routinely without action, the signal-to-noise ratio collapses. The 201st notification — the one that actually matters — gets lost.

The goal of this guide is to reframe how you think about alerting. Rather than asking "what should we alarm on?", start with "what conditions require immediate human intervention?" Everything else can wait for a dashboard review, a weekly metric review, or be surfaced as a log insight.

CloudWatch Alarms — Fundamentals

Every CloudWatch Alarm is made up of three parts: a Metric, which is the data you’re keeping an eye on; a Condition, which is the rule or threshold that triggers the alarm (this can be a fixed number or based on unusual behavior); and an Action, which is what happens when the alarm changes state—like sending a notification or starting an auto-scaling event. Think of alarms like state machines—you can set them to respond whenever their status changes, not just when something crosses a set limit.

Alarm States Explained

It’s really important to understand the three alarm states because if engineers don’t, the alarms can act in unexpected ways—especially during things like deployments or when there’s missing data. Getting how the alarm’s “state machine” works helps avoid surprises and keeps everything running smoothly.

Choosing the Right Metrics

The four Golden signals;

Avoid Pure Resource Utilization Alarms: Having your CPU at 90% isn’t a problem on its own. It only becomes an issue if it’s happening alongside something users notice, like slow response times. To handle this better, you can use Composite Alarms to make sure alerts only go off when multiple signals show there’s a real problem.

Custom Metrics via EMF: You can use the Embedded Metrics Format to send detailed application data from your Lambda functions in a neat, structured JSON format. CloudWatch then automatically understands and processes this data without any extra cost for API calls. Now let us start to review the alert strategies that ensure that alerting is not a nightmare.

Thresholds & Evaluation Periods

The M-of-N Pattern: An alarm should go off only when several of the recent data points (M out of N) cross the limit. Don’t trigger an alarm just because 1 out of 1 data point did, unless it’s a clear failure, like having zero healthy hosts.

Set Thresholds from Baselines; Observe 2–4 weeks of normal operation, then set thresholds at a meaningful distance from your p95/p99. Avoid round-number intuition guesses like "80% feels high.”

Composite Alarms

Composite Alarms combine several alarms using AND, OR, or NOT logic, so the main alarm only triggers when a specific combination of conditions happens. This way, you avoid unnecessary alerts and focus only on real issues. It’s a powerful way to reduce false alarms and make monitoring more accurate.

Anomaly Detection Alarms

For metrics that naturally change over time like higher traffic on weekdays and lower at 3 a.m, fixed thresholds can either trigger alarms too often during normal busy times or miss real issues.

CloudWatch Anomaly Detection utilizes machine learning to recognize these patterns, including time of day and day of week, and alerts you only when the metric exceeds the expected range. It requires at least 14 days of data to create an effective model, so start with static thresholds on new services and then switch to anomaly detection once sufficient data is collected.

Best Practices Checklist

Use this checklist when reviewing any CloudWatch Alarm — whether newly created or inherited. Every alarm that cannot satisfy these criteria is a candidate for deletion or rework.

[x] Alarm is actionable — the engineer knows what to do immediately
[x] Metric correlates directly with user-facing impact
[x]Threshold set from observed baseline data, not intuition
[x] M-of-N evaluation configured (minimum 3 of 5 for most metrics)
[x] TreatMissingData is explicitly configured
[x] OKAction defined — team gets an automated all-clear
[x] Correct priority tier / SNS topic assigned
[x] Runbook URL in alarm description
[x] Defined in Terraform or CloudFormation — not the console
[x] Reviewed and tested in the last 90 days

Summary

Designing alerts that matters takes focus and discipline. CloudWatch has lots of great features like anomaly detection, composite alarms, and metric math, but the goal isn’t to use everything everywhere. Instead, pick just the few alarms that give your team clear, useful info without causing too much noise.

Start with the Four Golden Signals. Use composite alarms to reduce false alarms. Use anomaly detection for metrics that have patterns or seasonal changes. Make sure alerts are prioritized right and include runbooks in the alarm descriptions. Define alerts as code so they’re easy to manage. And regularly review your alarms to remove anything outdated—old alarms can cause more harm than good.

A well-designed alert means your on-call engineer gets a clear alert at 2 AM, knows exactly what’s wrong, where to check, and who to call. That’s what a good alerting system looks like.

How to pass the CKA Exam on the first try [GUARANTEED]

Michael Mekuleyi — Mon, 26 Jan 2026 19:02:59 +0000

Introduction

Before I took the Certified Kubernetes Administrator (CKA) Exam, I read every article there was about taking the exam, and yet I failed it. I knew every shortcut and every command, and yet I scored 5 marks below the passing score. In this article, I tell you about the silent key things that helped me succeed the second time that I wish I had known the first time. This article is not about what to study; it is more about how to study and mostly how to pass the exam. If you read this article at least twice and diligently note everything therein, I guarantee you will pass the exam.

Knowledge is more important than Speed

Every article you read about the CKA exam says you have to be fast, you have to know shortcuts, you have to practise speed, blah blah blah. I insist that, first of all, you have to know what you are doing. Speed in the wrong direction will hurt you more than it would help.

The CKA Exam is more about breadth of knowledge than depth; almost every single topic listed in the curriculum is tested. Also, if you are pretty fast at solving practice questions that you already have answers to, you are not really improving. Write out all the topics you are weak in and unsure about, and review questions on these topics multiple times. Remember not to only practice your strong areas, you need to prioritise even your weak areas.

Use the CKA and CKAD practice exams and other questions online, generate AI questions if you have to, but practice everything in the curriculum.

Pay attention to details

The most painful part about failing the CKA Exam was that most of the questions I failed were the really easy ones, and the questions I got right were the really difficult ones. Apart from me not mastering the basics during prep, I also did not pay attention to little details. When I started prepping the second time, I realised that I had come across most of the questions that I failed at least once or twice during prep, but I either did not acknowledge them, or I just quickly glossed over them.

If you intend to pass the CKA Exam, ensure that you pay complete and undivided attention to the most minute details during practice and while studying.

Study for the CKAD Exam

The CKA Exam is more like the CKAD++; most of the topics in the CKA Exam are in the CKAD Exam, infact I found practice questions from the CKAD Exam more useful than the CKA Exam. I believe this was because practice questions from the CKA Exam mostly focused on troubleshooting rather than the basics.

If you practice only the CKAD Exam really well, you are most likely to be close to at least 60% of the passing score, infact the Linux Foundation advises that you study the CKAD Exam before taking the CKA Exam.

P.S: You should take the CKAD Exam first; it's pretty straightforward and not difficult to pass.

Do not be persistent on a question

If you fail a question, flag it and simply move on. You probably would get it on a second try.

Be conversant with the testing environment

A lot of people have failed the exam because they were not conversant with the exam environment. The testing environment is on the PSI browser with access to a remote virtual machine. It is very important that you understand the testing environment, especially how to search and find on Mozilla Firefox.

Take time out to play with the testing environment on killer.sh - paying for the exam gives you access to the session. I exhausted the sessions on my first try and paid for some more. Despite that, I still lost the first two minutes when I logged into the exam, and it was zoomed in to the max.

P.S: If you can use a monitor, it creates a world of difference.

Believe you can do it

Mentality is a key part of the game; if you do not believe you can get the job done, you simply won't. Believe you can get the job done and work towards it. First of all, be confident in your abilities, do not make silly mistakes, read the question in detail, and pay attention to the remarks. Ensure that you check your solution works.

Be calm, you are prepared.

It's okay to shake a little; Who doesn’t? What’s more important is the fact that you are already a Kubernetes administrator even before you sit for this exam, so just go prove yourself.

Conclusion

I deliberately left out specific materials because everyone knows KodeKloud is the best at this kind of stuff; their practice exams just give you the kind of practice you need to pass.

If this article helps you to pass your exam, I am waiting to hear your success story.

Please re-read this article until it sticks. Feel free to like, share and subscribe.

How to Score 93% in the Prometheus Certified Associate Exam

Michael Mekuleyi — Wed, 07 May 2025 10:40:35 +0000

Introduction

Passing technical certifications often feels daunting and intimidating. The topics often feel endless, tabs pile up, and you may begin to wonder if you really know anything or if you’re prepared for it. But with the right approach, time and resources, it switches from unnerving and becomes both achievable and rewarding. Recently, I sat for and completed the Prometheus Certified Associate exam with a score of 93%, and the journey highlighted a few key principles worth sharing. This article is not a hack or guide of certainty to that exact score; rather, it is a reflection on what helped and how anyone preparing for this exam (and any other exam) can do it with better clarity and less burnout.

Here’s a breakdown for anyone preparing for the exam and aiming for not just a pass, but true mastery.

Get to Know Prometheus

Before you start breaking down the grand topic of Prometheus and building your study plan, you want to focus on really understanding it. Why does Prometheus exist? How does it fit into the tech world? How does it interact with other tools? What problem does it solve?

To build this base of knowledge for myself, I used the Prometheus Certified Associate PCA Course on KodeKloud. It is structured and hands-on, making complex concepts easy to approach. This excellent course from KodeKloud not only taught concepts but also allowed for real-world lab practice.

If you’re preparing for this exam, do not rush past this stage. Spend enough time here to read and understand what Prometheus is about.

Build a Structured Study Plan

Consistency beats intensity. Preparing for the exam took roughly 400 minutes per week, which I stretched over eight weeks. Rather than cramming close to exam day, breaking the learning journey into focused daily or weekly sessions is significantly more effective.
Setting a target (for example, 90 minutes per day or a few dedicated hours across the week) ensures that momentum stays steady and learning remains layered. This is much better than putting off all the reading till a week before the exam. This will affect your ability to identify and fill in gaps in knowledge.

Document Everything While Studying

Note-taking isn’t just for school — it’s a superpower.
Throughout my preparation, every topic, command, concept, and tricky detail was documented using Evernote. Having a single repository where everything lives — explanations, commands, tricky questions — builds confidence and revises faster. Don't just consume information; organise it.
Create sections, notebooks, or mind maps that can be quickly skimmed before the exam.

Prioritise Practice Tests

Theory creates familiarity, but practice cements understanding.
Taking practice exams early and often will reveal weak areas that need reinforcement. The Udemy Prometheus Certified Associate Practice Exam I took in preparation mimicked the actual exam’s structure and timing well, offering a good measure of readiness. It included 4 exams of 60 minutes each. Approach practice exams with seriousness: simulate real conditions (no pausing, no peeking at notes) and time every session.
Over time, your scores will improve, but more importantly, your confidence will be built.

Manage Time on Exam Day

One often overlooked skill is time management.
I completed the Prometheus exam in under 30 minutes — not because I rushed, but because I was familiar with the exam structure and the pace practiced during mock tests.
Pacing is crucial:
Read questions carefully but efficiently.
Eliminate wrong options methodically.
Trust first instincts when confident, and avoid second-guessing unless absolutely necessary.

If you spend sufficient time preparing, be confident.

Final Thoughts

Success in technical certifications isn’t about cramming or getting lucky — it’s about building daily discipline, choosing excellent resources, documenting learning clearly, and practicing intentionally. Whether aiming for a Prometheus certification or any other professional milestone, these principles apply universally. And if you follow the steps above diligently, you will find that you are acing your exams and certifications with ease.
Dedicate the time.

Trust the process.

And most importantly, enjoy the journey of becoming a little sharper, a little stronger, and a lot more confident.
Thank you for reading. If you have found this helpful, like, share, and follow me for more helpful articles. You can also go through my page to check other articles I’ve written. I’m positive you will find at least one article helpful.

Designing a fault-tolerant etcd cluster

Michael Mekuleyi — Wed, 07 May 2025 07:39:19 +0000

Introduction

In this article, we are going to discuss a strongly consistent, distributed key-value pair datastore used for shared configuration, service discovery, and scheduler coordination in Kubernetes, this database is called etcd (pronounced et-see-dee). This article is part of a series that will focus on understanding, mastering, and designing efficient etcd clusters. In this article, we will discuss the justification behind using etcd, the leader election process, and finally, the consensus algorithm used in etcd, in the following parts, we will follow up with a technical implementation of a highly available etcd cluster. and also backing up an etcd database to prevent failures. This article requires a basic understanding of Kubernetes, algorithms, and system design.

etcd

etcd (https://etcd.io/) is an open-source leader-based distributed key-value datastore designed by a vibrant team of engineers at CoreOS in 2013 and donated to Cloud Native Computing Foundation (CNCF) in 2018. Since then, etcd has grown to be adopted as a datastore in major projects like Kubernetes, CoreDNS, OpenStack, and other relevant tools. etcd is built to be simple, secure, reliable, and fast (benchmarked 10,000 writes/sec), it is written in Go and uses the Raft consensus algorithm to manage a highly-available replicated log. etcd is strongly consistent because it has strict serializability (https://jepsen.io/consistency/models/strict-serializable), which means a consistent global ordering of events, to be practical, no client subscribed to an etcd database will ever see a stale database (this isn't the case for NoSQl databases the eventual consistency of NoSQL databases ). Also unlike traditional SQL databases, etcd is distributed in nature, allowing high availability without sacrificing consistency.

etcd is that guy.

Why etcd?

Why is etcd used in Kubernetes as the key-value store? Why not some SQL database or a NoSQL database? The key to answering this question is understanding the core storage requirements of the Kubernetes API-server.

Designing a fault-tolerant etcd cluster on AWS

Michael Mekuleyi — Mon, 04 Nov 2024 10:40:38 +0000

Introduction

etcd

etcd is an open-source leader-based distributed key-value datastore designed by a vibrant team of engineers at CoreOS in 2013 and donated to Cloud Native Computing Foundation (CNCF) in 2018. Since then, etcd has grown to be adopted as a datastore in major projects like Kubernetes, CoreDNS, OpenStack, and other relevant tools. etcd is built to be simple, secure, reliable, and fast (benchmarked 10,000 writes/sec), it is written in Go and uses the Raft consensus algorithm to manage a highly-available replicated log. etcd is strongly consistent because it has strict serializability, which means a consistent global ordering of events, to be practical, no client subscribed to an etcd database will ever see a stale database (this isn't the case for NoSQl databases due to the eventual consistency of NoSQL databases ). Also unlike traditional SQL databases, etcd is distributed in nature, allowing high availability without sacrificing consistency.

etcd is that guy.

Why etcd?

The datastore attached to Kubernetes must have the following requirements,

Change notification: Kubernetes is designed to watch over the state of a cluster and adapt the cluster to the desired state, hence the API server acts as a central coordinator between the different clients, streaming changes to the different parts of the controlplane. The most optimal datastore for kubernetes would allow the API-server to conveniently subscribe to a key or some set of keys and would promptly update the API-server on any key change, while simultaneously performing the updates efficiently at scale.
Consistency: Kubernetes is a fast-moving orchestrator, it would be a disaster if the datastore powering the kube api-server has eventual consistency. Imagine your kubernetes cluster creating two deployments from a deployment spec because the datastore did not broadcast the update that a deployment already exists. etcd is strongly consistent, meaning that data across each node is the same, infact etcd does not complete a write till the update has been written to every member of the etcd cluster.
Availability: the Kube-Episerver requires its datastore to be highly available, this is because if the datastore is unavailable the Kube-apiserver goes down immediately and Kubernetes is unavailable. etcd solves this problem by being a distributed cluster with many nodes, this means that if a leader node is unhealthy another leader is elected and if a follower node is unhealthy requests are no longer sent to it.

The aforementioned characteristics highlight why etcd is the best choice as a datastore for Kubernetes. Other datastores like Hashicorp Consul, Zookeeper and CockroachDB can replace etcd in the way it is used in Kubernetes but etcd is by far the most popular just due to its share ability to be highly performant.

etcd internals

etcd in itself is a distributed consensus-based system, this means that for the cluster to function there must be a leader and the rest of the nodes would be followers. The distributed consensus system, offers one of computer science's biggest problems, "How do multiple independent processes decide on a single value for something?". etcd solves this problem by using the Raft Algorithm. the Raft algorithm is etcd's secret tool for maintaining a balance between string consistency and high availability.

The Raft algorithm works by electing a leader among nodes in the cluster and then ensuring all write requests go through that leader, any changes made by the leader are broadcast to the other nodes, a write is not also complete until all other nodes receive the write, hence the larger the etcd cluster the longer it takes for a write to occur. Since all the nodes are maintained in the same state, a read request can be sent to any node.

How does the Raft elect the leader in a group of nodes in a cluster? At the beginning of every etcd cluster, every node is a follower generally, nodes can exist in three states, follower, candidate, and leader. If followers at any point in time do not hear the heartbeat of a leader, they can then become candidates and request votes from other nodes to become the leader, nodes then respond with their vote, and the candidate with the highest number of votes becomes the leader.

Maintaining Quorum

in distributed consensus-based systems that use the Raft algorithm for leader election and voting, decisions are made using a majority vote, which means that if a majority cannot be reached the cluster becomes unavailable. in a cluster of 3 nodes, the majority is 2, if the leader goes offline and an election is held there would be no majority, and the cluster would remain unavailable, hence the quorum of this cluster is 2.

Theoretically, Quorum refers to the minimum number of members or nodes in a distributed system that must agree (reach consensus) for the cluster to make a decision or perform an operation. The quorum is typically defined as (N/2) + 1, where N is the total number of nodes in the cluster rounded down. For example, in a 5-node cluster, the quorum would be 3 nodes (5/2 rounded down + 1 = 3). This means at least 3 nodes must agree to proceed with a given operation.

Fault tolerance is the number of nodes that can be allowed to fail for a cluster to still maintain Quorum, For example, in a 5-node cluster, losing 2 nodes still allows the system to maintain a quorum (3 nodes), but losing 3 nodes would cause the cluster to lose its quorum, hence the fault tolerance of a 5-node cluster is 2.

Fault Tolerance = N - Q

If you look very closely at the image above, you will realize that the difference in fault tolerance between odd-numbered clusters and consecutive even-numbered clusters are the same (For example, 3 and 4 have the same cluster, 5 and 6 also have the same fault tolerance). There is genuinely no competitive advantage of having an even-numbered cluster, hence it is a rule of thumb for consensus-based clusters to have an odd number of nodes.

Conclusion

In this article, we discussed one of the most fundamental parts of Kubernetes, the Kubernetes datastore "etcd". We discussed the requirements that Kubernetes API-server has for its datastore, we also went in-depth on why etcd is the most popular datastore and what competitive advantages it has over SQL and NoSQL databases. Finally, we discussed concepts fundamental to the design of distributed clusters and how to ensure and guarantee availability for distributed clusters and not just etcd. In the next article, we will build an etcd cluster from the ground up and put the practice the things discussed in this article. If you enjoyed reading this article, feel free to share and also subscribe to my page.

Inside the AWS Kubernetes Control Plane

Michael Mekuleyi — Wed, 10 Apr 2024 08:18:53 +0000

Introduction

In this article, I will be discussing the essential parts of the Kubernetes brainbox, namely parts of the Kubernetes control plane. This article is a foundation for a deep dive into Kubernetes that I will be doing throughout the year. This piece of writing aims to give a clear foundation to the different parts of the Kubernetes control plane, and how they work individually and collectively to ensure pods are deployed smoothly and efficiently. This body of work is also solely focused on the master node components, other components that function on the worker nodes will be discussed in a later article. To get the best of this article you require no prior knowledge of Kubernetes or any programming experience.

Kindly note that the word "Kubernetes" is used interchangeably with the word "kube", especially when describing a core component of the control plane, this is an abbreviation that has been accepted as a standard in the community.

Understanding the Kubernetes Control Plane

Kubernetes is an open-source orchestration platform that is used to manage, deploy, and scale containerized applications anywhere. Kubernetes (short for k8s) is portable and extensible and can be used to manage workloads that facilitate both declarative configuration and automation.

Fun fact: The name Kubernetes originates from Greek, meaning helmsman or pilot

In this article, we are focused on a series of components in the Kubernetes ecosystem called the Control plane. The Kubernetes control plane is a set of components in Kubernetes that collectively work to reconfigure the state of a cluster with the main goal of achieving a desired state. The control plane utilizes information such as cluster activity and node date to ensure that deployed applications are fault-tolerant and highly available.

In this article, we will discuss the four major parts of the control plane namely,

Kube api-server
etcd
Kube-scheduler
Kube controller-manager

Caveat: I am not including the Cloud controller manager because this article is solely focused on native parts of Kubernetes, outside of any cloud provider or third-party system.

Kube api-server

The Kube api-server exposes the Kubernetes API through multiple pathways (such as HTTP, gRPC, and kubectl), making it the only external-facing component of the control plane. The kube api-server validates and configures data for native Kubernetes objects such as pods, deployments, and services. The api-server is also solely responsible for authentication of API requests, authorization of roles and groups, and admission control of Kubernetes objects.

The api-server's state is stored in a distributed and highly consistent database (etcd) so the api-server itself is stateless and can be easily replicated across different machines. The api-server supports both the SPDY protocol, as well as HTTP2/WebSocket however, SPDY is being deprecated for HTTP2/Websocket.

etcd

etcd is an open-source, strongly consistent distributed key-value store that is used to store configuration, state data, and meta information for Kubernetes. Kubernetes utilizes the highly available nature of etcd to provide a single consistent source of truth about the status of its distributed clusters, this is inclusive of the pods and application instances deployed on the same pods.

etcd is designed to have no single point of failure and gracefully tolerate hardware failure and network partitions, it is also pretty fast as it has been benchmarked to be able to handle over 1,000 writes per second (https://etcd.io/docs/v3.5/benchmarks/etcd-2-1-0-alpha-benchmarks/).

Due to the highly sensitive nature of the data that resides in Kubernetes, etcd also supports automatic Transport Layer Security (TLS) and secure socket layer (SSL) client certificate authentication.

The kube api-server stores each cluster's state data in etcd, it also makes use of etcd's watch function to monitor saved state data and initiate a response when the state data deviates from the intended data.

Kube scheduler

The Kubernetes scheduler is the component of the control plane that is responsible for assigning pods to Nodes. The scheduler watches for newly created pods and then assigns the best possible available node to those pods. Multiple different schedulers can be used in the same cluster, depending on the intended scheduling algorithm.

In a cluster, nodes that satisfy the scheduling requirements of a pod are called feasible nodes, if none of the nodes are suitable the pod is left in a pending state until it is allocated to a node.

The process by which the kube-scheduler assigns pods to nodes is pretty straightforward. When a new pod is created, the scheduler picks up its requirements and proceeds to find feasible nodes for the pod in a process known as filtering. After filtering, the scheduler runs a set of functions to score feasible nodes in a process known as scoring. After scoring is complete, the scheduler then picks the node with the highest score among the feasible nodes to run the pod and proceeds to attach that node to the pod, this process is known as binding.

The kube scheduler assigns the pod to the node with the highest score, and if there is more than one node with the highest score, the scheduler selects one at random and attaches the pod to that node.

Kube controller-manager

To completely understand the role of the Kubernetes controller-manager, it is imperative that you understand the complete function of A controller.

A controller is a non-terminating loop that regulates the state of the Kubernetes system, a controller works to bring the whole system to the desired functioning state. When the current state of an object deviates from the desired state, the control loop takes corrective steps to make sure that the current state is the same as the desired state. A controller can perform the aforementioned duties with the help of the kube api-server.

An example of a controller is the node controller, which is responsible for monitoring the state of nodes and taking the necessary actions to keep applications running when a node is faulty. Another example is the replication controller, this controller monitors the state of the replica sets and ensures that the desired number of pods are available at all times within the set. If a pod terminates, another one will be created in its place.

The kube controller-manager is a core component of the control place that is responsible for running multiple controllers that maintain the desired state of a cluster. The kube controller manager runs as a daemon (on every node in the cluster) and it comes pre-packaged with the necessary controllers to successfully run a Kubernetes cluster.

Conclusion

In this article, we have described the core parts of the Kubernetes Control plane, their roles, and how they collectively function to ensure that the cluster is healthy. This article is a primer to a series of articles that I will be writing on Kubernetes through the course of the year. Thank you for reading! Please like, share, and subscribe to enjoy more articles from me. Thank you!

Hardening Cluster Security in Google Kubernetes Engine

Michael Mekuleyi — Tue, 12 Dec 2023 19:34:51 +0000

Introduction

This technical article is a detailed continuation of my talk at DevFest Ikorodu, where I spoke extensively about key security concepts in Kubernetes and how to build a truly secure cluster network. In this article, I will highlight security concepts that are wholly focused on Google Kubernetes Engine (GKE) on the Google Cloud Platform, I will discuss security policies native to Google Cloud and particularly to Kubernetes, I will also discuss container-optimized images and some security practices that run at deploy-time. This article requires some knowledge of Kubernetes, Google Cloud, and a passion for building secure systems.

Understanding the Shared responsibility model

Google's ideology towards building secure systems is called the Shared Responsibility Model, the shared responsibility model details that the security of your workloads, networks, and data is a joint liability between Google and the Client (You). As regards the Google Kubernetes Engine, Google has a responsibility to secure the master control plane and its components like the API server, etcd database, and controller manager while the user is responsible for securing nodes, containers, and pods. Across IaaS, PaaS, and SaaS, Google properly defines its responsibility and also the Client's responsibility. This is clearly shown in the diagram below

Key Security Practices in Kubernetes on Google Cloud

Google Cloud provides a series of services, policies, and configurations that strengthen authentication and authorization across Kubernetes networks, data systems, and workloads. A majority of these policies are configurable and this makes it the whole responsibility of the client to ensure that their cluster has the appropriate security policy in use. In this article, we will focus briefly on the following security concepts,

Network Policies
Shielded GKE nodes
Container-optimised OS images
Binary Authorization
Private Clusters

Network Policies

By default all pods in Kubernetes can communicate with each other, however, Kubernetes provides access to objects that can limit inter-pod communication. These Kubernetes objects are called network policies. Kubernetes network policies allow you to specify how a pod can communicate with various network entities based on pods matching label selectors or specific IP addresses with port combinations. These policies can be defined for both ingress and egress.

GKE provides the option to enforce the use of a network policy when a cluster is created to ensure that inter-pod communication is controlled. You can easily configure this by running the following command,



# Enforce network policy on new clusters 
gcloud container clusters create <cluster-name> --enable-network-policy

# Enforce network policy on existing clusters
gcloud container clusters update <cluster-name> --update-addons=NetworkPolicy=ENABLED

An example of a simple network policy can be found here

Shielded GKE nodes

Google also provides Shielded GKE nodes that increase cluster security by providing strong verifiable node identity and integrity. Google Cloud simply uses Shielded Compute Engine virtual machines as Kubernetes cluster nodes. These virtual machines cannot be compromised at the boot or kernel level, because they use virtual Trusted Platform modules and also secure boot. Shielded VMs enforce and verify the signature of all the components at the boot process to make sure that individual components and modules in the VMs are safe and secure.

Shielded GKE nodes prevent attackers from impersonating nodes in a cluster in the event of a pod vulnerability being exploited.

You can enable Shielded nodes in new/existing clusters with the following commands,



# Enable Shielded GKE nodes on new cluster
gcloud container clusters create <cluster-name> --enable-shielded-nodes

# Enable Shielded GKE nodes on existing cluster
gcloud container clusters update <cluster-name> --enable-shielded-nodes

# Verify that Shielded GKE nodes are enabled (check for enabled under shieldedNodes as true) 
gcloud container clusters describe <cluster-name>

There is no extra cost for using Shielded nodes in GKE, however, they generate more logs which will generally lead to an overall increase in cost.

Container Optimised OS images

Container-optimized OS (also known as cos_containerd image ) are Linux-based kernel images provided by Google to run secure and production-ready workloads. They are optimized and hardened specifically for running enterprise workloads. They continuously scan for vulnerabilities at the kernel level and patch and update any package in case of a vulnerability. Their root filesystem is always mounted as read-only, this prevents attackers from making changes to the filesystem. They are completely stateless, however, can be customized to allow for writes on a specific directory.

Binary Authorization

Binary Authorization ensures that only trusted containers are deployed to clusters in GKE. It is a deploy-time security service provided by Google. Binary Authorization has seamless integration with Container analysis, a GCP service that scans container images stored in the Container registry for Vulnerabilities. Binary Authorization comprises one or more rules before the image is allowed to be deployed in the cluster. Binary authorization also ensures that only attested images are deployed, an attested image is an image that is verified by an attestor. At the time of deployment, Binary Authorization enforces the use of an attestor to verify the attestation. Any unauthorized image that does not match the Binary Authorization policy is rejected and will not be deployed.

The following commands will enable binary authorization on GKE clusters,



# Enable binary authorization on a new cluster 
gcloud container clusters create  CLUSTER_NAME --binauthz-evaluation-mode=PROJECT_SINGLETON_POLICY_ENFORCE --zone ZONE

# Enable binary authorization on an existing cluster
gcloud container clusters update CLUSTER_NAME --binauthz-evaluation-mode=PROJECT_SINGLETON_POLICY_ENFORCE --zone ZONE

Even though binary authorization prevents unauthorized images from being deployed, you can specify the break the glass flag as an annotation in the pod deployment to allow the pods to be created even if the image violates the Binary authorization policy. The following is an example of a pod specification that uses the break-glass annotation,



apiVersion: v1

kind: Pod

metadata:

    name: my-break-glass-pod

    annotations:

        alpha.image-policy.k8s.io/break-glass: "true"

Private Clusters

GKE Private Clusters are used to isolate the network connectivity of a cluster to the public internet, this includes both inbound and outbound traffic. This is possible because the nodes in the cluster will have only an internal private IP address and no public-facing IP address. If nodes require outbound internet traffic then a managed Network Address Translation gateway will be used. For inbound internet access, external clients can reach the applications inside the cluster through Kubernetes service objects. Such services should be of the type NodePort or LoadBalancer. Since inbound or outbound traffic is not allowed, it will be impossible to use public docker containers to deploy images in clusters. To access public containers, it is advised that you create a Cloud NAT gateway or upload the images to a private Container Registry and then point your cluster to them.

The level of access to a private cluster via endpoints can be controlled through any of the following configurations;

Public endpoint access disabled
Public endpoint access enabled; authorized networks enabled for limited access
Public endpoint access enabled; authorized networks disabled

The following code snippets will enable you to create private clusters in any of the aforementioned configurations,



  
  
  Public Access Disabled


gcloud container clusters create my-private-cluster --create-subnetwork name=my-subnet --enable-master-authorized-networks --enable-ip-alias --enable-private-nodes --enable-private-endpoint --master-ipv4-cidr 172.20.4.32/28

  
  
  Public endpoint access enabled; authorized networks enabled for limited access


gcloud container clusters create my-private-cluster-1 --create-subnetwork name=my-subnet-1 --enable-master-authorized-networks --enable-ip-alias --enable-private-nodes --master-ipv4-cidr 172.20.8.0/28

  
  
  Public endpoint access enabled; authorized networks disabled


gcloud container clusters create my-private-cluster-2 --create-subnetwork name=my-subnet-2 --no-enable-master-authorized-networks --enable-ip-alias --enable-private-nodes --master-ipv4-cidr 172.20.10.32/28

Conclusion

In this article, I discussed in detail the active steps you can take to secure your cluster and ensure that your Kubernetes workloads are safe and secure. If you enjoyed reading this article, kindly follow me on Twitter. You can also like and share this article with anyone interested in learning about securing their GKE clusters. Thank you and be safe!

References

Taints and Tolerations in Kubernetes: A Pocket Guide

Michael Mekuleyi — Sat, 30 Sep 2023 20:00:55 +0000

Introduction

This body of work is the conclusion of a three-part series on Pod scheduling in Kubernetes. We started this series with Intentional Pod Scheduling using Node Selectors, We went on to further explore Pod scheduling with Node Affinity and finally we will be touching on Pod scheduling using Taints and tolerations. In the article, we will discuss the practical use of taints and tolerations in pod scheduling, how to apply taints to multiple nodes/node pools, and how to apply tolerations to pods.

This article requires that you have a working knowledge of Kubernetes, yaml files and you understand how to use the command line interface (CLI).

Understanding Taints and Tolerations

Kubernetes version 1.8 came with a feature called "Taints and Tolerations", the main goal of this feature was to prevent unwanted pods from being scheduled on some particular nodes. Kubernetes also used this feature to prevent pods from being scheduled on the master node and to ensure the master node was free from taking on workloads. Taints are generally applied on nodes to prevent unwanted scheduling, tolerations are applied on pods to allow them to be scheduled on nodes that have taints 🥲.

Another practical application of taints is scheduling pods with special compute requirements to nodes that have those requirements, with this we can deliberately schedule compute-intensive pods to special hardware nodes.

Tainting a node

To taint a node, we will need to run the following command,

kubectl taint nodes <node name> <taint key>=<taint value>:<taint effect>

Here, is the name of the node that you want to taint, and the taint is described with the key-value pair. In the above example, is mapped to the and the taint effect is correlated to the key-value pair.

Taint effects also define what will happen to pods if they don’t tolerate the taints. The three taint effects are:

NoSchedule: A strong effect where the system lets the pods already scheduled in the nodes run, but enforces taints from the subsequent pods.
PreferNoSchedule: A soft effect where the system will try to avoid placing a pod that does not tolerate the taint on the node.
NoExecute: A strong effect where all previously scheduled pods are evicted, and new pods that don’t tolerate the taint will not be scheduled.

Adding a Toleration to a pod

Tolerations help you schedule pods on nodes with taints. Tolerations are usually applied to pod manifests in the following format.

tolerations:
- key: "key1"
  operator: "Equal"
  value: "value1"
  effect: "NoExecute"
  tolerationSeconds: 3600

If the toleration and the taints match, the pod can be scheduled on that node, however, the pod with the toleration can still be scheduled on any other node even without the taint. This is why it is advised to taint all the nodes in your cluster if you intend on using taints for pod scheduling.

In the end, your pod would look like this,

apiVersion: v1
kind: Pod
metadata:
  name: nginx
spec:
  containers:
  - name: nginx
    image: nginx:latest
    imagePullPolicy: IfNotPresent
    resources:
      requests:
        memory: "128Mi"
        cpu: "250m"
      limits:
        memory: "512Mi"
        cpu: "500m"
  tolerations:
  - key: <taint key>
    operator: "Equal"
    value: <taint value>
    effect: <taint effect>

Conclusion.
Understanding and effectively utilizing pod scheduling in Kubernetes through taints and tolerations is essential for optimizing resource allocation, ensuring high availability, and maintaining the reliability of your containerized applications. By carefully defining taints on nodes and specifying tolerations in your pod specifications, you can achieve a fine-grained level of control over where and how your pods are placed within the cluster.

Please give this article a like if you enjoyed reading it, and feel free to subscribe to my page. Thank You!

Kubernetes Node Affinity; A Love Story between Nodes and Pods

Michael Mekuleyi — Thu, 31 Aug 2023 19:33:35 +0000

Introduction

In this article, we will discuss designing a one-to-one heartfelt relationship between a node and a Kubernetes object (pod/deployment). This article at its core is centered on intentional pod scheduling with Kubernetes with a focus on Node affinity. In this piece of art, we will explore the definition of Node Affinity, the reason we instruct certain nodes to be deployed on certain pods, why this form of pod scheduling is preferred, and finally how to set up Node Affinity for a set of nodes and some pods. To follow this body of work it is important you understand the basic concepts of Kubernetes, you have a cluster up and running and you genuinely have your heart open to a love story.

A Love Story

In this love story, we have two friends called Claire and Fiona. Claire and Fiona are looking to get married but they have specific requirements for a suitor. They both insist on marrying tall men while Fiona would like a tall man who has a beard (however she is not adamant about it). Both friends would rather not get married if they don't find tall men, However in a surplus of tall men, Fiona would like a man with a beard. Our task in this article is to design "Men" for both Claire and Fiona.

Understanding Node affinity

To understand Node affinity assume the two friends are pods and the Suitors (Men) are nodes. Some pods have Hard requirements (tall men) these requirements are compulsory and an absence of them would leave the pod unscheduled, while some pods have Soft requirements (bearded men), these requirements are optional and they just streamline the process of finding a suitable node.

We deliberately place such requirements on pods to ensure that those pods are deployed on certain nodes, for example, pods that require a lot of compute/memory resources should be deployed on nodes with heavy compute/memory resources. The primary reason for such a design is for deliberate pod scheduling and to override the default scheduling mechanism of the kube-scheduler. We mask the properties of these nodes by using labels and then we design the pods to look for these labels. Let's move on to designing nodes that will be suitable for both Claira and Fiona.

Finding a Node for Claire

For this article, it is assumed that we have two fresh nodes running. On the terminal, run the following command, to view the nodes:

michael@monarene:~$ kubectl get nodes

Notice that we have two distinct nodes, one ending with 54 on the first sub-string of the node name and another ending in 197 on the first substring of the node name.

To deploy the claire pod, save the following configuration into claire.yaml.

apiVersion: v1
kind: Pod
metadata:
  name: claire
spec:
  affinity:
    nodeAffinity:
      requiredDuringSchedulingIgnoredDuringExecution:
        nodeSelectorTerms:
        - matchExpressions:
          - key: height
            operator: In
            values:
            - tall
  containers:
  - name: with-node-affinity
    image: k8s.gcr.io/pause:2.0

Take a keen look at spec.affinity.nodeAffinity , you will find a field called requiredDuringSchedulingIgnoreDuringExecution, requiredDuringScheduling means that for the pod to be assigned a node, it is a must that the expressions on the pod match the requirements. In this case the requirement is that one of the labels of the pod must be height=tall . IgnoreDuringExecution means that after scheduling on the node, ignore further changes to the labels on the nodes, in simpler terms even if the labels on the node change to height=short the pod should still continue to be assigned to the node.

To start the deployment, run the following command on the terminal.

michael@monarene:~$ kubectl apply -f claire.yaml

After a few seconds, check the state of the pod by running the following command,

michael@monarene:~$ kubectl get pods

We see that claire is stuck in a pending phase, this is the case because none of the nodes have the label height=tall Hence claire will not be scheduled on any of the nodes as this is a Hard requirement for scheduling.

To further verify that this is the case, run the following command,

michael@monarene:~$ kubectl describe pods claire

To fix this, assign the node that ends with 54 the label height=tall , we do this by running the following command:

michael@monarene:~$ kubectl label nodes ip-172-31-15-54.us-east-2.compute.internal height=tall

Now let us check the status of the pod by running the following command:

michael@monarene:~$ kubectl get pods

We can now see that the pod is running as expected, And claire has been paired with a Suitor 🥰. To further show that claire was scheduled to the node ending with 54, run the following command:

michael@monarene:~$ kubectl describe pods claire

Finding a Node for Fiona

To setup the pod definition for Fiona, save the following configuration into fiona.yaml

apiVersion: v1
kind: Pod
metadata:
  name: fiona
spec:
  affinity:
    nodeAffinity:
      requiredDuringSchedulingIgnoredDuringExecution:
        nodeSelectorTerms:
        - matchExpressions:
          - key: height
            operator: In
            values:
            - tall
      preferredDuringSchedulingIgnoredDuringExecution:
      - weight: 1
        preference:
          matchExpressions:
          - key: beard
            operator: In
            values:
            - present
  containers:
  - name: with-node-affinity
    image: k8s.gcr.io/pause:2.0

As per the configuration, you can see that Fiona requires that her man be tall, but amongst tall men, she would also like a bearded man. This is illustrated in the preferredDuringScheduling field , this requirement is what we referred to as a Soft requirement. To further illustrate this, we will assign the node that ends with 197 with the labels height=tall and beard=present . We can do this by running the following command:

michael@monarene:~$ kubectl label nodes ip-172-31-37-197.us-east-2.compute.internal height=tall && kubectl label nodes ip-172-31-37-197.us-east-2.compute.internal beard=present

Next, we deploy fiona and see what node she moves to, We do this by running the following command:

michael@monarene:~$ kubectt apply -f fiona.yaml

Next, we check to see if she is running as expected,

We can see that Fiona is running and has been successfully scheduled to a pod. Now to check the pod she has been scheduled to we run the following command:

michael@monarene:~$ kubectl get pods -o wide

We see that Fiona has been scheduled on the appropriate node, the Suitor that is tall and also has beards, and this is how we schedule pods to suitable nodes.

Conclusion

In this article we used an illustration of a courting process to describe the intentional pod scheduling using Node Affinity, We configured two nodes with appropriate labels, we deployed our pods to their respective suitors (Nodes) and finally, we showed how to assign labels to nodes and pods to labels. Feel free to like, share, and comment on this article. Don't forget to subscribe to receive notifications when I post new articles. Thank you!

Intentional Kubernetes Pod Scheduling - NodeSelector

Michael Mekuleyi — Mon, 31 Jul 2023 20:02:09 +0000

Introduction

In this article, we will be discussing deliberate pod scheduling in Kubernetes with a focus on using Node Selectors. First, we will focus on the default mechanism for pod scheduling in Kubernetes, then we will justify the need to deliberately schedule pods to specific nodes, outlining the different methods of pod scheduling. Finally, we will do a run-through on scheduling a pod to a specific node on an actual cluster using the nodeSelector method. This article requires that you have a strong working knowledge of Kubernetes and that you are conversant with using the kubectl.

This is a 3-part series that focuses on different methods of pod scheduling, This is part 1, focused on using the NodeSelector method, other parts will be focused on Node Affinity with Interpod-Afffinty and Anti-affinity and finally, Taints and Tolerations.

How does Kubernetes Schedule Pods by default?

Kubernetes schedules pods with a component of the control plane called the kube-scheduler. The Kube-scheduler is responsible for selecting an optimal node to run newly created or not yet scheduled (unscheduled) pods. The kube-scheduler is also built to allow you write your own scheduling component when you need to.

Nodes that meet the pod requirements are called feasible nodes, the kube-scheduler is tasked with the responsibility of finding feasible nodes for a pod among the nodes in a cluster (in a process called filtering), running a set of algorithms on the feasible nodes to pick the node with the highest score (this process is called scoring) and finally assigning the pod to that node. If there are no feasible nodes to run a pod, the pod remains unscheduled. If there is more than one node with the same high score, the kube-scheduler selects a node at random to run the pod.

At the end of this selection process, the kube-scheduler notifies the kube api-server about the final node selected to run the pod in a seperate process called binding, and then the pod is deployed to that node.

Why deliberate pod scheduling?

Deliberate pod scheduling is most important when you have multiple node pools in a customized cluster. For example, if you want a pod to run on a node that has SSD for faster processes, you can schedule the pod to run on just that node. Perhaps you want to co-locate pods to run on a particular node in the same zone for less latency or you want strongly related services to run on the same node, deliberate pod scheduling can be used to write your scheduling algorithm and override the default scoring process used by the kube-scheduler.

There are a number of distinct ways to deliberately schedule pods on nodes, below are a few of them;

NodeSelector
Node Affinity
Inter-pod affinity and Anti-affinity
NodeName
Taints and Tolerations

NodeSelector

In this article, we will focus on using a NodeSelector and we will explore other options in a later article. Using nodeSelector is the simplest recommended form of node selection constraint. You can add the nodeSelector field to your Pod specification and specify the node labels you want the target node to have.

Kubernetes only schedules the Pod on nodes that have each of the labels you specify. NodeSelector is a Pod attribute that forces kube-scheduler to schedule a pod only against a node with a matching label and corresponding value for the label.

Setting up NodeSelector

The first thing to do in setting up NodeSelector is to view the labels already on your intended node, you can use kubectl to do this



michael@monarene:~$ kubectl get nodes

Next, you select the intended node and view the labels on the cluster using kubectl,



michael@monarene:~$ kubectl describe nodes ip-172-31-28-239.us-east-2.compute.internal

You then add the labels on the intended node node by using kubectl , note that the structure for the command should be kubectl label nodes <node-name> <label-key>=<label-value>



michael@monarene:~$ kubectl label nodes ip-172-31-28-239.us-east-2.compute.internal platform=web

Verify that the label was added to the node using the kubectl describe.



michael@monarene:~$ kubectl describe nodes ip-172-31-28-239.us-east-2.compute.internal

You can see the new label added to the node.

Assign a pod to the node you just labeled, Save this spec below to the test-pod.yaml



apiVersion: v1
kind: Pod
metadata:
  name: httpd
  labels:
    env: prod
spec:
  containers:
  - name: httpd
    image: httpd
    imagePullPolicy: IfNotPresent
  nodeSelector:
    platform: web

Go ahead to deploy the pod using the kubectl create command.



michael@monarene:~$ kubectl create -f test-pod.yaml

Finally, verify that the pod is scheduled on the right node by using the kubectl get command.



michael@monarene:~$ kubectl get pods -o wide

Security Constraints

To prevent malicious users from scheduling pods to their own nodes, ensure to choose label keys that the kubelet cannot modify. This prevents a compromised node from setting those labels on itself so that the scheduler schedules workloads onto the compromised node.

Kubernetes has a NodeRestriction admission plugin that prevents kubelets from setting or modifying labels with a node-restriction.kubernetes.io/ prefix, ensure to Add those labels under node-restriction.kubernetes.io/ prefix to your Node objects, and use those labels in your node selectors.

Conclusion

In this article, we discussed intentional Pod scheduling in Kubernetes and we also explored using a NodeSelector to schedule pods on nodes, If you enjoyed this article feel free to like, share and subscribe.Thank you!

Designing a Kubernetes Cluster in GCP using OSS terraform modules

Michael Mekuleyi — Fri, 30 Jun 2023 09:17:42 +0000

Introduction

In this article, we will be leveraging Terraform modules in the Google Foundation toolkit to deploy a private compute network and a Kubernetes cluster. The Google Foundation took-kit is a set of tools, modules, and packages that follow Google's best practices for deploying and maintaining architecture on the Google Cloud Platform. First, we will deploy a private compute network, and then we will go ahead to deploy a Kubernetes Cluster in the same network using only open-source modules. This article requires that you have a working knowledge of Terraform and you are at least conversant with the Google Cloud Platform.

Project structure

The entire project is uploaded in this GitHub repository. The folder structure used in the project is defined below,

auth.tf : This file contains code for creating the service account and adding the necessary IAM permissions to those service accounts.
gcloud.sh: This file contains commands that are run to enable container and compute services on Google cloud.
gke.tf: This files contains code on deploying the Google Kubernetes Engine using the open source module from the Google Foundation Tool-kit.
outputs.tf: This file contains code on the output of the configuration.
provider.tf: This file contains code that initializes the entire configuration
terraform.tfvars: This file contains code that sets values for the variables declared in variables.tf
variables.tf: This file contains variable declarations that will be used in the comfiguration
vpc.tf: This file contains code on deploying the Private compute network in Google Cloud using the open-source foundation kit

Authentication

I genuinely consider this section extremely important if not most important, because authenticating properly in Google Cloud can be frustrating if not done properly. First, head over to the official documentation to read on how to authenticate with Google Cloud properly (https://registry.terraform.io/providers/hashicorp/google/latest/docs/guides/getting_started) . Ensure you have your project properly configured with the appropriate permissions, take note to ensure the service account you are using has the Service Acount Token Creator as we will be creating service accounts on the fly.

For the next step head over to gcloud.sh and run the script to enable both compute and container APIs on your Google Cloud Account.



gcloud services enable container.googleapis.com
gcloud services enable compute.googleapis.com

You can choose to run these commands individually or run the file as a script.

Next head over to auth.tf to see how we create the IAM role and enable the necessary permissions.



resource "google_project_service" "this" {
  for_each           = toset(var.services)
  service            = "${each.key}.googleapis.com"
  disable_on_destroy = false
}

resource "google_service_account" "this" {
  account_id   = var.service_account.name
  display_name = "${var.service_account.name} Service Account"
}

resource "google_project_iam_member" "this" {
  project = var.project_id
  count   = length(var.service_account.roles)
  role    = "roles/${var.service_account.roles[count.index]}"
  member  = "serviceAccount:${google_service_account.this.email}"
}

Here we define a list of services to enable, and we also create a service account and an IAM member for each service account role to enable the service account be properly authorised to deploy our compute network and cluster. You can view the list of services enabled in terraform.tfvars.



services = [
  "cloudresourcemanager",
  "compute",
  "iam",
  "servicenetworking",
  "container"
]

You don't need to run any configuration here as everything will be created when we run our terraform configuration.

Deploying the Compute network

To deploy the compute network, head over to vpc.tf, here we define the module for the compute network, the version to be used, and also the compute project IAM before deploying the compute network. We also deploy just one subnet, with secondary ranges. And finally, we enable Identity aware proxy to have SSH access over port 22 to our cluster.



module "vpc" {
  source  = "terraform-google-modules/network/google"
  version = "5.2.0"

  depends_on = [google_project_service.this["compute"]]

  project_id   = var.project_id
  network_name = var.network.name

  subnets = [
    {
      subnet_name           = var.network.subnetwork_name
      subnet_ip             = var.network.nodes_cidr_range
      subnet_region         = var.region
      subnet_private_access = "true"
    },
  ]

  secondary_ranges = {
    (var.network.subnetwork_name) = [
      {
        range_name    = "${var.network.subnetwork_name}-pods"
        ip_cidr_range = var.network.pods_cidr_range
      },
      {
        range_name    = "${var.network.subnetwork_name}-services"
        ip_cidr_range = var.network.services_cidr_range
      },
    ]
  }

  firewall_rules = [
    {
      name      = "${var.network.name}-allow-iap-ssh-ingress"
      direction = "INGRESS"
      ranges    = ["35.235.240.0/20"]
      allow = [{
        protocol = "tcp"
        ports    = ["22"]
      }]
    },
  ]
}

Designing the Kubernetes Cluster

Now to the exciting part, head over to gke.tf to see how we deploy the Kubernetes cluster. First, we use the data object to grab the default google client config, then we initialise the Kubernetes provider with a token and the certificate.



data "google_client_config" "default" {
}
provider "kubernetes" {
  host                   = "https://${module.gke.endpoint}"
  token                  = data.google_client_config.default.access_token
  cluster_ca_certificate = base64decode(module.gke.ca_certificate)
}

Next, we go over to define the Kubernetes module, we set the network as the VPC from the earlier module, and we grab the subnet from the VPC in the earlier module. We enable horizontal pod scaling and http load balancing . We also over-ride the default node pool with an object of custom node pool machine definition definitions. The rest of the defaults are the Google-advised parameters to get our cluster production ready.



module "gke" {
  source  = "terraform-google-modules/kubernetes-engine/google"
  version = "23.3.0"
  project_id = var.project_id
  region     = var.region
  name     = var.gke.name
  regional = var.gke.regional
  zones    = var.gke.zones
  network           = module.vpc.network_name
  subnetwork        = local.subnetwork_name
  ip_range_pods     = "${local.subnetwork_name}-pods"
  ip_range_services = "${local.subnetwork_name}-services"
  service_account = google_service_account.this.email
  node_pools = [
    {
      name               = var.node_pool.name
      machine_type       = var.node_pool.machine_type
      disk_size_gb       = var.node_pool.disk_size_gb
      spot               = var.node_pool.spot
      initial_node_count = var.node_pool.initial_node_count
      max_count          = var.node_pool.max_count
      disk_type          = "pd-ssd"
    },
  ]
  # Fixed values
  network_policy             = true
  horizontal_pod_autoscaling = true
  http_load_balancing        = true
  create_service_account     = false
  initial_node_count       = 1
  remove_default_node_pool = true
}

Deploying the services

This is definitely my favorite part. Before you deploy the service, ensure to configure your project id in terraform.tfvars



project_id = "<PROJECT_ID>"

To deploy, first initialize the entire configuration by running terraform init,



michael@monarene:~$ terraform init

Next, we view the intended configuration to be deployed, to do that we run terraform plan on the configuration,



michael@monarene:~$ terraform plan

Next we deploy the configuration by running the following,



michael@monarene:~$ terraform apply -var-file=terraform.tfvars --auto-approve

If the deployment is successful and everything goes well log in to the console to verify the deployment. First, we check Kubernetes Engine to verify that our cluster is properly deployed.

Next, we check the VPC Network tab to verify that the Compute network is deployed correctly.

Lastly, we check our node pool to verify that the instances were indeed created,

Yayyyy! Our deployment is successful and we have a working Kubernetes Cluster in a Private Compute Network.

Lastly, please ensure to delete all the resources created by running the following command,



michael@monarene:~$ terraform destroy --auto-approve

*Please do this so as not to attract additional charges on the deployment. *

Conclusion

Thank you for following me on the journey to making this deployment, feel free to extend this deployment and even raise a PR against the main repository (https://github.com/Monarene/deploy-gcp-k8s-modules). If you enjoyed this article, feel free to share it and also star the repository. Thank you!

References

Utilizing Google Cloud Storage as a remote backend for Terraform

Michael Mekuleyi — Wed, 31 May 2023 08:39:08 +0000

Introduction

In this article, I will be discussing using Google Cloud storage as a remote backend for your Terraform configuration, This article is a sequel to my article on Deploying a Remote backend with AWS S3 and Terraform , feel free to check out that article to learn more on remote state backends using AWS.

In this article, we will provision a Google Cloud Storage (GCS) bucket and utilize it to store its own state, then we will go ahead to provision a compute instance on Google Cloud Platform and store its statefile in the remote backend we enabled earlier. This article assumes a working knowledge of Google Cloud (cloud.google.com ) and an understanding of Terraform (https://www.terraform.io/ ). You can find the repository for this tutorial here

Setting up the remote backend

The idea of a remote backend is to safely move your statefile from your local computer to a reliable and remote location, this is to ease collaboration and multi-tasking. To get started please head to global-resources folder in the GitHub repository to view the configuration scripts. First, we will deploy a GCS bucket using the local state, then we will use the GCS bucket to manage its own state. Head over to global-resources/terraform.tf.



terraform {
required_version = ">= 1.3.0, < 2.0.0"
 /* backend "gcs" {
    bucket = "<YOUR-BUCKET-NAMR>"
    prefix = "global-resources/"
  } */

  required_providers {
    google = {
      source  = "hashicorp/google"
      version = "~> 4.40"
    }
  }
}

provider "google" {
  project = var.project_id
  region  = var.region
  zone    = var.zone
}

Here we initialize the required providers and set the necessary values for the GCS bucket, note that the backend is commented out, this is because we are yet to deploy the GCS bucket. Head over to global-resources/bucket.tf to see the configuration to deploy the cloud storage bucket.



resource "google_storage_bucket" "default" {
  name          = var.bucket_name
  force_destroy = true
  location      = "US"
  storage_class = "STANDARD"
  versioning {
    enabled = true
  }
}

Here, we define the bare necessary values for a GCS bucket and we enable versioning to help us preserve data. Also, remote backends with GCS support state-locking by default, hence is no need to provision a Key store. After entering the necessary variables in global-resources/variables.tf, we go on to deploy the configuration.

First we initialize the configuration,



michael@monarene:~$ terraform init

Then we go on to check the configuration plan,



michael@monarene:~$ terraform plan

Next we apply the configuration,



michael@monarene:~$ terraform apply --auto-approve

Next, we log in to the GCP console to check that the storage bucket is already created.

Now we are going to switch the storage bucket to use itself to manage its state file. Head over to global-resources/terraform.tf and uncomment the backend object in the terraform block.



terraform {
required_version = ">= 1.3.0, < 2.0.0"
 backend "gcs" {
    bucket = "<YOUR-BUCKET-NAMR>"
    prefix = "global-resources/"
  }

  required_providers {
    google = {
      source  = "hashicorp/google"
      version = "~> 4.40"
    }
  }
}

provider "google" {
  project = var.project_id
  region  = var.region
  zone    = var.zone
}

Now migrate to the remote state by re-initializing the Terraform configuration.



michael@monarene:~$ terraform init

When prompted on copying the existing state to the new backend, type "yes".

Now we head over to the console to confirm that our remote state is in GCS, you can find the state file in the global-resources folder in the GCS bucket.

Now we have a well configured backend with a Google Cloud storage bucket.

Applying the remote backend in other configurations

Now we will provision three compute instances using the count keyword in Terraform and store the statefile in the GCS bucket. First, head over to compute-instance/terraform.tf to see the terraform configuration.



terraform {
  required_version = "~> 1.3"
  required_providers {
    google = {
      source  = "hashicorp/google"
      version = "~> 4.40"
    }
  }
  backend "gcs" {
    bucket = "<YOUR-BUCKET-NAME>"
    prefix = "compute-instance"
  }
}

provider "google" {
  project = var.project_id
  region  = var.region
  zone    = var.zone
}

Here we declare the necessary variables to run the configuration, also notice that we set the backend object in the terraform block to gcs and we are pointing to our remote backend which was created earlier. Let's head over to compute-instance/main.tf to view the main configuration.



resource "google_compute_instance" "this" {
  provider     = google
  count        = 3
  name         = "${var.server_name}-${count.index}"
  machine_type = var.machine_type
  zone         = var.zone

  boot_disk {
    initialize_params {
      image = "debian-cloud/debian-11"
    }
  }
  network_interface {
    network = "default"
    access_config {
      // Ephemeral public IP
    }
  }
  metadata_startup_script = file("startup.sh")

  tags = ["http-server"]
}

Here we define 3 compute instances using the count block, we also set other important variables like the machine type and server name. You can also check out compute-instance/startup.sh for the startup script that runs when the server is spun, finally I have also added a http-server tag to allow ingress on default http ports. Please go ahead to study the configuration to understand the different connecting parts.

To deploy this configuration, we start by initializing it.



michael@monarene:~$ terraform init

Note the movement to Google Cloud storage backend. Next, we view the plan to run the configuration,



michael@monarene:~$ terraform plan

Next, we apply the configuration and get its outputs.



michael@monarene:~$ terraform apply --auto-approve

To verify that our compute instances have been accurately deployed, we login to the compute console on GCP to check,

Finally, we check our GCS bucket to verify that the instance state file is stored in the bucket.

We have successfully created a remote backend with a GCS bucket and we have utilized the bucket in storing our state files. Please go ahead to destroy all the resources you have created to avoid extra-billing charges.

Conclusion

In this article we have explored creating a Google Cloud storage bucket, using it to store our state files and further utilizing the state files for other deployments. We have done everything using Terraform as an IaC tool to manage infrastructure. You can also find the github repository for this article here,I hope you learnt a lot, feel free to like, share and comment on this article. Thank you!