DEV Community

Cover image for Kubernetes Use Case: Deploying and Managing a Scalable Web Application
Avesh
Avesh

Posted on

Kubernetes Use Case: Deploying and Managing a Scalable Web Application

Introduction to Kubernetes

Kubernetes (often abbreviated as K8s) is an open-source container orchestration platform that automates the deployment, scaling, and management of containerized applications. Developed by Google and now maintained by the Cloud Native Computing Foundation, Kubernetes is widely used for cloud-native application development and is highly effective in managing applications that require high availability, scalability, and fault tolerance.

In this article, we’ll explore a real-world use case for Kubernetes by setting up a scalable web application. We'll go through step-by-step instructions for deploying and managing this application on Kubernetes.

Use Case Scenario: Scalable Web Application

Consider a scenario where we have a web application that experiences high traffic. We need the application to be available at all times, able to handle scaling dynamically based on demand, and have failover capabilities to recover from unexpected failures.

Requirements:

  1. Scalability: The application must scale out (add more instances) or scale in (reduce instances) based on demand.
  2. Load Balancing: Incoming traffic should be evenly distributed across all instances of the application.
  3. Resilience: The application should be able to self-heal, automatically replacing any failed instances.

In this example, we’ll deploy a simple Node.js web application on Kubernetes and use Kubernetes features like Deployments, Services, and Horizontal Pod Autoscalers to fulfill these requirements.

Kubernetes Components Used

  1. Pods: The smallest unit in Kubernetes, representing one or more containers.
  2. Deployment: Defines the desired state and manages the number of replicas for our application.
  3. Service: Exposes our application and balances the load across pods.
  4. Horizontal Pod Autoscaler (HPA): Automatically scales the number of pod replicas based on CPU or memory usage.

Example Architecture

  • Node.js Web Application: A simple HTTP server that returns a "Hello, World!" message.
  • Nginx Ingress: A load balancer to distribute requests.
  • Kubernetes Cluster: Running locally (using Minikube) or in the cloud (e.g., Google Kubernetes Engine, AWS EKS).

Step-by-Step Implementation

1. Setting Up Kubernetes Environment

If you don’t have a Kubernetes cluster set up, you can use Minikube for local development or a managed Kubernetes service (like GKE or EKS) for production-grade deployments. Here’s how to set up Minikube:

# Install Minikube
curl -LO https://storage.googleapis.com/minikube/releases/latest/minikube-linux-amd64
sudo install minikube-linux-amd64 /usr/local/bin/minikube

# Start Minikube
minikube start
Enter fullscreen mode Exit fullscreen mode

Verify that the cluster is running:

kubectl get nodes
Enter fullscreen mode Exit fullscreen mode

2. Creating the Node.js Application

For demonstration, we’ll use a simple Node.js application that returns "Hello, World!" when accessed.

// app.js
const http = require('http');
const PORT = process.env.PORT || 3000;

const requestHandler = (req, res) => {
  res.end('Hello, World!');
};

const server = http.createServer(requestHandler);
server.listen(PORT, () => {
  console.log(`Server running on port ${PORT}`);
});
Enter fullscreen mode Exit fullscreen mode

Create a Dockerfile to containerize the application:

# Dockerfile
FROM node:14-alpine
WORKDIR /app
COPY app.js .
CMD ["node", "app.js"]
Enter fullscreen mode Exit fullscreen mode

Build and push the Docker image:

docker build -t <your_dockerhub_username>/node-app:v1 .
docker push <your_dockerhub_username>/node-app:v1
Enter fullscreen mode Exit fullscreen mode

3. Creating Kubernetes Deployment and Service

Define a Deployment YAML file (deployment.yaml) for the Node.js app:

# deployment.yaml
apiVersion: apps/v1
kind: Deployment
metadata:
  name: node-app
spec:
  replicas: 2
  selector:
    matchLabels:
      app: node-app
  template:
    metadata:
      labels:
        app: node-app
    spec:
      containers:
      - name: node-container
        image: <your_dockerhub_username>/node-app:v1
        ports:
        - containerPort: 3000
Enter fullscreen mode Exit fullscreen mode

Create a Service YAML file (service.yaml) to expose the application within the Kubernetes cluster:

# service.yaml
apiVersion: v1
kind: Service
metadata:
  name: node-app-service
spec:
  selector:
    app: node-app
  ports:
  - protocol: TCP
    port: 80
    targetPort: 3000
  type: LoadBalancer
Enter fullscreen mode Exit fullscreen mode

Apply these configurations:

kubectl apply -f deployment.yaml
kubectl apply -f service.yaml
Enter fullscreen mode Exit fullscreen mode

4. Exposing the Service

To expose the service, you can use minikube service (for Minikube):

minikube service node-app-service
Enter fullscreen mode Exit fullscreen mode

Or, in a managed Kubernetes cluster, you’d configure Ingress or a load balancer to expose the application.

5. Setting Up Auto-Scaling

Define a Horizontal Pod Autoscaler (HPA) that automatically adjusts the number of pods based on CPU usage. Create a file (hpa.yaml):

# hpa.yaml
apiVersion: autoscaling/v1
kind: HorizontalPodAutoscaler
metadata:
  name: node-app-hpa
spec:
  scaleTargetRef:
    apiVersion: apps/v1
    kind: Deployment
    name: node-app
  minReplicas: 2
  maxReplicas: 10
  targetCPUUtilizationPercentage: 50
Enter fullscreen mode Exit fullscreen mode

Apply the HPA configuration:

kubectl apply -f hpa.yaml
Enter fullscreen mode Exit fullscreen mode

The HPA will monitor CPU usage. If usage exceeds 50%, the number of replicas will increase; if usage falls, the replicas will decrease, down to a minimum of two.

6. Testing the Deployment

  1. Verify Deployment: Check the status of your pods and services.
   kubectl get pods
   kubectl get services
Enter fullscreen mode Exit fullscreen mode
  1. Generate Load for Scaling: Simulate load to trigger the HPA.
   kubectl run -i --tty load-generator --image=busybox /bin/sh
   # Inside the load generator shell, use:
   while true; do wget -q -O- http://<service-ip>; done
Enter fullscreen mode Exit fullscreen mode

The HPA should scale up additional pods if the CPU usage threshold is reached.

  1. Monitor Scaling: Observe the scaling activity.
   kubectl get hpa
   kubectl get pods -w
Enter fullscreen mode Exit fullscreen mode

Conclusion

This example illustrates how Kubernetes enables us to deploy a scalable, highly available web application with minimal configuration and management overhead. With just a few resource definitions, we can:

  • Automatically scale our application to handle increased load.
  • Load balance requests among multiple instances.
  • Ensure high availability and fault tolerance through Kubernetes’ self-healing capabilities.

By applying this approach to larger, more complex applications, teams can improve operational efficiency and ensure that applications are resilient and responsive to changing demands.

Top comments (0)