<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>DEV Community: Keyvan Soleimani</title>
    <description>The latest articles on DEV Community by Keyvan Soleimani (@keyvan_soleimani).</description>
    <link>https://dev.to/keyvan_soleimani</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F3122023%2Facbdf4a5-5dc9-445a-9b4f-100a22643a21.jpg</url>
      <title>DEV Community: Keyvan Soleimani</title>
      <link>https://dev.to/keyvan_soleimani</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://dev.to/feed/keyvan_soleimani"/>
    <language>en</language>
    <item>
      <title>PySpark &amp; Jupyter Notebooks Deployed On Kubernetes</title>
      <dc:creator>Keyvan Soleimani</dc:creator>
      <pubDate>Mon, 05 May 2025 12:15:24 +0000</pubDate>
      <link>https://dev.to/keyvan_soleimani/pyspark-jupyter-notebooks-deployed-on-kubernetes-4ck0</link>
      <guid>https://dev.to/keyvan_soleimani/pyspark-jupyter-notebooks-deployed-on-kubernetes-4ck0</guid>
      <description>&lt;p&gt;&lt;strong&gt;PySpark&lt;/strong&gt; is an interface for Apache Spark in Python. It not only allows you to write Spark applications using Python APIs, but also provides the PySpark shell for interactively analyzing your data in a distributed environment.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Jupyter Notebook&lt;/strong&gt; is used to create interactive notebook documents that can contain live code, equations, visualizations, media and other computational outputs. Jupyter Notebook is often used by programmers, data scientists and students to document and demonstrate coding workflows or simply experiment with code.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Kubernetes&lt;/strong&gt; is an open-source container orchestration system for automating software deployment, scaling, and management. Originally designed by Google.&lt;/p&gt;




&lt;p&gt;The kubernetes control-plane &amp;amp; worker nodes addresses are :&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;192.168.56.115
192.168.56.116
192.168.56.117
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fpoa8ky8vzrbs6oudvzwd.webp" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fpoa8ky8vzrbs6oudvzwd.webp" alt="Image description" width="800" height="388"&gt;&lt;/a&gt;&lt;br&gt;
Kubernetes cluster nodes :&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fdntmx5dqiaw7xvxcnbep.webp" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fdntmx5dqiaw7xvxcnbep.webp" alt="Image description" width="800" height="66"&gt;&lt;/a&gt;&lt;br&gt;
you can install helm via the link &lt;a href="https://helm.sh/docs/intro/install/" rel="noopener noreferrer"&gt;helm&lt;/a&gt; :&lt;/p&gt;



&lt;p&gt;Install Spark On Kubernetes via bitnami helm chart&lt;/p&gt;

&lt;p&gt;The Steps :&lt;/p&gt;

&lt;p&gt;you can install helm chart via the link &lt;a href="https://artifacthub.io/packages/helm/bitnami/spark/8.7.2?modal=install" rel="noopener noreferrer"&gt;helm chart&lt;/a&gt; :&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Important:&lt;/strong&gt; the spark version of helm chart must be the same as the PySpark version of jupyter&lt;/p&gt;

&lt;p&gt;Install spark via helm chart (bitnami) :&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fuddonyujcspzwxd1zjlg.webp" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fuddonyujcspzwxd1zjlg.webp" alt="Image description" width="800" height="328"&gt;&lt;/a&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;$ helm repo add bitnami https://charts.bitnami.com/bitnami
$ helm search repo bitnami
$ helm install kayvan-release bitnami/spark --version 8.7.2
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Deploy Jupyter workloads :&lt;/p&gt;

&lt;p&gt;jupyter.yaml :&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;apiVersion: apps/v1
kind: Deployment
metadata:
   name: jupiter-spark
   namespace: default
spec:
   replicas: 1
   selector:
      matchLabels:
         app: spark
   template:
      metadata:
         labels:
            app: spark
      spec:
         containers:
            - name: jupiter-spark-container
              image: docker.arvancloud.ir/jupyter/all-spark-notebook
              imagePullPolicy: IfNotPresent
              ports:
              - containerPort: 8888
              env: 
              - name: JUPYTER_ENABLE_LAB
                value: "yes"
---
apiVersion: v1
kind: Service
metadata:
   name: jupiter-spark-svc
   namespace: default
spec:
   type: NodePort
   selector:
      app: spark
   ports:
      - port: 8888
        targetPort: 8888
        nodePort: 30001
---
apiVersion: v1
kind: Service
metadata:
  name: jupiter-spark-driver-headless
spec:
  clusterIP: None
  selector:
    app: spark
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;





&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;kubectl apply -f jupyter.yaml
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;the installed &lt;strong&gt;pods&lt;/strong&gt;:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fne5jos0qxie0oeoh3a93.webp" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fne5jos0qxie0oeoh3a93.webp" alt="Image description" width="741" height="186"&gt;&lt;/a&gt;&lt;br&gt;
and &lt;strong&gt;Services&lt;/strong&gt; (headless for statefull) :&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fqkfoub55mhjtbf6ckaw3.webp" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fqkfoub55mhjtbf6ckaw3.webp" alt="Image description" width="800" height="177"&gt;&lt;/a&gt;&lt;br&gt;
Note: &lt;strong&gt;spark master&lt;/strong&gt; url address is :&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;spark://kayvan-release-spark-master-0.kayvan-release-spark-headless.default.svc.cluster.local:7077
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Open &lt;strong&gt;jupyter notebook&lt;/strong&gt; and write some python codes based on pyspark and press shift + enter keys on each block to execute:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;#import os

#os.environ['PYSPARK_SUBMIT_ARGS']='pyspark-shell'
#os.environ['PYSPARK_PYTHON']='/opt/bitnami/python/bin/python'
#os.environ['PYSPARK_DRIVER_PYTHON']='/opt/bitnami/python/bin/python'

from pyspark.sql import SparkSession

spark = SparkSession.builder.master("spark://kayvan-release-spark-master-0.kayvan-release-spark-headless.default.svc.cluster.local:7077")\
            .appName("Mahla").config('spark.driver.host', socket.gethostbyname(socket.gethostname()))\
            .getOrCreate()
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;note:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;socket.gethostbyname(socket.gethostname()) ---&amp;gt; returns the jupyter pod's ip address
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fd5hmldktwaaqqeeycmdk.webp" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fd5hmldktwaaqqeeycmdk.webp" alt="Image description" width="800" height="387"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F8u977e1ynx3oigd9urb5.webp" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F8u977e1ynx3oigd9urb5.webp" alt="Image description" width="702" height="572"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fq4p3i959l6ohjz7e2njs.webp" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fq4p3i959l6ohjz7e2njs.webp" alt="Image description" width="676" height="317"&gt;&lt;/a&gt;&lt;br&gt;
enjoy from sending python codes to spark cluster on kubernetes via jupyter.&lt;/p&gt;

&lt;p&gt;Note: of course you can work with pyspark single node installed on jupyter without kubernetes and when you will be sure that the code is correct, then send it via spark-submit or like above code to spark cluster on kubernetes.&lt;/p&gt;



&lt;p&gt;&lt;strong&gt;Deploy on the Docker Desktop :&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;docker-compose.yml :&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;version: '3.6'

services:

  spark-master:
    container_name: spark
    image: docker.arvancloud.ir/bitnami/spark:3.5.0
    environment:
      - SPARK_MODE=master
      - SPARK_RPC_AUTHENTICATION_ENABLED=no
      - SPARK_RPC_ENCRYPTION_ENABLED=no
      - SPARK_LOCAL_STORAGE_ENCRYPTION_ENABLED=no
      - SPARK_SSL_ENABLED=no
      - SPARK_USER=root   
      - PYSPARK_PYTHON=/opt/bitnami/python/bin/python3
    ports:
      - 127.0.0.1:8081:8080
      - 127.0.0.1:7077:7077
    networks:
      - spark-network

  spark-worker:
    image: docker.arvancloud.ir/bitnami/spark:3.5.0
    environment:
      - SPARK_MODE=worker
      - SPARK_MASTER_URL=spark://spark:7077
      - SPARK_WORKER_MEMORY=2G
      - SPARK_WORKER_CORES=2
      - SPARK_RPC_AUTHENTICATION_ENABLED=no
      - SPARK_RPC_ENCRYPTION_ENABLED=no
      - SPARK_LOCAL_STORAGE_ENCRYPTION_ENABLED=no
      - SPARK_SSL_ENABLED=no
      - SPARK_USER=root
      - PYSPARK_PYTHON=/opt/bitnami/python/bin/python3
    networks:
      - spark-network


  jupyter:
    image:  docker.arvancloud.ir/jupyter/all-spark-notebook:latest
    container_name: jupyter
    ports:
      - "8888:8888"
    environment:
      - JUPYTER_ENABLE_LAB=yes
    networks:
      - spark-network
    depends_on:
      - spark-master


networks:
  spark-network:
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;run in cmd :&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;docker-compose up --scale spark-worker=2
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F3humtxgqw2xmkod2m1oa.webp" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F3humtxgqw2xmkod2m1oa.webp" alt="Image description" width="800" height="237"&gt;&lt;/a&gt;&lt;br&gt;
Copy csv file to inside spark worker container :&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;docker cp file.csv spark-worker-1:/opt/file
docker cp file.csv spark-worker-2:/opt/file
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Open jupyter notebook and write some python codes based on pyspark and press shift + enter keys on each block to execute:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;from pyspark.sql import SparkSession

# Create a Spark session
spark = SparkSession.builder.appName("YourAppName")\
            .master("spark://8fa1bd982ade:7077").getOrCreate()
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;





&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;data = spark.read.csv("/opt/file/file.csv", header=True)
data.limit(3).show()
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;





&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;spark.stop()
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fwqg6w4wztll7b9p94dvx.webp" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fwqg6w4wztll7b9p94dvx.webp" alt="Image description" width="800" height="483"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F7s688192vhlxahy2squd.webp" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F7s688192vhlxahy2squd.webp" alt="Image description" width="800" height="333"&gt;&lt;/a&gt;&lt;br&gt;
Note again: you can work with pyspark single node installed on jupyter without spark cluster and when you will be sure that the code is correct, then send it via spark-submit or like above code to spark cluster on docker desktop.&lt;/p&gt;



&lt;p&gt;&lt;strong&gt;Execute on the pyspark that installed on jupyter, no on spark cluster&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Copy csv file to inside jupyter container :&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;docker cp file.csv jupyter:/opt/file
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;





&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;from pyspark.sql import SparkSession

# Create a Spark session
spark = SparkSession.builder.appName("YourAppName").getOrCreate()

data = spark.read.csv("/opt/file/file.csv", header=True)
data.limit(3).show()

spark.stop()
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;and also you can practice on single node pyspark in jupyter :&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fn2bfjxzz1d6bu39odg14.webp" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fn2bfjxzz1d6bu39odg14.webp" alt="Image description" width="684" height="502"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Congratulation 🍹&lt;/p&gt;

</description>
      <category>kubernetes</category>
      <category>docker</category>
      <category>pyspark</category>
      <category>jupyter</category>
    </item>
    <item>
      <title>Spark On Kubernetes</title>
      <dc:creator>Keyvan Soleimani</dc:creator>
      <pubDate>Mon, 05 May 2025 11:45:11 +0000</pubDate>
      <link>https://dev.to/keyvan_soleimani/spark-on-kubernetes-2l9d</link>
      <guid>https://dev.to/keyvan_soleimani/spark-on-kubernetes-2l9d</guid>
      <description>&lt;p&gt;Spark On Kubernetes via &lt;strong&gt;helm chart&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;This article is aimed at introducing the feature of Apache Spark on Kubernetes, which is officially announced as production-ready and Generally Available with the Apache Spark 3.1 release in March 2021.&lt;/p&gt;

&lt;p&gt;To gain hands-on experience with the feature, we go through running Apache Spark on a local Kubernetes cluster with Workload Identity to explore the nuance of environment set up and arguments when submitting jobs.&lt;/p&gt;




&lt;p&gt;The &lt;strong&gt;control-plane&lt;/strong&gt; &amp;amp; &lt;strong&gt;worker&lt;/strong&gt; nodes addresses are :&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;192.168.56.115
192.168.56.116
192.168.56.117
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F6pbjz5iy15dtzukwsc0v.webp" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F6pbjz5iy15dtzukwsc0v.webp" alt="Image description" width="800" height="388"&gt;&lt;/a&gt;&lt;br&gt;
Kubernetes cluster &lt;strong&gt;nodes&lt;/strong&gt; :&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Frid4w4k0vtf6nwh0pthu.webp" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Frid4w4k0vtf6nwh0pthu.webp" alt="Image description" width="800" height="66"&gt;&lt;/a&gt;&lt;br&gt;
you can install helm via the link &lt;a href="https://helm.sh/docs/intro/install" rel="noopener noreferrer"&gt;helm&lt;/a&gt;&lt;/p&gt;



&lt;p&gt;&lt;strong&gt;Installing Apache Spark&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;The Steps :&lt;/p&gt;

&lt;p&gt;1) Install spark via helm chart (&lt;strong&gt;bitnami&lt;/strong&gt;) :&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;$ helm repo add bitnami https://charts.bitnami.com/bitnami
$ helm search repo bitnami
$ helm install kayvan-release oci://registry-1.docker.io/bitnamicharts/spark
$ helm upgrade kayvan-release bitnami/spark - set worker.replicaCount=5
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;the installed 6 &lt;strong&gt;pods&lt;/strong&gt; :&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F5yxbpxxdgjl27muf2i6l.webp" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F5yxbpxxdgjl27muf2i6l.webp" alt="Image description" width="699" height="181"&gt;&lt;/a&gt;&lt;br&gt;
and &lt;strong&gt;Services&lt;/strong&gt; (headless for statefull) :&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fwpjxavsrksjxeqynio09.webp" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fwpjxavsrksjxeqynio09.webp" alt="Image description" width="800" height="85"&gt;&lt;/a&gt;&lt;br&gt;
and the &lt;strong&gt;spark master ui&lt;/strong&gt; is :&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Ffs73r72y4aclki5jykom.webp" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Ffs73r72y4aclki5jykom.webp" alt="Image description" width="800" height="389"&gt;&lt;/a&gt;&lt;br&gt;
2) type the below commands on kubernetes &lt;strong&gt;kube-apiserver&lt;/strong&gt; :&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;kubectl exec -it kayvan-release-spark-master-0 -- ./bin/spark-submit \
 --class org.apache.spark.examples.SparkPi \
 --master spark://kayvan-release-spark-master-0.kayvan-release-spark-headless.default.svc.cluster.local:7077 \
 ./examples/jars/spark-examples_2.12–3.4.1.jar 1000
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;or&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;kubectl exec -it kayvan-release-spark-master-0 -- /bin/bash


./bin/spark-submit \
 --class org.apache.spark.examples.SparkPi \
 --master spark://kayvan-release-spark-master-0.kayvan-release-spark-headless.default.svc.cluster.local:7077 \
 ./examples/jars/spark-examples_2.12–3.4.1.jar 1000


./bin/spark-submit \
 --class org.apache.spark.examples.SparkPi \
 --master spark://kayvan-release-spark-master-0.kayvan-release-spark-headless.default.svc.cluster.local:7077 \
 ./examples/src/main/python/pi.py 1000


./bin/spark-submit \
 --class org.apache.spark.examples.SparkPi \
 --master spark://kayvan-release-spark-master-0.kayvan-release-spark-headless.default.svc.cluster.local:7077 \
 ./examples/src/main/python/wordcount.py //filepath
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F0ik3ro5isopf74mt8xt5.webp" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F0ik3ro5isopf74mt8xt5.webp" alt="Image description" width="800" height="92"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fsmoil9pxxqoru5vf3ur6.webp" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fsmoil9pxxqoru5vf3ur6.webp" alt="Image description" width="761" height="516"&gt;&lt;/a&gt;&lt;br&gt;
the exact scala &amp;amp; python code of spark-examples_2.12–3.4.1.jar , pi.py &amp;amp; wordcount.py :&lt;/p&gt;

&lt;p&gt;&lt;a href="https://github.com/apache/spark/blob/master/examples/src/main/scala/org/apache/spark/examples/SparkPi.scala" rel="noopener noreferrer"&gt;SparkPi.scala&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href="https://github.com/apache/spark/blob/master/examples/src/main/python/pi.py" rel="noopener noreferrer"&gt;pi.py&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href="https://github.com/apache/spark/blob/master/examples/src/main/python/wordcount.py" rel="noopener noreferrer"&gt;wordcount.py&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;3) The &lt;strong&gt;final result&lt;/strong&gt; is 🍹 :&lt;/p&gt;

&lt;p&gt;for scala :&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F042qr8qwl1do5erklu4p.webp" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F042qr8qwl1do5erklu4p.webp" alt="Image description" width="800" height="222"&gt;&lt;/a&gt;&lt;br&gt;
for python :&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fkr1ktvnwsa95778qhisi.webp" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fkr1ktvnwsa95778qhisi.webp" alt="Image description" width="800" height="216"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Favtwhgrr6dnq3f38i3j1.webp" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Favtwhgrr6dnq3f38i3j1.webp" alt="Image description" width="800" height="227"&gt;&lt;/a&gt;&lt;/p&gt;



&lt;p&gt;The other &lt;strong&gt;python&lt;/strong&gt; Programm :&lt;/p&gt;

&lt;p&gt;1) Copy &lt;strong&gt;People.csv&lt;/strong&gt; (large file) inside spark worker pods :&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fow0b5atij83lw94vfy3a.webp" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fow0b5atij83lw94vfy3a.webp" alt="Image description" width="661" height="96"&gt;&lt;/a&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;kubectl cp people.csv kayvan-release-spark-worker-{x}:/opt/bitnami/spark
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Notes:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;you can download the file from &lt;a href="https://www.datablist.com/learn/csv/download-sample-csv-files" rel="noopener noreferrer"&gt;link&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;you can also use a &lt;strong&gt;nfs share folder&lt;/strong&gt; for read large csv file from it instead of copying it inside pods.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;2) Write some python codes inside &lt;strong&gt;readcsv.py&lt;/strong&gt; please :&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;from pyspark.sql import SparkSession
#from pyspark.sql.functions import sum
from pyspark.context import SparkContext

spark = SparkSession\
 .builder\
 .appName("Mahla")\
 .getOrCreate()


sc = spark.sparkContext

path = "people.csv"

df = spark.read.options(delimiter=",", header=True).csv(path)

df.show()

#df.groupBy("Job Title").sum().show()

df.createOrReplaceTempView("Peopletable")
df2 = spark.sql("select Sex, count(1) countsex, sum(Index) sex_sum " \
                     "from peopletable group by Sex")

df2.show()

#df.select(sum(df.Index)).show()
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F8m6g0zje4wi2dl2jqgb6.webp" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F8m6g0zje4wi2dl2jqgb6.webp" alt="Image description" width="660" height="508"&gt;&lt;/a&gt;&lt;br&gt;
3) copy readcsv.py file inside &lt;strong&gt;spark master&lt;/strong&gt; pod :&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;kubectl cp readcsv.py kayvan-release-spark-master-0:/opt/bitnami/spark
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;4) run the code :&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;kubectl exec -it kayvan-release-spark-master-0 -- ./bin/spark-submit 
   --class org.apache.spark.examples.SparkPi
   --master spark://kayvan-release-spark-master-0.kayvan-release-spark-headless.default.svc.cluster.local:7077 
       readcsv.py
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;5) showing some data :&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fy1jp1ovhh94lwqrgeabx.webp" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fy1jp1ovhh94lwqrgeabx.webp" alt="Image description" width="800" height="338"&gt;&lt;/a&gt;&lt;br&gt;
6) the next result data:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F0zyfo4tugtl99zqm8pam.webp" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F0zyfo4tugtl99zqm8pam.webp" alt="Image description" width="630" height="223"&gt;&lt;/a&gt;&lt;br&gt;
7) the time consuming for processing :&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F1g9sv5yymb8fyrhrg67f.webp" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F1g9sv5yymb8fyrhrg67f.webp" alt="Image description" width="800" height="372"&gt;&lt;/a&gt;&lt;/p&gt;



&lt;p&gt;The other python program on &lt;strong&gt;Docker Desktop&lt;/strong&gt; :&lt;/p&gt;

&lt;p&gt;docker-compose.yml :&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;version: '3.6'

services:

  spark:
    container_name: spark
    image: bitnami/spark:latest
    environment:
      - SPARK_MODE=master
      - SPARK_RPC_AUTHENTICATION_ENABLED=no
      - SPARK_RPC_ENCRYPTION_ENABLED=no
      - SPARK_LOCAL_STORAGE_ENCRYPTION_ENABLED=no
      - SPARK_SSL_ENABLED=no
      - SPARK_USER=root   
      - PYSPARK_PYTHON=/opt/bitnami/python/bin/python3
    ports:
      - 127.0.0.1:8081:8080

  spark-worker:
    image: bitnami/spark:latest
    environment:
      - SPARK_MODE=worker
      - SPARK_MASTER_URL=spark://spark:7077
      - SPARK_WORKER_MEMORY=2G
      - SPARK_WORKER_CORES=2
      - SPARK_RPC_AUTHENTICATION_ENABLED=no
      - SPARK_RPC_ENCRYPTION_ENABLED=no
      - SPARK_LOCAL_STORAGE_ENCRYPTION_ENABLED=no
      - SPARK_SSL_ENABLED=no
      - SPARK_USER=root
      - PYSPARK_PYTHON=/opt/bitnami/python/bin/python3
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;





&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;docker-compose up --scale spark-worker=2
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fgycuqxa66g9d0ry3dzud.webp" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fgycuqxa66g9d0ry3dzud.webp" alt="Image description" width="800" height="162"&gt;&lt;/a&gt;&lt;br&gt;
copy required files to containers :&lt;/p&gt;

&lt;p&gt;for e.g.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;docker cp file.csv spark-worker-1:/opt/bitnami/spark
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;python code on master :&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;from pyspark.sql import SparkSession

spark = SparkSession.builder.appName("Writingjson").getOrCreate()

df = spark.read.option("header", True).csv("csv/file.csv").coalesce(2)

df.show()

df.write.partitionBy('name').mode('overwrite').format('json').save('file_name.json')
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;run the code on spark &lt;strong&gt;master docker&lt;/strong&gt; container :&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;./bin/spark-submit --master spark://4f28330ce077:7077 csv/ctp.py
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;showing some data on master :&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fwcvmfyxpq5zfs1jjf2u7.webp" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fwcvmfyxpq5zfs1jjf2u7.webp" alt="Image description" width="324" height="611"&gt;&lt;/a&gt;&lt;br&gt;
and the seperated json files based on name partitioning on worker 2 :&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fpxkosm43q8se90qxcw25.webp" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fpxkosm43q8se90qxcw25.webp" alt="Image description" width="800" height="656"&gt;&lt;/a&gt;&lt;br&gt;
data for name=kayvan :&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F3dm2plblyrwq16w0gioz.webp" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F3dm2plblyrwq16w0gioz.webp" alt="Image description" width="675" height="685"&gt;&lt;/a&gt;&lt;br&gt;
Congratulation 🍹&lt;/p&gt;

</description>
      <category>kubernetes</category>
      <category>spark</category>
      <category>docker</category>
      <category>bigdata</category>
    </item>
    <item>
      <title>Nginx Ingress Controller in Kubernetes</title>
      <dc:creator>Keyvan Soleimani</dc:creator>
      <pubDate>Mon, 05 May 2025 11:05:46 +0000</pubDate>
      <link>https://dev.to/keyvan_soleimani/nginx-ingress-controller-in-kubernetes-422p</link>
      <guid>https://dev.to/keyvan_soleimani/nginx-ingress-controller-in-kubernetes-422p</guid>
      <description>&lt;p&gt;Setting up &lt;strong&gt;Ingress controller NGINX&lt;/strong&gt; (as a router) along with HAproxy for Microservice deployed inside &lt;strong&gt;Kubernetes cluster&lt;/strong&gt; (Bare-metal servers)&lt;/p&gt;

&lt;p&gt;The NGINX Ingress Controller is an &lt;u&gt;Ingress Controller&lt;/u&gt; implementation for NGINX and NGINX Plus that can load balance Websocket, gRPC, TCP and UDP applications. It supports standard Ingress features such as content-based routing and TLS/SSL termination. Several NGINX and NGINX Plus features are available as extensions to Ingress resources through Annotations and the ConfigMap resource.&lt;/p&gt;

&lt;p&gt;The NGINX Ingress Controller supports the VirtualServer and VirtualServerRoute resources as alternatives to Ingress, enabling traffic splitting and advanced content-based routing. It also supports TCP, UDP and TLS Passthrough load balancing using TransportServer resources.&lt;/p&gt;




&lt;p&gt;&lt;strong&gt;General Design (big picture) :&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Ff518gh33o66un04u0ale.webp" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Ff518gh33o66un04u0ale.webp" alt="Image description" width="800" height="314"&gt;&lt;/a&gt;&lt;br&gt;
The control-plane &amp;amp; worker nodes addresses are :&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;192.168.56.115
192.168.56.116
192.168.56.117
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fzin7rnqfcahzmgwp5obn.webp" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fzin7rnqfcahzmgwp5obn.webp" alt="Image description" width="800" height="388"&gt;&lt;/a&gt;&lt;br&gt;
and HAProxy as a Load Balancer :&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;192.168.56.118
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Kubernetes cluster nodes :&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Ffarty7tre9861bp82vix.webp" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Ffarty7tre9861bp82vix.webp" alt="Image description" width="800" height="66"&gt;&lt;/a&gt;&lt;/p&gt;




&lt;p&gt;Note: All manifests needed for below steps is present at my &lt;a href="https://github.com/kayvansol/Ingress/tree/main/Manifests" rel="noopener noreferrer"&gt;github&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;The Steps :&lt;/p&gt;

&lt;p&gt;1) Install NGINX Ingress Controller from nginx-ingress :&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;kubectl apply -f https://raw.githubusercontent.com/kubernetes/ingress-nginx/controller-v1.1.3/deploy/static/provider/baremetal/deploy.yaml
kubectl get all - namespace=ingress-nginx
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Ingress-Nginx workloads (ingress port in our case is &lt;strong&gt;30798&lt;/strong&gt;) :&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fyyj1bn70q9xc6v6tvqzw.webp" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fyyj1bn70q9xc6v6tvqzw.webp" alt="Image description" width="800" height="230"&gt;&lt;/a&gt;&lt;br&gt;
2) On the nodes, where the PODs will be located (node1 and node2 in our case) :&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;DIRNAME="vol1"
mkdir -p /mnt/disk/$DIRNAME 
chcon -Rt svirt_sandbox_file_t /mnt/disk/$DIRNAME
chmod 777 /mnt/disk/$DIRNAME
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;3) Deploy the Storage Class &amp;amp; PV &amp;amp; PVC :&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;kubectl apply -f storageClass.yaml
kubectl apply -f persistentVolume.yaml
kubectl apply -f persistentVolume1.yaml
kubectl apply -f persistentVolumeClaim.yaml
kubectl apply -f persistentVolumeClaim1.yaml
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fwtlldcoy1ldsi4uqgngu.webp" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fwtlldcoy1ldsi4uqgngu.webp" alt="Image description" width="800" height="157"&gt;&lt;/a&gt;&lt;br&gt;
4) Deploy the &lt;strong&gt;web apps&lt;/strong&gt; :&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;kubectl apply -f http-pod.yaml
kubectl apply -f http-pod1.yaml
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F9yymbi7uat0b22bscgy9.webp" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F9yymbi7uat0b22bscgy9.webp" alt="Image description" width="736" height="97"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fj044q0gvmfb74tf7jz8n.webp" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fj044q0gvmfb74tf7jz8n.webp" alt="Image description" width="743" height="223"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Ffk6o6k3v7njjsh2lu3y5.webp" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Ffk6o6k3v7njjsh2lu3y5.webp" alt="Image description" width="651" height="258"&gt;&lt;/a&gt;&lt;br&gt;
Get pod ip &amp;amp; curl the related web app :&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;POD_IP=$(kubectl get pod www2-c5644ff98-trk4d -o yaml | grep podIP | awk '{print $2}'); echo $POD_IP
curl $POD_IP
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;5) Deploy the &lt;strong&gt;Services&lt;/strong&gt; :&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;kubectl apply -f IngressService.yaml
kubectl apply -f IngressService1.yaml
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fqmi8vakszv0zdaursz0q.webp" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fqmi8vakszv0zdaursz0q.webp" alt="Image description" width="800" height="135"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F3oi6k1dlqf3xsht5iito.webp" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F3oi6k1dlqf3xsht5iito.webp" alt="Image description" width="563" height="366"&gt;&lt;/a&gt;&lt;br&gt;
6) Deploy the &lt;strong&gt;Ingress resource&lt;/strong&gt; :&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;kubectl apply -f Ingress.yaml
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fl2owyci3zw0k7d45xxjl.webp" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fl2owyci3zw0k7d45xxjl.webp" alt="Image description" width="789" height="72"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F6gp076oqia8981tlipa1.webp" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F6gp076oqia8981tlipa1.webp" alt="Image description" width="746" height="367"&gt;&lt;/a&gt;&lt;br&gt;
Note: you can also Secure NGINX-ingress via this &lt;strong&gt;&lt;a href="https://cert-manager.io/docs/tutorials/acme/nginx-ingress/" rel="noopener noreferrer"&gt;[Securing NGINX-ingress]&lt;/a&gt;&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;7) &lt;strong&gt;HAProxy&lt;/strong&gt; config as a &lt;strong&gt;Load Balancer&lt;/strong&gt; (On 192.168.56.118) :&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;sudo nano /etc/haproxy/haproxy.cfg
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fbud7jdbj6ap18zex4o6k.webp" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fbud7jdbj6ap18zex4o6k.webp" alt="Image description" width="444" height="303"&gt;&lt;/a&gt;&lt;br&gt;
8) DNS Record (On DNS Server) :&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fcf5y14uhggzivev03xfs.webp" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fcf5y14uhggzivev03xfs.webp" alt="Image description" width="543" height="353"&gt;&lt;/a&gt;&lt;br&gt;
The final results are 🍹 :&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Ft17wgbjmon93j3mtwy1y.webp" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Ft17wgbjmon93j3mtwy1y.webp" alt="Image description" width="497" height="288"&gt;&lt;/a&gt;&lt;/p&gt;




&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fe0sduzxkojrgnfdsdi14.webp" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fe0sduzxkojrgnfdsdi14.webp" alt="Image description" width="384" height="286"&gt;&lt;/a&gt;&lt;/p&gt;

</description>
      <category>kubernetes</category>
      <category>devops</category>
      <category>nginxingresscontroller</category>
      <category>letsencrypt</category>
    </item>
    <item>
      <title>Deploying Kubernetes Cluster On On-premises</title>
      <dc:creator>Keyvan Soleimani</dc:creator>
      <pubDate>Sun, 04 May 2025 13:08:45 +0000</pubDate>
      <link>https://dev.to/keyvan_soleimani/deploying-kubernetes-cluster-on-on-premises-2pgf</link>
      <guid>https://dev.to/keyvan_soleimani/deploying-kubernetes-cluster-on-on-premises-2pgf</guid>
      <description>&lt;p&gt;&lt;strong&gt;Kubernetes&lt;/strong&gt; assembles one or more computers, either virtual machines or bare metal, into a cluster which can run workloads in containers. It works with various container runtimes, such as Docker.&lt;/p&gt;

&lt;p&gt;The Kubernetes &lt;strong&gt;master&lt;/strong&gt; node handles the Kubernetes &lt;strong&gt;control plane&lt;/strong&gt; of the cluster, managing its workload and directing communication across the system, &lt;strong&gt;etcd&lt;/strong&gt; is a persistent, lightweight, distributed, key-value data store (originally developed for Container Linux). It reliably stores the configuration data of the cluster, representing the overall state of the cluster at any given point of time.&lt;/p&gt;

&lt;p&gt;The &lt;strong&gt;API server&lt;/strong&gt; serves the Kubernetes API using JSON over HTTP, which provides both the internal and external interface to Kubernetes. The API server processes, validates REST requests, and updates the state of the API objects in etcd, thereby allowing clients to configure workloads and containers across worker nodes.&lt;/p&gt;

&lt;p&gt;The &lt;strong&gt;scheduler&lt;/strong&gt; is an extensible component that selects the node that an unscheduled pod (the basic unit of workloads to be scheduled) runs, based on resource availability and other constraints. The scheduler tracks resource allocation on each node to ensure that workload is not scheduled in excess of available resources.&lt;/p&gt;

&lt;p&gt;A &lt;strong&gt;controller&lt;/strong&gt; is a reconciliation loop that drives the actual cluster state toward the desired state, communicating with the API server to create, update, and delete the resources it manages.&lt;/p&gt;

&lt;p&gt;A node, also known as a &lt;strong&gt;worker&lt;/strong&gt; or a minion, is a machine where containers (workloads) are deployed. Every node in the cluster must run a container runtime, as well as the below-mentioned components, for communication with the primary network configuration of these containers.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;kubelet&lt;/strong&gt; is responsible for the running state of each node, ensuring that all containers on the node are healthy. It takes care of starting, stopping, and maintaining application containers organized into pods as directed by the control plane&lt;/p&gt;

&lt;p&gt;A &lt;strong&gt;container runtime&lt;/strong&gt; is responsible for the lifecycle of containers, including launching, reconciling and killing of containers. kubelet interacts with container runtimes via the Container Runtime Interface (&lt;strong&gt;CRI&lt;/strong&gt;), which decouples the maintenance of core Kubernetes from the actual CRI implementation.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;kube-proxy&lt;/strong&gt; is an implementation of a network proxy and a load balancer, and it supports the service abstraction along with the other networking operations. It is responsible for routing traffic to the appropriate container based on IP and port number of the incoming request.&lt;/p&gt;




&lt;p&gt;This article is for deploying kubernetes on-permises cluster (bare-metal servers).&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F50e74p71ha62ippxk1wv.webp" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F50e74p71ha62ippxk1wv.webp" alt="Kubernetes Cluster" width="800" height="371"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Note : The &lt;a href="https://github.com/kayvansol/Kubernetes/tree/main/Kubernetes" rel="noopener noreferrer"&gt;kubernetes&lt;/a&gt; folder contains the files for preparing server to install kubernetes cluster or join to the cluster.&lt;/p&gt;

&lt;p&gt;Please run the following script on all servers :&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;sudo ansible servers -m ping -i inventory.ini -u root

sudo ansible-playbook -i inventory.ini Kubernetes/ServerPrepare.yml -u root
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Kubernetes Cluster :&lt;/p&gt;

&lt;p&gt;The control-plane nodes addresses are :&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;192.168.56.120
192.168.56.121
192.168.56.122
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The worker nodes addresses are :&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;192.168.56.123
192.168.56.124
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The vms hosted on virtualbox are like the below schema :&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F2h3teyujsr90h55i6bjx.webp" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F2h3teyujsr90h55i6bjx.webp" alt="vms" width="800" height="388"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;HAProxy server (Load Balancer for kube apiserver) address is :&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;192.168.56.118
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;haproxy.cfg :&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;stats enable
(frontend bind to 192.168.56.118:6443)
(backend bind to 192.168.56.120:6443 192.168.56.121:6443 192.168.56.122:6443)
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;For starting a Kubernetes cluster, follow the below lines :&lt;/p&gt;

&lt;p&gt;Run below scripts only on 192.168.56.120 :&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;sudo kubeadm init  --control-plane-endpoint="192.168.56.118:6443"   
      --upload-certs  --apiserver-advertise-address=192.168.56.120
      --pod-network-cidr=192.168.0.0/16  
      --cri-socket=unix:///var/run/cri-dockerd.sock  
      --ignore-preflight-errors=all
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;And below code for all nodes :&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;mkdir -p $HOME/.kube
sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
sudo chown $(id -u):$(id -g) $HOME/.kube/config
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Install Calico network policy for on-premises deployments, 50 nodes or less :&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;curl https://raw.githubusercontent.com/projectcalico/calico/v3.27.2/manifests/calico.yaml -O

kubectl apply -f calico.yaml
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;and run the below on other servers to join to the cluster :&lt;/p&gt;

&lt;p&gt;On control-plane (e.g. 192.168.56.122) :&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;kubeadm join 192.168.56.118:6443 --token c4c6wt.2rzubblajmxx7wf1 \
     --discovery-token-ca-cert-hash sha256:91877d933445148c650e5fa11acca05d455fe1e9e53cd33f8497ad06a2126142 \
     --control-plane --certificate-key 2e8c3d0a1f2d4aec3e4ccb09a0dd6f43756344269c0b414cdd83c0ef02c0293d \
     --apiserver-advertise-address=192.168.56.122 
     --cri-socket=unix:///var/run/cri-dockerd.sock 
     --ignore-preflight-errors=all
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;On worker nodes :&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;kubeadm join 192.168.56.118:6443 --token  c4c6wt.2rzubblajmxx7wf1 \
     --discovery-token-ca-cert-hash sha256:91877d933445148c650e5fa11acca05d455fe1e9e53cd33f8497ad06a2126142 \
     --cri-socket=unix:///var/run/cri-dockerd.sock 
     --ignore-preflight-errors=all
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;and at final step enjoy from your cluster :&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;kubectl get nodes -o wide

kubectl get pod -A
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F339an2qebv9qsliiucig.webp" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F339an2qebv9qsliiucig.webp" alt="nodes" width="800" height="93"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fiqxh3eu6a00slt2nw3sv.webp" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fiqxh3eu6a00slt2nw3sv.webp" alt="pods" width="697" height="603"&gt;&lt;/a&gt;&lt;br&gt;
HAProxy Stats :&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fopdb4muh5uzrjc3c884z.webp" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fopdb4muh5uzrjc3c884z.webp" alt="HAProxy Stats" width="800" height="149"&gt;&lt;/a&gt;&lt;br&gt;
haproxy.cfg :&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fp0lngbyqmyn4fln1lp70.webp" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fp0lngbyqmyn4fln1lp70.webp" alt="haproxy.cfg" width="556" height="333"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Congratulation 🍹&lt;/p&gt;

</description>
      <category>kubernetes</category>
      <category>docker</category>
      <category>devops</category>
      <category>linux</category>
    </item>
  </channel>
</rss>
