<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>DEV Community: Rafif</title>
    <description>The latest articles on DEV Community by Rafif (@rafif_1999).</description>
    <link>https://dev.to/rafif_1999</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F3012331%2F60fb9962-041d-40d8-bccb-6f690dad94a9.jpeg</url>
      <title>DEV Community: Rafif</title>
      <link>https://dev.to/rafif_1999</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://dev.to/feed/rafif_1999"/>
    <language>en</language>
    <item>
      <title>Real-Time Video Streaming Using Kafka, Flask &amp; OpenCV</title>
      <dc:creator>Rafif</dc:creator>
      <pubDate>Wed, 23 Jul 2025 11:29:47 +0000</pubDate>
      <link>https://dev.to/rafif_1999/real-time-video-streaming-using-kafka-flask-opencv-j7m</link>
      <guid>https://dev.to/rafif_1999/real-time-video-streaming-using-kafka-flask-opencv-j7m</guid>
      <description>&lt;h3&gt;
  
  
  1. Overview
&lt;/h3&gt;

&lt;p&gt;This document presents the architecture for a real-time video streaming web application that uses &lt;strong&gt;Apache Kafka&lt;/strong&gt; as the data pipeline, Flask as the web server, OpenCV for video frame processing, and &lt;strong&gt;Gunicorn&lt;/strong&gt; + &lt;strong&gt;NGINX&lt;/strong&gt; for production-grade deployment.&lt;br&gt;
The application captures video (live or file-based), streams it through Kafka, and displays it on the web in real time.&lt;/p&gt;


&lt;h3&gt;
  
  
  2. System Goals
&lt;/h3&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Goal&lt;/th&gt;
&lt;th&gt;Description&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Real-Time Video Streaming&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Provide smooth and low-latency video feed from producers to web clients&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Scalability&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Kafka enables distributed data handling, supporting horizontal scaling&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Modularity&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Components are decoupled: producer, broker, consumer&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Maintainability&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Python-based and well-structured for readability and testing&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;


&lt;h3&gt;
  
  
  3. Technology Stack
&lt;/h3&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Component&lt;/th&gt;
&lt;th&gt;Tool / Library&lt;/th&gt;
&lt;th&gt;Role&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;💻 &lt;strong&gt;OS&lt;/strong&gt;
&lt;/td&gt;
&lt;td&gt;Ubuntu Server  24.04.2&lt;/td&gt;
&lt;td&gt;&lt;a href="https://ubuntu.com/download/server" rel="noopener noreferrer"&gt;https://ubuntu.com/download/server&lt;/a&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;💬 &lt;strong&gt;Messaging&lt;/strong&gt;
&lt;/td&gt;
&lt;td&gt;Apache Kafka&lt;/td&gt;
&lt;td&gt;Message broker for real-time data streaming&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;🐍 &lt;strong&gt;Backend&lt;/strong&gt;
&lt;/td&gt;
&lt;td&gt;Python 3, Kafka-Python&lt;/td&gt;
&lt;td&gt;Business logic and video frame handling&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;🌐 &lt;strong&gt;Web App&lt;/strong&gt;
&lt;/td&gt;
&lt;td&gt;Flask&lt;/td&gt;
&lt;td&gt;Micro web framework for streaming video&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;🖼 &lt;strong&gt;Vision&lt;/strong&gt;
&lt;/td&gt;
&lt;td&gt;OpenCV&lt;/td&gt;
&lt;td&gt;Frame capture, processing and encoding&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;🚀 &lt;strong&gt;Server&lt;/strong&gt;
&lt;/td&gt;
&lt;td&gt;Gunicorn + NGINX&lt;/td&gt;
&lt;td&gt;WSGI and reverse proxy setup for production&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;


&lt;h3&gt;
  
  
  4. System Architecture Diagram
&lt;/h3&gt;
&lt;h4&gt;
  
  
  4.1 Infrastructure Diagram
&lt;/h4&gt;

&lt;p&gt;This diagram focuses on the deployment and hosting architecture:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F24jo4o2vq9pnntmxpqkw.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F24jo4o2vq9pnntmxpqkw.png" alt=" " width="800" height="2230"&gt;&lt;/a&gt;&lt;/p&gt;
&lt;h4&gt;
  
  
  4.2 Messaging System Diagram
&lt;/h4&gt;

&lt;p&gt;This focuses on the video processing pipeline and Kafka message flow:&lt;/p&gt;
&lt;h2&gt;
  
  
  &lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fbo4jwsy9gg3nddvsh1bs.png" alt=" " width="800" height="1447"&gt;
&lt;/h2&gt;
&lt;h3&gt;
  
  
  5. Guided Implementation
&lt;/h3&gt;
&lt;h4&gt;
  
  
  5.1 Configuring Python
&lt;/h4&gt;

&lt;p&gt;We start by creating a virtual environment:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;sudo apt install python3 python3-pip
pip3 install --upgrade pip
pip3 install virtualenv

mkdir kafka-video-streaming
cd kafka-video-streaming
virtualenv venv
source venv/bin/activate
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fp8hr49fpglktvc7405n2.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fp8hr49fpglktvc7405n2.png" alt=" " width="800" height="152"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F27r5pe28fgx73o432vb5.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F27r5pe28fgx73o432vb5.png" alt=" " width="800" height="99"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h4&gt;
  
  
  5.2 Installing Requirements
&lt;/h4&gt;

&lt;h4&gt;
  
  
  Essential Components for Running the Application in the Browser:
&lt;/h4&gt;

&lt;p&gt;To build and run a real-time web streaming application using Python, we need to set up a few core components that allow smooth communication between the server and the browser. This system relies on:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Flask&lt;/strong&gt; – to build the backend web server.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;NGINX&lt;/strong&gt; – as a reverse proxy for secure and scalable request handling.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Gunicorn&lt;/strong&gt; – a production-grade WSGI server to serve the Flask app.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Kafka&lt;/strong&gt; – to stream video data between the producer and consumer components.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;OpenCV&lt;/strong&gt; - an open-source library designed for computer vision and machine learning applications.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;code&gt;pip install kafka&lt;/code&gt;&lt;br&gt;
&lt;code&gt;pip install kafka-python&lt;/code&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fltf885woe1jcidio8l4l.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fltf885woe1jcidio8l4l.png" alt=" " width="800" height="281"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;code&gt;pip install flask&lt;/code&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fp9iz8vlp66v9kgu5axf2.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fp9iz8vlp66v9kgu5axf2.png" alt=" " width="800" height="320"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;This command also installs essential Flask dependencies, including:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Werkzeug&lt;/strong&gt; – a comprehensive WSGI utility library for request/response handling.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Jinja2&lt;/strong&gt; – a templating engine used by Flask.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;MarkupSafe&lt;/strong&gt; – helps escape characters in Jinja templates.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;ItsDangerous&lt;/strong&gt; – provides secure signing for data like cookies and tokens.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;OpenCV:&lt;/p&gt;

&lt;p&gt;&lt;code&gt;pip install opencv-contrib-python&lt;/code&gt;&lt;/p&gt;

&lt;p&gt;Now we install NGINX with the command:&lt;br&gt;
&lt;code&gt;sudo apt update&lt;br&gt;
sudo apt install nginx&lt;/code&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fx0yiye7t9cyowaptb46k.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fx0yiye7t9cyowaptb46k.png" alt=" " width="800" height="308"&gt;&lt;/a&gt;&lt;br&gt;
last but not least Gunicorn:&lt;/p&gt;

&lt;p&gt;&lt;code&gt;pip install gunicorn&lt;/code&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F988gxrlgpfvq7cyk68t0.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F988gxrlgpfvq7cyk68t0.png" alt=" " width="800" height="177"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h4&gt;
  
  
  5.2 Confugiring NGINX
&lt;/h4&gt;

&lt;p&gt;Before setting up our custom NGINX configuration for the Flask app, remove the default site configuration to avoid conflicts:&lt;/p&gt;

&lt;p&gt;&lt;code&gt;sudo rm /etc/nginx/sites-enabled/default&lt;/code&gt;&lt;/p&gt;

&lt;p&gt;Now we create a new configuration file inside the sites-available directory. Then, create a symbolic link from this file to the sites-enabled directory to activate the configuration.&lt;br&gt;
This tells NGINX to use our custom settings instead of the default ones.&lt;br&gt;
&lt;code&gt;touch /etc/nginx/sites-available/flask_settings&lt;/code&gt;&lt;br&gt;
&lt;code&gt;ln -s /etc/nginx/sites-available/flask_settings /etc/nginx/sites-enabled/flask_settings&lt;/code&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F90wbipg4prqiucm5qy7c.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F90wbipg4prqiucm5qy7c.png" alt=" " width="800" height="39"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Let's open the file located in the sites-enabled directory and add the following configuration to it:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fnr8sajyn6lt8ly2gc1cc.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fnr8sajyn6lt8ly2gc1cc.png" alt=" " width="800" height="228"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;These lines are intended to configure the web server to act as a reverse proxy. Inside the location block, we specify the parameters that enable this behavior:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;code&gt;proxy_pass&lt;/code&gt;: This line forwards all incoming requests to the target server. In this case, we are using the loopback address (127.0.0.1), and the requests are forwarded to port 8000, which we will discuss shortly.&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;proxy_set_header Host $host;&lt;/code&gt;: This line ensures the original host header from the client is passed along with the request to the upstream server.&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;proxy_set_header X-Real-IP $remote_addr;&lt;/code&gt;: This line ensures the client’s real IP address is passed to the upstream server, rather than the IP address of the reverse proxy (NGINX).&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;Now let's restart NGINX and check its status:&lt;br&gt;
&lt;code&gt;systemctl restart nginx&lt;/code&gt;&lt;br&gt;
&lt;code&gt;systemctl status nginx.service&lt;/code&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fjjjvf6necoczphmnq4mo.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fjjjvf6necoczphmnq4mo.png" alt=" " width="800" height="268"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;After all these configs are done, it's safe to say that we have this architecture:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F4z0g4vwpay00avm47w76.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F4z0g4vwpay00avm47w76.png" alt=" " width="" height=""&gt;&lt;/a&gt; &lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Nginx&lt;/strong&gt; acts as a reverse proxy, meaning it receives incoming requests and decides where to forward them next. It also serves static files like images and CSS, and supports encryption via the SSL protocol.&lt;br&gt;
In the second layer, we have &lt;strong&gt;Gunicorn&lt;/strong&gt;. When &lt;strong&gt;Nginx&lt;/strong&gt; receives a request—for example, to &lt;a href="http://www.domain.com%E2%80%94it" rel="noopener noreferrer"&gt;www.domain.com—it&lt;/a&gt; checks the configuration files and sees that it should forward the request to &lt;strong&gt;Gunicorn&lt;/strong&gt;. We’ve specified port &lt;code&gt;8000&lt;/code&gt; in the web application configuration, so &lt;strong&gt;Nginx&lt;/strong&gt; forwards the request to Gunicorn through this port.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Gunicorn’s&lt;/strong&gt; role is to handle dynamic content. It receives the request passed from &lt;strong&gt;Nginx&lt;/strong&gt; via the&lt;code&gt;proxy_pass&lt;/code&gt; directive and generates the appropriate dynamic response.&lt;/p&gt;

&lt;h4&gt;
  
  
  5.3 Kafka Producer – Video Publisher
&lt;/h4&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Ficy1pm2fd5yvnujl4n7j.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Ficy1pm2fd5yvnujl4n7j.png" alt=" " width="800" height="615"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;In this section of the code, we import the required libraries and define the topic. We then define the &lt;code&gt;pub_video&lt;/code&gt; function, whose purpose is to stream a pre-recorded video selected by the user. This function takes one parameter: &lt;code&gt;video_file&lt;/code&gt;, which is a string representing the path to the video file that will be streamed.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;First statement&lt;/strong&gt; inside the function assigns the bootstrap server to the producer.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Second statemen&lt;/strong&gt;t opens the video file using OpenCV. To do this, we create a VideoCapture object from the cv2 class. This object takes an argument: either the index of the device to stream from or the name of the video file.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Third statement&lt;/strong&gt; prints a simple log message.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Fourth statement&lt;/strong&gt; enters a while loop, where the condition is video.isOpened(). In some cases, the video capture may not be initialized correctly, so it's important to check whether the file has been successfully opened before proceeding with streaming.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Fifth statement&lt;/strong&gt;: If the video file is successfully opened, the video will be read frame by frame.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Sixth statement&lt;/strong&gt;: If the video cannot be opened, an error message is printed and the loop exits.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Seventh statement&lt;/strong&gt;: This line converts the frame to PNG format. The ret value indicates whether the frame was successfully captured (True or False). The buffer variable stores the result of applying the imencode() function to the frame, which compresses and encodes the image to memory.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Eighth statement&lt;/strong&gt;: Converts the image to bytes format to be transmitted between the producer and the consumer, as required.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Ninth statement&lt;/strong&gt;: Adds a short delay using time.sleep(0.2) to give time for each frame to be sent to the Kafka topic.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Tenth statement&lt;/strong&gt;: Releases the video file after the streaming is complete.
now we write the code for &lt;code&gt;live streaming&lt;/code&gt;:&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fmgwghm1w3if1g25y6xzi.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fmgwghm1w3if1g25y6xzi.png" alt=" " width="800" height="420"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;The process is similar to streaming a pre-recorded video, but with a few differences. In the second statement inside the function, we assign the value 0 to &lt;code&gt;cv2.VideoCapture()&lt;/code&gt;, which indicates the index of the &lt;strong&gt;camera&lt;/strong&gt; that &lt;strong&gt;OpenCV&lt;/strong&gt; will use to capture input.&lt;br&gt;
As for publishing the video to the Kafka topic, it follows the exact same method. Once the streaming is complete, we call the &lt;code&gt;camera.release()&lt;/code&gt; function to release the camera resource.&lt;/p&gt;

&lt;p&gt;last snippet of the producer code:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fdriyuefl5mjcsgo5cj0k.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fdriyuefl5mjcsgo5cj0k.png" alt=" " width="800" height="381"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;First statement:&lt;/strong&gt; Before executing the code, the Python interpreter reads the source file and defines global variables. If the interpreter runs this source file as the main program, it assigns the special variable &lt;code&gt;__name__&lt;/code&gt; the value &lt;code&gt;__main__&lt;/code&gt;.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Second statement:&lt;/strong&gt; &lt;code&gt;sys.argv&lt;/code&gt; is a Python list that contains the arguments passed to the script via the command line. Here, we check the length of this list—if it’s greater than 1, we assume the user has provided a second argument, which will be treated as the path to the video file to be streamed. In this case, we call the &lt;code&gt;pub_video&lt;/code&gt; function and pass that file as a parameter.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Third statement:&lt;/strong&gt; If the user doesn’t provide any arguments, the program will stream live video by default.&lt;/li&gt;
&lt;/ul&gt;

&lt;h4&gt;
  
  
  5.4 Kafka Consumer + Flask Web Server
&lt;/h4&gt;

&lt;p&gt;&lt;strong&gt;Responsibility:&lt;/strong&gt; Subscribes to the &lt;code&gt;video-stream&lt;/code&gt; topic, retrieves video frames, and sends them as multipart responses to the client browser.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Femslmny71p99clzzucgw.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Femslmny71p99clzzucgw.png" alt=" " width="800" height="476"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;After importing the required libraries and defining both the &lt;strong&gt;Kafka&lt;/strong&gt; consumer and the &lt;strong&gt;Flask&lt;/strong&gt; app, we define the &lt;code&gt;index()&lt;/code&gt; function, which simply returns the home page of the web application (an optional step).&lt;/li&gt;
&lt;li&gt;We then define a route &lt;code&gt;/kaf/&lt;/code&gt; for streaming the video, and set the &lt;code&gt;HTTP&lt;/code&gt; method to &lt;code&gt;GET&lt;/code&gt; (which can be either &lt;code&gt;GET&lt;/code&gt; or &lt;code&gt;POST&lt;/code&gt;).&lt;/li&gt;
&lt;li&gt;Next, we define the &lt;code&gt;video()&lt;/code&gt; function — this is the core of the streaming logic. Inside this function, we use &lt;code&gt;multipart&lt;/code&gt; responses, a key mechanism in streaming applications where each piece of data replaces the previous one. This enables continuous video playback in the browser.&lt;/li&gt;
&lt;li&gt;The idea is that each &lt;strong&gt;data chunk&lt;/strong&gt; is treated as an &lt;strong&gt;image&lt;/strong&gt;, and by replacing each image with the next one in sequence, we simulate a live video stream. To enable real-time frame updates, we use what’s called a &lt;code&gt;multipart boundary,&lt;/code&gt; which segments each part of the response.&lt;/li&gt;
&lt;li&gt;We're using the content type:
&lt;code&gt;multipart/x-mixed-replace&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;This format is ideal for video streaming where each part replaces the previous one in a &lt;strong&gt;pipeline-like manner&lt;/strong&gt;.&lt;/li&gt;
&lt;li&gt;Then we define the &lt;code&gt;kafs()&lt;/code&gt; function, which receives images from the Kafka server and converts them into a format compatible with &lt;strong&gt;Flask&lt;/strong&gt;.&lt;/li&gt;
&lt;li&gt;When the program runs, the &lt;code&gt;video()&lt;/code&gt; function is called first. It then calls the&lt;code&gt;kafs()&lt;/code&gt; function, which handles converting each frame. These converted frames are passed back to &lt;code&gt;video()&lt;/code&gt;, which returns them as a response — each one replacing the previous, creating a live stream effect in the browser.&lt;/li&gt;
&lt;li&gt;Finally, we call &lt;code&gt;app.run()&lt;/code&gt; in the main program block to launch the &lt;strong&gt;Flask server&lt;/strong&gt;, specifying the host &lt;strong&gt;IP address&lt;/strong&gt; that will serve the video stream.&lt;/li&gt;
&lt;/ul&gt;

&lt;h4&gt;
  
  
  5.5 Running The App
&lt;/h4&gt;

&lt;p&gt;We run the producer with the folowing commands:&lt;br&gt;
To stream a pre recorded video:&lt;br&gt;
&lt;code&gt;Python &amp;lt;file_name&amp;gt;.py &amp;lt;video_file's name&amp;gt;&lt;/code&gt;&lt;br&gt;
To stream a live video:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fgxyxlgv4sxnapkd91he1.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fgxyxlgv4sxnapkd91he1.png" alt=" " width="800" height="157"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;now we run the app using gunicorn and we open the browser:&lt;br&gt;
&lt;code&gt;gunicorn app_name:app&lt;/code&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F24deyyp12jy9eqld4jbs.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F24deyyp12jy9eqld4jbs.png" alt=" " width="800" height="530"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;To see the video we add &lt;code&gt;/kaf&lt;/code&gt; to the browser url:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Funjf2odm5wmawlupkd34.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Funjf2odm5wmawlupkd34.png" alt=" " width="800" height="405"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Live Video:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F72hyl6oeiomwfetiz553.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F72hyl6oeiomwfetiz553.png" alt=" " width="800" height="63"&gt;&lt;/a&gt;&lt;br&gt;
now we re-open the browser:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F7pa4x93s4fu4zyzygvau.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F7pa4x93s4fu4zyzygvau.png" alt=" " width="800" height="437"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;And Voila!&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F3c87mipshy9hbu7i27fh.gif" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F3c87mipshy9hbu7i27fh.gif" alt=" " width="362" height="381"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  What's next?
&lt;/h2&gt;

&lt;p&gt;We will build a web application that &lt;strong&gt;detects faces&lt;/strong&gt; and &lt;strong&gt;identifies motion&lt;/strong&gt; by calculating the difference between consecutive video frames. &lt;strong&gt;Motion detection&lt;/strong&gt; is widely used in &lt;strong&gt;security applications&lt;/strong&gt;, especially for identifying moving objects within a monitored area — particularly people. For example, if unusual movement is detected in a surveillance camera feed, the system can raise an alert to indicate suspicious or unexpected activity.&lt;br&gt;
In the next project, we will combine &lt;strong&gt;motion detection&lt;/strong&gt; with &lt;strong&gt;face detection&lt;/strong&gt; to simulate a basic &lt;strong&gt;security system&lt;/strong&gt; — similar to what you might find in intelligent surveillance software.&lt;/p&gt;

</description>
      <category>kafka</category>
      <category>opensource</category>
      <category>programming</category>
    </item>
    <item>
      <title>Getting Started with Apache Kafka: A Beginner’s Guide to Real-Time Data Streaming</title>
      <dc:creator>Rafif</dc:creator>
      <pubDate>Thu, 03 Apr 2025 14:28:53 +0000</pubDate>
      <link>https://dev.to/rafif_1999/getting-started-with-apache-kafka-a-beginners-guide-to-real-time-data-streaming-4gf0</link>
      <guid>https://dev.to/rafif_1999/getting-started-with-apache-kafka-a-beginners-guide-to-real-time-data-streaming-4gf0</guid>
      <description>&lt;h3&gt;
  
  
  1.Introduction
&lt;/h3&gt;

&lt;p&gt;With the rapid evolution of computing systems, the need for a fast, scalable, and fault-tolerant messaging system has grown significantly. Apache Kafka has emerged as one of the most powerful and widely used messaging systems, providing a highly efficient way to process large volumes of data in real time. Without requiring massive computational resources, Kafka can handle thousands of messages per second with minimal latency, making it a preferred choice for major tech companies like LinkedIn, Twitter, Mozilla, Netflix, and Oracle.&lt;/p&gt;

&lt;p&gt;Modern businesses rely on data to understand trends, analyze customer behavior, and automate processes. Kafka plays a crucial role in real-time data processing and predictive analytics by reducing the time between event registration and system response. Originally developed by LinkedIn in 2011 as an open-source project, Kafka was later acquired by Apache and is now further developed by Confluent, founded by Kafka's original creators: Jay Kreps, Neha Narkhede, and Jun Rao.&lt;/p&gt;

&lt;p&gt;Kafka's core philosophy revolves around treating data as a continuous stream rather than static storage. This approach is particularly useful in machine learning, security monitoring, and real-time video analytics, where data needs to be processed and responded to instantly.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fj1fikj2mwke43jfeyiy9.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fj1fikj2mwke43jfeyiy9.png" alt="Image description" width="737" height="350"&gt;&lt;/a&gt;&lt;/p&gt;




&lt;h3&gt;
  
  
  1.1 Messaging Systems
&lt;/h3&gt;

&lt;p&gt;Messaging systems facilitate communication between applications by transferring data asynchronously. They are categorized into two main types:&lt;/p&gt;

&lt;h4&gt;
  
  
  1.1.1 Point-to-Point(Queue-Based Messaging)
&lt;/h4&gt;

&lt;p&gt;A producer sends messages to a queue, where a single consumer retrieves them.&lt;br&gt;
Once consumed, messages are removed from the queue.&lt;br&gt;
If the consumer is unavailable, the message remains in the queue until it is processed.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Flfc1wzri2ehj1trfg5p1.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Flfc1wzri2ehj1trfg5p1.png" alt="Image description" width="800" height="184"&gt;&lt;/a&gt;&lt;/p&gt;
&lt;h4&gt;
  
  
  1.1.2 Publish-Subscribe (Pub/Sub)
&lt;/h4&gt;

&lt;p&gt;A publisher sends messages to a topic, which multiple subscribers can read from.&lt;br&gt;
Subscribers must be available to receive messages, or they may be lost.&lt;br&gt;
Unlike the queue model, messages are not deleted after being read by one subscriber.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F3vm0bwupno0xa0ankqj3.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F3vm0bwupno0xa0ankqj3.png" alt="Image description" width="800" height="283"&gt;&lt;/a&gt;&lt;/p&gt;


&lt;h3&gt;
  
  
  1.2 Apache Kafka Concept
&lt;/h3&gt;

&lt;p&gt;Kafka is a distributed, real-time streaming platform designed to handle millions of messages per second. It processes continuous data streams by collecting, storing, and distributing records efficiently across different consumers.&lt;/p&gt;

&lt;p&gt;Kafka offers three key functions:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Message Publishing &amp;amp; Subscription: Stores and distributes records sequentially, ensuring reliability.&lt;/li&gt;
&lt;li&gt;Fault-Tolerance: Ensures system stability even in case of failures.&lt;/li&gt;
&lt;li&gt;Real-Time Processing: Supports instant processing of high-speed data streams.&lt;/li&gt;
&lt;/ul&gt;


&lt;h3&gt;
  
  
  1.3 Why Use Kafka?
&lt;/h3&gt;

&lt;p&gt;Traditional messaging systems often face challenges like high latency and message buildup under heavy traffic. Kafka addresses these limitations with a modern, robust design that offers:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;High-throughput message storage&lt;/strong&gt; that's both scalable and efficient.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Fault-tolerant architecture&lt;/strong&gt; that ensures no data is lost, even during failures.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Real-time data streaming&lt;/strong&gt; for instant processing and analytics.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Unified platform&lt;/strong&gt; that seamlessly combines messaging, storage, and stream processing.&lt;/li&gt;
&lt;/ul&gt;


&lt;h3&gt;
  
  
  1.4 Kafka Workflow
&lt;/h3&gt;

&lt;p&gt;Even though publish-subscribe (pub/sub) and queuing are different messaging patterns, Kafka combines both to support various use cases. Sometimes, Kafka functions as a traditional topic-based pub/sub system, where data is sent to topics as a continuous stream of records. These records are structured in a sequential and ordered manner, and multiple subscribers (consumers) can process them independently.&lt;/p&gt;
&lt;h4&gt;
  
  
  1.4.1 Pub/Sub
&lt;/h4&gt;

&lt;p&gt;In the pub/sub model, an application (the producer) connects to Kafka and publishes messages to a topic. Kafka stores these messages in a structured log that is divided into partitions (segments). Multiple consumers can subscribe to a topic, and Kafka ensures that each consumer gets assigned a specific partition. When another application (consumer) connects, it reads and processes records from its assigned partition.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F1xshbox7g8dqe66jztet.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F1xshbox7g8dqe66jztet.png" alt="Image description" width="800" height="170"&gt;&lt;/a&gt;&lt;/p&gt;
&lt;h4&gt;
  
  
  1.4.2 Queuing
&lt;/h4&gt;

&lt;p&gt;In the queuing model, both producers and consumers connect to Kafka in a similar way. However, unlike pub/sub, queuing ensures that each message is delivered to only one consumer for processing. Messages are stored in a queue until a consumer retrieves them.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fbp1bi84pn9et4zqnbgk7.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fbp1bi84pn9et4zqnbgk7.png" alt="Image description" width="800" height="182"&gt;&lt;/a&gt;&lt;br&gt;
A key characteristic of this approach is that it does not support multiple consumers processing the same message simultaneously on a single machine. This design is ideal for workloads where requests need to be processed in a sequential manner rather than being handled simultaneously by multiple consumers.&lt;/p&gt;

&lt;p&gt;For instance, in large-scale machine learning (ML) applications, requests are often processed one after another and then stored in a queue before being passed to the next stage. Since ML workloads can be computationally heavy, handling requests in a sequential pipeline ensures better resource utilization. This is where Kafka’s consumer API plays a crucial role in managing and distributing these workloads efficiently.&lt;/p&gt;


&lt;h3&gt;
  
  
  1.5 Kafka Cluster Architecture
&lt;/h3&gt;

&lt;p&gt;Kafka consists of several core components:&lt;/p&gt;
&lt;h4&gt;
  
  
  1.5.1 Brokers:
&lt;/h4&gt;

&lt;p&gt;A Kafka broker is a server responsible for handling incoming requests and managing topics. It plays a key role in distributing and storing records efficiently.&lt;/p&gt;

&lt;p&gt;In a Kafka cluster, there can be one or multiple brokers working together. Each broker holds a copy of the data and manages the topics created within the cluster.&lt;/p&gt;

&lt;p&gt;Producers send records to a broker, which then forwards them to the appropriate topic. On the other end, consumers retrieve records from the broker as needed.&lt;/p&gt;

&lt;p&gt;The primary reason for using multiple brokers is to take advantage of replication, ensuring data availability and fault tolerance within Kafka.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fqtslmr0k533a95a4nbro.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fqtslmr0k533a95a4nbro.png" alt="Image description" width="800" height="192"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;In Kafka, a topic is divided into 8 partitions to enable parallel processing and scalability. These partitions allow multiple consumers to read data concurrently, ensuring efficient data distribution.&lt;/p&gt;

&lt;p&gt;Each broker in the Kafka cluster manages one or more partitions, balancing the load and improving fault tolerance. This partitioning mechanism enhances throughput and performance, making it easier to handle large-scale data streams.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fmln1hnl9y4375zxdg30d.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fmln1hnl9y4375zxdg30d.png" alt="Image description" width="800" height="226"&gt;&lt;/a&gt;&lt;/p&gt;
&lt;h4&gt;
  
  
  1.5.2 ZooKeeper:
&lt;/h4&gt;

&lt;p&gt;Zookeeper is a distributed coordination service that Kafka uses to maintain synchronization and manage metadata across brokers. It ensures fault tolerance and efficient communication between Kafka components.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Offset Management: Consumers use offsets stored in logs to keep track of their position while reading data.&lt;/li&gt;
&lt;li&gt;Cluster Coordination: If a broker fails or a new one joins, Zookeeper helps rebalance the cluster.&lt;/li&gt;
&lt;li&gt;Leader Election: It manages leader selection for partitions, ensuring smooth operations.&lt;/li&gt;
&lt;li&gt;Service Discovery: Producers and consumers use Zookeeper to locate active brokers and topics.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F6gagygodl382zklak22j.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F6gagygodl382zklak22j.png" alt="Image description" width="800" height="342"&gt;&lt;/a&gt;&lt;/p&gt;
&lt;h4&gt;
  
  
  1.5.3 Producers:
&lt;/h4&gt;

&lt;p&gt;Send messages to Kafka topics.&lt;/p&gt;
&lt;h4&gt;
  
  
  1.5.4 Consumers:
&lt;/h4&gt;

&lt;p&gt;Retrieve and process messages from topics using offset tracking to ensure correct message order.&lt;/p&gt;


&lt;h3&gt;
  
  
  2. Install and Run Kafka
&lt;/h3&gt;
&lt;h4&gt;
  
  
  2.1 Installation
&lt;/h4&gt;

&lt;p&gt;Kafka requires Java to run. Install OpenJDK 8 using the following command:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;sudo apt update
sudo apt install openjdk-8-jdk
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Verify the installation:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;java -version
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Go to the Apache Kafka website and download the latest version. Alternatively, you can use wget to download it directly:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;wget https://downloads.apache.org/kafka/&amp;lt;latest_version&amp;gt;/kafka_&amp;lt;latest_version&amp;gt;.tgz
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Extract the downloaded file and check its content:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;tar -xzf kafka_&amp;lt;latest_version&amp;gt;.tgz
cd kafka_&amp;lt;latest_version&amp;gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fik10lp86ej7d08cpoe8j.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fik10lp86ej7d08cpoe8j.png" alt="Image description" width="800" height="56"&gt;&lt;/a&gt;&lt;br&gt;
In this project, we rely on the bin and config directories to run and manage Apache Kafka services.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;bin directory: Contains all the necessary shell scripts for starting, stopping, and managing Kafka services, such as running Zookeeper, Brokers, and handling topics.&lt;/li&gt;
&lt;li&gt;config directory: Includes configuration files for Kafka, covering settings for Brokers, Zookeeper, Topics, Producers, and Consumers.
By exploring the contents of these directories, we can access operational tools that simplify system management and control data flow within Kafka.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fkccekkl1ru3fkg4ecfyc.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fkccekkl1ru3fkg4ecfyc.png" alt="Image description" width="800" height="390"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fyq8d1c631nwa1ksut9sb.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fyq8d1c631nwa1ksut9sb.png" alt="Image description" width="800" height="164"&gt;&lt;/a&gt;&lt;/p&gt;
&lt;h4&gt;
  
  
  2.2 Startup Kafka
&lt;/h4&gt;

&lt;p&gt;To run Kafka, we must first start Zookeeper. Zookeeper is essential for managing the Kafka cluster and ensuring synchronization and coordination between the brokers. As a result, if an error occurs while starting Zookeeper, it will cause Kafka to fail to start due to its heavy dependency on it.&lt;br&gt;
We start Zookeeper using the command:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;bin/zookeeper-server-start.sh config/zookeeper.properties
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;We will get the following output:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F2k690mi6s06axemwl3ac.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F2k690mi6s06axemwl3ac.png" alt="Image description" width="800" height="377"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F0iva05bcqj81bl8m4z8w.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F0iva05bcqj81bl8m4z8w.png" alt="Image description" width="800" height="550"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;The message enclosed in a circle indicates that Zookeeper is listening on port 2181, which is its designated port. From this notification, we can confirm that Zookeeper is running without any issues. Now, to start Kafka, we open a new session within the virtual machine we're working on by pressing Alt + F3 (this may vary depending on the device). After opening the new session and logging in, we can start Kafka using the command:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;bin/kafka-server-start.sh config/server.properties
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Now we get the following output:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F8rrfbl7n68708x73ed2h.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F8rrfbl7n68708x73ed2h.png" alt="Image description" width="800" height="481"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Since it outputted "started," there are no issues. We notice that it assigned the &lt;strong&gt;number 0&lt;/strong&gt; to the server we just started, which indicates the initial broker number that Kafka creates when it starts.&lt;/p&gt;

&lt;h3&gt;
  
  
  2.3 Kafka Topics
&lt;/h3&gt;

&lt;p&gt;A topic is a collection of partitions that contain immutable, ordered records, each identified by a unique offset, making the records sequential. The primary goal of having multiple partitions is to allow users to read from the topic in a &lt;em&gt;&lt;strong&gt;parallel manner&lt;/strong&gt;&lt;/em&gt;.&lt;/p&gt;

&lt;p&gt;To create a topic, we execute the command in a new session (ensure that both Kafka and Zookeeper are running):&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;bin/kafka-topics.sh –-create –-bootstrap-server localhost:9092 –-replication-factor 1 –-partitions 1 –topic &amp;lt;topic name&amp;gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The replication factor indicates the number of copies of the topic that should exist within the Kafka cluster. Its value can be 1, 2, or 3. A value greater than 1 helps store a backup of the data in another broker within the cluster for backup or load balancing purposes. On the other hand, partitions are used to segment the topic, enabling parallel processing. A topic consists of partitions that divide the data across multiple brokers as we mentioned earlier.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F7pjsh7bh9qmzuivuc6zf.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F7pjsh7bh9qmzuivuc6zf.png" alt="Image description" width="800" height="87"&gt;&lt;/a&gt;&lt;br&gt;
After creating the topic, we will send messages from the producer to be received by the consumer. We open both the producer and consumer in separate sessions. we execute the following commands:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;producer:&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;bin/kafka-console-producer.sh --broker-list localhost:9092 -topic &amp;lt;topic name&amp;gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Consumer:&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;bin/kafka-console-consumer.sh localhost:9092 -topic &amp;lt;topic name&amp;gt; --from-beginning
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fchb01dvur01ou6ro7fmq.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fchb01dvur01ou6ro7fmq.png" alt="Image description" width="800" height="192"&gt;&lt;/a&gt;&lt;br&gt;
 We will notice that the time taken for messages to travel from the producer to the consumer is almost instantaneous, which is one of the main advantages of Kafka—its speed. &lt;/p&gt;

&lt;p&gt;And that’s it! In the upcoming chapters, we’ll dive into more advanced features for data streaming—including video and live video streaming using Kafka.&lt;/p&gt;

</description>
      <category>microservices</category>
      <category>kafka</category>
      <category>pubsub</category>
      <category>devops</category>
    </item>
  </channel>
</rss>
