<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>DEV Community: Abhiraj Adhikary</title>
    <description>The latest articles on DEV Community by Abhiraj Adhikary (@abhirajadhikary06).</description>
    <link>https://dev.to/abhirajadhikary06</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F2096578%2F92bca5c8-a4a6-4407-8ff7-e2c0b7e5a9e5.png</url>
      <title>DEV Community: Abhiraj Adhikary</title>
      <link>https://dev.to/abhirajadhikary06</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://dev.to/feed/abhirajadhikary06"/>
    <language>en</language>
    <item>
      <title>Parallel &amp; Concurrent Computing</title>
      <dc:creator>Abhiraj Adhikary</dc:creator>
      <pubDate>Tue, 27 Jan 2026 08:30:14 +0000</pubDate>
      <link>https://dev.to/abhirajadhikary06/parallel-concurrent-computing-42g3</link>
      <guid>https://dev.to/abhirajadhikary06/parallel-concurrent-computing-42g3</guid>
      <description>&lt;p&gt;Parallel and concurrent computing are no longer niche topics for high-performance researchers; they are essential for anyone wanting to squeeze real performance out of modern hardware.&lt;/p&gt;

&lt;h2&gt;
  
  
  1. Motivation: The End of "Free Lunch"
&lt;/h2&gt;

&lt;p&gt;For decades, software got faster simply because hardware engineers increased CPU clock speeds. However, around 2004, we hit a "Power Wall." Increasing clock speeds further generated more heat than could be dissipated.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;CPU Core Stagnation:&lt;/strong&gt; Instead of making one core faster (increasing GHz), manufacturers began adding &lt;strong&gt;more cores&lt;/strong&gt; to a single chip.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;The Shift:&lt;/strong&gt; To gain performance now, developers must write code that can run across these multiple cores simultaneously.&lt;/li&gt;
&lt;/ul&gt;




&lt;h2&gt;
  
  
  2. Serial vs. Parallel Execution
&lt;/h2&gt;

&lt;p&gt;The difference lies in how tasks are queued and processed.&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Feature&lt;/th&gt;
&lt;th&gt;Serial Execution&lt;/th&gt;
&lt;th&gt;Parallel Execution&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Workflow&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;One task must finish before the next begins.&lt;/td&gt;
&lt;td&gt;Multiple tasks (or parts of a task) run at the same time.&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Hardware&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Uses a single processor core.&lt;/td&gt;
&lt;td&gt;Uses multiple cores or multiple processors.&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Analogy&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;A single grocery checkout line.&lt;/td&gt;
&lt;td&gt;Multiple checkout lanes open at once.&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;




&lt;h2&gt;
  
  
  3. Key Definitions
&lt;/h2&gt;

&lt;h3&gt;
  
  
  Concurrency vs. Parallelism
&lt;/h3&gt;

&lt;p&gt;These terms are often used interchangeably, but they describe different concepts:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Concurrency:&lt;/strong&gt; The &lt;em&gt;art&lt;/em&gt; of dealing with many things at once. It’s about &lt;strong&gt;structure&lt;/strong&gt;. A system is concurrent if it can handle multiple tasks by switching between them (interleaving).&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Parallelism:&lt;/strong&gt; The &lt;em&gt;act&lt;/em&gt; of doing many things at once. It’s about &lt;strong&gt;execution&lt;/strong&gt;. It requires hardware capable of running tasks at the exact same moment.&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Deterministic vs. Non-deterministic Execution
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Deterministic:&lt;/strong&gt; Given the same input, the program always produces the same output and follows the same execution path.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Non-deterministic:&lt;/strong&gt; The outcome or the order of execution can change between runs, even with the same input. This is common in parallel systems because the &lt;strong&gt;thread scheduler&lt;/strong&gt; decides when each task runs, often leading to different interleaving.&lt;/li&gt;
&lt;/ul&gt;




&lt;h2&gt;
  
  
  4. Common Pitfalls
&lt;/h2&gt;

&lt;p&gt;Writing parallel code is notoriously difficult because of the "bugs" that only appear when timing is just right (or wrong).&lt;/p&gt;

&lt;h3&gt;
  
  
  Race Conditions
&lt;/h3&gt;

&lt;p&gt;A race condition occurs when the output depends on the sequence or timing of uncontrollable events. &lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Example:&lt;/strong&gt; Two threads try to increment a counter simultaneously. If they both read "10," add 1, and write back "11," the counter only increases by 1 instead of 2.&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Deadlocks
&lt;/h3&gt;

&lt;p&gt;A deadlock is a "Mexican Standoff" in code. It happens when:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt; Thread A holds Resource 1 and waits for Resource 2.&lt;/li&gt;
&lt;li&gt; Thread B holds Resource 2 and waits for Resource 1.
Neither can proceed, and the program freezes.&lt;/li&gt;
&lt;/ol&gt;

&lt;h3&gt;
  
  
  Synchronization Issues
&lt;/h3&gt;

&lt;p&gt;To prevent race conditions, we use "locks" or "mutexes." However, over-synchronizing leads to problems:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Contention:&lt;/strong&gt; Too many threads fighting for the same lock, which slows the system down to serial speeds.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Starvation:&lt;/strong&gt; A thread is perpetually denied access to resources because other "greedier" threads keep taking them.&lt;/li&gt;
&lt;/ul&gt;




&lt;p&gt;Understanding how memory is allocated is the "make or break" moment for designing parallel systems. It dictates how your workers (threads or processes) talk to each other and how much they’ll fight over resources.&lt;/p&gt;




&lt;h2&gt;
  
  
  2.1 Shared Memory Parallelism (Multithreading)
&lt;/h2&gt;

&lt;p&gt;In this model, multiple &lt;strong&gt;threads&lt;/strong&gt; live within a single &lt;strong&gt;process&lt;/strong&gt;. Imagine a single kitchen (the memory) where multiple chefs (threads) are working at the same counter.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Shared Space:&lt;/strong&gt; All threads can see and modify the same variables. This makes communication lightning-fast because you don't have to "send" data; it's already there.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;The Synchronization Tax:&lt;/strong&gt; Since everyone is touching the same "ingredients," you need strict rules (locks/mutexes) to prevent them from chopping the same carrot at the same time. This adds significant logic complexity.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;The Python Catch (GIL):&lt;/strong&gt; In standard Python (CPython), the &lt;strong&gt;Global Interpreter Lock (GIL)&lt;/strong&gt; ensures only one thread executes Python bytecode at a time. Even on a 16-core machine, multithreading in Python won't give you a 16x speedup for CPU-heavy math; it’s mostly useful for I/O tasks like downloading files.&lt;/li&gt;
&lt;/ul&gt;




&lt;h2&gt;
  
  
  2.2 Distributed Memory Parallelism (Multiprocessing)
&lt;/h2&gt;

&lt;p&gt;Here, you have multiple &lt;strong&gt;processes&lt;/strong&gt;, each with its own private "kitchen." No process can peek into another's memory.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Independence:&lt;/strong&gt; Since memory isn't shared, you don't have to worry about one process accidentally overwriting another’s variables. This eliminates many race conditions.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Message Passing:&lt;/strong&gt; If Process A needs data from Process B, it must be explicitly "sent" over a communication channel (like a Pipe or Queue). This is called &lt;strong&gt;Message Passing&lt;/strong&gt;.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;True Parallelism:&lt;/strong&gt; Because each process has its own memory and its own instance of the Python interpreter, the &lt;strong&gt;GIL is bypassed&lt;/strong&gt;. This is the go-to method for &lt;strong&gt;compute-bound tasks&lt;/strong&gt; (e.g., heavy data processing, image rendering).&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;The Overhead:&lt;/strong&gt; Creating a new process is "heavier" and slower than creating a thread, and sending large amounts of data between processes can be a performance bottleneck.&lt;/li&gt;
&lt;/ul&gt;




&lt;h2&gt;
  
  
  Summary Comparison
&lt;/h2&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Feature&lt;/th&gt;
&lt;th&gt;Multithreading (Shared)&lt;/th&gt;
&lt;th&gt;Multiprocessing (Distributed)&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Memory&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Shared among all threads&lt;/td&gt;
&lt;td&gt;Private to each process&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Communication&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Fast (Shared variables)&lt;/td&gt;
&lt;td&gt;Slower (Message passing)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Complexity&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;High (Needs locks/semaphores)&lt;/td&gt;
&lt;td&gt;Lower (Isolation)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Python GIL&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Restricted by GIL&lt;/td&gt;
&lt;td&gt;Bypasses GIL&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Best Use Case&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;I/O-bound (API calls, DB reads)&lt;/td&gt;
&lt;td&gt;CPU-bound (Math, Data Science)&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;




&lt;p&gt;The &lt;strong&gt;Global Interpreter Lock (GIL)&lt;/strong&gt; is perhaps the most famous (and infamous) technical detail of the Python programming language. It is essentially a "safety latch" that has shaped how the entire Python ecosystem handles performance.&lt;/p&gt;




&lt;h2&gt;
  
  
  3.1 What is the GIL and Why Does It Exist?
&lt;/h2&gt;

&lt;p&gt;The GIL is a &lt;strong&gt;mutex&lt;/strong&gt; (a lock) that protects access to Python objects, preventing multiple threads from executing Python bytecodes at once.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;The Reason:&lt;/strong&gt; Python uses &lt;strong&gt;reference counting&lt;/strong&gt; for memory management. If two threads increment or decrement the "use count" of an object simultaneously, it could lead to memory leaks or, worse, deleting an object that is still in use.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;The Benefit:&lt;/strong&gt; It makes the implementation of CPython (the standard Python version) much simpler and faster for single-threaded programs. It also makes integrating C libraries (which might not be thread-safe) much easier.&lt;/li&gt;
&lt;/ul&gt;




&lt;h2&gt;
  
  
  3.2 Impact on Python Multithreading
&lt;/h2&gt;

&lt;p&gt;Because of the GIL, even if your computer has 32 CPU cores, a standard Python program using &lt;code&gt;threading&lt;/code&gt; will only utilize &lt;strong&gt;one core&lt;/strong&gt; at a time for execution.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;The Illusion of Parallelism:&lt;/strong&gt; To a human, it looks like threads are running in parallel because the GIL switches between them very quickly (every 5ms or so).&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;CPU-Bound Bottleneck:&lt;/strong&gt; If your code is doing heavy math (CPU-bound), multithreading actually makes it &lt;strong&gt;slower&lt;/strong&gt; than a single-threaded program. This is because of the "lock overhead"—the time wasted by threads fighting over who gets to hold the GIL.&lt;/li&gt;
&lt;/ul&gt;




&lt;h2&gt;
  
  
  3.3 How the GIL is Bypassed
&lt;/h2&gt;

&lt;p&gt;The GIL isn't an impenetrable wall; it’s more like a gate that can be opened under specific conditions.&lt;/p&gt;

&lt;h3&gt;
  
  
  1. Native Extensions (The "C" Escape)
&lt;/h3&gt;

&lt;p&gt;Libraries like &lt;strong&gt;NumPy, SciPy, and Pandas&lt;/strong&gt; are written in C or Fortran. When you perform a massive matrix multiplication in NumPy, the library "releases" the GIL, does the heavy lifting in C across multiple cores, and "grabs" the GIL back only when it’s done.&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Note:&lt;/strong&gt; This is why Python is a powerhouse for Data Science despite the GIL.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;h3&gt;
  
  
  2. I/O Operations
&lt;/h3&gt;

&lt;p&gt;When a thread is waiting for something external—like a website to respond, a file to be read from a disk, or a database query—it &lt;strong&gt;voluntarily releases the GIL&lt;/strong&gt;.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;While Thread A waits for a download, Thread B can take the GIL and start working. This makes Python threads excellent for &lt;strong&gt;network-heavy tasks&lt;/strong&gt;.&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  3. Multiprocessing
&lt;/h3&gt;

&lt;p&gt;As we discussed earlier, the GIL is per-interpreter. By using the &lt;code&gt;multiprocessing&lt;/code&gt; module, you launch &lt;strong&gt;entirely separate instances&lt;/strong&gt; of the Python interpreter.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Each process has its own GIL.&lt;/li&gt;
&lt;li&gt;Each process can sit on its own CPU core.&lt;/li&gt;
&lt;li&gt;This is the standard way to achieve "True Parallelism" in Python for pure Python code.&lt;/li&gt;
&lt;/ul&gt;




&lt;h2&gt;
  
  
  Summary: Threading vs. Multiprocessing in Python
&lt;/h2&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Task Type&lt;/th&gt;
&lt;th&gt;Recommended Approach&lt;/th&gt;
&lt;th&gt;Why?&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;
&lt;strong&gt;CPU-Bound&lt;/strong&gt; (Math, Compression)&lt;/td&gt;
&lt;td&gt;&lt;code&gt;multiprocessing&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Bypasses GIL, uses all cores.&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;
&lt;strong&gt;I/O-Bound&lt;/strong&gt; (Web Scraping, API)&lt;/td&gt;
&lt;td&gt;&lt;code&gt;threading&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Efficiently uses "waiting time."&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Scientific Computing&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;&lt;code&gt;NumPy / Pandas&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Releases GIL internally in C code.&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;




&lt;p&gt;In Python, the &lt;code&gt;threading&lt;/code&gt; module is the go-to choice for tasks where the bottleneck isn't your CPU's speed, but rather the &lt;strong&gt;latency&lt;/strong&gt; of external systems.&lt;/p&gt;




&lt;h2&gt;
  
  
  4.1 Threading Use Cases
&lt;/h2&gt;

&lt;p&gt;Threads are ideal for &lt;strong&gt;I/O-bound&lt;/strong&gt; workloads. In these scenarios, the processor spends most of its time idle, waiting for a response from a device or network.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Network Requests:&lt;/strong&gt; Fetching data from multiple APIs or web scraping. While Thread A waits for a server in New York to respond, the GIL is released, allowing Thread B to start a request to a server in London.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Disk Operations:&lt;/strong&gt; Reading or writing multiple files. Since disk I/O is significantly slower than CPU cache, threads allow you to overlap the "wait time" of different file operations.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;User Interfaces (GUIs):&lt;/strong&gt; Keeping the interface responsive. One thread handles the "click" events while a background thread does the heavy lifting, preventing the window from freezing.&lt;/li&gt;
&lt;/ul&gt;




&lt;h2&gt;
  
  
  4.2 ThreadPoolExecutor
&lt;/h2&gt;

&lt;p&gt;Modern Python development favors the &lt;code&gt;concurrent.futures.ThreadPoolExecutor&lt;/code&gt; over the older &lt;code&gt;threading.Thread&lt;/code&gt; class. It provides a higher-level interface for managing a "pool" of threads.&lt;/p&gt;

&lt;h3&gt;
  
  
  What is a Thread Pool?
&lt;/h3&gt;

&lt;p&gt;Instead of creating and destroying a thread for every single task (which is expensive), you create a &lt;strong&gt;Pool&lt;/strong&gt; of workers that stay alive and pick up tasks from a queue as they become available.&lt;/p&gt;

&lt;h3&gt;
  
  
  Key Methods: &lt;code&gt;map&lt;/code&gt; vs. &lt;code&gt;submit&lt;/code&gt;
&lt;/h3&gt;

&lt;p&gt;The &lt;code&gt;ThreadPoolExecutor&lt;/code&gt; offers two primary ways to run tasks:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt; &lt;strong&gt;&lt;code&gt;map(func, *iterables)&lt;/code&gt;:&lt;/strong&gt;

&lt;ul&gt;
&lt;li&gt;Works like the built-in &lt;code&gt;map&lt;/code&gt;.&lt;/li&gt;
&lt;li&gt;Executes the function across all items in the iterable in parallel.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Pros:&lt;/strong&gt; Very simple; returns results in the order they were submitted.&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt; &lt;strong&gt;&lt;code&gt;submit(func, *args)&lt;/code&gt;:&lt;/strong&gt;

&lt;ul&gt;
&lt;li&gt;Schedules a single callable and returns a &lt;strong&gt;Future&lt;/strong&gt; object.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Pros:&lt;/strong&gt; More flexible; allows you to handle individual task completion and different arguments for each task.&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;/ol&gt;

&lt;h3&gt;
  
  
  Code Example: Efficiently Fetching Data
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;concurrent.futures&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;ThreadPoolExecutor&lt;/span&gt;
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;requests&lt;/span&gt;

&lt;span class="n"&gt;urls&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;https://google.com&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;https://python.org&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;https://github.com&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;

&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;fetch_status&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;url&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="n"&gt;response&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;requests&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;get&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;url&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;url&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s"&gt;: &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;response&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;status_code&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;

&lt;span class="c1"&gt;# Using a context manager ensures threads are cleaned up automatically
&lt;/span&gt;&lt;span class="k"&gt;with&lt;/span&gt; &lt;span class="nc"&gt;ThreadPoolExecutor&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;max_workers&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;3&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="n"&gt;executor&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="c1"&gt;# 'map' handles the distribution of URLs to the 3 threads
&lt;/span&gt;    &lt;span class="n"&gt;results&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;list&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;executor&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;map&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;fetch_status&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;urls&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;

&lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;r&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;results&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;r&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;






&lt;h2&gt;
  
  
  Why use a Pool instead of manual Threads?
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Resource Management:&lt;/strong&gt; It prevents you from accidentally spawning 10,000 threads and crashing your system.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Cleanliness:&lt;/strong&gt; Using the &lt;code&gt;with&lt;/code&gt; statement (context manager) ensures that all threads are joined and resources are released even if an error occurs.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Future Objects:&lt;/strong&gt; It provides "Futures," which are placeholders for results that haven't happened yet, allowing you to check if a task is "done" or if it "cancelled."&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;To master parallel computing, you must be able to diagnose the &lt;strong&gt;bottleneck&lt;/strong&gt;. Is your code waiting for the "brain" (CPU) or the "delivery truck" (I/O)? Choosing the wrong tool for the workload can actually make your code slower.&lt;/p&gt;




&lt;h2&gt;
  
  
  6.1 Identifying Workload Characteristics
&lt;/h2&gt;

&lt;h3&gt;
  
  
  CPU-Bound (Compute-Heavy)
&lt;/h3&gt;

&lt;p&gt;The speed is limited by the &lt;strong&gt;CPU's clock speed and core count&lt;/strong&gt;.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Examples:&lt;/strong&gt; Matrix multiplication, image processing, data compression, searching for prime numbers.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Significance:&lt;/strong&gt; These tasks keep the processor usage at 100%.&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  I/O-Bound (Wait-Heavy)
&lt;/h3&gt;

&lt;p&gt;The speed is limited by &lt;strong&gt;Input/Output operations&lt;/strong&gt;. The CPU often sits idle, waiting for data.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Examples:&lt;/strong&gt; Web scraping (Network), reading thousands of small CSVs (Disk), waiting for a database query to return.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Significance:&lt;/strong&gt; Processor usage is usually low; the system is waiting on external latency.&lt;/li&gt;
&lt;/ul&gt;




&lt;h2&gt;
  
  
  6.2 Performance Comparison Table
&lt;/h2&gt;

&lt;p&gt;Here is how each execution style behaves under different pressures:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Workload Type&lt;/th&gt;
&lt;th&gt;Serial Execution&lt;/th&gt;
&lt;th&gt;Multithreading&lt;/th&gt;
&lt;th&gt;Multiprocessing&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;I/O-Bound&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Very Slow (Total wait time)&lt;/td&gt;
&lt;td&gt;
&lt;strong&gt;Fastest&lt;/strong&gt; (Overlaps wait time)&lt;/td&gt;
&lt;td&gt;Fast (But uses more memory)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;CPU-Bound&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Slow&lt;/td&gt;
&lt;td&gt;
&lt;strong&gt;Slowest&lt;/strong&gt; (GIL overhead + context switching)&lt;/td&gt;
&lt;td&gt;
&lt;strong&gt;Fastest&lt;/strong&gt; (Uses all cores)&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;




&lt;h2&gt;
  
  
  6.3 Demonstrations (Mental Model)
&lt;/h2&gt;

&lt;h3&gt;
  
  
  I/O-Bound: The &lt;code&gt;sleep()&lt;/code&gt; Test
&lt;/h3&gt;

&lt;p&gt;Imagine a task that does nothing but &lt;code&gt;time.sleep(1)&lt;/code&gt;. This simulates waiting for a network response.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Serial:&lt;/strong&gt; To do this 10 times, it takes &lt;strong&gt;10 seconds&lt;/strong&gt;.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Multithreading:&lt;/strong&gt; You spawn 10 threads. They all start "sleeping" at the same time. The total time is roughly &lt;strong&gt;1 second&lt;/strong&gt;.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Why?&lt;/strong&gt; The GIL is released during &lt;code&gt;sleep()&lt;/code&gt;, letting threads wait in parallel.&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  CPU-Bound: The Mathematical Loop
&lt;/h3&gt;

&lt;p&gt;Imagine calculating the sum of squares for 50 million numbers.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Serial:&lt;/strong&gt; Takes &lt;strong&gt;X&lt;/strong&gt; seconds.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Multithreading:&lt;/strong&gt; Takes &lt;strong&gt;X + overhead&lt;/strong&gt; seconds. Because of the GIL, only one thread can do math at a time. The CPU is essentially "juggling" threads, which wastes time.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Multiprocessing:&lt;/strong&gt; If you have 4 cores, it takes roughly &lt;strong&gt;X / 4&lt;/strong&gt; seconds. Each core handles a chunk of the numbers independently.&lt;/li&gt;
&lt;/ul&gt;




&lt;h2&gt;
  
  
  Summary: The Decision Tree
&lt;/h2&gt;

&lt;ol&gt;
&lt;li&gt; &lt;strong&gt;Is the CPU usage low while the program is running?&lt;/strong&gt; $\rightarrow$ It's I/O-bound. Use &lt;code&gt;threading&lt;/code&gt; or &lt;code&gt;asyncio&lt;/code&gt;.&lt;/li&gt;
&lt;li&gt; &lt;strong&gt;Is one core pegged at 100%?&lt;/strong&gt; $\rightarrow$ It's CPU-bound. Use &lt;code&gt;multiprocessing&lt;/code&gt; or a library like &lt;code&gt;NumPy&lt;/code&gt;.&lt;/li&gt;
&lt;li&gt; &lt;strong&gt;Are you limited by memory?&lt;/strong&gt; $\rightarrow$ Be careful with &lt;code&gt;multiprocessing&lt;/code&gt;, as each process copies the memory space.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;While &lt;strong&gt;Multithreading&lt;/strong&gt; is like having one chef with multiple hands, &lt;strong&gt;Multiprocessing&lt;/strong&gt; is like hiring four chefs in four separate kitchens. This is the only way to achieve "true" parallelism for Python-native code.&lt;/p&gt;




&lt;h2&gt;
  
  
  7.1 The &lt;code&gt;multiprocessing&lt;/code&gt; Module
&lt;/h2&gt;

&lt;p&gt;This module bypasses the GIL by creating entirely new instances of the Python interpreter for each task. &lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Process-based parallelism:&lt;/strong&gt; Each process has its own memory space and its own GIL.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Safety:&lt;/strong&gt; Since memory isn't shared by default, one process can't accidentally corrupt another's data.&lt;/li&gt;
&lt;/ul&gt;




&lt;h2&gt;
  
  
  7.2 Pool, Map, and Starmap
&lt;/h2&gt;

&lt;p&gt;The &lt;code&gt;multiprocessing.Pool&lt;/code&gt; class is the workhorse for data-parallel tasks.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;&lt;code&gt;map(func, iterable)&lt;/code&gt;:&lt;/strong&gt; The simplest way to parallelize. It chops the iterable into chunks and sends them to the worker processes.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;&lt;code&gt;starmap(func, iterable_of_tuples)&lt;/code&gt;:&lt;/strong&gt; Used when your function requires multiple arguments. 

&lt;ul&gt;
&lt;li&gt;
&lt;em&gt;Example:&lt;/em&gt; If &lt;code&gt;func(x, y)&lt;/code&gt; is your function, &lt;code&gt;starmap&lt;/code&gt; takes &lt;code&gt;[(1, 2), (3, 4)]&lt;/code&gt;.&lt;/li&gt;
&lt;/ul&gt;


&lt;/li&gt;

&lt;/ul&gt;

&lt;h3&gt;
  
  
  &lt;code&gt;ProcessPoolExecutor&lt;/code&gt;
&lt;/h3&gt;

&lt;p&gt;Found in &lt;code&gt;concurrent.futures&lt;/code&gt;, this provides an identical interface to the &lt;code&gt;ThreadPoolExecutor&lt;/code&gt; we saw earlier. It is generally preferred in modern code for its consistency and better error handling.&lt;/p&gt;




&lt;h2&gt;
  
  
  7.3 Communication &amp;amp; Shared Memory
&lt;/h2&gt;

&lt;p&gt;Sometimes processes &lt;strong&gt;do&lt;/strong&gt; need to talk to each other. Since they don't share memory, we use special constructs:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Tool&lt;/th&gt;
&lt;th&gt;Description&lt;/th&gt;
&lt;th&gt;Best For...&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;&lt;code&gt;Value&lt;/code&gt; / &lt;code&gt;Array&lt;/code&gt;&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Allocates a small piece of shared memory (C-style) that all processes can see.&lt;/td&gt;
&lt;td&gt;Simple counters or flags.&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;&lt;code&gt;Queue&lt;/code&gt;&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;A thread- and process-safe FIFO (First-In-First-Out) pipe.&lt;/td&gt;
&lt;td&gt;Passing complex objects or results back to the main process.&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;&lt;code&gt;Pipe&lt;/code&gt;&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;A direct connection between two processes.&lt;/td&gt;
&lt;td&gt;Fast, two-way communication between exactly two workers.&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;




&lt;h2&gt;
  
  
  7.4 Limitations in Interactive Environments (Jupyter)
&lt;/h2&gt;

&lt;p&gt;A common "gotcha" for data scientists is that the &lt;code&gt;multiprocessing&lt;/code&gt; module often fails or behaves unpredictably in &lt;strong&gt;Jupyter Notebooks&lt;/strong&gt; or the &lt;strong&gt;IPython&lt;/strong&gt; console.&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt; &lt;strong&gt;Serialization (Pickling):&lt;/strong&gt; Python must "pickle" (serialize) your function and data to send it to the other process. If you define a function inside a notebook cell, the worker process might not be able to find its definition.&lt;/li&gt;
&lt;li&gt; &lt;strong&gt;The &lt;code&gt;if __name__ == "__main__":&lt;/code&gt; block:&lt;/strong&gt; On Windows and macOS, you &lt;strong&gt;must&lt;/strong&gt; wrap your multiprocessing code in this block to prevent a recursive loop of process creation.

&lt;ul&gt;
&lt;li&gt;&lt;em&gt;Jupyter doesn't always handle this entry point correctly.&lt;/em&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;&lt;strong&gt;Workaround:&lt;/strong&gt; If you run into issues in Jupyter, move your functions into a separate &lt;code&gt;.py&lt;/code&gt; file and import them into your notebook.&lt;/p&gt;




&lt;h2&gt;
  
  
  7.5 Summary: When to use Multiprocessing
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;YES:&lt;/strong&gt; For "number crunching" (e.g., calculating $\pi$ to a billion digits).&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;YES:&lt;/strong&gt; For heavy image/video processing.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;NO:&lt;/strong&gt; For simple I/O (it uses way more RAM than threads).&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;NO:&lt;/strong&gt; When you need to share massive amounts of data (the "pickling" overhead will kill your performance).&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;When multiple threads or processes try to change the same piece of data at the same time, you enter the world of &lt;strong&gt;Race Conditions&lt;/strong&gt;. This is the most common source of "heisenbugs"—bugs that seem to disappear when you try to look for them.&lt;/p&gt;




&lt;h2&gt;
  
  
  8.1 Shared State Modification Problems
&lt;/h2&gt;

&lt;p&gt;A race condition occurs when the final outcome of a program depends on the &lt;strong&gt;timing&lt;/strong&gt; or &lt;strong&gt;scheduling&lt;/strong&gt; of the execution. &lt;/p&gt;

&lt;p&gt;If two threads are incrementing a shared variable, the operation looks like one step in Python (&lt;code&gt;x += 1&lt;/code&gt;), but the CPU sees three distinct steps:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt; &lt;strong&gt;Read&lt;/strong&gt; the current value of $x$.&lt;/li&gt;
&lt;li&gt; &lt;strong&gt;Add&lt;/strong&gt; 1 to that value.&lt;/li&gt;
&lt;li&gt; &lt;strong&gt;Write&lt;/strong&gt; the new value back to memory.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;If Thread A is interrupted after step 1, and Thread B finishes all three steps, Thread A will eventually overwrite Thread B's work with an outdated value.&lt;/p&gt;




&lt;h2&gt;
  
  
  8.2 Demonstration of Incorrect Results
&lt;/h2&gt;

&lt;p&gt;In a perfectly synchronized world, if you have 10 threads each adding 1 to a counter 100,000 times, the result should be &lt;strong&gt;1,000,000&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;In a race condition scenario, the result might be &lt;strong&gt;742,384&lt;/strong&gt;. This happens because thousands of "updates" were lost when threads stomped on each other’s data.&lt;/p&gt;




&lt;h2&gt;
  
  
  8.3 Threading vs. Multiprocessing Behavior
&lt;/h2&gt;

&lt;p&gt;The way these two handle "shared state" is fundamentally different, which changes how they fail.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;In Multithreading:&lt;/strong&gt; Race conditions are &lt;strong&gt;common and dangerous&lt;/strong&gt;. Because all threads share the same memory, they can all "see" and "touch" the same variables globally.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;In Multiprocessing:&lt;/strong&gt; Race conditions are &lt;strong&gt;rare&lt;/strong&gt; by default. Since each process has its own memory, incrementing &lt;code&gt;x&lt;/code&gt; in Process A does nothing to &lt;code&gt;x&lt;/code&gt; in Process B.

&lt;ul&gt;
&lt;li&gt;
&lt;em&gt;Exception:&lt;/em&gt; You only face race conditions in multiprocessing if you explicitly use &lt;strong&gt;Shared Memory constructs&lt;/strong&gt; (like &lt;code&gt;Value&lt;/code&gt; or &lt;code&gt;Array&lt;/code&gt;) or shared external resources (like a database or a file on disk).&lt;/li&gt;
&lt;/ul&gt;


&lt;/li&gt;

&lt;/ul&gt;




&lt;h2&gt;
  
  
  8.4 Synchronization Primitives
&lt;/h2&gt;

&lt;p&gt;To fix these issues, we use tools that force threads to "wait their turn."&lt;/p&gt;

&lt;h3&gt;
  
  
  1. The Lock (Mutex)
&lt;/h3&gt;

&lt;p&gt;A Lock is the simplest tool. It has two states: &lt;strong&gt;locked&lt;/strong&gt; and &lt;strong&gt;unlocked&lt;/strong&gt;.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;A thread must "acquire" the lock before touching the shared data.&lt;/li&gt;
&lt;li&gt;If another thread holds the lock, everyone else must wait.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Analogy:&lt;/strong&gt; The "talking stick" in a meeting. You can't speak unless you hold the stick.&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  2. The Semaphore
&lt;/h3&gt;

&lt;p&gt;A Semaphore is like a Lock, but it allows a &lt;strong&gt;specific number&lt;/strong&gt; of threads to enter.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Analogy:&lt;/strong&gt; A restaurant with 10 tables. The first 10 groups get in; the 11th must wait until someone leaves.&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  3. The RLock (Re-entrant Lock)
&lt;/h3&gt;

&lt;p&gt;A standard Lock can cause a thread to "deadlock itself" if it tries to acquire the same lock twice. An &lt;code&gt;RLock&lt;/code&gt; allows the &lt;em&gt;same&lt;/em&gt; thread to acquire the lock multiple times without freezing.&lt;/p&gt;




&lt;h2&gt;
  
  
  Summary: The Cost of Safety
&lt;/h2&gt;

&lt;p&gt;While synchronization prevents data corruption, it comes with a performance price:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Overhead:&lt;/strong&gt; Managing locks takes CPU time.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Serial Bottlenecks:&lt;/strong&gt; If every thread is waiting for the same lock, your "parallel" program is actually running one-by-one (serial).&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Numerical integration is a "perfect" parallel problem. It follows the &lt;strong&gt;embarrassingly parallel&lt;/strong&gt; pattern, where a large task can be easily divided into independent sub-tasks that don't need to communicate with each other.&lt;/p&gt;




&lt;h2&gt;
  
  
  9.1 The Grid-Based Technique (Rectangle Rule)
&lt;/h2&gt;

&lt;p&gt;To find the area under a curve f(x) between a and b, we divide the interval into $N$ small rectangles. The total area is the sum of the areas of these rectangles.&lt;/p&gt;

&lt;p&gt;$$Area \approx \sum_{i=0}^{N-1} f(x_i) \Delta x$$&lt;/p&gt;

&lt;p&gt;In a &lt;strong&gt;serial&lt;/strong&gt; approach, a single CPU core calculates rectangle #1, then #2, then #3, all the way to N. If N is 100 million, this takes a significant amount of time.&lt;/p&gt;




&lt;h2&gt;
  
  
  9.2 Identifying Parallelizable Regions
&lt;/h2&gt;

&lt;p&gt;The beauty of integration is that the calculation of "Rectangle #500" does not depend on the result of "Rectangle #499."&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;The Strategy:&lt;/strong&gt; Split the total range $[a, b]$ into sub-intervals.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;The Workers:&lt;/strong&gt; If you have 4 cores, Core 1 handles the first 25%, Core 2 the second 25%, and so on.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;The Reduction:&lt;/strong&gt; Once all cores finish their local sums, you add those 4 sums together to get the final answer.&lt;/li&gt;
&lt;/ul&gt;




&lt;h2&gt;
  
  
  9.3 Implementation Strategies
&lt;/h2&gt;

&lt;h3&gt;
  
  
  Multithreading Approach
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Performance:&lt;/strong&gt; Low. Because integration is &lt;strong&gt;CPU-bound&lt;/strong&gt; (pure math), the Python GIL will prevent the threads from running the math in parallel.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Use Case:&lt;/strong&gt; Only beneficial if the function $f(x)$ involved an I/O wait (e.g., fetching a coordinate from a remote database), which is rare in pure math.&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Multiprocessing Approach
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Performance:&lt;/strong&gt; High. This is the correct tool. By using a &lt;code&gt;ProcessPoolExecutor&lt;/code&gt;, each core gets a chunk of the grid.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Efficiency:&lt;/strong&gt; You get nearly "linear scaling." If 1 core takes 10 seconds, 4 cores should take roughly 2.5 seconds.&lt;/li&gt;
&lt;/ul&gt;




&lt;h2&gt;
  
  
  9.4 Performance Measurement
&lt;/h2&gt;

&lt;p&gt;To prove the speedup, we use the &lt;code&gt;time&lt;/code&gt; module. It is vital to measure only the calculation, excluding the time it takes to set up the data.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;time&lt;/span&gt;

&lt;span class="n"&gt;start_time&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;time&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;time&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;

&lt;span class="c1"&gt;# ... Parallel Integration Logic ...
&lt;/span&gt;
&lt;span class="n"&gt;end_time&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;time&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;time&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
&lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Execution Time: &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;end_time&lt;/span&gt; &lt;span class="o"&gt;-&lt;/span&gt; &lt;span class="n"&gt;start_time&lt;/span&gt;&lt;span class="si"&gt;:&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="mi"&gt;4&lt;/span&gt;&lt;span class="n"&gt;f&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s"&gt; seconds&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Critical Metrics:
&lt;/h3&gt;

&lt;ol&gt;
&lt;li&gt; &lt;strong&gt;Speedup ($S$):&lt;/strong&gt; $S = \frac{T_{serial}}{T_{parallel}}$&lt;/li&gt;
&lt;li&gt; &lt;strong&gt;Efficiency ($E$):&lt;/strong&gt; $E = \frac{S}{Number\ of\ Cores}$ (Ideally, this is close to 1.0 or 100%).&lt;/li&gt;
&lt;/ol&gt;




&lt;h2&gt;
  
  
  9.5 Summary Table: Integration Performance
&lt;/h2&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Method&lt;/th&gt;
&lt;th&gt;Execution&lt;/th&gt;
&lt;th&gt;Expected Speedup&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Serial&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;One core, one by one.&lt;/td&gt;
&lt;td&gt;1x (Baseline)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Multithreading&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Context switching on one core.&lt;/td&gt;
&lt;td&gt;~0.9x (Slower due to overhead)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Multiprocessing&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Multiple cores simultaneously.&lt;/td&gt;
&lt;td&gt;~3.8x (on a 4-core machine)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;NumPy (Vectorized)&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Optimized C-backend/SIMD.&lt;/td&gt;
&lt;td&gt;
&lt;strong&gt;Fastest&lt;/strong&gt; (often 50x - 100x)&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;




&lt;p&gt;To wrap up our foundations, we look at the "low-hanging fruit" of the computing world. An &lt;strong&gt;Embarrassingly Parallel&lt;/strong&gt; problem is one where little to no effort is needed to separate the problem into a number of parallel tasks.&lt;/p&gt;




&lt;h2&gt;
  
  
  10.1 Definition and Characteristics
&lt;/h2&gt;

&lt;p&gt;A problem is embarrassingly parallel if there is &lt;strong&gt;no dependency&lt;/strong&gt; (or very little) between the sub-tasks. &lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;No Communication:&lt;/strong&gt; Task A doesn't need to know what Task B is doing to finish its job.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;No Shared State:&lt;/strong&gt; Workers don't need to update a global variable constantly (which avoids those pesky race conditions).&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;High Scalability:&lt;/strong&gt; These problems scale almost perfectly; doubling your CPU cores usually halves the execution time.&lt;/li&gt;
&lt;/ul&gt;




&lt;h2&gt;
  
  
  10.2 Core Examples
&lt;/h2&gt;

&lt;h3&gt;
  
  
  Monte Carlo Simulations
&lt;/h3&gt;

&lt;p&gt;These simulations use repeated random sampling to obtain numerical results (like predicting stock market trends or calculating $\pi$). Since every "random trial" is independent, you can run a million trials on one core or divide them across a thousand cores with zero logic changes.&lt;/p&gt;

&lt;h3&gt;
  
  
  Weather Ensemble Models
&lt;/h3&gt;

&lt;p&gt;Meteorologists don't just run one weather forecast; they run dozens of "ensembles" with slightly different starting conditions. Since Forecast A doesn't affect Forecast B, they are computed in parallel across massive supercomputers.&lt;/p&gt;

&lt;h3&gt;
  
  
  Batch Data Processing
&lt;/h3&gt;

&lt;p&gt;Imagine you have 10,000 high-resolution photos to resize. Resizing photo #1 has nothing to do with photo #100. This is a classic "Map" operation where a worker pool can chew through the pile of files as fast as the disk can provide them.&lt;/p&gt;

&lt;h3&gt;
  
  
  CNN (Convolutional Neural Network) Workloads
&lt;/h3&gt;

&lt;p&gt;In Deep Learning, a Convolutional layer applies filters to an image. Each "pixel" calculation or each "filter" application can be done independently. This is why GPUs—which have thousands of tiny cores—are so much faster than CPUs for AI tasks.&lt;/p&gt;

&lt;h3&gt;
  
  
  FFT (Fast Fourier Transform)
&lt;/h3&gt;

&lt;p&gt;While the classic DFT is $O(N^2)$, the FFT reduces complexity to $O(N \log N)$. In many implementations, the data is split into "even" and "odd" parts that can be processed recursively in parallel, making it a staple of digital signal processing.&lt;/p&gt;




&lt;h2&gt;
  
  
  Summary of the "Parallel Spectrum"
&lt;/h2&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Type&lt;/th&gt;
&lt;th&gt;Communication Needs&lt;/th&gt;
&lt;th&gt;Difficulty to Parallelize&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Embarrassingly Parallel&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;None&lt;/td&gt;
&lt;td&gt;Very Easy&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Coarse-Grained&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Occasional&lt;/td&gt;
&lt;td&gt;Moderate&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Fine-Grained&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Constant/Frequent&lt;/td&gt;
&lt;td&gt;Hard (High risk of overhead)&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;




&lt;p&gt;While Python’s &lt;code&gt;multiprocessing&lt;/code&gt; is great for a single machine, &lt;strong&gt;MPI (Message Passing Interface)&lt;/strong&gt; is the gold standard for high-performance computing (HPC) across clusters of multiple computers. It is the language of supercomputers.&lt;/p&gt;




&lt;h2&gt;
  
  
  11.1 MPI Fundamentals
&lt;/h2&gt;

&lt;p&gt;Unlike the shared-memory models we’ve discussed, MPI is built entirely on the &lt;strong&gt;Distributed-Memory Model&lt;/strong&gt;.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Independent Processes:&lt;/strong&gt; Each process has its own address space. There is no shared "global variable." If Process 0 has a variable &lt;code&gt;x&lt;/code&gt;, Process 1 cannot see it unless Process 0 explicitly sends it.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;The "Rank":&lt;/strong&gt; Every process in an MPI job is assigned a unique ID called a &lt;strong&gt;Rank&lt;/strong&gt; (starting from 0). You use this rank to tell each process what part of the work it should do.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;The "Communicator":&lt;/strong&gt; This is a group of processes that can talk to each other. The default group containing all your processes is called &lt;code&gt;COMM_WORLD&lt;/code&gt;.&lt;/li&gt;
&lt;/ul&gt;




&lt;h2&gt;
  
  
  11.2 mpi4py: MPI for Python
&lt;/h2&gt;

&lt;p&gt;The &lt;code&gt;mpi4py&lt;/code&gt; library provides the Python bindings for the MPI standard. It allows Python scripts to communicate across a network.&lt;/p&gt;

&lt;h3&gt;
  
  
  Key Concepts
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;&lt;code&gt;COMM_WORLD&lt;/code&gt;:&lt;/strong&gt; The primary communicator.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;&lt;code&gt;Get_size()&lt;/code&gt;:&lt;/strong&gt; Tells you the total number of processes running.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;&lt;code&gt;Get_rank()&lt;/code&gt;:&lt;/strong&gt; Tells the current process its unique ID.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Point-to-Point Communication:&lt;/strong&gt; Using &lt;code&gt;send()&lt;/code&gt; and &lt;code&gt;recv()&lt;/code&gt; to move data between specific ranks.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Collective Communication:&lt;/strong&gt; Using &lt;code&gt;bcast()&lt;/code&gt; (one-to-all) or &lt;code&gt;reduce()&lt;/code&gt; (all-to-one) to synchronize data.&lt;/li&gt;
&lt;/ul&gt;




&lt;h2&gt;
  
  
  11.3 Running MPI Programs
&lt;/h2&gt;

&lt;p&gt;You cannot run an MPI script by simply typing &lt;code&gt;python script.py&lt;/code&gt;. You must use a process manager, typically &lt;strong&gt;&lt;code&gt;mpirun&lt;/code&gt;&lt;/strong&gt; or &lt;strong&gt;&lt;code&gt;mpiexec&lt;/code&gt;&lt;/strong&gt;, which handles the launching of multiple instances across your CPU cores or network nodes.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;The Command:&lt;/strong&gt;&lt;br&gt;
&lt;code&gt;mpirun -n 4 python3 my_script.py&lt;/code&gt;&lt;br&gt;
&lt;em&gt;(This launches 4 independent instances of your script.)&lt;/em&gt;&lt;/p&gt;

&lt;h3&gt;
  
  
  Example: The "Who Am I?" Pattern
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;mpi4py&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;MPI&lt;/span&gt;

&lt;span class="c1"&gt;# Get the communicator
&lt;/span&gt;&lt;span class="n"&gt;comm&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;MPI&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;COMM_WORLD&lt;/span&gt;

&lt;span class="c1"&gt;# Get the size (total processes) and rank (my ID)
&lt;/span&gt;&lt;span class="n"&gt;size&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;comm&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nc"&gt;Get_size&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
&lt;span class="n"&gt;rank&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;comm&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nc"&gt;Get_rank&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;

&lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Hello! I am process &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;rank&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s"&gt; out of &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;size&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s"&gt; total processes.&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;rank&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="n"&gt;data&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;key&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;value&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;
    &lt;span class="n"&gt;comm&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;send&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;data&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;dest&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Process 0 sent data to Process 1.&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="k"&gt;elif&lt;/span&gt; &lt;span class="n"&gt;rank&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="n"&gt;data&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;comm&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;recv&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;source&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Process 1 received: &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;data&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;






&lt;h2&gt;
  
  
  Summary: MPI vs. Multiprocessing
&lt;/h2&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Feature&lt;/th&gt;
&lt;th&gt;&lt;code&gt;multiprocessing&lt;/code&gt;&lt;/th&gt;
&lt;th&gt;
&lt;code&gt;mpi4py&lt;/code&gt; (MPI)&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Scope&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Single Machine (Multi-core)&lt;/td&gt;
&lt;td&gt;Multi-Node (Clusters/Supercomputers)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Memory&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Shared-memory constructs available&lt;/td&gt;
&lt;td&gt;Strictly Distributed (Message Passing)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Launch&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Standard Python interpreter&lt;/td&gt;
&lt;td&gt;
&lt;code&gt;mpirun&lt;/code&gt; / &lt;code&gt;mpiexec&lt;/code&gt;
&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Scaling&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Limited by one motherboard&lt;/td&gt;
&lt;td&gt;Scales to thousands of CPUs&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;




&lt;p&gt;&lt;strong&gt;This completes the technical foundation of Parallel &amp;amp; Concurrent Computing!&lt;/strong&gt; We've traveled from CPU core stagnation all the way to distributed supercomputing.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Would you like me to create a "Cheat Sheet" summarizing which tool (Threading, Multiprocessing, or MPI) you should use based on the specific type of project you are working on?&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;In MPI, communication is how independent processes coordinate to solve a single problem. There are two primary ways processes "talk": one-to-one (&lt;strong&gt;Point-to-Point&lt;/strong&gt;) or all-together (&lt;strong&gt;Collective&lt;/strong&gt;).&lt;/p&gt;




&lt;h2&gt;
  
  
  12.1 Point-to-Point Communication
&lt;/h2&gt;

&lt;p&gt;This is the most basic form of messaging, involving exactly two processes: a &lt;strong&gt;sender&lt;/strong&gt; and a &lt;strong&gt;receiver&lt;/strong&gt;.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;&lt;code&gt;send(obj, dest)&lt;/code&gt;&lt;/strong&gt;: The source process sends a Python object to a specific rank.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;&lt;code&gt;recv(source)&lt;/code&gt;&lt;/strong&gt;: The destination process waits to receive an object from a specific rank.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Blocking Communication:&lt;/strong&gt; By default, these operations are "blocking." The sender waits until the message is safely in the transmission buffer, and the receiver waits (sleeps) until the message actually arrives. If you &lt;code&gt;recv()&lt;/code&gt; and no one ever &lt;code&gt;send()&lt;/code&gt;, your program will hang forever.&lt;/li&gt;
&lt;/ul&gt;




&lt;h2&gt;
  
  
  12.2 Collective Communication
&lt;/h2&gt;

&lt;p&gt;Collective operations involve &lt;strong&gt;all&lt;/strong&gt; processes in a communicator (e.g., &lt;code&gt;COMM_WORLD&lt;/code&gt;). These are highly optimized and usually much faster than writing multiple point-to-point loops.&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Operation&lt;/th&gt;
&lt;th&gt;Description&lt;/th&gt;
&lt;th&gt;Analogy&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Broadcast (&lt;code&gt;bcast&lt;/code&gt;)&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;One process sends the same data to everyone else.&lt;/td&gt;
&lt;td&gt;A teacher giving a handout to the whole class.&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Scatter (&lt;code&gt;scatter&lt;/code&gt;)&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;One process takes a list and gives one piece to each process.&lt;/td&gt;
&lt;td&gt;Dealing a deck of cards to players.&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Gather (&lt;code&gt;gather&lt;/code&gt;)&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;One process collects a piece of data from everyone else into a list.&lt;/td&gt;
&lt;td&gt;A teacher collecting homework from every student.&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Reduce (&lt;code&gt;reduce&lt;/code&gt;)&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Everyone sends data to one process, which "crunches" it (e.g., Sum, Max).&lt;/td&gt;
&lt;td&gt;Everyone votes, and the teller announces only the total count.&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;




&lt;h2&gt;
  
  
  12.3 Performance: The "Case" Matters
&lt;/h2&gt;

&lt;p&gt;In &lt;code&gt;mpi4py&lt;/code&gt;, there is a massive performance difference between lowercase methods (e.g., &lt;code&gt;send&lt;/code&gt;) and uppercase methods (e.g., &lt;code&gt;Send&lt;/code&gt;).&lt;/p&gt;

&lt;h3&gt;
  
  
  Lowercase Methods (&lt;code&gt;send&lt;/code&gt;, &lt;code&gt;recv&lt;/code&gt;, &lt;code&gt;bcast&lt;/code&gt;)
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Mechanism:&lt;/strong&gt; Uses &lt;strong&gt;pickle&lt;/strong&gt; to serialize Python objects.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Flexibility:&lt;/strong&gt; Can send almost any Python object (dicts, lists, custom classes).&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Performance:&lt;/strong&gt; &lt;strong&gt;Slower&lt;/strong&gt;. The overhead of pickling and unpickling large amounts of data can create a bottleneck.&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Uppercase Methods (&lt;code&gt;Send&lt;/code&gt;, &lt;code&gt;Recv&lt;/code&gt;, &lt;code&gt;Bcast&lt;/code&gt;)
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Mechanism:&lt;/strong&gt; Uses &lt;strong&gt;Buffer-based communication&lt;/strong&gt;. It points directly to a contiguous block of memory.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Flexibility:&lt;/strong&gt; Requires data to be in a buffer-like format, typically &lt;strong&gt;NumPy arrays&lt;/strong&gt;.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Performance:&lt;/strong&gt; &lt;strong&gt;Extremely Fast&lt;/strong&gt;. This is near-C speeds because it avoids the Python overhead and communicates the raw memory directly.&lt;/li&gt;
&lt;/ul&gt;

&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Rule of Thumb:&lt;/strong&gt; If you are moving NumPy arrays for math, &lt;strong&gt;always&lt;/strong&gt; use the uppercase methods (e.g., &lt;code&gt;comm.Send(my_array, dest=1)&lt;/code&gt;).&lt;/p&gt;
&lt;/blockquote&gt;




&lt;h2&gt;
  
  
  12.4 Summary: When to use what?
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;Use &lt;strong&gt;Point-to-Point&lt;/strong&gt; for complex logic where specific workers need unique instructions.&lt;/li&gt;
&lt;li&gt;Use &lt;strong&gt;Collective&lt;/strong&gt; for mathematical synchronization (e.g., summing partial results of an integral).&lt;/li&gt;
&lt;li&gt;Use &lt;strong&gt;Uppercase methods&lt;/strong&gt; whenever you are doing heavy data lifting with NumPy.&lt;/li&gt;
&lt;/ul&gt;

</description>
      <category>architecture</category>
      <category>computerscience</category>
      <category>performance</category>
      <category>softwaredevelopment</category>
    </item>
    <item>
      <title>Building a Complete DevOps Pipeline: Flask App with Docker, Jenkins, GitHub Actions, Prometheus, and Grafana</title>
      <dc:creator>Abhiraj Adhikary</dc:creator>
      <pubDate>Sat, 01 Nov 2025 16:09:51 +0000</pubDate>
      <link>https://dev.to/abhirajadhikary06/building-a-complete-devops-pipeline-flask-app-with-docker-jenkins-github-actions-prometheus-4bam</link>
      <guid>https://dev.to/abhirajadhikary06/building-a-complete-devops-pipeline-flask-app-with-docker-jenkins-github-actions-prometheus-4bam</guid>
      <description>&lt;p&gt;In today's fast-paced software development world, DevOps practices are essential for streamlining workflows, ensuring reliable deployments, and monitoring applications effectively. This tutorial walks you through a hands-on DevOps project using Flask as the web framework, Pytest and Playwright for testing, Docker for containerization, GitHub Actions and Jenkins for CI/CD, and Prometheus with Grafana for monitoring. Whether you're a beginner or an experienced engineer, this guide will help you build, test, deploy, and monitor a simple Flask app.&lt;/p&gt;

&lt;p&gt;By the end, you'll have a production-ready setup that demonstrates key DevOps principles like automation, containerization, and observability. Let's dive in!&lt;/p&gt;

&lt;h2&gt;
  
  
  Project Overview: What We're Building
&lt;/h2&gt;

&lt;p&gt;This DevOps project creates a basic Flask web application that serves a simple HTML template. We integrate testing, containerization, CI/CD pipelines, and monitoring to create a robust ecosystem. Key tools include:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Flask&lt;/strong&gt;: For the backend web app.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Pytest &amp;amp; Playwright&lt;/strong&gt;: For unit and end-to-end (E2E) testing.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Docker&lt;/strong&gt;: For building and orchestrating containers.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;GitHub Actions &amp;amp; Jenkins&lt;/strong&gt;: For automated CI/CD.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Prometheus &amp;amp; Grafana&lt;/strong&gt;: For metrics collection and visualization.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The full source code is available on &lt;a href="https://github.com/abhirajadhikary06/devops-1" rel="noopener noreferrer"&gt;GitHub&lt;/a&gt;. Keywords: DevOps tutorial, Flask Docker CI/CD, Prometheus Grafana monitoring.&lt;/p&gt;

&lt;h2&gt;
  
  
  Step 1: Setting Up the Flask Application
&lt;/h2&gt;

&lt;p&gt;Start by creating a simple Flask app. Install dependencies like &lt;code&gt;flask&lt;/code&gt; and &lt;code&gt;prometheus_flask_exporter&lt;/code&gt; for metrics exposure.&lt;/p&gt;

&lt;p&gt;Here's the core &lt;code&gt;app.py&lt;/code&gt; code:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;flask&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;Flask&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;render_template&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;prometheus_flask_exporter&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;PrometheusMetrics&lt;/span&gt;

&lt;span class="n"&gt;app&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;Flask&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;__name__&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;metrics&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;PrometheusMetrics&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;app&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="nd"&gt;@app.route&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;/&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;home&lt;/span&gt;&lt;span class="p"&gt;():&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="nf"&gt;render_template&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;index.html&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;__name__&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;__main__&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="n"&gt;app&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;run&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;host&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;0.0.0.0&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;port&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;5000&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This app renders an &lt;code&gt;index.html&lt;/code&gt; from the &lt;code&gt;templates&lt;/code&gt; folder and exposes metrics at &lt;code&gt;/metrics&lt;/code&gt;. The &lt;code&gt;static&lt;/code&gt; folder holds CSS styles for a polished UI. Run it locally with &lt;code&gt;python app.py&lt;/code&gt; and access at &lt;code&gt;http://localhost:5000&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;For SEO: Flask web app tutorial, Python DevOps project.&lt;/p&gt;

&lt;h2&gt;
  
  
  Step 2: Implementing Tests with Pytest and Playwright
&lt;/h2&gt;

&lt;p&gt;Testing is crucial in DevOps. We use Pytest for backend unit tests and Playwright for E2E browser automation.&lt;/p&gt;

&lt;p&gt;Install libraries: &lt;code&gt;pip install pytest playwright&lt;/code&gt;.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Backend Tests (&lt;code&gt;tests/test_app.py&lt;/code&gt;)&lt;/strong&gt;: Verifies routes and responses.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Example test:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;  &lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;app&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;app&lt;/span&gt;

  &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;test_home&lt;/span&gt;&lt;span class="p"&gt;():&lt;/span&gt;
      &lt;span class="n"&gt;client&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;app&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;test_client&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
      &lt;span class="n"&gt;response&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;client&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;get&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;/&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
      &lt;span class="k"&gt;assert&lt;/span&gt; &lt;span class="n"&gt;response&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;status_code&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="mi"&gt;200&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;E2E Tests (&lt;code&gt;tests/test_e2e.py&lt;/code&gt;)&lt;/strong&gt;: Simulates browser interactions.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Run the app first, then &lt;code&gt;pytest&lt;/code&gt;. All 4 tests (2 backend, 2 E2E) should pass. Note: Keep the app running on &lt;code&gt;localhost:5000&lt;/code&gt; for Playwright to test the UI.&lt;/p&gt;

&lt;p&gt;This ensures code quality before deployment. Keywords: Pytest Playwright tutorial, automated testing DevOps.&lt;/p&gt;

&lt;h2&gt;
  
  
  Step 3: Containerizing with Docker
&lt;/h2&gt;

&lt;p&gt;Docker makes deployments consistent. We use a multistage &lt;code&gt;Dockerfile&lt;/code&gt; for efficiency:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight docker"&gt;&lt;code&gt;&lt;span class="c"&gt;# Builder stage&lt;/span&gt;
&lt;span class="k"&gt;FROM&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s"&gt;python:3.12-slim&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="k"&gt;AS&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s"&gt;builder&lt;/span&gt;
&lt;span class="k"&gt;WORKDIR&lt;/span&gt;&lt;span class="s"&gt; /app&lt;/span&gt;
&lt;span class="k"&gt;COPY&lt;/span&gt;&lt;span class="s"&gt; requirements.txt .&lt;/span&gt;
&lt;span class="k"&gt;RUN &lt;/span&gt;pip &lt;span class="nb"&gt;install&lt;/span&gt; &lt;span class="nt"&gt;--no-cache-dir&lt;/span&gt; &lt;span class="nt"&gt;-r&lt;/span&gt; requirements.txt
&lt;span class="k"&gt;COPY&lt;/span&gt;&lt;span class="s"&gt; . .&lt;/span&gt;

&lt;span class="c"&gt;# Runtime stage&lt;/span&gt;
&lt;span class="k"&gt;FROM&lt;/span&gt;&lt;span class="s"&gt; python:3.12-slim&lt;/span&gt;
&lt;span class="k"&gt;WORKDIR&lt;/span&gt;&lt;span class="s"&gt; /app&lt;/span&gt;
&lt;span class="k"&gt;COPY&lt;/span&gt;&lt;span class="s"&gt; --from=builder /app .&lt;/span&gt;
&lt;span class="k"&gt;EXPOSE&lt;/span&gt;&lt;span class="s"&gt; 5000&lt;/span&gt;
&lt;span class="k"&gt;CMD&lt;/span&gt;&lt;span class="s"&gt; ["python", "app.py"]&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Build and push: &lt;code&gt;docker build -t yourusername/flask-app:latest .&lt;/code&gt; and &lt;code&gt;docker push yourusername/flask-app:latest&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;For orchestration, &lt;code&gt;docker-compose.yml&lt;/code&gt; spins up the full stack:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight yaml"&gt;&lt;code&gt;&lt;span class="na"&gt;version&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s1"&gt;'&lt;/span&gt;&lt;span class="s"&gt;3'&lt;/span&gt;
&lt;span class="na"&gt;services&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="na"&gt;app&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
    &lt;span class="na"&gt;build&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;.&lt;/span&gt;
    &lt;span class="na"&gt;ports&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
      &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;5000:5000"&lt;/span&gt;
  &lt;span class="na"&gt;prometheus&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
    &lt;span class="na"&gt;image&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;prom/prometheus&lt;/span&gt;
    &lt;span class="na"&gt;volumes&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
      &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="s"&gt;./prometheus.yml:/etc/prometheus/prometheus.yml&lt;/span&gt;
    &lt;span class="na"&gt;ports&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
      &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;9090:9090"&lt;/span&gt;
  &lt;span class="na"&gt;grafana&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
    &lt;span class="na"&gt;image&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;grafana/grafana&lt;/span&gt;
    &lt;span class="na"&gt;ports&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
      &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;3000:3000"&lt;/span&gt;
  &lt;span class="na"&gt;jenkins&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
    &lt;span class="na"&gt;image&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;jenkins/jenkins:lts&lt;/span&gt;
    &lt;span class="na"&gt;ports&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
      &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;8080:8080"&lt;/span&gt;
      &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;50000:50000"&lt;/span&gt;
    &lt;span class="na"&gt;volumes&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
      &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="s"&gt;jenkins_home:/var/jenkins_home&lt;/span&gt;
&lt;span class="na"&gt;volumes&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="na"&gt;jenkins_home&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Run with &lt;code&gt;docker-compose up&lt;/code&gt;. Access services at:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Flask: &lt;code&gt;http://localhost:5000&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;Prometheus: &lt;code&gt;http://localhost:9090&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;Grafana: &lt;code&gt;http://localhost:3000&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;Jenkins: &lt;code&gt;http://localhost:8080&lt;/code&gt;
&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The &lt;code&gt;prometheus.yml&lt;/code&gt; configures scraping:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight yaml"&gt;&lt;code&gt;&lt;span class="na"&gt;scrape_configs&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;job_name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s1"&gt;'&lt;/span&gt;&lt;span class="s"&gt;flask'&lt;/span&gt;
    &lt;span class="na"&gt;scrape_interval&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;15s&lt;/span&gt;
    &lt;span class="na"&gt;metrics_path&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s1"&gt;'&lt;/span&gt;&lt;span class="s"&gt;/metrics'&lt;/span&gt;
    &lt;span class="na"&gt;static_configs&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
      &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;targets&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="pi"&gt;[&lt;/span&gt;&lt;span class="s1"&gt;'&lt;/span&gt;&lt;span class="s"&gt;app:5000'&lt;/span&gt;&lt;span class="pi"&gt;]&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Keywords: Docker multistage build, Docker Compose DevOps.&lt;/p&gt;

&lt;h2&gt;
  
  
  Step 4: CI/CD with GitHub Actions and Jenkins
&lt;/h2&gt;

&lt;p&gt;Automation is the heart of DevOps.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;GitHub Actions (&lt;code&gt;.github/workflows/ci.yml&lt;/code&gt;)&lt;/strong&gt;: Triggers on push/PR to main. Tests, builds, and pushes Docker image if tests pass.
&lt;/li&gt;
&lt;/ul&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight yaml"&gt;&lt;code&gt;  &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;CI&lt;/span&gt;
  &lt;span class="na"&gt;on&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="pi"&gt;[&lt;/span&gt;&lt;span class="nv"&gt;push&lt;/span&gt;&lt;span class="pi"&gt;,&lt;/span&gt; &lt;span class="nv"&gt;pull_request&lt;/span&gt;&lt;span class="pi"&gt;]&lt;/span&gt;
  &lt;span class="na"&gt;jobs&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
    &lt;span class="na"&gt;build&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
      &lt;span class="na"&gt;runs-on&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;ubuntu-latest&lt;/span&gt;
      &lt;span class="na"&gt;steps&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
        &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;uses&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;actions/checkout@v2&lt;/span&gt;
        &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;Set up Python&lt;/span&gt;
          &lt;span class="na"&gt;uses&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;actions/setup-python@v2&lt;/span&gt;
          &lt;span class="na"&gt;with&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="pi"&gt;{&lt;/span&gt; &lt;span class="nv"&gt;python-version&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s1"&gt;'&lt;/span&gt;&lt;span class="s"&gt;3.12'&lt;/span&gt; &lt;span class="pi"&gt;}&lt;/span&gt;
        &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;run&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;pip install -r requirements.txt&lt;/span&gt;
        &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;run&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;pytest&lt;/span&gt;
        &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;Build and Push Docker&lt;/span&gt;
          &lt;span class="na"&gt;if&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;success()&lt;/span&gt;
          &lt;span class="na"&gt;uses&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;docker/build-push-action@v2&lt;/span&gt;
          &lt;span class="na"&gt;with&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
            &lt;span class="na"&gt;push&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="kc"&gt;true&lt;/span&gt;
            &lt;span class="na"&gt;tags&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;yourusername/flask-app:latest&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Jenkins (&lt;code&gt;Jenkinsfile&lt;/code&gt;)&lt;/strong&gt;: Pipeline for build, test, and deploy.
&lt;/li&gt;
&lt;/ul&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight groovy"&gt;&lt;code&gt;  &lt;span class="n"&gt;pipeline&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;
      &lt;span class="n"&gt;agent&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt; &lt;span class="n"&gt;docker&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt; &lt;span class="n"&gt;image&lt;/span&gt; &lt;span class="s1"&gt;'python:3.12'&lt;/span&gt; &lt;span class="o"&gt;}&lt;/span&gt; &lt;span class="o"&gt;}&lt;/span&gt;
      &lt;span class="n"&gt;stages&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;
          &lt;span class="n"&gt;stage&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'Build'&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt; &lt;span class="n"&gt;steps&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt; &lt;span class="n"&gt;sh&lt;/span&gt; &lt;span class="s1"&gt;'pip install -r requirements.txt'&lt;/span&gt; &lt;span class="o"&gt;}&lt;/span&gt; &lt;span class="o"&gt;}&lt;/span&gt;
          &lt;span class="n"&gt;stage&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'Test'&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt; &lt;span class="n"&gt;steps&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt; &lt;span class="n"&gt;sh&lt;/span&gt; &lt;span class="s1"&gt;'pytest'&lt;/span&gt; &lt;span class="o"&gt;}&lt;/span&gt; &lt;span class="o"&gt;}&lt;/span&gt;
          &lt;span class="n"&gt;stage&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'Deploy'&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;
              &lt;span class="n"&gt;steps&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;
                  &lt;span class="n"&gt;withCredentials&lt;/span&gt;&lt;span class="o"&gt;([&lt;/span&gt;&lt;span class="n"&gt;usernamePassword&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="nl"&gt;credentialsId:&lt;/span&gt; &lt;span class="s1"&gt;'dockerhub'&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="nl"&gt;usernameVariable:&lt;/span&gt; &lt;span class="s1"&gt;'USER'&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="nl"&gt;passwordVariable:&lt;/span&gt; &lt;span class="s1"&gt;'PASS'&lt;/span&gt;&lt;span class="o"&gt;)])&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;
                      &lt;span class="n"&gt;sh&lt;/span&gt; &lt;span class="s1"&gt;'docker build -t $USER/flask-app:latest .'&lt;/span&gt;
                      &lt;span class="n"&gt;sh&lt;/span&gt; &lt;span class="s1"&gt;'echo $PASS | docker login -u $USER --password-stdin'&lt;/span&gt;
                      &lt;span class="n"&gt;sh&lt;/span&gt; &lt;span class="s1"&gt;'docker push $USER/flask-app:latest'&lt;/span&gt;
                  &lt;span class="o"&gt;}&lt;/span&gt;
              &lt;span class="o"&gt;}&lt;/span&gt;
          &lt;span class="o"&gt;}&lt;/span&gt;
      &lt;span class="o"&gt;}&lt;/span&gt;
  &lt;span class="o"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Setup Jenkins: Install Docker, Docker Pipeline, and Git plugins. Restart at &lt;code&gt;http://localhost:8080/restart&lt;/code&gt;. Create a pipeline with your GitHub repo.&lt;/p&gt;

&lt;p&gt;Keywords: Jenkins CI/CD pipeline, GitHub Actions Docker push.&lt;/p&gt;

&lt;h2&gt;
  
  
  Step 5: Monitoring with Prometheus and Grafana
&lt;/h2&gt;

&lt;p&gt;Expose metrics via &lt;code&gt;prometheus_flask_exporter&lt;/code&gt;. In Grafana, add Prometheus as data source (&lt;code&gt;http://prometheus:9090&lt;/code&gt;), create dashboards for app metrics like requests and response times.&lt;/p&gt;

&lt;p&gt;This setup provides real-time insights. Keywords: Prometheus Grafana tutorial, Flask monitoring DevOps.&lt;/p&gt;

&lt;h2&gt;
  
  
  Conclusion: Why This DevOps Project Matters
&lt;/h2&gt;

&lt;p&gt;This project showcases a full DevOps lifecycle: from coding and testing to deployment and monitoring. It's scalable, automated, and observable—perfect for modern apps. Fork the repo, experiment, and level up your skills!&lt;/p&gt;

&lt;p&gt;For more repos, follow me on &lt;a href="https://github.com/abhirajadhikary06" rel="noopener noreferrer"&gt;GitHub&lt;/a&gt;. Share your thoughts in the comments!&lt;/p&gt;

</description>
      <category>devops</category>
      <category>cicd</category>
      <category>flask</category>
      <category>jenkins</category>
    </item>
    <item>
      <title>Building a YouTube Video Search App with Flask, Whisper, and RAG</title>
      <dc:creator>Abhiraj Adhikary</dc:creator>
      <pubDate>Thu, 09 Oct 2025 14:26:35 +0000</pubDate>
      <link>https://dev.to/abhirajadhikary06/building-a-youtube-video-search-app-with-flask-whisper-and-rag-ebl</link>
      <guid>https://dev.to/abhirajadhikary06/building-a-youtube-video-search-app-with-flask-whisper-and-rag-ebl</guid>
      <description>&lt;h1&gt;
  
  
  Building a YouTube Video Search App with Flask, Whisper, and RAG
&lt;/h1&gt;

&lt;p&gt;Ever wanted to search for specific moments in a YouTube video by just typing a keyword? Imagine pinpointing that exact timestamp where someone explains "machine learning" in a 5-minute tutorial—without scrubbing through the whole thing. I built a Flask-based web app called &lt;strong&gt;video-rag-search&lt;/strong&gt; that does exactly this, using Retrieval-Augmented Generation (RAG), OpenAI's Whisper, and a sprinkle of AI magic. In this post, I'll walk you through what it does, how it works, and why it's a fun project for developers to explore.&lt;/p&gt;

&lt;h2&gt;
  
  
  What Does It Do?
&lt;/h2&gt;

&lt;p&gt;The &lt;strong&gt;video-rag-search&lt;/strong&gt; app lets you:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Paste a YouTube video link (up to 5 minutes long).&lt;/li&gt;
&lt;li&gt;Automatically download and transcribe the audio using OpenAI's Whisper.&lt;/li&gt;
&lt;li&gt;Generate 5 key topics from the transcript using Groq's LLM.&lt;/li&gt;
&lt;li&gt;Search for moments in the video by selecting a topic, with results linked to exact timestamps.&lt;/li&gt;
&lt;li&gt;Cache results for speed and store data in a MariaDB database for persistence.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Think of it as a smart search engine for YouTube videos, powered by semantic search and AI transcription. Whether you're a student skimming lectures or a developer digging through tech talks, this tool saves time.&lt;/p&gt;

&lt;h2&gt;
  
  
  Why Build This?
&lt;/h2&gt;

&lt;p&gt;I wanted to combine my love for Flask, AI, and video content into a practical tool. YouTube is a treasure trove of knowledge, but finding specific moments can be a pain. By leveraging RAG (Retrieval-Augmented Generation), we can make video content searchable in a way that's intuitive and developer-friendly. Plus, it's a great excuse to play with cutting-edge AI libraries like Whisper and SentenceTransformers!&lt;/p&gt;

&lt;h2&gt;
  
  
  Tech Stack
&lt;/h2&gt;

&lt;p&gt;Here's the lineup of tools and libraries powering the app:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Flask&lt;/strong&gt;: Lightweight Python web framework for the backend and UI.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;OpenAI Whisper&lt;/strong&gt;: Transcribes YouTube audio to text with timestamps.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Groq LLM&lt;/strong&gt;: Generates meaningful keywords from transcripts.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;SentenceTransformers&lt;/strong&gt;: Creates semantic embeddings for search.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;MariaDB&lt;/strong&gt;: Stores transcripts and embeddings for persistence.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;yt-dlp&lt;/strong&gt;: Downloads YouTube audio efficiently.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Flask-Caching&lt;/strong&gt;: Speeds up repeated searches.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;pydub&lt;/strong&gt;: Handles audio file processing.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;NumPy&lt;/strong&gt;: Computes similarity scores for search.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;You'll also need a Groq API key (free tier available) and a MariaDB instance (local or cloud).&lt;/p&gt;

&lt;h2&gt;
  
  
  How It Works
&lt;/h2&gt;

&lt;p&gt;Let's break down the app's workflow, from YouTube link to search results.&lt;/p&gt;

&lt;h3&gt;
  
  
  1. Input a YouTube Link
&lt;/h3&gt;

&lt;p&gt;The user submits a YouTube URL via a simple form (&lt;code&gt;index.html&lt;/code&gt;). The app validates it using a regex to ensure it's a proper YouTube link (e.g., &lt;code&gt;youtube.com/watch?v=...&lt;/code&gt; or &lt;code&gt;youtu.be/...&lt;/code&gt;). Whitespace and quotes are stripped for cleanliness.&lt;/p&gt;

&lt;h3&gt;
  
  
  2. Download Audio
&lt;/h3&gt;

&lt;p&gt;Using &lt;code&gt;yt-dlp&lt;/code&gt;, the app downloads the audio as an MP3 file. It checks the video's duration (via &lt;code&gt;pydub&lt;/code&gt;) and enforces a 5-minute limit to keep processing manageable. If the video's too long, you get a friendly error message.&lt;/p&gt;

&lt;h3&gt;
  
  
  3. Transcribe with Whisper
&lt;/h3&gt;

&lt;p&gt;OpenAI's Whisper (&lt;code&gt;medium&lt;/code&gt; model) transcribes the audio, producing segments with text and timestamps (e.g., &lt;code&gt;[10.2 - 12.5] "Welcome to AI basics"&lt;/code&gt;). Empty or invalid segments are filtered out to ensure quality.&lt;/p&gt;

&lt;h3&gt;
  
  
  4. Store in MariaDB
&lt;/h3&gt;

&lt;p&gt;Each segment is saved in a MariaDB table (&lt;code&gt;video_data&lt;/code&gt;) with:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Video ID (from the YouTube URL).&lt;/li&gt;
&lt;li&gt;Segment text, start/end times, and a timestamped YouTube link.&lt;/li&gt;
&lt;li&gt;Semantic embeddings (as JSON, generated later).&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The table is created dynamically if it doesn't exist, with defensive migrations to handle schema changes.&lt;/p&gt;

&lt;h3&gt;
  
  
  5. Generate Keywords with Groq
&lt;/h3&gt;

&lt;p&gt;The transcript is sent to Groq's LLM (model: &lt;code&gt;openai/gpt-oss-20b&lt;/code&gt;) with a prompt to extract 5 relevant keywords. For example, a machine learning tutorial might yield:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Neural Networks&lt;/li&gt;
&lt;li&gt;Backpropagation&lt;/li&gt;
&lt;li&gt;Overfitting&lt;/li&gt;
&lt;li&gt;Gradient Descent&lt;/li&gt;
&lt;li&gt;Activation Functions&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;The app parses the LLM's response, prioritizing bold (&lt;code&gt;**...**&lt;/code&gt;), numbered, or bulleted lists, and cleans up markdown artifacts.&lt;/p&gt;

&lt;h3&gt;
  
  
  6. Semantic Search with Embeddings
&lt;/h3&gt;

&lt;p&gt;To enable smart searching, the app uses &lt;code&gt;SentenceTransformers&lt;/code&gt; (&lt;code&gt;all-MiniLM-L6-v2&lt;/code&gt;) to create embeddings for each transcript segment. These are stored as JSON in MariaDB. When a user selects a keyword (e.g., "Neural Networks"), the app:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Encodes the keyword into an embedding.&lt;/li&gt;
&lt;li&gt;Computes cosine similarity against stored segment embeddings.&lt;/li&gt;
&lt;li&gt;Returns the best-matching segment (if similarity ≥ 0.5) with its timestamp and a clickable link.&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  7. Caching for Speed
&lt;/h3&gt;

&lt;p&gt;Results are cached using &lt;code&gt;Flask-Caching&lt;/code&gt; with the video ID as the key. If the same video is searched again within an hour, the app skips processing and loads from cache.&lt;/p&gt;

&lt;h3&gt;
  
  
  8. User Interface
&lt;/h3&gt;

&lt;p&gt;The UI (built with Jinja2 templates) guides users through three steps:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Input Link&lt;/strong&gt;: Enter a YouTube URL.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Select Keyword&lt;/strong&gt;: Choose from 5 auto-generated keywords.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;View Results&lt;/strong&gt;: See the matching timestamp, transcript snippet, and a link to jump to that moment in the video.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Errors (e.g., invalid URL, failed transcription) are logged and displayed as user-friendly messages.&lt;/p&gt;

&lt;h2&gt;
  
  
  Code Highlights
&lt;/h2&gt;

&lt;p&gt;Here's a peek at some key functions (simplified for brevity):&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;download_audio&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;youtube_link&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="n"&gt;args&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;yt-dlp&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;-x&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;--audio-format&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;mp3&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;-o&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;video.mp3&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;youtube_link&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
    &lt;span class="n"&gt;result&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;subprocess&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;run&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;args&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;capture_output&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="bp"&gt;True&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;text&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="bp"&gt;True&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;result&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;returncode&lt;/span&gt; &lt;span class="o"&gt;!=&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="k"&gt;raise&lt;/span&gt; &lt;span class="nc"&gt;RuntimeError&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Download failed: &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;result&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;stderr&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="ow"&gt;not&lt;/span&gt; &lt;span class="n"&gt;os&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;path&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;exists&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;video.mp3&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
        &lt;span class="k"&gt;raise&lt;/span&gt; &lt;span class="nc"&gt;FileNotFoundError&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Audio file not found.&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;parse_keywords&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;text&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="nb"&gt;list&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="n"&gt;bold&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;re&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;findall&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sa"&gt;r&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;\*\*(.+?)\*\*&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;text&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;candidates&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;bold&lt;/span&gt; &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;bold&lt;/span&gt; &lt;span class="k"&gt;else&lt;/span&gt; &lt;span class="n"&gt;re&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;findall&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sa"&gt;r&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;^\s*\d+\.\s*(.+)&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;text&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;re&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;M&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;re&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;sub&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sa"&gt;r&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;[\s\.,;:!]+$&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;""&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;item&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;strip&lt;/span&gt;&lt;span class="p"&gt;())&lt;/span&gt; &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;item&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;candidates&lt;/span&gt;&lt;span class="p"&gt;][:&lt;/span&gt;&lt;span class="mi"&gt;5&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;

&lt;span class="nd"&gt;@app.route&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;/select_keyword/&amp;lt;int:index&amp;gt;&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;methods&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;GET&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;])&lt;/span&gt;
&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;select_keyword&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;index&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="n"&gt;keywords&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;session&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;get&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;keywords&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;[])&lt;/span&gt;
    &lt;span class="n"&gt;query&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;keywords&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;index&lt;/span&gt; &lt;span class="o"&gt;-&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;].&lt;/span&gt;&lt;span class="nf"&gt;lower&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
    &lt;span class="n"&gt;embedder&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;SentenceTransformer&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;all-MiniLM-L6-v2&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;query_embedding&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;embedder&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;encode&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;query&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;convert_to_tensor&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="bp"&gt;True&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="c1"&gt;# ... (fetch segments, compute cosine similarity, return best match)
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  Getting Started
&lt;/h2&gt;

&lt;p&gt;Want to try it yourself? Here's how to set it up:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;Clone the Repo&lt;/strong&gt;:
&lt;/li&gt;
&lt;/ol&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;   git clone &amp;lt;your-repo&amp;gt;
   &lt;span class="nb"&gt;cd &lt;/span&gt;video-rag-search
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;Install Dependencies&lt;/strong&gt;:
&lt;/li&gt;
&lt;/ol&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;   pip &lt;span class="nb"&gt;install &lt;/span&gt;flask whisper sentence-transformers groq pydub flask-caching mariadb numpy yt-dlp
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;Set Up Environment&lt;/strong&gt;:
Create a &lt;code&gt;.env&lt;/code&gt; file:
&lt;/li&gt;
&lt;/ol&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;   GROQ_API_KEY=your_groq_key
   DB_USER=root
   DB_PASSWORD=RootPass123!
   DB_HOST=localhost
   DB_PORT=3306
   DB_NAME=youtube_search
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;ol&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Set Up MariaDB&lt;/strong&gt;:&lt;br&gt;
Install MariaDB locally or use a cloud provider. Create a &lt;code&gt;youtube_search&lt;/code&gt; database.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Run the App&lt;/strong&gt;:&lt;br&gt;
&lt;/p&gt;&lt;/li&gt;
&lt;/ol&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;   python app.py
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Visit &lt;code&gt;http://localhost:5000&lt;/code&gt; and paste a YouTube link (try a short tech tutorial!).&lt;/p&gt;

&lt;h2&gt;
  
  
  Challenges and Lessons
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Whisper Load Time&lt;/strong&gt;: The &lt;code&gt;medium&lt;/code&gt; model is heavy. Preloading or using a smaller model (&lt;code&gt;tiny&lt;/code&gt;) could speed things up, but I prioritized accuracy.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Embedding Storage&lt;/strong&gt;: Storing embeddings as JSON in MariaDB works but isn't ideal for scale. A vector database like FAISS or Pinecone would be better (planned for v2!).&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;LLM Parsing&lt;/strong&gt;: Groq's output varies, so robust parsing (e.g., handling markdown) was key to consistent keywords.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Caching&lt;/strong&gt;: &lt;code&gt;Flask-Caching&lt;/code&gt; with a simple in-memory store is great for dev but needs Redis for production.&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  What's Next?
&lt;/h2&gt;

&lt;p&gt;I'm excited to extend this project with:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Quiz Generation&lt;/strong&gt;: Turn transcripts into MCQ quizzes for learning.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;User Accounts&lt;/strong&gt;: Add login/register to track search history.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Cloud DB&lt;/strong&gt;: Move to Neon Postgres or Render for scalability.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Audio Readout&lt;/strong&gt;: Use text-to-speech for accessibility.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Leaderboard&lt;/strong&gt;: Rank users by search activity or quiz scores.&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Try It Out!
&lt;/h2&gt;

&lt;p&gt;The &lt;strong&gt;video-rag-search&lt;/strong&gt; app is a fun blend of AI, web dev, and data science. It’s open-source, so fork it, tweak it, or add your own spin! Got ideas for features or hit a snag? Drop a comment on Dev.to or open an issue on the repo.&lt;/p&gt;

&lt;p&gt;Happy coding, and let’s make YouTube videos searchable! This was build during MariaDB hackathon by &lt;a class="mentioned-user" href="https://dev.to/anikchand461"&gt;@anikchand461&lt;/a&gt; and me.&lt;/p&gt;

</description>
      <category>whisper</category>
      <category>flask</category>
      <category>mariadb</category>
      <category>hackathon</category>
    </item>
    <item>
      <title>GitHub Profile Summarizer with n8n and Bright Data</title>
      <dc:creator>Abhiraj Adhikary</dc:creator>
      <pubDate>Sat, 30 Aug 2025 11:20:20 +0000</pubDate>
      <link>https://dev.to/abhirajadhikary06/github-profile-summarizer-with-n8n-and-bright-data-53di</link>
      <guid>https://dev.to/abhirajadhikary06/github-profile-summarizer-with-n8n-and-bright-data-53di</guid>
      <description>&lt;p&gt;&lt;em&gt;This is a submission for the &lt;a href="https://dev.to/challenges/brightdata-n8n-2025-08-13"&gt;AI Agents Challenge powered by n8n and Bright Data&lt;/a&gt;&lt;/em&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  Building a Chat with GitHub with n8n and Bright Data
&lt;/h2&gt;

&lt;h2&gt;
  
  
  What I Built
&lt;/h2&gt;

&lt;p&gt;I created an AI-powered GitHub Profile Summarizer using n8n and Bright Data. This agent takes a GitHub username as input, scrapes the user's public profile, and generates a concise HTML summary of their bio, top repositories, and contributions. It leverages Bright Data's web scraping capabilities and Mistral AI's language model to deliver a polished, human-readable output. The workflow handles invalid usernames gracefully, ensuring a robust user experience.&lt;/p&gt;

&lt;h2&gt;
  
  
  Demo
&lt;/h2&gt;

&lt;p&gt;![Chat with GitHub](&lt;br&gt;
&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fwfuoj11zxe4cw1ut59pt.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fwfuoj11zxe4cw1ut59pt.png" alt=" " width="800" height="285"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href="https://youtu.be/EgtpIjSm-cQ" rel="noopener noreferrer"&gt;Watch the demo video&lt;/a&gt; showcasing the workflow generating a summary for a sample GitHub profile.&lt;/p&gt;

&lt;h3&gt;
  
  
  n8n Workflow
&lt;/h3&gt;

&lt;p&gt;The workflow JSON is available in this &lt;a href="https://gist.github.com/abhirajadhikary06/059647a819f306cf1567b12e9a71f186" rel="noopener noreferrer"&gt;GitHub Gist&lt;/a&gt;.&lt;/p&gt;

&lt;h2&gt;
  
  
  Technical Implementation
&lt;/h2&gt;

&lt;p&gt;The agent is built using an n8n workflow with the following components:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Webhook&lt;/strong&gt;: Receives a GET request with a &lt;code&gt;username&lt;/code&gt; query parameter.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Set Username&lt;/strong&gt;: Extracts and sets the GitHub username, defaulting to &lt;code&gt;abhirajadhikary06&lt;/code&gt; if none is provided.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Validate Username&lt;/strong&gt;: Uses regex (&lt;code&gt;^[a-zA-Z0-9][a-zA-Z0-9-]{0,37}[a-zA-Z0-9]$&lt;/code&gt;) to ensure the username is valid.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Bright Data Scraper&lt;/strong&gt;: Scrapes the GitHub profile using Bright Data's verified node.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Mistral AI&lt;/strong&gt;: Uses the &lt;code&gt;mistral-large-latest&lt;/code&gt; model with a prompt to summarize the scraped data into a 200-word Markdown summary.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Memory&lt;/strong&gt;: Maintains a context window of 50 interactions for conversational continuity.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;AI Agent&lt;/strong&gt;: Configured as a conversational agent with a prefix: "You are a helpful assistant summarizing GitHub profiles."&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Markdown to HTML&lt;/strong&gt;: Converts the AI-generated Markdown summary to HTML.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Chat Trigger&lt;/strong&gt;: Supports chat-based input for interactive use cases.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Error Handling&lt;/strong&gt;: Returns a 400 status code with an error message for invalid usernames.&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Bright Data Verified Node
&lt;/h3&gt;

&lt;p&gt;The Bright Data verified node is central to the workflow, scraping the GitHub profile page (&lt;code&gt;https://github.com/{{ $json.githubUsername }}&lt;/code&gt;) using the dataset ID &lt;code&gt;gd_lyrexgxc24b3d4imjt&lt;/code&gt;. It reliably extracts structured data (bio, repositories, contributions) without triggering GitHub's rate limits or CAPTCHAs, thanks to Bright Data's proxy management. The scraped data is passed to the AI Agent for summarization.&lt;/p&gt;

&lt;h2&gt;
  
  
  Journey
&lt;/h2&gt;

&lt;p&gt;Building this agent was a rewarding challenge. Integrating Bright Data's scraper required fine-tuning the dataset configuration to extract relevant profile data consistently. The Mistral AI model needed a precise prompt to produce concise summaries, which I iterated on to balance detail and brevity. Handling invalid usernames robustly was another hurdle, solved by tightening the regex validation. Learning to chain n8n nodes with AI and web scraping tools deepened my understanding of automation and data processing. The biggest lesson was the power of combining reliable data extraction (Bright Data) with intelligent processing (Mistral AI) in a seamless n8n workflow.&lt;/p&gt;

</description>
      <category>devchallenge</category>
      <category>n8nbrightdatachallenge</category>
      <category>ai</category>
      <category>webdev</category>
    </item>
    <item>
      <title>Building EventStack – A Lightweight, Real-Time Doodle &amp; Luma Clone Using Tornado</title>
      <dc:creator>Abhiraj Adhikary</dc:creator>
      <pubDate>Thu, 19 Jun 2025 20:37:21 +0000</pubDate>
      <link>https://dev.to/abhirajadhikary06/building-eventstack-a-lightweight-real-time-doodle-luma-clone-using-tornado-1ogo</link>
      <guid>https://dev.to/abhirajadhikary06/building-eventstack-a-lightweight-real-time-doodle-luma-clone-using-tornado-1ogo</guid>
      <description>&lt;p&gt;Have you ever struggled to coordinate a meeting time with a group? Tools like Doodle make scheduling easier — but I wanted to create something simpler, open-source, and custom-built with a modern stack. That’s how &lt;strong&gt;&lt;a href="https://eventstack-production.up.railway.app" rel="noopener noreferrer"&gt;EventStack&lt;/a&gt;&lt;/strong&gt; was born.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://eventstack-production.up.railway.app" rel="noopener noreferrer"&gt;EventStack&lt;/a&gt; is a lightweight event scheduling app that allows users to propose time slots, vote on availability, and finalize meetings — all with a slick frontend and real-time updates.&lt;/p&gt;




&lt;h2&gt;
  
  
  Why I Built It
&lt;/h2&gt;

&lt;p&gt;I wanted to explore &lt;strong&gt;Tornado&lt;/strong&gt;, a powerful Python framework known for handling asynchronous and real-time web apps. Unlike Flask or Django, Tornado gives fine-grained control over sockets, routing, and performance. I also wanted to integrate:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;GitHub OAuth&lt;/strong&gt; for easy login&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;PostgreSQL&lt;/strong&gt; as a robust backend&lt;/li&gt;
&lt;li&gt;A beautiful frontend using &lt;strong&gt;Tailwind CSS&lt;/strong&gt;
&lt;/li&gt;
&lt;li&gt;Potential for &lt;strong&gt;WebSocket-based real-time voting&lt;/strong&gt;
&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;This project was a perfect way to combine learning with utility.&lt;/p&gt;




&lt;h2&gt;
  
  
  Tech Stack
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Backend&lt;/strong&gt;: &lt;a href="https://www.tornadoweb.org/en/stable/" rel="noopener noreferrer"&gt;Tornado&lt;/a&gt; – asynchronous Python framework&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Frontend&lt;/strong&gt;: Tailwind CSS + custom HTML templates&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Auth&lt;/strong&gt;: GitHub OAuth2 (manual token exchange using &lt;code&gt;requests&lt;/code&gt;)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Database&lt;/strong&gt;: PostgreSQL (used NeonDB Postgres during initial dev, later moved to local)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Hosting&lt;/strong&gt;: Runs locally and deployable to platforms like Railway, etc.&lt;/li&gt;
&lt;/ul&gt;




&lt;h2&gt;
  
  
  Authentication with GitHub
&lt;/h2&gt;

&lt;p&gt;OAuth integration was handled manually — bypassing libraries like Authlib — to better understand the token exchange process. Users log in via GitHub, and their profile data is stored securely in the database.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="c1"&gt;# Get GitHub token manually using requests
&lt;/span&gt;&lt;span class="n"&gt;response&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;requests&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;post&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;https://github.com/login/oauth/access_token&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;data&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;{...},&lt;/span&gt; &lt;span class="n"&gt;headers&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Accept&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;application/json&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;






&lt;h2&gt;
  
  
  Features
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;✅ Secure GitHub login&lt;/li&gt;
&lt;li&gt;✅ Create events with multiple time slots&lt;/li&gt;
&lt;li&gt;✅ Vote for available slots&lt;/li&gt;
&lt;li&gt;✅ Real-time voting updates&lt;/li&gt;
&lt;li&gt;✅ Auto-finalization and notifications (planned)&lt;/li&gt;
&lt;/ul&gt;




&lt;h2&gt;
  
  
  Frontend Preview
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;A clean dashboard for users to view and manage events&lt;/li&gt;
&lt;li&gt;Interactive voting interface&lt;/li&gt;
&lt;li&gt;Markdown-ready comment section (coming)&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;All templates are rendered server-side with Jinja2 and styled using Tailwind for responsiveness and polish.&lt;/p&gt;




&lt;h2&gt;
  
  
  Lessons Learned
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;Tornado requires more boilerplate than Flask, but it pays off for async control.&lt;/li&gt;
&lt;li&gt;GitHub OAuth is surprisingly easy when broken down.&lt;/li&gt;
&lt;li&gt;NeonDB's PostgreSQL is handy for prototyping — but local or cloud-managed Postgres is better for production.&lt;/li&gt;
&lt;li&gt;Real-time updates will require integrating &lt;code&gt;tornado.websocket.WebSocketHandler&lt;/code&gt;.&lt;/li&gt;
&lt;/ul&gt;




&lt;h2&gt;
  
  
  What's Next?
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;Email or GitHub notifications on finalization&lt;/li&gt;
&lt;/ul&gt;




&lt;h2&gt;
  
  
  Final Thoughts
&lt;/h2&gt;

&lt;p&gt;EventStack is more than just a clone — it’s a showcase of how you can build something powerful, fast, and modern with minimal libraries. If you’re looking to build real-time apps in Python, give Tornado a try.&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;Want to contribute? The &lt;a href="https://github.com/abhirajadhikary06/eventstack" rel="noopener noreferrer"&gt;GitHub repo&lt;/a&gt; will be public soon. Drop a ⭐️ if you like the project!&lt;/p&gt;
&lt;/blockquote&gt;

</description>
      <category>webdev</category>
      <category>programming</category>
      <category>productivity</category>
      <category>python</category>
    </item>
    <item>
      <title>My Contribution to Kharagpur Winter of Code 2024 (KWOC)</title>
      <dc:creator>Abhiraj Adhikary</dc:creator>
      <pubDate>Sun, 19 Jan 2025 17:38:53 +0000</pubDate>
      <link>https://dev.to/abhirajadhikary06/my-contribution-to-kharagpur-winter-of-code-2024-kwoc-3k32</link>
      <guid>https://dev.to/abhirajadhikary06/my-contribution-to-kharagpur-winter-of-code-2024-kwoc-3k32</guid>
      <description>&lt;p&gt;As Kharagpur Winter of Code (KWOC) 2024 draws to a close, I am thrilled to share my journey, contributions, and learnings. KWOC provided me with a platform to contribute to open-source projects, hone my technical skills, and collaborate with a vibrant community of developers. Here’s an overview of the work I accomplished during this enriching experience:&lt;/p&gt;




&lt;h2&gt;
  
  
  &lt;strong&gt;Projects I Worked On&lt;/strong&gt;
&lt;/h2&gt;

&lt;h3&gt;
  
  
  &lt;strong&gt;1. Beautiify&lt;/strong&gt;
&lt;/h3&gt;

&lt;p&gt;Beautiify is a dynamic project focused on enhancing web design components. I contributed to multiple features:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Infinite Scroll Emoji Background&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;  &lt;strong&gt;PR Link:&lt;/strong&gt; &lt;a href="https://github.com/Rakesh9100/Beautiify/pull/1391" rel="noopener noreferrer"&gt;#1391&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;  &lt;strong&gt;Description:&lt;/strong&gt; Implemented a visually appealing background with infinitely scrolling emojis. Users can add custom components to it via HTML.&lt;/li&gt;
&lt;/ul&gt;


&lt;/li&gt;

&lt;li&gt;

&lt;p&gt;&lt;strong&gt;Responsive Feedback Form-2&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;  &lt;strong&gt;PR Link:&lt;/strong&gt; &lt;a href="https://github.com/Rakesh9100/Beautiify/pull/1392" rel="noopener noreferrer"&gt;#1392&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;  &lt;strong&gt;Description:&lt;/strong&gt; Made the feedback form responsive across devices. Enhanced the design with a gradient background, green borders for placeholders, and star animations.&lt;/li&gt;
&lt;/ul&gt;


&lt;/li&gt;

&lt;li&gt;

&lt;p&gt;&lt;strong&gt;Swag Shipment Form&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;  &lt;strong&gt;PR Link:&lt;/strong&gt; &lt;a href="https://github.com/Rakesh9100/Beautiify/pull/1397" rel="noopener noreferrer"&gt;#1397&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;  &lt;strong&gt;Description:&lt;/strong&gt; Designed a comprehensive swag shipment form with all essential placeholders.&lt;/li&gt;
&lt;/ul&gt;


&lt;/li&gt;

&lt;li&gt;

&lt;p&gt;&lt;strong&gt;Error Pages Category and Component&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;  &lt;strong&gt;PR Link:&lt;/strong&gt; &lt;a href="https://github.com/Rakesh9100/Beautiify/pull/1405" rel="noopener noreferrer"&gt;#1405&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;  &lt;strong&gt;Description:&lt;/strong&gt; Introduced a category for error pages and added a reusable error component for contributors to build upon.&lt;/li&gt;
&lt;/ul&gt;


&lt;/li&gt;

&lt;li&gt;

&lt;p&gt;&lt;strong&gt;Spooky Themed Hero Component Responsiveness&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;  &lt;strong&gt;PR Link:&lt;/strong&gt; &lt;a href="https://github.com/Rakesh9100/Beautiify/pull/1439" rel="noopener noreferrer"&gt;#1439&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;  &lt;strong&gt;Description:&lt;/strong&gt; Made the spooky-themed hero component fully responsive, ensuring images adapt seamlessly across devices.&lt;/li&gt;
&lt;/ul&gt;


&lt;/li&gt;

&lt;/ul&gt;




&lt;h3&gt;
  
  
  &lt;strong&gt;2. Eventica&lt;/strong&gt;
&lt;/h3&gt;

&lt;p&gt;Eventica is a platform designed for managing events efficiently. My contribution included:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;  &lt;strong&gt;Home Page Design Enhancement&lt;/strong&gt;

&lt;ul&gt;
&lt;li&gt;  &lt;strong&gt;PR Link:&lt;/strong&gt; &lt;a href="https://github.com/Rakesh9100/Eventica/pull/45" rel="noopener noreferrer"&gt;#45&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;  &lt;strong&gt;Description:&lt;/strong&gt; Revamped the home page with changing images, adding a vibrant and engaging touch to the design.&lt;/li&gt;
&lt;/ul&gt;


&lt;/li&gt;

&lt;/ul&gt;




&lt;h3&gt;
  
  
  &lt;strong&gt;3. MindDrive&lt;/strong&gt;
&lt;/h3&gt;

&lt;p&gt;MindDrive is an innovative project aimed at improving user experiences. My contribution was:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;  &lt;strong&gt;Arrow Visibility in Dark Mode&lt;/strong&gt;

&lt;ul&gt;
&lt;li&gt;  &lt;strong&gt;PR Link:&lt;/strong&gt; &lt;a href="https://github.com/sristy17/MindDrive/pull/65" rel="noopener noreferrer"&gt;#65&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;  &lt;strong&gt;Description:&lt;/strong&gt; Ensured arrows are clearly visible in dark mode, enhancing accessibility and user experience.&lt;/li&gt;
&lt;/ul&gt;


&lt;/li&gt;

&lt;/ul&gt;




&lt;h2&gt;
  
  
  &lt;strong&gt;Summary of My Work and Learnings&lt;/strong&gt;
&lt;/h2&gt;

&lt;p&gt;During KWOC 2024, I explored various aspects of front-end development, including:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt; &lt;strong&gt;Responsive Design:&lt;/strong&gt; I learned the importance of making components adaptable to different devices and screen sizes.&lt;/li&gt;
&lt;li&gt; &lt;strong&gt;Enhanced Aesthetics:&lt;/strong&gt; Implementing gradients, animations, and dynamic backgrounds sharpened my design sensibilities.&lt;/li&gt;
&lt;li&gt; &lt;strong&gt;Collaboration:&lt;/strong&gt; Working with mentors and fellow contributors taught me effective communication and the value of feedback.&lt;/li&gt;
&lt;li&gt; &lt;strong&gt;Version Control:&lt;/strong&gt; Gained deeper insights into Git and GitHub workflows, managing multiple branches, and resolving conflicts.&lt;/li&gt;
&lt;/ol&gt;




&lt;h2&gt;
  
  
  &lt;strong&gt;Conclusion&lt;/strong&gt;
&lt;/h2&gt;

&lt;p&gt;KWOC 2024 has been an incredible learning journey. Each project challenged me to push my boundaries and equipped me with skills that I will carry forward in my development journey. I am grateful for the opportunity to contribute to impactful projects and collaborate with talented individuals.&lt;/p&gt;

&lt;p&gt;To future contributors: Open source is not just about code—it's about community, learning, and growth. Dive in, explore, and enjoy the process!&lt;/p&gt;




&lt;p&gt;Feel free to tweak or add personal touches to the draft as needed. Let me know if you'd like me to expand on any section!&lt;/p&gt;

</description>
      <category>kwoc</category>
      <category>opensource</category>
      <category>github</category>
    </item>
    <item>
      <title>Leveraging DEM using Daytona</title>
      <dc:creator>Abhiraj Adhikary</dc:creator>
      <pubDate>Sat, 28 Dec 2024 21:29:42 +0000</pubDate>
      <link>https://dev.to/abhirajadhikary06/leveraging-dem-using-daytona-1115</link>
      <guid>https://dev.to/abhirajadhikary06/leveraging-dem-using-daytona-1115</guid>
      <description>&lt;p&gt;In this blog, we'll dive 🌊🤿 into building a Streamlit-based dashboard for analyzing &lt;strong&gt;Spotify User Sentiment&lt;/strong&gt; using &lt;strong&gt;Airbyte&lt;/strong&gt; for data extraction, &lt;strong&gt;Motherduck (DuckDB)&lt;/strong&gt; for storage and querying, and &lt;strong&gt;Daytona&lt;/strong&gt; for streamlined development environments. This project explores how these technologies integrate with Streamlit to create an interactive and insightful data analysis application.&lt;/p&gt;




&lt;h2&gt;
  
  
  📁 &lt;strong&gt;Folder Structure Overview&lt;/strong&gt;
&lt;/h2&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;SPOTIFY-REVIEWS-ANALYSIS
├── .devcontainer
│   ├── devcontainer.json
├── .streamlit
│   ├── config.toml
├── assets
│   ├── main.png
├── src
│   ├── config
│   │   ├── __init__.py
│   │   ├── config.py
│   ├── utils
│   │   ├── __init__.py
│   │   ├── database.py
│   ├── app.py
├── .env
├── venv
├── .gitignore
├── LICENSE.md
├── README.md
├── requirements.txt
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;.devcontainer/devcontainer.json:&lt;/strong&gt; Configures development environment.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;.streamlit/config.toml:&lt;/strong&gt; Streamlit's UI style and configuration.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;assets:&lt;/strong&gt; Stores static assets like images.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;src/config/config.py:&lt;/strong&gt; Handles environment variables.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;src/utils/database.py:&lt;/strong&gt; Queries data from Motherduck.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;src/app.py:&lt;/strong&gt; Streamlit dashboard and logic.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;.env:&lt;/strong&gt; Stores environment variables securely.&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  👉 &lt;strong&gt;Tips On Folder Structure&lt;/strong&gt;
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;SPOTIFY-REVIEWS-ANALYSIS:&lt;/strong&gt; This is the outer folder of the repository.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;src:&lt;/strong&gt; This folder contains &lt;strong&gt;config&lt;/strong&gt; and &lt;strong&gt;utils&lt;/strong&gt; for project logic.&lt;/li&gt;
&lt;/ul&gt;




&lt;h2&gt;
  
  
  ☀️ &lt;strong&gt;Daytona Integration&lt;/strong&gt;
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;Daytona&lt;/strong&gt; is an open-source Development Environment Manager (DEM) designed to simplify and streamline the process of setting up development environments.&lt;/p&gt;

&lt;h3&gt;
  
  
  🛠️ &lt;strong&gt;Why Daytona?&lt;/strong&gt;
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Consistency:&lt;/strong&gt; Ensures uniform development environments across all team members.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Scalability:&lt;/strong&gt; Manages multiple environments seamlessly.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Security:&lt;/strong&gt; Isolates and secures environments.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Efficiency:&lt;/strong&gt; Reduces overhead during setup and switching between environments.&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  📚 &lt;strong&gt;Daytona Setup&lt;/strong&gt;
&lt;/h3&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;Installation:&lt;/strong&gt; Follow the &lt;a href="https://www.daytona.io/docs/installation/" rel="noopener noreferrer"&gt;official installation guide&lt;/a&gt;.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Configuration:&lt;/strong&gt; Create a &lt;code&gt;daytona.yaml&lt;/code&gt; file with dependencies and environment configurations.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Environment Initialization:&lt;/strong&gt; Run Daytona commands to set up your development environment.
&lt;/li&gt;
&lt;/ol&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight yaml"&gt;&lt;code&gt;&lt;span class="na"&gt;environment&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;spotify-reviews-analysis&lt;/span&gt;
  &lt;span class="na"&gt;dependencies&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
    &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="s"&gt;python&lt;/span&gt;
    &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="s"&gt;pip&lt;/span&gt;
  &lt;span class="na"&gt;scripts&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
    &lt;span class="na"&gt;start&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;streamlit&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;run&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;src/app.py"&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Daytona ensures every developer has an identical and functional environment for running the project seamlessly.&lt;/p&gt;




&lt;h2&gt;
  
  
  🎏 &lt;strong&gt;Streamlit Setup&lt;/strong&gt;
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;Streamlit&lt;/strong&gt; is an open-source Python library that enables developers to create interactive web apps for data science and machine learning projects.&lt;/p&gt;

&lt;h3&gt;
  
  
  📜 &lt;strong&gt;Code Snippet: Streamlit Core Structure&lt;/strong&gt;
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;streamlit&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="n"&gt;st&lt;/span&gt;
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;plotly.express&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="n"&gt;px&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;utils.database&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;get_reviews_for_sentiment&lt;/span&gt;

&lt;span class="n"&gt;st&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;set_page_config&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;page_title&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Spotify Analysis&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;page_icon&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;🗳️&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;layout&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;wide&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="c1"&gt;# Title
&lt;/span&gt;&lt;span class="n"&gt;st&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;markdown&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;## 🗳️ Spotify Sentiment Analysis&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="c1"&gt;# Sidebar
&lt;/span&gt;&lt;span class="n"&gt;sentiment_type&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;st&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;sidebar&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;selectbox&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Sentiment Analysis Type&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Polarity&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Subjectivity&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;])&lt;/span&gt;

&lt;span class="c1"&gt;# Fetch and Display Data
&lt;/span&gt;&lt;span class="n"&gt;reviews_df&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;get_reviews_for_sentiment&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
&lt;span class="n"&gt;st&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;dataframe&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;reviews_df&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;When you run &lt;code&gt;app.py&lt;/code&gt; with &lt;code&gt;streamlit run src/app.py&lt;/code&gt;, the dashboard launches at &lt;strong&gt;&lt;a href="http://localhost:8501/" rel="noopener noreferrer"&gt;http://localhost:8501&lt;/a&gt;&lt;/strong&gt;.&lt;/p&gt;




&lt;h2&gt;
  
  
  📊 &lt;strong&gt;Core Logic of Sentiment Analysis&lt;/strong&gt;
&lt;/h2&gt;

&lt;p&gt;Sentiment analysis is powered by &lt;strong&gt;TextBlob&lt;/strong&gt; to determine the polarity (positive/negative sentiment) and subjectivity (factual/opinionated content) of reviews.&lt;/p&gt;

&lt;h3&gt;
  
  
  🧠 &lt;strong&gt;Sentiment Analysis Function&lt;/strong&gt;
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;textblob&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;TextBlob&lt;/span&gt;

&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;get_sentiment&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;text&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="n"&gt;blob&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;TextBlob&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;str&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;text&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;blob&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;sentiment&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;polarity&lt;/span&gt; &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;sentiment_type&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Polarity&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt; &lt;span class="k"&gt;else&lt;/span&gt; &lt;span class="n"&gt;blob&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;sentiment&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;subjectivity&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  📈 &lt;strong&gt;Visualization Example&lt;/strong&gt;
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="n"&gt;fig&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;px&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;histogram&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;reviews_df&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;x&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;sentiment&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;title&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;Sentiment Distribution&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;st&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;plotly_chart&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;fig&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;






&lt;h2&gt;
  
  
  🦆 &lt;strong&gt;Database Integration with Motherduck&lt;/strong&gt;
&lt;/h2&gt;

&lt;h3&gt;
  
  
  🔗 &lt;strong&gt;database.py&lt;/strong&gt;
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;duckdb&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;config.config&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;MOTHERDUCK_TOKEN&lt;/span&gt;

&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;get_connection&lt;/span&gt;&lt;span class="p"&gt;():&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;duckdb&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;connect&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;md:?token=&lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;MOTHERDUCK_TOKEN&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;get_reviews_for_sentiment&lt;/span&gt;&lt;span class="p"&gt;():&lt;/span&gt;
    &lt;span class="n"&gt;conn&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;get_connection&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
    &lt;span class="n"&gt;query&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="sh"&gt;"""&lt;/span&gt;&lt;span class="s"&gt;
    SELECT content, score FROM spotify_reviews WHERE content IS NOT NULL
    &lt;/span&gt;&lt;span class="sh"&gt;"""&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;conn&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;execute&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;query&lt;/span&gt;&lt;span class="p"&gt;).&lt;/span&gt;&lt;span class="nf"&gt;fetch_df&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This code fetches Spotify review data securely using &lt;strong&gt;MOTHERDUCK_TOKEN&lt;/strong&gt; stored in &lt;code&gt;.env&lt;/code&gt; through &lt;code&gt;config.py&lt;/code&gt; file.&lt;/p&gt;

&lt;h3&gt;
  
  
  🗂️ &lt;strong&gt;config.py&lt;/strong&gt;
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;os&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;dotenv&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;load_dotenv&lt;/span&gt;

&lt;span class="nf"&gt;load_dotenv&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
&lt;span class="n"&gt;MOTHERDUCK_TOKEN&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;os&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;getenv&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;MOTHERDUCK_TOKEN&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  🔄 &lt;strong&gt;Connection Between &lt;code&gt;app.py&lt;/code&gt; and &lt;code&gt;database.py&lt;/code&gt;&lt;/strong&gt;
&lt;/h3&gt;

&lt;p&gt;The &lt;code&gt;app.py&lt;/code&gt; imports &lt;code&gt;get_reviews_for_sentiment&lt;/code&gt; from &lt;code&gt;database.py&lt;/code&gt;, creating a seamless flow of data into the dashboard.&lt;/p&gt;




&lt;h2&gt;
  
  
  🎯 &lt;strong&gt;Conclusion&lt;/strong&gt;
&lt;/h2&gt;

&lt;p&gt;We successfully built a &lt;strong&gt;Spotify Reviews Sentiment Analysis Dashboard&lt;/strong&gt; using &lt;strong&gt;Airbyte&lt;/strong&gt;, &lt;strong&gt;Motherduck&lt;/strong&gt;, &lt;strong&gt;Streamlit&lt;/strong&gt;, and &lt;strong&gt;Daytona&lt;/strong&gt;. This project demonstrates the power of consistent development environments, robust data storage, and insightful visualization.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F9ym4we9zyri0tx206szj.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F9ym4we9zyri0tx206szj.png" alt="Sneak-Peak" width="800" height="446"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;👨‍💻 &lt;strong&gt;Check out the complete code on &lt;a href="https://github.com/abhirajadhikary06/Spotify-Sentiment-Analysis" rel="noopener noreferrer"&gt;GitHub&lt;/a&gt;&lt;/strong&gt;.&lt;br&gt;
📺 &lt;strong&gt;Live PROJECT&lt;/strong&gt; &lt;a href="https://spotify-sentiment-analysis.streamlit.app" rel="noopener noreferrer"&gt;https://spotify-sentiment-analysis.streamlit.app&lt;/a&gt;&lt;/p&gt;

&lt;h1&gt;
  
  
  Sentiment Analysis #Happy Coding! #Daytona 🚀🦆
&lt;/h1&gt;

</description>
    </item>
    <item>
      <title>📊 Dropbox User Sentiment Analysis using Airbyte 🪼 and Motherduck 🦆</title>
      <dc:creator>Abhiraj Adhikary</dc:creator>
      <pubDate>Sat, 28 Dec 2024 11:58:59 +0000</pubDate>
      <link>https://dev.to/abhirajadhikary06/dropbox-user-sentiment-analysis-using-airbyte-and-motherduck-1ggd</link>
      <guid>https://dev.to/abhirajadhikary06/dropbox-user-sentiment-analysis-using-airbyte-and-motherduck-1ggd</guid>
      <description>&lt;p&gt;In this blog, we'll dive 🌊🤿 into building a Streamlit-based dashboard for analyzing &lt;strong&gt;Dropbox User Sentiment&lt;/strong&gt; using &lt;strong&gt;Airbyte&lt;/strong&gt; for data extraction and &lt;strong&gt;Motherduck (DuckDB)&lt;/strong&gt; for storage and querying. This post continues from our previous discussion in &lt;em&gt;"&lt;a href="https://dev.to/abhirajadhikary06/leveraging-airbyte-and-motherduck-for-sentiment-analysis-13km"&gt;Leveraging Airbyte 🪼 and Motherduck 🦆 for Sentiment Analysis&lt;/a&gt;"&lt;/em&gt; and explores how these technologies integrate with Streamlit to create an interactive and insightful data analysis application.&lt;/p&gt;




&lt;h2&gt;
  
  
  📁 &lt;strong&gt;Folder Structure Overview&lt;/strong&gt;
&lt;/h2&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;DROPBOX-REVIEWS-ANALYSIS
├── .devcontainer
│   ├── devcontainer.json
├── .streamlit
│   ├── config.toml
├── assets
│   ├── main.png
├── dropbox-reviews-analytics
│   ├── src
│   │   ├── config
│   │   │   ├── __init__.py
│   │   │   ├── config.py
│   │   ├── utils
│   │   │   ├── __init__.py
│   │   │   ├── database.py
│   │   ├── app.py
├── .env
├── venv
├── .gitignore
├── LICENSE.md
├── README.md
├── requirements.txt
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;ul&gt;
&lt;li&gt;  &lt;strong&gt;.devcontainer/devcontainer.json:&lt;/strong&gt; Configures development environment.&lt;/li&gt;
&lt;li&gt;  &lt;strong&gt;.streamlit/config.toml:&lt;/strong&gt; Streamlit's UI style and configuration.&lt;/li&gt;
&lt;li&gt;  &lt;strong&gt;assets:&lt;/strong&gt; Stores static assets like images.&lt;/li&gt;
&lt;li&gt;  &lt;strong&gt;src/config/config.py:&lt;/strong&gt; Handles environment variables.&lt;/li&gt;
&lt;li&gt;  &lt;strong&gt;src/utils/database.py:&lt;/strong&gt; Queries data from Motherduck.&lt;/li&gt;
&lt;li&gt;  &lt;strong&gt;src/app.py:&lt;/strong&gt; Streamlit dashboard and logic.&lt;/li&gt;
&lt;li&gt;  &lt;strong&gt;.env:&lt;/strong&gt; Stores environment variables securely.&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  👉 &lt;strong&gt;Tips On Folder Structure&lt;/strong&gt;
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;  &lt;strong&gt;DROPBOX-REVIEWS-ANALYSIS:&lt;/strong&gt; This is the outer folder of repo on Github (while building project on your own don't create this folder)&lt;/li&gt;
&lt;li&gt;  &lt;strong&gt;dropbox-reviews-analytics:&lt;/strong&gt; This is the main folder where you will add &lt;strong&gt;src&lt;/strong&gt; followed by &lt;strong&gt;config&lt;/strong&gt; and &lt;strong&gt;utils&lt;/strong&gt;
&lt;/li&gt;
&lt;/ul&gt;




&lt;h2&gt;
  
  
  🎏 &lt;strong&gt;Streamlit Setup&lt;/strong&gt;
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;Streamlit&lt;/strong&gt; is an open-source Python library that enables developers to create interactive web apps for data science and machine learning projects.&lt;/p&gt;

&lt;h3&gt;
  
  
  📜 &lt;strong&gt;Code Snippet: Streamlit Core Structure&lt;/strong&gt;
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;streamlit&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="n"&gt;st&lt;/span&gt;
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;plotly.express&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="n"&gt;px&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;utils.database&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;get_reviews_for_sentiment&lt;/span&gt;

&lt;span class="n"&gt;st&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;set_page_config&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;page_title&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Dropbox Analysis&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;page_icon&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;🗳️&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;layout&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;wide&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="c1"&gt;# Title
&lt;/span&gt;&lt;span class="n"&gt;st&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;markdown&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;## 🗳️ Dropbox Sentiment Analysis&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="c1"&gt;# Sidebar
&lt;/span&gt;&lt;span class="n"&gt;sentiment_type&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;st&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;sidebar&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;selectbox&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Sentiment Analysis Type&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Polarity&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Subjectivity&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;])&lt;/span&gt;

&lt;span class="c1"&gt;# Fetch and Display Data
&lt;/span&gt;&lt;span class="n"&gt;reviews_df&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;get_reviews_for_sentiment&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
&lt;span class="n"&gt;st&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;dataframe&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;reviews_df&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;When you run &lt;code&gt;app.py&lt;/code&gt; with &lt;code&gt;streamlit run src/app.py&lt;/code&gt;, the dashboard launches at &lt;strong&gt;&lt;a href="http://localhost:8501/" rel="noopener noreferrer"&gt;http://localhost:8501&lt;/a&gt;&lt;/strong&gt;.&lt;/p&gt;




&lt;h2&gt;
  
  
  📊 &lt;strong&gt;Core Logic of Sentiment Analysis&lt;/strong&gt;
&lt;/h2&gt;

&lt;p&gt;Sentiment analysis is powered by &lt;strong&gt;TextBlob&lt;/strong&gt; to determine the polarity (positive/negative sentiment) and subjectivity (factual/opinionated content) of reviews.&lt;/p&gt;

&lt;h3&gt;
  
  
  🧠 &lt;strong&gt;Sentiment Analysis Function&lt;/strong&gt;
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;textblob&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;TextBlob&lt;/span&gt;

&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;get_sentiment&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;text&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="n"&gt;blob&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;TextBlob&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;str&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;text&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;blob&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;sentiment&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;polarity&lt;/span&gt; &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;sentiment_type&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Polarity&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt; &lt;span class="k"&gt;else&lt;/span&gt; &lt;span class="n"&gt;blob&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;sentiment&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;subjectivity&lt;/span&gt;

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  📈 &lt;strong&gt;Visualization Example&lt;/strong&gt;
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="n"&gt;fig&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;px&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;histogram&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;reviews_df&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;x&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;sentiment&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;title&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;Sentiment Distribution&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;st&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;plotly_chart&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;fig&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;






&lt;h2&gt;
  
  
  🦆 *&lt;em&gt;Database Integration with Motherduck *&lt;/em&gt;
&lt;/h2&gt;

&lt;h3&gt;
  
  
  🔗 &lt;strong&gt;database.py&lt;/strong&gt;
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;duckdb&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;config.config&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;MOTHERDUCK_TOKEN&lt;/span&gt;

&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;get_connection&lt;/span&gt;&lt;span class="p"&gt;():&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;duckdb&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;connect&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;md:?token=&lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;MOTHERDUCK_TOKEN&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;get_reviews_for_sentiment&lt;/span&gt;&lt;span class="p"&gt;():&lt;/span&gt;
    &lt;span class="n"&gt;conn&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;get_connection&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
    &lt;span class="n"&gt;query&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="sh"&gt;"""&lt;/span&gt;&lt;span class="s"&gt;
    SELECT content, score FROM dropbox_reviews WHERE content IS NOT NULL
    &lt;/span&gt;&lt;span class="sh"&gt;"""&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;conn&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;execute&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;query&lt;/span&gt;&lt;span class="p"&gt;).&lt;/span&gt;&lt;span class="nf"&gt;fetch_df&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This code fetches Dropbox review data securely using &lt;strong&gt;MOTHERDUCK_TOKEN&lt;/strong&gt; stored in &lt;code&gt;.env&lt;/code&gt; through &lt;code&gt;config.py&lt;/code&gt; file.&lt;/p&gt;

&lt;h3&gt;
  
  
  🗂️ &lt;strong&gt;config.py&lt;/strong&gt;
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;os&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;dotenv&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;load_dotenv&lt;/span&gt;

&lt;span class="nf"&gt;load_dotenv&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
&lt;span class="n"&gt;MOTHERDUCK_TOKEN&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;os&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;getenv&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;MOTHERDUCK_TOKEN&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  🔄 &lt;strong&gt;Connection Between &lt;code&gt;app.py&lt;/code&gt; and &lt;code&gt;database.py&lt;/code&gt;&lt;/strong&gt;
&lt;/h3&gt;

&lt;p&gt;The &lt;code&gt;app.py&lt;/code&gt; imports &lt;code&gt;get_reviews_for_sentiment&lt;/code&gt; from &lt;code&gt;database.py&lt;/code&gt;, creating a seamless flow of data into the dashboard.&lt;/p&gt;




&lt;h2&gt;
  
  
  ⚙️ &lt;strong&gt;Why &lt;code&gt;devcontainer.json&lt;/code&gt; and &lt;code&gt;config.toml&lt;/code&gt;?&lt;/strong&gt;
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;  &lt;strong&gt;devcontainer.json:&lt;/strong&gt; Provides a consistent environment for development, anyone willing to use Docker for containerization can use it.&lt;/li&gt;
&lt;li&gt;  &lt;strong&gt;config.toml:&lt;/strong&gt; Controls Streamlit UI customization (e.g., colors, fonts, themes).&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Example &lt;code&gt;config.toml&lt;/code&gt;:&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight toml"&gt;&lt;code&gt;&lt;span class="nn"&gt;[theme]&lt;/span&gt;
&lt;span class="py"&gt;primaryColor&lt;/span&gt;&lt;span class="p"&gt;=&lt;/span&gt;&lt;span class="s"&gt;"#0061FE"&lt;/span&gt;
&lt;span class="py"&gt;backgroundColor&lt;/span&gt;&lt;span class="p"&gt;=&lt;/span&gt;&lt;span class="s"&gt;"#0E1117"&lt;/span&gt;
&lt;span class="py"&gt;secondaryBackgroundColor&lt;/span&gt;&lt;span class="p"&gt;=&lt;/span&gt;&lt;span class="s"&gt;"#262730"&lt;/span&gt;
&lt;span class="py"&gt;textColor&lt;/span&gt;&lt;span class="p"&gt;=&lt;/span&gt;&lt;span class="s"&gt;"#FAFAFA"&lt;/span&gt;
&lt;span class="py"&gt;font&lt;/span&gt;&lt;span class="p"&gt;=&lt;/span&gt;&lt;span class="s"&gt;"Monospace"&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;






&lt;h2&gt;
  
  
  ⚠️ &lt;strong&gt;Deployment Challenges&lt;/strong&gt;
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;  Avoid specifying exact library versions in &lt;code&gt;requirements.txt&lt;/code&gt;, like instead of &lt;code&gt;plotly == 5.24.1&lt;/code&gt; write &lt;code&gt;plotly&lt;/code&gt; only.&lt;/li&gt;
&lt;li&gt;  Ensure &lt;code&gt;.env&lt;/code&gt; is configured correctly in deployment environments.&lt;/li&gt;
&lt;li&gt;  Validate database connection tokens during runtime.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Backend Deployment Flow:&lt;/strong&gt;&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt; Load environment variables from &lt;code&gt;.env&lt;/code&gt;.&lt;/li&gt;
&lt;li&gt; Establish connection with Motherduck DB.&lt;/li&gt;
&lt;li&gt; Fetch and process data.&lt;/li&gt;
&lt;li&gt; Render dashboard in Streamlit.&lt;/li&gt;
&lt;/ol&gt;




&lt;h2&gt;
  
  
  🎯 &lt;strong&gt;Conclusion&lt;/strong&gt;
&lt;/h2&gt;

&lt;p&gt;We successfully built a &lt;strong&gt;Dropbox Reviews Sentiment Analysis Dashboard&lt;/strong&gt; using &lt;strong&gt;Airbyte&lt;/strong&gt;, &lt;strong&gt;Motherduck&lt;/strong&gt;, and &lt;strong&gt;Streamlit&lt;/strong&gt;. This project demonstrates the power of data analysis and visualization.&lt;/p&gt;

&lt;p&gt;👨‍💻 &lt;strong&gt;Check out the complete code on &lt;a href="https://github.com/abhirajadhikary06/Dropbox-Sentiment-Analysis" rel="noopener noreferrer"&gt;GitHub&lt;/a&gt;&lt;/strong&gt;.&lt;br&gt;
📺 &lt;strong&gt;Live PROJECT&lt;/strong&gt; &lt;a href="https://airbyte-motherduck-hackathon-sentiment-analysis.streamlit.app" rel="noopener noreferrer"&gt;https://airbyte-motherduck-hackathon-sentiment-analysis.streamlit.app&lt;/a&gt;&lt;/p&gt;

&lt;h1&gt;
  
  
  Sentiment Analysis #Happy Coding! #Airbyte🪼🦆
&lt;/h1&gt;

</description>
      <category>airbyte</category>
      <category>opensource</category>
      <category>datascience</category>
      <category>motherduck</category>
    </item>
    <item>
      <title>Leveraging Airbyte 🪼 and Motherduck 🦆 for Sentiment Analysis</title>
      <dc:creator>Abhiraj Adhikary</dc:creator>
      <pubDate>Fri, 27 Dec 2024 09:46:31 +0000</pubDate>
      <link>https://dev.to/abhirajadhikary06/leveraging-airbyte-and-motherduck-for-sentiment-analysis-13km</link>
      <guid>https://dev.to/abhirajadhikary06/leveraging-airbyte-and-motherduck-for-sentiment-analysis-13km</guid>
      <description>&lt;p&gt;This blog is a part of the &lt;strong&gt;Airbyte + Motherduck Hackathon&lt;/strong&gt; where I’ll demonstrate how to connect &lt;strong&gt;Google Sheets&lt;/strong&gt; with &lt;strong&gt;Motherduck&lt;/strong&gt; using Airbyte. This setup forms the backbone of my &lt;strong&gt;Dropbox Sentiment Analysis Dashboard&lt;/strong&gt;, enabling seamless data integration and storage for analysis. This blog makes it easy to make your fist setup on &lt;strong&gt;Airbyte&lt;/strong&gt; between your source and destination, it is advised to go through the &lt;a href="https://docs.airbyte.com" rel="noopener noreferrer"&gt;official documentation&lt;/a&gt; after this. Let’s dive in! 🤿🌊&lt;/p&gt;




&lt;h2&gt;
  
  
  Overview of the Project 🗺️
&lt;/h2&gt;

&lt;p&gt;The goal is to analyze user reviews of the Dropbox app using sentiment analysis techniques. Here's a breakdown of the workflow:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;Dataset Source&lt;/strong&gt;: A CSV dataset of Dropbox app user reviews, downloaded from Kaggle.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Preprocessing&lt;/strong&gt;: Uploaded the CSV to Google Sheets for basic formatting (e.g., converting ratings from text to integers).&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Airbyte Integration&lt;/strong&gt;: Used Airbyte to connect Google Sheets (source) with Motherduck (destination).&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Destination Setup&lt;/strong&gt;: Motherduck stores the data in DuckDB (similar to SQL databases).&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Analysis&lt;/strong&gt;: Built a sentiment analysis dashboard using Python and Streamlit.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;Let me walk you through the setup process for Airbyte and Motherduck. 🎮&lt;/p&gt;




&lt;h2&gt;
  
  
  What is &lt;a href="https://docs.airbyte.com" rel="noopener noreferrer"&gt;Airbyte&lt;/a&gt;? 🧐
&lt;/h2&gt;

&lt;p&gt;Airbyte is an open-source data integration platform that helps synchronize data between different sources and destinations. It provides a wide range of connectors and a user-friendly interface to automate data workflows.&lt;/p&gt;

&lt;h2&gt;
  
  
  What is &lt;a href="https://motherduck.com/docs/getting-started/" rel="noopener noreferrer"&gt;Motherduck&lt;/a&gt;? 🤔
&lt;/h2&gt;

&lt;p&gt;Motherduck is a cloud-based platform built on DuckDB, a fast and lightweight SQL engine. It allows efficient data analysis and management, making it an excellent choice for scalable and real-time data handling.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fq43qx8ha48cgkxtfxwct.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fq43qx8ha48cgkxtfxwct.png" alt="Airbyte + Motherduck" width="800" height="396"&gt;&lt;/a&gt;&lt;/p&gt;




&lt;h2&gt;
  
  
  Setting Up Airbyte 🪼
&lt;/h2&gt;

&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Step 1&lt;/strong&gt;: Go to &lt;a href="https://airbyte.com" rel="noopener noreferrer"&gt;Airbyte&lt;/a&gt; and log in.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;You’ll land in the Airbyte workspace. Follow these steps:&lt;/p&gt;

&lt;h3&gt;
  
  
  Create a New Connection
&lt;/h3&gt;

&lt;ol&gt;
&lt;li&gt;Click on &lt;strong&gt;New Connection&lt;/strong&gt; and choose &lt;strong&gt;Google Sheet&lt;/strong&gt; as your source.&lt;/li&gt;
&lt;li&gt;Share your dataset on Google Sheets and copy the link.&lt;/li&gt;
&lt;li&gt;Paste the shared link into the placeholder in Airbyte.&lt;/li&gt;
&lt;li&gt;Authenticate your Google account (ensure it's the same account linked to the Google Sheet).&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Ft5ckk0k3z1g6zg2lw7dl.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Ft5ckk0k3z1g6zg2lw7dl.png" alt="Airbyte Source Setup" width="800" height="429"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h3&gt;
  
  
  Select Destination
&lt;/h3&gt;

&lt;ol&gt;
&lt;li&gt;Under the &lt;strong&gt;Marketplace&lt;/strong&gt;, search for and select &lt;strong&gt;Motherduck&lt;/strong&gt;.&lt;/li&gt;
&lt;li&gt;Authenticate Motherduck as the destination (process is written below).&lt;/li&gt;
&lt;/ol&gt;




&lt;h2&gt;
  
  
  Configuring Motherduck 🦆
&lt;/h2&gt;

&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Step 2&lt;/strong&gt;: Head over to &lt;a href="https://motherduck.com" rel="noopener noreferrer"&gt;Motherduck&lt;/a&gt; and sign up.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;ol&gt;
&lt;li&gt;After signing up, delete the sample workspace (not needed for this setup).&lt;/li&gt;
&lt;li&gt;Navigate to &lt;strong&gt;Settings&lt;/strong&gt; under your profile.&lt;/li&gt;
&lt;li&gt;In the &lt;strong&gt;General&lt;/strong&gt; tab, generate a &lt;strong&gt;Motherduck token&lt;/strong&gt; (API Key).&lt;/li&gt;
&lt;li&gt;Copy the token and paste it into Airbyte when prompted.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F30o23cm4ca8d9z4fpnj2.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F30o23cm4ca8d9z4fpnj2.png" alt="Airbyte-Motherduck Connection" width="800" height="446"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h3&gt;
  
  
  Schedule the Sync 🎗️
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;Configure the sync schedule to keep your Motherduck database updated with any changes in the Google Sheet.&lt;/li&gt;
&lt;li&gt;Click &lt;strong&gt;Next&lt;/strong&gt; to finalize the connection.&lt;/li&gt;
&lt;/ul&gt;




&lt;h2&gt;
  
  
  Validating the Connection 🔄
&lt;/h2&gt;

&lt;p&gt;After completing the setup, check if the source data has successfully transferred to the destination:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;On the left panel of your Motherduck page, find &lt;strong&gt;Attached Databases&lt;/strong&gt;.&lt;/li&gt;
&lt;li&gt;Under &lt;code&gt;my_db&lt;/code&gt;, navigate to &lt;code&gt;main&lt;/code&gt;, where you’ll see your dataset (e.g., &lt;code&gt;dropbox_reviews&lt;/code&gt;).&lt;/li&gt;
&lt;li&gt;Start a new notebook and run queries to confirm the data transfer.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;Example query:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="k"&gt;from&lt;/span&gt; &lt;span class="n"&gt;my_db&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;main&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;dropbox_reviews&lt;/span&gt;
&lt;span class="k"&gt;select&lt;/span&gt;
    &lt;span class="n"&gt;score&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;content&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;reviewId&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;_airbyte_raw_id&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;_airbyte_extracted_at&lt;/span&gt;
&lt;span class="k"&gt;limit&lt;/span&gt; &lt;span class="mi"&gt;100&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fgwb1fypfyggto77u491h.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fgwb1fypfyggto77u491h.png" alt="Google Sheet - Airbyte - Motherduck Connection" width="800" height="430"&gt;&lt;/a&gt;&lt;/p&gt;




&lt;h2&gt;
  
  
  What’s Next? 🚞
&lt;/h2&gt;

&lt;p&gt;This blog covers the setup of Airbyte and Motherduck for seamless data integration. In my next post, I’ll dive into:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Project Structure&lt;/strong&gt;: A detailed walkthrough of the Dropbox Sentiment Analysis project.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Coding Logic&lt;/strong&gt;: Explanation of Python libraries used for sentiment analysis.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Dashboard Deployment&lt;/strong&gt;: How to deploy the application on Streamlit.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;PROJECT 📊 : &lt;a href="https://airbyte-motherduck-hack-dropbox-sentiment-analysis.streamlit.app" rel="noopener noreferrer"&gt;https://airbyte-motherduck-hack-dropbox-sentiment-analysis.streamlit.app&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Stay tuned for an exciting journey into sentiment analysis of Dropbox User Reviews! 🚀🌕🪼🦆&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Edit:&lt;/strong&gt; Blog on "&lt;a href="https://dev.to/abhirajadhikary06/dropbox-user-sentiment-analysis-using-airbyte-and-motherduck-1ggd"&gt;Dropbox User Review Sentiment Analysis&lt;/a&gt;" is Out today..28th December&lt;/p&gt;

&lt;h1&gt;
  
  
  AirbyteHQ #Motherduck #HappyConnecting
&lt;/h1&gt;

</description>
      <category>airbyte</category>
      <category>motherduck</category>
      <category>opensource</category>
      <category>datascience</category>
    </item>
    <item>
      <title>Unlock Your Creativity: 6 End-to-End Python Projects Using Open-Source APIs</title>
      <dc:creator>Abhiraj Adhikary</dc:creator>
      <pubDate>Thu, 19 Dec 2024 05:41:28 +0000</pubDate>
      <link>https://dev.to/abhirajadhikary06/unlock-your-creativity-6-end-to-end-python-projects-using-open-source-apis-2eb0</link>
      <guid>https://dev.to/abhirajadhikary06/unlock-your-creativity-6-end-to-end-python-projects-using-open-source-apis-2eb0</guid>
      <description>&lt;p&gt;Are you looking to build impactful projects with Python and open-source APIs? Whether you're an aspiring developer or a seasoned coder, crafting end-to-end applications can showcase your skills and enhance your portfolio. This blog explores six innovative project ideas that leverage Python as the main language and integrate different open-source tools, with features like GitHub OAuth using Supabase. Let’s dive in!&lt;/p&gt;




&lt;h3&gt;
  
  
  1. Personalized Job Finder Platform
&lt;/h3&gt;

&lt;p&gt;&lt;strong&gt;Description&lt;/strong&gt;: Create a platform where users can find jobs tailored to their skills and location, track applications, and save resumes.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Features&lt;/strong&gt;:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;GitHub OAuth login using Supabase.&lt;/li&gt;
&lt;li&gt;Job recommendations based on user preferences.&lt;/li&gt;
&lt;li&gt;Application tracking system.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Open-Source Tools&lt;/strong&gt;:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;a href="https://supabase.com/" rel="noopener noreferrer"&gt;Supabase&lt;/a&gt;: For user authentication and database management.&lt;/li&gt;
&lt;li&gt;
&lt;a href="https://fastapi.tiangolo.com/" rel="noopener noreferrer"&gt;FastAPI&lt;/a&gt;: To develop a robust backend.&lt;/li&gt;
&lt;li&gt;
&lt;a href="https://www.crummy.com/software/BeautifulSoup/" rel="noopener noreferrer"&gt;BeautifulSoup&lt;/a&gt;: For web scraping job listings.&lt;/li&gt;
&lt;li&gt;
&lt;a href="https://streamlit.io/" rel="noopener noreferrer"&gt;Streamlit&lt;/a&gt;: To create an interactive front end.&lt;/li&gt;
&lt;li&gt;
&lt;a href="https://github.com/jsvine/pdfplumber" rel="noopener noreferrer"&gt;PDFPlumber&lt;/a&gt;: For parsing uploaded resumes.&lt;/li&gt;
&lt;/ul&gt;




&lt;h3&gt;
  
  
  2. AI-Powered Recipe Generator
&lt;/h3&gt;

&lt;p&gt;&lt;strong&gt;Description&lt;/strong&gt;: Develop a tool that generates recipes based on available ingredients and analyzes their nutritional value.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Features&lt;/strong&gt;:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Save recipes via Supabase.&lt;/li&gt;
&lt;li&gt;AI-generated recipes using text models.&lt;/li&gt;
&lt;li&gt;Nutrition analysis of recipes.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Open-Source Tools&lt;/strong&gt;:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;a href="https://supabase.com/" rel="noopener noreferrer"&gt;Supabase&lt;/a&gt;: For recipe storage and user authentication.&lt;/li&gt;
&lt;li&gt;
&lt;a href="https://huggingface.co/transformers/" rel="noopener noreferrer"&gt;Hugging Face Transformers&lt;/a&gt;: For generating recipe suggestions.&lt;/li&gt;
&lt;li&gt;
&lt;a href="https://spoonacular.com/food-api" rel="noopener noreferrer"&gt;Spoonacular API&lt;/a&gt;: For nutrition analysis.&lt;/li&gt;
&lt;li&gt;
&lt;a href="https://fastapi.tiangolo.com/" rel="noopener noreferrer"&gt;FastAPI&lt;/a&gt;: To handle backend operations.&lt;/li&gt;
&lt;li&gt;
&lt;a href="https://streamlit.io/" rel="noopener noreferrer"&gt;Streamlit&lt;/a&gt;: For a seamless UI experience.&lt;/li&gt;
&lt;/ul&gt;




&lt;h3&gt;
  
  
  3. Collaborative Study Platform
&lt;/h3&gt;

&lt;p&gt;&lt;strong&gt;Description&lt;/strong&gt;: Build a platform where users can collaborate on notes in real time and participate in gamified study challenges.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Features&lt;/strong&gt;:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Real-time collaborative document editing.&lt;/li&gt;
&lt;li&gt;Gamification with leaderboards.&lt;/li&gt;
&lt;li&gt;GitHub OAuth for login.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Open-Source Tools&lt;/strong&gt;:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;a href="https://supabase.com/" rel="noopener noreferrer"&gt;Supabase&lt;/a&gt;: For managing users and storing notes.&lt;/li&gt;
&lt;li&gt;
&lt;a href="https://socket.io/" rel="noopener noreferrer"&gt;Socket.IO&lt;/a&gt;: For real-time collaboration.&lt;/li&gt;
&lt;li&gt;
&lt;a href="https://quilljs.com/" rel="noopener noreferrer"&gt;Quill.js&lt;/a&gt;: To integrate a rich text editor.&lt;/li&gt;
&lt;li&gt;
&lt;a href="https://www.mongodb.com/" rel="noopener noreferrer"&gt;MongoDB&lt;/a&gt;: For storing documents.&lt;/li&gt;
&lt;li&gt;
&lt;a href="https://fastapi.tiangolo.com/" rel="noopener noreferrer"&gt;FastAPI&lt;/a&gt;: Backend development.&lt;/li&gt;
&lt;/ul&gt;




&lt;h3&gt;
  
  
  4. Eco-Friendly Shopping Assistant
&lt;/h3&gt;

&lt;p&gt;&lt;strong&gt;Description&lt;/strong&gt;: A web app that helps users evaluate products for eco-friendliness and calculates the carbon footprint of their shopping habits.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Features&lt;/strong&gt;:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Barcode scanner for product lookup.&lt;/li&gt;
&lt;li&gt;Eco-friendliness ratings of products.&lt;/li&gt;
&lt;li&gt;Carbon footprint calculations.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Open-Source Tools&lt;/strong&gt;:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;a href="https://supabase.com/" rel="noopener noreferrer"&gt;Supabase&lt;/a&gt;: For user authentication and data storage.&lt;/li&gt;
&lt;li&gt;
&lt;a href="https://github.com/zxing/zxing" rel="noopener noreferrer"&gt;ZXing API&lt;/a&gt;: To scan barcodes.&lt;/li&gt;
&lt;li&gt;
&lt;a href="https://world.openfoodfacts.org/data" rel="noopener noreferrer"&gt;Open Food Facts API&lt;/a&gt;: For product information.&lt;/li&gt;
&lt;li&gt;
&lt;a href="https://pandas.pydata.org/" rel="noopener noreferrer"&gt;Pandas&lt;/a&gt;: To calculate and analyze data.&lt;/li&gt;
&lt;li&gt;
&lt;a href="https://streamlit.io/" rel="noopener noreferrer"&gt;Streamlit&lt;/a&gt;: For visualizing the insights.&lt;/li&gt;
&lt;/ul&gt;




&lt;h3&gt;
  
  
  5. Fitness Tracker with Social Features
&lt;/h3&gt;

&lt;p&gt;&lt;strong&gt;Description&lt;/strong&gt;: A fitness tracker that lets users monitor their progress and share achievements with friends.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Features&lt;/strong&gt;:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Track fitness goals and daily activity.&lt;/li&gt;
&lt;li&gt;Social sharing of fitness achievements.&lt;/li&gt;
&lt;li&gt;GitHub OAuth for login.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Open-Source Tools&lt;/strong&gt;:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;a href="https://supabase.com/" rel="noopener noreferrer"&gt;Supabase&lt;/a&gt;: For managing user data and achievements.&lt;/li&gt;
&lt;li&gt;
&lt;a href="https://developers.google.com/fit" rel="noopener noreferrer"&gt;Google Fit API&lt;/a&gt;: To sync fitness data.&lt;/li&gt;
&lt;li&gt;
&lt;a href="https://matplotlib.org/" rel="noopener noreferrer"&gt;Matplotlib&lt;/a&gt;: For creating visualizations of progress.&lt;/li&gt;
&lt;li&gt;
&lt;a href="https://dash.plotly.com/" rel="noopener noreferrer"&gt;Dash&lt;/a&gt;: Interactive dashboards for users.&lt;/li&gt;
&lt;li&gt;
&lt;a href="https://fastapi.tiangolo.com/" rel="noopener noreferrer"&gt;FastAPI&lt;/a&gt;: Backend services.&lt;/li&gt;
&lt;/ul&gt;




&lt;h3&gt;
  
  
  6. AI-Powered Code Review Assistant
&lt;/h3&gt;

&lt;p&gt;&lt;strong&gt;Description&lt;/strong&gt;: Develop a tool that integrates with GitHub to perform automated code reviews and provide suggestions.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Features&lt;/strong&gt;:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;GitHub OAuth for authentication.&lt;/li&gt;
&lt;li&gt;Automated code analysis with actionable insights.&lt;/li&gt;
&lt;li&gt;Integration with pull requests for seamless code reviews.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Open-Source Tools&lt;/strong&gt;:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;a href="https://supabase.com/" rel="noopener noreferrer"&gt;Supabase&lt;/a&gt;: Authentication and user management.&lt;/li&gt;
&lt;li&gt;
&lt;a href="https://docs.github.com/en/rest" rel="noopener noreferrer"&gt;GitHub API&lt;/a&gt;: To fetch and manage pull requests.&lt;/li&gt;
&lt;li&gt;
&lt;a href="https://huggingface.co/transformers/" rel="noopener noreferrer"&gt;Hugging Face Transformers&lt;/a&gt;: For analyzing and improving code.&lt;/li&gt;
&lt;li&gt;
&lt;a href="https://fastapi.tiangolo.com/" rel="noopener noreferrer"&gt;FastAPI&lt;/a&gt;: Backend for handling requests.&lt;/li&gt;
&lt;li&gt;
&lt;a href="https://streamlit.io/" rel="noopener noreferrer"&gt;Streamlit&lt;/a&gt;: UI to display review results.&lt;/li&gt;
&lt;/ul&gt;




&lt;h3&gt;
  
  
  Conclusion
&lt;/h3&gt;

&lt;p&gt;These projects are excellent for mastering Python and open-source tools while building real-world applications. Whether it’s a job finder, recipe generator, or code review assistant, the possibilities are endless. By integrating APIs like Supabase, Hugging Face, or Open Food Facts, you’ll learn to create efficient, scalable solutions.&lt;/p&gt;

&lt;p&gt;Start building today, and let your creativity shine!&lt;/p&gt;

</description>
      <category>python</category>
      <category>startup</category>
      <category>opensource</category>
    </item>
    <item>
      <title>UI Card Library</title>
      <dc:creator>Abhiraj Adhikary</dc:creator>
      <pubDate>Sat, 14 Dec 2024 19:57:37 +0000</pubDate>
      <link>https://dev.to/abhirajadhikary06/ui-card-library-5d4k</link>
      <guid>https://dev.to/abhirajadhikary06/ui-card-library-5d4k</guid>
      <description>&lt;p&gt;Participating in the &lt;a href="https://dev.to/challenges/frontend-2024-12-04"&gt;Frontend Challenge - December Edition, CSS Art: December&lt;/a&gt; has been an inspiring journey into the world of CSS art.&lt;/p&gt;

&lt;h2&gt;
  
  
  Inspiration
&lt;/h2&gt;

&lt;p&gt;A curated collection of beautifully designed UI cards with direct access to their Figma designs. Each card includes creator details with links to their LinkedIn and Twitter profiles. Perfect for inspiration and collaboration!&lt;/p&gt;

&lt;h2&gt;
  
  
  Demo
&lt;/h2&gt;

&lt;p&gt;You can view the CSS art piece I created for this challenge below:&lt;/p&gt;

&lt;p&gt;Github Repo: &lt;a href="https://github.com/abhirajadhikary06/UI-Card-Library" rel="noopener noreferrer"&gt;https://github.com/abhirajadhikary06/UI-Card-Library&lt;/a&gt;&lt;br&gt;
Live Preview: &lt;a href="https://ui-card-library.vercel.app/" rel="noopener noreferrer"&gt;https://ui-card-library.vercel.app/&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F9ooxj3bch4aqgpkodm3k.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F9ooxj3bch4aqgpkodm3k.png" alt="UI Card Library" width="800" height="643"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  Journey
&lt;/h2&gt;

&lt;p&gt;Embarking on this project allowed me to delve deeper into CSS techniques, such as positioning, transformations, and animations. I learned how to manipulate simple HTML elements to create intricate designs, enhancing both my technical skills and artistic expression.&lt;/p&gt;

&lt;p&gt;One of the key takeaways was understanding the importance of planning and sketching designs before coding, which streamlined the development process. I'm particularly proud of how the final piece reflects the initial concept, demonstrating the potential of CSS in creating art.&lt;/p&gt;

&lt;p&gt;Looking ahead, I aim to experiment with more complex animations and interactive elements, further blending art with functionality in web design.&lt;/p&gt;

&lt;p&gt;&lt;em&gt;Note: The code for this project is open-source and available under the &lt;a href="https://opensource.org/licenses/MIT" rel="noopener noreferrer"&gt;MIT License&lt;/a&gt;.&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;Thank you for viewing my submission! &lt;/p&gt;

</description>
      <category>frontendchallenge</category>
      <category>devchallenge</category>
      <category>css</category>
      <category>opensource</category>
    </item>
    <item>
      <title>My Journey from Code Explorer to Open Source Contributor: 2024 Hacktoberfest</title>
      <dc:creator>Abhiraj Adhikary</dc:creator>
      <pubDate>Tue, 29 Oct 2024 12:12:24 +0000</pubDate>
      <link>https://dev.to/abhirajadhikary06/my-journey-from-code-explorer-to-open-source-contributor-2024-hacktoberfest-5foe</link>
      <guid>https://dev.to/abhirajadhikary06/my-journey-from-code-explorer-to-open-source-contributor-2024-hacktoberfest-5foe</guid>
      <description>&lt;p&gt;&lt;em&gt;This is a submission for the &lt;a href="https://dev.to/challenges/hacktoberfest"&gt;2024 Hacktoberfest Writing challenge&lt;/a&gt;: Contributor Experience&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;My Journey from Code Explorer to Open Source Contributor: 2024 Hacktoberfest&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;&lt;em&gt;This is a submission for the &lt;a href="https://dev.to/challenges/hacktoberfest"&gt;2024 Hacktoberfest Writing challenge&lt;/a&gt;: Contributor Experience.&lt;/em&gt;&lt;/p&gt;




&lt;p&gt;The term "Open Source" was a part of my vocabulary early in my coding journey, as was GitHub. Yet, contributing actively never felt like a step I was encouraged to take—until a webinar hosted by ISTE at Haldia Institute of Technology opened my eyes. They shared not only the importance of contributing to Open Source but also the exciting opportunities it offers developers to grow, learn, and even gain recognition. That single session was my "light bulb" moment, and from that day, my journey into Open Source began.&lt;/p&gt;

&lt;h3&gt;
  
  
  Setting Out: My First Steps in the Ocean of Code
&lt;/h3&gt;

&lt;p&gt;The first few days of this journey felt like diving into an endless ocean of code. After the initial excitement, I found myself staring at vast repositories filled with complex codebases. But instead of rushing, I took a step back, studied the repository structure, navigated the issues, and researched using AI tools to gain insights. This methodical approach paid off. I learned how to understand issue requirements, the layout of repos, and how code contributions were expected to flow in each project.&lt;/p&gt;

&lt;h3&gt;
  
  
  Hacktoberfest: Taking the Leap
&lt;/h3&gt;

&lt;p&gt;With Hacktoberfest approaching, I set myself a challenge: to push beyond the minimum four PRs and make a meaningful impact across multiple projects. In just two weeks, I contributed to several renowned repositories. My process was to thoroughly research, submit PRs, and actively seek feedback from maintainers. A few PRs required modifications, which were challenging but essential learning opportunities. By the end of two weeks, I had raised &lt;strong&gt;10 PRs&lt;/strong&gt;—far more than I initially imagined I could!&lt;/p&gt;

&lt;p&gt;Here are some of the repositories I contributed to during Hacktoberfest:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Gradle:&lt;/strong&gt; &lt;a href="https://github.com/gradle/gradle" rel="noopener noreferrer"&gt;https://github.com/gradle/gradle&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;GuacSec Documentation:&lt;/strong&gt; &lt;a href="https://github.com/guacsec/guac-docs" rel="noopener noreferrer"&gt;https://github.com/guacsec/guac-docs&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;URLShortener:&lt;/strong&gt; &lt;a href="https://github.com/AvgBlank/URLShortener" rel="noopener noreferrer"&gt;https://github.com/AvgBlank/URLShortener&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Configu:&lt;/strong&gt; &lt;a href="https://github.com/configu/configu" rel="noopener noreferrer"&gt;https://github.com/configu/configu&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;QuickBlox Hacktoberfest Challenge:&lt;/strong&gt; &lt;a href="https://github.com/QuickBlox/quickblox-hacktoberfest-challenge" rel="noopener noreferrer"&gt;https://github.com/QuickBlox/quickblox-hacktoberfest-challenge&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Cadmi AI:&lt;/strong&gt; &lt;a href="https://github.com/Sohammhatre10/Cadmi_AI" rel="noopener noreferrer"&gt;https://github.com/Sohammhatre10/Cadmi_AI&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Gradle Cookbook:&lt;/strong&gt; &lt;a href="https://github.com/gradle/cookbook" rel="noopener noreferrer"&gt;https://github.com/gradle/cookbook&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Easy Fix:&lt;/strong&gt; &lt;a href="https://github.com/vatsalsinghkv/easy-fix" rel="noopener noreferrer"&gt;https://github.com/vatsalsinghkv/easy-fix&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Gofr:&lt;/strong&gt; &lt;a href="https://github.com/gofr-dev/gofr" rel="noopener noreferrer"&gt;https://github.com/gofr-dev/gofr&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;QuestDB Documentation:&lt;/strong&gt; &lt;a href="https://github.com/questdb/documentation" rel="noopener noreferrer"&gt;https://github.com/questdb/documentation&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Aardwolf Social:&lt;/strong&gt; &lt;a href="https://github.com/Aardwolf-Social/aardwolf-social" rel="noopener noreferrer"&gt;https://github.com/Aardwolf-Social/aardwolf-social&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;ISTE Contribution React:&lt;/strong&gt; &lt;a href="https://github.com/Arijit-017/ISTE-Contribution-React" rel="noopener noreferrer"&gt;https://github.com/Arijit-017/ISTE-Contribution-React&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Flyte (No-Code Contribution):&lt;/strong&gt; &lt;a href="https://github.com/flyteorg/flyte" rel="noopener noreferrer"&gt;https://github.com/flyteorg/flyte&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Steampipe (No-Code Contribution):&lt;/strong&gt; &lt;a href="https://github.com/turbot/steampipe" rel="noopener noreferrer"&gt;https://github.com/turbot/steampipe&lt;/a&gt;
&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Lessons Learned and Skills Gained
&lt;/h3&gt;

&lt;p&gt;My Hacktoberfest experience was packed with learnings. The most significant takeaways were:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;Patience and Persistence:&lt;/strong&gt; Many contributions require a deep understanding of codebases and issues, often needing multiple iterations to get right.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Effective Communication:&lt;/strong&gt; Tagging maintainers, seeking feedback, and updating PRs based on suggestions taught me the importance of clear, concise communication in Open Source.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Community and Collaboration:&lt;/strong&gt; Open Source is, at its core, community-driven. Meeting fellow contributors and understanding their perspectives made me appreciate the collective spirit that keeps these projects alive.&lt;/li&gt;
&lt;/ol&gt;

&lt;h3&gt;
  
  
  Tips for Newcomers: Getting Started in Open Source
&lt;/h3&gt;

&lt;p&gt;Are you new to Open Source and wondering how to make your first contribution? Here are some tips to get started:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;Find Beginner-Friendly Repositories:&lt;/strong&gt; Start with repositories that have tags like “good first issue” or “beginner-friendly.” These are often curated specifically for new contributors.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Start Small:&lt;/strong&gt; Begin with small issues, such as documentation updates or minor bug fixes. This will help you get familiar with the project and build confidence.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Study the Repository Structure:&lt;/strong&gt; Before jumping into code, take time to understand the structure of the repository. Read the documentation and navigate through the folders to understand where things are located.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Ask Questions:&lt;/strong&gt; If you’re unsure about something, don’t hesitate to ask. Maintainers and other contributors are usually very welcoming and ready to help.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Follow Contribution Guidelines:&lt;/strong&gt; Every project has its guidelines on contributing. Make sure to read them and follow all instructions to make the review process smoother.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Join Hacktoberfest:&lt;/strong&gt; If you’re unsure where to start, participating in Hacktoberfest can provide structured goals and a supportive community to guide you.&lt;/li&gt;
&lt;/ol&gt;

&lt;h3&gt;
  
  
  The Future: GSoC and Beyond
&lt;/h3&gt;

&lt;p&gt;The thrill of contributing, the satisfaction of raising PRs, and the welcoming feedback from the community have inspired me to continue even beyond Hacktoberfest. My next goal? To try my hand at cracking Google Summer of Code (GSoC). I feel ready to tackle bigger challenges, contribute to even more complex projects, and work alongside brilliant minds in the Open Source community.&lt;/p&gt;

&lt;h3&gt;
  
  
  Why Open Source Is Worth It: More Than Just Code
&lt;/h3&gt;

&lt;p&gt;Participating in Open Source has opened doors to learn, grow, and even receive some amazing swags from projects I contributed to! But the true reward has been the journey itself—the challenges, the community, and the newfound confidence. This experience has left a fire in me to keep contributing, learning, and growing as a developer.&lt;/p&gt;

&lt;p&gt;You can check out my GitHub and follow along with my journey here: &lt;a href="https://github.com/abhirajadhikary06" rel="noopener noreferrer"&gt;https://github.com/abhirajadhikary06&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Let’s keep pushing the boundaries of what we can achieve together in Open Source!&lt;/strong&gt; 🚀&lt;/p&gt;

</description>
      <category>devchallenge</category>
      <category>hacktoberfest</category>
      <category>opensource</category>
    </item>
  </channel>
</rss>
