<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>DEV Community: Anhaj Uwaisulkarni</title>
    <description>The latest articles on DEV Community by Anhaj Uwaisulkarni (@anhaj0).</description>
    <link>https://dev.to/anhaj0</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F3773551%2Ff468f771-fd5a-4b02-b44c-a1c4a8688b26.png</url>
      <title>DEV Community: Anhaj Uwaisulkarni</title>
      <link>https://dev.to/anhaj0</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://dev.to/feed/anhaj0"/>
    <language>en</language>
    <item>
      <title>Building a Live Bus Tracker: AI Crowd Analysis &amp; Real-Time Sync (Part 2)</title>
      <dc:creator>Anhaj Uwaisulkarni</dc:creator>
      <pubDate>Sun, 15 Feb 2026 07:20:11 +0000</pubDate>
      <link>https://dev.to/anhaj0/building-a-live-bus-tracker-ai-crowd-analysis-real-time-sync-part-2-46pk</link>
      <guid>https://dev.to/anhaj0/building-a-live-bus-tracker-ai-crowd-analysis-real-time-sync-part-2-46pk</guid>
      <description>&lt;p&gt;Getting hardware to transmit data from a moving bus is hard. Processing that data into actionable, real-time insights is a completely different challenge.&lt;/p&gt;

&lt;p&gt;In Part 1, I built an ESP32-CAM node to push GPS and image data over a 2G cellular network.&lt;/p&gt;

&lt;p&gt;Now, I need a backend capable of catching that data, analyzing the crowd size using AI, and syncing it to a mobile app in milliseconds. Here is how I architected the cloud infrastructure using FastAPI, Hugging Face, Google Gemini, and React.&lt;/p&gt;

&lt;h3&gt;
  
  
  1. The FastAPI Receiver
&lt;/h3&gt;

&lt;p&gt;The ESP32 sends a continuous stream of HTTP POST requests. I deployed a Python FastAPI backend on Hugging Face Spaces to catch them.&lt;/p&gt;

&lt;p&gt;Memory on the hardware side is tight. I couldn't send massive JSON payloads without crashing the board.&lt;/p&gt;

&lt;p&gt;Instead, I injected the GPS coordinates and speed directly into the HTTP headers. The actual image is sent as raw binary data in the request body.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Python
@app.post("/api/update-bus")
async def update_bus(
    request: Request,
    bus_id: str = Header(..., alias="bus-id"),
    bus_lat: float = Header(..., alias="bus-lat"),
    bus_lng: float = Header(..., alias="bus-lng"),
    bus_speed: float = Header(..., alias="bus-speed"),
):
    # Read raw image binary directly from the request body
    image_bytes = await request.body()

    if not image_bytes:
        return JSONResponse({"status": "error", "message": "No image data"}, status=400)
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  2. Dual AI Processing (The Flex)
&lt;/h3&gt;

&lt;p&gt;Relying on a single AI model for object detection in a production environment is risky. If the API rate-limits or fails, the entire tracking app breaks.&lt;/p&gt;

&lt;p&gt;To solve this, I used two different models: a Hugging Face DETR model and Google's Gemini 2.5 Flash API.&lt;/p&gt;

&lt;p&gt;Running them sequentially would take too long and introduce severe latency to the real-time map. I used Python's ThreadPoolExecutor to run both AI inferences concurrently on separate threads.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Python
    # 1. Concurrent AI Analysis
    gemini_count = 0
    hf_count = 0

    with ThreadPoolExecutor(max_workers=2) as executor:
        future_gem = executor.submit(analyze_with_gemini, image_bytes)
        future_hf = executor.submit(analyze_with_huggingface, image_bytes)
        gemini_count = future_gem.result()
        hf_count = future_hf.result()

    # 2. Calculate Average &amp;amp; Crowd Level
    if gemini_count &amp;gt; 0 and hf_count &amp;gt; 0:
        avg_count = round((gemini_count + hf_count) / 2)
    else:
        avg_count = max(gemini_count, hf_count) # Fallback if one API fails
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This cuts the processing time in half. It also guarantees a crowd count even if one API goes offline. The avg_count is then mapped to a string status: Low, Moderate, High, or Very High.&lt;/p&gt;

&lt;h3&gt;
  
  
  3. Firebase Real-Time Sync
&lt;/h3&gt;

&lt;p&gt;Once the backend calculates the crowd level, it needs to push the data to the frontend immediately.&lt;/p&gt;

&lt;p&gt;I used Firebase Firestore. The critical detail here is updating the live location and crowd status without overwriting static database fields (like the bus route number or destination).&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Python
        try:
            bus_ref = db.collection('public').document('data').collection('buses').document(bus_id)

            bus_data = {
                "id": bus_id,
                "lat": bus_lat,
                "lng": bus_lng,
                "speed": bus_speed,
                "peopleCount": avg_count,
                "crowdLevel": crowd_status,
                "lastUpdated": firestore.SERVER_TIMESTAMP
            }

            # merge=True prevents overwriting existing route data
            bus_ref.set(bus_data, merge=True)
        except Exception as e:
            print(f"Firestore Write Failed: {e}")
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  4. The React Frontend
&lt;/h3&gt;

&lt;p&gt;The final piece of the architecture is the user interface. The React app actively listens to the Firestore database.&lt;/p&gt;

&lt;p&gt;When a user searches for a route, the app dynamically renders the bus cards. It reads the crowdLevel string and injects Tailwind CSS classes to change the UI colors—amber for moderate, red for crowded.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;JavaScript
// Determining color coding for crowd levels
const crowd = bus.crowdLevel || 'Moderate';
const crowdIsVery = crowd.toLowerCase().includes('very');

// Inside the component return:
&amp;lt;div
  className={`flex items-center gap-1.5 text-sm font-bold px-3 py-1 rounded-full ${
    crowdIsVery ? 'bg-red-50 text-red-600' : 'bg-amber-50 text-amber-600'
  }`}
&amp;gt;
  &amp;lt;Users size={14} /&amp;gt;
  &amp;lt;span&amp;gt;{crowd}&amp;lt;/span&amp;gt;
&amp;lt;/div&amp;gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The Result&lt;br&gt;
A hardware node captures the real world. A Python backend processes it concurrently using dual AI models. A React app visualizes it for the user in real-time.&lt;/p&gt;

&lt;p&gt;This architecture successfully bridges physical IoT with cloud AI.&lt;/p&gt;

</description>
      <category>ai</category>
      <category>fastapi</category>
      <category>iot</category>
      <category>python</category>
    </item>
    <item>
      <title>Building a Live Bus Tracker with ESP32-CAM, GPS, and Cellular Data (Part 1)</title>
      <dc:creator>Anhaj Uwaisulkarni</dc:creator>
      <pubDate>Sun, 15 Feb 2026 06:51:09 +0000</pubDate>
      <link>https://dev.to/anhaj0/building-a-live-bus-tracker-with-esp32-cam-gps-and-cellular-data-part-1-66c</link>
      <guid>https://dev.to/anhaj0/building-a-live-bus-tracker-with-esp32-cam-gps-and-cellular-data-part-1-66c</guid>
      <description>&lt;p&gt;Public transportation has a massive data problem. Commuters constantly face unpredictable arrival times and have no idea how crowded a bus is until it pulls up.&lt;/p&gt;

&lt;p&gt;I decided to fix this by building a self-contained IoT device for buses. It tracks live GPS coordinates and uses an onboard camera to capture the interior.&lt;/p&gt;

&lt;p&gt;This is Part 1 of my case study. I will break down the hardware node, the C++ firmware, and how I managed to reliably transmit image data over a 2G cellular network.&lt;/p&gt;

&lt;h3&gt;
  
  
  The Hardware Stack
&lt;/h3&gt;

&lt;p&gt;The goal was to keep the unit low-cost but capable of handling network failovers and image processing.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;ESP32-CAM:&lt;/strong&gt; The central brain of the operation. It handles the logic, captures the JPEG image, and manages the network connections. &lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;NEO-6M GPS:&lt;/strong&gt; Connected via serial to constantly pull latitude, longitude, and speed data.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;SIM800L Module:&lt;/strong&gt; Handles the cellular data transmission. Getting images over a 2G connection is tough, but necessary for mobile transit tracking.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;LM2596 Buck Converter:&lt;/strong&gt; Steps down the volatile 12V/24V bus battery to a clean 5V to keep the modules from frying.&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  The Firmware: Solving Network Drops
&lt;/h3&gt;

&lt;p&gt;I wrote the C++ firmware to handle network failovers automatically. The ESP32 first attempts a WiFi connection (useful for debugging at the terminal). If that fails, it instantly restarts the SIM module and falls back to a GPRS cellular connection.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;C++
  // Network Failover Logic
  if (WiFi.status() == WL_CONNECTED) {
    Serial.println("\n✅ WiFi Connected!");
    activeClient = &amp;amp;wifiClient;
    connected = true;
  } else {
    Serial.println("\n❌ WiFi Failed. Trying SIM800L...");
    modem.restart();

    // Attempt GPRS connection
    while (!modem.isGprsConnected() &amp;amp;&amp;amp; millis() - gsmStart &amp;lt; 10000) {  
      modem.gprsConnect(apn, gprsUser, gprsPass);
    }
  }
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The Firmware: Chunking Image Data&lt;br&gt;
The biggest headache in IoT development is memory management. The ESP32-CAM does not have the RAM to load a massive HTTP POST request into memory all at once.&lt;/p&gt;

&lt;p&gt;If you try to send the entire JPEG buffer and the HTTP headers in a single client.print() command, the board will crash and reboot.&lt;/p&gt;

&lt;p&gt;To solve this, I structured the HTTP request as multipart/form-data. I injected the GPS coordinates into the custom headers, and then sent the actual image binary in strict 1024-byte chunks.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;C++
      // Sending the whole JPEG buffer in chunks to prevent crashes
      uint8_t *fbBuf = fb-&amp;gt;buf;
      size_t fbLen = fb-&amp;gt;len;
      size_t sent = 0;
      const size_t CHUNK_SIZE = 1024;

      while (sent &amp;lt; fbLen) {
        size_t toSend = CHUNK_SIZE;
        if (fbLen - sent &amp;lt; CHUNK_SIZE) {
          toSend = fbLen - sent;  // Last chunk
        }

        activeClient-&amp;gt;write(fbBuf + sent, toSend);
        sent += toSend;
        delay(50);  // Buffer breathing room
      }
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This 50ms delay between chunks ensures the SIM800L module does not get overwhelmed and drop packets over the slow 2G network.&lt;/p&gt;

&lt;p&gt;What's Next?&lt;br&gt;
Getting the hardware to reliably capture and transmit data from a moving vehicle is only half the battle.&lt;/p&gt;

&lt;p&gt;In Part 2, I will break down the cloud infrastructure. I will show how I deployed a Python backend to Hugging Face, used the Google Gemini API to analyze the images for crowd density, and synced it all in real-time to a React frontend.&lt;/p&gt;

</description>
      <category>iot</category>
      <category>esp32</category>
      <category>cpp</category>
      <category>ai</category>
    </item>
  </channel>
</rss>
