<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>DEV Community: Arnab Datta</title>
    <description>The latest articles on DEV Community by Arnab Datta (@arnab500th).</description>
    <link>https://dev.to/arnab500th</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F3818372%2F3c17c00d-7efd-4aee-84f5-fe9658073d32.jpeg</url>
      <title>DEV Community: Arnab Datta</title>
      <link>https://dev.to/arnab500th</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://dev.to/feed/arnab500th"/>
    <language>en</language>
    <item>
      <title>Rapid Interest Shifts in Recommender Systems: A Case Study on Instagram Reels</title>
      <dc:creator>Arnab Datta</dc:creator>
      <pubDate>Thu, 16 Apr 2026 08:09:43 +0000</pubDate>
      <link>https://dev.to/arnab500th/rapid-interest-shifts-in-recommender-systems-a-case-study-on-instagram-reels-1eh1</link>
      <guid>https://dev.to/arnab500th/rapid-interest-shifts-in-recommender-systems-a-case-study-on-instagram-reels-1eh1</guid>
      <description>&lt;h3&gt;
  
  
  A late-night experiment revealing how fast recommendation systems actually adapt
&lt;/h3&gt;

&lt;p&gt;&lt;em&gt;An informal, timestamped experiment showing how quickly Instagram's recommendation system adapts to new inputs — often within minutes — and what that reveals about modern recommender systems.&lt;/em&gt;&lt;/p&gt;




&lt;h2&gt;
  
  
  Key Observations (TL;DR)
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;Feed adaptation latency: &lt;strong&gt;~2 minutes&lt;/strong&gt; consistently across genres&lt;/li&gt;
&lt;li&gt;Subgenre-level clustering observed (not just category-level)&lt;/li&gt;
&lt;li&gt;Content classification appears independent of hashtags&lt;/li&gt;
&lt;li&gt;Cross-user candidate pool overlap observed&lt;/li&gt;
&lt;/ul&gt;




&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Flll4r61z5x9psuzyeowz.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Flll4r61z5x9psuzyeowz.png" alt="Time Line" width="800" height="1200"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  The Accidental Beginning
&lt;/h2&gt;

&lt;p&gt;This wasn't supposed to be any glorified research. It started as harrasement.&lt;/p&gt;

&lt;p&gt;I was chatting with a friend late at night and casually sent her a reel — a (golgappa) street food video, nothing special. Eight minutes later she sent one back. Then another. Then at 11:29 PM she messaged me: &lt;em&gt;that her entire feed is just food now&lt;/em&gt;. Which, to be fair, was the objective.&lt;/p&gt;

&lt;p&gt;I laughed. Then I got curious. Then I got obsessive about it.&lt;/p&gt;

&lt;p&gt;What followed was a highly controlled two-hour experimental session (i.e., I spammed her with reels), across completely different genres — food, coding, anime, gaming, gym, Harry Potter — and timing exactly how fast her feed shifted each time. She was a very willing participant (trust me).&lt;/p&gt;

&lt;p&gt;What I found was genuinely surprising, and it lines up in interesting ways with what we know about how Instagram's recommendation system actually works under the hood.&lt;/p&gt;




&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Note on methodology:&lt;/strong&gt; This wasn't a controlled lab experiment. We were actively chatting throughout, her app was in normal use the whole time, and I wasn't running any formal measurement tools. These are real timestamps from our chat logs. I'd call it a &lt;em&gt;naturalistic informal experiment&lt;/em&gt; — messy, but honest. I think that actually makes it more interesting.&lt;/p&gt;
&lt;/blockquote&gt;




&lt;h2&gt;
  
  
  The Friend Factor: Why Her Feed and Not Mine
&lt;/h2&gt;

&lt;p&gt;Before the timeline, one thing worth noting: &lt;strong&gt;the same reels I sent her barely affected my own feed.&lt;/strong&gt; My feed stayed mostly stable throughout the night. Hers was rewriting itself every few minutes.&lt;/p&gt;

&lt;p&gt;This asymmetry is the most interesting starting observation. We're both active users on old accounts. So why was she so much more "algorithmically reactive"?&lt;/p&gt;

&lt;p&gt;Toward the end of our (very productive) experiment she started sending me reels from her shifting feed — but almost none of it migrated to mine. The reels she sent me just didn't move the needle. Same input, completely different output.&lt;/p&gt;

&lt;p&gt;My hypothesis — which I'll come back to later — is what I'm calling &lt;strong&gt;low engagement inertia&lt;/strong&gt;. But first, the meticulously observed data.&lt;/p&gt;




&lt;h2&gt;
  
  
  The Timeline
&lt;/h2&gt;

&lt;p&gt;Here's what actually happened, timestamped from our chat:&lt;/p&gt;

&lt;h3&gt;
  
  
  Phase 1 — Food (10:56 PM to 11:29 PM)
&lt;/h3&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Time&lt;/th&gt;
&lt;th&gt;What I Did&lt;/th&gt;
&lt;th&gt;What Happened&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;10:56 PM&lt;/td&gt;
&lt;td&gt;Sent her a golgappa reel&lt;/td&gt;
&lt;td&gt;—&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;11:04 PM&lt;/td&gt;
&lt;td&gt;—&lt;/td&gt;
&lt;td&gt;She sends a golgappa reel back&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;11:11 PM&lt;/td&gt;
&lt;td&gt;—&lt;/td&gt;
&lt;td&gt;She sends more golgappa reels&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;11:12 PM&lt;/td&gt;
&lt;td&gt;Sent 2–3 more food reels&lt;/td&gt;
&lt;td&gt;—&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;11:20 PM&lt;/td&gt;
&lt;td&gt;—&lt;/td&gt;
&lt;td&gt;She starts sending food reels unprompted&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;11:29 PM&lt;/td&gt;
&lt;td&gt;—&lt;/td&gt;
&lt;td&gt;&lt;em&gt;"Her full feed is just food"&lt;/em&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;&lt;strong&gt;33 minutes from first reel → complete feed takeover.&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;I felt like a god, able to manipulate feed at will.&lt;/p&gt;

&lt;h3&gt;
  
  
  Phase 2 — The Concept Test (12:12 AM to 12:33 AM)
&lt;/h3&gt;

&lt;p&gt;This is where things got interesting.&lt;/p&gt;

&lt;p&gt;At 12:12 AM I sent her a specific meme format — the kind where one person says &lt;em&gt;"if no one told you today, you're such a good mother"&lt;/em&gt; and then the account creator reacts with something like &lt;em&gt;"bro I became a mother just scrolling."&lt;/em&gt; Just a joke between us. Not a genre reel, no obvious category.&lt;/p&gt;

&lt;p&gt;At &lt;strong&gt;12:18 AM — six minutes later&lt;/strong&gt; — she sent me a reel with the exact same &lt;strong&gt;first half&lt;/strong&gt; — same video — but a completely different reaction from a different creator.&lt;/p&gt;

&lt;p&gt;The behavior suggests the system may be matching structural patterns in content, not just hashtags or genre labels. Two reels, different creators, same comedic template. That's a surprisingly granular level of content understanding.&lt;/p&gt;

&lt;p&gt;Then at 12:30 AM I sent a coding reel. By &lt;strong&gt;12:33 AM&lt;/strong&gt; she was sending me back multiple coding reels.&lt;/p&gt;

&lt;p&gt;Then something genuinely weird happened. Around the same time we'd both been served a reel far from coding, a near-identical reel appeared on &lt;strong&gt;both&lt;/strong&gt; our feeds simultaneously — same creator, same video, but with different audio and a different concept.&lt;/p&gt;

&lt;p&gt;Well that was not supposed to happen.&lt;/p&gt;

&lt;p&gt;We'd apparently both been pulled from the same creator's content pool at the same moment — probably because of candidate generation overlap when two users get freshly classified into similar interest clusters.&lt;/p&gt;

&lt;h3&gt;
  
  
  Phase 3 — Pushing the Limits (1:01 AM to 1:17 AM)
&lt;/h3&gt;

&lt;p&gt;By now I had fully abandoned the pretense that this was a normal conversation and was just running (very productive) experiments on my  willingly participated friend's feed at 1 AM, to see how consistent these shifts could get and how much will entirely new content do to her feed.&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Time&lt;/th&gt;
&lt;th&gt;What I Did&lt;/th&gt;
&lt;th&gt;What Happened&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;1:01 AM&lt;/td&gt;
&lt;td&gt;Sent Valorant reel&lt;/td&gt;
&lt;td&gt;—&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;1:02 AM&lt;/td&gt;
&lt;td&gt;Sent &lt;em&gt;Your Lie in April&lt;/em&gt; anime reel&lt;/td&gt;
&lt;td&gt;—&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;1:03 AM&lt;/td&gt;
&lt;td&gt;—&lt;/td&gt;
&lt;td&gt;She sends a PUBG reel (~2 min after gaming reel)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;1:03 AM&lt;/td&gt;
&lt;td&gt;—&lt;/td&gt;
&lt;td&gt;She sends &lt;em&gt;I Want to Eat Your Pancreas&lt;/em&gt; reel (same sad/romantic anime concept)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;1:03 AM&lt;/td&gt;
&lt;td&gt;—&lt;/td&gt;
&lt;td&gt;She sends a &lt;em&gt;Your Lie in April&lt;/em&gt; reel specifically — &lt;strong&gt;with my like on it&lt;/strong&gt;
&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;1:04 AM&lt;/td&gt;
&lt;td&gt;—&lt;/td&gt;
&lt;td&gt;FF reel, then continuous anime + gaming content&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;1:08 AM&lt;/td&gt;
&lt;td&gt;Sent a gym reel (sub-100 likes — not viral)&lt;/td&gt;
&lt;td&gt;—&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;1:10 AM&lt;/td&gt;
&lt;td&gt;—&lt;/td&gt;
&lt;td&gt;She sends gym reel&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;1:15 AM&lt;/td&gt;
&lt;td&gt;Sent Harry Potter edit reel&lt;/td&gt;
&lt;td&gt;—&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;1:17 AM&lt;/td&gt;
&lt;td&gt;—&lt;/td&gt;
&lt;td&gt;She sends 2 Harry Potter reels&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;The consistent pattern: &lt;strong&gt;~2 minutes from send to feed shift&lt;/strong&gt;, across completely unrelated genres, back to back. At this point, the system wasn’t reacting. It was anticipating.&lt;/p&gt;

&lt;p&gt;The anime response deserves special attention. I sent a &lt;em&gt;Your Lie in April&lt;/em&gt; reel. Within 2 minutes she received:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;A different anime with the same emotional subgenre (sad, romantic)&lt;/li&gt;
&lt;li&gt;Then the exact same anime I sent&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;This is consistent with subgenre-level clustering — the system appears to track not just "anime" as a category, but the emotional and stylistic signature within it.&lt;/p&gt;

&lt;h3&gt;
  
  
  The No-Hashtag Reel
&lt;/h3&gt;

&lt;p&gt;For the gym test I specifically chose a reel with under 100 likes — not viral, not trending, just a guy doing lunges in mediocre lighting. It worked anyway.&lt;/p&gt;

&lt;p&gt;One of the gaming reels that appeared on her feed after my Valorant send had &lt;strong&gt;zero hashtags&lt;/strong&gt;. No caption, no tags, no metadata hints whatsoever.&lt;/p&gt;

&lt;p&gt;The system still correctly categorized it as gaming content and served it within the 2-minute window. This is consistent with Instagram using visual and audio content analysis rather than relying solely on metadata — it wasn't reading the label. It was watching the video.&lt;/p&gt;




&lt;h2&gt;
  
  
  Why This Happens: The Technical Explanation
&lt;/h2&gt;

&lt;p&gt;Based on publicly available information about how Meta's recommendation systems work, here's what's likely going on:&lt;/p&gt;

&lt;h3&gt;
  
  
  The Pipeline
&lt;/h3&gt;

&lt;p&gt;Instagram's Reels recommendation isn't one model — it's a multi-stage pipeline:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;Candidate Generation&lt;/strong&gt; — pulls a large pool of potential reels from followed accounts, trending content, and category clusters&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;First-Stage Ranking&lt;/strong&gt; — a lightweight model scores candidates quickly (Instagram uses a Two Towers neural network here, which can cache embeddings efficiently)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Second-Stage Ranking&lt;/strong&gt; — a heavier multi-task model (MTML) predicts engagement probability for top candidates&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Reranking + Filters&lt;/strong&gt; — diversity rules, content moderation, eligibility checks&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Reels Chaining&lt;/strong&gt; — selects what plays &lt;em&gt;next&lt;/em&gt; to keep the session going within a content cluster&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;The speed we observed (~2 minutes) suggests the interest profile update is happening in near real-time — what's called &lt;strong&gt;online learning&lt;/strong&gt;, where user interaction signals stream into the system without requiring a full model retrain.&lt;/p&gt;

&lt;h3&gt;
  
  
  Why Content Gets Classified Without Hashtags
&lt;/h3&gt;

&lt;p&gt;Instagram is consistent with using &lt;strong&gt;computer vision and audio analysis&lt;/strong&gt; on every reel, independent of metadata. The system can identify objects, scenes, on-screen text, audio patterns, and visual style. A gaming reel with zero hashtags still has game UI on screen, specific audio, and recognizable visual patterns — enough to generate a content embedding without any metadata hints at all.&lt;/p&gt;

&lt;h3&gt;
  
  
  Why She Got the Same Anime Subgenre (Not Just "Anime")
&lt;/h3&gt;

&lt;p&gt;The Two Towers model doesn't classify content into broad buckets — it generates &lt;strong&gt;dense embedding vectors&lt;/strong&gt; that capture fine-grained stylistic and thematic features. &lt;em&gt;Your Lie in April&lt;/em&gt; and &lt;em&gt;I Want to Eat Your Pancreas&lt;/em&gt; likely sit close together in embedding space because they share visual aesthetics, pacing, color palette, and emotional tone — not just the genre tag "anime."&lt;/p&gt;

&lt;p&gt;When I sent her one, the system updated her interest vector toward that specific region of embedding space, and served content from the same neighborhood.&lt;/p&gt;

&lt;h3&gt;
  
  
  The Asymmetry: Why My Feed Didn't Change
&lt;/h3&gt;

&lt;p&gt;My feed has years of varied engagement history — diverse genres, lots of signals, a well-established taste graph. Shifting it requires overcoming accumulated inertia.&lt;/p&gt;

&lt;p&gt;Her feed, by contrast, appears to have a &lt;strong&gt;lower signal diversity&lt;/strong&gt; — not because she's a new or inactive user, but because her engagement pattern is cleaner and less fragmented. Each watch signal she sends is relatively uncontested, so new signals propagate faster.&lt;/p&gt;

&lt;p&gt;I'd call this &lt;strong&gt;low engagement inertia&lt;/strong&gt; — a state where the algorithm has a highly responsive, low-noise profile to work with. It's not a flaw in the system. It's the system working exactly as designed, just made visible.&lt;/p&gt;




&lt;h2&gt;
  
  
  Limitations
&lt;/h2&gt;

&lt;p&gt;This was a single informal observation, not a controlled study. Some important caveats:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Single-user observation (n=1)&lt;/li&gt;
&lt;li&gt;No control over watch time, likes, or skips during the session&lt;/li&gt;
&lt;li&gt;Background app activity may have influenced results&lt;/li&gt;
&lt;li&gt;Feed state prior to the experiment was not fully quantified&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;These results should be interpreted as exploratory rather than conclusive. The patterns are consistent with known system behavior, but cannot be treated as proof of specific mechanisms.&lt;/p&gt;




&lt;h2&gt;
  
  
  What This Actually Means
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;Filter bubbles can shift in minutes, not days.&lt;/strong&gt; The popular narrative is that algorithmic filter bubbles form slowly over time. What we saw suggests that at least for users with low engagement inertia, a single session can substantially reorient the feed. That has real implications for how quickly someone can get pulled into a content rabbit hole.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;The system understands content, not just categories.&lt;/strong&gt; The concept-level match at 12:18 AM and the no-hashtag gaming reel both point to the same thing: Instagram's content understanding appears to go well beyond keyword matching. Hashtags are a hint, not a requirement.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Intentional feed influence is surprisingly accessible.&lt;/strong&gt; I shifted her feed across six completely different genres in one night by simply sending her reels. Anyone could do this — a friend, a family member, or theoretically someone with less benign intent. The system has no apparent mechanism to distinguish "organic interest signal" from "someone else sent this to you."&lt;/p&gt;




&lt;p&gt;I started this by sending my friend a food reel as a bit. I ended it two hours later having documented six genre shifts, a concept-level meme match, and a hashtag-free classification. She has not forgiven me. Understandably.&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;&lt;em&gt;In the right conditions, your feed isn't a reflection of your interests — it's a reflection of your last 10 minutes.&lt;/em&gt;&lt;/p&gt;
&lt;/blockquote&gt;




&lt;h2&gt;
  
  
  Further Reading
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;&lt;a href="https://www.geeksforgeeks.org/how-instagram-reel-uses-recommender-systems/" rel="noopener noreferrer"&gt;How Instagram Reel Uses Recommender Systems — GeeksforGeeks&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://abhiverse01.medium.com/how-instagram-reels-recommendations-actually-work-behind-the-scenes-84e59eb7059e" rel="noopener noreferrer"&gt;How Instagram Reels Recommendations Actually Work Behind the Scenes — Abhishek Shah, Medium&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://www.accio.com/blog/how-does-instagram-reels-algorithm-work-a-superb-guide" rel="noopener noreferrer"&gt;How Does Instagram Reels Algorithm Work: A Superb Guide — Roy Nnalue&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;




&lt;p&gt;&lt;em&gt;Thanks to my friend — credited as "the unwilling test subject" at her request — for tolerating two hours of me hijacking her feed. This was originally her repayment for me harassing her algorithm.&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Written at 4 AM while sleep-deprived. At no point did this feel like a bad idea to me atleast.&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;&lt;em&gt;If you found this interesting, I write about ML, computer science, and things I accidentally stumble into at &lt;a href="https://dev.to/arnab500th"&gt;@Arnab500th&lt;/a&gt;.&lt;/em&gt;&lt;/p&gt;




&lt;p&gt;&lt;strong&gt;Tags:&lt;/strong&gt; &lt;code&gt;#machinelearning&lt;/code&gt; &lt;code&gt;#ai&lt;/code&gt; &lt;code&gt;#datascience&lt;/code&gt; &lt;code&gt;#socialmedia&lt;/code&gt; &lt;code&gt;#technology&lt;/code&gt;&lt;/p&gt;

</description>
      <category>machinelearning</category>
      <category>socialmedia</category>
      <category>ai</category>
      <category>datascience</category>
    </item>
    <item>
      <title>I Built a Chrome Extension to Bypass Spotify's Mini-Player Paywall (Because I'm a Pirate 🏴‍☠️)</title>
      <dc:creator>Arnab Datta</dc:creator>
      <pubDate>Sun, 29 Mar 2026 12:16:38 +0000</pubDate>
      <link>https://dev.to/arnab500th/i-built-a-chrome-extension-to-bypass-spotifys-mini-player-paywall-because-im-a-pirate--41e5</link>
      <guid>https://dev.to/arnab500th/i-built-a-chrome-extension-to-bypass-spotifys-mini-player-paywall-because-im-a-pirate--41e5</guid>
      <description>&lt;p&gt;I love listening to music while coding.&lt;/p&gt;

&lt;p&gt;Not background noise. Not lo-fi beats. Actual music — I want to see what's playing, change tracks without breaking flow, and keep the player somewhere visible so I can glance at it without switching tabs.&lt;/p&gt;

&lt;p&gt;Spotify Web has a mini-player. You probably know this. What you might also know is that &lt;strong&gt;it's locked behind Premium&lt;/strong&gt;. Free tier users get the full-page player or nothing. You can't make it small. You can't tuck it into a corner. Spotify decided that's a Premium feature.&lt;/p&gt;

&lt;p&gt;I disagreed.&lt;/p&gt;

&lt;p&gt;So I built my own.&lt;/p&gt;




&lt;h2&gt;
  
  
  The Problem (In Case You've Never Hit This)
&lt;/h2&gt;

&lt;p&gt;Here's the workflow I wanted:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Spotify Web open in one tab, playing music&lt;/li&gt;
&lt;li&gt;A tiny floating player sitting in the corner of my screen&lt;/li&gt;
&lt;li&gt;Full controls — play, pause, skip, seek, volume — without switching tabs&lt;/li&gt;
&lt;li&gt;A Picture-in-Picture window I can drag to another monitor while I code&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Spotify's answer: &lt;strong&gt;pay for Premium&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;My answer: build a Chrome Extension that injects its own mini-player directly onto the page.&lt;/p&gt;

&lt;p&gt;Will I submit it to the Chrome Web Store? Probably not — it's kind of piracy and I don't think they'd approve it. But who cares. I'm a pirate. 🏴‍☠️&lt;/p&gt;




&lt;h2&gt;
  
  
  What I Built
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;Spotify Float&lt;/strong&gt; is a Chrome Extension (Manifest V3) that:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Injects a floating, draggable, resizable mini-player into &lt;code&gt;open.spotify.com&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;Supports three modes: Full (art + info + controls), Compact, and Mini pill&lt;/li&gt;
&lt;li&gt;Has a real &lt;strong&gt;Document Picture-in-Picture&lt;/strong&gt; window — a separate always-on-top OS window showing the album art with controls on hover&lt;/li&gt;
&lt;li&gt;Works entirely on the &lt;strong&gt;free tier&lt;/strong&gt; — no Spotify API, no login, no data collection&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Here's what it looks like:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F0rusyvox214km553qz6k.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F0rusyvox214km553qz6k.png" alt="Spotify Float Mini Player" width="800" height="405"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;And How it works:&lt;br&gt;
  &lt;iframe src="https://www.youtube.com/embed/UGurwvHmOHs"&gt;
  &lt;/iframe&gt;
&lt;/p&gt;




&lt;h2&gt;
  
  
  How It Actually Works
&lt;/h2&gt;

&lt;p&gt;The interesting part: this extension &lt;strong&gt;doesn't use the Spotify API at all&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;Spotify's Web Player is a React app. All the playback state lives in the DOM — track titles, artist names, play state, progress — all exposed through &lt;code&gt;data-testid&lt;/code&gt; attributes and &lt;code&gt;aria-label&lt;/code&gt; values. So instead of going through any API, I just... read the DOM directly and simulate clicks.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight javascript"&gt;&lt;code&gt;&lt;span class="c1"&gt;// Finding the play button — no API needed&lt;/span&gt;
&lt;span class="kd"&gt;var&lt;/span&gt; &lt;span class="nx"&gt;SELECTORS&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="na"&gt;playPauseButton&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;
    &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;[data-testid="control-button-playpause"]&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;button[aria-label="Play"]&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;button[aria-label="Pause"]&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="c1"&gt;// ...more fallbacks&lt;/span&gt;
  &lt;span class="p"&gt;],&lt;/span&gt;
  &lt;span class="c1"&gt;// ...&lt;/span&gt;
&lt;span class="p"&gt;};&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Every control has a priority-ordered list of selectors. Primary is &lt;code&gt;data-testid&lt;/code&gt; (most stable), with CSS class fallbacks for when Spotify updates their DOM.&lt;/p&gt;

&lt;h3&gt;
  
  
  The Sync Loop
&lt;/h3&gt;

&lt;p&gt;Every 500ms (or 2000ms when paused), &lt;code&gt;syncNow()&lt;/code&gt; reads the current state from the DOM and updates the floating player:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;MutationObserver (DOM changes on the now-playing widget)
    ↓ debounce 200ms
    ↓
syncNow()
  ├── readText('trackTitle')
  ├── readText('artistName')
  ├── cachedResolve('albumArt')
  ├── calcProgress()  — 3-strategy fallback
  ├── playBtn aria-label → play state
  ├── shuffleBtn aria-label → shuffle state
  └── repeatBtn aria-label → repeat mode
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The sync loop &lt;strong&gt;completely stops&lt;/strong&gt; when the mini-player is hidden — no background polling, no CPU waste.&lt;/p&gt;

&lt;h3&gt;
  
  
  Seeking Without an API
&lt;/h3&gt;

&lt;p&gt;Seeking was the trickiest part. Spotify's progress bar is a range input, but you can't just set &lt;code&gt;.value&lt;/code&gt; — React controls it and ignores direct assignment. The trick is using the native property setter:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight javascript"&gt;&lt;code&gt;&lt;span class="kd"&gt;function&lt;/span&gt; &lt;span class="nf"&gt;handleSeek&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;pct&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="kd"&gt;var&lt;/span&gt; &lt;span class="nx"&gt;sl&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;resolveSelector&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;SELECTORS&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;seekSlider&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
  &lt;span class="k"&gt;if &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;sl&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="kd"&gt;var&lt;/span&gt; &lt;span class="nx"&gt;setter&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nb"&gt;Object&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;getOwnPropertyDescriptor&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
      &lt;span class="nb"&gt;window&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;HTMLInputElement&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;prototype&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;value&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;
    &lt;span class="p"&gt;).&lt;/span&gt;&lt;span class="kd"&gt;set&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
    &lt;span class="nx"&gt;setter&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;call&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;sl&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;pct&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="nf"&gt;parseFloat&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;sl&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;max&lt;/span&gt; &lt;span class="o"&gt;||&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;100&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;));&lt;/span&gt;
    &lt;span class="nx"&gt;sl&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;dispatchEvent&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;Event&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;input&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="na"&gt;bubbles&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="kc"&gt;true&lt;/span&gt; &lt;span class="p"&gt;}));&lt;/span&gt;
    &lt;span class="nx"&gt;sl&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;dispatchEvent&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;Event&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;change&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="na"&gt;bubbles&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="kc"&gt;true&lt;/span&gt; &lt;span class="p"&gt;}));&lt;/span&gt;
  &lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This bypasses React's synthetic event system and fires a real native event that Spotify's player actually listens to.&lt;/p&gt;

&lt;h3&gt;
  
  
  Shadow DOM — Fully Isolated
&lt;/h3&gt;

&lt;p&gt;The floating player is built inside a Shadow DOM. This means Spotify's CSS can't bleed in and break the player's styles, and the player's styles can't accidentally affect Spotify's page. Complete isolation.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight javascript"&gt;&lt;code&gt;&lt;span class="k"&gt;this&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;host&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nb"&gt;document&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;createElement&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;div&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;span class="k"&gt;this&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;host&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;id&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;spotify-float-host&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="nb"&gt;document&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;documentElement&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;appendChild&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;this&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;host&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;span class="k"&gt;this&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;shadow&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;this&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;host&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;attachShadow&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt; &lt;span class="na"&gt;mode&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;open&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt; &lt;span class="p"&gt;});&lt;/span&gt;
&lt;span class="c1"&gt;// Everything inside is fully encapsulated&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;






&lt;h2&gt;
  
  
  The PiP Bug That Took Me A While
&lt;/h2&gt;

&lt;p&gt;The Document Picture-in-Picture API is relatively new (&lt;code&gt;window.documentPictureInPicture&lt;/code&gt;) and it does something unexpected: &lt;strong&gt;CSS &lt;code&gt;:hover&lt;/code&gt; doesn't work inside a PiP window&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;The PiP window is a separate OS-level window with its own document. When your mouse is inside it, the main page's document doesn't receive hover events. So all the CSS I had like:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight css"&gt;&lt;code&gt;&lt;span class="nf"&gt;#player&lt;/span&gt;&lt;span class="nd"&gt;:hover&lt;/span&gt; &lt;span class="nf"&gt;#ctrl&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="nl"&gt;opacity&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="m"&gt;1&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="nf"&gt;#player&lt;/span&gt;&lt;span class="nd"&gt;:hover&lt;/span&gt; &lt;span class="nf"&gt;#pw&lt;/span&gt;   &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="nl"&gt;opacity&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="m"&gt;1&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;...never fired. The controls were stuck invisible and unclickable. Forever.&lt;/p&gt;

&lt;p&gt;The fix was to stop relying on CSS hover entirely and switch to JS events registered directly on the PiP window's document:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight javascript"&gt;&lt;code&gt;&lt;span class="nx"&gt;pipWindow&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nb"&gt;document&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;addEventListener&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;mouseenter&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nf"&gt;function &lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="k"&gt;if &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;pipPlayer&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="nx"&gt;pipPlayer&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;classList&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;add&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;pip-hovered&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;span class="p"&gt;});&lt;/span&gt;
&lt;span class="nx"&gt;pipWindow&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nb"&gt;document&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;addEventListener&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;mouseleave&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nf"&gt;function &lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="k"&gt;if &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;pipPlayer&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="nx"&gt;pipPlayer&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;classList&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;remove&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;pip-hovered&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;span class="p"&gt;});&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Then all the hover styles reference &lt;code&gt;.pip-hovered&lt;/code&gt; instead of &lt;code&gt;:hover&lt;/code&gt;. Simple fix once you know the root cause, but CSS &lt;code&gt;:hover&lt;/code&gt; silently not working across window contexts is not an obvious thing to debug.&lt;/p&gt;




&lt;h2&gt;
  
  
  The Stack
&lt;/h2&gt;

&lt;p&gt;No frameworks. No build tools. No npm packages at runtime.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Manifest V3&lt;/strong&gt; Chrome Extension&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Vanilla JS&lt;/strong&gt; — single bundled IIFE in &lt;code&gt;content.js&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Shadow DOM&lt;/strong&gt; for style encapsulation&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Document Picture-in-Picture API&lt;/strong&gt; for the PiP window&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;&lt;code&gt;chrome.storage.local&lt;/code&gt;&lt;/strong&gt; for persisting position, size, mode&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Node.js&lt;/strong&gt; (built-in &lt;code&gt;zlib&lt;/code&gt; only) for icon generation&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The whole thing is five files that Chrome loads directly. No webpack, no transpilation, no dependencies.&lt;/p&gt;

&lt;p&gt;One thing worth noting: Chrome MV3 content scripts &lt;strong&gt;cannot use ES module &lt;code&gt;import&lt;/code&gt;/&lt;code&gt;export&lt;/code&gt;&lt;/strong&gt; without some manifest workarounds that introduce their own issues. So the entire UI, selector system, and logic are bundled into one self-contained IIFE. If you're building a Chrome Extension and hitting &lt;code&gt;Uncaught SyntaxError: Cannot use import statement outside a module&lt;/code&gt; — that's why.&lt;/p&gt;




&lt;h2&gt;
  
  
  Features at a Glance
&lt;/h2&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;What&lt;/th&gt;
&lt;th&gt;How&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Play / Pause / Skip / Shuffle / Repeat&lt;/td&gt;
&lt;td&gt;DOM click simulation with retry backoff&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Seek&lt;/td&gt;
&lt;td&gt;Native range input setter + bubbling events&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Volume&lt;/td&gt;
&lt;td&gt;Same native setter technique&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Drag&lt;/td&gt;
&lt;td&gt;
&lt;code&gt;mousedown&lt;/code&gt; on handle → &lt;code&gt;mousemove&lt;/code&gt; on document, viewport-clamped&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Resize&lt;/td&gt;
&lt;td&gt;SE corner handle, 200–500px × 120–700px&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;PiP&lt;/td&gt;
&lt;td&gt;&lt;code&gt;window.documentPictureInPicture.requestWindow()&lt;/code&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Persistence&lt;/td&gt;
&lt;td&gt;
&lt;code&gt;chrome.storage.local&lt;/code&gt; — position, size, mode, visibility&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Keyboard&lt;/td&gt;
&lt;td&gt;
&lt;code&gt;Space&lt;/code&gt;, &lt;code&gt;Ctrl+→&lt;/code&gt;, &lt;code&gt;Ctrl+←&lt;/code&gt;
&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;




&lt;h2&gt;
  
  
  Get It
&lt;/h2&gt;

&lt;p&gt;The extension is open source on GitHub:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;&lt;a href="https://github.com/Arnab500th/Spotify-miniplayer-chrome-extension-By-pass-premium-pay-walls" rel="noopener noreferrer"&gt;github.com/Arnab500th/Spotify-miniplayer-chrome-extension-By-pass-premium-pay-walls&lt;/a&gt;&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Download the ZIP from the &lt;a href="https://github.com/Arnab500th/Spotify-miniplayer-chrome-extension-By-pass-premium-pay-walls/releases/latest" rel="noopener noreferrer"&gt;Releases page&lt;/a&gt;, unzip it, go to &lt;code&gt;chrome://extensions&lt;/code&gt;, enable Developer Mode, click Load unpacked, select the folder. Done.&lt;/p&gt;




&lt;h2&gt;
  
  
  What's Next
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;Better selector recovery when Spotify does a major redesign&lt;/li&gt;
&lt;li&gt;Maybe a volume keyboard shortcut&lt;/li&gt;
&lt;li&gt;Possibly a lyrics overlay if I can figure out a non-API approach&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;And yes, I know the repo name is a bit on the nose. But it's accurate. 🏴‍☠️&lt;/p&gt;




&lt;p&gt;&lt;em&gt;Not affiliated with Spotify AB. This is a personal project built for fun and learning. Use it responsibly.&lt;/em&gt;&lt;/p&gt;

</description>
      <category>javascript</category>
      <category>chromeextension</category>
      <category>spotify</category>
      <category>webdev</category>
    </item>
    <item>
      <title>How We Built an AI Littering Detection System in 4 Days — and Won 2nd Place</title>
      <dc:creator>Arnab Datta</dc:creator>
      <pubDate>Tue, 24 Mar 2026 13:39:15 +0000</pubDate>
      <link>https://dev.to/arnab500th/how-we-built-an-ai-littering-detection-system-in-4-days-and-won-2nd-place-1d3e</link>
      <guid>https://dev.to/arnab500th/how-we-built-an-ai-littering-detection-system-in-4-days-and-won-2nd-place-1d3e</guid>
      <description>&lt;p&gt;We had 4 days, one laptop with an RTX 2050, and a problem nobody on our team had fully solved before. This is the story of building TRACE — and everything that broke along the way.&lt;/p&gt;




&lt;h2&gt;
  
  
  What Is TRACE?
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;TRACE (Trash Recognition and Automated Civic Enforcement)&lt;/strong&gt; is a real-time AI surveillance pipeline that:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Detects littering events across multiple live camera feeds&lt;/li&gt;
&lt;li&gt;Confirms offender identity using a 5-state behaviour machine&lt;/li&gt;
&lt;li&gt;Reads license plates via OCR for vehicle offenders&lt;/li&gt;
&lt;li&gt;Routes WhatsApp alerts with evidence snapshots to the &lt;strong&gt;nearest municipality ward office&lt;/strong&gt; using GPS distance&lt;/li&gt;
&lt;li&gt;Streams live annotated video to a real-time analytics dashboard&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Stack: Python · YOLOv8 · ByteTrack · EasyOCR · FastAPI · SQLite · Twilio · OpenCV · HTML/CSS/JS&lt;/p&gt;

&lt;p&gt;We won &lt;strong&gt;🥈 2nd place at NextGenHack 2026&lt;/strong&gt;. This was my first ever hackathon, first semester of college, competing against seniors.&lt;/p&gt;

&lt;p&gt;Here's what actually happened during those 4 days.&lt;/p&gt;




&lt;h2&gt;
  
  
  Problem 1: Detecting &lt;em&gt;Behaviour&lt;/em&gt;, Not Just Objects
&lt;/h2&gt;

&lt;p&gt;The obvious approach — detect trash, flag it — doesn't work. Trash appears in a frame for a lot of reasons that aren't littering. Someone carrying a bag. A bin. A parked vehicle with litter near it. You'd get false alerts constantly.&lt;/p&gt;

&lt;p&gt;We looked at &lt;strong&gt;Human Action Recognition (HAR) models&lt;/strong&gt; first. The idea was to classify the action — "person dropping object" — directly. But every model we tested was either too slow for real-time inference on our hardware, trained on datasets that didn't cover littering specifically, or produced too many false positives on adjacent actions like "person placing object on surface."&lt;/p&gt;

&lt;p&gt;No perfect fit existed. So I designed something from scratch.&lt;/p&gt;

&lt;h3&gt;
  
  
  The 5-State Machine
&lt;/h3&gt;

&lt;p&gt;Every tracked trash object moves through states independently:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;UNKNOWN → CARRYING → SEPARATION → STATIONARY → ALERTED
                                ↘ CANCELLED (owner returns)
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;UNKNOWN&lt;/strong&gt;: Trash first appears. Looking for the nearest suspect.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;CARRYING&lt;/strong&gt;: Suspect within 150px of the object — assumed being carried.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;SEPARATION&lt;/strong&gt;: Suspect has moved away. Timer starts.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;STATIONARY&lt;/strong&gt;: Object hasn't moved more than 15px in 30+ frames since separation.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;ALERTED&lt;/strong&gt;: Suspect is beyond 200px — confirmed littering. Evidence captured, alert dispatched.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;CANCELLED&lt;/strong&gt;: Owner identified by ByteTrack ID returns — false alarm cleared.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The key detail: owner identity is verified using &lt;strong&gt;ByteTrack track IDs&lt;/strong&gt;, not just position. A different person walking near a stationary object doesn't cancel the alert. Without this, any passerby would reset the timer.&lt;/p&gt;

&lt;p&gt;This took hours of whiteboarding. Getting the transition logic right — especially the cancellation paths — was harder than the model training.&lt;/p&gt;




&lt;h2&gt;
  
  
  Problem 2: Single Camera to Multi-Camera
&lt;/h2&gt;

&lt;p&gt;Getting one camera working was straightforward. Getting three to run simultaneously without everything collapsing was a different problem entirely.&lt;/p&gt;

&lt;p&gt;The issues hit in layers:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Threading&lt;/strong&gt;: Each camera needs its own detection loop. Python's GIL means you can't just run them in threads and expect true parallelism for CPU-bound work. We moved to one thread per camera, each with its own model instances.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Shared state&lt;/strong&gt;: ByteTrack maintains tracking state across frames. If two cameras share a tracker, their track IDs collide and the state machine breaks completely. Solution: each camera thread gets its own ByteTrack instance. No sharing.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;MJPEG streaming&lt;/strong&gt;: The dashboard needs live video. Naive implementation — encode frame, POST to backend, serve — blocks the detection loop and tanks FPS. We decoupled it: a separate sender thread reads from a shared frame buffer and POSTs independently. The detection loop writes one frame to the buffer (microseconds) and moves on. If the backend is slow, the sender skips to the latest frame. Detection runs at full GPU speed regardless.&lt;/p&gt;




&lt;h2&gt;
  
  
  Problem 3: Round 2 Surprise — Add Geofencing
&lt;/h2&gt;

&lt;p&gt;Midway through the hackathon, the judges told us to add geofencing. New requirement, mid-build.&lt;/p&gt;

&lt;p&gt;The goal: instead of hardcoding a phone number per camera, alerts should automatically route to the &lt;strong&gt;nearest municipality ward office&lt;/strong&gt; based on the camera's GPS coordinates.&lt;/p&gt;

&lt;p&gt;My first instinct was Euclidean distance — just subtract the coordinates. That's wrong.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;1 degree of latitude ≈ 111 km.&lt;/strong&gt; Raw degree subtraction treats coordinates as flat 2D points, which gives completely wrong distances at any real-world scale. A camera 200 metres from an office could appear farther than one 2 km away depending on which direction you measure.&lt;/p&gt;

&lt;p&gt;The correct formula is &lt;strong&gt;Haversine&lt;/strong&gt;, which accounts for the Earth's curvature:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;math&lt;/span&gt;

&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;haversine&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;lat1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;lng1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;lat2&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;lng2&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="n"&gt;R&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;6_371_000&lt;/span&gt;  &lt;span class="c1"&gt;# Earth radius in metres
&lt;/span&gt;    &lt;span class="n"&gt;phi1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;phi2&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;math&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;radians&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;lat1&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="n"&gt;math&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;radians&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;lat2&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;dphi&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;math&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;radians&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;lat2&lt;/span&gt; &lt;span class="o"&gt;-&lt;/span&gt; &lt;span class="n"&gt;lat1&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;dlambda&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;math&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;radians&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;lng2&lt;/span&gt; &lt;span class="o"&gt;-&lt;/span&gt; &lt;span class="n"&gt;lng1&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;a&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;math&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;sin&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;dphi&lt;/span&gt;&lt;span class="o"&gt;/&lt;/span&gt;&lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="o"&gt;**&lt;/span&gt;&lt;span class="mi"&gt;2&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="n"&gt;math&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;cos&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;phi1&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="n"&gt;math&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;cos&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;phi2&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="n"&gt;math&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;sin&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;dlambda&lt;/span&gt;&lt;span class="o"&gt;/&lt;/span&gt;&lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="o"&gt;**&lt;/span&gt;&lt;span class="mi"&gt;2&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;R&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="mi"&gt;2&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="n"&gt;math&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;atan2&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;math&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;sqrt&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;a&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="n"&gt;math&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;sqrt&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt; &lt;span class="o"&gt;-&lt;/span&gt; &lt;span class="n"&gt;a&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Now &lt;code&gt;nearest_office(cam_lat, cam_lng)&lt;/code&gt; iterates through every entry in &lt;code&gt;MUNICIPALITY_OFFICES&lt;/code&gt;, computes Haversine distance, and returns the closest one. Adding a new ward office requires one dict entry in config. No camera config changes needed — routing updates automatically.&lt;/p&gt;

&lt;p&gt;We also added &lt;strong&gt;high sensitivity zones&lt;/strong&gt; — schools, stations, heritage sites — where cameras within a defined radius never drop below MEDIUM priority surveillance.&lt;/p&gt;




&lt;h2&gt;
  
  
  Problem 4: The GPU Was Choking
&lt;/h2&gt;

&lt;p&gt;More cameras meant the GPU was hitting its ceiling. On an RTX 2050, we could run 4 cameras at full inference before FPS started dropping hard.&lt;/p&gt;

&lt;p&gt;I looked at standard rate-control approaches:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Token bucket&lt;/strong&gt;: Solves contention between producers sharing one resource. But each camera owns its own thread and model instances — there's no shared queue to arbitrate. Doesn't fit.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Frame differencing&lt;/strong&gt;: Gates inference on pixel-change detection. Sounds good, but lighting changes, wind, insects — all produce false triggers. Also creates irregular frame gaps that ByteTrack's &lt;code&gt;persist=True&lt;/code&gt; wasn't designed for, breaking track continuity.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;We'd actually had simple frame skipping in an earlier version — run detection every Nth frame regardless of what's happening. We scrapped it because it broke tracking. ByteTrack needs consistent temporal input to maintain IDs reliably.&lt;/p&gt;

&lt;h3&gt;
  
  
  The Dynamic Priority System
&lt;/h3&gt;

&lt;p&gt;The insight: most cameras are idle most of the time. A camera pointed at an empty street at 2am doesn't need the same inference rate as one that just detected a littering event.&lt;/p&gt;

&lt;p&gt;Each camera thread tracks time since its last confirmed trash detection and assigns itself a priority:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;get_camera_skip&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;ctx&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="n"&gt;elapsed&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;time&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;time&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="o"&gt;-&lt;/span&gt; &lt;span class="n"&gt;ctx&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;last_trash_time&lt;/span&gt;
    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;elapsed&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;&lt;/span&gt; &lt;span class="n"&gt;PRIORITY_HIGH_WINDOW&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;    &lt;span class="c1"&gt;# 5 seconds
&lt;/span&gt;        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;PRIORITY_HIGH_SKIP&lt;/span&gt;         &lt;span class="c1"&gt;# skip=1, every frame
&lt;/span&gt;    &lt;span class="k"&gt;elif&lt;/span&gt; &lt;span class="n"&gt;elapsed&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;&lt;/span&gt; &lt;span class="n"&gt;PRIORITY_MEDIUM_WINDOW&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="c1"&gt;# 30 seconds
&lt;/span&gt;        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;PRIORITY_MEDIUM_SKIP&lt;/span&gt;       &lt;span class="c1"&gt;# skip=5
&lt;/span&gt;    &lt;span class="k"&gt;else&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;PRIORITY_LOW_SKIP&lt;/span&gt;          &lt;span class="c1"&gt;# skip=8
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Key design decisions:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Trash model runs every frame regardless&lt;/strong&gt; — only person detection is skipped&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Cameras start at LOW automatically&lt;/strong&gt; — &lt;code&gt;last_trash_time=0.0&lt;/code&gt; means elapsed ≈ 1.7 billion seconds&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Skipped frames reuse &lt;code&gt;last_known_persons&lt;/code&gt; cache&lt;/strong&gt; — ByteTrack state is preserved between detection frames&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Priority transitions POST to backend only on change&lt;/strong&gt; — not every frame&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Result: went from 4 cameras at full load to &lt;strong&gt;6-9 cameras&lt;/strong&gt; on the same RTX 2050.&lt;/p&gt;

&lt;p&gt;The difference from the old frame skipping: this version is &lt;em&gt;activity-aware&lt;/em&gt;. It doesn't skip blindly on a fixed schedule — it skips based on what's actually happening in the scene. A camera that just detected a littering event immediately jumps to HIGH (every frame) for 5 seconds. An idle camera at LOW still runs the trash model every frame, just not person detection.&lt;/p&gt;




&lt;h2&gt;
  
  
  The Dashboard Problem (JS with No JS Experience)
&lt;/h2&gt;

&lt;p&gt;None of our team were JavaScript developers. The dashboard needed to be live, multi-camera, handle MJPEG streams, update charts every 5 seconds, and look presentable to judges.&lt;/p&gt;

&lt;p&gt;We deliberately chose &lt;strong&gt;plain HTML/CSS/JS&lt;/strong&gt; — no React, no build step, no npm. Zero risk of build failures mid-demo. It opens directly in any browser and polls the FastAPI backend every 5 seconds.&lt;/p&gt;

&lt;p&gt;Chart.js for the graphs. Native &lt;code&gt;&amp;lt;img&amp;gt;&lt;/code&gt; tags for MJPEG streams — the browser handles multipart decode natively, no JS required. &lt;code&gt;fetch()&lt;/code&gt; for everything else. It works. It held up through the entire demo.&lt;/p&gt;




&lt;h2&gt;
  
  
  What We Shipped
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;Multi-camera real-time detection (threaded, one worker per camera)&lt;/li&gt;
&lt;li&gt;5-state littering behaviour machine with ByteTrack ID-based owner verification&lt;/li&gt;
&lt;li&gt;EasyOCR license plate recognition with Indian format validation&lt;/li&gt;
&lt;li&gt;Haversine geofencing — nearest ward office routing&lt;/li&gt;
&lt;li&gt;Dynamic HIGH/MEDIUM/LOW priority inference system&lt;/li&gt;
&lt;li&gt;imgbb snapshot upload → Twilio WhatsApp alert with zone label&lt;/li&gt;
&lt;li&gt;FastAPI backend, SQLite, MJPEG streaming&lt;/li&gt;
&lt;li&gt;Live dashboard with priority badges and zone sensitivity indicators&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Model performance:&lt;/strong&gt; YOLOv8s fine-tuned on TACO dataset, mAP50 = 0.81&lt;/p&gt;




&lt;h2&gt;
  
  
  What I'd Do Differently
&lt;/h2&gt;

&lt;p&gt;The state machine thresholds (150px carry distance, 200px abandon distance) were tuned empirically on test videos. They work, but they're pixel-based — which means they're resolution and camera-angle dependent. A proper implementation would normalize by estimated person height in frame.&lt;/p&gt;

&lt;p&gt;The &lt;code&gt;seen_trash_ids&lt;/code&gt; set that tracks confirmed events is never pruned. Over a long session it grows indefinitely. Simple fix with a timestamp-based TTL, just didn't make the hackathon cut.&lt;/p&gt;

&lt;p&gt;Frame differencing as a &lt;em&gt;complement&lt;/em&gt; to the priority system — gating the trash model on truly static scenes — would be the next meaningful optimization. The priority system handles person detection well. The trash model still runs every frame regardless.&lt;/p&gt;




&lt;h2&gt;
  
  
  Links
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;GitHub&lt;/strong&gt;: &lt;a href="https://github.com/Arnab500th/Hackathon-Automated-Littering-Detection-and-Alert-System-for-Public-Spaces" rel="noopener noreferrer"&gt;github.com/Arnab500th/Hackathon-Automated-Littering-Detection-and-Alert-System-for-Public-Spaces&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Demo&lt;/strong&gt; (Round 1 prototype showcase): &lt;a href="https://youtu.be/U9AvOBRZ0JI" rel="noopener noreferrer"&gt;youtu.be/hackathon-test&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Images&lt;/strong&gt;
&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F0raps1ale0vntn7gm232.jpg" alt="Sample Person detected Snapshot"&gt;
&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Flnlmjigr42sj4x0ebcrc.png" alt="Dashboard"&gt;
&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fupqe7orbwht1dr3if7gz.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fupqe7orbwht1dr3if7gz.png" alt="Live Feed"&gt;&lt;/a&gt;&lt;/p&gt;




&lt;p&gt;&lt;em&gt;First hackathon. First semester. If you're a first-year student reading this wondering whether to enter one — just enter it.&lt;/em&gt;`&lt;/p&gt;

</description>
      <category>computervision</category>
      <category>hackathon</category>
      <category>machinelearning</category>
      <category>python</category>
    </item>
  </channel>
</rss>
