<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>DEV Community: Vishwa Kumaresh</title>
    <description>The latest articles on DEV Community by Vishwa Kumaresh (@jackbright).</description>
    <link>https://dev.to/jackbright</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F945329%2Fad6fc644-69b6-4cb1-9653-1924057323ae.jpeg</url>
      <title>DEV Community: Vishwa Kumaresh</title>
      <link>https://dev.to/jackbright</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://dev.to/feed/jackbright"/>
    <language>en</language>
    <item>
      <title>MeraSociety — I Turned My Apartment Society's WhatsApp Chaos into a Real App</title>
      <dc:creator>Vishwa Kumaresh</dc:creator>
      <pubDate>Mon, 02 Mar 2026 02:19:56 +0000</pubDate>
      <link>https://dev.to/jackbright/merasociety-i-turned-my-apartment-societys-whatsapp-chaos-into-a-real-app-4o85</link>
      <guid>https://dev.to/jackbright/merasociety-i-turned-my-apartment-societys-whatsapp-chaos-into-a-real-app-4o85</guid>
      <description>&lt;p&gt;&lt;em&gt;This is a submission for the &lt;a href="https://dev.to/challenges/weekend-2026-02-28"&gt;DEV Weekend Challenge: Community&lt;/a&gt;&lt;/em&gt;&lt;/p&gt;

&lt;h1&gt;
  
  
  The Community
&lt;/h1&gt;

&lt;p&gt;I live in an apartment society in Bengaluru, India. &lt;strong&gt;500 families. Multiple WhatsApp groups. Pure chaos.&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;This is not a hypothetical. Here's proof — straight from my mom's phone:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F0g0mnmthah2gl3m0qvec.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F0g0mnmthah2gl3m0qvec.png" alt="Buy and Sell Group"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fbv0f1rzpkzekubl8223m.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fbv0f1rzpkzekubl8223m.png" alt="Announcements"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fb940wv9jifi7vxzcg2hb.jpeg" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fb940wv9jifi7vxzcg2hb.jpeg" alt="Security Phone"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;The daily reality:&lt;/strong&gt;&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;The Problem&lt;/th&gt;
&lt;th&gt;What Actually Happens&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Lost announcements&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Committee posts important notice → buried under "👍" and "ok noted" within the hour&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Security calls&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Guard's only workflow: call residents — &lt;em&gt;"Sir, koi Ramesh Kumar aaya hai…"&lt;/em&gt; — at 2 AM, because your in-laws arrived a day early&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Buy/sell nobody sees&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;"Selling Samsung washing machine, 8000 rs" — scrolled past by 150 people, seen by 3. Who knows if someone sent a private DM??&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Court booking wars&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Four people claim they booked badminton at 6 PM. Nobody has proof. Loudest person wins.&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Home food, buried&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Neighbor's incredible biryani offer drowns between a parking complaint and a meme&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;&lt;strong&gt;Every apartment society in India with 50–500 flats runs on WhatsApp groups&lt;/strong&gt; that were never designed for any of this. Tens of millions of families deal with this daily.&lt;/p&gt;

&lt;p&gt;So I built the thing we actually need.&lt;/p&gt;




&lt;h1&gt;
  
  
  What I Built
&lt;/h1&gt;

&lt;p&gt;&lt;strong&gt;MeraSociety&lt;/strong&gt; is a private, verified community platform for apartment societies — one app that replaces the group chaos with structured workflows, AI agents, and real-time collaboration.&lt;/p&gt;

&lt;p&gt;The core insight: people don't want to fill out forms. They want to type the same messy way they do in WhatsApp and have the system figure it out. So every major feature is powered by AI that bridges that gap.&lt;/p&gt;

&lt;h2&gt;
  
  
  Quick Overview — Every Pain Has a Proper Fix
&lt;/h2&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Feature&lt;/th&gt;
&lt;th&gt;The WhatsApp Pain&lt;/th&gt;
&lt;th&gt;The MeraSociety Fix&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Announcements&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Buried in 20 minutes&lt;/td&gt;
&lt;td&gt;Pinned, priority-tagged, seen tracking + threaded comments&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Real-Time Chat&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;One giant noisy group&lt;/td&gt;
&lt;td&gt;Multiple topic channels (General, Buy &amp;amp; Sell, Services, etc)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Bazaar Marketplace&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Listings vanish in minutes&lt;/td&gt;
&lt;td&gt;AI extracts structured listings from messy text; AI matches buyers to sellers&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Security Passes&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Guard calls you at 2 AM&lt;/td&gt;
&lt;td&gt;QR-coded digital passes — guest shows code at gate, zero phone calls&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Sports Booking&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;"I messaged first!" fights&lt;/td&gt;
&lt;td&gt;Slot grid with enforced fairness via rubrics&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;AI Court Booking&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Manual slot hunting&lt;/td&gt;
&lt;td&gt;"Book badminton tomorrow evening" → confirmed booking in 2 seconds&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Chat-to-Listing AI&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Selling messages get lost&lt;/td&gt;
&lt;td&gt;AI detects listings in chat → one-click post to Bazaar&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;AI Composer&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Admins can't write notices&lt;/td&gt;
&lt;td&gt;Rough notes → polished English + Hindi announcement&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;MCP Translation&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Half the society reads Hindi&lt;/td&gt;
&lt;td&gt;Model Context Protocol server for English↔Hindi translation&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Admin Dashboard&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Volunteer admins overwhelmed&lt;/td&gt;
&lt;td&gt;Member management, approvals, full audit trail&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;h2&gt;
  
  
  🏠 Dashboard — Your Society at a Glance
&lt;/h2&gt;

&lt;p&gt;Member count, active listings, pending security passes, today's bookings — all at a glance. Quick action buttons for the four things people do most. A live activity feed shows what's happening in your society right now.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fa6upza49i0kdk8wm8ks8.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fa6upza49i0kdk8wm8ks8.png" alt=" "&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  📢 Announcements — Notices That Don't Disappear
&lt;/h2&gt;

&lt;p&gt;Water tank cleaning notice? Pinned at the top, marked &lt;strong&gt;urgent&lt;/strong&gt; with a red badge. Tracks comments (threaded) and seen count (admin knows who hasn't read it). Filter by all, pinned, or urgent. Still there weeks later.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fdh9e4jl17yqn2fcf42ba.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fdh9e4jl17yqn2fcf42ba.png" alt=" "&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fl81bzmrsd3zvl9wfocrx.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fl81bzmrsd3zvl9wfocrx.png" alt=" "&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  💬 Real-Time Chat — Multiple Channels, Not One Firehose
&lt;/h2&gt;

&lt;p&gt;Six topic channels in my society: General, Buy &amp;amp; Sell, Services, Food Corner, Sports, Maintenance. Every message shows sender's photo, name, and flat number. Mobile-responsive with a slide-out channel list.&lt;/p&gt;

&lt;p&gt;But chat isn't just chat — it's where the AI agents live.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fji8xl8gm8u2cn6b1xxzx.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fji8xl8gm8u2cn6b1xxzx.png" alt=" "&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  🛒 Bazaar — A Marketplace That Understands WhatsApp
&lt;/h2&gt;

&lt;p&gt;Nobody writes structured listings. They type:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;"Selling my 2 year old Samsung washing machine, 7kg, works perfectly. 8000 rs. DM me flat B-302"&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;&lt;em&gt;AI extraction&lt;/em&gt; turns that into:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight json"&gt;&lt;code&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"title"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"Samsung 7kg Washing Machine"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"category"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"buy_sell"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"price"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;8000&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"tags"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s2"&gt;"electronics"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"washing-machine"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"samsung"&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"condition"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"good"&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;


&lt;p&gt;One click → listing is live. AI matching on the buyer side: "looking for a washing machine under 10k" → relevance-scored results with reasons.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F97igzaxh3lrqf67dml4t.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F97igzaxh3lrqf67dml4t.png" alt=" "&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F0ithjnhb49i8gwhrlxqs.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F0ithjnhb49i8gwhrlxqs.png" alt=" "&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F189tj49is2zknmg1y8hz.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F189tj49is2zknmg1y8hz.png" alt=" "&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fpuw4dov3ye1ydk1v54mf.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fpuw4dov3ye1ydk1v54mf.png" alt=" "&gt;&lt;/a&gt;&lt;/p&gt;
&lt;h2&gt;
  
  
  🔒 Security Passes
&lt;/h2&gt;

&lt;p&gt;Pre-register visitors (Guest, Contractor, Delivery). Each generates a 6-character code + QR code. Guest shows it at the gate → guard verifies on screen → done. Status tracking: &lt;code&gt;Active → Used → Expired → Cancelled&lt;/code&gt;. Everything logged.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fjvttizr8tzmyj3aidp15.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fjvttizr8tzmyj3aidp15.png" alt=" "&gt;&lt;/a&gt;&lt;/p&gt;
&lt;h2&gt;
  
  
  🏸 Sports Booking — Fair, Provable, Argument-Free
&lt;/h2&gt;

&lt;p&gt;Slot grid: pick a court, pick a date, book. We have enforced a 2-hour cap per flat per court per day. Browser dev tools can't bypass it. The database itself says no.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fuyinndjadw1xtwmb8qgj.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fuyinndjadw1xtwmb8qgj.png" alt=" "&gt;&lt;/a&gt;&lt;/p&gt;
&lt;h2&gt;
  
  
  🤖 AI Agents — Natural Language → Real Actions
&lt;/h2&gt;

&lt;p&gt;Three AI Agent workflows + MCP integration:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;em&gt;AI Court Booking Agent&lt;/em&gt; — type in the Sports chat channel:
&lt;/li&gt;
&lt;/ul&gt;
&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;User: "Book me a badminton court tomorrow evening"

🤖 Agent: ✅ Booked! Badminton Court A on 2026-03-03, 6:00–7:00 PM.
📊 Fairness: 1.0 hours remaining (max 2h/day per flat).
💡 Also available: 19:00-20:00, Tennis Court 18:00-19:00
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;


&lt;p&gt;An &lt;em&gt;agentic loop&lt;/em&gt;: natural language → database queries → booking insert → explanation.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Ffn4uf0yb34oa7eou6vo8.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Ffn4uf0yb34oa7eou6vo8.png" alt=" "&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Funjerjbe7f3zxkyuv3ts.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Funjerjbe7f3zxkyuv3ts.png" alt=" "&gt;&lt;/a&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;em&gt;Chat-to-Marketplace Agent&lt;/em&gt; — post "Selling my Samsung TV, 8000 rs" in Buy &amp;amp; Sell. AI silently detects it (confidence &amp;gt; 50%) and offers a one-click post to Bazaar. No form-filling.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fiaa1gjvx9s75u9fenrfc.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fiaa1gjvx9s75u9fenrfc.png" alt=" "&gt;&lt;/a&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;em&gt;AI Announcement Composer&lt;/em&gt; — admins type &lt;em&gt;"water tank cleaning tmrw 10am-2pm no water sorry"&lt;/em&gt; → polished English + Hindi announcement, auto-suggested priority, pin recommendation.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fdxlpm49p6bjs3p0uttal.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fdxlpm49p6bjs3p0uttal.png" alt=" "&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Frxtuaw0azeewfzjsknqj.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Frxtuaw0azeewfzjsknqj.png" alt=" "&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fv5wpiycklwps5jtvk7aa.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fv5wpiycklwps5jtvk7aa.png" alt=" "&gt;&lt;/a&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;MCP Translation Server&lt;/strong&gt; — &lt;code&gt;/api/mcp/translate&lt;/code&gt; exposes &lt;code&gt;translate_to_hindi&lt;/code&gt;, &lt;code&gt;translate_to_english&lt;/code&gt;, &lt;code&gt;detect_language&lt;/code&gt; as MCP tools. Any MCP-compatible client can discover and use them (&lt;em&gt;AI Can Translate Searches from different languages into semantic searches on the same knowledge base&lt;/em&gt;).&lt;/li&gt;
&lt;/ul&gt;
&lt;h2&gt;
  
  
  👨‍💼 Admin Dashboard &amp;amp; More
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Admin Dashboard&lt;/strong&gt; — member directory, pending approvals, full audit log&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Notifications&lt;/strong&gt; — deep-linked alerts for listings, bookings, verifications&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Feedback&lt;/strong&gt; — bug reports + feature requests tracked &lt;code&gt;open → in progress → resolved&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Onboarding&lt;/strong&gt; — invite-code gated signup → admin approval → full access&lt;/li&gt;
&lt;/ul&gt;


&lt;h2&gt;
  
  
  Demo
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;Live Demo : &lt;a href="https://merasociety.vercel.app" rel="noopener noreferrer"&gt;https://merasociety.vercel.app&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;YouTube Link : &lt;a href="https://www.youtube.com/watch?v=6TNUxPRyHKA" rel="noopener noreferrer"&gt;https://www.youtube.com/watch?v=6TNUxPRyHKA&lt;/a&gt;
&lt;/li&gt;
&lt;/ul&gt;


&lt;h2&gt;
  
  
  Code
&lt;/h2&gt;

&lt;p&gt;

&lt;/p&gt;
&lt;div class="ltag-github-readme-tag"&gt;
  &lt;div class="readme-overview"&gt;
    &lt;h2&gt;
      &lt;img src="https://assets.dev.to/assets/github-logo-5a155e1f9a670af7944dd5e12375bc76ed542ea80224905ecaf878b9157cdefc.svg" alt="GitHub logo"&gt;
      &lt;a href="https://github.com/Vishwa-docs" rel="noopener noreferrer"&gt;
        Vishwa-docs
      &lt;/a&gt; / &lt;a href="https://github.com/Vishwa-docs/merasociety" rel="noopener noreferrer"&gt;
        merasociety
      &lt;/a&gt;
    &lt;/h2&gt;
    &lt;h3&gt;
      
    &lt;/h3&gt;
  &lt;/div&gt;
  &lt;div class="ltag-github-body"&gt;
    
&lt;div id="readme" class="md"&gt;
&lt;div class="markdown-heading"&gt;
&lt;h1 class="heading-element"&gt;🏘️ MeraSociety — Your Society, Connected&lt;/h1&gt;

&lt;/div&gt;

&lt;blockquote&gt;
&lt;p&gt;DEV.to Weekend Build Hackathon — Feb 28, 2026&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;A verified, private mini-social network for apartment societies — replacing chaotic WhatsApp groups with structured, searchable workflows.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;&lt;a href="https://merasociety.vercel.app" rel="nofollow noopener noreferrer"&gt;Live Demo →&lt;/a&gt;&lt;/strong&gt; | &lt;strong&gt;&lt;a href="https://dev.to/jackbright/merasociety-i-turned-my-apartment-societys-whatsapp-chaos-into-a-real-app-4o85" rel="nofollow"&gt;Read the Blog Post →&lt;/a&gt;&lt;/strong&gt; | &lt;strong&gt;&lt;a href="https://www.youtube.com/watch?v=6TNUxPRyHKA" rel="nofollow noopener noreferrer"&gt;Watch on YouTube →&lt;/a&gt;&lt;/strong&gt;&lt;/p&gt;

&lt;div class="markdown-heading"&gt;
&lt;h2 class="heading-element"&gt;Quick Start&lt;/h2&gt;

&lt;/div&gt;

&lt;div class="highlight highlight-source-shell notranslate position-relative overflow-auto js-code-highlight"&gt;
&lt;pre&gt;&lt;span class="pl-c1"&gt;cd&lt;/span&gt; merasociety
npm install
npm run dev&lt;/pre&gt;

&lt;/div&gt;
&lt;p&gt;See &lt;a href="https://github.com/Vishwa-docs/merasociety/merasociety/README.md" rel="noopener noreferrer"&gt;merasociety/README.md&lt;/a&gt; for full documentation.&lt;/p&gt;
&lt;/div&gt;

  &lt;/div&gt;
  &lt;div class="gh-btn-container"&gt;&lt;a class="gh-btn" href="https://github.com/Vishwa-docs/merasociety" rel="noopener noreferrer"&gt;View on GitHub&lt;/a&gt;&lt;/div&gt;
&lt;/div&gt;





&lt;p&gt;Github URL : &lt;a href="https://github.com/Vishwa-docs/merasociety" rel="noopener noreferrer"&gt;https://github.com/Vishwa-docs/merasociety&lt;/a&gt;&lt;/p&gt;




&lt;h1&gt;
  
  
  How I Built It
&lt;/h1&gt;

&lt;h2&gt;
  
  
  The Stack
&lt;/h2&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Layer&lt;/th&gt;
&lt;th&gt;Technology&lt;/th&gt;
&lt;th&gt;Why&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Framework&lt;/td&gt;
&lt;td&gt;Next.js 16 (App Router)&lt;/td&gt;
&lt;td&gt;Server components + API routes + file routing in one project&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Language&lt;/td&gt;
&lt;td&gt;TypeScript&lt;/td&gt;
&lt;td&gt;27 routes across 14 pages — type safety isn't optional at this scale&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Styling&lt;/td&gt;
&lt;td&gt;Tailwind CSS v4&lt;/td&gt;
&lt;td&gt;Consistent UI, fast iteration, zero component library dependency&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Backend&lt;/td&gt;
&lt;td&gt;Supabase&lt;/td&gt;
&lt;td&gt;Auth + Postgres + Realtime — no separate backend to deploy&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;AI&lt;/td&gt;
&lt;td&gt;Azure OpenAI (GPT-4o)&lt;/td&gt;
&lt;td&gt;Few-shot prompting for extraction, semantic matching, three agentic workflows&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;MCP&lt;/td&gt;
&lt;td&gt;Custom MCP Server&lt;/td&gt;
&lt;td&gt;Model Context Protocol translation server — tool discovery + execution&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;State&lt;/td&gt;
&lt;td&gt;Zustand&lt;/td&gt;
&lt;td&gt;20 lines of code for the entire client-side auth store&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;QR&lt;/td&gt;
&lt;td&gt;qrcode.js&lt;/td&gt;
&lt;td&gt;Branded QR codes for visitor passes&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Deploy&lt;/td&gt;
&lt;td&gt;Vercel&lt;/td&gt;
&lt;td&gt;Push to main → live&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;h2&gt;
  
  
  Database: 14 Tables
&lt;/h2&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;societies → members → announcements → announcement_comments
                                    → announcement_seen
                   → channels → messages
                   → listings
                   → visitor_passes
                   → courts → bookings
                   → notifications
                   → feedback
                   → audit_log
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  Key Design Decisions
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;&lt;p&gt;Invite-code isolation. Each society has a unique code. No code = no account. No "browse societies" by design — total data privacy per community.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Three roles. &lt;code&gt;admin&lt;/code&gt;, &lt;code&gt;resident&lt;/code&gt;, &lt;code&gt;guard&lt;/code&gt; — each sees a different app. Permissions enforced at the application layer, not just hidden in UI.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Server-side fairness. &lt;code&gt;check_booking_fairness()&lt;/code&gt; is a PostgreSQL function on every INSERT. I could've just disabled the button — but anyone with dev tools could bypass that. The database enforces the 2-hour cap. You can't cheat math.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Full audit trail. Every approval, verification, booking, announcement → logged with timestamp. Definitive answers when questions arise.&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Seven AI Endpoints
&lt;/h2&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Endpoint&lt;/th&gt;
&lt;th&gt;What It Does&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;/api/ai/extract&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Messy text → structured listing JSON (few-shot prompted)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;/api/ai/match&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Natural language query → scored listing matches&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;/api/ai/summarize&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Long announcements → 1–2 sentence summaries&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;/api/ai/book-court&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;"Book badminton tomorrow" → confirmed booking via agentic loop&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;/api/ai/chat-to-listing&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Chat message → confidence-scored listing detection&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;/api/ai/compose-announcement&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Rough notes → polished bilingual announcement&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;/api/mcp/translate&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;MCP-compatible English↔Hindi translation server&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;




&lt;h2&gt;
  
  
  What I'd Build Next
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;Voice-to-Listing — speak your listing, AI transcribes + structures it (huge for elderly residents)&lt;/li&gt;
&lt;li&gt;Push notifications — the app only works if people open it&lt;/li&gt;
&lt;li&gt;UPI payments — monthly maintenance collection&lt;/li&gt;
&lt;li&gt;AI Community Pulse — scan chat + feedback to surface trending issues for admins&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;But honestly? Even as it stands, this is the app I wish my society had instead of hundreds of WhatsApp groups and a spreadsheet.&lt;/p&gt;




&lt;p&gt;&lt;em&gt;If you live in an apartment society and this resonated — hit ❤️ and share it with your society's WhatsApp group. The irony is intentional.&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;Built with love and the kind of frustration that only comes from living in an Indian apartment society.&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;NOTE : Since starting the development of this project, Supabase has been &lt;a href="https://status.supabase.com" rel="noopener noreferrer"&gt;blocked in India&lt;/a&gt;. I have deployed it on Vercel so it should not be an issue, but in case you are unable to log in from an Indian network, I request you to use a VPN or change the DNS provider for supabase.co (As I did).&lt;/p&gt;
&lt;/blockquote&gt;

</description>
      <category>showdev</category>
      <category>devchallenge</category>
      <category>weekendchallenge</category>
      <category>ai</category>
    </item>
    <item>
      <title>I Built a Real-Time AI Vision Assistant in 1 Week — Here's What I Learned About Multimodal AI</title>
      <dc:creator>Vishwa Kumaresh</dc:creator>
      <pubDate>Sun, 01 Mar 2026 14:02:01 +0000</pubDate>
      <link>https://dev.to/jackbright/i-built-a-real-time-ai-vision-assistant-in-1-week-heres-what-i-learned-about-multimodal-ai-232i</link>
      <guid>https://dev.to/jackbright/i-built-a-real-time-ai-vision-assistant-in-1-week-heres-what-i-learned-about-multimodal-ai-232i</guid>
      <description>&lt;h2&gt;
  
  
  The Idea That Wouldn't Let Go
&lt;/h2&gt;

&lt;p&gt;For &lt;strong&gt;466 million&lt;/strong&gt; people with disabling hearing loss and &lt;strong&gt;43 million&lt;/strong&gt; with visual impairment, two questions define their daily lives:&lt;/p&gt;

&lt;p&gt;&lt;em&gt;"What did you say?"&lt;/em&gt; and &lt;em&gt;"What's in front of me?"&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;These aren't minor inconveniences. They're barriers — to independence, to safety, to just walking down a street.&lt;/p&gt;

&lt;p&gt;When I saw the &lt;a href="https://www.wemakedevs.org/hackathons/vision" rel="noopener noreferrer"&gt;WeMakeDevs Vision Possible Hackathon&lt;/a&gt;, I knew exactly what I wanted to build: a system that turns a camera into an intelligent companion that can see, speak, navigate, and translate — in real-time.&lt;/p&gt;

&lt;p&gt;No buttons. No screens to read. Just natural voice conversation with an AI that has eyes.&lt;/p&gt;

&lt;p&gt;That's &lt;strong&gt;WorldLens&lt;/strong&gt;.&lt;/p&gt;




&lt;h2&gt;
  
  
  What Is WorldLens?
&lt;/h2&gt;

&lt;p&gt;WorldLens is a dual-mode assistive vision platform built on the &lt;a href="https://visionagents.ai" rel="noopener noreferrer"&gt;Vision Agents SDK&lt;/a&gt;:&lt;/p&gt;

&lt;h3&gt;
  
  
  GuideLens — Your Walking Companion
&lt;/h3&gt;

&lt;p&gt;For visually impaired users. Point any camera — laptop, phone, or even a tiny M5Stack edge device — and GuideLens becomes your eyes:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;YOLO11 object detection across 80 classes — people, cars, obstacles, animals, furniture&lt;/li&gt;
&lt;li&gt;Hazard tracking with approach speed and direction estimation (left/center/right, near/medium/far)&lt;/li&gt;
&lt;li&gt;Real-time OCR — reads signs, building names, bus numbers aloud&lt;/li&gt;
&lt;li&gt;Turn-by-turn walking navigation via Google Maps&lt;/li&gt;
&lt;li&gt;Spatial memory — remembers every object it's seen, queryable by voice&lt;/li&gt;
&lt;li&gt;Natural voice conversation — you talk, it sees and responds&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  SignBridge — Sign Language Translation (Prototype Level)
&lt;/h3&gt;

&lt;p&gt;A real-time sign language → spoken English bridge:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;YOLO11 Pose extracts 17 body keypoints&lt;/li&gt;
&lt;li&gt;MediaPipe tracks 21 hand landmarks per hand&lt;/li&gt;
&lt;li&gt;ASL finger-spelling recognition for letters like A, B, D, I, L, V, W, Y&lt;/li&gt;
&lt;li&gt;Gesture classification (wave, point, thumbs up) via 30-frame buffer analysis&lt;/li&gt;
&lt;/ul&gt;




&lt;h2&gt;
  
  
  The Architecture — How It All Fits Together
&lt;/h2&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Camera (Webcam / M5Stack K210)
    │
    ▼
GetStream Edge Network (WebRTC)
    │
    ▼
┌─────────── Vision Agents Backend ───────────┐
│                                              │
│  YOLO11 Detection ─── Hazard Tracking        │
│  YOLO11 Pose ──────── MediaPipe Hands        │
│  Multi-VLM OCR                               │
│                                              │
│  Event Bus (pub/sub)                         │
│       │                                      │
│       ▼                                      │
│  Gemini 2.5 Flash Realtime                   │
│  Speech-to-Speech @ 5 FPS                    │
│  + 12 MCP Tools (Maps, Memory, Weather...)   │
│                                              │
└──────────────────────────────────────────────┘
    │
    ▼
React 19 Frontend (WebRTC + Alerts)
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The entire system is &lt;strong&gt;one real-time voice+vision conversation&lt;/strong&gt;. The user speaks, the AI sees and responds. No manual triggers. Gemini autonomously decides when to call tools — "Take me to the train station" triggers Google Maps directions, "What does that sign say?" triggers the OCR pipeline.&lt;/p&gt;




&lt;h2&gt;
  
  
  The Build — 7 Days, One Vision
&lt;/h2&gt;

&lt;h3&gt;
  
  
  Day 1: Infrastructure
&lt;/h3&gt;

&lt;p&gt;Got the Vision Agents SDK running with GetStream WebRTC transport. Built the React frontend skeleton. Established dual-mode architecture (GuideLens / SignBridge). Wired up camera input.&lt;/p&gt;

&lt;h3&gt;
  
  
  Day 2: Computer Vision
&lt;/h3&gt;

&lt;p&gt;Integrated YOLO11 for both object detection and pose estimation. Built the multi-VLM provider chain with automatic failover across 5 providers (Gemini → Grok → Azure GPT-4o → NVIDIA Cosmos → HuggingFace). Mode switching working end-to-end.&lt;/p&gt;

&lt;h3&gt;
  
  
  Day 3: Advanced Visuals
&lt;/h3&gt;

&lt;p&gt;OCR processor with multi-VLM chain. NVIDIA Cosmos integration for dense scene descriptions. 3D avatar with lip-sync using React Three Fiber (Discontinued) . OCR text overlay on the frontend.&lt;/p&gt;

&lt;h3&gt;
  
  
  Day 4: Agentic Intelligence
&lt;/h3&gt;

&lt;p&gt;This was the breakthrough day. Google Maps API integration for live walking directions. SQLite spatial memory database. MediaPipe hand landmarks for ASL finger-spelling. Priority-based navigation engine with announcement cooldowns.&lt;/p&gt;

&lt;h3&gt;
  
  
  Day 5: Polish &amp;amp; Testing
&lt;/h3&gt;

&lt;p&gt;Wired up all 12 MCP tools. Built AlertOverlay v2 with Web Audio API chimes and severity-based haptic feedback. Enterprise-grade telemetry panel. Glassmorphism UI.&lt;/p&gt;

&lt;h3&gt;
  
  
  Day 6: A LOT of bug fixes :)
&lt;/h3&gt;

&lt;h3&gt;
  
  
  Day 7 : Deployment Activities (Docker and M5Stack K210 Camera)
&lt;/h3&gt;




&lt;h2&gt;
  
  
  The Hard Lessons
&lt;/h2&gt;

&lt;h3&gt;
  
  
  1. Edge Deployment Is TOUGH
&lt;/h3&gt;

&lt;p&gt;I connected an M5Stack UnitV K210 — a RISC-V chip with 8 MB of SRAM and a hardware neural accelerator. Getting YOLO v2 tiny to run on it at ~15 FPS taught me more about real-world constraints than any tutorial.&lt;/p&gt;

&lt;p&gt;You can't just "deploy to edge." You're fighting memory limits, model quantization, serial communication protocols, and the fact that a 224×224 input resolution means your detection accuracy drops significantly. Edge AI sounds great in blog posts. In practice, it's an engineering discipline unto itself.&lt;/p&gt;

&lt;h3&gt;
  
  
  2. Real-Time Is Possible — But It Takes Architecture
&lt;/h3&gt;

&lt;p&gt;My first approach was naive: detect objects → send to LLM → speak response. It crumbled instantly. Duplicate announcements every frame. Hazard alerts drowning out navigation. The LLM getting overwhelmed with events.&lt;/p&gt;

&lt;p&gt;The solution was an entire event-driven architecture:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;code&gt;BaseEvent&lt;/code&gt; pub/sub for decoupled communication&lt;/li&gt;
&lt;li&gt;Priority-based announcement queues with configurable cooldowns&lt;/li&gt;
&lt;li&gt;Bounding box growth rate estimation for approach speed (not just "car detected" but "car approaching from the left, getting closer")&lt;/li&gt;
&lt;li&gt;30-second deduplication cooldowns in spatial memory&lt;/li&gt;
&lt;li&gt;User speech suppression during active navigation&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Real-time isn't about speed. It's about &lt;strong&gt;knowing what NOT to say&lt;/strong&gt;.&lt;/p&gt;

&lt;h3&gt;
  
  
  3. Voice-First Design Changes Everything
&lt;/h3&gt;

&lt;p&gt;This was the deepest lesson. When your user can't see a screen:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;You can't show a loading spinner — you have to say "one moment" or stay silent&lt;/li&gt;
&lt;li&gt;You can't display status text — you have to speak it naturally&lt;/li&gt;
&lt;li&gt;You can't use visual hierarchy — everything is sequential audio&lt;/li&gt;
&lt;li&gt;Error messages become spoken apologies&lt;/li&gt;
&lt;li&gt;"Tap to retry" becomes "just ask me again"&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Every single UX pattern I knew was wrong. Voice-first isn't a feature. It's a complete paradigm shift.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Vision Agents SDK Made This Possible
&lt;/h2&gt;

&lt;p&gt;I want to be real about this: building an agentic real-time video+voice system from scratch would have taken months. The Vision Agents SDK gave me:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;code&gt;Agent&lt;/code&gt; class for lifecycle management&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;Realtime&lt;/code&gt; mode for speech-to-speech with Gemini&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;VideoProcessorPublisher&lt;/code&gt; base class for all my vision pipelines&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;BaseEvent&lt;/code&gt; for event-driven architecture&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;register_function()&lt;/code&gt; for MCP tool registration&lt;/li&gt;
&lt;li&gt;GetStream Edge integration for WebRTC transport&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;I could focus on the &lt;em&gt;what&lt;/em&gt; (assistive vision) instead of the &lt;em&gt;how&lt;/em&gt;.&lt;/p&gt;




&lt;h2&gt;
  
  
  The Tech Stack
&lt;/h2&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Layer&lt;/th&gt;
&lt;th&gt;Tech&lt;/th&gt;
&lt;th&gt;Purpose&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Orchestration&lt;/td&gt;
&lt;td&gt;Vision Agents SDK&lt;/td&gt;
&lt;td&gt;Agent lifecycle, processors, events, MCP&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Reasoning&lt;/td&gt;
&lt;td&gt;Gemini 2.5 Flash Realtime&lt;/td&gt;
&lt;td&gt;Speech-to-speech @ 5 FPS&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Detection&lt;/td&gt;
&lt;td&gt;YOLO11 (Ultralytics)&lt;/td&gt;
&lt;td&gt;80-class detection + 17-keypoint pose&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Hand Tracking&lt;/td&gt;
&lt;td&gt;MediaPipe&lt;/td&gt;
&lt;td&gt;21 keypoints/hand, ASL recognition&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Navigation&lt;/td&gt;
&lt;td&gt;Google Maps APIs&lt;/td&gt;
&lt;td&gt;Directions, Places, Geocoding&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Memory&lt;/td&gt;
&lt;td&gt;aiosqlite&lt;/td&gt;
&lt;td&gt;Persistent spatial object history&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Transport&lt;/td&gt;
&lt;td&gt;GetStream Edge (WebRTC)&lt;/td&gt;
&lt;td&gt;Real-time video + audio&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Frontend&lt;/td&gt;
&lt;td&gt;React 19 + Vite 7 + TypeScript&lt;/td&gt;
&lt;td&gt;WebRTC client, 3D avatar, alerts&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Edge Device&lt;/td&gt;
&lt;td&gt;M5Stack K210 (RISC-V)&lt;/td&gt;
&lt;td&gt;On-device YOLO v2 tiny&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Deployment&lt;/td&gt;
&lt;td&gt;Docker (multi-stage)&lt;/td&gt;
&lt;td&gt;Single-container deployment&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Testing&lt;/td&gt;
&lt;td&gt;pytest + Vitest&lt;/td&gt;
&lt;td&gt;70 tests (24 + 46)&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;




&lt;h2&gt;
  
  
  What's Next
&lt;/h2&gt;

&lt;p&gt;WorldLens is a proof-of-concept, but the vision (pun intended) is bigger:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Full mobile edge deployment with SIM card connectivity — truly portable, untethered navigation&lt;/li&gt;
&lt;li&gt;Lip reading to speech — supplement audio in noisy environments&lt;/li&gt;
&lt;li&gt;Caller vibration alerts — detect when someone is speaking to you, alert via haptics&lt;/li&gt;
&lt;li&gt;Full SignBridge two-user mode — real-time bidirectional deaf ↔ hearing translation&lt;/li&gt;
&lt;li&gt;Expanded ASL vocabulary — beyond finger-spelling to full conversational signs&lt;/li&gt;
&lt;li&gt;Offline fallback — on-device YOLO + edge TTS for basic hazard detection without internet (I faced a lot of connectivity issues)&lt;/li&gt;
&lt;/ul&gt;




&lt;h2&gt;
  
  
  Final Thought
&lt;/h2&gt;

&lt;p&gt;I built WorldLens because I believe multimodal AI shouldn't just be impressive demos — it should solve real problems for real people. For someone who can't see, a camera that speaks is not a gimmick. It's independence. For someone who signs, an AI that translates in real-time isn't a novelty. It's being heard.&lt;/p&gt;




&lt;p&gt;&lt;em&gt;Built for the &lt;a href="https://www.wemakedevs.org/hackathons/vision" rel="noopener noreferrer"&gt;WeMakeDevs Vision Possible Hackathon&lt;/a&gt; (February 2026) using the &lt;a href="https://visionagents.ai" rel="noopener noreferrer"&gt;Vision Agents SDK&lt;/a&gt;.&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;&lt;em&gt;GitHub: &lt;a href="https://github.com/Vishwa-docs/WeMakeDevs-Vision-Possible-Hackathon" rel="noopener noreferrer"&gt;WorldLens Repository&lt;/a&gt;&lt;/em&gt;&lt;/p&gt;

&lt;h1&gt;
  
  
  VisionPossible #VisionAgents #AI #Accessibility #Hackathon
&lt;/h1&gt;

</description>
      <category>ai</category>
      <category>vision</category>
      <category>agents</category>
      <category>aiforgood</category>
    </item>
  </channel>
</rss>
