<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>DEV Community: somyabhalani</title>
    <description>The latest articles on DEV Community by somyabhalani (@somyabhalani).</description>
    <link>https://dev.to/somyabhalani</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F3940938%2Fcd61aa83-66a5-4106-b1db-095bd98d8cc7.png</url>
      <title>DEV Community: somyabhalani</title>
      <link>https://dev.to/somyabhalani</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://dev.to/feed/somyabhalani"/>
    <language>en</language>
    <item>
      <title>How We Automated Catalog Image Extraction using Computer Vision &amp; FastAPI</title>
      <dc:creator>somyabhalani</dc:creator>
      <pubDate>Tue, 19 May 2026 19:24:39 +0000</pubDate>
      <link>https://dev.to/somyabhalani/how-we-automated-catalog-image-extraction-using-computer-vision-fastapi-5han</link>
      <guid>https://dev.to/somyabhalani/how-we-automated-catalog-image-extraction-using-computer-vision-fastapi-5han</guid>
      <description>&lt;p&gt;For businesses in the stone, marble, and interior design industries, managing digital catalog assets is a massive headache. &lt;/p&gt;

&lt;p&gt;When a new product catalog arrives as a 100-page PDF, design teams spend hours manually cropping out individual tile samples to upload to their websites or inventory sheets.&lt;/p&gt;

&lt;p&gt;To automate this, we built &lt;strong&gt;Tile Extractor&lt;/strong&gt;—a high-performance, automated parsing engine designed specifically to isolate tile samples from raw catalog documents.&lt;/p&gt;




&lt;h3&gt;
  
  
  How it Works (Under the Hood)
&lt;/h3&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;PDF Ingestion&lt;/strong&gt;: The system uses a &lt;strong&gt;FastAPI&lt;/strong&gt; backend to ingest multi-page PDFs. We process the pages using &lt;strong&gt;PyMuPDF&lt;/strong&gt; to extract raw page vectors and high-res layout structures.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Object Detection &amp;amp; Border Cleaning&lt;/strong&gt;: Instead of relying on slow, expensive cloud Vision APIs, we use local &lt;strong&gt;Pillow&lt;/strong&gt; and OpenCV-based spatial algorithms. The engine analyzes:

&lt;ul&gt;
&lt;li&gt;Edge density to isolate individual tile boundaries.&lt;/li&gt;
&lt;li&gt;Aspect ratios to filter out page noise (like page numbers or logos).&lt;/li&gt;
&lt;li&gt;Color distributions using RGB histograms.&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Lossless Cropping&lt;/strong&gt;: Once a tile is identified and classified, the engine performs a lossless crop directly from the PDF's high-resolution asset stream, ensuring no pixel resolution is lost.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Batch ZIP Packaging&lt;/strong&gt;: The isolated tile PNGs are packaged into a single ZIP file and returned to the user instantly.&lt;/li&gt;
&lt;/ol&gt;




&lt;h3&gt;
  
  
  Why it Matters for B2B Automation
&lt;/h3&gt;

&lt;p&gt;What used to take a human designer 2 hours now takes our engine 5 seconds. By running localized computer vision algorithms instead of cloud APIs, we eliminate usage fees and keep client data fully private.&lt;/p&gt;

&lt;p&gt;If your business manages product catalogs, you can try the tool for free here:&lt;/p&gt;

&lt;p&gt;👉 &lt;strong&gt;Try Tile Extractor&lt;/strong&gt;: &lt;a href="https://tile-extractor.onrender.com" rel="noopener noreferrer"&gt;https://tile-extractor.onrender.com&lt;/a&gt;&lt;br&gt;
👉 &lt;strong&gt;Explore our work&lt;/strong&gt;: &lt;a href="https://anantalabs.app/" rel="noopener noreferrer"&gt;https://anantalabs.app/&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fsfx8zj9fej4cwioootvy.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fsfx8zj9fej4cwioootvy.png" alt=" " width="799" height="518"&gt;&lt;/a&gt;&lt;/p&gt;

</description>
      <category>python</category>
      <category>fastapi</category>
      <category>computervision</category>
      <category>automation</category>
    </item>
    <item>
      <title>How We Built a Contactless Digital Signature App inside the Browser (No Servers, 100% Private)</title>
      <dc:creator>somyabhalani</dc:creator>
      <pubDate>Tue, 19 May 2026 19:21:46 +0000</pubDate>
      <link>https://dev.to/somyabhalani/how-we-built-a-contactless-digital-signature-app-inside-the-browser-no-servers-100-private-18n2</link>
      <guid>https://dev.to/somyabhalani/how-we-built-a-contactless-digital-signature-app-inside-the-browser-no-servers-100-private-18n2</guid>
      <description>&lt;p&gt;Traditional digital signature platforms have two major issues: &lt;strong&gt;privacy&lt;/strong&gt; and &lt;strong&gt;cost&lt;/strong&gt;. &lt;/p&gt;

&lt;p&gt;To sign a document, you have to upload sensitive agreements to a third-party server. And as a developer, running server-side document rendering and signatures can lead to heavy API bills and database management overhead.&lt;/p&gt;

&lt;p&gt;At &lt;strong&gt;Ananta Labs&lt;/strong&gt;, we wanted to see if we could build a completely secure, contactless alternative that runs entirely on the client side using browser-native AI. &lt;/p&gt;

&lt;p&gt;Here is how we built &lt;strong&gt;AirSign&lt;/strong&gt;.&lt;/p&gt;




&lt;h3&gt;
  
  
  The Architecture: 100% Client-Side
&lt;/h3&gt;

&lt;p&gt;Instead of hosting heavy machine learning models on a GPU server, we compiled our hand-tracking models to run locally inside the user's browser.&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;Gesture Capture&lt;/strong&gt;: We utilized &lt;strong&gt;MediaPipe's&lt;/strong&gt; hand-landmarker models compiled into WebAssembly. This allows the browser to track 21 3D hand coordinates in real-time at 30 FPS using a standard webcam.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Contactless Canvas&lt;/strong&gt;: Using &lt;strong&gt;WebGL&lt;/strong&gt;, we map the index finger coordinate to a HTML5 canvas. We implemented a custom interpolation algorithm to smooth out hand jitter and render a fluid, realistic signature line.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Local PDF Stamp&lt;/strong&gt;: Once the user finishes drawing their signature in the air, we generate the final document. The signature coordinate vector is parsed and stamped onto the PDF using a client-side library.&lt;/li&gt;
&lt;/ol&gt;




&lt;h3&gt;
  
  
  Why this is the Future of AI Integration
&lt;/h3&gt;

&lt;p&gt;By moving the computation from the server to the client:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Absolute Privacy&lt;/strong&gt;: 0 video frames, coordinate points, or document bytes are transmitted to any database.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Zero Server Overhead&lt;/strong&gt;: The hosting cost for this app is exactly $0 since it runs on the user's CPU/GPU.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Instant Load Times&lt;/strong&gt;: Zero network latency during signature interpolation.&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Try it Yourself
&lt;/h3&gt;

&lt;p&gt;AirSign is completely open and free to test. We’d love to hear your feedback on the hand-tracking latency and mobile performance:&lt;/p&gt;

&lt;p&gt;👉 &lt;strong&gt;Try AirSign&lt;/strong&gt;: &lt;a href="https://airsign-red.vercel.app/" rel="noopener noreferrer"&gt;https://airsign-red.vercel.app/&lt;/a&gt;&lt;br&gt;
👉 &lt;strong&gt;Explore our work&lt;/strong&gt;: &lt;a href="https://anantalabs.app/" rel="noopener noreferrer"&gt;https://anantalabs.app/&lt;/a&gt;&lt;/p&gt;

</description>
      <category>webdev</category>
      <category>javascript</category>
      <category>ai</category>
      <category>computervision</category>
    </item>
  </channel>
</rss>
