<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>DEV Community: Michael Liu</title>
    <description>The latest articles on DEV Community by Michael Liu (@voqusa).</description>
    <link>https://dev.to/voqusa</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F3916315%2F74f6e881-8106-4bed-b2c0-248856a6b767.jpg</url>
      <title>DEV Community: Michael Liu</title>
      <link>https://dev.to/voqusa</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://dev.to/feed/voqusa"/>
    <language>en</language>
    <item>
      <title>Built a Chrome extension to right-click any social video and get an AI transcript</title>
      <dc:creator>Michael Liu</dc:creator>
      <pubDate>Thu, 28 May 2026 11:24:05 +0000</pubDate>
      <link>https://dev.to/voqusa/built-a-chrome-extension-to-right-click-any-social-video-and-get-an-ai-transcript-2e5f</link>
      <guid>https://dev.to/voqusa/built-a-chrome-extension-to-right-click-any-social-video-and-get-an-ai-transcript-2e5f</guid>
      <description>&lt;p&gt;I shipped a Chrome extension this week that I've been wanting for a while: right-click any social video on a page and get a clean transcript back, without leaving the tab.&lt;/p&gt;

&lt;h2&gt;
  
  
  Why I built this
&lt;/h2&gt;

&lt;p&gt;I write content about short-form video and spend a lot of time pulling hooks, scripts, and outlines out of TikToks, Reels, and Shorts. The flow used to be:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Copy URL&lt;/li&gt;
&lt;li&gt;Open another transcription tool&lt;/li&gt;
&lt;li&gt;Paste, wait, download&lt;/li&gt;
&lt;li&gt;Tab back to where I was&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;That's a lot of context switches for "I just want the words from this video." A right-click context menu felt like the right interface.&lt;/p&gt;

&lt;h2&gt;
  
  
  What it does
&lt;/h2&gt;

&lt;p&gt;&lt;a href="https://chromewebstore.google.com/detail/voqusa-social-video-trans/ifojkfjgiombchkkijefngjhgbmehhmo" rel="noopener noreferrer"&gt;Voqusa Chrome Extension&lt;/a&gt; is the built version. You right-click any web page (or paste a video URL into the popup) and a transcript appears in 30–60 seconds.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Supports TikTok, YouTube Shorts, Instagram Reels, Facebook, Twitter/X, LinkedIn, Pinterest&lt;/li&gt;
&lt;li&gt;Whisper-grade speech-to-text on the backend&lt;/li&gt;
&lt;li&gt;Anonymous users get 3 free transcripts; sign in at &lt;a href="https://www.voqusa.com" rel="noopener noreferrer"&gt;voqusa.com&lt;/a&gt; to use the full free tier or pay-as-you-go credits&lt;/li&gt;
&lt;li&gt;133 KiB install, no tracking pixels&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Architecture
&lt;/h2&gt;

&lt;p&gt;The extension itself is tiny — it just hands the video URL to the cloud transcription API. The constraints I cared about:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Privacy:&lt;/strong&gt; the extension only sends the URL you explicitly submit. No DOM, no cookies, no browsing history.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Anonymous-first:&lt;/strong&gt; local device ID stores transcripts so you can try it without signing in.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Fast:&lt;/strong&gt; Whisper-grade model, downloaded audio only (not full video).&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Use cases I'm using it for
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;Pulling hooks and outlines from competitor short-form videos&lt;/li&gt;
&lt;li&gt;Turning lecture clips into citable text&lt;/li&gt;
&lt;li&gt;Building swipe files of high-performing scripts&lt;/li&gt;
&lt;li&gt;Reading along when audio isn't an option (open offices, late nights)&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;If you find yourself transcribing social videos often, give it a try and let me know what's missing.&lt;/p&gt;

</description>
      <category>ai</category>
      <category>productivity</category>
      <category>showdev</category>
      <category>socialmedia</category>
    </item>
    <item>
      <title>Transcribing TikTok and short-form social videos: a quick comparison of approaches</title>
      <dc:creator>Michael Liu</dc:creator>
      <pubDate>Sun, 10 May 2026 14:06:17 +0000</pubDate>
      <link>https://dev.to/voqusa/transcribing-tiktok-and-short-form-social-videos-a-quick-comparison-of-approaches-1p3m</link>
      <guid>https://dev.to/voqusa/transcribing-tiktok-and-short-form-social-videos-a-quick-comparison-of-approaches-1p3m</guid>
      <description>&lt;p&gt;When I started analyzing viral content for a side project, I assumed transcription would be the easy part. It's not — at least not for short-form social video. Here's what I learned trying a few different approaches.&lt;/p&gt;

&lt;h2&gt;
  
  
  The problem with file-based tools
&lt;/h2&gt;

&lt;p&gt;Most popular transcription tools (Otter, Descript, &lt;a href="https://videotranscriber.ai/" rel="noopener noreferrer"&gt;VideoTranscriber.ai&lt;/a&gt;, Whisper-based desktop apps) expect you to feed them an audio or video &lt;strong&gt;file&lt;/strong&gt;. That's fine for podcasts, Zoom recordings, or YouTube long-form videos you've already downloaded. But for TikTok / Reels / Shorts you usually start with a &lt;strong&gt;public URL&lt;/strong&gt;, and converting that into a file means:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Find or pay for a TikTok/IG/X video downloader&lt;/li&gt;
&lt;li&gt;Wait for the download&lt;/li&gt;
&lt;li&gt;Upload to the transcription tool&lt;/li&gt;
&lt;li&gt;Wait again for the transcribe&lt;/li&gt;
&lt;li&gt;Repeat for every single clip&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;For a 30-clip swipe file that's a real time sink.&lt;/p&gt;

&lt;h2&gt;
  
  
  URL-native transcription
&lt;/h2&gt;

&lt;p&gt;The approach I ended up using is &lt;a href="https://www.voqusa.com" rel="noopener noreferrer"&gt;Voqusa&lt;/a&gt; — you paste the public URL of the video and it returns the transcript. Supports TikTok, YouTube, Instagram, Facebook, Twitter/X, LinkedIn, and Pinterest. Captions are free; speech-to-text is pay-as-you-go (no subscription) and failed transcripts cost zero credits, which is a nice detail when you're testing it on borderline-quality audio.&lt;/p&gt;

&lt;p&gt;14 languages also helped me when I was looking at Spanish and Portuguese creators in the same niche.&lt;/p&gt;

&lt;h2&gt;
  
  
  When each fits
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;File-based tools (Descript, VideoTranscriber.ai, Otter):&lt;/strong&gt; long-form, multi-speaker, podcasts, meetings, anything you already have on disk. Editor features matter most here.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;URL-based tools (Voqusa):&lt;/strong&gt; short-form social, viral analysis, content repurposing, quick research where you just need the text fast.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Not a strict either/or — I use both depending on the input I'm starting from.&lt;/p&gt;

&lt;h2&gt;
  
  
  Tradeoffs to be aware of
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;URL-based tools depend on the social platform's public access. If a creator's account is private, you'll need a downloader anyway.&lt;/li&gt;
&lt;li&gt;For very low-volume use, captions-only mode (free on Voqusa) is enough. If you need diarization or punctuation cleanup, file-based editors are still ahead.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Mostly posting this so I stop getting DMs asking how I'm pulling 50+ TikTok transcripts a week without losing my mind.&lt;/p&gt;

</description>
      <category>ai</category>
      <category>productivity</category>
      <category>tutorial</category>
      <category>contentcreators</category>
    </item>
    <item>
      <title>OCR for handwriting and math: comparing tools in 2026</title>
      <dc:creator>Michael Liu</dc:creator>
      <pubDate>Sun, 10 May 2026 07:48:50 +0000</pubDate>
      <link>https://dev.to/voqusa/ocr-for-handwriting-and-math-comparing-tools-in-2026-b6m</link>
      <guid>https://dev.to/voqusa/ocr-for-handwriting-and-math-comparing-tools-in-2026-b6m</guid>
      <description>&lt;p&gt;If you've ever tried to OCR handwritten notes or math equations from a screenshot, you know the standard tools (Google Vision, Tesseract, AWS Textract) all hit a wall once you leave printed Latin text.&lt;/p&gt;

&lt;p&gt;I spent some time benchmarking what's out there in 2026. Here's what's actually working.&lt;/p&gt;

&lt;h2&gt;
  
  
  What breaks in generic OCR
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Handwriting&lt;/strong&gt; — especially cursive in non-Latin scripts. Most OCRs were trained on printed text and treat ligatures as noise.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Math equations&lt;/strong&gt; — generic OCR returns "x2 + y2 = 1" instead of &lt;code&gt;x² + y² = 1&lt;/code&gt; or LaTeX.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Tables&lt;/strong&gt; — column structure flattens into a paragraph; you lose the relationships.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;CJK&lt;/strong&gt; — character recognition is OK; vertical-text and traditional-character handling are not.&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Tools I tried
&lt;/h2&gt;

&lt;h3&gt;
  
  
  &lt;a href="https://scanread.ai" rel="noopener noreferrer"&gt;ScanRead.ai&lt;/a&gt; — free OCR for the gap cases
&lt;/h3&gt;

&lt;p&gt;Built on PP-OCRv5 + PaddleOCR-VL (~2M params). Has a dedicated &lt;strong&gt;Math → LaTeX&lt;/strong&gt; path that actually preserves multi-line derivations when there's clear bracketing, and CJK accuracy that's competitive with Vision/Textract on my test set. 22 specialized tools (handwriting, receipts, tables, etc.). Free tier 20 pages/day, Pro from $10/mo for batch + watermark-free export.&lt;/p&gt;

&lt;h3&gt;
  
  
  Google Cloud Vision API
&lt;/h3&gt;

&lt;p&gt;Best general-purpose OCR for printed Latin text. Falls apart on handwriting and math structure. ~$1.50 / 1000 pages.&lt;/p&gt;

&lt;h3&gt;
  
  
  AWS Textract
&lt;/h3&gt;

&lt;p&gt;Strongest on tables and forms in printed documents. Math support is essentially nonexistent. Pricier.&lt;/p&gt;

&lt;h3&gt;
  
  
  Mistral OCR (released earlier this year)
&lt;/h3&gt;

&lt;p&gt;Strong on document layout. Less specialized routes than purpose-built tools.&lt;/p&gt;

&lt;h3&gt;
  
  
  Tesseract (open source)
&lt;/h3&gt;

&lt;p&gt;Free, but 2026 use case is mostly "I need to OCR something offline". Quality on handwriting is poor.&lt;/p&gt;

&lt;h2&gt;
  
  
  Picking one
&lt;/h2&gt;

&lt;p&gt;For most indie/dev use cases I'd lean on &lt;strong&gt;ScanRead&lt;/strong&gt; for the free tier and CJK + math; &lt;strong&gt;Vision&lt;/strong&gt; if you're processing printed English at volume; and &lt;strong&gt;Textract&lt;/strong&gt; if you have heavy form-extraction needs.&lt;/p&gt;

&lt;p&gt;What's your stack? Curious what people are using for handwriting specifically — that's still the hardest case for me.&lt;/p&gt;

</description>
      <category>ai</category>
      <category>machinelearning</category>
      <category>productivity</category>
      <category>software</category>
    </item>
    <item>
      <title>OCR is back: replacing Tesseract with PP-OCRv5 in my document pipelines</title>
      <dc:creator>Michael Liu</dc:creator>
      <pubDate>Fri, 08 May 2026 14:23:58 +0000</pubDate>
      <link>https://dev.to/voqusa/ocr-is-back-replacing-tesseract-with-pp-ocrv5-in-my-document-pipelines-15og</link>
      <guid>https://dev.to/voqusa/ocr-is-back-replacing-tesseract-with-pp-ocrv5-in-my-document-pipelines-15og</guid>
      <description>&lt;h2&gt;
  
  
  OCR is back: how I'm replacing Tesseract with PP-OCRv5 in my pipelines
&lt;/h2&gt;

&lt;p&gt;I've been wrangling OCR pipelines for years — Tesseract for plain text, Google Vision when CJK comes up, AWS Textract for tables. Each has its own pain (Tesseract drops handwritten characters, Vision is pricey at scale, Textract's bbox layout is opinionated).&lt;/p&gt;

&lt;p&gt;Recently I've been quietly piping a lot of work through &lt;a href="https://scanread.ai" rel="noopener noreferrer"&gt;ScanRead.ai&lt;/a&gt; instead. It's a free OCR tool built on &lt;strong&gt;PP-OCRv5&lt;/strong&gt; and the new &lt;strong&gt;PaddleOCR-VL&lt;/strong&gt; model. Here's what changed for me.&lt;/p&gt;

&lt;h3&gt;
  
  
  What it actually does
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;Image → text in 100+ languages (including Arabic, Japanese, Chinese, Hindi, Thai)&lt;/li&gt;
&lt;li&gt;22 specialized tools: image-to-text, PDF-to-Word, screenshot-to-text, handwriting recognition, math-to-LaTeX, receipt OCR&lt;/li&gt;
&lt;li&gt;Outputs to .txt, .md, or .docx — Markdown export is great for pipelines into Notion or Obsidian&lt;/li&gt;
&lt;li&gt;Free tier is generous: 20 pages/day, no signup&lt;/li&gt;
&lt;li&gt;Pro is $10/mo for 3,000 pages with batch (up to 20 files at once)&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Where it shined for me
&lt;/h3&gt;

&lt;p&gt;&lt;strong&gt;Handwritten meeting notes.&lt;/strong&gt; Tesseract gives me garbage on cursive. ScanRead reconstructed three pages of a colleague's whiteboard photos with maybe two errors per page. That's the difference between "useful" and "I'll just retype it."&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;CJK receipts.&lt;/strong&gt; I had a folder of Japanese receipts to reconcile. PaddleOCR-VL handles vertical text and mixed kanji/kana way better than I expected — competitive with Google Vision in my spot-check, at zero cost.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Math → LaTeX.&lt;/strong&gt; Pasting screenshots of equations from PDFs and getting back ( \LaTeX ) source is the kind of small thing that saves a real amount of time over a week.&lt;/p&gt;

&lt;h3&gt;
  
  
  Where it's weaker
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;Layout reconstruction for complex multi-column PDFs is okay but Textract is still better for forms with deep nested tables.&lt;/li&gt;
&lt;li&gt;The free tier is rate-limited per day, not per minute — fine for humans, awkward for batch jobs.&lt;/li&gt;
&lt;li&gt;No public API yet (as of writing); Pro batch UI is the workaround.&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Why I'm sharing
&lt;/h3&gt;

&lt;p&gt;If you're paying for Vision/Textract for occasional OCR, try the free tier first. If you do batch scans, the $10/mo Pro plan undercuts both. Link: &lt;a href="https://scanread.ai" rel="noopener noreferrer"&gt;https://scanread.ai&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Curious if anyone else has switched off Tesseract for handwriting. What's your stack?&lt;/p&gt;

</description>
      <category>ai</category>
      <category>automation</category>
      <category>machinelearning</category>
      <category>tooling</category>
    </item>
    <item>
      <title>How I Turn TikTok Videos into Searchable Transcripts in Seconds (Free Tool)</title>
      <dc:creator>Michael Liu</dc:creator>
      <pubDate>Wed, 06 May 2026 16:14:45 +0000</pubDate>
      <link>https://dev.to/voqusa/how-i-turn-tiktok-videos-into-searchable-transcripts-in-seconds-free-tool-h6</link>
      <guid>https://dev.to/voqusa/how-i-turn-tiktok-videos-into-searchable-transcripts-in-seconds-free-tool-h6</guid>
      <description>&lt;h2&gt;
  
  
  Why I needed transcripts
&lt;/h2&gt;

&lt;p&gt;I spend a lot of time studying short-form video — TikTok hooks, YouTube Shorts, Instagram Reels — and the part I actually want is the &lt;strong&gt;script&lt;/strong&gt;, not the video. Re-watching to copy down a 30-second hook is painful, and most "free transcript tools" hide behind a signup wall or only work on YouTube.&lt;/p&gt;

&lt;p&gt;So I built &lt;strong&gt;&lt;a href="https://www.voqusa.com" rel="noopener noreferrer"&gt;Voqusa&lt;/a&gt;&lt;/strong&gt; — paste a TikTok / YouTube / Instagram / Facebook / Twitter / LinkedIn / Pinterest URL, get the transcript instantly. No signup, no paywall on captions.&lt;/p&gt;

&lt;h2&gt;
  
  
  How it works
&lt;/h2&gt;

&lt;ol&gt;
&lt;li&gt;Paste the video URL.&lt;/li&gt;
&lt;li&gt;Voqusa pulls the audio + any embedded captions.&lt;/li&gt;
&lt;li&gt;AI speech-to-text fills in the rest (14 languages supported).&lt;/li&gt;
&lt;li&gt;Copy the text and search/repurpose/study it.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;A few things I made deliberate:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;No account required for caption-based transcripts.&lt;/strong&gt; You only spend a credit when the AI has to do speech-to-text from scratch.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Failed transcripts cost 0 credits.&lt;/strong&gt; If we can't pull it, you don't pay.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Privacy:&lt;/strong&gt; URLs and transcripts aren't kept after your session ends.&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  What I use it for
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;Reverse-engineering viral hooks (collect 50 transcripts, find patterns)&lt;/li&gt;
&lt;li&gt;Building swipe files of proven video structures&lt;/li&gt;
&lt;li&gt;Summarizing podcast clips into LinkedIn posts&lt;/li&gt;
&lt;li&gt;Accessibility — adding text alternatives to video content&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Try it
&lt;/h2&gt;

&lt;p&gt;If you ever wanted "Ctrl+F for video," it's at &lt;strong&gt;&lt;a href="https://www.voqusa.com" rel="noopener noreferrer"&gt;voqusa.com&lt;/a&gt;&lt;/strong&gt;. Captions are free; speech-to-text is pay-as-you-go (no subscription, credits valid 12 months). Curious if anyone has other use cases — drop them in the comments.&lt;/p&gt;

</description>
      <category>ai</category>
      <category>productivity</category>
      <category>tutorial</category>
      <category>tools</category>
    </item>
  </channel>
</rss>
