<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>DEV Community: Lahla</title>
    <description>The latest articles on DEV Community by Lahla (@suktoyeajs4).</description>
    <link>https://dev.to/suktoyeajs4</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F3713337%2Fa27546d3-3c1b-457e-9cfc-cb56311a7dda.png</url>
      <title>DEV Community: Lahla</title>
      <link>https://dev.to/suktoyeajs4</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://dev.to/feed/suktoyeajs4"/>
    <language>en</language>
    <item>
      <title>Tesseract OCR</title>
      <dc:creator>Lahla</dc:creator>
      <pubDate>Thu, 15 Jan 2026 19:34:56 +0000</pubDate>
      <link>https://dev.to/suktoyeajs4/tesseract-ocr-4h92</link>
      <guid>https://dev.to/suktoyeajs4/tesseract-ocr-4h92</guid>
      <description>&lt;p&gt;I scan a lot of paper documents for work: invoices, handwritten notes, old contracts. On macOS, this sounds trivial until you actually try to extract clean, editable text from mixed-quality scans. Preview can copy text sometimes, but it breaks the moment the scan isn’t perfect. That’s where I ended up with &lt;strong&gt;gImageReader&lt;/strong&gt; after a few frustrating evenings.&lt;/p&gt;

&lt;p&gt;The core problem I had was accuracy and control. I didn’t just want “some text” — I needed to decide language, page regions, and output format without uploading sensitive documents to random cloud services.&lt;/p&gt;




&lt;h3&gt;
  
  
  Why gImageReader solved my actual problem
&lt;/h3&gt;

&lt;p&gt;gImageReader is a graphical frontend for &lt;strong&gt;Tesseract OCR&lt;/strong&gt;, which is still one of the most reliable open-source OCR engines. The difference compared to online tools is immediate:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Works completely offline&lt;/li&gt;
&lt;li&gt;Lets you define recognition areas manually&lt;/li&gt;
&lt;li&gt;Supports multiple languages in one document&lt;/li&gt;
&lt;li&gt;Doesn’t mangle formatting as aggressively&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;I was scanning multilingual PDFs (English + German), and cloud OCR tools kept guessing wrong. With gImageReader, I explicitly selected language packs and ran OCR per page. The error rate dropped noticeably.&lt;/p&gt;

&lt;p&gt;Official project resources:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;a href="https://github.com/manisandro/gImageReader" rel="noopener noreferrer"&gt;https://github.com/manisandro/gImageReader&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://tesseract-ocr.github.io/" rel="noopener noreferrer"&gt;https://tesseract-ocr.github.io/&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;




&lt;h3&gt;
  
  
  macOS-specific issues I ran into (and fixed)
&lt;/h3&gt;

&lt;p&gt;This is where real-world usage matters. On macOS, gImageReader doesn’t always “just work” out of the box.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Problem 1: App launches but OCR does nothing&lt;/strong&gt;&lt;br&gt;
Cause: macOS permissions + missing Tesseract path.&lt;/p&gt;

&lt;p&gt;Fix:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Install Tesseract via Homebrew:
&lt;a href="https://brew.sh" rel="noopener noreferrer"&gt;https://brew.sh&lt;/a&gt;
&lt;/li&gt;
&lt;/ul&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;  brew &lt;span class="nb"&gt;install &lt;/span&gt;tesseract
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;ul&gt;
&lt;li&gt;In gImageReader settings, manually point to &lt;code&gt;/opt/homebrew/bin/tesseract&lt;/code&gt; (Apple Silicon)&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Problem 2: No access to scanned PDFs&lt;/strong&gt;&lt;br&gt;
macOS blocks filesystem access silently.&lt;/p&gt;

&lt;p&gt;Fix:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;System Settings → Privacy &amp;amp; Security → Files and Folders&lt;/li&gt;
&lt;li&gt;Allow gImageReader access to Documents / Desktop&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Apple’s official documentation on app permissions explains this behavior:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;a href="https://support.apple.com/guide/mac-help/control-access-to-files-and-folders-mchld5a35146/mac" rel="noopener noreferrer"&gt;https://support.apple.com/guide/mac-help/control-access-to-files-and-folders-mchld5a35146/mac&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;




&lt;h3&gt;
  
  
  Real workflow that worked for me
&lt;/h3&gt;

&lt;ol&gt;
&lt;li&gt;Scan documents as grayscale PDF (not color)&lt;/li&gt;
&lt;li&gt;Open PDF directly in gImageReader&lt;/li&gt;
&lt;li&gt;Manually select text-heavy regions&lt;/li&gt;
&lt;li&gt;Run OCR per page instead of bulk&lt;/li&gt;
&lt;li&gt;Export as plain text or searchable PDF&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;This avoids the “OCR soup” you get from one-click tools and keeps the result usable.&lt;/p&gt;




&lt;h3&gt;
  
  
  When gImageReader makes sense (and when it doesn’t)
&lt;/h3&gt;

&lt;p&gt;It’s worth using if:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;You care about privacy&lt;/li&gt;
&lt;li&gt;You handle multilingual documents&lt;/li&gt;
&lt;li&gt;You want control over OCR behavior&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;It’s not ideal if:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;You expect a polished macOS-native UI&lt;/li&gt;
&lt;li&gt;You want zero configuration&lt;/li&gt;
&lt;li&gt;You rely on handwriting recognition&lt;/li&gt;
&lt;/ul&gt;




&lt;h3&gt;
  
  
  Reference link used during setup
&lt;/h3&gt;

&lt;p&gt;I originally found the macOS build and usage notes here:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;a href="https://vbpyz.com/office-and-productivity/82791-gimagereader.html" rel="noopener noreferrer"&gt;https://vbpyz.com/office-and-productivity/82791-gimagereader.html&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;That page helped confirm package details and avoid incompatible builds before installing.&lt;/p&gt;




&lt;h3&gt;
  
  
  Final takeaway
&lt;/h3&gt;

&lt;p&gt;gImageReader isn’t flashy, but it does one thing very well: reliable OCR without cloud dependency. If you’ve been fighting with inaccurate text extraction on macOS and don’t want your documents leaving your machine, this tool is still one of the most practical solutions — as long as you’re willing to spend a few minutes configuring it properly.&lt;/p&gt;

</description>
      <category>opensource</category>
      <category>privacy</category>
      <category>productivity</category>
      <category>tooling</category>
    </item>
  </channel>
</rss>
