<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>DEV Community: joe wang</title>
    <description>The latest articles on DEV Community by joe wang (@joe_wang_6a4a3e51566e8b52).</description>
    <link>https://dev.to/joe_wang_6a4a3e51566e8b52</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F3782706%2Fad97c097-2ca6-4cab-854e-22a1c039feab.png</url>
      <title>DEV Community: joe wang</title>
      <link>https://dev.to/joe_wang_6a4a3e51566e8b52</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://dev.to/feed/joe_wang_6a4a3e51566e8b52"/>
    <language>en</language>
    <item>
      <title>Optimizing OCR Performance on Mobile: From 5 Seconds to Under 1 Second</title>
      <dc:creator>joe wang</dc:creator>
      <pubDate>Sat, 21 Feb 2026 04:39:32 +0000</pubDate>
      <link>https://dev.to/joe_wang_6a4a3e51566e8b52/optimizing-ocr-performance-on-mobile-from-5-seconds-to-under-1-second-332m</link>
      <guid>https://dev.to/joe_wang_6a4a3e51566e8b52/optimizing-ocr-performance-on-mobile-from-5-seconds-to-under-1-second-332m</guid>
      <description>&lt;p&gt;OCR on mobile needs to be fast. Users expect results in under 2 seconds. When I started building &lt;a href="https://play.google.com/store/apps/details?id=com.screentranslator.app" rel="noopener noreferrer"&gt;Screen Translator&lt;/a&gt;, our initial OCR pipeline took 4-5 seconds per screen capture. That's an eternity when you're trying to read a game menu or translate a chat message in real time.&lt;/p&gt;

&lt;p&gt;Here's how we got it down to under 1 second on modern devices.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Bottlenecks
&lt;/h2&gt;

&lt;p&gt;Before optimizing, we profiled the pipeline:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;Screen capture&lt;/strong&gt;: ~200ms (MediaProjection API)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Image preprocessing&lt;/strong&gt;: ~800ms 😱&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;OCR inference&lt;/strong&gt;: ~2500ms 😱😱&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Translation API call&lt;/strong&gt;: ~500ms&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;UI rendering&lt;/strong&gt;: ~100ms&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;Total: ~4100ms. Steps 2 and 3 were the obvious targets.&lt;/p&gt;

&lt;h2&gt;
  
  
  Optimization 1: Smart Image Downscaling
&lt;/h2&gt;

&lt;p&gt;The biggest win came from not feeding full-resolution screenshots to the OCR engine.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight kotlin"&gt;&lt;code&gt;&lt;span class="k"&gt;fun&lt;/span&gt; &lt;span class="nf"&gt;optimizeForOCR&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;bitmap&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nc"&gt;Bitmap&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt; &lt;span class="nc"&gt;Bitmap&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="kd"&gt;val&lt;/span&gt; &lt;span class="py"&gt;maxDimension&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;1280&lt;/span&gt; &lt;span class="c1"&gt;// Sweet spot for accuracy vs speed&lt;/span&gt;
    &lt;span class="kd"&gt;val&lt;/span&gt; &lt;span class="py"&gt;scale&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;minOf&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
        &lt;span class="n"&gt;maxDimension&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;toFloat&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="p"&gt;/&lt;/span&gt; &lt;span class="n"&gt;bitmap&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;width&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="n"&gt;maxDimension&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;toFloat&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="p"&gt;/&lt;/span&gt; &lt;span class="n"&gt;bitmap&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;height&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="mf"&gt;1f&lt;/span&gt; &lt;span class="c1"&gt;// Don't upscale&lt;/span&gt;
    &lt;span class="p"&gt;)&lt;/span&gt;

    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;scale&lt;/span&gt; &lt;span class="p"&gt;&amp;gt;=&lt;/span&gt; &lt;span class="mf"&gt;1f&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;bitmap&lt;/span&gt;

    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="nc"&gt;Bitmap&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;createScaledBitmap&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
        &lt;span class="n"&gt;bitmap&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;bitmap&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;width&lt;/span&gt; &lt;span class="p"&gt;*&lt;/span&gt; &lt;span class="n"&gt;scale&lt;/span&gt;&lt;span class="p"&gt;).&lt;/span&gt;&lt;span class="nf"&gt;toInt&lt;/span&gt;&lt;span class="p"&gt;(),&lt;/span&gt;
        &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;bitmap&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;height&lt;/span&gt; &lt;span class="p"&gt;*&lt;/span&gt; &lt;span class="n"&gt;scale&lt;/span&gt;&lt;span class="p"&gt;).&lt;/span&gt;&lt;span class="nf"&gt;toInt&lt;/span&gt;&lt;span class="p"&gt;(),&lt;/span&gt;
        &lt;span class="k"&gt;true&lt;/span&gt; &lt;span class="c1"&gt;// Bilinear filtering&lt;/span&gt;
    &lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;A 2400x1080 screenshot scaled to 1280x576 processes 3x faster with negligible accuracy loss for screen text.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Result&lt;/strong&gt;: Image preprocessing dropped from 800ms to 250ms.&lt;/p&gt;

&lt;h2&gt;
  
  
  Optimization 2: Region of Interest (ROI) Detection
&lt;/h2&gt;

&lt;p&gt;Why OCR the entire screen when the user only cares about a specific area?&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight kotlin"&gt;&lt;code&gt;&lt;span class="k"&gt;fun&lt;/span&gt; &lt;span class="nf"&gt;detectTextRegions&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;bitmap&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nc"&gt;Bitmap&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt; &lt;span class="nc"&gt;List&lt;/span&gt;&lt;span class="p"&gt;&amp;lt;&lt;/span&gt;&lt;span class="nc"&gt;Rect&lt;/span&gt;&lt;span class="p"&gt;&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="c1"&gt;// Convert to grayscale&lt;/span&gt;
    &lt;span class="kd"&gt;val&lt;/span&gt; &lt;span class="py"&gt;gray&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;toGrayscale&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;bitmap&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

    &lt;span class="c1"&gt;// Apply adaptive threshold&lt;/span&gt;
    &lt;span class="kd"&gt;val&lt;/span&gt; &lt;span class="py"&gt;binary&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;adaptiveThreshold&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;gray&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

    &lt;span class="c1"&gt;// Find contours and merge nearby text blocks&lt;/span&gt;
    &lt;span class="kd"&gt;val&lt;/span&gt; &lt;span class="py"&gt;contours&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;findContours&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;binary&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="nf"&gt;mergeNearbyContours&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;contours&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;mergeDistance&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;20&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;By detecting text regions first (which is fast — ~50ms), we only run the expensive OCR on areas that actually contain text. For a typical app screen, this means processing 30-40% of the image instead of 100%.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Result&lt;/strong&gt;: OCR inference dropped from 2500ms to ~800ms.&lt;/p&gt;

&lt;h2&gt;
  
  
  Optimization 3: ML Kit On-Device vs Cloud
&lt;/h2&gt;

&lt;p&gt;We use Google ML Kit's on-device text recognition as the default. It's free, fast, and works offline. For CJK languages (Chinese, Japanese, Korean), we use the V2 API which has significantly better accuracy.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight kotlin"&gt;&lt;code&gt;&lt;span class="kd"&gt;val&lt;/span&gt; &lt;span class="py"&gt;recognizer&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;TextRecognition&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;getClient&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="k"&gt;when&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;scriptType&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="nc"&gt;ScriptType&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nc"&gt;LATIN&lt;/span&gt; &lt;span class="p"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="nc"&gt;TextRecognizerOptions&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nc"&gt;DEFAULT_OPTIONS&lt;/span&gt;
        &lt;span class="nc"&gt;ScriptType&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nc"&gt;CJK&lt;/span&gt; &lt;span class="p"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="nc"&gt;ChineseTextRecognizerOptions&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nc"&gt;Builder&lt;/span&gt;&lt;span class="p"&gt;().&lt;/span&gt;&lt;span class="nf"&gt;build&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
        &lt;span class="nc"&gt;ScriptType&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nc"&gt;KOREAN&lt;/span&gt; &lt;span class="p"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="nc"&gt;KoreanTextRecognizerOptions&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nc"&gt;Builder&lt;/span&gt;&lt;span class="p"&gt;().&lt;/span&gt;&lt;span class="nf"&gt;build&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
        &lt;span class="nc"&gt;ScriptType&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nc"&gt;JAPANESE&lt;/span&gt; &lt;span class="p"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="nc"&gt;JapaneseTextRecognizerOptions&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nc"&gt;Builder&lt;/span&gt;&lt;span class="p"&gt;().&lt;/span&gt;&lt;span class="nf"&gt;build&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
        &lt;span class="nc"&gt;ScriptType&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nc"&gt;DEVANAGARI&lt;/span&gt; &lt;span class="p"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="nc"&gt;DevanagariTextRecognizerOptions&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nc"&gt;Builder&lt;/span&gt;&lt;span class="p"&gt;().&lt;/span&gt;&lt;span class="nf"&gt;build&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The key insight: &lt;strong&gt;choose the right recognizer upfront&lt;/strong&gt;. Running the Latin recognizer on Japanese text wastes time and gives garbage results. We detect the likely script from user settings and previous results.&lt;/p&gt;

&lt;h2&gt;
  
  
  Optimization 4: Background Threading with Coroutines
&lt;/h2&gt;

&lt;p&gt;Never block the main thread. We use Kotlin coroutines with a dedicated dispatcher:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight kotlin"&gt;&lt;code&gt;&lt;span class="k"&gt;private&lt;/span&gt; &lt;span class="kd"&gt;val&lt;/span&gt; &lt;span class="py"&gt;ocrDispatcher&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;Dispatchers&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nc"&gt;Default&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;limitedParallelism&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="k"&gt;suspend&lt;/span&gt; &lt;span class="k"&gt;fun&lt;/span&gt; &lt;span class="nf"&gt;processScreen&lt;/span&gt;&lt;span class="p"&gt;():&lt;/span&gt; &lt;span class="nc"&gt;TranslationResult&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;withContext&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;ocrDispatcher&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="kd"&gt;val&lt;/span&gt; &lt;span class="py"&gt;capture&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;captureScreen&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;           &lt;span class="c1"&gt;// ~200ms&lt;/span&gt;
    &lt;span class="kd"&gt;val&lt;/span&gt; &lt;span class="py"&gt;optimized&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;optimizeForOCR&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;capture&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;  &lt;span class="c1"&gt;// ~50ms&lt;/span&gt;
    &lt;span class="kd"&gt;val&lt;/span&gt; &lt;span class="py"&gt;regions&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;detectTextRegions&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;optimized&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="c1"&gt;// ~50ms&lt;/span&gt;

    &lt;span class="c1"&gt;// Process regions in parallel&lt;/span&gt;
    &lt;span class="kd"&gt;val&lt;/span&gt; &lt;span class="py"&gt;results&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="n"&gt;regions&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;map&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="n"&gt;region&lt;/span&gt; &lt;span class="p"&gt;-&amp;gt;&lt;/span&gt;
        &lt;span class="nf"&gt;async&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
            &lt;span class="kd"&gt;val&lt;/span&gt; &lt;span class="py"&gt;cropped&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;cropRegion&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;optimized&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;region&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
            &lt;span class="nf"&gt;recognizeText&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;cropped&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="p"&gt;}&lt;/span&gt;
    &lt;span class="p"&gt;}.&lt;/span&gt;&lt;span class="nf"&gt;awaitAll&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;

    &lt;span class="c1"&gt;// Translate in batch&lt;/span&gt;
    &lt;span class="nf"&gt;translateBatch&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;results&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;                  &lt;span class="c1"&gt;// ~400ms&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Processing multiple text regions in parallel on multi-core devices gives us another 20-30% speedup.&lt;/p&gt;

&lt;h2&gt;
  
  
  Optimization 5: Caching
&lt;/h2&gt;

&lt;p&gt;If the screen hasn't changed much, don't re-OCR everything.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight kotlin"&gt;&lt;code&gt;&lt;span class="kd"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;OCRCache&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;private&lt;/span&gt; &lt;span class="kd"&gt;val&lt;/span&gt; &lt;span class="py"&gt;maxSize&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nc"&gt;Int&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;50&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="k"&gt;private&lt;/span&gt; &lt;span class="kd"&gt;val&lt;/span&gt; &lt;span class="py"&gt;cache&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;LruCache&lt;/span&gt;&lt;span class="p"&gt;&amp;lt;&lt;/span&gt;&lt;span class="nc"&gt;Long&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nc"&gt;OCRResult&lt;/span&gt;&lt;span class="p"&gt;&amp;gt;(&lt;/span&gt;&lt;span class="n"&gt;maxSize&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

    &lt;span class="k"&gt;fun&lt;/span&gt; &lt;span class="nf"&gt;getOrProcess&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;bitmap&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nc"&gt;Bitmap&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;process&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="p"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="nc"&gt;OCRResult&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt; &lt;span class="nc"&gt;OCRResult&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="kd"&gt;val&lt;/span&gt; &lt;span class="py"&gt;hash&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;computePerceptualHash&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;bitmap&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="n"&gt;cache&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="k"&gt;get&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;hash&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="o"&gt;?.&lt;/span&gt;&lt;span class="nf"&gt;let&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;it&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt;

        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="nf"&gt;process&lt;/span&gt;&lt;span class="p"&gt;().&lt;/span&gt;&lt;span class="nf"&gt;also&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="n"&gt;cache&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;put&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;hash&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;it&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;

    &lt;span class="k"&gt;private&lt;/span&gt; &lt;span class="k"&gt;fun&lt;/span&gt; &lt;span class="nf"&gt;computePerceptualHash&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;bitmap&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nc"&gt;Bitmap&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt; &lt;span class="nc"&gt;Long&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="c1"&gt;// Downscale to 8x8, convert to grayscale, compute average&lt;/span&gt;
        &lt;span class="c1"&gt;// Compare each pixel to average -&amp;gt; 64-bit hash&lt;/span&gt;
        &lt;span class="kd"&gt;val&lt;/span&gt; &lt;span class="py"&gt;small&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;Bitmap&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;createScaledBitmap&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;bitmap&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;8&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;8&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="k"&gt;true&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="c1"&gt;// ... hash computation&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Perceptual hashing means slightly different screenshots (e.g., a blinking cursor) still hit the cache.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Result&lt;/strong&gt;: Repeated translations are instant (~10ms).&lt;/p&gt;

&lt;h2&gt;
  
  
  Final Numbers
&lt;/h2&gt;

&lt;p&gt;After all optimizations on a mid-range device (Snapdragon 695):&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Step&lt;/th&gt;
&lt;th&gt;Before&lt;/th&gt;
&lt;th&gt;After&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Screen capture&lt;/td&gt;
&lt;td&gt;200ms&lt;/td&gt;
&lt;td&gt;200ms&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Image preprocessing&lt;/td&gt;
&lt;td&gt;800ms&lt;/td&gt;
&lt;td&gt;50ms&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;ROI detection&lt;/td&gt;
&lt;td&gt;N/A&lt;/td&gt;
&lt;td&gt;50ms&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;OCR inference&lt;/td&gt;
&lt;td&gt;2500ms&lt;/td&gt;
&lt;td&gt;400ms&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Translation&lt;/td&gt;
&lt;td&gt;500ms&lt;/td&gt;
&lt;td&gt;400ms&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;UI rendering&lt;/td&gt;
&lt;td&gt;100ms&lt;/td&gt;
&lt;td&gt;50ms&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Total&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;4100ms&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;~800ms&lt;/strong&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;On flagship devices (Snapdragon 8 Gen 3), we're seeing 400-500ms total.&lt;/p&gt;

&lt;h2&gt;
  
  
  Key Takeaways
&lt;/h2&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;Profile first&lt;/strong&gt; — don't guess where the bottleneck is&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Downscale aggressively&lt;/strong&gt; — screen text is high contrast, OCR handles lower resolution well&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;ROI detection&lt;/strong&gt; is cheap and saves massive OCR time&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Choose the right ML model&lt;/strong&gt; for the script type&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Cache everything&lt;/strong&gt; — screens don't change that often&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Parallelize&lt;/strong&gt; where possible with coroutines&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;These techniques aren't specific to our app. If you're building anything with on-device OCR, these patterns will help.&lt;/p&gt;

&lt;p&gt;If you want to see these optimizations in action, check out &lt;a href="https://play.google.com/store/apps/details?id=com.screentranslator.app" rel="noopener noreferrer"&gt;Screen Translator on Google Play&lt;/a&gt;.&lt;/p&gt;




&lt;p&gt;&lt;em&gt;What OCR performance challenges have you faced on mobile? Drop your experiences in the comments.&lt;/em&gt;&lt;/p&gt;

</description>
      <category>android</category>
      <category>kotlin</category>
      <category>performance</category>
      <category>mobile</category>
    </item>
    <item>
      <title>Android Foreground Services in 2026: What Changed and How to Adapt</title>
      <dc:creator>joe wang</dc:creator>
      <pubDate>Sat, 21 Feb 2026 04:32:40 +0000</pubDate>
      <link>https://dev.to/joe_wang_6a4a3e51566e8b52/android-foreground-services-in-2026-what-changed-and-how-to-adapt-2o3d</link>
      <guid>https://dev.to/joe_wang_6a4a3e51566e8b52/android-foreground-services-in-2026-what-changed-and-how-to-adapt-2o3d</guid>
      <description>&lt;p&gt;If you're an Android developer working with foreground services, 2026 has brought some significant changes you need to know about. Starting with Android 14 (API 34), Google introduced strict foreground service type requirements that affect how apps like screen translators, media players, and location trackers operate.&lt;/p&gt;

&lt;p&gt;In this post, I'll walk through what changed, why it matters, and how we adapted our app — &lt;a href="https://play.google.com/store/apps/details?id=com.screentranslator.app" rel="noopener noreferrer"&gt;Screen Translator&lt;/a&gt; — to comply with the new rules.&lt;/p&gt;

&lt;h2&gt;
  
  
  What Changed in Android 14+
&lt;/h2&gt;

&lt;p&gt;Before Android 14, you could start a foreground service without specifying a type. The system didn't care what your service was doing — as long as you showed a notification, you were good.&lt;/p&gt;

&lt;p&gt;Android 14 changed that. Now, every foreground service &lt;strong&gt;must&lt;/strong&gt; declare a &lt;code&gt;foregroundServiceType&lt;/code&gt; in the manifest. If you don't, your app will crash with a &lt;code&gt;MissingForegroundServiceTypeException&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;Here are the available types:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight xml"&gt;&lt;code&gt;&lt;span class="nt"&gt;&amp;lt;service&lt;/span&gt;
    &lt;span class="na"&gt;android:name=&lt;/span&gt;&lt;span class="s"&gt;".MyForegroundService"&lt;/span&gt;
    &lt;span class="na"&gt;android:foregroundServiceType=&lt;/span&gt;&lt;span class="s"&gt;"mediaProjection|camera|microphone|location|dataSync|mediaPlayback|phoneCall|connectedDevice|remoteMessaging|health|systemExempted|shortService|specialUse"&lt;/span&gt;
    &lt;span class="na"&gt;android:exported=&lt;/span&gt;&lt;span class="s"&gt;"false"&lt;/span&gt; &lt;span class="nt"&gt;/&amp;gt;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;You must pick the type(s) that match what your service actually does.&lt;/p&gt;

&lt;h2&gt;
  
  
  The MediaProjection Challenge
&lt;/h2&gt;

&lt;p&gt;For apps that capture the screen — like Screen Translator — the relevant type is &lt;code&gt;mediaProjection&lt;/code&gt;. This is one of the most restricted types because screen capture has obvious privacy implications.&lt;/p&gt;

&lt;p&gt;Here's what you need to declare in your &lt;code&gt;AndroidManifest.xml&lt;/code&gt;:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight xml"&gt;&lt;code&gt;&lt;span class="nt"&gt;&amp;lt;uses-permission&lt;/span&gt; &lt;span class="na"&gt;android:name=&lt;/span&gt;&lt;span class="s"&gt;"android.permission.FOREGROUND_SERVICE"&lt;/span&gt; &lt;span class="nt"&gt;/&amp;gt;&lt;/span&gt;
&lt;span class="nt"&gt;&amp;lt;uses-permission&lt;/span&gt; &lt;span class="na"&gt;android:name=&lt;/span&gt;&lt;span class="s"&gt;"android.permission.FOREGROUND_SERVICE_MEDIA_PROJECTION"&lt;/span&gt; &lt;span class="nt"&gt;/&amp;gt;&lt;/span&gt;

&lt;span class="nt"&gt;&amp;lt;service&lt;/span&gt;
    &lt;span class="na"&gt;android:name=&lt;/span&gt;&lt;span class="s"&gt;".ScreenCaptureService"&lt;/span&gt;
    &lt;span class="na"&gt;android:foregroundServiceType=&lt;/span&gt;&lt;span class="s"&gt;"mediaProjection"&lt;/span&gt;
    &lt;span class="na"&gt;android:exported=&lt;/span&gt;&lt;span class="s"&gt;"false"&lt;/span&gt; &lt;span class="nt"&gt;/&amp;gt;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;And when starting the service in Kotlin:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight kotlin"&gt;&lt;code&gt;&lt;span class="kd"&gt;val&lt;/span&gt; &lt;span class="py"&gt;serviceIntent&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;Intent&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;this&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nc"&gt;ScreenCaptureService&lt;/span&gt;&lt;span class="o"&gt;::&lt;/span&gt;&lt;span class="k"&gt;class&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;java&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nc"&gt;Build&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nc"&gt;VERSION&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nc"&gt;SDK_INT&lt;/span&gt; &lt;span class="p"&gt;&amp;gt;=&lt;/span&gt; &lt;span class="nc"&gt;Build&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nc"&gt;VERSION_CODES&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nc"&gt;UPSIDE_DOWN_CAKE&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="nf"&gt;startForegroundService&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;serviceIntent&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="k"&gt;else&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="nf"&gt;startForegroundService&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;serviceIntent&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Inside the service, you must call &lt;code&gt;startForeground()&lt;/code&gt; with the correct type:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight kotlin"&gt;&lt;code&gt;&lt;span class="k"&gt;override&lt;/span&gt; &lt;span class="k"&gt;fun&lt;/span&gt; &lt;span class="nf"&gt;onStartCommand&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;intent&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nc"&gt;Intent&lt;/span&gt;&lt;span class="p"&gt;?,&lt;/span&gt; &lt;span class="n"&gt;flags&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nc"&gt;Int&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;startId&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nc"&gt;Int&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt; &lt;span class="nc"&gt;Int&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="kd"&gt;val&lt;/span&gt; &lt;span class="py"&gt;notification&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;createNotification&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;

    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nc"&gt;Build&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nc"&gt;VERSION&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nc"&gt;SDK_INT&lt;/span&gt; &lt;span class="p"&gt;&amp;gt;=&lt;/span&gt; &lt;span class="nc"&gt;Build&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nc"&gt;VERSION_CODES&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nc"&gt;UPSIDE_DOWN_CAKE&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="nf"&gt;startForeground&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
            &lt;span class="nc"&gt;NOTIFICATION_ID&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
            &lt;span class="n"&gt;notification&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
            &lt;span class="nc"&gt;ServiceInfo&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nc"&gt;FOREGROUND_SERVICE_TYPE_MEDIA_PROJECTION&lt;/span&gt;
        &lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="k"&gt;else&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="nf"&gt;startForeground&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nc"&gt;NOTIFICATION_ID&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;notification&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;

    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="nc"&gt;START_STICKY&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  Android 15: Even More Restrictions
&lt;/h2&gt;

&lt;p&gt;Android 15 (API 35) added another layer. Now, &lt;code&gt;dataSync&lt;/code&gt; foreground services have a 6-hour time limit. After that, the system will stop your service. If your app relied on long-running data sync services, you need to rethink your architecture.&lt;/p&gt;

&lt;p&gt;For screen translators, the key restriction is that &lt;code&gt;mediaProjection&lt;/code&gt; services now require the user to grant permission &lt;strong&gt;every time&lt;/strong&gt; the app starts a new session. You can no longer cache the &lt;code&gt;MediaProjection&lt;/code&gt; token across app restarts.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight kotlin"&gt;&lt;code&gt;&lt;span class="c1"&gt;// This no longer works in Android 15+&lt;/span&gt;
&lt;span class="c1"&gt;// val savedToken = sharedPrefs.getString("projection_token", null)&lt;/span&gt;

&lt;span class="c1"&gt;// You must request a new MediaProjection each time&lt;/span&gt;
&lt;span class="kd"&gt;val&lt;/span&gt; &lt;span class="py"&gt;mediaProjectionManager&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;getSystemService&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nc"&gt;MEDIA_PROJECTION_SERVICE&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="nc"&gt;MediaProjectionManager&lt;/span&gt;
&lt;span class="kd"&gt;val&lt;/span&gt; &lt;span class="py"&gt;captureIntent&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="n"&gt;mediaProjectionManager&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;createScreenCaptureIntent&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
&lt;span class="nf"&gt;startActivityForResult&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;captureIntent&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nc"&gt;REQUEST_CODE&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  How Screen Translator Adapted
&lt;/h2&gt;

&lt;p&gt;When building &lt;a href="https://play.google.com/store/apps/details?id=com.screentranslator.app" rel="noopener noreferrer"&gt;Screen Translator&lt;/a&gt;, we had to make several architectural changes:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Declared the correct service type&lt;/strong&gt;: We use &lt;code&gt;mediaProjection&lt;/code&gt; for screen capture and &lt;code&gt;specialUse&lt;/code&gt; for our floating bubble overlay service.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Handled the per-session permission flow&lt;/strong&gt;: Instead of caching the projection token, we now guide users through a clean permission flow each time they start a translation session.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Optimized service lifecycle&lt;/strong&gt;: We stop the foreground service as soon as the user dismisses the floating bubble, rather than keeping it running in the background.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Graceful degradation&lt;/strong&gt;: On older Android versions, we fall back to the simpler foreground service API without type declarations.&lt;br&gt;
&lt;/p&gt;&lt;/li&gt;
&lt;/ol&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight kotlin"&gt;&lt;code&gt;&lt;span class="k"&gt;fun&lt;/span&gt; &lt;span class="nf"&gt;startCaptureService&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;resultCode&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nc"&gt;Int&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;data&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nc"&gt;Intent&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="kd"&gt;val&lt;/span&gt; &lt;span class="py"&gt;serviceIntent&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;Intent&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;context&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nc"&gt;ScreenCaptureService&lt;/span&gt;&lt;span class="o"&gt;::&lt;/span&gt;&lt;span class="k"&gt;class&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;java&lt;/span&gt;&lt;span class="p"&gt;).&lt;/span&gt;&lt;span class="nf"&gt;apply&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="nf"&gt;putExtra&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"resultCode"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;resultCode&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="nf"&gt;putExtra&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"data"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;data&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;

    &lt;span class="nc"&gt;ContextCompat&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;startForegroundService&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;context&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;serviceIntent&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  Common Pitfalls
&lt;/h2&gt;

&lt;p&gt;Here are mistakes I've seen developers make:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Declaring types you don't use&lt;/strong&gt;: Google Play will reject your app if you declare &lt;code&gt;camera&lt;/code&gt; or &lt;code&gt;location&lt;/code&gt; foreground service types but don't actually use those capabilities.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Not handling the permission dialog&lt;/strong&gt;: On Android 14+, the MediaProjection permission dialog looks different and includes a "single app" vs "entire screen" option. Make sure your app handles both cases.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Forgetting the notification channel&lt;/strong&gt;: Foreground services still require a notification, and on Android 13+ you need the &lt;code&gt;POST_NOTIFICATIONS&lt;/code&gt; permission too.&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Testing Tips
&lt;/h2&gt;

&lt;p&gt;Test your foreground service changes on multiple API levels:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight kotlin"&gt;&lt;code&gt;&lt;span class="c1"&gt;// In your test setup&lt;/span&gt;
&lt;span class="nd"&gt;@Test&lt;/span&gt;
&lt;span class="k"&gt;fun&lt;/span&gt; &lt;span class="nf"&gt;testForegroundServiceType&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nc"&gt;Build&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nc"&gt;VERSION&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nc"&gt;SDK_INT&lt;/span&gt; &lt;span class="p"&gt;&amp;gt;=&lt;/span&gt; &lt;span class="nc"&gt;Build&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nc"&gt;VERSION_CODES&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nc"&gt;UPSIDE_DOWN_CAKE&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="c1"&gt;// Verify service type is declared correctly&lt;/span&gt;
        &lt;span class="kd"&gt;val&lt;/span&gt; &lt;span class="py"&gt;serviceInfo&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="n"&gt;packageManager&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;getServiceInfo&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
            &lt;span class="nc"&gt;ComponentName&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;context&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nc"&gt;ScreenCaptureService&lt;/span&gt;&lt;span class="o"&gt;::&lt;/span&gt;&lt;span class="k"&gt;class&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;java&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
            &lt;span class="nc"&gt;PackageManager&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nc"&gt;GET_META_DATA&lt;/span&gt;
        &lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="nf"&gt;assertTrue&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
            &lt;span class="n"&gt;serviceInfo&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;foregroundServiceType&lt;/span&gt; &lt;span class="n"&gt;and&lt;/span&gt; 
            &lt;span class="nc"&gt;ServiceInfo&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nc"&gt;FOREGROUND_SERVICE_TYPE_MEDIA_PROJECTION&lt;/span&gt; &lt;span class="p"&gt;!=&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;
        &lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  Wrapping Up
&lt;/h2&gt;

&lt;p&gt;The foreground service changes in Android 14 and 15 are significant, but they make sense from a user privacy perspective. As developers, we need to be explicit about what our services do and why they need to run in the foreground.&lt;/p&gt;

&lt;p&gt;If you're building an app that uses screen capture, OCR, or floating overlays, these changes will directly affect you. Plan your migration early and test thoroughly across API levels.&lt;/p&gt;

&lt;p&gt;For a real-world example of an app that handles all of this, check out &lt;a href="https://play.google.com/store/apps/details?id=com.screentranslator.app" rel="noopener noreferrer"&gt;Screen Translator on Google Play&lt;/a&gt; — it's a floating bubble translator that uses MediaProjection for OCR-based screen translation.&lt;/p&gt;




&lt;p&gt;&lt;em&gt;Have questions about foreground services or MediaProjection? Drop a comment below — happy to help!&lt;/em&gt;&lt;/p&gt;

</description>
      <category>android</category>
      <category>mobile</category>
      <category>kotlin</category>
      <category>tutorial</category>
    </item>
    <item>
      <title>How to Handle Vertical Japanese Text in Android OCR</title>
      <dc:creator>joe wang</dc:creator>
      <pubDate>Sat, 21 Feb 2026 02:40:43 +0000</pubDate>
      <link>https://dev.to/joe_wang_6a4a3e51566e8b52/how-to-handle-vertical-japanese-text-in-android-ocr-1mj9</link>
      <guid>https://dev.to/joe_wang_6a4a3e51566e8b52/how-to-handle-vertical-japanese-text-in-android-ocr-1mj9</guid>
      <description>&lt;p&gt;Japanese text can be written vertically (tategaki, 縦書き) — top-to-bottom, right-to-left columns. This is the standard layout in manga, many games, and traditional Japanese documents. If you're building an OCR-based translation tool for Android, handling vertical text is one of the trickiest challenges you'll face.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Problem
&lt;/h2&gt;

&lt;p&gt;Most OCR engines are optimized for horizontal left-to-right text. When you feed them a manga page with vertical Japanese text, you get:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Garbled character order&lt;/li&gt;
&lt;li&gt;Merged text from adjacent columns&lt;/li&gt;
&lt;li&gt;Missing characters at column boundaries&lt;/li&gt;
&lt;li&gt;Completely wrong reading direction&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Detection Strategies
&lt;/h2&gt;

&lt;h3&gt;
  
  
  1. Aspect Ratio Analysis
&lt;/h3&gt;

&lt;p&gt;Vertical text blocks tend to be taller than wide. If a detected text region has a height-to-width ratio &amp;gt; 2:1, it's likely vertical text.&lt;/p&gt;

&lt;h3&gt;
  
  
  2. Character Spacing Patterns
&lt;/h3&gt;

&lt;p&gt;In vertical text, characters are stacked with consistent vertical spacing. Analyze the spatial distribution of detected characters — if they cluster along vertical axes, rotate the region 90° before OCR.&lt;/p&gt;

&lt;h3&gt;
  
  
  3. ML Kit's Built-in Support
&lt;/h3&gt;

&lt;p&gt;Google's ML Kit (used in many Android OCR apps) has improved vertical text support in recent versions. The &lt;code&gt;TextRecognition&lt;/code&gt; API with the Japanese script recognizer handles vertical text reasonably well out of the box.&lt;/p&gt;

&lt;h2&gt;
  
  
  Practical Solution: Screen Translator's Approach
&lt;/h2&gt;

&lt;p&gt;In &lt;a href="https://play.google.com/store/apps/details?id=com.screentranslator.app" rel="noopener noreferrer"&gt;Screen Translator&lt;/a&gt;, we handle vertical Japanese text with a multi-step pipeline:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;Capture&lt;/strong&gt; — MediaProjection API captures the screen&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Pre-process&lt;/strong&gt; — Detect text regions and analyze orientation&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Rotate if needed&lt;/strong&gt; — Vertical regions get rotated 90° CW&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;OCR&lt;/strong&gt; — Run text recognition on normalized images&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Post-process&lt;/strong&gt; — Reorder characters to correct reading order&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Translate&lt;/strong&gt; — Send properly ordered text to translation API&lt;/li&gt;
&lt;/ol&gt;

&lt;h2&gt;
  
  
  Tips for Developers
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;Always test with real manga pages, not just synthetic test images&lt;/li&gt;
&lt;li&gt;Japanese text in games often uses custom fonts that reduce OCR accuracy&lt;/li&gt;
&lt;li&gt;Furigana (small reading aids above kanji) can confuse OCR — consider filtering by text size&lt;/li&gt;
&lt;li&gt;Mixed horizontal/vertical layouts (common in manga) need per-region orientation detection&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Results
&lt;/h2&gt;

&lt;p&gt;With proper vertical text handling, OCR accuracy on manga pages jumps from ~40% to ~85%+ for clean digital scans. The key insight is that text orientation detection must happen &lt;em&gt;before&lt;/em&gt; OCR, not after.&lt;/p&gt;

&lt;p&gt;If you're working on similar problems, I'd love to hear your approach. Drop a comment below!&lt;/p&gt;




&lt;p&gt;&lt;em&gt;Screen Translator is a free floating overlay translator for Android that handles vertical Japanese text, manga, games, and more: &lt;a href="https://play.google.com/store/apps/details?id=com.screentranslator.app" rel="noopener noreferrer"&gt;Google Play&lt;/a&gt;&lt;/em&gt;&lt;/p&gt;

</description>
      <category>android</category>
    </item>
    <item>
      <title>Why OCR for CJK Languages Is Still a Hard Problem in 2026 — And How I'm Tackling It</title>
      <dc:creator>joe wang</dc:creator>
      <pubDate>Fri, 20 Feb 2026 18:46:38 +0000</pubDate>
      <link>https://dev.to/joe_wang_6a4a3e51566e8b52/why-ocr-for-cjk-languages-is-still-a-hard-problem-in-2026-and-how-im-tackling-it-5fge</link>
      <guid>https://dev.to/joe_wang_6a4a3e51566e8b52/why-ocr-for-cjk-languages-is-still-a-hard-problem-in-2026-and-how-im-tackling-it-5fge</guid>
      <description>&lt;p&gt;If you've ever tried to build an OCR system that handles Chinese, Japanese, or Korean text, you know the pain. Latin-script OCR has been "good enough" for years, but CJK languages? Still a minefield in 2026.&lt;/p&gt;

&lt;p&gt;I've been working on &lt;a href="https://play.google.com/store/apps/details?id=com.screentranslator.app" rel="noopener noreferrer"&gt;Screen Translator&lt;/a&gt;, an Android app that uses a floating bubble to OCR and translate on-screen text in real time. Building it forced me to confront every ugly corner of CJK text recognition. Here's what I learned.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Character Set Problem
&lt;/h2&gt;

&lt;p&gt;English has 26 letters. Chinese has over 50,000 characters in common use (GB18030 standard). Japanese mixes three scripts — Hiragana, Katakana, and Kanji — sometimes in the same sentence. Korean Hangul has 11,172 possible syllable blocks.&lt;/p&gt;

&lt;p&gt;For an OCR engine, this means:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Massive classification space&lt;/strong&gt;: Instead of distinguishing ~70 characters (upper/lower + digits + punctuation), you're classifying among tens of thousands&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Visually similar characters&lt;/strong&gt;: 土/士, 末/未, 己/已/巳 — these differ by a single pixel-level stroke&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Mixed scripts&lt;/strong&gt;: A Japanese game UI might show "HP回復アイテム" — that's Latin, Kanji, and Katakana in one string&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Why Standard OCR Pipelines Struggle
&lt;/h2&gt;

&lt;p&gt;Most OCR pipelines follow: Detection → Recognition → Post-processing.&lt;/p&gt;

&lt;p&gt;For CJK, each step has unique failure modes:&lt;/p&gt;

&lt;h3&gt;
  
  
  Detection
&lt;/h3&gt;

&lt;p&gt;CJK text can be vertical or horizontal. Game UIs love vertical text. Manga reads right-to-left. Most detection models are trained on horizontal Latin text and simply miss vertical CJK layouts.&lt;/p&gt;

&lt;h3&gt;
  
  
  Recognition
&lt;/h3&gt;

&lt;p&gt;The standard CRNN (CNN + RNN + CTC) architecture works well for Latin scripts but struggles with CJK because:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;# Simplified comparison
Latin: Fixed-width character assumption mostly works
CJK: Character width varies dramatically
     Full-width: ＡＢＣ (each takes 2x space)
     Half-width: ABC
     Mixed: 「Hello世界」
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The CTC (Connectionist Temporal Classification) loss function assumes characters appear in sequence without overlap. CJK characters in stylized fonts (especially in games and manga) often break this assumption.&lt;/p&gt;

&lt;h3&gt;
  
  
  Post-processing
&lt;/h3&gt;

&lt;p&gt;For English, you can use dictionary lookup and language models to fix OCR errors. "teh" → "the" is trivial. But for Chinese, a single wrong character can completely change meaning:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;大人 (adult) vs 犬人 (not a word — but OCR might produce it)&lt;/li&gt;
&lt;li&gt;Context-based correction requires much larger language models&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  What Actually Works in 2026
&lt;/h2&gt;

&lt;p&gt;After months of iteration, here's what I found effective:&lt;/p&gt;

&lt;h3&gt;
  
  
  1. Multi-scale text detection
&lt;/h3&gt;

&lt;p&gt;Using a CRAFT-like detector with explicit vertical text support. Training data must include vertical Japanese manga panels and Chinese calligraphy-style game text.&lt;/p&gt;

&lt;h3&gt;
  
  
  2. Attention-based recognition over CTC
&lt;/h3&gt;

&lt;p&gt;Transformer-based recognition models handle variable-width CJK characters much better than CTC-based approaches. The attention mechanism naturally handles the alignment problem.&lt;/p&gt;

&lt;h3&gt;
  
  
  3. Script-aware preprocessing
&lt;/h3&gt;

&lt;p&gt;Before feeding text to the recognizer, detect the dominant script and adjust:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;preprocess_for_script&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;image&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;detected_script&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;detected_script&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;ja&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;zh&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;]:&lt;/span&gt;
        &lt;span class="c1"&gt;# CJK benefits from higher resolution input
&lt;/span&gt;        &lt;span class="n"&gt;image&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;upscale&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;image&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;factor&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="c1"&gt;# Binarization helps with stylized game fonts
&lt;/span&gt;        &lt;span class="n"&gt;image&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;adaptive_threshold&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;image&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="nf"&gt;is_vertical&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;image&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
        &lt;span class="n"&gt;image&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;rotate_90&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;image&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;image&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  4. Game/Manga-specific fine-tuning
&lt;/h3&gt;

&lt;p&gt;Generic OCR models fail on stylized text. Fine-tuning on screenshots from actual games and manga pages made a huge difference in my app's accuracy.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Real-World Test
&lt;/h2&gt;

&lt;p&gt;The ultimate test for Screen Translator was Japanese gacha games. These combine:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Stylized fonts with outlines and shadows&lt;/li&gt;
&lt;li&gt;Text over complex backgrounds (character art, particle effects)&lt;/li&gt;
&lt;li&gt;Mixed Japanese/English/numbers&lt;/li&gt;
&lt;li&gt;Small text in UI elements&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Getting reliable OCR in this environment required all the techniques above, plus aggressive image preprocessing to isolate text from backgrounds.&lt;/p&gt;

&lt;h2&gt;
  
  
  Lessons for Fellow Developers
&lt;/h2&gt;

&lt;p&gt;If you're building anything that touches CJK OCR:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;Don't assume horizontal text&lt;/strong&gt; — support vertical from day one&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Test on real content&lt;/strong&gt; — synthetic training data alone won't cut it for games/manga&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Character-level confidence matters&lt;/strong&gt; — when OCR confidence is low on a CJK character, it's better to show the user than to guess wrong&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Translation quality depends on OCR quality&lt;/strong&gt; — garbage in, garbage out. A mistranslation from bad OCR is worse than showing "recognition failed"&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;I'm still iterating on Screen Translator's OCR pipeline. If you're working on similar problems or have found good approaches for CJK text recognition, I'd love to hear about it in the comments.&lt;/p&gt;

&lt;p&gt;You can try the app here: &lt;a href="https://play.google.com/store/apps/details?id=com.screentranslator.app" rel="noopener noreferrer"&gt;Screen Translator on Google Play&lt;/a&gt;&lt;/p&gt;




&lt;p&gt;&lt;em&gt;What's your experience with CJK OCR? Have you found any tricks that work well for specific use cases? Let me know below.&lt;/em&gt;&lt;/p&gt;

</description>
      <category>android</category>
      <category>machinelearning</category>
      <category>showdev</category>
      <category>programming</category>
    </item>
    <item>
      <title>How I Built a Floating Bubble OCR Translator for Android — Lessons Learned</title>
      <dc:creator>joe wang</dc:creator>
      <pubDate>Fri, 20 Feb 2026 18:28:12 +0000</pubDate>
      <link>https://dev.to/joe_wang_6a4a3e51566e8b52/how-i-built-a-floating-bubble-ocr-translator-for-android-lessons-learned-1k6o</link>
      <guid>https://dev.to/joe_wang_6a4a3e51566e8b52/how-i-built-a-floating-bubble-ocr-translator-for-android-lessons-learned-1k6o</guid>
      <description>&lt;p&gt;As a solo Android developer, I spent the last few months building a floating bubble OCR translator. The idea was simple: tap a bubble on your screen, select any text area, and get an instant translation — without leaving whatever app you're in.&lt;/p&gt;

&lt;p&gt;Here's what I learned along the way, and some technical challenges that might help if you're building something similar.&lt;/p&gt;

&lt;h2&gt;
  
  
  Why a Floating Bubble?
&lt;/h2&gt;

&lt;p&gt;Most translation apps require you to switch contexts. Copy text, open the translator, paste, read the result, switch back. For use cases like reading manga, playing foreign-language games, or chatting on international messaging apps, this flow is painfully slow.&lt;/p&gt;

&lt;p&gt;A floating bubble overlay stays on top of everything. One tap, drag to select, instant result. The UX difference is massive.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Technical Stack
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Language&lt;/strong&gt;: Kotlin&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;OCR Engine&lt;/strong&gt;: ML Kit for on-device text recognition&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Translation&lt;/strong&gt;: Google ML Kit Translation API (on-device models)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Overlay&lt;/strong&gt;: Android's &lt;code&gt;SYSTEM_ALERT_WINDOW&lt;/code&gt; permission&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Screen Capture&lt;/strong&gt;: &lt;code&gt;MediaProjection&lt;/code&gt; API&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Challenge 1: Getting the Overlay Right
&lt;/h2&gt;

&lt;p&gt;Android's overlay permission (&lt;code&gt;SYSTEM_ALERT_WINDOW&lt;/code&gt;) is one of those things that sounds simple but has a ton of edge cases:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;On Android 10+, you need to explicitly request the permission via &lt;code&gt;Settings.ACTION_MANAGE_OVERLAY_PERMISSION&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;Some OEMs (looking at you, Xiaomi and OPPO) have additional overlay restrictions&lt;/li&gt;
&lt;li&gt;The bubble needs to be draggable but also respond to taps — handling the touch event delegation correctly took more iterations than I'd like to admit&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The key insight: use &lt;code&gt;WindowManager.LayoutParams.FLAG_NOT_FOCUSABLE&lt;/code&gt; for the bubble itself, but switch to a focusable window when the selection overlay is active.&lt;/p&gt;

&lt;h2&gt;
  
  
  Challenge 2: OCR on CJK Languages
&lt;/h2&gt;

&lt;p&gt;ML Kit's text recognition works great for Latin scripts out of the box. But for Japanese, Chinese, and Korean — which are the primary use cases for screen translation — you need the CJK-specific models.&lt;/p&gt;

&lt;p&gt;Some gotchas:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Vertical text&lt;/strong&gt;: Japanese manga is written vertically. ML Kit handles this, but you need to configure the recognizer for Japanese specifically&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Mixed scripts&lt;/strong&gt;: Manga often mixes kanji, hiragana, katakana, and sometimes romaji in the same panel&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Small text&lt;/strong&gt;: OCR accuracy drops significantly with small text. I added a zoom hint in the UI to encourage users to zoom in before capturing&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Challenge 3: Screen Capture Performance
&lt;/h2&gt;

&lt;p&gt;Using &lt;code&gt;MediaProjection&lt;/code&gt; to capture the screen is straightforward, but performance matters:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight kotlin"&gt;&lt;code&gt;&lt;span class="c1"&gt;// Capture only the selected region, not the full screen&lt;/span&gt;
&lt;span class="kd"&gt;val&lt;/span&gt; &lt;span class="py"&gt;bitmap&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;Bitmap&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;createBitmap&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="n"&gt;fullScreenBitmap&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;selectionRect&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;left&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;selectionRect&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;top&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;selectionRect&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;width&lt;/span&gt;&lt;span class="p"&gt;(),&lt;/span&gt;
    &lt;span class="n"&gt;selectionRect&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;height&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Cropping to just the selected area before running OCR makes a huge difference in processing time. On a mid-range phone, full-screen OCR takes 800ms+, but a cropped manga panel takes ~200ms.&lt;/p&gt;

&lt;h2&gt;
  
  
  Challenge 4: Translation Quality
&lt;/h2&gt;

&lt;p&gt;On-device translation models are convenient (no API costs, works offline), but the quality varies. For Japanese → English, the results are "good enough" for understanding context, but not publication-quality.&lt;/p&gt;

&lt;p&gt;I found that keeping the source text visible alongside the translation helps users who know some of the source language fill in the gaps.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Result
&lt;/h2&gt;

&lt;p&gt;The app is called &lt;strong&gt;Screen Translator&lt;/strong&gt; and it's live on Google Play:&lt;/p&gt;

&lt;p&gt;👉 &lt;a href="https://play.google.com/store/apps/details?id=com.screentranslator.app" rel="noopener noreferrer"&gt;Screen Translator on Google Play&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Main use cases people are finding:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Reading raw manga without waiting for fan translations&lt;/li&gt;
&lt;li&gt;Playing Japanese/Korean/Chinese mobile games&lt;/li&gt;
&lt;li&gt;Translating chat messages in foreign-language messaging apps&lt;/li&gt;
&lt;li&gt;Reading foreign social media posts&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  What I'd Do Differently
&lt;/h2&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;Start with CJK support from day one&lt;/strong&gt; — I initially built for Latin scripts and retrofitted CJK support. Should have been the other way around given the target audience.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Battery optimization earlier&lt;/strong&gt; — Screen capture + OCR + translation is battery-hungry. I should have implemented smart capture intervals from the start.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;User onboarding&lt;/strong&gt; — The overlay permission flow confuses a lot of users. A step-by-step tutorial on first launch would have saved me a lot of support emails.&lt;/li&gt;
&lt;/ol&gt;

&lt;h2&gt;
  
  
  Wrapping Up
&lt;/h2&gt;

&lt;p&gt;Building an overlay-based Android app is a unique challenge. You're essentially building a mini-app that lives on top of the entire OS. The permission model, touch handling, and performance constraints are all different from a standard app.&lt;/p&gt;

&lt;p&gt;If you're thinking about building something similar, feel free to ask questions in the comments. Happy to share more technical details about any specific part of the implementation.&lt;/p&gt;




&lt;p&gt;&lt;em&gt;I'm a solo dev building tools that help people break language barriers on mobile. If you find this interesting, check out the app and let me know what you think!&lt;/em&gt;&lt;/p&gt;

</description>
      <category>android</category>
      <category>showdev</category>
    </item>
    <item>
      <title>Best Screen Translation Apps for Android in 2026</title>
      <dc:creator>joe wang</dc:creator>
      <pubDate>Fri, 20 Feb 2026 14:06:20 +0000</pubDate>
      <link>https://dev.to/joe_wang_6a4a3e51566e8b52/best-screen-translation-apps-for-android-in-2026-1la2</link>
      <guid>https://dev.to/joe_wang_6a4a3e51566e8b52/best-screen-translation-apps-for-android-in-2026-1la2</guid>
      <description>&lt;p&gt;Need to translate text on your phone screen without switching apps? Whether you're playing foreign language games, reading untranslated manga, or chatting with someone in another language, a screen translation app can save you a lot of time.&lt;/p&gt;

&lt;p&gt;I tested several screen translation apps for Android to find out which ones actually work well. Here's what I found.&lt;/p&gt;

&lt;h2&gt;
  
  
  What Is a Screen Translation App?
&lt;/h2&gt;

&lt;p&gt;Unlike regular translation apps where you type or paste text, screen translation apps work directly on your screen. They use OCR (Optical Character Recognition) to detect text in any app — games, manga readers, social media, chat apps — and translate it without you having to leave what you're doing.&lt;/p&gt;

&lt;p&gt;Most of them use a floating bubble or overlay that stays on top of other apps.&lt;/p&gt;

&lt;h2&gt;
  
  
  What I Tested For
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;OCR accuracy&lt;/strong&gt; — Can it correctly recognize text, especially Japanese/Korean/Chinese?&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Translation quality&lt;/strong&gt; — How good are the translations?&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Speed&lt;/strong&gt; — How fast does it translate?&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Ease of use&lt;/strong&gt; — Is the floating bubble intuitive?&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Language support&lt;/strong&gt; — How many languages does it handle?&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Price&lt;/strong&gt; — Is it free or does it require a subscription?&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  The Apps
&lt;/h2&gt;

&lt;h3&gt;
  
  
  1. Screen Translator
&lt;/h3&gt;

&lt;p&gt;&lt;strong&gt;Rating: ★★★★☆&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href="https://play.google.com/store/apps/details?id=com.screentranslator.app" rel="noopener noreferrer"&gt;Google Play Link&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;A floating bubble translator that works on any screen. Tap the bubble, it captures the screen, recognizes text via OCR, and shows the translation.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;What I liked:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Clean floating bubble interface — doesn't get in the way&lt;/li&gt;
&lt;li&gt;Good OCR for Japanese (including vertical text) and Korean&lt;/li&gt;
&lt;li&gt;Works with games, manga readers, social media, chat apps&lt;/li&gt;
&lt;li&gt;Supports 100+ languages&lt;/li&gt;
&lt;li&gt;Free to use with ads, premium removes ads and adds features&lt;/li&gt;
&lt;li&gt;Offline OCR scanning available&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;What could be better:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Translation speed could be slightly faster on older devices&lt;/li&gt;
&lt;li&gt;Handwritten or heavily stylized text can be missed&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Best for:&lt;/strong&gt; Gamers playing Japanese/Korean games, manga readers, people browsing foreign social media.&lt;/p&gt;

&lt;h3&gt;
  
  
  2. Google Translate (Camera/Lens)
&lt;/h3&gt;

&lt;p&gt;&lt;strong&gt;Rating: ★★★☆☆&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;The most well-known option. Google Translate's camera feature can translate text in real-time through your camera, and Google Lens can translate text in screenshots.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;What I liked:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Free and pre-installed on most Android phones&lt;/li&gt;
&lt;li&gt;Excellent translation quality (it's Google)&lt;/li&gt;
&lt;li&gt;Camera mode works for physical text (signs, menus)&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;What could be better:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;No floating bubble — you have to leave your current app&lt;/li&gt;
&lt;li&gt;Can't translate directly on screen in other apps&lt;/li&gt;
&lt;li&gt;Tedious workflow: screenshot → open Google Translate → import image&lt;/li&gt;
&lt;li&gt;Not practical for gaming or manga reading&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Best for:&lt;/strong&gt; Translating physical text (signs, menus, documents) with your camera.&lt;/p&gt;

&lt;h3&gt;
  
  
  3. Papago
&lt;/h3&gt;

&lt;p&gt;&lt;strong&gt;Rating: ★★★★☆&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Developed by Naver (Korea's biggest search engine). Papago is excellent for Korean, Japanese, and Chinese translation.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;What I liked:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Best Korean translation quality among all apps&lt;/li&gt;
&lt;li&gt;Has an image translation feature&lt;/li&gt;
&lt;li&gt;Clean interface&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;What could be better:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;No floating bubble overlay for on-screen translation&lt;/li&gt;
&lt;li&gt;Image translation requires taking a screenshot first&lt;/li&gt;
&lt;li&gt;Limited to Asian languages (not great for European languages)&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Best for:&lt;/strong&gt; Korean language translation, especially for Korean learners.&lt;/p&gt;

&lt;h3&gt;
  
  
  4. Universal Copy
&lt;/h3&gt;

&lt;p&gt;&lt;strong&gt;Rating: ★★★☆☆&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Not a translator itself, but lets you copy text from any app (even apps that normally don't allow text selection). You can then paste the copied text into any translator.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;What I liked:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Works system-wide&lt;/li&gt;
&lt;li&gt;Useful for copying text from apps that block selection&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;What could be better:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Extra step required — copy then paste into translator&lt;/li&gt;
&lt;li&gt;Doesn't work with text in images (no OCR)&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Best for:&lt;/strong&gt; Copying text from restrictive apps to use with your preferred translator.&lt;/p&gt;

&lt;h2&gt;
  
  
  Comparison Table
&lt;/h2&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;App&lt;/th&gt;
&lt;th&gt;Floating Bubble&lt;/th&gt;
&lt;th&gt;OCR&lt;/th&gt;
&lt;th&gt;Game Translation&lt;/th&gt;
&lt;th&gt;Manga Translation&lt;/th&gt;
&lt;th&gt;Free&lt;/th&gt;
&lt;th&gt;Offline&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Screen Translator&lt;/td&gt;
&lt;td&gt;✅&lt;/td&gt;
&lt;td&gt;✅&lt;/td&gt;
&lt;td&gt;✅&lt;/td&gt;
&lt;td&gt;✅&lt;/td&gt;
&lt;td&gt;✅ (with ads)&lt;/td&gt;
&lt;td&gt;Partial&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Google Translate&lt;/td&gt;
&lt;td&gt;❌&lt;/td&gt;
&lt;td&gt;✅ (camera only)&lt;/td&gt;
&lt;td&gt;❌&lt;/td&gt;
&lt;td&gt;❌&lt;/td&gt;
&lt;td&gt;✅&lt;/td&gt;
&lt;td&gt;✅&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Papago&lt;/td&gt;
&lt;td&gt;❌&lt;/td&gt;
&lt;td&gt;✅ (screenshot)&lt;/td&gt;
&lt;td&gt;❌&lt;/td&gt;
&lt;td&gt;❌&lt;/td&gt;
&lt;td&gt;✅&lt;/td&gt;
&lt;td&gt;Partial&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Universal Copy&lt;/td&gt;
&lt;td&gt;❌&lt;/td&gt;
&lt;td&gt;❌&lt;/td&gt;
&lt;td&gt;❌&lt;/td&gt;
&lt;td&gt;❌&lt;/td&gt;
&lt;td&gt;✅&lt;/td&gt;
&lt;td&gt;N/A&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;h2&gt;
  
  
  Which One Should You Use?
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;If you play foreign language games:&lt;/strong&gt; Screen Translator — the floating bubble is essential for gaming since you can't easily switch apps mid-game.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;If you read raw manga/manhwa:&lt;/strong&gt; Screen Translator — OCR works on manga panels and handles vertical Japanese text.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;If you need to translate physical text (menus, signs):&lt;/strong&gt; Google Translate camera mode — point and translate.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;If you're learning Korean:&lt;/strong&gt; Papago — best Korean translation quality.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;If you need to copy text from restrictive apps:&lt;/strong&gt; Universal Copy + any translator.&lt;/p&gt;

&lt;h2&gt;
  
  
  My Recommendation
&lt;/h2&gt;

&lt;p&gt;For most use cases — especially gaming, manga reading, and social media browsing — a floating bubble translator like Screen Translator gives you the best experience. Being able to translate without leaving your current app is a game-changer once you get used to it.&lt;/p&gt;

&lt;p&gt;Google Translate is still the king for general translation quality, but its lack of a floating overlay makes it impractical for on-screen translation.&lt;/p&gt;




&lt;p&gt;&lt;em&gt;Have you tried any of these apps? Let me know which one works best for you in the comments!&lt;/em&gt;&lt;/p&gt;

</description>
      <category>android</category>
      <category>mobile</category>
      <category>tooling</category>
      <category>productivity</category>
    </item>
    <item>
      <title>How to Read Raw Manga and Manhwa Without Knowing Japanese or Korean</title>
      <dc:creator>joe wang</dc:creator>
      <pubDate>Fri, 20 Feb 2026 14:02:17 +0000</pubDate>
      <link>https://dev.to/joe_wang_6a4a3e51566e8b52/how-to-read-raw-manga-and-manhwa-without-knowing-japanese-or-korean-3oof</link>
      <guid>https://dev.to/joe_wang_6a4a3e51566e8b52/how-to-read-raw-manga-and-manhwa-without-knowing-japanese-or-korean-3oof</guid>
      <description>&lt;p&gt;You found an amazing manga series, but there's no English translation yet. Or the translation is 50 chapters behind. Sound familiar?&lt;/p&gt;

&lt;p&gt;Reading raw (untranslated) manga and manhwa is frustrating when you can't read the language. But there are ways to enjoy these series without waiting months for fan translations.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Raw Manga Problem
&lt;/h2&gt;

&lt;p&gt;New manga chapters release in Japanese first. Popular series might get English translations within days, but many titles — especially niche ones — take weeks, months, or never get translated at all.&lt;/p&gt;

&lt;p&gt;Korean manhwa and webtoons have the same issue. The Korean releases are always ahead of English versions.&lt;/p&gt;

&lt;p&gt;If you're tired of waiting, here's how to read raw manga and manhwa right now.&lt;/p&gt;

&lt;h2&gt;
  
  
  Method 1: On-Screen Translation Apps
&lt;/h2&gt;

&lt;p&gt;The fastest way to read raw manga on your phone. These apps use OCR (Optical Character Recognition) to detect text in manga panels and translate it in real-time.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;How it works:&lt;/strong&gt;&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Open your manga reader app (Tachiyomi, MangaDex, etc.)&lt;/li&gt;
&lt;li&gt;Enable the floating translator bubble&lt;/li&gt;
&lt;li&gt;Navigate to the page you want to translate&lt;/li&gt;
&lt;li&gt;Tap the bubble — it scans the page and translates all text&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;&lt;strong&gt;Why this works well for manga:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Handles both vertical text (Japanese manga) and horizontal text (Korean manhwa)&lt;/li&gt;
&lt;li&gt;Works with any manga reader app&lt;/li&gt;
&lt;li&gt;Translates speech bubbles, sound effects, and narration boxes&lt;/li&gt;
&lt;li&gt;No need to take screenshots or switch apps&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;a href="https://play.google.com/store/apps/details?id=com.screentranslator.app" rel="noopener noreferrer"&gt;Screen Translator&lt;/a&gt; is one app that does this — it's free on Android and supports both Japanese vertical text and Korean horizontal text. The OCR is decent for clean manga text, though handwritten or stylized fonts can be tricky.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Best for:&lt;/strong&gt; Catching up on untranslated chapters, reading niche series with no English release.&lt;/p&gt;

&lt;h2&gt;
  
  
  Method 2: Manga-Specific Translation Tools
&lt;/h2&gt;

&lt;p&gt;Some tools are built specifically for manga translation:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Manga Translator (various apps)&lt;/strong&gt; — dedicated manga translation with panel detection&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Google Lens&lt;/strong&gt; — point your camera at a physical manga page to translate&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;These can work, but most require you to take screenshots first, which breaks the reading flow.&lt;/p&gt;

&lt;h2&gt;
  
  
  Method 3: Wait for Fan Translations
&lt;/h2&gt;

&lt;p&gt;The most common approach. Fan translation groups (scanlation groups) translate manga and manhwa for free.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Where to find fan translations:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;MangaDex — largest fan translation aggregator&lt;/li&gt;
&lt;li&gt;Webtoon (official) — for licensed Korean webtoons&lt;/li&gt;
&lt;li&gt;Various scanlation group websites&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;The downside:&lt;/strong&gt; You're always behind the latest chapters, and some series never get picked up.&lt;/p&gt;

&lt;h2&gt;
  
  
  Method 4: Machine Translation Aggregator Sites
&lt;/h2&gt;

&lt;p&gt;Some websites automatically machine-translate raw manga. The quality varies wildly — sometimes readable, sometimes nonsensical.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Pros:&lt;/strong&gt; Instant access to latest chapters&lt;br&gt;
&lt;strong&gt;Cons:&lt;/strong&gt; Translation quality is often poor, especially for nuanced dialogue&lt;/p&gt;

&lt;h2&gt;
  
  
  Comparing the Methods
&lt;/h2&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Method&lt;/th&gt;
&lt;th&gt;Speed&lt;/th&gt;
&lt;th&gt;Quality&lt;/th&gt;
&lt;th&gt;Effort&lt;/th&gt;
&lt;th&gt;Cost&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;On-screen translator&lt;/td&gt;
&lt;td&gt;Instant&lt;/td&gt;
&lt;td&gt;Good (for clean text)&lt;/td&gt;
&lt;td&gt;Very low&lt;/td&gt;
&lt;td&gt;Free&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Manga-specific tools&lt;/td&gt;
&lt;td&gt;Fast&lt;/td&gt;
&lt;td&gt;Good&lt;/td&gt;
&lt;td&gt;Medium&lt;/td&gt;
&lt;td&gt;Varies&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Fan translations&lt;/td&gt;
&lt;td&gt;Slow (days-weeks)&lt;/td&gt;
&lt;td&gt;Best&lt;/td&gt;
&lt;td&gt;None&lt;/td&gt;
&lt;td&gt;Free&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Machine translation sites&lt;/td&gt;
&lt;td&gt;Instant&lt;/td&gt;
&lt;td&gt;Poor-Medium&lt;/td&gt;
&lt;td&gt;None&lt;/td&gt;
&lt;td&gt;Free&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;h2&gt;
  
  
  Tips for Reading Raw Manga
&lt;/h2&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;Start with manhwa&lt;/strong&gt; — Korean text is horizontal and generally easier for OCR to recognize than vertical Japanese text&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Choose series with clean art&lt;/strong&gt; — manga with lots of screen tones or busy backgrounds can interfere with text recognition&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Learn common manga vocabulary&lt;/strong&gt; — words like 何 (what), 俺 (I/me), 行く (go), 来い (come) appear constantly&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Use translation for dialogue, skip sound effects&lt;/strong&gt; — sound effects (onomatopoeia) are often untranslatable anyway&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Read the raw first, then the translation&lt;/strong&gt; — if a fan translation exists, read the raw chapter first with a translator, then compare with the official translation to improve your understanding&lt;/li&gt;
&lt;/ol&gt;

&lt;h2&gt;
  
  
  For Korean Manhwa / Webtoon Readers
&lt;/h2&gt;

&lt;p&gt;Korean manhwa and webtoons are actually easier to translate than Japanese manga because:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Text is horizontal (standard reading direction)&lt;/li&gt;
&lt;li&gt;Speech bubbles are usually clean with standard fonts&lt;/li&gt;
&lt;li&gt;Less stylized text compared to Japanese manga&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;If you're reading Korean webtoons raw, an on-screen translator will give you pretty good results.&lt;/p&gt;

&lt;h2&gt;
  
  
  Conclusion
&lt;/h2&gt;

&lt;p&gt;You don't have to wait for translations anymore. With on-screen translation tools, you can read raw manga and manhwa the day they release. The translation won't be perfect, but it's good enough to follow the story and enjoy the art.&lt;/p&gt;

&lt;p&gt;For Android users, &lt;a href="https://play.google.com/store/apps/details?id=com.screentranslator.app" rel="noopener noreferrer"&gt;Screen Translator&lt;/a&gt; works well with most manga reader apps — just tap the floating bubble on any page to get an instant translation.&lt;/p&gt;




&lt;p&gt;&lt;em&gt;What raw manga or manhwa are you reading? Share your recommendations below!&lt;/em&gt;&lt;/p&gt;

</description>
      <category>android</category>
      <category>tutorial</category>
      <category>beginners</category>
      <category>mobile</category>
    </item>
    <item>
      <title>How to Play Japanese Games Without Knowing Japanese (2026 Guide)</title>
      <dc:creator>joe wang</dc:creator>
      <pubDate>Fri, 20 Feb 2026 14:00:52 +0000</pubDate>
      <link>https://dev.to/joe_wang_6a4a3e51566e8b52/how-to-play-japanese-games-without-knowing-japanese-2026-guide-a3a</link>
      <guid>https://dev.to/joe_wang_6a4a3e51566e8b52/how-to-play-japanese-games-without-knowing-japanese-2026-guide-a3a</guid>
      <description>&lt;p&gt;If you've ever wanted to play a Japanese mobile game but gave up because you couldn't read the menus, quests, or story — you're not alone. Tons of amazing games never get an official English release, and even when they do, it can take months or years.&lt;/p&gt;

&lt;p&gt;The good news? You don't need to learn Japanese to enjoy these games. Here are the practical methods that actually work in 2026.&lt;/p&gt;

&lt;h2&gt;
  
  
  Why Japanese Games Are Worth Playing
&lt;/h2&gt;

&lt;p&gt;Japan produces some of the best mobile games in the world — gacha games like Fate/Grand Order, Monster Strike, and Granblue Fantasy have massive player bases. Many JRPGs, visual novels, and strategy games launch exclusively in Japan first.&lt;/p&gt;

&lt;p&gt;The problem? Most of these games are entirely in Japanese. Menus, dialogue, quest descriptions, equipment stats — everything.&lt;/p&gt;

&lt;h2&gt;
  
  
  Method 1: Use a Floating Screen Translator
&lt;/h2&gt;

&lt;p&gt;This is the most convenient method for mobile gamers. A floating screen translator sits on top of your game as a small bubble. When you need to translate something, you just tap the bubble — it captures the text on screen using OCR and translates it instantly.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;How it works:&lt;/strong&gt;&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Open the translator app and enable the floating bubble&lt;/li&gt;
&lt;li&gt;Launch your game&lt;/li&gt;
&lt;li&gt;When you see Japanese text you can't read, tap the floating bubble&lt;/li&gt;
&lt;li&gt;The app scans the screen, recognizes the text, and shows the translation&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;&lt;strong&gt;Pros:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Works with any game (no root or special setup needed)&lt;/li&gt;
&lt;li&gt;Translates menus, dialogue, quests, equipment — anything on screen&lt;/li&gt;
&lt;li&gt;No need to switch between apps or take screenshots&lt;/li&gt;
&lt;li&gt;Supports Japanese, Korean, Chinese, and 100+ other languages&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Cons:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Works best with static text (not ideal for fast-scrolling content)&lt;/li&gt;
&lt;li&gt;Translation quality depends on the OCR accuracy&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;One app that does this well is &lt;a href="https://play.google.com/store/apps/details?id=com.screentranslator.app" rel="noopener noreferrer"&gt;Screen Translator&lt;/a&gt; — it's a free Android app with a floating bubble that translates any on-screen text. I've been using it for Japanese gacha games and it handles kanji recognition pretty well.&lt;/p&gt;

&lt;h2&gt;
  
  
  Method 2: Screenshot + Google Translate
&lt;/h2&gt;

&lt;p&gt;The old-school method. Take a screenshot, open Google Translate, use the camera/image feature to translate.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Pros:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Free and widely available&lt;/li&gt;
&lt;li&gt;Google Translate is decent for Japanese&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Cons:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Extremely tedious — you have to leave the game every time&lt;/li&gt;
&lt;li&gt;Breaks your flow, especially during story-heavy sections&lt;/li&gt;
&lt;li&gt;Doesn't work well for games that block screenshots&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Method 3: Learn Basic Gaming Japanese
&lt;/h2&gt;

&lt;p&gt;If you're serious about Japanese games, learning some common gaming vocabulary helps a lot:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Japanese&lt;/th&gt;
&lt;th&gt;Romaji&lt;/th&gt;
&lt;th&gt;English&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;攻撃&lt;/td&gt;
&lt;td&gt;kougeki&lt;/td&gt;
&lt;td&gt;Attack&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;防御&lt;/td&gt;
&lt;td&gt;bougyo&lt;/td&gt;
&lt;td&gt;Defense&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;体力&lt;/td&gt;
&lt;td&gt;tairyoku&lt;/td&gt;
&lt;td&gt;HP / Stamina&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;スキル&lt;/td&gt;
&lt;td&gt;sukiru&lt;/td&gt;
&lt;td&gt;Skill&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;ガチャ&lt;/td&gt;
&lt;td&gt;gacha&lt;/td&gt;
&lt;td&gt;Gacha (lottery)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;クエスト&lt;/td&gt;
&lt;td&gt;kuesuto&lt;/td&gt;
&lt;td&gt;Quest&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;装備&lt;/td&gt;
&lt;td&gt;soubi&lt;/td&gt;
&lt;td&gt;Equipment&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;強化&lt;/td&gt;
&lt;td&gt;kyouka&lt;/td&gt;
&lt;td&gt;Enhance / Upgrade&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;This won't help with story dialogue, but it makes navigating menus much easier.&lt;/p&gt;

&lt;h2&gt;
  
  
  Method 4: Community Translations and Wikis
&lt;/h2&gt;

&lt;p&gt;For popular games, the English-speaking community often creates:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Fan-translated wikis with menu guides&lt;/li&gt;
&lt;li&gt;Reddit threads explaining game mechanics&lt;/li&gt;
&lt;li&gt;Discord servers with translation help channels&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Check Reddit (r/gachagaming) and Discord for your specific game.&lt;/p&gt;

&lt;h2&gt;
  
  
  Which Method Should You Use?
&lt;/h2&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Method&lt;/th&gt;
&lt;th&gt;Best For&lt;/th&gt;
&lt;th&gt;Effort Level&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Floating translator&lt;/td&gt;
&lt;td&gt;Casual play, multiple games&lt;/td&gt;
&lt;td&gt;Low&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Screenshot + Google Translate&lt;/td&gt;
&lt;td&gt;Occasional translation&lt;/td&gt;
&lt;td&gt;Medium&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Learn gaming Japanese&lt;/td&gt;
&lt;td&gt;Long-term players&lt;/td&gt;
&lt;td&gt;High&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Community wikis&lt;/td&gt;
&lt;td&gt;Popular games only&lt;/td&gt;
&lt;td&gt;Low&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;For most people, a floating screen translator is the best balance of convenience and coverage. You can play any game without preparation, and the translation is instant.&lt;/p&gt;

&lt;h2&gt;
  
  
  Tips for Playing Japanese Games
&lt;/h2&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;Start with games that have simple UI&lt;/strong&gt; — puzzle games and rhythm games need less text&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Use a translator for story sections&lt;/strong&gt; — skip translating every menu once you memorize the layout&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Join the game's English community&lt;/strong&gt; — other players have already figured out most things&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Don't try to translate everything&lt;/strong&gt; — focus on quest objectives and important dialogue&lt;/li&gt;
&lt;/ol&gt;

&lt;h2&gt;
  
  
  Conclusion
&lt;/h2&gt;

&lt;p&gt;Playing Japanese games without knowing Japanese is totally doable in 2026. Between floating translators, community resources, and basic vocabulary, you can enjoy almost any game regardless of the language barrier.&lt;/p&gt;

&lt;p&gt;If you're on Android, give &lt;a href="https://play.google.com/store/apps/details?id=com.screentranslator.app" rel="noopener noreferrer"&gt;Screen Translator&lt;/a&gt; a try — the floating bubble makes it really easy to translate game screens on the fly.&lt;/p&gt;




&lt;p&gt;&lt;em&gt;What Japanese games are you playing? Drop a comment and let me know if you have any translation tips to share!&lt;/em&gt;&lt;/p&gt;

</description>
      <category>android</category>
      <category>gamedev</category>
      <category>tutorial</category>
      <category>mobile</category>
    </item>
  </channel>
</rss>
