<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>DEV Community: Cool Light Shop Co,. LTD</title>
    <description>The latest articles on DEV Community by Cool Light Shop Co,. LTD (@cool_lightshopcoltd_).</description>
    <link>https://dev.to/cool_lightshopcoltd_</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F2123443%2F1be5b76c-61b4-4d31-8e84-29d9c14c5f74.png</url>
      <title>DEV Community: Cool Light Shop Co,. LTD</title>
      <link>https://dev.to/cool_lightshopcoltd_</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://dev.to/feed/cool_lightshopcoltd_"/>
    <language>en</language>
    <item>
      <title>On-Device AI in iOS: How I Built a Screenshot Classifier Without Any Cloud Calls</title>
      <dc:creator>Cool Light Shop Co,. LTD</dc:creator>
      <pubDate>Tue, 26 May 2026 04:29:48 +0000</pubDate>
      <link>https://dev.to/cool_lightshopcoltd_/on-device-ai-in-ios-how-i-built-a-screenshot-classifier-without-any-cloud-calls-1971</link>
      <guid>https://dev.to/cool_lightshopcoltd_/on-device-ai-in-ios-how-i-built-a-screenshot-classifier-without-any-cloud-calls-1971</guid>
      <description>&lt;h2&gt;
  
  
  Why On-Device AI Matters for Screenshots
&lt;/h2&gt;

&lt;p&gt;Screenshots are sensitive. They contain prices, flight details, personal conversations, banking info. Uploading them to a cloud AI is a non-starter for most users.&lt;/p&gt;

&lt;p&gt;The good news: iOS gives you everything you need to build a capable AI pipeline that runs entirely on-device. Here's how I did it for &lt;strong&gt;Snaap&lt;/strong&gt;, an AI screenshot cleaner.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Pipeline
&lt;/h2&gt;

&lt;h3&gt;
  
  
  Step 1: Find the Screenshots
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight swift"&gt;&lt;code&gt;&lt;span class="k"&gt;let&lt;/span&gt; &lt;span class="nv"&gt;options&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="kt"&gt;PHFetchOptions&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
&lt;span class="n"&gt;options&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;predicate&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="kt"&gt;NSPredicate&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="nv"&gt;format&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="s"&gt;"mediaType == %d AND (mediaSubtypes &amp;amp; %d) != 0"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="kt"&gt;PHAssetMediaType&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;image&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;rawValue&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="kt"&gt;PHAssetMediaSubtype&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;photoScreenshot&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;rawValue&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;options&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;sortDescriptors&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="kt"&gt;NSSortDescriptor&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nv"&gt;key&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="s"&gt;"creationDate"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nv"&gt;ascending&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="kc"&gt;false&lt;/span&gt;&lt;span class="p"&gt;)]&lt;/span&gt;
&lt;span class="k"&gt;let&lt;/span&gt; &lt;span class="nv"&gt;screenshots&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="kt"&gt;PHAsset&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;fetchAssets&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nv"&gt;with&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;options&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;iOS natively tags screenshots — no ML model needed for detection.&lt;/p&gt;

&lt;h3&gt;
  
  
  Step 2: Extract Text with Vision OCR
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight swift"&gt;&lt;code&gt;&lt;span class="k"&gt;let&lt;/span&gt; &lt;span class="nv"&gt;request&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="kt"&gt;VNRecognizeTextRequest&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="n"&gt;request&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;error&lt;/span&gt; &lt;span class="k"&gt;in&lt;/span&gt;
    &lt;span class="k"&gt;let&lt;/span&gt; &lt;span class="nv"&gt;text&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;request&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;results&lt;/span&gt;&lt;span class="p"&gt;?&lt;/span&gt;
        &lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;compactMap&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="nv"&gt;$0&lt;/span&gt; &lt;span class="k"&gt;as?&lt;/span&gt; &lt;span class="kt"&gt;VNRecognizedTextObservation&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt;
        &lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;compactMap&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="nv"&gt;$0&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;topCandidates&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;first&lt;/span&gt;&lt;span class="p"&gt;?&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;string&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt;
        &lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;joined&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nv"&gt;separator&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="s"&gt;"&lt;/span&gt;&lt;span class="se"&gt;\n&lt;/span&gt;&lt;span class="s"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;??&lt;/span&gt; &lt;span class="s"&gt;""&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="n"&gt;request&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;recognitionLevel&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;accurate&lt;/span&gt;
&lt;span class="n"&gt;request&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;usesLanguageCorrection&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="kc"&gt;true&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;code&gt;VNRecognizeTextRequest&lt;/code&gt; with &lt;code&gt;.accurate&lt;/code&gt; level catches even small text on product screenshots. Processing ~600 images takes about 90 seconds on iPhone 14 Pro.&lt;/p&gt;

&lt;h3&gt;
  
  
  Step 3: Rule-Based Classification
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight swift"&gt;&lt;code&gt;&lt;span class="kd"&gt;func&lt;/span&gt; &lt;span class="nf"&gt;classify&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nv"&gt;ocrText&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="kt"&gt;String&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="kt"&gt;Category&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="k"&gt;let&lt;/span&gt; &lt;span class="nv"&gt;text&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;ocrText&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;lowercased&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;

    &lt;span class="c1"&gt;// Travel: look for flight codes like VN123&lt;/span&gt;
    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;text&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;contains&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"boarding pass"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;||&lt;/span&gt; &lt;span class="n"&gt;text&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;contains&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"flight"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;||&lt;/span&gt;
       &lt;span class="n"&gt;text&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;range&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nv"&gt;of&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="s"&gt;"[A-Z]{2}&lt;/span&gt;&lt;span class="se"&gt;\\&lt;/span&gt;&lt;span class="s"&gt;d{3,4}"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nv"&gt;options&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;regularExpression&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;!=&lt;/span&gt; &lt;span class="kc"&gt;nil&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;travel&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;

    &lt;span class="c1"&gt;// Receipt: price patterns + keywords&lt;/span&gt;
    &lt;span class="k"&gt;let&lt;/span&gt; &lt;span class="nv"&gt;pricePattern&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s"&gt;"[&lt;/span&gt;&lt;span class="err"&gt;\&lt;/span&gt;&lt;span class="s"&gt;$£€¥]&lt;/span&gt;&lt;span class="err"&gt;\&lt;/span&gt;&lt;span class="s"&gt;s*&lt;/span&gt;&lt;span class="err"&gt;\&lt;/span&gt;&lt;span class="s"&gt;d+[&lt;/span&gt;&lt;span class="err"&gt;\&lt;/span&gt;&lt;span class="s"&gt;.,]&lt;/span&gt;&lt;span class="err"&gt;\&lt;/span&gt;&lt;span class="s"&gt;d{2}"&lt;/span&gt;
    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;text&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;contains&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"total"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;&amp;amp;&amp;amp;&lt;/span&gt; 
       &lt;span class="n"&gt;text&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;range&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nv"&gt;of&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;pricePattern&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nv"&gt;options&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;regularExpression&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;!=&lt;/span&gt; &lt;span class="kc"&gt;nil&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;receipt&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;

    &lt;span class="c1"&gt;// Recipe: multiple cooking keywords&lt;/span&gt;
    &lt;span class="k"&gt;let&lt;/span&gt; &lt;span class="nv"&gt;recipeWords&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s"&gt;"ingredients"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s"&gt;"tbsp"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s"&gt;"preheat"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s"&gt;"bake"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s"&gt;"simmer"&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;recipeWords&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;filter&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt; &lt;span class="n"&gt;text&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;contains&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nv"&gt;$0&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;})&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;count&lt;/span&gt; &lt;span class="o"&gt;&amp;gt;=&lt;/span&gt; &lt;span class="mi"&gt;2&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;recipe&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;

    &lt;span class="c1"&gt;// Code: programming keywords&lt;/span&gt;
    &lt;span class="k"&gt;let&lt;/span&gt; &lt;span class="nv"&gt;codeWords&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s"&gt;"func "&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s"&gt;"const "&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s"&gt;"import "&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s"&gt;"async"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s"&gt;"await"&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;codeWords&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;filter&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt; &lt;span class="n"&gt;text&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;contains&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nv"&gt;$0&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;})&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;count&lt;/span&gt; &lt;span class="o"&gt;&amp;gt;=&lt;/span&gt; &lt;span class="mi"&gt;2&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;code&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;

    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;other&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The key insight: screenshots of the same category share highly predictable vocabulary. A flight booking always says "boarding pass" or "gate." A receipt always has a price and the word "total." You don't need an LLM for this — domain-specific heuristics work better.&lt;/p&gt;

&lt;h3&gt;
  
  
  Step 4: Context Generation
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight swift"&gt;&lt;code&gt;&lt;span class="kd"&gt;func&lt;/span&gt; &lt;span class="nf"&gt;generateSentence&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="nv"&gt;screenshot&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="kt"&gt;Screenshot&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="kt"&gt;String&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="k"&gt;switch&lt;/span&gt; &lt;span class="n"&gt;screenshot&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;category&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="k"&gt;case&lt;/span&gt; &lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="nv"&gt;travel&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="nf"&gt;isDatePast&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;screenshot&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;extractedDate&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
            &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="s"&gt;"Flight to &lt;/span&gt;&lt;span class="se"&gt;\(&lt;/span&gt;&lt;span class="n"&gt;destination&lt;/span&gt;&lt;span class="se"&gt;)&lt;/span&gt;&lt;span class="s"&gt;. You already landed."&lt;/span&gt;
        &lt;span class="p"&gt;}&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="s"&gt;"Flight to &lt;/span&gt;&lt;span class="se"&gt;\(&lt;/span&gt;&lt;span class="n"&gt;destination&lt;/span&gt;&lt;span class="se"&gt;)&lt;/span&gt;&lt;span class="s"&gt; — &lt;/span&gt;&lt;span class="se"&gt;\(&lt;/span&gt;&lt;span class="nf"&gt;formatDate&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;screenshot&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;extractedDate&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="se"&gt;)&lt;/span&gt;&lt;span class="s"&gt;."&lt;/span&gt;
    &lt;span class="k"&gt;case&lt;/span&gt; &lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="nv"&gt;product&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="nf"&gt;weeksAgo&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;screenshot&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;createdAt&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="mi"&gt;4&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
            &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="s"&gt;"&lt;/span&gt;&lt;span class="se"&gt;\(&lt;/span&gt;&lt;span class="n"&gt;product&lt;/span&gt;&lt;span class="se"&gt;)&lt;/span&gt;&lt;span class="s"&gt; · &lt;/span&gt;&lt;span class="se"&gt;\(&lt;/span&gt;&lt;span class="n"&gt;price&lt;/span&gt;&lt;span class="se"&gt;)&lt;/span&gt;&lt;span class="s"&gt;. Saved &lt;/span&gt;&lt;span class="se"&gt;\(&lt;/span&gt;&lt;span class="n"&gt;weeks&lt;/span&gt;&lt;span class="se"&gt;)&lt;/span&gt;&lt;span class="s"&gt; weeks — still want it?"&lt;/span&gt;
        &lt;span class="p"&gt;}&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="s"&gt;"&lt;/span&gt;&lt;span class="se"&gt;\(&lt;/span&gt;&lt;span class="n"&gt;product&lt;/span&gt;&lt;span class="se"&gt;)&lt;/span&gt;&lt;span class="s"&gt; · &lt;/span&gt;&lt;span class="se"&gt;\(&lt;/span&gt;&lt;span class="n"&gt;price&lt;/span&gt;&lt;span class="se"&gt;)&lt;/span&gt;&lt;span class="s"&gt; from &lt;/span&gt;&lt;span class="se"&gt;\(&lt;/span&gt;&lt;span class="n"&gt;source&lt;/span&gt;&lt;span class="se"&gt;)&lt;/span&gt;&lt;span class="s"&gt;."&lt;/span&gt;
    &lt;span class="c1"&gt;// ... etc&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The sentences are designed to &lt;strong&gt;prompt a decision&lt;/strong&gt;. "You already landed" makes it safe to delete. "Still want it?" keeps the door open. The goal isn't perfect accuracy — it's removing the fear of deleting.&lt;/p&gt;

&lt;h3&gt;
  
  
  Step 5: Duplicate Detection with Perceptual Hashing
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight swift"&gt;&lt;code&gt;&lt;span class="kd"&gt;func&lt;/span&gt; &lt;span class="nf"&gt;computeHash&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="nv"&gt;image&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="kt"&gt;UIImage&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="kt"&gt;String&lt;/span&gt;&lt;span class="p"&gt;?&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="c1"&gt;// Resize to 8x8 grayscale&lt;/span&gt;
    &lt;span class="c1"&gt;// Compute average brightness&lt;/span&gt;
    &lt;span class="c1"&gt;// Build 64-bit string: each bit = pixel &amp;gt; average&lt;/span&gt;
    &lt;span class="c1"&gt;// Hamming distance &amp;lt; 10 = duplicate&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;pHash catches visually identical screenshots even if one is slightly cropped or has a different timestamp. Found 42 duplicates in my library that I never knew existed.&lt;/p&gt;

&lt;h2&gt;
  
  
  Why Not Use an LLM?
&lt;/h2&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;Speed:&lt;/strong&gt; Rule-based classification is instant. No API latency.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Privacy:&lt;/strong&gt; Nothing leaves the device. Critical for screenshot content.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Cost:&lt;/strong&gt; $0 vs. paying per token.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Reliability:&lt;/strong&gt; No hallucinations, no API outages.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Offline:&lt;/strong&gt; Works on airplanes, in subways, anywhere.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;For a constrained domain like screenshot classification, LLMs are overkill. The vocabulary is predictable, the categories are well-defined, and the cost of a misclassification is low (user just taps "other").&lt;/p&gt;

&lt;h2&gt;
  
  
  Results &amp;amp; App
&lt;/h2&gt;

&lt;p&gt;Snaap is free on the App Store: &lt;a href="https://apps.apple.com/app/snaap-voucher-reminder-ai/id6770817204" rel="noopener noreferrer"&gt;https://apps.apple.com/app/snaap-voucher-reminder-ai/id6770817204&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;The entire AI pipeline — OCR, classification, context generation, duplicate detection, expiry checking — runs in about 0.15 seconds per screenshot on device. No network calls, no backend, no user accounts.&lt;/p&gt;

&lt;p&gt;If you're building an iOS app that touches user data, I'd strongly recommend exploring on-device AI first. The frameworks are solid, the privacy story is compelling, and users genuinely appreciate it.&lt;/p&gt;




&lt;p&gt;&lt;em&gt;Built with Vision, PhotoKit, GRDB, SwiftUI + UIKit. iOS 16+.&lt;/em&gt;&lt;/p&gt;

</description>
      <category>ios</category>
      <category>swift</category>
      <category>ai</category>
      <category>privacy</category>
    </item>
    <item>
      <title>I Built an AI That Reads Your Screenshots and Tells You Why You Saved Them</title>
      <dc:creator>Cool Light Shop Co,. LTD</dc:creator>
      <pubDate>Tue, 26 May 2026 04:29:34 +0000</pubDate>
      <link>https://dev.to/cool_lightshopcoltd_/i-built-an-ai-that-reads-your-screenshots-and-tells-you-why-you-saved-them-29nn</link>
      <guid>https://dev.to/cool_lightshopcoltd_/i-built-an-ai-that-reads-your-screenshots-and-tells-you-why-you-saved-them-29nn</guid>
      <description>&lt;h2&gt;
  
  
  The Problem
&lt;/h2&gt;

&lt;p&gt;I had 2,847 screenshots on my iPhone. Boarding passes from 2023. Products I screenshotted but never bought. Recipes I saved and forgot. Memes I already sent to everyone.&lt;/p&gt;

&lt;p&gt;Apple's Photos app treats them as... photos. Not as the half-finished intentions they actually are.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Insight
&lt;/h2&gt;

&lt;p&gt;Every screenshot is an &lt;strong&gt;externalized intention&lt;/strong&gt; — something you meant to do, buy, watch, or remember. But the camera roll is a terrible task manager.&lt;/p&gt;

&lt;p&gt;I realized the reason people don't delete screenshots isn't laziness — it's &lt;strong&gt;fear of deleting something important&lt;/strong&gt;. If you knew what each screenshot was, you'd decide in one second. The problem is information, not motivation.&lt;/p&gt;

&lt;h2&gt;
  
  
  What I Built
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;Snaap&lt;/strong&gt; — an iOS app that reads every screenshot with on-device AI and generates a one-sentence explanation:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;&lt;em&gt;"Boarding pass for flight VN123 — departed Feb 14. Trip is over."&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;&lt;em&gt;"Nike Air Max $89 from Instagram. Saved 3 months ago — not bought yet."&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;&lt;em&gt;"Pasta recipe with 8 ingredients. Saved 4 weeks ago — never cooked."&lt;/em&gt;&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;Once you know &lt;em&gt;why&lt;/em&gt; you saved it, the decision to keep or delete becomes instant.&lt;/p&gt;

&lt;h2&gt;
  
  
  How It Works (Tech Stack)
&lt;/h2&gt;

&lt;h3&gt;
  
  
  1. Screenshot Detection
&lt;/h3&gt;

&lt;p&gt;iOS exposes &lt;code&gt;PHAssetMediaSubtype.photoScreenshot&lt;/code&gt; — no ML needed to find screenshots. PhotoKit handles ingestion and change observation so new screenshots appear automatically.&lt;/p&gt;

&lt;h3&gt;
  
  
  2. On-Device OCR
&lt;/h3&gt;

&lt;p&gt;Apple's Vision framework (&lt;code&gt;VNRecognizeTextRequest&lt;/code&gt;) extracts all text from each screenshot. Runs on a background queue, results cached to SQLite.&lt;/p&gt;

&lt;h3&gt;
  
  
  3. Rule-Based Classification
&lt;/h3&gt;

&lt;p&gt;No GPT, no cloud, no API calls. A keyword + regex classifier buckets screenshots into:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Receipt&lt;/strong&gt; (price patterns, "total", "invoice")&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Travel&lt;/strong&gt; (flight codes, "boarding pass", "gate")&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Recipe&lt;/strong&gt; ("ingredients", "tbsp", "preheat")&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Product&lt;/strong&gt; ("add to cart", "buy now", "sale")&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Code&lt;/strong&gt; ("func", "const", "import")&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Meme/Other&lt;/strong&gt; (fallback)&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  4. Context Generation
&lt;/h3&gt;

&lt;p&gt;A template engine turns classification + extracted data into human-readable sentences. Dates, prices, source apps, and time-since-saved are all woven in.&lt;/p&gt;

&lt;h3&gt;
  
  
  5. Duplicate Detection
&lt;/h3&gt;

&lt;p&gt;Perceptual hashing (pHash): resize to 8x8 grayscale, compute bit string, compare with Hamming distance &amp;lt; 10.&lt;/p&gt;

&lt;h3&gt;
  
  
  6. Expiry Detection
&lt;/h3&gt;

&lt;p&gt;Regex extracts dates from travel screenshots, compares with current date. Expired boarding passes = safe to delete.&lt;/p&gt;

&lt;h3&gt;
  
  
  Architecture
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;SwiftUI (Splash, Scan, Home, Settings)
    +
UIKit (Inbox card stack with UIPanGestureRecognizer swipes)
    |
PhotoKit → Vision OCR → Classifier → Sentence Engine → GRDB/SQLite
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;No backend.&lt;/strong&gt; No API calls. No user accounts.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;100% on-device.&lt;/strong&gt; Screenshots never leave the phone.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Hybrid SwiftUI + UIKit.&lt;/strong&gt; SwiftUI for static screens, UIKit for the gesture-heavy card stack.&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Results
&lt;/h2&gt;

&lt;p&gt;My first real session:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;634 screenshots scanned in ~90 seconds&lt;/li&gt;
&lt;li&gt;89 expired items auto-detected&lt;/li&gt;
&lt;li&gt;42 duplicates found&lt;/li&gt;
&lt;li&gt;612 cleaned in 4 minutes&lt;/li&gt;
&lt;li&gt;Liberated ~1.2 GB&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  What I Learned
&lt;/h2&gt;

&lt;ol&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Rule-based AI is underrated.&lt;/strong&gt; For a constrained domain like screenshots, regex + keywords outperform LLMs on speed, cost, and privacy. No hallucinations either.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;The swipe UX is the product.&lt;/strong&gt; The AI just enables fast decisions — the card stack + gesture interface is what makes it feel good.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;"Why you saved it" &amp;gt; "organize."&lt;/strong&gt; Users don't want another filing system. They want closure on unfinished intentions.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;iOS has great primitives.&lt;/strong&gt; &lt;code&gt;PHAsset.photoScreenshot&lt;/code&gt;, Vision OCR, GRDB — the OS gives you everything you need for a privacy-first AI app.&lt;/p&gt;&lt;/li&gt;
&lt;/ol&gt;

&lt;h2&gt;
  
  
  Try It
&lt;/h2&gt;

&lt;p&gt;Snaap is free on the App Store: &lt;a href="https://apps.apple.com/app/snaap-voucher-reminder-ai/id6770817204" rel="noopener noreferrer"&gt;https://apps.apple.com/app/snaap-voucher-reminder-ai/id6770817204&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Would love feedback from the dev community — especially on the classification approach and swipe UX!&lt;/p&gt;




&lt;p&gt;&lt;em&gt;Built with Swift, SwiftUI, UIKit, Vision, PhotoKit, and GRDB. iOS 16+.&lt;/em&gt;&lt;/p&gt;

</description>
      <category>ios</category>
      <category>swift</category>
      <category>ai</category>
      <category>indiedev</category>
    </item>
  </channel>
</rss>
