DEV Community

Cool Light Shop Co,. LTD
Cool Light Shop Co,. LTD

Posted on

I Built an AI That Reads Your Screenshots and Tells You Why You Saved Them

The Problem

I had 2,847 screenshots on my iPhone. Boarding passes from 2023. Products I screenshotted but never bought. Recipes I saved and forgot. Memes I already sent to everyone.

Apple's Photos app treats them as... photos. Not as the half-finished intentions they actually are.

The Insight

Every screenshot is an externalized intention — something you meant to do, buy, watch, or remember. But the camera roll is a terrible task manager.

I realized the reason people don't delete screenshots isn't laziness — it's fear of deleting something important. If you knew what each screenshot was, you'd decide in one second. The problem is information, not motivation.

What I Built

Snaap — an iOS app that reads every screenshot with on-device AI and generates a one-sentence explanation:

"Boarding pass for flight VN123 — departed Feb 14. Trip is over."

"Nike Air Max $89 from Instagram. Saved 3 months ago — not bought yet."

"Pasta recipe with 8 ingredients. Saved 4 weeks ago — never cooked."

Once you know why you saved it, the decision to keep or delete becomes instant.

How It Works (Tech Stack)

1. Screenshot Detection

iOS exposes PHAssetMediaSubtype.photoScreenshot — no ML needed to find screenshots. PhotoKit handles ingestion and change observation so new screenshots appear automatically.

2. On-Device OCR

Apple's Vision framework (VNRecognizeTextRequest) extracts all text from each screenshot. Runs on a background queue, results cached to SQLite.

3. Rule-Based Classification

No GPT, no cloud, no API calls. A keyword + regex classifier buckets screenshots into:

  • Receipt (price patterns, "total", "invoice")
  • Travel (flight codes, "boarding pass", "gate")
  • Recipe ("ingredients", "tbsp", "preheat")
  • Product ("add to cart", "buy now", "sale")
  • Code ("func", "const", "import")
  • Meme/Other (fallback)

4. Context Generation

A template engine turns classification + extracted data into human-readable sentences. Dates, prices, source apps, and time-since-saved are all woven in.

5. Duplicate Detection

Perceptual hashing (pHash): resize to 8x8 grayscale, compute bit string, compare with Hamming distance < 10.

6. Expiry Detection

Regex extracts dates from travel screenshots, compares with current date. Expired boarding passes = safe to delete.

Architecture

SwiftUI (Splash, Scan, Home, Settings)
    +
UIKit (Inbox card stack with UIPanGestureRecognizer swipes)
    |
PhotoKit → Vision OCR → Classifier → Sentence Engine → GRDB/SQLite
Enter fullscreen mode Exit fullscreen mode
  • No backend. No API calls. No user accounts.
  • 100% on-device. Screenshots never leave the phone.
  • Hybrid SwiftUI + UIKit. SwiftUI for static screens, UIKit for the gesture-heavy card stack.

Results

My first real session:

  • 634 screenshots scanned in ~90 seconds
  • 89 expired items auto-detected
  • 42 duplicates found
  • 612 cleaned in 4 minutes
  • Liberated ~1.2 GB

What I Learned

  1. Rule-based AI is underrated. For a constrained domain like screenshots, regex + keywords outperform LLMs on speed, cost, and privacy. No hallucinations either.

  2. The swipe UX is the product. The AI just enables fast decisions — the card stack + gesture interface is what makes it feel good.

  3. "Why you saved it" > "organize." Users don't want another filing system. They want closure on unfinished intentions.

  4. iOS has great primitives. PHAsset.photoScreenshot, Vision OCR, GRDB — the OS gives you everything you need for a privacy-first AI app.

Try It

Snaap is free on the App Store: https://apps.apple.com/app/snaap-voucher-reminder-ai/id6770817204

Would love feedback from the dev community — especially on the classification approach and swipe UX!


Built with Swift, SwiftUI, UIKit, Vision, PhotoKit, and GRDB. iOS 16+.

Top comments (0)