I built an open-source tool that cleans a decade-old mailbox with local-first AI

#ai #python #opensource #showdev

The mailbox that ate ten years

I'd had the same Gmail account for ~14 years. Opening it had become depressing: ~40,000 newsletters I never read, spam from services that no longer exist, receipts and notifications from tools I stopped using in 2016. The kind of mailbox you don't clean because the act of cleaning is the chore.

I'm a software architect (~10 years), so my instinct was: surely there's a tool. There wasn't - not really. There are scattered Python scripts and a few half-abandoned utilities, but nothing that felt like a complete, safe product: something that lets you say "show me everything from these 40 senders, tell me how many there are, and let me preview before I touch anything" - or better, "just figure out what's junk and clean it for me, without shipping my whole inbox to a cloud."

So I wrote a few lines of Python over IMAP, just for myself. Then it kept growing, and I decided to finish it properly and open-source it: IMAP Cleanup Tool.

pip install "imap-cleanup-tool[web,ai]" then imap-cleanup-tool-web
Repo: https://github.com/mrpickles007/imap-cleanup-tool
Site: https://imapcleanuptool.com

The headline: AI Cleanup that runs local-first

The feature I'm proudest of is AI Cleanup: hand "which of these do I actually want?" to a model, safely - and efficiently. The key design choice is that it works on aggregated per-sender statistics, not your individual emails. It never feeds a whole mailbox to an LLM (that would be slow and make the token count explode); a local heuristic does the bulk of the work, and only a short list of borderline senders ever reaches the model.

A local heuristic scores every sender 0-10 from signals read on your machine: List-Unsubscribe, the share of unread messages, send frequency, Precedence: bulk, and sender patterns (noreply@, newsletter@). Weights are calibrated and tunable. This step never leaves your machine.
The LLM only sees the borderline cases. Only senders at or above your threshold go to a model, with a few sample subjects each (plus stats) - never the message body. It returns strict JSON verdicts (pydantic-validated, retried on bad output) saying what to delete.
You stay in control. A Report only mode shows exactly what would be deleted and changes nothing; actually deleting is a separate, deliberate step (off by default).

💸 What does it cost? In my testing, cleaning a ~40,000-message Gmail with gpt-4o-mini cost about €0.03 and removed ~13,000 emails in ~5 minutes. Only senders over the threshold ever hit the LLM (a few subjects each), so cost stays tiny - and a local Ollama model is free. It even gets cheaper the more you run it: senders already saved as spam are skipped from the model on later runs, so each cleanup sends fewer addresses than the last.

The two things that make this something I'd actually run on my own inbox:

Local-first. Configure a free local model via Ollama (ollama/llama3 works out of the box) and nothing leaves your machine - not the subjects, not the stats, nothing.
BYOA (Bring Your Own API key). Prefer a cloud model? Plug in OpenAI, OpenRouter, or anything litellm supports with your own key. Optional per-model cost tracking tells you what each run cost.

And the prompt has an explicit safeguard: it must KEEP anything that looks like online orders/receipts, appointments/bookings, medical/health, travel, banking/tax, security/2FA, or personal mail - only obvious bulk (newsletters, promotions, notifications) is ever marked deletable. The senders it flags also land in a per-account "spam addresses" list, so you can report them to the server and push their future mail straight to spam - or unsubscribe from them in bulk (see the next section).

One-click unsubscribe from newsletters

Deleting is half the battle; the other half is making the junk stop arriving. From that same spam addresses list you can bulk-unsubscribe from newsletters. The tool captures each sender's List-Unsubscribe header during an AI report, then, for the senders you select, does the unsubscribe automatically where the standard allows it:

a mailto: unsubscribe, sent for you from your active SMTP profile, or
an RFC 8058 one-click HTTPS POST.

Senders that only offer a plain confirmation-page link can't be automated - so instead of opening a wall of browser tabs, the list filters itself to those leftovers and you open each with its per-row link ↗. Honest framing: automatic for most, open-the-page for the rest. And it's a deliberate, outbound action meant for real newsletters - never for actual spam, where unsubscribing just confirms your address is live.

Modern auth (OAuth2) where you need it

Microsoft has turned off password login over IMAP/SMTP for personal and 365 accounts, which quietly breaks a lot of older tools. So there's a "Sign in with Microsoft" built in: an OAuth2 device-code flow (open a URL, type a short code) that stores only an encrypted refresh token - no password - and authenticates with XOAUTH2. It works for reading mail (IMAP), for sending the notification emails (SMTP), and even for unattended scheduled jobs, because access tokens are minted silently from the refresh token. I implemented it with just the standard library (urllib / imaplib / smtplib), no vendor SDK, and the device-code flow has no localhost redirect, so it works headless / over SSH.

Every other provider (Gmail, Yahoo, iCloud, your own domain) just connects with a normal or app password. And the OAuth providers live in a JSON config file (endpoints, client id, optional secret, scope), so adding Google - or pointing the tool at your own organization's app registration, the setup you'd want for an enterprise deployment - is a config edit, not a code change.

Still a precise manual tool when you want control

AI is optional. Under the hood it's still a complete, scriptable IMAP cleaner:

See who's flooding you. Export a CSV of every sender and how many emails each sent, ranked.
Match by anything. Sender, domain, or nested AND/OR rules built in a visual builder - no query syntax to memorize.
Count before you touch, with dry-run by default.
Move instead of delete. Send matches to another folder (or a Gmail label), create/delete folders on the fly. Great for archiving instead of nuking.
Fast on huge folders via server-side IMAP search, with batched deletes.
Email notifications: get a mail when a run finishes, with the AI report attached as CSV.
Web UI and CLI (full AI parity on the CLI), plus a guided tour on first run.
Scheduling through your OS (Task Scheduler / cron) - including scheduled AI jobs - for recurring cleanups.
Local-only. Your credentials never leave your machine. No cloud, no tracking.

A 60-second tour

Let the AI build a report (nothing is deleted):

imap-cleanup-tool --host imap.gmail.com --user you@gmail.com \
    --ai-cleanup --ai-report-only --ai-report-csv report.csv

Find your worst offenders the manual way:

imap-cleanup-tool --host imap.gmail.com --user you@gmail.com --list-senders --save-senders senders.csv

Preview a rule-based cleanup (changes nothing):

imap-cleanup-tool --host imap.gmail.com --user you@gmail.com \
    --targets junk.txt --dry-run

Archive instead of delete - move matches into a folder/label:

imap-cleanup-tool --host HOST --user USER --create-folder "Archive/2013"
imap-cleanup-tool --host HOST --user USER --targets old.txt \
    --move --dest-folder "Archive/2013" --dry-run

Or just use the web UI - imap-cleanup-tool-web - if you'd rather click.

The part I actually care about: how it's built

This is where I let the engineering show, because the way something is built is the difference between a script and a tool you'd trust with your mailbox.

A UI-agnostic core. All IMAP logic lives in one module that knows nothing about argument parsing or HTML. It uses only the Python standard library (imaplib), which means the CLI has zero runtime dependencies; the web UI ([web]) and the AI features ([ai]) are optional, lazily-imported extras. Both front-ends call the same core, so there's no logic duplicated and no "works in the UI but not the CLI" drift.

The AI layer is provider-agnostic and defensive. It uses litellm so local (Ollama) and cloud models share one path; LLM calls are batched with timeouts and cooperative cancellation; verdicts are validated with pydantic and retried before giving up. Only subjects and stats are ever sent.

Rules as a serializable tree. A rule is a Condition/Group tree (AND/OR, arbitrarily nestable) that compiles to an IMAP SEARCH string. The visual builder and the CLI's --rule grammar both produce the same tree - so a rule you build with clicks can be saved into a scheduled job and run headless.

Safety is a design constraint, not a footnote. Dry-run is the default everywhere. Deletion flags messages and only --expunge (or emptying a folder) removes them for good - and that expunge is batched, after I hit a real EXPUNGE => System Error emptying a Trash of ~11.5k messages in one shot. You can't move a folder into itself; system folders (Trash, Sent, ...) can't be deleted - detected via IMAP special-use flags, not by name, so it works on localized and Gmail mailboxes too.

Boring-but-important hygiene. 270+ automated tests; CI runs the suite on Python 3.10 through 3.14; releases go to PyPI via Trusted Publishing (no API tokens), each tag auto-creating a GitHub release. I tested it end-to-end on real Gmail and Outlook/Microsoft accounts and on a mailbox on my own domain.

A note on transparency: I designed and architected this, and used an AI assistant to speed up writing code. I leaned on the test suite and CI rather than trusting generated code blindly - and the code is all there to read. (That's separate from the product's own AI Cleanup feature.)

Install

# Python 3.10+; [web,ai] adds the web UI + AI Cleanup on top of the CLI
pip install "imap-cleanup-tool[web,ai]"
imap-cleanup-tool-web        # opens the local web UI

Everything is local. For AI, point it at a local Ollama model (nothing leaves your machine) or paste your own cloud API key. On Gmail, use an app password (not your real password) and you're set. Don't need AI? pip install "imap-cleanup-tool[web]".

Where it's going / contributing

It's AGPL-3.0, with issue templates and a contributing guide. I'd love help with edge cases on non-Gmail providers, more rule fields, translations, and feedback on which local models work well for the AI step. If you try it, a GitHub star genuinely helps, and bug reports/feature requests are read and answered.

Repo: https://github.com/mrpickles007/imap-cleanup-tool
Site: https://imapcleanuptool.com

That's it. I hope it gives you back a clean inbox - feedback, issues and PRs are all read and answered.