Vortrix5

Posted on Jun 12

How I made deleting files hard to get wrong: building Sifty, a safety-first Windows cleaner

#ai #productivity #microsoft #cleaner

How I made deleting files hard to get wrong: building Sifty, a safety-first Windows cleaner

A free, open-source Windows maintenance tool for the terminal — and the design decisions behind trusting a program to delete your files.

I don't trust cleanup tools. That's an awkward thing to admit right before telling you I built one, but it's also the entire reason I built it.

The category has a reputation problem. The most famous Windows cleaner shipped a bundled cryptominer in one version and got compromised into a supply-chain attack in another. Most of them phone home, upsell you a "Pro" tier to fix problems they invented, and — the part that actually scares me — delete files permanently. You click "Clean," a progress bar fills, and whatever it decided was junk is gone. No Recycle Bin. No undo. You're trusting a closed-source binary's judgment with no take-backs.

So I wrote Sifty: a Windows 10/11 maintenance tool that runs in the terminal, is MIT-licensed, has zero telemetry, and is built from the ground up so that deleting the wrong thing is hard to do even on purpose. This post is about that last part — the design, and the code.

What it actually does

Quickly, so the rest makes sense. Sifty is a scriptable CLI plus a full-screen TUI that:

cleans junk and caches (temp files, browser caches, crash dumps, update leftovers — 11+ categories)
finds duplicate files (SHA-256, and NTFS-aware so hardlinks aren't double-counted) and your biggest space hogs
manages installed apps, startup items, services, and updates (via winget)
purges the clutter dev machines accumulate — node_modules, dist, __pycache__, orphaned git worktrees, bloated WSL2 virtual disks
has an optional AI assistant that runs locally via Ollama

pipx install sifty
sifty checkup              # one read-only scan of everything
sifty junk clean           # preview what it would remove (dry-run)
sifty junk clean --apply   # actually do it (asks first)

But the feature list isn't the interesting part. The interesting part is the constraint I put on myself: a tool that deletes files has no margin for "oops." Here's how that constraint shaped the code.

Principle 1: There is exactly one way to delete something

The single most important design rule in the whole codebase is this: nothing in Sifty calls os.remove, os.unlink, shutil.rmtree, or Path.unlink. Ever. Every deletion in the entire application funnels through one function, safety.trash():

def trash(
    path: str | Path,
    allow_subtrees: Sequence[str | Path] = (),
    extra_protected: Iterable[str | Path] = (),
    *,
    dry_run: bool = True,
) -> bool:
    """Send `path` to the Recycle Bin after a safety check."""
    assert_safe(path, allow_subtrees, extra_protected)
    if dry_run:
        return True
    send_to_trash(path)          # Send2Trash -> Recycle Bin, never permanent
    audit(f"TRASH {path}")
    return True

Three things are load-bearing here, and they're all in those few lines:

It routes to the Recycle Bin, not oblivion. send_to_trash is the project's one and only call to the Send2Trash library. If Sifty makes a mistake, the file is sitting in your Recycle Bin where you can drag it back. Permanent deletion is not a code path that exists.
dry_run=True is the default value of the parameter. Not a flag you remember to pass — the safe behavior is what you get if you do nothing. To actually delete, a caller has to explicitly opt out of safety. This inverts the usual danger: forgetting an argument makes Sifty more cautious, not less.
Every real deletion is audited. audit() appends a timestamped line to %APPDATA%\sifty\audit.log. If you ever wonder "what did this thing touch," there's a paper trail.

Because this is the only delete path, I can make one airtight guarantee about the whole program by reasoning about one function. And I enforce it the dumb, reliable way — a test greps the source tree and fails CI if os.remove/rmtree/unlink shows up anywhere outside this file. You can't accidentally reintroduce a raw delete in a feature PR; the build goes red.

Principle 2: Some paths are refused no matter what

The Recycle Bin saves you from permanent loss, but sending C:\Windows to the Recycle Bin still bricks your machine. So before anything gets trashed, it goes through assert_safe(), which asks is_protected(). This is where the actual judgment lives, and it's a two-tier model that took me a few iterations to get right.

The naive approach — "refuse a hardcoded list of system folders" — has a subtle bug. If you protect C:\Windows but the user aims a delete at C:\, you've just authorized deleting the parent, which takes Windows with it. And if you protect C:\ itself by refusing everything under it, you've made the user's entire disk undeletable, which makes the cleaner useless.

The fix was to split protected roots into two kinds:

def is_protected(path, allow_subtrees=(), extra_protected=()) -> bool:
    target = _norm(Path(path))
    allowed = [_norm(Path(a)) for a in allow_subtrees]

    for root in contents_protected_roots(extra_protected):
        # Deleting the root itself - OR AN ANCESTOR of it - is always refused.
        if target == root or _is_relative_to(root, target):
            return True
        # Deleting something *inside* it is refused unless a caller vouched.
        if _is_relative_to(target, root):
            if any(_is_relative_to(target, a) for a in allowed):
                return False
            return True

    for root in self_protected_roots():
        # Only the root itself (or an ancestor) is off-limits; contents are OK.
        if target == root or _is_relative_to(root, target):
            return True

    return False

Contents-protected roots — C:\Windows, the Program Files trees, ProgramData — refuse the root and everything inside it. You can't touch them or their contents.

Self-protected roots — the drive root C:\ and your user profile C:\Users\you — refuse only the root itself. Ordinary files inside stay deletable (otherwise the tool couldn't clean your Downloads), but the root can't be nuked wholesale.

And notice the _is_relative_to(root, target) check in both loops: that's the ancestor guard. Aim at C:\ and the check sees that a protected root (C:\Windows) lives underneath your target, so it refuses — closing the "delete the parent" hole.

The key safety property: these checks fire even with --apply --yes. There is no override flag, no --force, no "I really mean it." A protected path is simply not deletable by Sifty, full stop.

But cleaners need to touch system folders

Here's the tension: the whole point of a cleaner is to clear C:\Windows\Temp, which lives inside a contents-protected root. A blanket refusal would block the feature.

That's what allow_subtrees is for. It's a per-call carve-out where a specific module vouches for a specific subtree. The junk-cleaning module passes allow_subtrees=[r"C:\Windows\Temp"], and only then is that one folder permitted — while the rest of C:\Windows stays locked. The permission is narrow, explicit, and lives at the call site where a human decided it was OK, not buried in a global allowlist. Default-deny, with auditable exceptions.

Principle 3: The AI is advisory, local, and blind to your file contents

Plenty of "AI-powered" tools mean "we ship your data to a cloud model." I wanted the opposite, on every axis.

Sifty's optional assistant runs on Ollama — a local model on your own machine. Nothing leaves your computer. But "local" wasn't enough on its own; I gave it three hard limits:

It only ever sees metadata. The advisor builds prompts from file names, sizes, and paths — never file contents. The model can reason that a 40 GB folder of .mp4 files in Downloads is probably reclaimable; it cannot read your documents, because they're never in the prompt.
It cannot delete anything. The AI has no access to trash(). It's advisory: it explains and recommends, and that's the end of its authority.
Every action it proposes needs your click. It's agentic — it can propose running a scan or a cleanup — but each proposed tool call shows up as Run / Skip buttons inline in the conversation. The AI suggests; you decide. It never acts on its own.

The mental model I kept: the AI is a knowledgeable advisor sitting next to you, not a hand on the keyboard.

Why this is a library with a CLI on top

One more decision that pays off for safety and contributors: Sifty is layered, and the layers point one direction only.

cli/  tui/        <- thin frontends (Typer, Textual)
        |
       core/      <- the engine: junk, disk, apps, safety, ...  (no UI code)
        |
   windows/  infra/   <- OS primitives (winget, Recycle Bin, UAC) + config/logging

Frontends call core; core calls windows/infra; nothing imports upward, and OS-specific calls are quarantined in windows/. The CLI and TUI are deliberately dumb — they parse arguments and print results. All the logic that matters lives in plain, testable functions like junk.scan() and disk.find_duplicates().

Two payoffs:

The safety guarantees are testable in isolation. The tests for protected paths don't go anywhere near a terminal. They monkeypatch the environment (SystemRoot, ProgramFiles, Path.home) and point everything at a tmp_path sandbox, so the protected-path logic runs deterministically — and the suite passes on CI's Linux runners even though Sifty is a Windows tool. The safety layer is the most heavily tested code in the repo by a wide margin.
A future GUI is a frontend swap, not a rewrite. It would call the same core functions, inheriting the same single delete path and the same protections for free.

What I'd tell anyone building something that deletes

If there's one transferable lesson, it's this: make the safe thing the default and the dangerous thing loud. Concretely, the moves that worked:

One choke point. Funnel every destructive action through a single function, then enforce it mechanically (a test that greps for the raw calls). One function to audit beats one hundred call sites to trust.
Reversibility over correctness. I will never write a perfect "is this junk?" classifier. So I didn't try — I made the cost of being wrong small. Recycle Bin, audit log, sifty undo.
Default to the cautious behavior. Dry-run as the parameter default means a forgotten argument fails safe. Safety you have to remember to turn on is safety you'll eventually forget.
No override for the truly dangerous stuff. Protected paths have no --force. A door that can always be forced open isn't a lock.

Try it / tear it apart

Sifty is on PyPI and MIT-licensed:

pipx install sifty        # or: pip install sifty
sifty checkup             # read-only - see what it'd find, deletes nothing

There's also a scoop bucket and a standalone sifty.exe on the releases page if you'd rather not install Python.

The repo is github.com/Vortrix5/sifty, and I'd genuinely love eyes on the safety model — especially anyone who can think of a way to make it delete something it shouldn't (there's a SECURITY.md for that). It's open to contributors, the test suite is fast and cross-platform, and CONTRIBUTING.md will get you running in two commands.

Break it before your users do. ⭐ if it's useful.

DEV Community

How I made deleting files hard to get wrong: building Sifty, a safety-first Windows cleaner

How I made deleting files hard to get wrong: building Sifty, a safety-first Windows cleaner

What it actually does

Principle 1: There is exactly one way to delete something

Principle 2: Some paths are refused no matter what

But cleaners need to touch system folders

Principle 3: The AI is advisory, local, and blind to your file contents

Why this is a library with a CLI on top

What I'd tell anyone building something that deletes

Try it / tear it apart

Top comments (0)