DEV Community

Cover image for Building a Beautiful macOS Cleaner with Python and Rich UI
Denis
Denis

Posted on • Originally published at github.com

Building a Beautiful macOS Cleaner with Python and Rich UI

Ever wondered how much disk space is hiding in cache folders, old Xcode archives, or forgotten Docker containers? As a macOS developer, I found myself constantly battling with "Storage Almost Full" notifications. Commercial cleaning apps felt like overkill, so I decided to build my own: macOS Cleaner - a powerful, safe, and beautiful console application.

🔗 GitHub: QDenka/MacCleanCLI
Star if you find it useful!
🐛 Issues & PRs welcome

The Problem

macOS is notorious for accumulating gigabytes of "Other" storage:

  • System caches that grow indefinitely
  • Xcode DerivedData eating 10-20GB
  • Docker images forgotten after projects
  • Browser caches across Safari, Chrome, Firefox, Brave...
  • Homebrew package caches
  • Old node_modules in abandoned projects

I wanted a tool that was:

  1. Safe - Never delete critical system files
  2. Transparent - Show exactly what will be deleted
  3. Beautiful - Console doesn't mean ugly
  4. Smart - Categorize and prioritize cleaning targets
  5. Fast - Multi-threaded scanning and cleaning

The Solution: Category-Based Architecture

The core design decision was organizing everything around FileCategory enum:

class FileCategory(Enum):
    """Categories of files that can be cleaned."""
    SYSTEM_CACHE = auto()
    USER_CACHE = auto()
    BROWSER_CACHE = auto()
    XCODE_DERIVED_DATA = auto()
    XCODE_ARCHIVES = auto()
    DOCKER_DATA = auto()
    HOMEBREW_CACHE = auto()
    NODE_MODULES = auto()
    PYTHON_CACHE = auto()
    TEMPORARY_FILES = auto()
    LOG_FILES = auto()
    DOWNLOADS = auto()
    TRASH = auto()
    # ... and more
Enter fullscreen mode Exit fullscreen mode

Each category has:

  • Predefined scan paths - Where to look for files
  • Cleaning priority - HIGH, MEDIUM, LOW, OPTIONAL
  • Safety rules - What's protected

Technical Deep Dive

1. Multi-Threaded Scanning with ThreadPoolExecutor

Scanning hundreds of thousands of files can be slow. The solution? Parallel processing:

class SystemScanner:
    def scan(self, categories: Optional[List[FileCategory]] = None) -> ScanResult:
        with ThreadPoolExecutor(max_workers=self.config.max_workers) as executor:
            futures = {}

            for category in categories_to_scan:
                future = executor.submit(self._scan_category, category)
                futures[future] = category

            for future in as_completed(futures):
                category = futures[future]
                result = future.result()
                scan_result.add_category_result(result)
Enter fullscreen mode Exit fullscreen mode

Result: ~500-1000 files/second scan speed on typical macOS systems

2. Type-Safe Data Models with Dataclasses

Using Python 3.10+ dataclasses for clean, maintainable code:

@dataclass
class FileInfo:
    """Information about a file."""
    path: Path
    size: int
    modified_time: datetime
    accessed_time: datetime
    category: FileCategory
    priority: CleaningPriority
    is_safe_to_delete: bool = True

    @property
    def size_mb(self) -> float:
        return self.size / (1024 * 1024)

    @property
    def age_days(self) -> int:
        return (datetime.now() - self.modified_time).days
Enter fullscreen mode Exit fullscreen mode

3. Safety-First Design

Multiple layers of protection prevent catastrophic mistakes:

# Protected paths that are NEVER touched
self.protected_paths = {
    Path("/System"),
    Path("/Library/System"),
    Path("/private/var/db"),
    Path("/usr/bin"),
    Path("/usr/sbin"),
}

# Protected file extensions
protected_extensions = {".dmg", ".pkg", ".app"}

# Always confirm before deletion
if not dry_run:
    confirm = input("Proceed with deletion? [y/N]: ")
    if confirm.lower() != 'y':
        return
Enter fullscreen mode Exit fullscreen mode

Backup System: Optional automatic backup before deletion with configurable retention:

# Backups stored with timestamps
~/.macos-cleaner/backups/2024-10-06_170000/
├── Caches/
└── manifest.json
Enter fullscreen mode Exit fullscreen mode

4. Beautiful Console UI with Rich

The Rich library transforms console output:

from rich.console import Console
from rich.table import Table
from rich.progress import Progress, SpinnerColumn, BarColumn

console = Console()

# Create beautiful tables
table = Table(title="Scan Results", show_header=True)
table.add_column("Category", style="cyan")
table.add_column("Files", justify="right", style="green")
table.add_column("Size", justify="right", style="yellow")

for category, result in scan_result.categories.items():
    table.add_row(
        category.name,
        f"{result.file_count:,}",
        f"{result.total_size_gb:.2f} GB"
    )

console.print(table)
Enter fullscreen mode Exit fullscreen mode

5. New Feature: File Preview with Pagination

One of the latest additions - users can preview exactly what will be deleted:

def show_file_details(files: List[FileInfo], batch_size: int = 20):
    """Show detailed file list with pagination."""
    for i in range(0, len(files), batch_size):
        batch = files[i:i + batch_size]

        panel = Panel(
            self._create_file_list_table(batch),
            title=f"📁 Files {i+1}-{min(i+batch_size, len(files))} of {len(files)}",
            border_style="blue"
        )

        console.print(panel)

        if i + batch_size < len(files):
            if not Confirm.ask("Continue to next page?", default=True):
                break
Enter fullscreen mode Exit fullscreen mode

Developer-Focused Categories ⚡

As developers, we accumulate specific types of bloat:

Xcode

FileCategory.XCODE_DERIVED_DATA: [
    ~/Library/Developer/Xcode/DerivedData
]
# Often 10-20GB of build artifacts
Enter fullscreen mode Exit fullscreen mode

Docker

FileCategory.DOCKER_DATA: [
    ~/.docker,
    ~/Library/Containers/com.docker.docker
]
# Unused containers, images, volumes
Enter fullscreen mode Exit fullscreen mode

Node.js

FileCategory.NODE_MODULES: [
    # Recursively find all node_modules
]
# Old project dependencies
Enter fullscreen mode Exit fullscreen mode

Homebrew

FileCategory.HOMEBREW_CACHE: [
    ~/Library/Caches/Homebrew,
    /usr/local/Homebrew/Library/Homebrew/vendor/cache
]
Enter fullscreen mode Exit fullscreen mode

Testing Strategy

87 tests with 41% coverage, focusing on critical paths:

# tests/test_scanner.py
def test_scan_user_cache(scanner):
    """Test user cache scanning."""
    cache_dir = Path.home() / "Library" / "Caches"

    result = scanner._scan_cache_files(
        FileCategory.USER_CACHE,
        [cache_dir]
    )

    assert result.category == FileCategory.USER_CACHE
    assert result.file_count >= 0
    assert all(f.category == FileCategory.USER_CACHE for f in result.files)
Enter fullscreen mode Exit fullscreen mode
# Run tests
pytest --cov=. --cov-report=term-missing

# Coverage by module
core/scanner.py      87%
core/cleaner.py      76%
models/              92%
Enter fullscreen mode Exit fullscreen mode

Installation & Usage

# Quick install
git clone https://github.com/QDenka/MacCleanCLI.git
cd MacCleanCLI
pip install -e .

# Run interactive mode
macos-cleaner

# Or use the short alias
mclean

# Command-line options
macos-cleaner --scan-only         # Preview only
macos-cleaner --auto              # Auto-clean recommended
macos-cleaner --dry-run --verbose # Safe preview
Enter fullscreen mode Exit fullscreen mode

Architecture Highlights

Clean Separation of Concerns:

MacCleanCLI/
├── core/           # Business logic
│   ├── scanner.py  # Multi-threaded scanning
│   ├── cleaner.py  # Safe deletion
│   └── optimizer.py # System optimizations
├── models/         # Type-safe data structures
│   └── scan_result.py
├── ui/             # Rich-based interface
│   ├── interface.py
│   └── components.py
└── utils/          # Configuration, logging, backup
Enter fullscreen mode Exit fullscreen mode

SOLID Principles:

  • Single Responsibility: Each class has one job
  • Open/Closed: Easy to add new categories
  • Liskov Substitution: Dataclass inheritance
  • Interface Segregation: Minimal dependencies
  • Dependency Inversion: Config-driven behavior

Performance Benchmarks

On a typical macOS system:

  • Scan Speed: 500-1000 files/second
  • Memory Usage: 50-100 MB during scan
  • Clean Speed: 200-400 files/second
  • Thread Count: Configurable (default: 4 workers)

What I Learned

  1. Rich is amazing - Seriously transforms CLI UX
  2. Safety is paramount - Multiple protection layers are essential
  3. Dataclasses rock - Type safety with minimal boilerplate
  4. ThreadPoolExecutor - Simple parallel processing
  5. Testing file operations - Use temp directories and mocks
  6. macOS paths are complex - Safari has 5+ cache locations!

Future Roadmap

  • 📊 Visual reports - HTML/PDF scan summaries
  • 🔄 Scheduled cleaning - LaunchDaemon integration
  • 📱 iOS simulator cleanup - Support iOS DerivedData
  • 🌍 Localization - Multi-language support
  • 🔌 Plugin system - Custom cleaning categories
  • 📈 Historical tracking - Storage trends over time

Try It Yourself!

The project is fully open-source under MIT license:

🔗 GitHub: QDenka/MacCleanCLI
Star if you find it useful!
🐛 Issues & PRs welcome

Key Takeaways

If you're building a CLI tool, consider:

  1. Use Rich for beautiful, modern console UI
  2. Design around clear abstractions (categories, priorities)
  3. Implement safety by default (dry-run, backups, protection)
  4. Leverage parallel processing for I/O-heavy operations
  5. Use dataclasses for type-safe, maintainable code
  6. Test thoroughly, especially file operations

What's your experience with building CLI tools? Have you tried any interesting libraries for console UI? Drop a comment below! 👇


Built with ❤️ for macOS developers. If this saved your disk space, consider giving it a ⭐ on GitHub!

Top comments (0)