Ever wondered how much disk space is hiding in cache folders, old Xcode archives, or forgotten Docker containers? As a macOS developer, I found myself constantly battling with "Storage Almost Full" notifications. Commercial cleaning apps felt like overkill, so I decided to build my own: macOS Cleaner - a powerful, safe, and beautiful console application.
🔗 GitHub: QDenka/MacCleanCLI
⭐ Star if you find it useful!
🐛 Issues & PRs welcome
The Problem
macOS is notorious for accumulating gigabytes of "Other" storage:
- System caches that grow indefinitely
- Xcode DerivedData eating 10-20GB
- Docker images forgotten after projects
- Browser caches across Safari, Chrome, Firefox, Brave...
- Homebrew package caches
- Old
node_modules
in abandoned projects
I wanted a tool that was:
- Safe - Never delete critical system files
- Transparent - Show exactly what will be deleted
- Beautiful - Console doesn't mean ugly
- Smart - Categorize and prioritize cleaning targets
- Fast - Multi-threaded scanning and cleaning
The Solution: Category-Based Architecture
The core design decision was organizing everything around FileCategory enum:
class FileCategory(Enum):
"""Categories of files that can be cleaned."""
SYSTEM_CACHE = auto()
USER_CACHE = auto()
BROWSER_CACHE = auto()
XCODE_DERIVED_DATA = auto()
XCODE_ARCHIVES = auto()
DOCKER_DATA = auto()
HOMEBREW_CACHE = auto()
NODE_MODULES = auto()
PYTHON_CACHE = auto()
TEMPORARY_FILES = auto()
LOG_FILES = auto()
DOWNLOADS = auto()
TRASH = auto()
# ... and more
Each category has:
- Predefined scan paths - Where to look for files
- Cleaning priority - HIGH, MEDIUM, LOW, OPTIONAL
- Safety rules - What's protected
Technical Deep Dive
1. Multi-Threaded Scanning with ThreadPoolExecutor
Scanning hundreds of thousands of files can be slow. The solution? Parallel processing:
class SystemScanner:
def scan(self, categories: Optional[List[FileCategory]] = None) -> ScanResult:
with ThreadPoolExecutor(max_workers=self.config.max_workers) as executor:
futures = {}
for category in categories_to_scan:
future = executor.submit(self._scan_category, category)
futures[future] = category
for future in as_completed(futures):
category = futures[future]
result = future.result()
scan_result.add_category_result(result)
Result: ~500-1000 files/second scan speed on typical macOS systems
2. Type-Safe Data Models with Dataclasses
Using Python 3.10+ dataclasses for clean, maintainable code:
@dataclass
class FileInfo:
"""Information about a file."""
path: Path
size: int
modified_time: datetime
accessed_time: datetime
category: FileCategory
priority: CleaningPriority
is_safe_to_delete: bool = True
@property
def size_mb(self) -> float:
return self.size / (1024 * 1024)
@property
def age_days(self) -> int:
return (datetime.now() - self.modified_time).days
3. Safety-First Design
Multiple layers of protection prevent catastrophic mistakes:
# Protected paths that are NEVER touched
self.protected_paths = {
Path("/System"),
Path("/Library/System"),
Path("/private/var/db"),
Path("/usr/bin"),
Path("/usr/sbin"),
}
# Protected file extensions
protected_extensions = {".dmg", ".pkg", ".app"}
# Always confirm before deletion
if not dry_run:
confirm = input("Proceed with deletion? [y/N]: ")
if confirm.lower() != 'y':
return
Backup System: Optional automatic backup before deletion with configurable retention:
# Backups stored with timestamps
~/.macos-cleaner/backups/2024-10-06_170000/
├── Caches/
└── manifest.json
4. Beautiful Console UI with Rich
The Rich library transforms console output:
from rich.console import Console
from rich.table import Table
from rich.progress import Progress, SpinnerColumn, BarColumn
console = Console()
# Create beautiful tables
table = Table(title="Scan Results", show_header=True)
table.add_column("Category", style="cyan")
table.add_column("Files", justify="right", style="green")
table.add_column("Size", justify="right", style="yellow")
for category, result in scan_result.categories.items():
table.add_row(
category.name,
f"{result.file_count:,}",
f"{result.total_size_gb:.2f} GB"
)
console.print(table)
5. New Feature: File Preview with Pagination
One of the latest additions - users can preview exactly what will be deleted:
def show_file_details(files: List[FileInfo], batch_size: int = 20):
"""Show detailed file list with pagination."""
for i in range(0, len(files), batch_size):
batch = files[i:i + batch_size]
panel = Panel(
self._create_file_list_table(batch),
title=f"📁 Files {i+1}-{min(i+batch_size, len(files))} of {len(files)}",
border_style="blue"
)
console.print(panel)
if i + batch_size < len(files):
if not Confirm.ask("Continue to next page?", default=True):
break
Developer-Focused Categories ⚡
As developers, we accumulate specific types of bloat:
Xcode
FileCategory.XCODE_DERIVED_DATA: [
~/Library/Developer/Xcode/DerivedData
]
# Often 10-20GB of build artifacts
Docker
FileCategory.DOCKER_DATA: [
~/.docker,
~/Library/Containers/com.docker.docker
]
# Unused containers, images, volumes
Node.js
FileCategory.NODE_MODULES: [
# Recursively find all node_modules
]
# Old project dependencies
Homebrew
FileCategory.HOMEBREW_CACHE: [
~/Library/Caches/Homebrew,
/usr/local/Homebrew/Library/Homebrew/vendor/cache
]
Testing Strategy
87 tests with 41% coverage, focusing on critical paths:
# tests/test_scanner.py
def test_scan_user_cache(scanner):
"""Test user cache scanning."""
cache_dir = Path.home() / "Library" / "Caches"
result = scanner._scan_cache_files(
FileCategory.USER_CACHE,
[cache_dir]
)
assert result.category == FileCategory.USER_CACHE
assert result.file_count >= 0
assert all(f.category == FileCategory.USER_CACHE for f in result.files)
# Run tests
pytest --cov=. --cov-report=term-missing
# Coverage by module
core/scanner.py 87%
core/cleaner.py 76%
models/ 92%
Installation & Usage
# Quick install
git clone https://github.com/QDenka/MacCleanCLI.git
cd MacCleanCLI
pip install -e .
# Run interactive mode
macos-cleaner
# Or use the short alias
mclean
# Command-line options
macos-cleaner --scan-only # Preview only
macos-cleaner --auto # Auto-clean recommended
macos-cleaner --dry-run --verbose # Safe preview
Architecture Highlights
Clean Separation of Concerns:
MacCleanCLI/
├── core/ # Business logic
│ ├── scanner.py # Multi-threaded scanning
│ ├── cleaner.py # Safe deletion
│ └── optimizer.py # System optimizations
├── models/ # Type-safe data structures
│ └── scan_result.py
├── ui/ # Rich-based interface
│ ├── interface.py
│ └── components.py
└── utils/ # Configuration, logging, backup
SOLID Principles:
- Single Responsibility: Each class has one job
- Open/Closed: Easy to add new categories
- Liskov Substitution: Dataclass inheritance
- Interface Segregation: Minimal dependencies
- Dependency Inversion: Config-driven behavior
Performance Benchmarks
On a typical macOS system:
- Scan Speed: 500-1000 files/second
- Memory Usage: 50-100 MB during scan
- Clean Speed: 200-400 files/second
- Thread Count: Configurable (default: 4 workers)
What I Learned
- Rich is amazing - Seriously transforms CLI UX
- Safety is paramount - Multiple protection layers are essential
- Dataclasses rock - Type safety with minimal boilerplate
- ThreadPoolExecutor - Simple parallel processing
- Testing file operations - Use temp directories and mocks
- macOS paths are complex - Safari has 5+ cache locations!
Future Roadmap
- 📊 Visual reports - HTML/PDF scan summaries
- 🔄 Scheduled cleaning - LaunchDaemon integration
- 📱 iOS simulator cleanup - Support iOS DerivedData
- 🌍 Localization - Multi-language support
- 🔌 Plugin system - Custom cleaning categories
- 📈 Historical tracking - Storage trends over time
Try It Yourself!
The project is fully open-source under MIT license:
🔗 GitHub: QDenka/MacCleanCLI
⭐ Star if you find it useful!
🐛 Issues & PRs welcome
Key Takeaways
If you're building a CLI tool, consider:
- Use Rich for beautiful, modern console UI
- Design around clear abstractions (categories, priorities)
- Implement safety by default (dry-run, backups, protection)
- Leverage parallel processing for I/O-heavy operations
- Use dataclasses for type-safe, maintainable code
- Test thoroughly, especially file operations
What's your experience with building CLI tools? Have you tried any interesting libraries for console UI? Drop a comment below! 👇
Built with ❤️ for macOS developers. If this saved your disk space, consider giving it a ⭐ on GitHub!
Top comments (0)