I just shipped CleanCloud v0.4.0 with major performance improvements through parallel scanning. Here's how we did it.
What's CleanCloud?
If you missed the original announcement, CleanCloud is a read-only CLI tool that scans AWS/Azure for orphaned resources (unattached volumes, old snapshots, infinite CloudWatch log retention).
Unlike aggressive cleanup tools, CleanCloud gives you conservative signals so you can review before taking action. No auto-delete, no risk.
The Performance Problem
v0.3.x had a bottleneck: sequential scanning.
# Old approach (v0.3.x)
findings = []
for region in regions_to_scan:
click.echo(f"π Scanning region {region}")
findings.extend(_scan_aws_region(profile, region))
Result: Each region scanned one at a time. For accounts with resources in multiple regions, this added up quickly.
The Solution: Parallel Scanning
v0.4.0 introduces concurrent scanning at two levels:
1. Parallel Region Scanning
# New approach (v0.4.0)
from concurrent.futures import ThreadPoolExecutor, as_completed
def scan_aws_regions(profile: Optional[str], regions_to_scan: List[str]) -> List[Finding]:
findings = []
with ThreadPoolExecutor(max_workers=min(5, len(regions_to_scan))) as executor:
futures = {
executor.submit(_scan_aws_region, profile, region): region
for region in regions_to_scan
}
for future in as_completed(futures):
region = futures[future]
click.echo(f"β
Completed region {region}")
findings.extend(future.result())
return findings
Key decisions:
-
max_workers=min(5, len(regions_to_scan))- Limits parallelism to avoid rate limits -
as_completed()- Shows progress as regions complete - Thread-safe result collection
2. Parallel Rule Execution
Within each region, we also parallelized individual rules:
AWS_RULES = [
find_unattached_ebs_volumes,
find_old_ebs_snapshots,
find_inactive_cloudwatch_logs,
find_aws_untagged_resources,
]
def _scan_aws_region(profile: Optional[str], region: str) -> List[Finding]:
session = create_aws_session(profile=profile, region=region)
findings = []
with ThreadPoolExecutor(max_workers=min(4, len(AWS_RULES))) as executor:
futures = [executor.submit(rule, session, region) for rule in AWS_RULES]
for future in as_completed(futures):
try:
rule_findings = future.result()
findings.extend(rule_findings)
except Exception as e:
# Never fail entire scan due to one rule
click.echo(f"β οΈ Rule failed in {region}: {e}")
return findings
Benefits:
- All 4 rules run concurrently per region
- Exception isolation (one failing rule doesn't break the scan)
- Better resource utilization
Performance Improvements
Real-world results from testing:
Single region scan:
- Before: ~20-25 seconds
- After: ~15-18 seconds
- Improvement: ~30% faster
Multi-region scan (5 regions):
- Before: ~100-120 seconds (sequential)
- After: ~20-25 seconds (parallel)
- Improvement: ~5x faster
The key insight: The more regions you scan, the bigger the improvement. Parallel execution shines when there's actual work to parallelize.
Azure Gets the Same Treatment
Azure subscriptions are now scanned in parallel too:
def scan_azure_subscriptions(
subscription_ids: List[str],
credential,
region_filter: Optional[str],
) -> List[Finding]:
all_findings = []
with ThreadPoolExecutor(max_workers=min(4, len(subscription_ids))) as executor:
futures = {
executor.submit(
_scan_azure_subscription,
subscription_id=sub_id,
credential=credential,
region_filter=region_filter,
): sub_id
for sub_id in subscription_ids
}
for future in as_completed(futures):
sub_id = futures[future]
click.echo(f"β
Completed subscription {sub_id}")
try:
all_findings.extend(future.result())
except Exception as e:
click.echo(f"β οΈ Subscription {sub_id} failed: {e}")
return all_findings
Same benefits for Azure users with multiple subscriptions.
Other v0.4.0 Improvements
π Safety Integration Tests
We now have automated tests that verify CleanCloud's read-only guarantees:
def test_scan_is_read_only():
"""Ensure no write operations during scan."""
# Run full scan
scan_result = scan_all_regions()
# Check CloudTrail for write operations
cloudtrail_events = get_recent_events()
write_events = [e for e in cloudtrail_events
if e['EventName'] not in READ_ONLY_OPERATIONS]
# Fail if ANY writes detected
assert len(write_events) == 0, f"Write operations detected: {write_events}"
These run in CI on every PR against real AWS/Azure accounts. If CleanCloud ever tries to write, the build fails.
Why this matters: You can trust that CleanCloud is truly read-only, not just claiming to be.
π©Ί Enhanced Doctor Command
The cleancloud doctor command now provides actionable IAM diagnostics:
cleancloud doctor --provider aws
# Before (v0.3.x):
β Permission denied
# After (v0.4.0):
β Missing IAM permission: ec2:DescribeVolumes
Suggested IAM policy:
{
"Version": "2012-10-17",
"Statement": [{
"Effect": "Allow",
"Action": ["ec2:DescribeVolumes"],
"Resource": "*"
}]
}
Much more helpful for debugging permission issues.
π Post-Scan Feedback
After each scan, you'll see a feedback prompt (disabled in CI/CD with --no-feedback):
--- Scan Summary ---
Total findings: 23
CleanCloud feedback
-------------------
If this scan surfaced useful findings, we'd love to hear about it.
Share feedback: https://github.com/cleancloud-io/cleancloud/discussions
This helps us improve detection rules based on real user feedback.
Real-World Impact
Since launch, CleanCloud users have reported finding:
π° Cost Savings:
- $8K-12K/year in forgotten CloudWatch logs (infinite retention)
- $500-2K/year in unattached EBS volumes
- $300-1K/year in old snapshots
π― Common Findings:
- 50-100 unattached volumes per account
- 100-300 old snapshots from deleted instances
- 20-50 log groups with infinite retention
β±οΈ Time to Value:
- Scan time: 20-30 seconds (v0.4.0)
- Review time: 5-10 minutes
- First cleanup: Same day
- ROI: Immediate
Installation & Usage
# Install
pip install cleancloud
# Scan all active AWS regions (auto-detects which have resources)
cleancloud scan --provider aws --all-regions
# Check IAM permissions
cleancloud doctor --provider aws
# Scan specific region
cleancloud scan --provider aws --region us-east-1
# Scan Azure
cleancloud scan --provider azure
Example output:
π Starting CleanCloud scan...
Provider: aws
π Auto-detecting regions with resources...
β Found 3 active regions: us-east-1, us-west-2, eu-west-1
β
Completed region us-east-1
β
Completed region us-west-2
β
Completed region eu-west-1
--- Scan Summary ---
Total findings: 47
By confidence: {'HIGH': 12, 'MEDIUM': 23, 'LOW': 12}
Regions scanned: us-east-1, us-west-2, eu-west-1
Technical Deep Dive: Threading Challenges
Building the parallel scanning wasn't trivial. Here are some challenges we hit:
1. Thread Safety with boto3
boto3 clients are not thread-safe. We had to create separate sessions per thread:
def _scan_aws_region(profile: Optional[str], region: str) -> List[Finding]:
# Create NEW session per thread
session = create_aws_session(profile=profile, region=region)
# Now safe to use in this thread
findings = []
# ... scanning logic
return findings
Lesson: Never share boto3 clients across threads. Create new sessions per worker.
2. Rate Limiting
Running 5 regions in parallel meant more concurrent API calls. We had to be smart about worker limits:
# Limit parallelism to avoid throttling
max_workers = min(5, len(regions_to_scan)) # Cap at 5 workers
Also: boto3's built-in retry logic with adaptive mode handles most throttling gracefully.
3. Error Isolation
One region failing shouldn't kill the entire scan:
for future in as_completed(futures):
try:
rule_findings = future.result()
findings.extend(rule_findings)
except Exception as e:
# Log error but continue
click.echo(f"β οΈ Rule failed: {e}")
Result: Partial results if some regions fail. Trust-first means never failing the entire scan.
4. Progress Feedback
Users need to know what's happening during parallel scans:
for future in as_completed(futures):
region = futures[future]
click.echo(f"β
Completed region {region}")
Better UX: Show progress as regions complete, not just at the end.
What's Next
Roadmap for v0.5.0:
- π GCP support - Extend beyond AWS/Azure
- βοΈ Configurable thresholds - Adjust age/confidence per environment
- π΅ Cost calculations - Show potential savings in dollars
- π CI/CD templates - GitHub Actions, GitLab CI examples
- π JSON export improvements - Better integration with other tools
Want to contribute? We welcome PRs! Check out the issues.
Why Open Source?
CleanCloud is MIT licensed with:
- β Zero telemetry
- β No phone-home
- β No tracking
- β All code visible
Why?
Trust is critical for cloud security tools. Open source means you can verify CleanCloud is truly read-only. No need to trust my promises - read the code.
Plus: Building in public creates better software through community feedback.
Try It Out
pip install cleancloud
cleancloud scan --provider aws --all-regions
Links:
- π¦ PyPI: https://pypi.org/project/cleancloud
- π» GitHub: https://github.com/cleancloud-io/cleancloud
- π Docs: https://github.com/cleancloud-io/cleancloud#readme
Feedback Welcome!
What cloud hygiene checks would be useful? What other resources should CleanCloud scan?
Drop a comment or open an issue on GitHub. Would love to hear what you find! π
Top comments (0)