<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>DEV Community: Suresh</title>
    <description>The latest articles on DEV Community by Suresh (@sureshmandalapu).</description>
    <link>https://dev.to/sureshmandalapu</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F3685594%2Fc9b267ea-3265-4dc1-ae53-6af5ec0a39d4.png</url>
      <title>DEV Community: Suresh</title>
      <link>https://dev.to/sureshmandalapu</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://dev.to/feed/sureshmandalapu"/>
    <language>en</language>
    <item>
      <title>The Hidden Cost of Idle AI/ML Infrastructure: SageMaker, AML, and Vertex AI</title>
      <dc:creator>Suresh</dc:creator>
      <pubDate>Wed, 08 Apr 2026 10:11:49 +0000</pubDate>
      <link>https://dev.to/sureshmandalapu/the-hidden-cost-of-idle-aiml-infrastructure-sagemaker-aml-and-vertex-ai-2kom</link>
      <guid>https://dev.to/sureshmandalapu/the-hidden-cost-of-idle-aiml-infrastructure-sagemaker-aml-and-vertex-ai-2kom</guid>
      <description>&lt;p&gt;TL;DR: Idle AI/ML endpoints burn $500-$23K per month unnoticed. Here's how to detect them across AWS, Azure, and GCP — and stop them automatically in CI.&lt;/p&gt;

&lt;p&gt;Tools like &lt;a href="https://github.com/cleancloud-io/cleancloud" rel="noopener noreferrer"&gt;CleanCloud&lt;/a&gt; surface these across AWS, Azure, and GCP — and can enforce detection in CI before the bill hits.&lt;/p&gt;




&lt;h2&gt;
  
  
  AI/ML Waste: The New Blind Spot
&lt;/h2&gt;

&lt;p&gt;For the past several years, the dominant form of cloud waste was infrastructure sprawl — orphaned EBS volumes, idle NAT gateways, stopped VMs that were never deallocated. These resources are expensive in aggregate, but individually small. A forgotten EBS volume costs $10-$40/month. An unattached Elastic IP costs pennies.&lt;/p&gt;

&lt;p&gt;AI/ML infrastructure operates at a completely different scale. A single idle SageMaker endpoint backed by a GPU instance can cost more than a thousand stopped EC2 instances. A Vertex AI endpoint running an undeployed model on a high-memory accelerator can burn through $23,000 a month with zero traffic.&lt;/p&gt;

&lt;p&gt;This is the new waste category that FinOps dashboards were never designed to catch.&lt;/p&gt;

&lt;p&gt;Tools like &lt;a href="https://github.com/cleancloud-io/cleancloud" rel="noopener noreferrer"&gt;CleanCloud&lt;/a&gt; approach this differently: instead of dashboards, they scan for deterministic signals (like zero invocations over time) and flag idle endpoints directly — making them enforceable in CI/CD rather than something you discover weeks later in billing.&lt;/p&gt;

&lt;p&gt;Key stats:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;SageMaker GPU endpoint, idle: ~$500/month (minimum)&lt;/li&gt;
&lt;li&gt;SageMaker p4d.24xlarge endpoint, idle: $23K+/month&lt;/li&gt;
&lt;li&gt;Default idle window before detection: 7 days&lt;/li&gt;
&lt;li&gt;Typical "forgotten" endpoint: 6+ weeks of unnoticed billing&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The pattern is consistent across all three major clouds. A data scientist spins up an endpoint to serve a model — for a hackathon, a proof-of-concept, a demo. The event ends. The endpoint doesn't. Six weeks later, finance flags an anomaly in the ML budget. By then, the team has scattered, the model is stale, and the endpoint has cost more than the original experiment was worth.&lt;/p&gt;




&lt;h2&gt;
  
  
  AWS: Idle SageMaker Endpoints
&lt;/h2&gt;

&lt;p&gt;SageMaker endpoints are always-on serving infrastructure. Unlike Lambda or Fargate, an endpoint doesn't scale to zero when traffic stops. The underlying EC2 instance keeps running, and AWS keeps billing — at full on-demand rates, including any attached GPU capacity.&lt;/p&gt;

&lt;h3&gt;
  
  
  How the rule works
&lt;/h3&gt;

&lt;p&gt;The aws.sagemaker.endpoint.idle rule queries CloudWatch Metrics for InvocationCount across the idle window (default: 7 days). An endpoint with zero invocations over that period is flagged. The confidence level and cost estimate depend on the instance type:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Instance Type&lt;/th&gt;
&lt;th&gt;Approx. Monthly Cost (Idle)&lt;/th&gt;
&lt;th&gt;Confidence&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;ml.t3.medium&lt;/td&gt;
&lt;td&gt;~$30&lt;/td&gt;
&lt;td&gt;MEDIUM&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;ml.m5.xlarge&lt;/td&gt;
&lt;td&gt;~$140&lt;/td&gt;
&lt;td&gt;MEDIUM&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;ml.g4dn.xlarge&lt;/td&gt;
&lt;td&gt;~$500&lt;/td&gt;
&lt;td&gt;HIGH&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;ml.g4dn.12xlarge&lt;/td&gt;
&lt;td&gt;~$1,800&lt;/td&gt;
&lt;td&gt;HIGH&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;ml.p3.2xlarge&lt;/td&gt;
&lt;td&gt;~$2,800&lt;/td&gt;
&lt;td&gt;HIGH&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;ml.p4d.24xlarge&lt;/td&gt;
&lt;td&gt;~$23,000+&lt;/td&gt;
&lt;td&gt;HIGH&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;GPU-backed endpoints are rated HIGH confidence because the cost impact is unambiguous — zero invocations over 7 days on a GPU instance is definitively idle waste. CPU-backed endpoints at lower cost tiers are rated MEDIUM because they may be handling low but real traffic below the detection threshold.&lt;/p&gt;

&lt;h3&gt;
  
  
  Example CleanCloud output
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Rule: aws.sagemaker.endpoint.idle
Resource: fraud-detection-v2 (us-east-1)
Instance: ml.g4dn.xlarge
Idle: 14 days (zero InvocationCount)
Confidence: HIGH
Estimated Cost: ~$504/month
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Scanning for idle SageMaker endpoints in CI
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight yaml"&gt;&lt;code&gt;&lt;span class="c1"&gt;# .github/workflows/ai-hygiene.yml&lt;/span&gt;
&lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;AI/ML Waste Detection&lt;/span&gt;

&lt;span class="na"&gt;on&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="na"&gt;schedule&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
    &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;cron&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s1"&gt;'&lt;/span&gt;&lt;span class="s"&gt;0&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;9&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;*&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;*&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;MON'&lt;/span&gt;  &lt;span class="c1"&gt;# Weekly&lt;/span&gt;

&lt;span class="na"&gt;jobs&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="na"&gt;scan&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
    &lt;span class="na"&gt;runs-on&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;ubuntu-latest&lt;/span&gt;
    &lt;span class="na"&gt;permissions&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
      &lt;span class="na"&gt;id-token&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;write&lt;/span&gt;
      &lt;span class="na"&gt;contents&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;read&lt;/span&gt;
    &lt;span class="na"&gt;steps&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
      &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;uses&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;actions/checkout@v4&lt;/span&gt;

      &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;uses&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;aws-actions/configure-aws-credentials@v4&lt;/span&gt;
        &lt;span class="na"&gt;with&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
          &lt;span class="na"&gt;role-to-assume&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;arn:aws:iam::${{ vars.AWS_ACCOUNT_ID }}:role/CleanCloudCIReadOnly&lt;/span&gt;
          &lt;span class="na"&gt;aws-region&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;us-east-1&lt;/span&gt;

      &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;uses&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;cleancloud-io/scan-action@v1&lt;/span&gt;
        &lt;span class="na"&gt;with&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
          &lt;span class="na"&gt;provider&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;aws&lt;/span&gt;
          &lt;span class="na"&gt;category&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;ai&lt;/span&gt;
          &lt;span class="na"&gt;fail-on-cost&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="m"&gt;500&lt;/span&gt;
          &lt;span class="na"&gt;output&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;json&lt;/span&gt;
          &lt;span class="na"&gt;output-file&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;findings.json&lt;/span&gt;

      &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;Report to Slack (optional)&lt;/span&gt;
        &lt;span class="na"&gt;if&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;failure()&lt;/span&gt;
        &lt;span class="na"&gt;uses&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;slackapi/slack-github-action@v1&lt;/span&gt;
        &lt;span class="na"&gt;with&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
          &lt;span class="na"&gt;payload&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="pi"&gt;|&lt;/span&gt;
            &lt;span class="s"&gt;{&lt;/span&gt;
              &lt;span class="s"&gt;"text": "SageMaker endpoints detected with idle waste ($500+/month): Check findings.json"&lt;/span&gt;
            &lt;span class="s"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;






&lt;h2&gt;
  
  
  Azure: Idle Machine Learning Compute Clusters
&lt;/h2&gt;

&lt;p&gt;Azure Machine Learning (AML) compute clusters are provisioned clusters that you manage. Unlike serverless options, AML clusters have a minimum node count that stays running even when no training jobs are active.&lt;/p&gt;

&lt;h3&gt;
  
  
  How the rule works
&lt;/h3&gt;

&lt;p&gt;The azure.ml.compute.idle rule checks:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Cluster has minimum node count &amp;gt; 0&lt;/li&gt;
&lt;li&gt;No training jobs submitted in the idle window (default: 14 days)&lt;/li&gt;
&lt;li&gt;Cluster is still in "Running" state (not deallocated)&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;Result: A cluster provisioned for a pilot project that finished months ago is still spinning, still billing.&lt;/p&gt;

&lt;h3&gt;
  
  
  Cost examples
&lt;/h3&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Cluster Type&lt;/th&gt;
&lt;th&gt;Approx. Monthly Cost (Idle)&lt;/th&gt;
&lt;th&gt;Confidence&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Standard_D2s_v3 (2 CPU, 8GB RAM)&lt;/td&gt;
&lt;td&gt;~$60/month&lt;/td&gt;
&lt;td&gt;MEDIUM&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Standard_NC6_v3 (6 CPU, 1 GPU, 112GB RAM)&lt;/td&gt;
&lt;td&gt;~$600/month&lt;/td&gt;
&lt;td&gt;HIGH&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Standard_ND40rs_v2 (8 GPU, 24 CPU, 948GB RAM)&lt;/td&gt;
&lt;td&gt;~$15,000/month&lt;/td&gt;
&lt;td&gt;HIGH&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;h3&gt;
  
  
  Detection in CI
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight yaml"&gt;&lt;code&gt;&lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;uses&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;cleancloud-io/scan-action@v1&lt;/span&gt;
  &lt;span class="na"&gt;with&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
    &lt;span class="na"&gt;provider&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;azure&lt;/span&gt;
    &lt;span class="na"&gt;category&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;ai&lt;/span&gt;
    &lt;span class="na"&gt;fail-on-cost&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="m"&gt;500&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;






&lt;h2&gt;
  
  
  GCP: Idle Vertex AI Prediction Endpoints
&lt;/h2&gt;

&lt;p&gt;Vertex AI is Google's managed ML platform. Prediction endpoints are always-on infrastructure for real-time model serving. Unlike batch jobs, endpoints don't stop — they keep running and billing even with zero predictions.&lt;/p&gt;

&lt;h3&gt;
  
  
  How the rule works
&lt;/h3&gt;

&lt;p&gt;The gcp.vertex.prediction.endpoint.idle rule queries Vertex AI metrics for prediction activity over the idle window (default: 14 days). An endpoint with zero or near-zero predictions is flagged. Confidence is HIGH for GPU-backed endpoints.&lt;/p&gt;

&lt;h3&gt;
  
  
  Cost examples
&lt;/h3&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Instance Type&lt;/th&gt;
&lt;th&gt;Approx. Monthly Cost (Idle)&lt;/th&gt;
&lt;th&gt;Confidence&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;n1-standard-4 (CPU)&lt;/td&gt;
&lt;td&gt;~$150/month&lt;/td&gt;
&lt;td&gt;MEDIUM&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;nvidia-tesla-k80 (1 GPU)&lt;/td&gt;
&lt;td&gt;~$800/month&lt;/td&gt;
&lt;td&gt;HIGH&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;nvidia-tesla-v100 (1 GPU)&lt;/td&gt;
&lt;td&gt;~$2,500/month&lt;/td&gt;
&lt;td&gt;HIGH&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;nvidia-tesla-a100 (1 GPU)&lt;/td&gt;
&lt;td&gt;~$10,000/month&lt;/td&gt;
&lt;td&gt;HIGH&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;h3&gt;
  
  
  Detection in CI
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight yaml"&gt;&lt;code&gt;&lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;uses&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;cleancloud-io/scan-action@v1&lt;/span&gt;
  &lt;span class="na"&gt;with&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
    &lt;span class="na"&gt;provider&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;gcp&lt;/span&gt;
    &lt;span class="na"&gt;category&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;ai&lt;/span&gt;
    &lt;span class="na"&gt;all-projects&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="kc"&gt;true&lt;/span&gt;
    &lt;span class="na"&gt;fail-on-cost&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="m"&gt;500&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;






&lt;h2&gt;
  
  
  Multi-Cloud Detection: A Real Example
&lt;/h2&gt;

&lt;p&gt;One organization ran:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;3 idle SageMaker endpoints (ml.g4dn.xlarge) = $1,500/month&lt;/li&gt;
&lt;li&gt;1 idle AML cluster (GPU-backed) = $600/month&lt;/li&gt;
&lt;li&gt;2 idle Vertex AI endpoints (Tesla K80) = $1,600/month&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Total monthly waste: $3,700&lt;/p&gt;

&lt;p&gt;Detected by: Single weekly scan across all three clouds&lt;/p&gt;




&lt;h2&gt;
  
  
  Detecting AI/ML Waste at Scale
&lt;/h2&gt;

&lt;h3&gt;
  
  
  Option 1: Weekly scheduled scan (CI-based)
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;# Runs every Monday at 9am&lt;/span&gt;
cleancloud scan &lt;span class="nt"&gt;--provider&lt;/span&gt; aws &lt;span class="nt"&gt;--category&lt;/span&gt; ai &lt;span class="nt"&gt;--all-regions&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--provider&lt;/span&gt; azure &lt;span class="nt"&gt;--category&lt;/span&gt; ai &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--provider&lt;/span&gt; gcp &lt;span class="nt"&gt;--category&lt;/span&gt; ai &lt;span class="nt"&gt;--all-projects&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--fail-on-cost&lt;/span&gt; 500 &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--output&lt;/span&gt; json &lt;span class="nt"&gt;--output-file&lt;/span&gt; ai-waste.json
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Option 2: Per-deployment scan
&lt;/h3&gt;

&lt;p&gt;Run AI/ML detection before prod deployment to catch new endpoints:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight yaml"&gt;&lt;code&gt;&lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;Pre-deploy AI/ML check&lt;/span&gt;
  &lt;span class="na"&gt;run&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="pi"&gt;|&lt;/span&gt;
    &lt;span class="s"&gt;cleancloud scan --provider aws --category ai \&lt;/span&gt;
      &lt;span class="s"&gt;--fail-on-confidence HIGH&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Option 3: Policy-as-code enforcement
&lt;/h3&gt;

&lt;p&gt;Suppress intentional AI/ML infrastructure:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight yaml"&gt;&lt;code&gt;&lt;span class="c1"&gt;# cleancloud.yaml&lt;/span&gt;
&lt;span class="na"&gt;exceptions&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;rule_id&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;aws.sagemaker.endpoint.idle&lt;/span&gt;
    &lt;span class="na"&gt;resource_id&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;demo-endpoint-prod&lt;/span&gt;
    &lt;span class="na"&gt;reason&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Production&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;serving&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;endpoint&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;for&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;customer&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;X"&lt;/span&gt;
    &lt;span class="na"&gt;expires_at&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;2026-12-31"&lt;/span&gt;

&lt;span class="na"&gt;thresholds&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="na"&gt;fail_on_cost&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="m"&gt;500&lt;/span&gt;  &lt;span class="c1"&gt;# CI gate: fail if monthly AI/ML waste &amp;gt; $500&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;






&lt;h2&gt;
  
  
  Why This Matters
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;Data science teams: Spot forgotten endpoints before they blow budgets&lt;/li&gt;
&lt;li&gt;Platform teams: Enforce endpoint lifecycle policies automatically&lt;/li&gt;
&lt;li&gt;FinOps teams: AI/ML waste visibility in CI/CD, not quarterly surprises&lt;/li&gt;
&lt;li&gt;Finance teams: Audit trail of detected waste and enforcement decisions&lt;/li&gt;
&lt;/ul&gt;




&lt;h2&gt;
  
  
  Getting Started
&lt;/h2&gt;

&lt;h3&gt;
  
  
  Step 1: Try the demo
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;pipx &lt;span class="nb"&gt;install &lt;/span&gt;cleancloud
cleancloud demo &lt;span class="nt"&gt;--category&lt;/span&gt; ai
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This shows sample AI/ML waste findings without needing cloud credentials.&lt;/p&gt;

&lt;h3&gt;
  
  
  Step 2: Scan your cloud
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;cleancloud scan &lt;span class="nt"&gt;--provider&lt;/span&gt; aws &lt;span class="nt"&gt;--category&lt;/span&gt; ai &lt;span class="nt"&gt;--all-regions&lt;/span&gt;
cleancloud scan &lt;span class="nt"&gt;--provider&lt;/span&gt; azure &lt;span class="nt"&gt;--category&lt;/span&gt; ai
cleancloud scan &lt;span class="nt"&gt;--provider&lt;/span&gt; gcp &lt;span class="nt"&gt;--category&lt;/span&gt; ai &lt;span class="nt"&gt;--all-projects&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Step 3: Add to CI/CD
&lt;/h3&gt;

&lt;p&gt;Copy the workflow example above and commit to .github/workflows/ai-hygiene.yml&lt;/p&gt;

&lt;h3&gt;
  
  
  Step 4: Use policy-as-code
&lt;/h3&gt;

&lt;p&gt;Add cleancloud.yaml to document which endpoints are intentional (and when they expire).&lt;/p&gt;




&lt;h2&gt;
  
  
  Next Steps
&lt;/h2&gt;

&lt;p&gt;Learn more:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Full guide: &lt;a href="https://www.getcleancloud.com/blog/idle-ai-ml-infrastructure-cost.html" rel="noopener noreferrer"&gt;https://www.getcleancloud.com/blog/idle-ai-ml-infrastructure-cost.html&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;Detection rules: &lt;a href="https://github.com/cleancloud-io/cleancloud/blob/main/docs/rules.md" rel="noopener noreferrer"&gt;https://github.com/cleancloud-io/cleancloud/blob/main/docs/rules.md&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;Configuration: &lt;a href="https://github.com/cleancloud-io/cleancloud/blob/main/docs/configuration.md" rel="noopener noreferrer"&gt;https://github.com/cleancloud-io/cleancloud/blob/main/docs/configuration.md&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;Try it: cleancloud scan --provider aws --category ai --all-regions&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;GitHub: &lt;a href="https://github.com/cleancloud-io/cleancloud" rel="noopener noreferrer"&gt;https://github.com/cleancloud-io/cleancloud&lt;/a&gt;&lt;/p&gt;




&lt;p&gt;What's your biggest source of AI/ML waste? SageMaker endpoints, AML clusters, or Vertex AI endpoints? Share in the comments.&lt;/p&gt;




&lt;p&gt;Originally published on getcleancloud.com&lt;/p&gt;

</description>
      <category>aws</category>
      <category>ai</category>
      <category>azure</category>
      <category>gcp</category>
    </item>
    <item>
      <title>Stop Managing Cloud Exceptions in Spreadsheets — Use Policy-as-Code Instead</title>
      <dc:creator>Suresh</dc:creator>
      <pubDate>Wed, 08 Apr 2026 10:07:08 +0000</pubDate>
      <link>https://dev.to/sureshmandalapu/stop-managing-cloud-exceptions-in-spreadsheets-use-policy-as-code-instead-21a8</link>
      <guid>https://dev.to/sureshmandalapu/stop-managing-cloud-exceptions-in-spreadsheets-use-policy-as-code-instead-21a8</guid>
      <description>&lt;p&gt;TL;DR: Cloud exceptions in spreadsheets rot. Policy-as-code puts them in git with automatic expiry dates, git reviews, and cost tracking. Here's how.&lt;/p&gt;

&lt;p&gt;Tools like &lt;a href="https://github.com/cleancloud-io/cleancloud" rel="noopener noreferrer"&gt;CleanCloud&lt;/a&gt; move exceptions into Git with expiry dates, PR reviews, and enforceable cost thresholds.&lt;/p&gt;




&lt;h2&gt;
  
  
  The Exception Spreadsheet Nobody Trusts
&lt;/h2&gt;

&lt;p&gt;It starts innocently enough. Your team runs a cloud hygiene scan. A bastion host is intentionally stopped. A dev database has been idle for 45 days but is still in use. A NAT gateway with no traffic (seasonal workload). Legitimate exceptions, all of them.&lt;/p&gt;

&lt;p&gt;So you create a list.&lt;/p&gt;

&lt;p&gt;Three months later:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Nobody knows which exceptions are still valid&lt;/li&gt;
&lt;li&gt;Who approved them?&lt;/li&gt;
&lt;li&gt;When do they expire?&lt;/li&gt;
&lt;li&gt;The spreadsheet hasn't been updated since 2024&lt;/li&gt;
&lt;li&gt;You're suppressing the alerts instead of fixing the waste&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Stats:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;~40% of FinOps exceptions have zero documented expiry&lt;/li&gt;
&lt;li&gt;Average exception age before questioned: 6+ months&lt;/li&gt;
&lt;li&gt;Typical cost of "forgotten" exceptions: $10-50K+/month&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The exceptions spreadsheet is the FinOps equivalent of commenting out a security alert. &lt;/p&gt;

&lt;p&gt;The waste is still running. &lt;/p&gt;

&lt;p&gt;You've just stopped seeing it.&lt;/p&gt;




&lt;h2&gt;
  
  
  What If Exceptions Were Code?
&lt;/h2&gt;

&lt;p&gt;This is exactly the problem policy-as-code solves.&lt;/p&gt;

&lt;p&gt;Instead of tracking exceptions in spreadsheets or tickets, tools like &lt;a href="https://github.com/cleancloud-io/cleancloud" rel="noopener noreferrer"&gt;CleanCloud&lt;/a&gt; treat them as version-controlled configuration — living alongside your infrastructure, with built-in expiry, review, and enforcement.&lt;/p&gt;

&lt;p&gt;Instead of a spreadsheet, your exceptions live in Git.&lt;/p&gt;

&lt;p&gt;cleancloud.yaml (repo root)&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight yaml"&gt;&lt;code&gt;&lt;span class="c1"&gt;# cleancloud.yaml - commit to your repo root&lt;/span&gt;

&lt;span class="na"&gt;defaults&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="na"&gt;confidence&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;MEDIUM&lt;/span&gt;      &lt;span class="c1"&gt;# skip low-signal findings&lt;/span&gt;
  &lt;span class="na"&gt;min_cost&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="m"&gt;10&lt;/span&gt;            &lt;span class="c1"&gt;# ignore cheap findings&lt;/span&gt;

&lt;span class="na"&gt;exceptions&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;rule_id&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;aws.ec2.instance.stopped&lt;/span&gt;
    &lt;span class="na"&gt;resource_id&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;i-0bastion1234&lt;/span&gt;  &lt;span class="c1"&gt;# bastion host&lt;/span&gt;
    &lt;span class="na"&gt;reason&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Bastion&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;-&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;started&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;on&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;demand&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;for&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;debugging"&lt;/span&gt;
    &lt;span class="na"&gt;expires_at&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;2026-12-31"&lt;/span&gt;      &lt;span class="c1"&gt;# auto-expires (forces review)&lt;/span&gt;

  &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;rule_id&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;aws.rds.instance.idle&lt;/span&gt;
    &lt;span class="na"&gt;resource_id&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;db-test-*"&lt;/span&gt;      &lt;span class="c1"&gt;# wildcard (all test databases)&lt;/span&gt;
    &lt;span class="na"&gt;reason&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Test&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;databases&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;are&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;ephemeral&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;and&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;intentionally&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;idle"&lt;/span&gt;

  &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;rule_id&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;aws.nat.idle&lt;/span&gt;
    &lt;span class="na"&gt;resource_id&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;nat-12345678&lt;/span&gt;
    &lt;span class="na"&gt;reason&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;NAT&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;gateway&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;for&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;seasonal&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;workload&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;(Jan-Mar&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;only)"&lt;/span&gt;
    &lt;span class="na"&gt;expires_at&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;2026-03-31"&lt;/span&gt;      &lt;span class="c1"&gt;# expires after season ends&lt;/span&gt;

&lt;span class="na"&gt;thresholds&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="na"&gt;fail_on_confidence&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;HIGH&lt;/span&gt;        &lt;span class="c1"&gt;# CI gate: block on HIGH findings&lt;/span&gt;
  &lt;span class="na"&gt;fail_on_cost&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="m"&gt;500&lt;/span&gt;               &lt;span class="c1"&gt;# CI gate: block if waste &amp;gt; $500/month&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Now every exception is:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Reviewable - Show up in pull requests (why is documented)&lt;/li&gt;
&lt;li&gt;Auditable - Git history shows who approved what&lt;/li&gt;
&lt;li&gt;Self-expiring - No more "forgotten" exceptions (automatic)&lt;/li&gt;
&lt;li&gt;Enforceable - CI fails if exceptions are violated&lt;/li&gt;
&lt;li&gt;Version-controlled - Treated like infrastructure code (because it is)&lt;/li&gt;
&lt;/ul&gt;




&lt;h2&gt;
  
  
  How It Works in Practice
&lt;/h2&gt;

&lt;h3&gt;
  
  
  1. Run scans with your config
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;cleancloud scan &lt;span class="nt"&gt;--provider&lt;/span&gt; aws &lt;span class="nt"&gt;--all-regions&lt;/span&gt;
&lt;span class="c"&gt;# Automatically picks up cleancloud.yaml from repo root&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Without cleancloud.yaml: 47 findings&lt;br&gt;
With cleancloud.yaml: 12 findings (the exceptions are suppressed)&lt;/p&gt;
&lt;h3&gt;
  
  
  2. Enforce in CI/CD
&lt;/h3&gt;


&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight yaml"&gt;&lt;code&gt;&lt;span class="c1"&gt;# .github/workflows/cloud-hygiene.yml&lt;/span&gt;
&lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;Cloud Hygiene Check&lt;/span&gt;

&lt;span class="na"&gt;on&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="na"&gt;schedule&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
    &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;cron&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s1"&gt;'&lt;/span&gt;&lt;span class="s"&gt;0&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;9&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;*&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;*&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;MON'&lt;/span&gt;  &lt;span class="c1"&gt;# Weekly scan&lt;/span&gt;

&lt;span class="na"&gt;jobs&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="na"&gt;scan&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
    &lt;span class="na"&gt;runs-on&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;ubuntu-latest&lt;/span&gt;
    &lt;span class="na"&gt;steps&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
      &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;uses&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;actions/checkout@v4&lt;/span&gt;

      &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;Scan cloud for waste&lt;/span&gt;
        &lt;span class="na"&gt;uses&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;getcleancloud/scan-action@v1&lt;/span&gt;
        &lt;span class="na"&gt;with&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
          &lt;span class="na"&gt;provider&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;aws&lt;/span&gt;
          &lt;span class="na"&gt;all-regions&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="kc"&gt;true&lt;/span&gt;
          &lt;span class="na"&gt;fail-on-cost&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="m"&gt;500&lt;/span&gt;      &lt;span class="c1"&gt;# Exit code 2 if waste &amp;gt; $500/month&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;


&lt;p&gt;Result: Your build fails if exceptions expire or waste threshold is breached. No surprises.&lt;/p&gt;
&lt;h3&gt;
  
  
  3. Update exceptions via PR
&lt;/h3&gt;

&lt;p&gt;When your bastion exception expires (2026-12-31), the next scan will fail:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Cloud Hygiene Check: FAILED
  [HIGH] Stopped EC2 instance: i-0bastion1234
  Reason: Exception expired on 2026-12-31

  Action: Remove the exception or update expires_at
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Your team reviews the PR, decides whether to renew or delete it. Zero magic. Total visibility.&lt;/p&gt;




&lt;h2&gt;
  
  
  Real-World Example: Multi-Account Exception
&lt;/h2&gt;

&lt;p&gt;Managing exceptions across your AWS org:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight yaml"&gt;&lt;code&gt;&lt;span class="na"&gt;exceptions&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="c1"&gt;# Production RDS: kept for failover (intentional)&lt;/span&gt;
  &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;rule_id&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;aws.rds.instance.idle&lt;/span&gt;
    &lt;span class="na"&gt;account_id&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;123456789012"&lt;/span&gt;        &lt;span class="c1"&gt;# prod account&lt;/span&gt;
    &lt;span class="na"&gt;region&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;us-east-1&lt;/span&gt;
    &lt;span class="na"&gt;resource_id&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;db-failover&lt;/span&gt;
    &lt;span class="na"&gt;reason&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Standby&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;RDS&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;in&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;active-passive&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;setup"&lt;/span&gt;
    &lt;span class="na"&gt;expires_at&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;2027-03-31"&lt;/span&gt;          &lt;span class="c1"&gt;# annual review date&lt;/span&gt;

  &lt;span class="c1"&gt;# Staging: ephemeral, safe to ignore&lt;/span&gt;
  &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;rule_id&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;aws.rds.instance.idle&lt;/span&gt;
    &lt;span class="na"&gt;account_id&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;210987654321"&lt;/span&gt;        &lt;span class="c1"&gt;# staging account&lt;/span&gt;
    &lt;span class="na"&gt;resource_id&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;db-*"&lt;/span&gt;               &lt;span class="c1"&gt;# glob pattern&lt;/span&gt;
    &lt;span class="na"&gt;reason&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Staging&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;databases&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;are&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;ephemeral"&lt;/span&gt;
    &lt;span class="c1"&gt;# No expires_at = permanent exception (reviewed manually)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;






&lt;h2&gt;
  
  
  The Policy-as-Code Difference
&lt;/h2&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Approach&lt;/th&gt;
&lt;th&gt;Exceptions&lt;/th&gt;
&lt;th&gt;Audit Trail&lt;/th&gt;
&lt;th&gt;Expiry&lt;/th&gt;
&lt;th&gt;Cost&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Spreadsheet&lt;/td&gt;
&lt;td&gt;Unstructured&lt;/td&gt;
&lt;td&gt;Slack message&lt;/td&gt;
&lt;td&gt;Manual&lt;/td&gt;
&lt;td&gt;Free (but $50K waste)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Ticketing&lt;/td&gt;
&lt;td&gt;In Jira&lt;/td&gt;
&lt;td&gt;Comments&lt;/td&gt;
&lt;td&gt;Forgotten&lt;/td&gt;
&lt;td&gt;Free (but lost time)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;UI Toggle&lt;/td&gt;
&lt;td&gt;In vendor SaaS&lt;/td&gt;
&lt;td&gt;Dashboard logs&lt;/td&gt;
&lt;td&gt;No&lt;/td&gt;
&lt;td&gt;Vendor cost&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Policy-as-Code&lt;/td&gt;
&lt;td&gt;In git&lt;/td&gt;
&lt;td&gt;Full history&lt;/td&gt;
&lt;td&gt;Automatic&lt;/td&gt;
&lt;td&gt;Free (and tighter control)&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;




&lt;h2&gt;
  
  
  Getting Started (5 Minutes)
&lt;/h2&gt;

&lt;h3&gt;
  
  
  Step 1: Try it without exceptions
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;pipx &lt;span class="nb"&gt;install &lt;/span&gt;cleancloud
cleancloud demo                    &lt;span class="c"&gt;# See sample findings&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Step 2: Add a config file
&lt;/h3&gt;

&lt;p&gt;Create cleancloud.yaml in your repo root (the YAML above).&lt;/p&gt;

&lt;h3&gt;
  
  
  Step 3: Scan with your config
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;cleancloud scan &lt;span class="nt"&gt;--provider&lt;/span&gt; aws &lt;span class="nt"&gt;--all-regions&lt;/span&gt;
&lt;span class="c"&gt;# Now suppresses the exceptions listed in cleancloud.yaml&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Step 4: Commit to git
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;git add cleancloud.yaml
git commit &lt;span class="nt"&gt;-m&lt;/span&gt; &lt;span class="s2"&gt;"Add cloud hygiene exceptions with expiry dates"&lt;/span&gt;
git push
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Your exceptions are now:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Version-controlled&lt;/li&gt;
&lt;li&gt;Code-reviewed&lt;/li&gt;
&lt;li&gt;Automatically expiring&lt;/li&gt;
&lt;li&gt;Auditable&lt;/li&gt;
&lt;/ul&gt;




&lt;h2&gt;
  
  
  Why This Matters for Your Team
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;Platform teams: Enforce waste thresholds across departments without manual hunting&lt;/li&gt;
&lt;li&gt;FinOps teams: Audit trail + expiry dates = zero "forgotten" waste&lt;/li&gt;
&lt;li&gt;DevOps/SREs: Exceptions treated like infrastructure code (belong in git)&lt;/li&gt;
&lt;li&gt;Security/Compliance: Every exception is a documented, reviewable approval&lt;/li&gt;
&lt;/ul&gt;




&lt;h2&gt;
  
  
  Next Steps
&lt;/h2&gt;

&lt;p&gt;Learn more:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Full guide: &lt;a href="https://www.getcleancloud.com/blog/policy-as-code-cloud-governance.html" rel="noopener noreferrer"&gt;https://www.getcleancloud.com/blog/policy-as-code-cloud-governance.html&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;Configuration reference: &lt;a href="https://github.com/cleancloud-io/cleancloud/blob/main/docs/configuration.md" rel="noopener noreferrer"&gt;https://github.com/cleancloud-io/cleancloud/blob/main/docs/configuration.md&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;Try it: cleancloud demo --category ai (also detects idle AI/ML waste)&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;GitHub: &lt;a href="https://github.com/cleancloud-io/cleancloud" rel="noopener noreferrer"&gt;https://github.com/cleancloud-io/cleancloud&lt;/a&gt;&lt;/p&gt;




&lt;p&gt;What's your current approach to managing exceptions? Spreadsheets, Jira, or something else? Drop a comment below.&lt;/p&gt;




&lt;p&gt;Originally published on getcleancloud.com&lt;/p&gt;

</description>
      <category>finop</category>
      <category>devops</category>
      <category>cloudcomputing</category>
      <category>ai</category>
    </item>
    <item>
      <title>CleanCloud v0.4.0: How We Made Cloud Hygiene Scanning 10x Faster</title>
      <dc:creator>Suresh</dc:creator>
      <pubDate>Fri, 02 Jan 2026 06:15:44 +0000</pubDate>
      <link>https://dev.to/sureshmandalapu/cleancloud-v040-how-we-made-cloud-hygiene-scanning-10x-faster-with-benchmarkspublished-true-4png</link>
      <guid>https://dev.to/sureshmandalapu/cleancloud-v040-how-we-made-cloud-hygiene-scanning-10x-faster-with-benchmarkspublished-true-4png</guid>
      <description>&lt;p&gt;I just shipped CleanCloud v0.4.0 with major performance improvements through parallel scanning. Here's how we did it.&lt;/p&gt;

&lt;h2&gt;
  
  
  What's CleanCloud?
&lt;/h2&gt;

&lt;p&gt;If you missed the &lt;a href="https://dev.to/suresh_564529bdc18d6e32f4/i-built-a-read-only-awsazure-hygiene-scanner-because-auto-delete-is-too-risky-34go"&gt;original announcement&lt;/a&gt;, CleanCloud is a &lt;strong&gt;read-only&lt;/strong&gt; CLI tool that scans AWS/Azure for orphaned resources (unattached volumes, old snapshots, infinite CloudWatch log retention).&lt;/p&gt;

&lt;p&gt;Unlike aggressive cleanup tools, CleanCloud gives you conservative signals so you can review before taking action. No auto-delete, no risk.&lt;/p&gt;




&lt;h2&gt;
  
  
  The Performance Problem
&lt;/h2&gt;

&lt;p&gt;v0.3.x had a bottleneck: &lt;strong&gt;sequential scanning&lt;/strong&gt;.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="c1"&gt;# Old approach (v0.3.x)
&lt;/span&gt;&lt;span class="n"&gt;findings&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[]&lt;/span&gt;
&lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;region&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;regions_to_scan&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="n"&gt;click&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;echo&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;🔍 Scanning region &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;region&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;findings&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;extend&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;_scan_aws_region&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;profile&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;region&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Result:&lt;/strong&gt; Each region scanned one at a time. For accounts with resources in multiple regions, this added up quickly.&lt;/p&gt;




&lt;h2&gt;
  
  
  The Solution: Parallel Scanning
&lt;/h2&gt;

&lt;p&gt;v0.4.0 introduces &lt;strong&gt;concurrent scanning&lt;/strong&gt; at two levels:&lt;/p&gt;

&lt;h3&gt;
  
  
  1. Parallel Region Scanning
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="c1"&gt;# New approach (v0.4.0)
&lt;/span&gt;&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;concurrent.futures&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;ThreadPoolExecutor&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;as_completed&lt;/span&gt;

&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;scan_aws_regions&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;profile&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;Optional&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt; &lt;span class="n"&gt;regions_to_scan&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;List&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;])&lt;/span&gt; &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;List&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;Finding&lt;/span&gt;&lt;span class="p"&gt;]:&lt;/span&gt;
    &lt;span class="n"&gt;findings&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[]&lt;/span&gt;

    &lt;span class="k"&gt;with&lt;/span&gt; &lt;span class="nc"&gt;ThreadPoolExecutor&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;max_workers&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="nf"&gt;min&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;5&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nf"&gt;len&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;regions_to_scan&lt;/span&gt;&lt;span class="p"&gt;)))&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="n"&gt;executor&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="n"&gt;futures&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
            &lt;span class="n"&gt;executor&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;submit&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;_scan_aws_region&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;profile&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;region&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt; &lt;span class="n"&gt;region&lt;/span&gt; 
            &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;region&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;regions_to_scan&lt;/span&gt;
        &lt;span class="p"&gt;}&lt;/span&gt;

        &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;future&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="nf"&gt;as_completed&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;futures&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
            &lt;span class="n"&gt;region&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;futures&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;future&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
            &lt;span class="n"&gt;click&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;echo&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;✅ Completed region &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;region&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
            &lt;span class="n"&gt;findings&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;extend&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;future&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;result&lt;/span&gt;&lt;span class="p"&gt;())&lt;/span&gt;

    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;findings&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Key decisions:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;code&gt;max_workers=min(5, len(regions_to_scan))&lt;/code&gt; - Limits parallelism to avoid rate limits&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;as_completed()&lt;/code&gt; - Shows progress as regions complete&lt;/li&gt;
&lt;li&gt;Thread-safe result collection&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  2. Parallel Rule Execution
&lt;/h3&gt;

&lt;p&gt;Within each region, we also parallelized individual rules:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="n"&gt;AWS_RULES&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;
    &lt;span class="n"&gt;find_unattached_ebs_volumes&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;find_old_ebs_snapshots&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;find_inactive_cloudwatch_logs&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;find_aws_untagged_resources&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
&lt;span class="p"&gt;]&lt;/span&gt;

&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;_scan_aws_region&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;profile&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;Optional&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt; &lt;span class="n"&gt;region&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;List&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;Finding&lt;/span&gt;&lt;span class="p"&gt;]:&lt;/span&gt;
    &lt;span class="n"&gt;session&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;create_aws_session&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;profile&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;profile&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;region&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;region&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;findings&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[]&lt;/span&gt;

    &lt;span class="k"&gt;with&lt;/span&gt; &lt;span class="nc"&gt;ThreadPoolExecutor&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;max_workers&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="nf"&gt;min&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;4&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nf"&gt;len&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;AWS_RULES&lt;/span&gt;&lt;span class="p"&gt;)))&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="n"&gt;executor&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="n"&gt;futures&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;executor&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;submit&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;rule&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;session&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;region&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;rule&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;AWS_RULES&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;

        &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;future&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="nf"&gt;as_completed&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;futures&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
            &lt;span class="k"&gt;try&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
                &lt;span class="n"&gt;rule_findings&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;future&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;result&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
                &lt;span class="n"&gt;findings&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;extend&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;rule_findings&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
            &lt;span class="k"&gt;except&lt;/span&gt; &lt;span class="nb"&gt;Exception&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="n"&gt;e&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
                &lt;span class="c1"&gt;# Never fail entire scan due to one rule
&lt;/span&gt;                &lt;span class="n"&gt;click&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;echo&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;⚠️ Rule failed in &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;region&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s"&gt;: &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;e&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;findings&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Benefits:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;All 4 rules run concurrently per region&lt;/li&gt;
&lt;li&gt;Exception isolation (one failing rule doesn't break the scan)&lt;/li&gt;
&lt;li&gt;Better resource utilization&lt;/li&gt;
&lt;/ul&gt;




&lt;h2&gt;
  
  
  Performance Improvements
&lt;/h2&gt;

&lt;p&gt;Real-world results from testing:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Single region scan:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Before: ~20-25 seconds&lt;/li&gt;
&lt;li&gt;After: ~15-18 seconds&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Improvement: ~30% faster&lt;/strong&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Multi-region scan (5 regions):&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Before: ~100-120 seconds (sequential)&lt;/li&gt;
&lt;li&gt;After: ~20-25 seconds (parallel)&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Improvement: ~5x faster&lt;/strong&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;The key insight:&lt;/strong&gt; The more regions you scan, the bigger the improvement. Parallel execution shines when there's actual work to parallelize.&lt;/p&gt;




&lt;h2&gt;
  
  
  Azure Gets the Same Treatment
&lt;/h2&gt;

&lt;p&gt;Azure subscriptions are now scanned in parallel too:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;scan_azure_subscriptions&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="n"&gt;subscription_ids&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;List&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;
    &lt;span class="n"&gt;credential&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;region_filter&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;Optional&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;List&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;Finding&lt;/span&gt;&lt;span class="p"&gt;]:&lt;/span&gt;
    &lt;span class="n"&gt;all_findings&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[]&lt;/span&gt;

    &lt;span class="k"&gt;with&lt;/span&gt; &lt;span class="nc"&gt;ThreadPoolExecutor&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;max_workers&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="nf"&gt;min&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;4&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nf"&gt;len&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;subscription_ids&lt;/span&gt;&lt;span class="p"&gt;)))&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="n"&gt;executor&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="n"&gt;futures&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
            &lt;span class="n"&gt;executor&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;submit&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
                &lt;span class="n"&gt;_scan_azure_subscription&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
                &lt;span class="n"&gt;subscription_id&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;sub_id&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
                &lt;span class="n"&gt;credential&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;credential&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
                &lt;span class="n"&gt;region_filter&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;region_filter&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
            &lt;span class="p"&gt;):&lt;/span&gt; &lt;span class="n"&gt;sub_id&lt;/span&gt;
            &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;sub_id&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;subscription_ids&lt;/span&gt;
        &lt;span class="p"&gt;}&lt;/span&gt;

        &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;future&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="nf"&gt;as_completed&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;futures&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
            &lt;span class="n"&gt;sub_id&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;futures&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;future&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
            &lt;span class="n"&gt;click&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;echo&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;✅ Completed subscription &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;sub_id&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
            &lt;span class="k"&gt;try&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
                &lt;span class="n"&gt;all_findings&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;extend&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;future&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;result&lt;/span&gt;&lt;span class="p"&gt;())&lt;/span&gt;
            &lt;span class="k"&gt;except&lt;/span&gt; &lt;span class="nb"&gt;Exception&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="n"&gt;e&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
                &lt;span class="n"&gt;click&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;echo&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;⚠️ Subscription &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;sub_id&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s"&gt; failed: &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;e&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;all_findings&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Same benefits for Azure users with multiple subscriptions.&lt;/strong&gt;&lt;/p&gt;




&lt;h2&gt;
  
  
  Other v0.4.0 Improvements
&lt;/h2&gt;

&lt;h3&gt;
  
  
  🔒 Safety Integration Tests
&lt;/h3&gt;

&lt;p&gt;We now have automated tests that verify CleanCloud's read-only guarantees:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;test_scan_is_read_only&lt;/span&gt;&lt;span class="p"&gt;():&lt;/span&gt;
    &lt;span class="sh"&gt;"""&lt;/span&gt;&lt;span class="s"&gt;Ensure no write operations during scan.&lt;/span&gt;&lt;span class="sh"&gt;"""&lt;/span&gt;
    &lt;span class="c1"&gt;# Run full scan
&lt;/span&gt;    &lt;span class="n"&gt;scan_result&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;scan_all_regions&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;

    &lt;span class="c1"&gt;# Check CloudTrail for write operations
&lt;/span&gt;    &lt;span class="n"&gt;cloudtrail_events&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;get_recent_events&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
    &lt;span class="n"&gt;write_events&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;e&lt;/span&gt; &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;e&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;cloudtrail_events&lt;/span&gt; 
                    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;e&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;EventName&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="ow"&gt;not&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;READ_ONLY_OPERATIONS&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;

    &lt;span class="c1"&gt;# Fail if ANY writes detected
&lt;/span&gt;    &lt;span class="k"&gt;assert&lt;/span&gt; &lt;span class="nf"&gt;len&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;write_events&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Write operations detected: &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;write_events&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;These run in CI on every PR against real AWS/Azure accounts. If CleanCloud ever tries to write, the build fails.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Why this matters:&lt;/strong&gt; You can trust that CleanCloud is truly read-only, not just claiming to be.&lt;/p&gt;

&lt;h3&gt;
  
  
  🩺 Enhanced Doctor Command
&lt;/h3&gt;

&lt;p&gt;The &lt;code&gt;cleancloud doctor&lt;/code&gt; command now provides &lt;strong&gt;actionable IAM diagnostics&lt;/strong&gt;:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;cleancloud doctor &lt;span class="nt"&gt;--provider&lt;/span&gt; aws

&lt;span class="c"&gt;# Before (v0.3.x):&lt;/span&gt;
❌ Permission denied

&lt;span class="c"&gt;# After (v0.4.0):&lt;/span&gt;
❌ Missing IAM permission: ec2:DescribeVolumes

Suggested IAM policy:
&lt;span class="o"&gt;{&lt;/span&gt;
  &lt;span class="s2"&gt;"Version"&lt;/span&gt;: &lt;span class="s2"&gt;"2012-10-17"&lt;/span&gt;,
  &lt;span class="s2"&gt;"Statement"&lt;/span&gt;: &lt;span class="o"&gt;[{&lt;/span&gt;
    &lt;span class="s2"&gt;"Effect"&lt;/span&gt;: &lt;span class="s2"&gt;"Allow"&lt;/span&gt;,
    &lt;span class="s2"&gt;"Action"&lt;/span&gt;: &lt;span class="o"&gt;[&lt;/span&gt;&lt;span class="s2"&gt;"ec2:DescribeVolumes"&lt;/span&gt;&lt;span class="o"&gt;]&lt;/span&gt;,
    &lt;span class="s2"&gt;"Resource"&lt;/span&gt;: &lt;span class="s2"&gt;"*"&lt;/span&gt;
  &lt;span class="o"&gt;}]&lt;/span&gt;
&lt;span class="o"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Much more helpful for debugging permission issues.&lt;/strong&gt;&lt;/p&gt;

&lt;h3&gt;
  
  
  📊 Post-Scan Feedback
&lt;/h3&gt;

&lt;p&gt;After each scan, you'll see a feedback prompt (disabled in CI/CD with &lt;code&gt;--no-feedback&lt;/code&gt;):&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;--- Scan Summary ---
Total findings: 23

CleanCloud feedback
-------------------
If this scan surfaced useful findings, we'd love to hear about it.

Share feedback: https://github.com/cleancloud-io/cleancloud/discussions
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This helps us improve detection rules based on real user feedback.&lt;/p&gt;




&lt;h2&gt;
  
  
  Real-World Impact
&lt;/h2&gt;

&lt;p&gt;Since launch, CleanCloud users have reported finding:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;💰 Cost Savings:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;$8K-12K/year in forgotten CloudWatch logs (infinite retention)&lt;/li&gt;
&lt;li&gt;$500-2K/year in unattached EBS volumes&lt;/li&gt;
&lt;li&gt;$300-1K/year in old snapshots&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;🎯 Common Findings:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;50-100 unattached volumes per account&lt;/li&gt;
&lt;li&gt;100-300 old snapshots from deleted instances&lt;/li&gt;
&lt;li&gt;20-50 log groups with infinite retention&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;⏱️ Time to Value:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Scan time: 20-30 seconds (v0.4.0)&lt;/li&gt;
&lt;li&gt;Review time: 5-10 minutes&lt;/li&gt;
&lt;li&gt;First cleanup: Same day&lt;/li&gt;
&lt;li&gt;ROI: Immediate&lt;/li&gt;
&lt;/ul&gt;




&lt;h2&gt;
  
  
  Installation &amp;amp; Usage
&lt;/h2&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;# Install&lt;/span&gt;
pip &lt;span class="nb"&gt;install &lt;/span&gt;cleancloud

&lt;span class="c"&gt;# Scan all active AWS regions (auto-detects which have resources)&lt;/span&gt;
cleancloud scan &lt;span class="nt"&gt;--provider&lt;/span&gt; aws &lt;span class="nt"&gt;--all-regions&lt;/span&gt;

&lt;span class="c"&gt;# Check IAM permissions&lt;/span&gt;
cleancloud doctor &lt;span class="nt"&gt;--provider&lt;/span&gt; aws

&lt;span class="c"&gt;# Scan specific region&lt;/span&gt;
cleancloud scan &lt;span class="nt"&gt;--provider&lt;/span&gt; aws &lt;span class="nt"&gt;--region&lt;/span&gt; us-east-1

&lt;span class="c"&gt;# Scan Azure&lt;/span&gt;
cleancloud scan &lt;span class="nt"&gt;--provider&lt;/span&gt; azure
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Example output:&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;🔍 Starting CleanCloud scan...

Provider: aws

🔍 Auto-detecting regions with resources...
✓ Found 3 active regions: us-east-1, us-west-2, eu-west-1

✅ Completed region us-east-1
✅ Completed region us-west-2
✅ Completed region eu-west-1

--- Scan Summary ---
Total findings: 47
By confidence: {'HIGH': 12, 'MEDIUM': 23, 'LOW': 12}
Regions scanned: us-east-1, us-west-2, eu-west-1
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;






&lt;h2&gt;
  
  
  Technical Deep Dive: Threading Challenges
&lt;/h2&gt;

&lt;p&gt;Building the parallel scanning wasn't trivial. Here are some challenges we hit:&lt;/p&gt;

&lt;h3&gt;
  
  
  1. Thread Safety with boto3
&lt;/h3&gt;

&lt;p&gt;boto3 clients are &lt;strong&gt;not thread-safe&lt;/strong&gt;. We had to create separate sessions per thread:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;_scan_aws_region&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;profile&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;Optional&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt; &lt;span class="n"&gt;region&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;List&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;Finding&lt;/span&gt;&lt;span class="p"&gt;]:&lt;/span&gt;
    &lt;span class="c1"&gt;# Create NEW session per thread
&lt;/span&gt;    &lt;span class="n"&gt;session&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;create_aws_session&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;profile&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;profile&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;region&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;region&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

    &lt;span class="c1"&gt;# Now safe to use in this thread
&lt;/span&gt;    &lt;span class="n"&gt;findings&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[]&lt;/span&gt;
    &lt;span class="c1"&gt;# ... scanning logic
&lt;/span&gt;    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;findings&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Lesson:&lt;/strong&gt; Never share boto3 clients across threads. Create new sessions per worker.&lt;/p&gt;

&lt;h3&gt;
  
  
  2. Rate Limiting
&lt;/h3&gt;

&lt;p&gt;Running 5 regions in parallel meant more concurrent API calls. We had to be smart about worker limits:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="c1"&gt;# Limit parallelism to avoid throttling
&lt;/span&gt;&lt;span class="n"&gt;max_workers&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;min&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;5&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nf"&gt;len&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;regions_to_scan&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;  &lt;span class="c1"&gt;# Cap at 5 workers
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Also:&lt;/strong&gt; boto3's built-in retry logic with adaptive mode handles most throttling gracefully.&lt;/p&gt;

&lt;h3&gt;
  
  
  3. Error Isolation
&lt;/h3&gt;

&lt;p&gt;One region failing shouldn't kill the entire scan:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;future&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="nf"&gt;as_completed&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;futures&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="k"&gt;try&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="n"&gt;rule_findings&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;future&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;result&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
        &lt;span class="n"&gt;findings&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;extend&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;rule_findings&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="k"&gt;except&lt;/span&gt; &lt;span class="nb"&gt;Exception&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="n"&gt;e&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="c1"&gt;# Log error but continue
&lt;/span&gt;        &lt;span class="n"&gt;click&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;echo&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;⚠️ Rule failed: &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;e&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Result:&lt;/strong&gt; Partial results if some regions fail. Trust-first means never failing the entire scan.&lt;/p&gt;

&lt;h3&gt;
  
  
  4. Progress Feedback
&lt;/h3&gt;

&lt;p&gt;Users need to know what's happening during parallel scans:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;future&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="nf"&gt;as_completed&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;futures&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="n"&gt;region&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;futures&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;future&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
    &lt;span class="n"&gt;click&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;echo&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;✅ Completed region &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;region&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Better UX:&lt;/strong&gt; Show progress as regions complete, not just at the end.&lt;/p&gt;




&lt;h2&gt;
  
  
  What's Next
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;Roadmap for v0.5.0:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;🌐 &lt;strong&gt;GCP support&lt;/strong&gt; - Extend beyond AWS/Azure&lt;/li&gt;
&lt;li&gt;⚙️ &lt;strong&gt;Configurable thresholds&lt;/strong&gt; - Adjust age/confidence per environment&lt;/li&gt;
&lt;li&gt;💵 &lt;strong&gt;Cost calculations&lt;/strong&gt; - Show potential savings in dollars&lt;/li&gt;
&lt;li&gt;🔗 &lt;strong&gt;CI/CD templates&lt;/strong&gt; - GitHub Actions, GitLab CI examples&lt;/li&gt;
&lt;li&gt;📊 &lt;strong&gt;JSON export improvements&lt;/strong&gt; - Better integration with other tools&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Want to contribute?&lt;/strong&gt; We welcome PRs! Check out the &lt;a href="https://github.com/cleancloud-io/cleancloud/issues" rel="noopener noreferrer"&gt;issues&lt;/a&gt;.&lt;/p&gt;




&lt;h2&gt;
  
  
  Why Open Source?
&lt;/h2&gt;

&lt;p&gt;CleanCloud is &lt;strong&gt;MIT licensed&lt;/strong&gt; with:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;✅ Zero telemetry&lt;/li&gt;
&lt;li&gt;✅ No phone-home&lt;/li&gt;
&lt;li&gt;✅ No tracking&lt;/li&gt;
&lt;li&gt;✅ All code visible&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Why?&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Trust is critical for cloud security tools. Open source means you can verify CleanCloud is truly read-only. No need to trust my promises - read the code.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Plus:&lt;/strong&gt; Building in public creates better software through community feedback.&lt;/p&gt;




&lt;h2&gt;
  
  
  Try It Out
&lt;/h2&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;pip &lt;span class="nb"&gt;install &lt;/span&gt;cleancloud
cleancloud scan &lt;span class="nt"&gt;--provider&lt;/span&gt; aws &lt;span class="nt"&gt;--all-regions&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Links:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;📦 PyPI: &lt;a href="https://pypi.org/project/cleancloud" rel="noopener noreferrer"&gt;https://pypi.org/project/cleancloud&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;💻 GitHub: &lt;a href="https://github.com/cleancloud-io/cleancloud" rel="noopener noreferrer"&gt;https://github.com/cleancloud-io/cleancloud&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;📖 Docs: &lt;a href="https://github.com/cleancloud-io/cleancloud#readme" rel="noopener noreferrer"&gt;https://github.com/cleancloud-io/cleancloud#readme&lt;/a&gt;
&lt;/li&gt;
&lt;/ul&gt;




&lt;h2&gt;
  
  
  Feedback Welcome!
&lt;/h2&gt;

&lt;p&gt;What cloud hygiene checks would be useful? What other resources should CleanCloud scan?&lt;/p&gt;

&lt;p&gt;Drop a comment or open an issue on GitHub. Would love to hear what you find! 🚀&lt;/p&gt;

</description>
      <category>azure</category>
      <category>aws</category>
      <category>opensource</category>
      <category>devops</category>
    </item>
    <item>
      <title>I built a read-only AWS/Azure hygiene scanner (because auto-delete is too risky)</title>
      <dc:creator>Suresh</dc:creator>
      <pubDate>Tue, 30 Dec 2025 08:08:51 +0000</pubDate>
      <link>https://dev.to/sureshmandalapu/i-built-a-read-only-awsazure-hygiene-scanner-because-auto-delete-is-too-risky-34go</link>
      <guid>https://dev.to/sureshmandalapu/i-built-a-read-only-awsazure-hygiene-scanner-because-auto-delete-is-too-risky-34go</guid>
      <description>&lt;p&gt;After getting burned by an auto-cleanup tool that deleted a "test" database (it wasn't a test), I built CleanCloud.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Problem
&lt;/h2&gt;

&lt;p&gt;Modern cloud environments are messy:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Teams spin up resources constantly&lt;/li&gt;
&lt;li&gt;Deployments create and destroy infrastructure&lt;/li&gt;
&lt;li&gt;Resources get orphaned when instances are terminated&lt;/li&gt;
&lt;li&gt;Nobody knows what's safe to delete&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Most cloud hygiene tools fall into two camps:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;Auto-delete everything&lt;/strong&gt; → Too dangerous for production&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Flag everything&lt;/strong&gt; → Too noisy to be useful&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;Both approaches fail when you have:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Elastic infrastructure (autoscaling, spot instances)&lt;/li&gt;
&lt;li&gt;Multiple teams with different ownership&lt;/li&gt;
&lt;li&gt;Resources that look unused but are actually important&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Why Auto-Delete Fails
&lt;/h2&gt;

&lt;p&gt;I learned this the hard way.&lt;/p&gt;

&lt;p&gt;A "smart" cleanup tool we tried:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Saw a database with no connections for 7 days&lt;/li&gt;
&lt;li&gt;Assumed it was orphaned&lt;/li&gt;
&lt;li&gt;Deleted it automatically&lt;/li&gt;
&lt;li&gt;Turned out it was a quarterly reporting database&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Cost of that mistake:&lt;/strong&gt; 3 days of recovery, angry CFO, lost trust in automation.&lt;/p&gt;

&lt;p&gt;The blast radius of deleting the wrong resource is &lt;strong&gt;orders of magnitude higher&lt;/strong&gt; than leaving it running for a few more weeks.&lt;/p&gt;

&lt;h2&gt;
  
  
  CleanCloud's Approach: Signal First, Act Later
&lt;/h2&gt;

&lt;p&gt;Instead of automating cleanup, CleanCloud answers a safer question:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;&lt;em&gt;"Which resources deserve human review — and how confident are we?"&lt;/em&gt;&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;&lt;strong&gt;Core principles:&lt;/strong&gt;&lt;/p&gt;

&lt;h3&gt;
  
  
  1. Read-Only Always
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;# Required AWS permissions - notice no Delete* or Modify*&lt;/span&gt;
&lt;span class="o"&gt;{&lt;/span&gt;
  &lt;span class="s2"&gt;"Action"&lt;/span&gt;: &lt;span class="o"&gt;[&lt;/span&gt;
    &lt;span class="s2"&gt;"ec2:DescribeVolumes"&lt;/span&gt;,
    &lt;span class="s2"&gt;"ec2:DescribeSnapshots"&lt;/span&gt;,
    &lt;span class="s2"&gt;"logs:DescribeLogGroups"&lt;/span&gt;,
    &lt;span class="s2"&gt;"s3:ListAllMyBuckets"&lt;/span&gt;
  &lt;span class="o"&gt;]&lt;/span&gt;
&lt;span class="o"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;No write permissions. Ever. Safe to run in production.&lt;/p&gt;

&lt;h3&gt;
  
  
  2. Conservative Signals
&lt;/h3&gt;

&lt;p&gt;Not just "is this unattached?" but:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;How long has it been unattached? (14+ days = HIGH confidence)&lt;/li&gt;
&lt;li&gt;Multiple signals required (age + state + tags)&lt;/li&gt;
&lt;li&gt;Explicit confidence levels: LOW, MEDIUM, HIGH&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Example:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;🔴 HIGH confidence: Volume unattached for 45 days
🟡 MEDIUM confidence: Volume unattached for 10 days  
🟢 LOW confidence: Volume unattached for 3 days (probably autoscaling)
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  3. Review-Only Recommendations
&lt;/h3&gt;

&lt;p&gt;CleanCloud never says "delete this." It says:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;"This volume has been unattached for 45 days, has no tags, and doesn't match any known deployment patterns. &lt;strong&gt;Worth reviewing.&lt;/strong&gt;"&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;Humans make the final call.&lt;/p&gt;

&lt;h2&gt;
  
  
  What It Detects
&lt;/h2&gt;

&lt;h3&gt;
  
  
  AWS Rules (4 currently)
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Unattached EBS volumes&lt;/strong&gt; (14+ days = HIGH confidence)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Old snapshots&lt;/strong&gt; (365+ days = HIGH confidence)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;CloudWatch logs with infinite retention&lt;/strong&gt; (30+ days = HIGH confidence)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Untagged resources&lt;/strong&gt; (ownership unclear = MEDIUM confidence)&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Azure Rules (4 currently)
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Unattached managed disks&lt;/strong&gt; (14+ days = HIGH confidence)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Old snapshots&lt;/strong&gt; (90+ days = HIGH confidence)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Unused public IPs&lt;/strong&gt; (immediate = HIGH confidence)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Untagged resources&lt;/strong&gt; (MEDIUM confidence)&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Week 1 Results
&lt;/h2&gt;

&lt;p&gt;Released last week. Here's what happened:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Stats:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;300+ downloads (170 real users, rest are PyPI mirrors)&lt;/li&gt;
&lt;li&gt;0 production incidents (because read-only!)&lt;/li&gt;
&lt;li&gt;Most common finding: 15-30 unattached EBS volumes per AWS account&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;User feedback themes:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;"Finally, a tool I can trust in production"&lt;/li&gt;
&lt;li&gt;"Found $2K/month in waste in first scan"&lt;/li&gt;
&lt;li&gt;"Love that it explains WHY something was flagged"&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Quick Start
&lt;/h2&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;# Install&lt;/span&gt;
pip &lt;span class="nb"&gt;install &lt;/span&gt;cleancloud

&lt;span class="c"&gt;# Validate credentials&lt;/span&gt;
cleancloud doctor &lt;span class="nt"&gt;--provider&lt;/span&gt; aws

&lt;span class="c"&gt;# Scan single region&lt;/span&gt;
cleancloud scan &lt;span class="nt"&gt;--provider&lt;/span&gt; aws &lt;span class="nt"&gt;--region&lt;/span&gt; us-east-1

&lt;span class="c"&gt;# Scan all active regions&lt;/span&gt;
cleancloud scan &lt;span class="nt"&gt;--provider&lt;/span&gt; aws &lt;span class="nt"&gt;--all-regions&lt;/span&gt;

&lt;span class="c"&gt;# Output to JSON&lt;/span&gt;
cleancloud scan &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--provider&lt;/span&gt; aws &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--all-regions&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--output&lt;/span&gt; json &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--output-file&lt;/span&gt; results.json
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  Example Output
&lt;/h2&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nv"&gt;$ &lt;/span&gt;cleancloud scan &lt;span class="nt"&gt;--provider&lt;/span&gt; aws &lt;span class="nt"&gt;--region&lt;/span&gt; us-east-1

🔍 Scanning region us-east-1

Found 12 findings:
  HIGH confidence: 8
  MEDIUM confidence: 4

Top findings:
  • vol-0abc123 - Unattached volume &lt;span class="o"&gt;(&lt;/span&gt;45 days, 100GB&lt;span class="o"&gt;)&lt;/span&gt; - ~&lt;span class="nv"&gt;$10&lt;/span&gt;/mo
  • snap-0def456 - Old snapshot &lt;span class="o"&gt;(&lt;/span&gt;120 days, 500GB&lt;span class="o"&gt;)&lt;/span&gt; - ~&lt;span class="nv"&gt;$25&lt;/span&gt;/mo
  • log-group-xyz - Infinite retention &lt;span class="o"&gt;(&lt;/span&gt;2.1GB stored&lt;span class="o"&gt;)&lt;/span&gt; - ~&lt;span class="nv"&gt;$6&lt;/span&gt;/mo

💰 Estimated monthly waste: ~&lt;span class="nv"&gt;$156&lt;/span&gt;

Review findings and decide what to delete.
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  CI/CD Integration
&lt;/h2&gt;

&lt;p&gt;Built for pipelines with predictable exit codes:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight yaml"&gt;&lt;code&gt;&lt;span class="c1"&gt;# GitHub Actions example&lt;/span&gt;
&lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;Run hygiene scan&lt;/span&gt;
  &lt;span class="na"&gt;run&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="pi"&gt;|&lt;/span&gt;
    &lt;span class="s"&gt;pip install cleancloud&lt;/span&gt;
    &lt;span class="s"&gt;cleancloud scan \&lt;/span&gt;
      &lt;span class="s"&gt;--provider aws \&lt;/span&gt;
      &lt;span class="s"&gt;--all-regions \&lt;/span&gt;
      &lt;span class="s"&gt;--fail-on-confidence HIGH&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Exit codes:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;code&gt;0&lt;/code&gt; = Success (no policy violations)&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;1&lt;/code&gt; = Configuration error&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;2&lt;/code&gt; = Policy violation (findings detected)&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;3&lt;/code&gt; = Missing credentials&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Use cases:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Block PRs with HIGH confidence findings&lt;/li&gt;
&lt;li&gt;Generate weekly hygiene reports&lt;/li&gt;
&lt;li&gt;Enforce tagging standards&lt;/li&gt;
&lt;li&gt;Prevent resource leaks in development&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Authentication: OIDC First
&lt;/h2&gt;

&lt;p&gt;No long-lived credentials needed:&lt;/p&gt;

&lt;h3&gt;
  
  
  AWS (GitHub Actions)
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight yaml"&gt;&lt;code&gt;&lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;Configure AWS credentials (OIDC)&lt;/span&gt;
  &lt;span class="na"&gt;uses&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;aws-actions/configure-aws-credentials@v4&lt;/span&gt;
  &lt;span class="na"&gt;with&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
    &lt;span class="na"&gt;role-to-assume&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;arn:aws:iam::ACCOUNT:role/CleanCloudReadOnly&lt;/span&gt;
    &lt;span class="na"&gt;aws-region&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;us-east-1&lt;/span&gt;

&lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;Scan&lt;/span&gt;
  &lt;span class="na"&gt;run&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;cleancloud scan --provider aws&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Azure (GitHub Actions)
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight yaml"&gt;&lt;code&gt;&lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;Azure Login (OIDC)&lt;/span&gt;
  &lt;span class="na"&gt;uses&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;azure/login@v2&lt;/span&gt;
  &lt;span class="na"&gt;with&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
    &lt;span class="na"&gt;client-id&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;${{ secrets.AZURE_CLIENT_ID }}&lt;/span&gt;
    &lt;span class="na"&gt;tenant-id&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;${{ secrets.AZURE_TENANT_ID }}&lt;/span&gt;
    &lt;span class="na"&gt;subscription-id&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;${{ secrets.AZURE_SUBSCRIPTION_ID }}&lt;/span&gt;

&lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;Scan&lt;/span&gt;
  &lt;span class="na"&gt;run&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;cleancloud scan --provider azure&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;No &lt;code&gt;AWS_SECRET_ACCESS_KEY&lt;/code&gt; or &lt;code&gt;AZURE_CLIENT_SECRET&lt;/code&gt; needed. ✅&lt;/p&gt;

&lt;h2&gt;
  
  
  What CleanCloud is NOT
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;Not a cost optimization tool&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Doesn't access billing data&lt;/li&gt;
&lt;li&gt;Doesn't recommend rightsizing&lt;/li&gt;
&lt;li&gt;Focuses on hygiene, not savings&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Not a FinOps platform&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;No dashboards&lt;/li&gt;
&lt;li&gt;No cost tracking&lt;/li&gt;
&lt;li&gt;Just clean signals&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Not an auto-remediation service&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Will never delete anything&lt;/li&gt;
&lt;li&gt;Will never modify resources&lt;/li&gt;
&lt;li&gt;Will never tag resources&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;This is a &lt;strong&gt;strategic design choice&lt;/strong&gt;, not a limitation.&lt;/p&gt;

&lt;h2&gt;
  
  
  Privacy &amp;amp; Telemetry
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;CleanCloud collects zero telemetry.&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;No analytics. No tracking. No phone-home.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Why?&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Security tools shouldn't send data anywhere&lt;/li&gt;
&lt;li&gt;Works in air-gapped environments&lt;/li&gt;
&lt;li&gt;No opt-out flags needed&lt;/li&gt;
&lt;li&gt;Zero risk of leaking account info&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;We improve based on:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;GitHub issues&lt;/li&gt;
&lt;li&gt;Direct feedback&lt;/li&gt;
&lt;li&gt;Community contributions&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  What's Next
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;v0.3.1&lt;/strong&gt; just shipped with:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Complete documentation overhaul&lt;/li&gt;
&lt;li&gt;Smarter AWS region auto-detection&lt;/li&gt;
&lt;li&gt;Enhanced diagnostics with security grading&lt;/li&gt;
&lt;li&gt;Fixed region detection bugs&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Coming soon:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;GCP support&lt;/li&gt;
&lt;li&gt;Additional rules (unused Elastic IPs, old AMIs)&lt;/li&gt;
&lt;li&gt;Rule filtering (&lt;code&gt;--rules&lt;/code&gt; flag)&lt;/li&gt;
&lt;li&gt;Historical tracking&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Not planned:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Automated cleanup&lt;/li&gt;
&lt;li&gt;Cost optimization&lt;/li&gt;
&lt;li&gt;Billing data access&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;CleanCloud will remain focused on &lt;strong&gt;safe hygiene detection&lt;/strong&gt;, not automation.&lt;/p&gt;

&lt;h2&gt;
  
  
  Design Philosophy
&lt;/h2&gt;

&lt;p&gt;Three core principles:&lt;/p&gt;

&lt;h3&gt;
  
  
  1. Conservative by Default
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;Age-based confidence thresholds&lt;/li&gt;
&lt;li&gt;Multiple signals required&lt;/li&gt;
&lt;li&gt;Prefer false negatives over false positives&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  2. Read-Only Always
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;No Delete* permissions&lt;/li&gt;
&lt;li&gt;No Tag* permissions
&lt;/li&gt;
&lt;li&gt;No modification APIs&lt;/li&gt;
&lt;li&gt;Safe for production&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  3. Review-Only Recommendations
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;Findings are candidates for review, not automated action&lt;/li&gt;
&lt;li&gt;Clear reasoning for each finding&lt;/li&gt;
&lt;li&gt;Humans stay in control&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Who Is This For?
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;Primary users:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;SRE teams&lt;/li&gt;
&lt;li&gt;Platform engineers&lt;/li&gt;
&lt;li&gt;Infrastructure teams&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Stakeholders:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Security (read-only = passes security reviews)&lt;/li&gt;
&lt;li&gt;Compliance (SOC2/ISO27001 friendly)&lt;/li&gt;
&lt;li&gt;FinOps (identifies waste without aggressive optimization)&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Not for:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Teams wanting auto-cleanup&lt;/li&gt;
&lt;li&gt;Cost optimization as primary goal&lt;/li&gt;
&lt;li&gt;Aggressive savings recommendations&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Real Talk: Why I Built This
&lt;/h2&gt;

&lt;p&gt;I've seen too many "smart" automation tools cause outages:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Auto-scaler that scaled to zero during a traffic spike&lt;/li&gt;
&lt;li&gt;Cleanup tool that deleted "unused" security groups (broke production)&lt;/li&gt;
&lt;li&gt;Cost optimizer that downsized a database (performance disaster)&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;The pattern:&lt;/strong&gt; Automation is confident. Humans are cautious. Production requires caution.&lt;/p&gt;

&lt;p&gt;CleanCloud is designed for &lt;strong&gt;teams who value trust over automation&lt;/strong&gt;.&lt;/p&gt;

&lt;h2&gt;
  
  
  Try It
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;GitHub:&lt;/strong&gt; &lt;a href="https://github.com/cleancloud-io/cleancloud" rel="noopener noreferrer"&gt;https://github.com/cleancloud-io/cleancloud&lt;/a&gt;&lt;br&gt;&lt;br&gt;
&lt;strong&gt;Install:&lt;/strong&gt; &lt;code&gt;pip install cleancloud&lt;/code&gt;&lt;br&gt;&lt;br&gt;
&lt;strong&gt;Docs:&lt;/strong&gt; Complete setup guides for AWS and Azure&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Looking for feedback:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;What cloud hygiene tools do you currently use?&lt;/li&gt;
&lt;li&gt;Would read-only signals be useful for your team?&lt;/li&gt;
&lt;li&gt;What features would make this production-ready for you?&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Open source, MIT license. Contributions welcome!&lt;/p&gt;




&lt;p&gt;&lt;strong&gt;If you found this useful:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;⭐ Star the repo&lt;/li&gt;
&lt;li&gt;💬 Share your cloud hygiene horror stories in the comments&lt;/li&gt;
&lt;li&gt;🐛 Report issues or suggest features&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Built for SRE teams who value trust over automation.&lt;/strong&gt;&lt;/p&gt;

</description>
      <category>aws</category>
      <category>azure</category>
      <category>devops</category>
      <category>opensource</category>
    </item>
  </channel>
</rss>
