DEV Community

Cover image for Why Streamlit + Cloud Run is a Billing Trap (and How I Fixed It)
Pascal CESCATO
Pascal CESCATO

Posted on • Edited on

Why Streamlit + Cloud Run is a Billing Trap (and How I Fixed It)

The Initial Shock: When Your Demo Becomes a Target.

On January 4th, 2026, I deployed my "Knowledge Graph CV" app on Cloud Run. Just a small demo for the New Year, New You Portfolio Challenge on Dev.to, which I describe in the article Beyond the Linear CV. Nothing too complicated. A CV transformed into an interactive graph via Gemini AI, some Plotly visualizations, all wrapped in a nice Streamlit interface.

I thought: "Estimated budget: €5-8/month. Should be fine." I set the limit at €10 - to have a small safety margin.

Less than 72 hours later, I received a Google Cloud notification. My usage was already at €5.20.

In the Google console: Projected cost: €28/month.

The next day: €35.

The day after: €42.

Something was attacking my app in real time.


The Painful Statistics

Cloud Run logs analysis (first 6 days):

Origin Requests Type Behavior
🇵🇱 Poland 975/day WebSocket 30s connections each
🇺🇸 USA (Comcast IPv6) 478/day WebSocket 60s timeout
🇻🇳 Vietnam 476/day WebSocket Keeps reconnecting
Total bots 1929/day 101 (WebSocket) 16.7h CPU/day

Real cost: €0.40/day = €12/month for CPU alone.

Add RAM (368Mi), I/O, network... and you easily reach €25-30/month.

For a side project. That shouldn't exceed €10.

I had two options:

  1. shut down the app
  2. understand what was happening

I didn't want to close my app. I needed to understand. And fast.


The Culprit: Streamlit and Its Immortal WebSockets

Streamlit is great for building interactive dashboards. The catch? Everything relies on a persistent WebSocket connection.

Here's what happens when a bot arrives:

1. Bot hits my URL
2. Streamlit opens a WebSocket (Status 101)
3. Python starts, loads libs (pandas, plotly, gemini...)
4. The bot... never closes the connection
5. Cloud Run bills until timeout (30-60s)
6. The bot... immediately reopens a new connection
Enter fullscreen mode Exit fullscreen mode

Result: one bot = 30s of CPU billed. 50 bots/hour = 25 minutes of CPU. 1200 bots/day = 10 hours of CPU.

At €0.024/CPU-hour, that's €9/month just for the bots.

And that's when I understood the problem:

If your Python code sees the request, it's already too late. You've already paid.


The False Leads (Or How I Wasted 3 Days)

Attempt 1: Filter in Python

import streamlit as st
import os

def block_bots():
    ip = st.context.headers.get("X-Forwarded-For", "").split(",")[0]
    if ip.startswith("185.136.92"):
        os._exit(0)  # Brutal process kill
Enter fullscreen mode Exit fullscreen mode

Result: ❌ Complete failure.

Why? Because the bot hits /_stcore/stream (Streamlit's WebSocket endpoint). Python only executes after the WebSocket is established.

The 30 seconds are already billed.


Attempt 2: Password Protection

Idea: put a password page before the app.

if not st.session_state.authenticated:
    password = st.text_input("Password:", type="password")
    if password == os.getenv("DEMO_PASSWORD"):
        st.session_state.authenticated = True
Enter fullscreen mode Exit fullscreen mode

Result: ❌ Partial failure.

The bot stays blocked on the password page... but the WebSocket stays open for 30s.

Cost: still €9/month.


Attempt 3: Reduce Timeout to 15s

gcloud run deploy --timeout=15s
Enter fullscreen mode Exit fullscreen mode

Result: ⚠️ Works but...

The bot pays 15s instead of 30s (50% savings), BUT the app becomes unusable for real users. CV analyses take 15-20s.

Not acceptable.


Attempt 4: Consult 3 Different AIs

I asked for help from:

  • ChatGPT (rate limiting, environment variables, IP filtering)
  • Gemini (monitoring, memory optimizations)
  • Claude (architecture, debugging)

Result: Each gave me valuable leads, but none found THE solution.

Why? Because we were in an off-the-beaten-path case. AIs suggest standard solutions. Here, I needed to improvise.


The Breakthrough: "Block BEFORE Python"

Then, reading my logs for the umpteenth time, I noticed something:

{
  "remote_ip": "169.254.169.126",        // Cloud Run internal IP
  "X-Forwarded-For": "185.136.92.136",  // Bot's real IP
  "status": 101,
  "latency": "30.003234081s",
  "uri": "/_stcore/stream"               // Direct WebSocket!
}
Enter fullscreen mode Exit fullscreen mode

The bot doesn't even go through the homepage. It hits the WebSocket directly.

Conclusion:

An application firewall is too late. You need a bodyguard IN FRONT of Streamlit.

A reverse proxy.


Why Caddy (And Not NGINX)

I already use Caddy for other projects (reverse proxy in front of PostgreSQL in Docker). I know its lightness, its simplicity.

NGINX? Too heavy for a Cloud Run container:

  • Image ~50 MB (vs ~15 MB for Caddy)
  • Verbose configuration
  • Additional modules needed
  • RAM: ~15-20 MB (vs ~5-10 MB Caddy)

In a 368Mi container, every byte counts.

Caddy is a sniper. NGINX is a tank.

I needed a sniper.


The Final Architecture: The "Thermal Shield"

BEFORE (€42/month):

Before

AFTER (€1.59/month):

After

Architecture breakdown:

Architecture


The Code: 3 Files, ~60 Lines

1. Caddyfile (~35 lines)

{
    admin off
    auto_https off
    servers {
        trusted_proxies static 169.254.0.0/16
    }
}

:8080 {
    # Grouped matcher for all banned IPs
    @denylist {
        header X-Forwarded-For *185.136.92.*
        header X-Forwarded-For *115.98.235.*
        header X-Forwarded-For *119.111.248.*
        header X-Forwarded-For *115.96.83.*
        header X-Forwarded-For *2601:600:cb80:*
        header X-Forwarded-For *57.151.128.*
        header X-Forwarded-For *2402:3a80:*
    }

    # Respond 403 to any of these matches
    handle @denylist {
        respond "Access Denied" 403
    }

    # Everything else goes to Streamlit
    handle {
        reverse_proxy localhost:8501 {
            header_up X-Real-IP {http.request.header.X-Forwarded-For}
            header_up X-Forwarded-For {http.request.header.X-Forwarded-For}
        }
    }

    log {
        output stdout
        format console
    }
}
Enter fullscreen mode Exit fullscreen mode

Key config points:

  1. trusted_proxies static 169.254.0.0/16: Tells Caddy to trust Cloud Run's internal proxy (which always has an IP in 169.254.x.x). Without this, Caddy doesn't read X-Forwarded-For correctly.

  2. @denylist: Grouped matcher - if any of the patterns match, we block.

  3. header X-Forwarded-For *185.136.92.*: Simple wildcard - blocks all IPs starting with 185.136.92 (the entire /24 range).

Principle: Caddy reads the X-Forwarded-For header (where Cloud Run puts the real IP), compares it to the blacklist, and blocks BEFORE Python starts.


2. Dockerfile (20 lines)

FROM python:3.11-slim
WORKDIR /app

# Install Caddy (single binary, 15MB)
ADD https://caddyserver.com/api/download?os=linux&arch=amd64 /usr/bin/caddy
RUN chmod +x /usr/bin/caddy

# Install Python dependencies
COPY --from=ghcr.io/astral-sh/uv:latest /uv /bin/uv
COPY pyproject.toml .
COPY . .
RUN uv pip install --system -e .

# Config & startup
COPY Caddyfile start.sh ./
RUN chmod +x start.sh

EXPOSE 8080
CMD ["./start.sh"]
Enter fullscreen mode Exit fullscreen mode

3. start.sh (5 lines)

#!/bin/bash
# Launch Streamlit in background (internal only)
streamlit run app.py --server.port=8501 --server.address=127.0.0.1 &

# Launch Caddy in foreground (public-facing)
caddy run --config /app/Caddyfile
Enter fullscreen mode Exit fullscreen mode

That's it.

Two processes in one container. Caddy in front, Streamlit behind.


The Numbers: Crushing Victory

Results after 1 hour of production (heavy load):

Type Requests Action CPU Cost/month
Bots 904 403 (blocked) 2.17s €0.01
Humans 22 101 (passed) 328.68s €1.58
TOTAL 926 Mixed 330.85s €1.59

Cost evolution (projected):

D+1: Deployment (€1)
D+2: Bots discover the app (€5 → €28 projected)
D+3-4: Escalation (€42 at peak)
D+5: Caddy v1 partial (€25)
D+6: Caddy v2 finalized (€1.59) ✅
Enter fullscreen mode Exit fullscreen mode

Reduction: -96.2% (€42 → €1.59)


What This War Taught Me

1. The Real Cloud Cost Isn't AI

Gemini API? €0.05/day.

Bots camping on WebSockets? A fortune.

2. WebSockets Are the Achilles' Heel of Serverless

A classic HTTP connection: <1s billed.

A lingering WebSocket: 30s billed.

Multiply by 1000 bots, and you understand the problem.

3. IPv6 Makes Blacklists Obsolete

A bot in IPv4: 185.136.92.136

The same bot in IPv6: 2601:600:cb80:fba0:48a9:f979:23ca:193b

Good luck blacklisting all variations.

4. A Reverse Proxy = Best Economic Defense

Cloud Armor (Google WAF)? $7/month.

HTTPS Load Balancer? €18/month.

Caddy in the container? €0.

5. AIs Help, But Don't Find Everything

ChatGPT, Gemini, Claude all helped me explore. But the final solution? Me, at 2 AM, reading logs.

6. You Learn More in Prod Than in Tutorials

No Udemy course would have taught me:

  • How to read Cloud Run logs
  • Why X-Forwarded-For vs remote_ip
  • How Streamlit handles WebSockets
  • Why a reverse proxy saves at least €25/month

Production > Tutorials. Always.


The Bottom Line

I thought I was building an AI project for a Dev.to challenge.

I ended up at war with an international botnet.

Result:

✅ Functional and public app
✅ Controlled costs (€1.59/month projected)
✅ 900+ bots blocked/day
✅ A story to tell

And most importantly: I learned more in 6 days of debugging than in 6 months of tutorials.

If you're launching a serverless app exposed to the public, put a reverse proxy in front. Not for security (though...), but for your wallet.


Running Streamlit on Cloud Run? Check your billing NOW. Drop your monthly cost in comments - let's compare notes 👇


Resources & Code

Complete source code: GitHub - knowledge-graph-cv
Detailed technical article: Beyond the Linear CV

Top comments (0)