DEV Community

kanta13jp1
kanta13jp1

Posted on

Supabase Edge Functions + AI: Real Cost Breakdown and Optimization Patterns

Supabase Edge Functions + AI: Real Cost Breakdown and Optimization Patterns

"What Does It Actually Cost to Run AI Through Edge Functions?"

Jibun Kaisha routes all AI API calls through Supabase Edge Functions (Deno). When I started building this architecture, the cost picture was completely opaque. Here's what twelve months of real usage revealed.


The Architecture

Flutter Web
  └→ Supabase Edge Function (Deno)
        ├→ Claude API (Anthropic)
        ├→ Gemini API (Google)
        └→ OpenAI API
Enter fullscreen mode Exit fullscreen mode

Why not call APIs directly from the frontend? API keys stay server-side, and Row Level Security lets us enforce per-user rate limits without custom auth logic.


Real Monthly Cost Breakdown

Supabase (Edge Functions)

Plan Monthly Edge Function Calls Overage
Free $0 500,000 Stops
Pro $25 2,000,000 $2/1M

Actual usage: Pro plan. ~150,000 Edge Function calls per month. Well within limits.

AI API Costs

API Monthly Tokens Monthly Cost
Claude Sonnet 4.6 8M input + 1.2M output ~$30
Gemini 1.5 Flash 12M input + 2M output ~$4
OpenAI GPT-4o mini 3M input + 0.5M output ~$2

Total: Supabase $25 + AI APIs $36 ≈ $61/month


Edge Functions Runtime Costs

Cold Start Latency

Deno Edge Functions cold-start in 200–500ms. Since AI API calls already take 1–5 seconds, the user experience impact is minimal — but infrequently-called functions will cold-start on every request.

Fix: Warm up key functions via GHA cron:

# .github/workflows/keep-warm.yml
- name: Warm up AI functions
  run: |
    curl -s "https://${PROJECT_REF}.supabase.co/functions/v1/ai-assistant?ping=1" \
      -H "Authorization: Bearer $ANON_KEY" &
    curl -s "https://${PROJECT_REF}.supabase.co/functions/v1/daily-judgment?ping=1" \
      -H "Authorization: Bearer $ANON_KEY" &
    wait
Enter fullscreen mode Exit fullscreen mode

Timeout Limit

Default timeout is 150 seconds. Long Claude outputs can breach this.

Fix: Use streaming responses:

const stream = await anthropic.messages.stream({
  model: "claude-sonnet-4-6",
  max_tokens: 2048,
  messages: [{ role: "user", content: prompt }],
});

return new Response(stream.toReadableStream(), {
  headers: {
    "Content-Type": "text/event-stream",
    "Cache-Control": "no-cache",
    ...corsHeaders,
  },
});
Enter fullscreen mode Exit fullscreen mode

Cost Optimization Patterns

1. Circuit Breaker

When an AI API starts returning errors, stop sending requests immediately.

const { data: breaker } = await supabase
  .from("ai_circuit_breaker")
  .select("state, expires_at")
  .eq("provider", "anthropic")
  .single();

if (breaker?.state === "open" && new Date(breaker.expires_at) > new Date()) {
  return json({ error: "AI service temporarily unavailable" }, 503);
}
Enter fullscreen mode Exit fullscreen mode

2. Response Caching

Cache identical prompts in Supabase DB to avoid redundant API calls:

const cacheKey = createHash("sha256").update(prompt).toString();
const { data: cached } = await supabase
  .from("ai_cache")
  .select("response")
  .eq("key", cacheKey)
  .gt("expires_at", new Date().toISOString())
  .single();

if (cached) return json({ result: cached.response, cached: true });
Enter fullscreen mode Exit fullscreen mode

Real result: Daily report functions hit ~40% cache rate — multiple users requesting the same data on the same day.

3. Model Tiering

Match model cost to task complexity:

Task Model Reason
Complex reasoning / strategy claude-sonnet-4-6 Accuracy-first
Templated report generation gemini-1.5-flash Cost-first
Tagging / classification gpt-4o-mini Speed + cost
Batch processing claude-haiku-4-5 High-volume

4. Prompt Caching (Anthropic)

Anthropic's Prompt Caching gives 90% discount on repeated system prompt tokens:

const response = await anthropic.messages.create({
  model: "claude-sonnet-4-6",
  system: [
    {
      type: "text",
      text: LONG_SYSTEM_PROMPT, // 2000+ tokens
      cache_control: { type: "ephemeral" },
    },
  ],
  messages: [{ role: "user", content: userMessage }],
});
Enter fullscreen mode Exit fullscreen mode

Real result: The cs-check function (automated customer support replies) has a 3,000-token system prompt. After enabling caching: 63% reduction in input costs.


EF Cost Attribution

Edge Function Monthly Calls AI Model Est. Monthly Cost
ai-assistant 1,200 claude-sonnet-4-6 $8
daily-judgment 30 claude-sonnet-4-6 $4
cs-check (GHA) 60 claude-sonnet-4-6 + cache $3
ai-university-update 1,440 gemini-1.5-flash $2
get-home-dashboard 8,000 (no AI) $0
Others 140,000 (no AI) $0

Summary: What to Expect

  • $60–80/month covers Supabase Pro + Claude/Gemini/OpenAI for a solo project
  • Circuit breaker + caching + Prompt Caching can cut costs by 40–60%
  • The real cost driver is AI API usage, not Supabase itself
  • Cold starts are manageable with scheduled warmup requests

Supabase Edge Functions are production-ready for AI backends. If you need API key protection and per-user rate limiting without running your own server, this architecture is worth the setup cost.


Related Posts


Jibun Kaisha — integrating the best of 21 competitors into one life management app

Live: https://my-web-app-b67f4.web.app/

Top comments (0)