Everyone asks "how much does it cost to build an AI SaaS?" and gets vague answers like "it depends." So I built calculators for every layer of the stack and actually ran the numbers at three scales.
Here's the full breakdown for a typical AI SaaS — think a document Q&A tool, a customer support copilot, or an AI writing assistant.
The Stack
Every AI SaaS has roughly the same infrastructure layers:
LLM API — the brain (GPT-5.4, Claude Sonnet, Gemini Flash)
Vector Database — long-term memory (Pinecone, Qdrant, pgvector)
Hosting — where it runs (Hetzner, AWS, Vercel)
Auth — who can log in (Supabase Auth, Clerk, Auth0)
Payments — how you get paid (Stripe, Paddle, Lemon Squeezy)
Serverless — background jobs, webhooks, cron (Lambda, Cloudflare Workers)
Most cost guides only talk about the LLM layer. But I've seen startups where auth costs more than their AI budget, and others where the vector database quietly became their biggest line item.
Scale 1: Startup — 1,000 Users
Your first paying customers. Maybe $5K-10K MRR. You're optimizing for speed, not cost.
LayerCheap OptionCostPremium OptionCostLLM APIGPT-5.4 nano$15/moClaude Sonnet 4.6$180/moVector DBQdrant self-hosted$7/moPinecone Serverless$22/moHostingHetzner CAX21$6/moAWS t3.small$30/moAuthSupabase Auth$0/moClerk$0/mo (free tier)PaymentsStripe2.9%Paddle5%ServerlessCloudflare Workers$0/moAWS Lambda$0/mo
Cheapest viable stack: ~$28/month
Premium stack: ~$232/month
At 1K users, most services are within free tiers. The LLM API is your only real variable cost. If your users make 50 queries/day average with GPT-5.4 nano, that's ~$15/month. With Sonnet, it's ~$180.
The 12x difference between nano and Sonnet sounds scary, but here's the thing: for most tasks (classification, extraction, simple Q&A), nano is good enough. Save Sonnet for the complex reasoning chains.
Scale 2: Growth — 10,000 Users
Things get interesting here. Free tiers end, costs become real, and bad architecture decisions start hurting.
LayerCheap OptionCostPremium OptionCostLLM APIGPT-5.4 nano$150/moClaude Sonnet 4.6$1,800/moVector DBQdrant self-hosted$36/moPinecone$210/moHostingHetzner$17/moAWS$120/moAuthSupabase Auth$25/moClerk$275/moPaymentsStripe~$290/moPaddle~$500/moServerlessCF Workers$5/moLambda$45/mo
Cheapest viable: ~$523/month
Premium: ~$2,950/month
This is where auth pricing becomes a trap. Clerk at 10K users is $275/month. At 1K it was free. That's the steepest curve in the entire stack. If you started on Clerk's free tier thinking "I'll worry about cost later," later just arrived.
The LLM cost at this scale depends entirely on your caching strategy. If you're re-computing embeddings or re-running the same prompts, you're burning money. A Redis cache in front of your LLM calls can cut costs 30-50%.
Scale 3: Scaling — 100,000 Users
This is where architecture choices made at 1K users either pay off or blow up.
LayerCheap OptionCostPremium OptionCostLLM APIGPT-5.4 nano$1,500/moClaude Sonnet 4.6$18,000/moVector DBQdrant self-hosted$480/moPinecone$1,900/moHostingHetzner cluster$120/moAWS$800/moAuthSupabase Auth$25/moClerk$1,825/moPaymentsStripe~$2,900/moPaddle~$5,000/moServerlessCF Workers$25/moLambda$300/mo
Cheapest viable: ~$5,050/month
Premium: ~$27,825/month
The difference between cheap and premium is now $22,775/month — that's $273K/year. At this scale, every architecture decision has a five or six figure annual impact.
The wildest number: auth. Supabase Auth at 100K MAU is $25/month. Clerk is $1,825. Auth0 would be $5,000+. That's a 73x difference for the same core feature: letting people log in.
What I Learned Building These Calculators
- LLM costs are overestimated. Everyone worries about the AI bill, but at startup scale it's usually the smallest line item. A well-architected app with caching and nano-class models runs for $15-50/month at 1K users.
- Auth costs are underestimated. Clerk and Auth0 have aggressive pricing curves that feel invisible at small scale and devastating at medium scale. Check the pricing page before you npm install.
- Self-hosting saves 70-80% on vector databases. Qdrant on a Hetzner box vs Pinecone managed: the performance is identical, the cost is 5-10x less. The trade-off is operational overhead, which is real but manageable if you know Docker.
- Payment processor choice is permanent. Migrating from Stripe to Paddle means re-integrating billing for every customer. Choose once, choose carefully. The Stripe vs Paddle decision isn't about 2.9% vs 5% — it's about whether you want to handle global tax compliance yourself.
- Serverless is effectively free at startup scale. Cloudflare Workers gives you 10M requests/month free. Lambda gives you 1M. Don't spin up dedicated servers for background jobs until you actually need to. Run Your Own Numbers Every SaaS has different usage patterns. I built free calculators for each layer:
LLM API Cost Calculator
Vector Database Cost Calculator
Cloud VPS Comparison
Auth Provider Cost Calculator
Payment Processor Fees
Serverless Cost Calculator
No signup, runs in your browser, open source pricing data updated monthly.
What does your AI SaaS stack look like, and what's your biggest cost surprise been? I'm especially curious about anyone running at 50K+ users — does the math hold up?
Top comments (0)