How We Self-Host an AI Image Platform on 7 RTX 5090s (2026 Cost Breakdown)
I run infrastructure for ZSky AI. We serve 35,000+ creators on 7 privately-owned NVIDIA RTX 5090 GPUs in a basement in Florida. This is the unit economics writeup.
Every indie AI founder building a generative product faces the same decision early: rent from cloud, or own metal. Most people rent. They're probably wrong. Here's the math.
The hardware
- 7x NVIDIA RTX 5090 (32 GB VRAM each, Blackwell architecture, released late 2025)
- 224 GB total VRAM in the cluster
- 32-core / 64-thread CPU (AMD Threadripper 7970X)
- 256 GB system RAM
- 2x 8 TB NVMe (primary) + 30 TB spinning rust (archive)
- 2000W PSU per GPU node
- Custom liquid cooling on the 5090s (factory air cooling runs hot under sustained load)
- 2.5 Gbit home symmetric fiber connection
- UPS + second-line generator backup
Total all-in hardware cost, April 2026:
- 7x RTX 5090 @ $3,500 street price = $24,500
- Threadripper + motherboard + RAM + storage + PSUs + cooling = $8,000
- Racks, UPS, network gear = $2,500
- Total: ~$35,000 amortizable over 3-4 years
At 3-year amortization: $972/month straight-line hardware cost.
The cloud alternative
Same capacity rented from the leading providers in April 2026:
| Provider | 5090-equivalent / hr | Monthly (24/7) | Notes |
|---|---|---|---|
| Lambda Labs H100 PCIe | ~$2.29/hr | ~$1,649 | 1 GPU only |
| RunPod H100 SXM | ~$2.99/hr | ~$2,153 | 1 GPU only |
| AWS p5e.48xlarge | Not 5090-eq, varies | $5,000-7,000+ | Spot pricing fluctuates wildly |
| Paperspace H100 | ~$2.24/hr | ~$1,613 | Limited availability |
For 7 GPUs running 24/7 on any of these: $11,000-$15,000 per month. Plus egress bandwidth (which adds another $500-2,000/month depending on how much you serve).
So the math is:
- Self-hosted: $972/mo hardware + ~$300/mo electricity + ~$200/mo networking = $1,472/mo
- Cloud equivalent: ~$12,000/mo
That's an 8x difference. Over 3 years: ~$380,000 saved relative to renting.
What cloud is actually good for
Before I sound like a hosting fundamentalist — cloud is the right choice when:
- You have no idea what your utilization will look like. If you might be at 10% one week and 110% the next, elastic rental is correct. Self-hosted only works if your average utilization is >60% of peak capacity.
- You need a specific region for latency. You can't physically place a server in Frankfurt and São Paulo and Tokyo all at once.
- Your team is remote and nobody wants to drive to a datacenter. This is real. On-call for metal is painful.
- You need enterprise compliance (SOC 2, HIPAA, FedRAMP). Cloud providers have already done the compliance work you haven't.
If none of those apply — and for most indie AI tools they don't — self-hosted wins.
The gotchas I wish somebody had told me
1. Electricity pricing matters more than hardware pricing.
My 7-GPU cluster draws ~3.2 kW under full load. At $0.13/kWh (Florida residential), that's $300/mo. At $0.35/kWh (California residential), it would be $807/mo. If I were in California I'd reconsider the whole thing — the savings would be half.
2. Cooling is the second-hardest problem after electricity.
Factory air coolers on the 5090s run the GPU junction temp above 85°C under sustained inference load. That's within spec but shortens lifetime. I retrofitted all 7 cards with custom liquid blocks. Cost: ~$400/card. Worth every dollar. Card temps now sit at 62-68°C under the same workload.
3. Your ISP's "symmetric fiber" probably isn't truly symmetric.
I had to switch from Xfinity (which throttles upload after 2 TB/day regardless of plan) to a local regional fiber ISP that guarantees symmetric 2.5 Gbit with no caps. Cost: +$30/mo. Worth it because ZSky's video delivery would otherwise be rate-limited.
4. Your insurance policy probably doesn't cover "business equipment running 24/7 in your residence."
I got a separate rider on my homeowner's policy that explicitly covers the GPU cluster as business equipment. Cost: +$22/mo. Without it, a fire or theft claim would be denied.
5. Backup power is not optional.
I have a 3000W UPS for clean switchover, backed by a 9000W natural gas generator for extended outages. Total backup power cost: ~$4,500. It's paid for itself twice already — one hurricane and one unplanned substation failure last year.
Why it works for ZSky specifically
The self-hosted math only closes if your utilization is high. Mine is high because ZSky is queue-based: at any given moment, several requests are pending, so the GPUs are rarely idle. Our average utilization is 68%. Cloud providers bill for 100% of reserved capacity, so we'd pay for 32% of unused capacity on top of the renter's margin.
For a tool where users log in sporadically and expect instant response (say, a chat product), self-hosted is harder. Cloud's elasticity matters more.
What I'd tell another founder
If your product is:
- Queue-tolerant (users accept some wait)
- Utilization-heavy (you're >50% capacity most hours)
- Cost-sensitive (you can't pass through $12K/mo to end users)
- Geography-flexible (you don't need multi-region)
Then self-hosted GPUs are the right answer in 2026. The hardware is cheaper than it's ever been. Used H100s are showing up on eBay at 40% of new. The 5090 is a monster for its price point. And the cloud providers' margins on GPU rental are frankly obscene.
If your product is the opposite — real-time, bursty, multi-region — rent and don't look back.
TL;DR
- Owning 7x RTX 5090 beats renting the equivalent by 8x per month
- But only if utilization > 50%, geography is flexible, and you can tolerate on-call
- Electricity pricing is more important than hardware pricing
- Custom liquid cooling is worth the $400/card
- Insurance rider + backup power are not optional
The right infrastructure choice depends on your workload shape, not on which is "trendy" right now. Run the math for your specific case.
— Cemhan
I run ZSky AI, a free AI image and video platform built on privately-owned hardware. I'm happy to answer questions about the GPU economics in the comments.
Top comments (0)