I turned my idle GPUs into an AI API that undercuts cloud pricing by 2-20x

#ai #api #showdev #python

I've been running GPU hardware for a rental business for years. When demand slowed down, I had four NVIDIA RTX cards sitting mostly idle — a 6000 Ada 48GB, two 4070s, and a 3060.

Instead of selling them off, I decided to build something with them.

The idea

Every AI API I looked at charges a premium because they're running on cloud GPUs. AWS, GCP, Lambda — the GPU rental markup is insane. A single A100 hour can cost $3-4, and that cost gets passed to developers.

I already own the hardware. My electricity is cheap. So what if I just... offered the same models at near-cost pricing?

That's PixelAPI. Thirteen AI models behind a single REST API, running on my own metal.

What it does

The API covers the bread-and-butter stuff most apps actually need:

Background removal — for product photos, profile pics, e-commerce
Image generation — FLUX Schnell and SDXL
Video generation — WAN 2.1 14B (text to video, just launched this)
Upscaling — Real-ESRGAN 4x, useful for low-res assets
Face restoration — GFPGAN for old/damaged photos
Object & text removal — LaMa-based inpainting
Music generation — MusicGen (because why not)

The pricing thing

Here's where it gets interesting. Because I'm not paying cloud rates, I can price aggressively:

What	PixelAPI	What others charge
Remove a background	$0.001	$0.02 - $0.20
Generate an image	$0.001	$0.003 - $0.05
Generate video (per sec)	$0.017	$0.035 - $0.05
Upscale 4x	$0.002	$0.01+

Is it sustainable? At scale — yes. The GPUs are paid off. Electricity in India is cheap. My margins are thin but the hardware is mine, so there's no monthly cloud bill eating into revenue.

Technical details

The stack is straightforward:

Gateway: FastAPI on a dedicated server, handles auth/billing/routing
Workers: Custom Python workers on each GPU machine, pulling jobs from Redis queues
Video: ComfyUI with WAN 2.1 on the 48GB card (needs the VRAM)
Storage: S3-compatible for outputs
SDKs: Python (pip install pixelapi) and JavaScript (npm install pixelapi)

No Kubernetes. No microservices architecture diagram. Just Redis queues and Python workers.

What I learned building this

Cold starts are real and they suck. On cloud GPU platforms, your model gets evicted if nobody's using it. Then the next request waits 30-60 seconds while the model loads. Since I control the hardware, models stay warm 24/7.

Pricing is harder than engineering. I spent more time on credit math than on the actual GPU pipeline. Getting the balance right between "cheap enough to attract devs" and "not literally losing money on every request" took weeks.

Most developers need boring AI, not cutting-edge. Nobody's asking me for the latest research model. They want reliable background removal for their Shopify app. The unsexy stuff is where the demand is.

Try it

There's a free tier — 100 credits on signup, no credit card. That's enough for 100 background removals or a few video clips.

Site: pixelapi.dev
Docs: pixelapi.dev/docs
Free tools (no signup): pixelapi.dev/tools

Happy to answer questions about the architecture, pricing, or running production AI on consumer-ish GPUs.

DEV Community