I hosted Kokoro TTS on a GPU pod so you don't have to

#tts #api #kokoro #python

Self-hosting Kokoro is annoying. Here's what it actually takes:

Provision a GPU instance with Nvidia drivers + CUDA
Set up a Python venv with specific torch versions
Download the 82M parameter model weights
Write an inference server that handles concurrent requests without VRAM explosions
Configure a reverse proxy for SSL
Pay ~$0.44/hr for the GPU whether you're using it or not

For a prototype or small app, that's 3-6 hours of setup plus ongoing infrastructure cost you pay even when idle.

What I built instead

Kokoro running on an RTX 3090 GPU pod, behind a simple HTTP endpoint:

curl -X POST https://the-service.live/synthesize \
  -H 'Content-Type: application/json' \
  -d '{"text": "Hello, this is Kokoro."}' \
  --output speech.wav

Returns WAV audio. That's the whole API.

Pricing comparison

Option	Cost	Notes
ElevenLabs	$0.015/character	1,000-char paragraph = $15
Self-hosting Kokoro	~$0.44/hr (GPU idle)	Plus 3-6hr setup
xpay.tools hosted Kokoro	$0.02/call	x402 micropayment
the-service.live	$0.01/call	3 free/day, then $0.01

Same Kokoro-82M model quality. 2x cheaper than the nearest competitor. You maintain nothing.

When self-hosting actually makes sense

You have a dedicated GPU server already sitting idle
You need more than ~10,000 calls/day (at that volume, own your infra)
You need custom voices or fine-tuning on your data
You need air-gapped / on-premise deployment

For most indie projects and prototypes, none of those apply.

How the endpoint works

Kokoro-82M runs on a RunPod GPU pod (RTX 3090). Requests come in over HTTPS, the model generates speech, WAV comes back. Simple queue.

Payment is via x402 micropayment — $0.01 USDC on Base chain alongside the request. No account, no API key, no subscription. Pay per call.

Free tier

3 calls/day, no payment required. Enough to evaluate quality before you commit to anything.

# Try it right now
curl -X POST https://the-service.live/synthesize \
  -H 'Content-Type: application/json' \
  -d '{"text": "Testing Kokoro TTS via hosted API."}' \
  --output test.wav && open test.wav