What if you could describe a voice the way you'd describe it to a friend?
"A warm British narrator in his 50s, slightly raspy, like he's telling a story by the fireplace."
That's a real voice instruction. And it works.
from leanvox import Leanvox
client = Leanvox()
result = client.generate(
text="The kingdom had stood for a thousand years. Tonight, it would fall.",
model="max",
voice_instructions="A warm British narrator in his 50s, slightly raspy, like he's telling a story by the fireplace"
)
print(result.audio_url)
No voice IDs. No uploading samples. Just words.
Three Tiers, One API
| Tier | Model | Price/1K chars | Best for |
|---|---|---|---|
| Standard | Kokoro | $0.005 | High volume, low cost |
| Pro | Chatterbox | $0.01 | Voice cloning, emotion, 40 curated voices |
| Max | Qwen3-TTS | $0.03 | Instruction-based voice design |
All three work with the same API, same SDKs, same MCP server. Just change model.
How It Works
Instead of picking from a preset voice catalog, you write a description:
{
"text": "Welcome back to the show!",
"model": "max",
"voice_instructions": "Energetic female podcast host, mid-30s, American accent, upbeat and conversational"
}
The model interprets your instructions and generates a unique voice. Every generation returns a generated_voice_id you can reuse for consistency.
Voice Instructions That Work
Be specific:
- Gender & age: "Young woman in her 20s" or "Elderly man, 70s"
- Accent: "British RP", "Southern American", "Australian"
- Tone: "Warm and reassuring", "Cold and clinical"
- Character: "Like a nature documentary narrator"
- Quality: "Slightly raspy", "Crystal clear", "Deep and resonant"
Multi-Speaker Dialogue
Max works with the dialogue endpoint too:
episode = client.dialogue(
model="max",
lines=[
{
"text": "So what actually happened in production last night?",
"voice_instructions": "Calm female tech lead, direct and analytical"
},
{
"text": "Honestly? A config typo took down the whole cluster.",
"voice_instructions": "Sheepish junior engineer, male, nervous laugh energy"
}
],
gap_ms=500
)
Two distinct voices, one API call, no presets needed.
Works Everywhere
pip install leanvox # Python
npm install leanvox # Node.js/TypeScript
npx leanvox-mcp # MCP (Claude, Cursor, VS Code)
Pricing
$0.03 per 1,000 characters. No subscription. Credits never expire.
A 500-word narration costs ~$0.09. Still cheaper than every competitor's base tier.
$1.00 free on signup — enough for ~50 Max generations to experiment with.
When to Use Each Tier
- Standard ($0.005) — Bulk narration, IVR, notifications
- Pro ($0.01) — Podcasts, audiobooks, 40 curated voices, voice cloning
- Max ($0.03) — Creative projects, dynamic characters, describe any voice
→ Get your API key · Docs · Python SDK · Node SDK
Originally published at leanvox.com/blog
Top comments (0)