A few weeks ago I shipped an AI Rap Name Generator as
part of a multi-tool music app. It looks dead simple from the outside — pick a genre, pick a few "vibes," click
generate, get 6 rapper names back. Behind that simplicity there were three things that took longer than I expected to
get right. Sharing in case you're building something similar.
## The stack
- Next.js 15 (App Router, RSC where possible)
- Gemini 2.5 Flash for the LLM (fast + cheap, perfect for short-form structured generation)
- Drizzle ORM + Postgres for credit tracking
- Zod for input validation
- TypeScript everywhere
Total cost per 6-name generation: roughly $0.0002 with Gemini Flash. That's basically free, which is why I could
afford a generous free tier without going broke.
## Tricky part #1: Prompt engineering for strict JSON output
LLMs love to add chatty prefixes and trailing markdown fences. Out of the box, the model would happily return:
Sure! Here are 6 rapper names for you:
\json
[{"name": "...", "vibe": "..."}]
\\
That breaks JSON.parse(). The fix was twofold — strict instruction in the prompt + defensive parsing on the way out:
\tsYou are a creative rap name specialist. Generate exactly 6 unique rapper stage names.
return
Genre: ${p.genre}
Vibe: ${vibeStr}
Gender: ${genderStr}
Rules:
- Names must authentically fit the ${p.genre} genre
- Each name should be 1–3 words, memorable, and original
- No slurs or offensive language
- Vary the style: some single-word, some two-word, some with numbers
Return ONLY a valid JSON array with exactly 6 objects, no other text:
[{"name":"Example Name","vibe":"Short vibe description here"},...];
\\
And the parser:
\ts
const jsonMatch = rawText.match(/\[[\s\S]*\]/);
const names = jsonMatch ? JSON.parse(jsonMatch[0]) : [];
\\
The regex grabs the first [...] block in the response and ignores whatever fluff came before or after. Catches ~99%
of cases. The remaining 1% (Gemini occasionally truncates if maxOutputTokens is too low) gets handled by the user
just clicking generate again — and crucially, they don't get charged for it (more on that below).
## Tricky part #2: Gemini's 503s
About 1 in 30 requests come back with a 503 ("Service Unavailable" / "high demand"). First time I shipped without
retry logic, my support inbox filled up within a day.
The fix is a simple retry with backoff:
\tsGemini API error ${res.status}
let data;
for (let attempt = 0; attempt < 3; attempt++) {
if (attempt > 0) await new Promise((r) => setTimeout(r, 1500));
const res = await fetch(GEMINI_URL, { method: "POST", body: geminiBody });
if (res.status === 503 && attempt < 2) continue;
if (!res.ok) throw new Error();
data = await res.json();
break;
}
\\
3 attempts, 1.5s sleep between. Brings effective failure rate from ~3% to under 0.1%.
## Tricky part #3: The credit freeze/settle/release pattern
Users buy credits up front. A naive flow would be:
- Deduct credits
- Call Gemini
- Return result
But what if Gemini fails on attempt 3 of 3? You've charged the user for nothing. Refunding adds support overhead. The
right pattern:
\tsrapname_${nanoid(21)}`;
const holdUuid =
await creditService.freeze({
userId: user.id,
credits: RAP_NAME_GENERATION_CREDITS,
videoUuid: holdUuid,
});
try {
// ... call Gemini, parse result ...
await creditService.settle(holdUuid); // success → actually consume
} catch (err) {
await creditService.release(holdUuid); // failure → refund
throw err;
}
`\
Three states for credits: available, frozen, consumed. Frozen credits are reserved (so concurrent requests
can't double-spend) but not yet charged. On success they settle into consumed; on failure they release back to
available.
The user-facing payoff is one sentence in the error response:
"The AI is experiencing high demand right now. No credits were deducted — please try again in a few minutes."
That sentence — "no credits were deducted" — single-handedly killed about 80% of refund requests.
This pattern also scaled well when I added long-running jobs (music generation, where the AI provider takes 60+
seconds). Same primitives, different durations.
## Lessons
- Design the failure path before the happy path. Retries, refunds, friendly errors. The happy path is one if-statement; the failure paths are 80% of the work.
- Defensive parsing > strict prompts. You can ask the LLM nicely to return clean JSON, but always parse like it lied to you.
- Make the cheapest model your default. Gemini 2.5 Flash is plenty for short-form structured output. Save the expensive models for tasks that actually benefit.
If you want to try the live tool, it's at
melodycraftai.com/rap-name-generator. Curious what
other tricky parts people have hit when shipping LLM features — drop them in the comments.
Top comments (0)