Pulkit Goyal

Posted on Jun 13 • Edited on Jun 20 • Originally published at pulkito4.hashnode.dev

I Built an App to Finally Understand the Songs I Listen To (And Then Had to Rewrite the Entire Backend)

#ai #webdev #nextjs #systemdesign

I listen to a lot of Punjabi music. Always have. The beats are incredible — but I'd be lying if I said I understood more than 60% of what was actually being said. The usual fix is to Google the lyrics, find a translation, switch tabs, lose the vibe entirely. Or paste the lyrics into ChatGPT every single time. It works, but it's friction.

So I built SongBhaav — a webapp where you search any song and get line-by-line translations, emotional interpretation, cultural context, and what I like to call the "kavi kya kehna chahte hain" treatment. The thing your English teacher used to do with poems in school. Breaking down what the artist actually meant, not just what the words say.

It started as a weekend side project. Then I made it a proper webapp. Then two days after going live, half the backend broke in production.

This post is about how the app works, what broke, and the architectural decisions that came out of fixing it.

What SongBhaav Actually Does

The core flow is simple: search for a song → get a full breakdown.

The breakdown includes:

Line-by-line translation — not just a raw word-for-word translation but one that preserves the intent
Interpretation — what the song is actually about, the emotional arc, the metaphors
Fun facts — context about the artist, the album, cultural references in the lyrics
Emotional themes — the overall sentiment and mood

I use it daily now. Mostly for Punjabi tracks but it works across languages — anything Gemini can process, which is quite a lot.

The tech stack: Next.js (App Router) on the frontend, Supabase (PostgreSQL) as the database, Gemini models for AI processing, and — as we'll get to — Upstash QStash for the async architecture.

How v1 Worked (And Why It Was Naive)

The first version was a straightforward synchronous pipeline:

User searches song → POST /api/process → Fetch lyrics → Gemini AI → Return response

For songs already in the database, it was fast — just a cache hit on Supabase and done. The problem was cache misses. Any song that hadn't been analyzed before required the full pipeline: fetch lyrics from an external source, send them to Gemini, wait for a structured JSON analysis to come back. That whole flow was consistently taking 15 to 25 seconds depending on song complexity and Gemini's response time.

Vercel's Hobby tier enforces a hard 10-second execution limit on serverless functions. So for any new song, users would wait, the browser would spin indefinitely, and they'd eventually get a FUNCTION_INVOCATION_TIMEOUT error. Not a great first impression.

I knew this was a problem during development. I launched anyway, telling myself it "probably wouldn't be that bad in practice."

It was that bad in practice.

What Broke in Production

Problem 1: Cloudflare Killed My Lyrics Scraper

Before integrating official APIs, I was scraping lyrics directly from Genius. It worked fine locally, worked fine in the first couple of days after launch.

Then Genius added Cloudflare protection to their website.

The errors started showing up in my pipeline_logs table — not clean API errors, but this:

SyntaxError: Unexpected token '<', "<!DOCTYPE "... is not valid JSON

What was happening: my serverless function would make a request to scrape the Genius page, Cloudflare would intercept it, identify it as an automated agent, and serve back an HTML "Access Denied" page instead. My code would then try to parse that HTML as JSON and crash immediately.

The fix was straightforward once I understood the cause — register an official Genius Developer API client, switch to authenticated API calls with Bearer tokens, bypass the anti-bot layer entirely. Since making that switch, lyrics fetching has had zero failures.

But the lesson was more important than the fix: you cannot deploy a web app and consider it done. Third-party services update, add protections, change behavior. The two days between launch and breakage weren't anything I did wrong — the external environment just changed. Production monitoring isn't optional; it's how you find out about this before your users do.

Problem 2: The 10-Second Ceiling

This one I knew was coming and hadn't properly solved. The serverless timeout wasn't an occasional edge case — it was guaranteed for every new song search. There was no optimizing my way out of it within a synchronous request model. The question was: what's the right architecture for a task that inherently takes longer than your platform allows?

The v2 Rewrite: Going Async

The core insight was simple: don't make the user wait for the slow thing. Start the slow thing, tell the user you've started it, then notify them when it's done.

This is a standard pattern in distributed systems — sometimes called a "front desk / back office" model. The front desk takes your request immediately and gives you a ticket. The back office does the actual work. You don't stand at the counter for 25 seconds.

Here's what the new flow looks like:

Three components make this work:

Upstash QStash is the message broker. When a user requests a new song, /api/start-job pushes a message to QStash with the track metadata and immediately returns a job_id to the client — the whole thing resolves in under 50ms. QStash then handles triggering the background worker, with automatic retries built in if anything fails.

One practical note: QStash delivers messages via HTTP POST to your worker endpoint, which means it can't reach localhost in local development. I handled this by detecting the environment in /api/start-job and calling the worker directly in dev, bypassing QStash entirely. Not elegant, but it works cleanly and keeps the local development loop fast.

The polling loop handles the feedback to the user. While the background worker runs, the frontend periodically hits /api/check-job?id=... every few seconds, checking the background_jobs table until the status flips from processing to completed. Once it does, the client fetches the result and swaps the skeleton loader for the actual data. Simple, lightweight, and it avoids the complexity and connection quota costs of maintaining a persistent WebSocket.

The worker runs the actual pipeline: concurrent lyrics fetching via Promise.any across LRCLIB and Genius (whichever resolves first wins), then Gemini processing, then writing the structured result to processed_songs.

Handling Gemini 429s

One edge case I didn't think about until it happened: what if Gemini rate limits mid-job?

Before adding proper error handling, a 429 from Gemini would cause the worker to throw an unhandled error. QStash would mark the delivery as failed, and background_jobs would stay stuck at processing forever. The user would sit on a skeleton loader indefinitely with no feedback or resolution.

The fix: explicitly return a 500 status when Gemini throws a 429 or 503. QStash treats any non-2xx response as a failed delivery and automatically schedules a retry using exponential backoff. QStash's backoff formula is min(86400, e^(2.5*n)) seconds — so roughly 12 seconds after the first failure, ~2.5 minutes after the second, ~30 minutes after the third. Aggressive enough to let Gemini's rate window reset, spaced out enough not to hammer it again immediately.

This means transient rate limit errors self-heal entirely without manual intervention.

Securing the Worker Endpoint

Easy thing to overlook with async architectures: your background worker is just an HTTP endpoint sitting on the public internet. Anyone who finds the URL can POST to it and trigger AI processing, draining your API credits.

QStash solves this with cryptographic signature verification. Every message QStash sends includes an upstash-signature header signed with your secret key. The worker validates this using the QStash SDK before executing anything — invalid or missing signatures get a 401 immediately.

For rate limiting the public-facing endpoints, I used Upstash Redis with the @upstash/ratelimit SDK inside Next.js middleware. Since middleware runs at the edge (CDN level on Vercel), spam requests are blocked before they ever touch the database or trigger any compute.

What I'd Do Differently

Start async from day one. The v1 synchronous architecture was always going to hit the timeout wall for new songs. Building the QStash pipeline upfront isn't significantly more complex than the naive approach — it would have saved the rushed production rewrite.

Set up alerting before going live, not after. I found out about the Cloudflare failure by manually checking pipeline_logs. A simple alert on error rate spikes would have caught it immediately. Logging without alerting is only half the job.

Don't assume external services stay stable. Genius adding Cloudflare wasn't predictable. The right response isn't to avoid third-party dependencies — official APIs are fine — but to monitor them actively and build fallbacks where possible.

If you want to try SongBhaav, drop any song in the search bar: SongBhaav

Feedback welcome — especially if something in the architecture could be done better.

DEV Community