Lokesh

Posted on Mar 7

I Built My Own Spotify Because Every Music App Failed Me (Web App + Android APK, Solo)

#showdev #javascript #react #android

I Rage-Quit a Music App at 1am. Then Built My Own Spotify. (React + Node + Capacitor + Groq AI)

Full-stack music streaming platform — web app + real native Android APK — built solo, shipped free. Here's every technical decision, every mistake, and every architecture choice that made it work.

🌐 Live: soul-sync-beta.vercel.app
📱 APK: Download SoulSync.apk
⭐ GitHub: itslokeshx/SoulSync

The Night That Started Everything

It was 1am.

I had 60 songs I wanted in one playlist. Tamil classics, Anirudh hits, a few AR Rahman deep cuts. The kind of playlist that takes you back somewhere.

So I opened the app and started:

Song 1.  Search → find → add. ✅
Song 2.  Search → wrong version → search again → add. ✅
Song 3.  Search → add. ✅
...
Song 48. 45 minutes in. 12 songs left.

I closed the app.
Put my phone down.
Went to sleep.

Woke up the next morning — still annoyed.

That annoyance built SoulSync.

Not inspiration. Not a market gap. Not a YouTube tutorial that said "build a music app." Pure, unfiltered frustration at 1am.

What Is SoulSync?

A full-stack music streaming platform that ships as both a web app and a real native Android APK — from a single React codebase.

Here's what it does that Spotify doesn't (or charges for):

Feature	SoulSync	Spotify Premium (₹119/mo)
AI playlist from song list	✅ Free	❌ Doesn't exist
Listen together + live chat	✅ Free	⚠️ Paid, no chat
Unlimited downloads	✅ Free	⚠️ Paid only
Offline without account	✅ APK only	❌ Never
Intelligent NLP search	✅ Free	❌ Literal only
Open source	✅ MIT	❌
Price	₹0. Forever.	₹119/month

The Full Stack

Frontend  → React 18 + TypeScript + Vite + Tailwind + Zustand
Backend   → Node.js + Express + MongoDB Atlas
Realtime  → Socket.io
AI        → Groq SDK + LLaMA 3.3 70B
Mobile    → Capacitor 6 (web → native Android APK)
Search    → JioSaavn API + 7-layer caching
Cache     → Upstash Redis
Deploy    → Vercel (frontend) + Render (backend)

One backend. One database. One auth system. Serves both the web app and APK.

Feature 1: The AI Playlist Builder (The Feature That Started Everything)

That 45-minute nightmare is now 10 seconds.

How it works

User pastes 100 song names
         ↓
Groq AI (LLaMA 3.3 70B) parses the list
         ↓
Generates optimized search query per song
         ↓
Parallel JioSaavn search (Promise.all — 10 concurrent)
         ↓
Confidence scoring: exact / partial / not found
         ↓
Named playlist handed back to user

The chunking problem

LLaMA has a context window limit. 100 songs in one shot causes hallucinations and timeouts. The solution: chunked processing.

const CHUNK_SIZE = 20
const MAX_CONCURRENT_GROQ = 3  // rate limit safe
const BATCH_DELAY = 500         // ms between batches

async function buildPlaylist(songs: string[]) {
  const chunks = chunkArray(songs, CHUNK_SIZE)
  const results: Song[] = []

  for (let i = 0; i < chunks.length; i += MAX_CONCURRENT_GROQ) {
    const batch = chunks.slice(i, i + MAX_CONCURRENT_GROQ)

    const batchResults = await Promise.all(
      batch.map(chunk => processChunkWithGroq(chunk))
    )

    results.push(...batchResults.flat())

    // SSE progress event to frontend
    emitProgress({
      processed: Math.min((i + MAX_CONCURRENT_GROQ) * CHUNK_SIZE, songs.length),
      total: songs.length,
      found: results.length
    })

    if (i + MAX_CONCURRENT_GROQ < chunks.length) {
      await delay(BATCH_DELAY)
    }
  }

  return results
}

SSE streaming so user sees progress in real time

Instead of a spinner for 30 seconds, users see a live progress bar:

// Backend — stream results as they come in
router.get('/build-playlist/stream', (req, res) => {
  res.setHeader('Content-Type', 'text/event-stream')
  res.setHeader('Cache-Control', 'no-cache')

  const send = (event: string, data: any) => {
    res.write(`event: ${event}\ndata: ${JSON.stringify(data)}\n\n`)
  }

  buildPlaylist(songs, {
    onProgress: (p) => send('progress', p),
    onChunk: (songs) => send('chunk_complete', { songs }),
    onComplete: (all) => {
      send('complete', { songs: all })
      res.end()
    }
  })
})

// Frontend — consume the stream
const source = new EventSource(`${BACKEND}/api/ai/build-playlist/stream?sessionId=${id}`)

source.addEventListener('progress', (e) => {
  const { processed, total, found } = JSON.parse(e.data)
  setProgress(Math.round((processed / total) * 100))
  setMatchCount(found)
})

source.addEventListener('chunk_complete', (e) => {
  const { songs } = JSON.parse(e.data)
  setSongs(prev => [...prev, ...songs])  // live list grows as songs are found
})

Result: User sees songs appearing one batch at a time, with a real progress bar. Not a fake loading animation.

Feature 2: SoulLink — Listen Together

Create a room → share a 6-character code → friend joins → everything syncs live.

The sync architecture

Host device                  Server                  Guest device
     │                          │                          │
     │──── duo:join ───────────▶│                          │
     │                          │◀─── duo:join ────────────│
     │                          │                          │
     │──── duo:sync-play ──────▶│──── duo:sync-play ──────▶│
     │                          │                          │
     │◀─── duo:message ─────────│◀─── duo:message ─────────│
     │                          │                          │
     │──── duo:heartbeat ──────▶│◀─── duo:heartbeat ────── │

The hardest problem: seek sync

When the host seeks to 2:34, the guest needs to jump to 2:34 + network latency. Without compensation, they're always slightly out of sync.

// Host sends seek with timestamp
socket.emit('duo:seek', {
  position: audio.currentTime,
  sentAt: Date.now()  // ping compensation
})

// Guest receives and compensates
socket.on('duo:seek', ({ position, sentAt }) => {
  const latency = (Date.now() - sentAt) / 1000  // seconds
  audio.currentTime = position + latency
})

It's not perfect — but it's imperceptibly close on good connections.

Feature 3: The Search That Went From 10s to 150ms

This was the biggest engineering challenge.

Why it was slow

Old flow (sequential):
  Query 1 → JioSaavn → wait 1200ms
  Query 2 → JioSaavn → wait 1200ms
  Query 3 → JioSaavn → wait 1200ms
  Query 4 → JioSaavn → wait 1200ms
  Total: ~4800ms minimum + debounce + Render cold start = 10s+

The 7-layer fix

Layer 0 — Startup warmup     → top 100 queries pre-cached at boot
Layer 1 — Client-side cache  → Map<query, result>, 5min TTL, ~0ms
Layer 2 — Debounce + Abort   → 150ms (was 400ms) + cancel inflight
Layer 3 — Redis smart cache  → normalized keys, ~5ms
Layer 4 — Parallel fetch     → Promise.allSettled, all queries at once
Layer 5 — SSE streaming      → first results visible at ~200ms TTFB
Layer 6 — Intent parser      → 5 targeted queries instead of 1 vague one
Layer 7 — Static index       → top 100 artists return instantly

The real fix: NLP Intent Parser

Speed alone wasn't enough. The search also needed to be intelligent.

The problem: "ranjini 173" is a Tamil movie. The old search sent "ranjini 173" to JioSaavn as a song name — 0 results.

The fix: an intent parser that classifies every query before searching:

type IntentType =
  | 'movie_songs'     // "ranjini 173", "leo songs", "kgf 2"
  | 'artist_recent'   // "latest anirudh", "new vijay songs"
  | 'artist_all'      // "arijit singh songs"
  | 'mood_search'     // "sad songs", "lofi tamil night"
  | 'bgm_search'      // "harris jayaraj bgm"
  | 'language_hits'   // "tamil hits 2025"
  | 'song_direct'     // exact song name

function parseIntent(query: string): ParsedIntent {
  // 1. Movie database check (500+ movies including "ranjini 173")
  const movieMatch = findMovieMatch(query)
  if (movieMatch) return { intent: 'movie_songs', movie: movieMatch }

  // 2. Artist dictionary (500+ aliases)
  // "arr" → "A.R. Rahman", "thalapathy" → "Vijay"
  const artist = findArtistMatch(query)
  if (artist) return buildArtistIntent(artist, query)

  // 3. Mood detection
  // "sad lofi" → expands to 5 mood variants
  const mood = detectMood(query)
  if (mood) return buildMoodIntent(mood, query)

  // 4. Language + era detection
  // "90s tamil hits" → year range + language filter
  return buildDirectIntent(query)
}

For "ranjini 173":

Intent: movie_songs
Expanded queries (all fired in parallel):
  1. "Ranjini 173 songs"          weight: 1.0
  2. "Ranjini 173 movie songs"    weight: 0.95
  3. "Ranjini 173 audio jukebox"  weight: 0.90
  4. "songs from Ranjini 173"     weight: 0.85
  5. (album search via separate endpoint)

Movie songs appear. Not a song titled "ranjini 173".

For "sad anirudh 2024":

Intent: artist_recent + mood
Artist: "Anirudh Ravichander"
Year: 2024
Mood: sad

Queries:
  1. "Anirudh Ravichander songs 2024"  weight: 1.0
  2. "sad Anirudh songs 2024"          weight: 0.95
  3. "Anirudh 2024 hits"               weight: 0.90
  4. "emotional Anirudh 2024"          weight: 0.85

Result timings

Client cache hit:     ~0ms   (repeat search same session)
Redis cache hit:      ~5ms   (server cached)
Cold query (TTFB):  ~200ms   (first results streamed)
Cold query (full):  ~600ms   (all results scored + ranked)
Before:           10,000ms+

Feature 4: One Codebase → Web App + Native Android APK

This was the decision that changed the whole project.

Why not React Native?

React Native would have meant:

Separate component library
Separate navigation (React Navigation vs React Router)
Separate state management patterns
Essentially rebuilding the entire app

Why Capacitor won

Capacitor wraps your existing web app in a native shell. Same React components. Same Zustand stores. Same backend calls. Native capabilities added via plugins.

React App (web)
     ↓
npx cap add android
     ↓
npx cap sync
     ↓
Android Studio → assembleRelease
     ↓
SoulSync.apk ✅

What native capabilities I added

import { Haptics, ImpactStyle } from '@capacitor/haptics'
import { Filesystem, Directory } from '@capacitor/filesystem'
import { Network } from '@capacitor/network'
import { LocalNotifications } from '@capacitor/local-notifications'

// Haptic on every song tap
await Haptics.impact({ style: ImpactStyle.Light })

// Download song to native filesystem
await Filesystem.writeFile({
  path: `songs/${song.id}.mp3`,
  data: base64Audio,
  directory: Directory.Data
})

// Network-aware offline mode
const status = await Network.getStatus()
if (!status.connected) setOfflineMode(true)

// Lock screen media controls
await LocalNotifications.schedule({
  notifications: [{
    title: song.name,
    body: song.artist,
    id: 1,
    extra: { type: 'now_playing' }
  }]
})

Platform detection pattern

import { Capacitor } from '@capacitor/core'

const isAPK = Capacitor.isNativePlatform()
const platform = Capacitor.getPlatform() // 'android' | 'ios' | 'web'

// Different storage per platform
const storage = isAPK
  ? new NativeFileStorage()    // Capacitor Filesystem
  : new WebStorage()           // IndexedDB

// Different audio URL handling
function getPlayableUrl(song: Song): string {
  if (isAPK && song.filePath) {
    return Capacitor.convertFileSrc(song.filePath)  // file:// → capacitor://
  }
  return song.streamUrl
}

Feature 5: Personalized Dashboard

Every session rebuilds from:

Your listening history (last 90 days, MongoDB TTL index)
Your language preferences (set at onboarding)
Time of day (morning = energetic, night = chill)

async function buildDashboard(userId: string): Promise<Section[]> {
  const [history, prefs, trending] = await Promise.all([
    getListeningHistory(userId, 90),  // days
    getUserPreferences(userId),
    getTrendingByLanguage(prefs.languages)
  ])

  const hour = new Date().getHours()
  const timeContext = hour < 12 ? 'morning'
                    : hour < 18 ? 'afternoon'
                    : hour < 22 ? 'evening'
                    : 'late_night'

  return [
    buildTimeSection(timeContext, history),         // "Morning Fresh"
    buildArtistSpotlight(history),                  // "Because You Listened"
    buildLanguageSection(prefs.languages, trending), // "Trending Tamil"
    buildRecentlyPlayed(history),
    buildRecommended(history, prefs),
  ]
}

The 5 Hardest Problems I Solved

1. Background audio on Android APK dying when screen locks

The web audio API pauses when the screen locks on Android. Fix: Capacitor's @capacitor-community/background-runner + a MediaSession API registration.

navigator.mediaSession.metadata = new MediaMetadata({
  title: song.name,
  artist: song.primaryArtists,
  artwork: [{ src: song.image, sizes: '512x512' }]
})

navigator.mediaSession.setActionHandler('play', () => audioRef.current?.play())
navigator.mediaSession.setActionHandler('pause', () => audioRef.current?.pause())
navigator.mediaSession.setActionHandler('nexttrack', () => playNext())
navigator.mediaSession.setActionHandler('previoustrack', () => playPrev())

2. Offline playback notification not working

The original code had this guard:

// Bug: this silently killed offline notifications
if (!song.streamUrl) return
showNowPlayingNotification(song)

Offline songs have filePath, not streamUrl. Removing that guard and using getPlayableUrl() fixed it.

3. Search cache key collision

"sad arijit" and "arijit sad" should hit the same cache. The fix: normalize by sorting words alphabetically.

function normalizeKey(query: string): string {
  return query
    .toLowerCase()
    .trim()
    .replace(/[^\w\s]/g, '')
    .split(' ')
    .filter(Boolean)
    .sort()           // ← the critical line
    .join('_')
}

// "sad arijit"  → "arijit_sad"
// "arijit sad"  → "arijit_sad"  ✅ same key

4. Groq rate limits with 100 songs

5 API keys in rotation, checked before each request:

const GROQ_KEYS = [
  process.env.GROQ_KEY_1,
  process.env.GROQ_KEY_2,
  // ...
]
let keyIndex = 0

function getGroqClient() {
  const key = GROQ_KEYS[keyIndex % GROQ_KEYS.length]
  keyIndex++
  return new Groq({ apiKey: key })
}

5. Socket.io rooms leaking memory

Rooms weren't being cleaned up when users disconnected. After 48 hours the server would slow to a crawl.

socket.on('disconnect', () => {
  const room = getRoomForSocket(socket.id)
  if (!room) return

  room.members.delete(socket.id)

  if (room.members.size === 0) {
    // Last person left — clean up completely
    activeRooms.delete(room.code)
    console.log(`Room ${room.code} cleaned up`)
  } else {
    // Notify remaining members
    socket.to(room.code).emit('duo:member-left', { socketId: socket.id })
  }
})

What I'd Do Differently

1. Redis from day one, not an afterthought.
I added caching after the search was already "working" at 10 seconds. If I'd designed the cache layer first, the search would have been fast from the start.

2. Intent parser before building the search UI.
I built the search UI for a simple text match, then had to refactor it when I realized literal string matching was useless for natural language queries.

3. Capacitor from day one.
I built the entire web app first, then added Capacitor. It worked, but there were 2-3 days of fixing things that broke in the native context (audio, storage, network detection). Designing with Capacitor in mind from the start would have been cleaner.

4. SSE instead of polling for playlist building.
I used polling for the first version of the AI playlist builder (setInterval every 2 seconds checking if it was done). SSE was a 2-hour rewrite that made the experience 10x better.

Performance Numbers

Search (cold, intent-matched):    ~600ms  (was 10,000ms+)
Search (Redis cache hit):           ~5ms
Search (client cache hit):           ~0ms
AI Playlist (100 songs):           ~18s   (streamed, first results at ~3s)
SoulLink room creation:            ~120ms
Page load (Vercel edge):           ~800ms
APK cold start:                    ~1.2s

What's Next

[ ] Play Store submission (APK is signed and ready)
[ ] Lyrics sync display
[ ] iOS build (Capacitor makes this straightforward)
[ ] Audio visualizer on the full player
[ ] Push notifications for new releases from followed artists

Try It

🌐 Web app: soul-sync-beta.vercel.app

📱 Android APK (best experience, direct install):
SoulSync.apk

⭐ Star on GitHub:
github.com/itslokeshx/SoulSync

The Takeaway

The best product you'll build isn't the one with the best market research.

It's the one where you woke up the next morning still annoyed.

That's your signal.

Questions about any part of the architecture? Drop them in the comments — happy to go deep on anything. 🎧

DEV Community