TAMSIV

Posted on Mar 16

I Built a Voice-Powered Task Manager with AI in 650 Commits — Here's What I Learned

#react #ai #productivity #android

Five months ago, I had a simple problem: my family of four couldn't keep track of groceries, appointments, and who's picking up the kids. Post-it notes on the fridge weren't cutting it.

So I built TAMSIV — a voice-powered task and memo manager with AI. You speak, the AI understands, and everything gets organized automatically. No typing, no tapping through forms. Just talk.

650 commits later, here's what I learned building it solo.

The Stack

Frontend: React Native 0.81 + TypeScript (New Architecture enabled)
Backend: Node.js/Express with WebSocket for real-time voice streaming
Database: Supabase (PostgreSQL) with Row Level Security
AI: OpenRouter (400+ LLM models), Deepgram (STT), OpenAI (TTS)
Hosting: Railway (backend) + Vercel (website)

The Voice Pipeline

This is the core of TAMSIV. The flow looks like this:

User speaks → PCM audio chunks via WebSocket
  → Deepgram (real-time STT with VAD)
  → LLM via OpenRouter (context-aware)
  → Function calling (create_task, create_memo, create_event)
  → OpenAI TTS → Audio response back to user

The tricky part? Voice Activity Detection (VAD). You need to know when the user stopped talking before you send the audio to the LLM. Too early and you cut them off. Too late and the app feels sluggish.

I ended up with a dual-mode system:

Push-to-talk: User controls when they start/stop. More reliable.
VAD mode: Deepgram detects silence automatically. More natural but harder to get right.

What the AI Actually Does

The LLM doesn't just transcribe — it understands intent. Say:

"Remind me to buy milk tomorrow at 5pm, it's urgent"

The AI extracts:

Type: Task
Title: "Buy milk"
Due date: Tomorrow 5pm
Priority: High
Category: Auto-detected based on context

It uses function calling to create structured data, not free text. The backend returns a function_result message, and the frontend handles the actual database operations.

The Hardest Bugs

1. LLMs Can't Do Dates

Ask an LLM "what's next Tuesday?" and you'll get wrong answers 30% of the time. My fix: inject a date lookup table into the system prompt with the current date, day of week, and the next 14 days mapped out. Accuracy went from ~70% to ~99%.

2. The Singleton Cache Leak

After logout, the next user would see the previous user's data. The singleton services (ConversationService, ContentCacheService) kept stale data in memory. Fix: proper cleanup on auth state change.

3. Six Hooks DDoSing Supabase

Six different React hooks were all calling getGroups() independently on mount. That's 6 concurrent identical requests per screen load. Fix: centralize in a GroupsContext with a single fetch.

Features That Surprised Me

Gamification works. I added levels, badges, streaks, and daily challenges almost as an afterthought. Turns out, getting a "5-day streak!" notification makes you actually want to use the app. 12 levels, 5 database tables, full schema — overkill? Maybe. But engagement went up.

Hierarchical folders are underrated. Most task apps give you flat lists or one level of categories. TAMSIV supports unlimited depth: Work > Project Alpha > Sprint 3 > Backend. It sounds simple but the recursive navigation was a real challenge in React Native.

AI-generated images add personality. I integrated Runware (FLUX.1) for automatic image generation on tasks and memos. But the real trick: a LLM first analyzes your content and generates an optimized prompt before sending it to the image model. Way better results than raw text.

The Numbers

650+ commits in 5 months (solo dev)
6 languages (FR, EN, DE, ES, IT, PT)
400+ AI models available via OpenRouter
12 gamification levels
Real-time collaboration with groups, assignments, comments, reactions
3 voice modes: Standard, Live Streaming, Push-to-talk

What's Next

TAMSIV is currently in closed alpha on Google Play with 12 testers. Production launch is planned for late March 2026.

If you're interested in:

Testing the app (Android)
The technical architecture
How I handle voice streaming in React Native
Supabase RLS patterns for multi-tenant apps

Drop a comment or check out tamsiv.com.

Building in public, one commit at a time. Follow along for more technical deep dives.

DEV Community