I built this project for the Gemini Live Agent Challenge 2026.
The idea started from a simple observation: I and most of my friends
regularly buy things we immediately regret. Not because we don't know
our financial situation — but because the bad decision happens at 11 PM
when we're stressed, and no tool is watching at that exact moment.
So I built one.
Sentience Finance is a Chrome extension that detects checkout pages
automatically, reads your emotional state through your camera, and opens
a Gemini Live voice conversation with your own spending history loaded
in — before you click buy.
The core architecture
The extension (content.js) watches for financial page patterns —
checkout URLs, payment selectors, keywords like "bKash" and "Cash on
Delivery" for platforms like Daraz. When it fires, it injects a HUD
token and sends the page context to the sidepanel.
The sidepanel opens a WebSocket to a FastAPI backend, which connects to
Gemini Live using the gemini-2.5-flash-native-audio-preview-12-2025
model via v1alpha. Audio goes in as 512-frame PCM chunks (32ms latency).
Audio comes back at 24kHz and plays through a persistent AudioContext
with chunks scheduled sequentially — no gaps, no pops.
The hardest part: making it actually feel live
Early builds required clicking a button for every sentence. That's a
chatbot. Making it feel like a real conversation required three fixes:
Buffer size. 2048 frames = 128ms before Gemini hears your first
word. Dropping to 512 frames cut that to 32ms.OS audio processing.
echoCancellation: trueadds 40-60ms on
older hardware. Disabled it and replaced with a JavaScript echo gate
that reads the RMS amplitude per chunk — blocks mic audio while AI is
speaking, lets through loud barge-in.End-of-turn signaling. Gemini's cloud VAD waits ~600-800ms after
silence before responding. A client-side VAD detects 600ms of silence
and sends an explicit flush signal, cutting that wait significantly.
The Vulnerability Score
$$V_{score} = (0.4 \times F_{freq}) + (0.3 \times A_{spend}) + (0.3 \times R_{regret})$$
Three signals: purchase frequency in the current emotional state (40%),
average spend vs. baseline (30%), and self-rated regret on past purchases
(30%). In testing, interventions at scores ≥ 7.5 produced 67% cart
abandonment vs. 12% at ≤ 4.0.
What I learned
Financial behavior is psychological before it's mathematical. The
system prompt matters as much as the API calls. And the moment of
intervention matters more than the quality of analysis — a perfect
insight a week later changes nothing.
Top comments (0)