DEV Community: Lana

I Built an AI That Whispers Interview Answers in Real Time

Lana — Fri, 10 Apr 2026 09:58:14 +0000

You know that moment in a job interview when someone asks you a question and your brain just... empties? You have prepared for hours. You know the answer. But under pressure, the words vanish.

I built TalkBuoy to fix that.

What It Actually Does

TalkBuoy is a PWA that sits on your phone or tablet during an interview, presentation, or any high-pressure conversation. It listens to what the other person says, processes it through AI, and silently displays suggested talking points on your screen.

You glance down, see a structured response with key points you should hit, and weave them into your own words. The other person never knows.

Think of it as a teleprompter, but one that writes itself in real time based on what is actually being discussed.

How It Works Under the Hood

The app captures audio through the device microphone and runs it through speech recognition to convert what the interviewer says into text. That text gets sent to an AI model that understands the context of the conversation and generates relevant talking points.

The key technical challenges were:

Voice filtering. You do not want the app responding to your own voice. TalkBuoy distinguishes between the user and the other speaker, filtering out your speech so suggestions only respond to the interviewer's questions.

Speed. In a real conversation, you have maybe 2 to 3 seconds of natural pause before silence gets awkward. The entire pipeline from audio capture to displayed suggestion needs to complete in that window. Every millisecond matters.

Smart auto-mute. While you are reading the suggestions, the microphone needs to behave. TalkBuoy silences the mic during reading so the other person never hears anything unusual from your end.

Context awareness. The AI does not just answer the current question in isolation. It maintains context from the entire conversation so suggestions build on what has already been discussed. If you mentioned a project earlier, and the interviewer follows up on it, the AI references your earlier points.

The Use Cases That Surprised Me

I originally built this for job interviews. That is still the primary use case, and it handles behavioral questions, technical questions, and panel interviews well.

But users started using it for things I did not expect:

Public speaking Q&A. Speakers use it during the question period after presentations. You can nail a prepared talk but freeze when someone asks something unexpected. TalkBuoy fills that gap.

Sales calls. Sales reps use it to handle objections in real time. When a prospect raises a concern, suggested responses appear instantly with relevant data points and rebuttals.

Language practice. People learning a second language use it as a conversation partner. It helps them formulate responses in the target language when they get stuck during actual conversations.

Customer service. Support reps dealing with complex product questions get suggested answers without putting the customer on hold to search a knowledge base.

Podcast interviews. Hosts use it when interviewing guests on unfamiliar topics. It suggests follow-up questions based on what the guest just said, leading to better, more natural conversations.

The Ethics Question

Every time I show this to someone, the first reaction is either "that is genius" or "that is cheating." Fair enough, let me address it.

TalkBuoy does not put words in your mouth. It suggests talking points. You still need to understand the material, speak naturally, and think on your feet. If you have zero knowledge of a topic, a list of bullet points in your ear will not save you. The interviewer will hear the difference.

What it does is prevent the specific failure mode where you know the answer but cannot access it under pressure. That is a performance anxiety problem, not a knowledge problem. Most people who use it say they stop needing it after a few interviews because the practice with the safety net builds genuine confidence.

It is the same principle as training wheels. You use them until you do not need them.

Privacy

This was non-negotiable from day one. No audio is stored. No conversations are saved to servers. Everything is processed and discarded. There is no recording, no transcript history, no data mining.

The app works as a PWA, which means it runs in the browser with no app store installation required. It works on phones, tablets, and laptops. Place your device on the table or prop it up where you can glance at it naturally.

What I Learned Building It

Latency is everything in real-time AI. A suggestion that arrives 5 seconds late is useless. I spent more time optimizing the pipeline speed than on any other feature.

People do not read paragraphs under pressure. Early versions generated full paragraph responses. Nobody could process that while maintaining eye contact and a conversation. Bullet points with 5 to 8 words each turned out to be the right format.

The voice filtering problem is harder than it sounds. Distinguishing between two voices in the same room, especially through a phone microphone with room echo, required a lot of iteration. Getting this wrong means the app starts responding to your own answers, which creates a bizarre feedback loop.

PWA was the right choice. No app store approval process, no platform restrictions, instant updates. Users just visit the URL and it works. For a tool that people might need on short notice before an interview, eliminating friction was critical.

Try It

TalkBuoy is live and free to try. If you have an interview coming up, a presentation to give, or you just want to see how real-time AI coaching feels, give it a shot.

Would love to hear what you think, and if you have ideas for other use cases I have not considered.

What is the worst interview freeze moment you have experienced? Drop it in the comments.

How I Built a Free AI Voice Separator for Podcasts and Interviews

Lana — Wed, 18 Mar 2026 08:59:41 +0000

My first Dev.to post about building ToolsOnFire covered the broad overview. This time I want to go deep on one specific tool: the Voice Separator.

*The Problem *

I kept seeing the same requests in podcasting and journalism communities: "I recorded an interview and need to edit just one speaker's audio" or "I need a transcript that shows who said what."

The existing options were either expensive (Descript at $24/month), required desktop software, or didn't actually separate the audio - they just labelled who spoke when.
I wanted to build something that: 1. Identifies each speaker in a recording

Creates separate downloadable audio files per speaker
Allows users to separate background music into a separate file
Produces a timestamped transcript with speaker labels
Is free to try without creating an account

The Challenges

Transcript accuracy is never perfect. This was the biggest reality check. No matter which AI model you use, transcripts will have errors - especially with accents, technical jargon, mumbling, or background noise. I spent a long time chasing 100% accuracy before accepting that even professional human transcribers don't achieve that. The goal became "accurate enough to be useful" rather than perfect.

Speaker misidentification. The AI sometimes assigns the wrong speaker label to short utterances, especially when speakers have similar voices or one person only says a few words. I had to build post-processing logic to smooth out these errors - grouping nearby utterances and correcting obvious misattributions.
Overlapping speech is the hardest problem. When two people talk over each other, basic diarization falls apart. My free tier handles this reasonably well for brief interruptions, but I built a premium tier with a more advanced pipeline specifically for recordings with heavy crosstalk - panel discussions, heated interviews, group meetings.

Audio quality varies wildly. A studio-recorded podcast processes beautifully. A phone call recorded on speakerphone in a noisy cafe is a completely different challenge. I had to set expectations clearly in the UI and add guidance about what makes a good recording for separation.
Processing time and user feedback. Some recordings take 30-60 seconds to process. My initial spinning loader felt broken for anything longer than 10 seconds. I replaced it with a simulated progress bar that moves through phases (uploading... processing... generating results...). Users need to see movement even when I have no real progress data from the AI.

*Cost control *

The AI APIs cost real money per minute of audio processed. Without proper limits, someone could process hours of audio for free. I built a tiered system with minute-based quotas, prepaid credit packs, and usage tracking.

*What I Learned *

Free tiers drive conversions. Letting people try with no account was the right call. Most users try it once and leave, but the ones who find it useful come back and upgrade. If I'd required sign-up from the start, most people would never have tried it.
Podcasters are the sweet spot. I built this for a broad audience but podcasters are by far the most engaged users. They record regularly, always need to edit individual speakers, and the time savings are immediate. If I were starting over, I'd market specifically to podcasters from day one.
Managing expectations matters more than improving accuracy. Users who understand the limitations upfront are happy with 90% accuracy. Users who expect perfection are frustrated at 95%. Clear communication about what the tool can and can't do made a bigger difference to satisfaction than any technical improvement.
Try It

The Voice Separator is free to try - upload any recording with 2 speakers, up to 5 minutes, no account needed.

I also built a Meeting Recorder, Transcriber and Summarizer for live recording and transcription, and Talkbuoy for AI speech coaching.

Have you built anything with audio or speech processing? I'd love to hear what challenges you ran into.

I Built 35+ Free Browser Tools as a Solo Dev - Here's How

Lana — Tue, 17 Mar 2026 14:20:16 +0000

                                                                                                                                                                                                                                                                                    I Built 35+ Free Browser Tools as a Solo Dev

I've been building ToolsOnFire - a collection of 35+ free online tools. Video converters, AI voice separation, a sticker creator, markdown converter, and more.

I'm a solo developer from the UK based in Norway. No team, no funding. Just a frustration with paying subscriptions for simple tasks, so I built free alternatives.

*The Stack *

Next.js 14 (App Router) with TypeScript - Firebase for auth and database
Stripe for premium tiers
Vercel for hosting The Key Decision: Process in the Browser Most of my tools process files entirely in the user's browser rather than on a server.

My Video to Gif Converter and Video to audio converter use FFmpeg.wasm - a WebAssembly port of FFmpeg.
This means files never leave the user's device. No uploads, no privacy concerns, and I don't pay for server compute.

The trade-off is browser memory limits. I had to add device detection to cap file sizes on mobile and optimise how large files are handled.

*AI-Powered Tools *

Some tools need real AI models:

Voice Separator and background music extractor - Splits recordings by speaker using Deepgram and Replicate. Useful for podcasts, interviews, and meetings.
Sticker Creator - Generates custom stickers with DALL-E 3, or removes backgrounds from photos client-side using an ONNX model.
*What I've Learned *
You have to try things to see if they work. My top 5 tools drive 90% of traffic, but I wouldn't have known which 5 without building all of them. Some tools I expected to be popular get no visits. Others I built on a whim end up surprising me.
SEO is the real challenge. Building the tools was the easy part. Organic traffic growth is still slow. If I could go back, I'd start writing content and building backlinks from day one.
*My Other Projects *
Meeting Transcriber - Records and transcribes meetings with automatic speaker identification - Speech Assistant Tool - AI speech coaching with real-time feedback
I'm always building and experimenting with new tools. If something works, I double down. If it doesn't, I move on.

Would love to hear from anyone who's grown a tools site from zero - what worked for you?