Adarsh Kant

Posted on Apr 4

I Built a Voice-Powered Form Builder and 87% of Users Complete It

#webdev #ai #startup #saas

Every developer has built a form. And every developer knows the pain: you spend hours perfecting the UX, adding validation, making it responsive... and then 85% of users abandon it halfway through.

I got tired of this. So I built Anve Voice Forms — a form builder where users can speak their answers instead of typing them.

The Problem With Text Forms

Here's what the data actually shows:

Average form completion rate: 15-20% (Formstack, 2024)
Average time to complete a 10-field form: 4 minutes 23 seconds
#1 reason for abandonment: "Too many fields" / "Takes too long"
Mobile form completion is 30% lower than desktop

We've been building forms the same way since the 90s. Text input, validation, submit. The entire interaction model assumes users want to type. But 40% of the world's population prefers voice input — whether due to accessibility needs, mobile context, or just convenience.

What I Built

Anve Voice Forms lets you create forms where users can speak their answers. The voice engine (powered by Google Gemini's multimodal API) transcribes responses in real-time across 40+ languages.

The tech stack:

React + TypeScript + Vite (frontend)
Tailwind CSS (styling)
Supabase (database + auth + edge functions)
Clerk (authentication)
Google Gemini API (voice processing via real-time WebSocket streaming)
Razorpay (payments)

How it works:

You build a form (drag-and-drop, just like Typeform)
Each field can accept text OR voice input
When a user clicks the mic, Gemini processes their speech in real-time
The response is transcribed, validated, and stored
You get analytics on completion rates, voice vs text usage, and more

The Results

After testing with early users across education, HR, and customer feedback use cases:

87%+ completion rates (vs ~15-20% industry average for text)
3x faster form completion time
40+ languages supported out of the box
Users on mobile completed forms 2.5x faster with voice

The biggest surprise? Users who had the option of voice but chose text still completed at higher rates. Just having voice as a fallback reduced anxiety about long forms.

Why Voice Changes Everything for Forms

1. Accessibility is built-in, not bolted on
1.3 billion people globally have some form of disability. Voice input isn't a nice-to-have — it's how a huge chunk of the world interacts with technology.

2. Multilingual by default
If your form serves users in multiple languages, voice forms handle it natively. No translation layers, no per-language form variants. A user in Tamil Nadu speaks Tamil, a user in Berlin speaks German — same form.

3. Mobile-first UX
Typing on a phone is slow and error-prone. Voice is the natural input method for mobile. Forms that support voice see significantly higher mobile completion rates.

The Architecture

The voice processing pipeline:

User speaks → WebSocket to Gemini API → Real-time transcription → Client-side validation → Supabase insert → Analytics event

Key technical decisions:

WebSocket streaming over REST for real-time feel
Client-side audio processing — only processed text is stored
Supabase Edge Functions for server-side logic
Progressive enhancement — voice is additive, text always works

Try It / Get a Lifetime Deal

I'm running a limited launch: 500 lifetime licenses at $199 (one-time payment, lifetime access).

What you get:

Unlimited text form submissions (forever)
50 voice responses/month
Analytics dashboard
API access + webhooks
40+ languages
Lifetime updates

Live demo: voiceforms.anvevoice.app/lifetime/

Main app: forms.anvevoice.app

What's Next

Currently working on:

Zapier + Make integrations
Conditional logic for voice flows
Team collaboration features
White-label option for agencies

Would love feedback from the dev community. What would you build with voice-powered forms? Drop a comment.

Built by Adarsh — indie founder from India.

DEV Community