DEV Community

Agustin Montoya
Agustin Montoya

Posted on

Why We Built Bilingual Voice Agents Instead of English-Only

TL;DR

  • Most voice AI is built for English speakers and treats Spanish as an afterthought
  • 500+ million Spanish speakers globally, yet the market gets ignored by US startups
  • Building bilingual from day 1 unlocked a cost arbitrage opportunity from Argentina
  • Voice AI by Triqual handles both languages natively — not as a translation layer

I was on a demo call last month. A small logistics company in Mexico City explained they tried three different voice AI platforms. All of them worked great in English. In Spanish? One pronounced "José" as "Joe-say" (rhyming with "maybe"). Another couldn't handle the customer who switched mid-sentence from Spanish to English. The third just gave up entirely.

They were ready to abandon voice AI completely.

That's when it clicked. Everyone's building voice agents for San Francisco engineers. Meanwhile, there's a massive, underserved market right next door.

Why Bilingual Isn't a Nice-to-Have

Spanish is the second most spoken native language on the planet. 500+ million people. The US alone has 42 million native Spanish speakers — more than Spain. Latin America's digital economy is growing 15% year over year. Mexico has the highest e-commerce growth rate in the Americas.

Yet walk into any YC demo day and count the voice AI startups building for Spanish-first markets. I'll wait.

Here's the uncomfortable truth: most voice AI treats non-English as a translation layer. They run speech-to-text in Spanish, translate to English, process the intent, translate back, then text-to-speech. It's a game of telephone with your customer's experience.

The result? 2-3 second latency. Weird phrasing. Names butchered. Context lost.

When your competitor is a human who speaks the language natively, that's not a fair fight.

The Argentina Angle

I'm based in Argentina. I think in Spanish, code-switch constantly, and build for people who do the same.

This gave me something most US startups don't have: a built-in stress test for bilingual voice AI. I couldn't ignore Spanish even if I wanted to. My beta testers, my friends, my family — they all speak Spanish. English-only wasn't an option unless I wanted to build in a vacuum.

It also created a cost arbitrage. Building from Argentina for a market (Latin America) that US competitors ignore means lower operational costs and less competition. I'm not competing with OpenAI's voice product. I'm competing with the local call center that charges $8/hour and speaks perfect Spanish. That's a winnable fight.

How It Actually Works

I won't bore you with the stack. The point is architectural.

Voice AI by Triqual processes Spanish and English as first-class citizens. No translation layer. The agent detects language on the fly and handles code-switching naturally — because that's how people actually talk.

When a customer says "Necesito hablar con el manager about my refund," the agent doesn't glitch out. It understands. Context carries across languages.

Names are tricky. We spent weeks on pronunciation rules for Spanish names, regional accents, the whole spectrum. "Yolanda" shouldn't sound like a gringo trying to order at a taqueria.

The latency? Under 800ms end-to-end. Because there's no translation layer adding friction.

What Went Wrong

Week 3: First Spanish voice test. I asked it to say "Buenos días, soy el asistente de Triqual." It said "Buenos dias, soy el ass-is-tent of Tree-kwal." I wanted to cry.

Week 7: Accent detection was garbage. The agent worked fine with Mexican Spanish. Argentine Spanish? It heard "ll" sounds and just... panicked. Had to rebuild phoneme mapping from scratch.

Week 12: Real customer call. Customer's name was "Ñoño." The agent pronounced it "N-yo-n-yo" instead of "Nyoh-nyoh." The customer hung up. Lost a potential deal over one letter.

Month 4: Tried a "unified voice" that spoke both languages. Turns out Spanish speakers can hear the subtle English phonetic influence and it creeps them out. Had to split to language-specific voice models. Doubled inference costs overnight.

Each failure taught me something. Bilingual isn't just about translation. It's about cultural fluency. The rhythm of speech. The formality levels. When to use "tú" vs "usted."

What's Next

Portuguese is next. Brazil is 215 million people. Same problem, same opportunity.

Also experimenting with regional accent models. Mexican Spanish vs Colombian Spanish vs Argentine Spanish. They're different. Treating "Spanish" as one language is like treating a New Yorker and a Texan as identical speakers.


If you're curious what bilingual voice AI actually sounds like, check out Voice AI by Triqual. Built for businesses that serve Spanish and English customers without making either group feel like an afterthought.

What languages are your AI agents speaking? And more importantly — are they actually speaking them, or just translating?

Top comments (0)