DEV Community: Christopher Pribadi

Nobody Built a Meeting Tool for the Way Southeast Asia Actually Talks

Christopher Pribadi — Tue, 16 Jun 2026 21:19:03 +0000

*Every meeting tool assumes you speak one language.
*

Otter.ai. Fireflies. Notion AI. Microsoft Copilot. They are all built for a boardroom in San Francisco where everyone speaks English, with a slight accent at most.

That is not how Southeast Asia works.

I grew up in Indonesia. Every meeting I have ever been in has been a mix. Bahasa Indonesia when we are relaxed and talking fast. English when we are pitching or presenting. Mandarin when someone senior from a Chinese-owned conglomerate joins the call. Sometimes all three in the same sentence.

This is not unusual. This is just how ***business* gets done here**.

When I started looking for a tool that could handle this, I found nothing. The best I could do was run Otter.ai and watch it struggle with every Bahasa sentence, every code-switched phrase, every moment where the conversation naturally moved between languages. The transcripts were unusable. The summaries were worse.

That was the moment I knew Bysik had to exist.

The gap nobody was talking about

Southeast Asia has 670 million people. Indonesia alone has 270 million. The region's digital economy is on track to hit $1 trillion by 2030. Startups are scaling fast, enterprises are expanding across borders, and remote and hybrid work is now the default for any professional team.

And yet every single AI meeting tool was built by American companies, for American users, with English as the only language that matters.

The few tools that claimed multilingual support were doing post-processing translation. You would record your meeting in one language, wait for it to be processed, and get a translated transcript hours later. That is not translation. That is transcription with a dictionary attached.

Real multilingual work is real-time. It switches mid-sentence. It does not wait.

_
Why on-device changes everything
_
When we started building Bysik, we made one architectural decision that changed everything else: the AI model runs on-device.

No external API. No audio routed through a server in Virginia. No third-party dependency.

This was not just a privacy decision, though it matters enormously for Indonesian companies navigating data localisation requirements under PP 71/2019. It was a product decision.

Running the model locally means Bysik works offline. It works in a Jakarta office building with inconsistent WiFi. It works on a laptop on a flight between Taipei and Singapore. It works in any environment where cloud-dependent tools fail.

It also means our unit economics are fundamentally different from every competitor. No per-minute API cost. No usage-based billing that punishes power users. A flat, predictable product that scales without our costs scaling with it.

What we are building

*Bysik is an AI meeting intelligence platform with four core features: real-time multilingual translation, automatic meeting notes, speech-to-text, and a productivity suite we call Launch Pad.
*
The real-time translation is the feature I am most proud of. You open Bysik before a meeting. As people speak, regardless of whether they switch between Bahasa, English, or Mandarin, subtitles appear in real time. No delay. No post-processing. Just the meeting, understood.

We are post-beta. We are approaching launch. Indonesia is our first and primary market.

Why now

AI adoption in Southeast Asia is accelerating faster than most people outside the region realise. Indonesia's AI startup funding grew over 200% in the first half of 2025. The enterprise market is hungry for tools that actually fit how they work, not tools built for someone else that they have been told to adapt to.

We are not trying to be a cheaper Otter.ai. We are building the meeting intelligence tool that Southeast Asia deserves. One that was built here, for here, from the ground up.

If that resonates with you, follow along. We are just getting started.

*Bysik is currently in pre-launch. Visit bysik.app to learn more.
*

The Hidden Cost of Bad Meeting Transcription (It's Not What You Think)

Christopher Pribadi — Tue, 19 May 2026 04:42:18 +0000

You probably think the cost of bad meeting transcription is time spent fixing transcripts.
It's not.
That's the visible cost. The real cost is what you're not seeing.
The Visible Cost

You record a 60-minute meeting. The transcription accuracy is 60-70%. Someone spends 20-30 minutes fixing it.
For a team of 20: 8+ hours/day of cleanup = $103k/year in wasted time.

That's measurable. That's what CFOs see.
But it's the smallest cost.

The Hidden Costs

Knowledge Loss ($250k/year): When transcripts are useless, people stop using them. They use personal notes instead. Now three people have three different records of what was said. Rework, slow onboarding, missed decisions.

Decision Latency ($150k/year): By the time someone reads bad notes, they've already made decisions based on what they remember, not what was said. Clarification calls, deal delays, lost revenue.

Compliance Risk ($50k/year amortized): Bad transcription = unreliable records. If you ever need to prove what was discussed, you can't. PDPA violations in SE Asia = fines + legal fees.

Team Friction ($100k/year): "I thought we agreed to X" / "No, we said Y" / "Check the notes!" / "The notes don't make sense." Bad transcription becomes a source of conflict.

Scaling Friction ($200k/year): New hires can't learn from written records. Everything requires 1-on-1 explanation. You can't scale beyond founder involvement.

The Math
For a 50-person regional team:
CostAnnualFixing transcripts$103,750Knowledge loss$250,000Decision latency$150,000Compliance risk$50,000Team friction$100,000Scaling friction$200,000TOTAL$853,750
That's $17k per employee per year.
A good transcription tool costs $200-300/person/year.
ROI: 34x-85x.
If you're using a tool that doesn't work for your team's language patterns, you're not saving money. You're burning it.

Thats why we created a multilingual meeting notetaker :

www.bysik.app

How Code-Switching Breaks AI (And Why That Matters for Southeast Asia)

Christopher Pribadi — Sat, 16 May 2026 05:42:42 +0000

Your Singapore team is on a call. Someone says: "Eh, so we need to deploy this feature lah, but the database query very slow lor. How we optimize? Boleh ask the backend team?"

That sentence has English, Malay, and Singlish grammar patterns all mixed together.

Try to transcribe it with Otter.ai or Google Meet's built-in captions. You'll get something like: "Eh, so we need to deploy this feature la, but the database query very slow or how we optimize..."
The words are garbled. The meaning is lost. And if you're using that transcript as your meeting notes, you've got a mess.
This is code-switching. And it's breaking almost every AI transcription tool on the market.

What Is Code-Switching?
Code-switching is when a bilingual or multilingual speaker mixes two or more languages in a single conversation, often within the same sentence.
It's not broken English. It's not bad grammar. It's actually a sign of linguistic sophistication. Bilingual speakers code-switch because it's the most efficient way to communicate with other bilingual people in their community.
It's normal in:

Singapore: Singlish (English + Malay + Mandarin + Tamil grammar patterns)
Malaysia: Bahasa Rojak (English + Malay + Cantonese)
Philippines: Taglish (Tagalog + English)
Thailand: Denglish (English + Thai)
Indonesia: Bahasa Campur (Indonesian + English + regional languages)
Vietnam: Vietglish (Vietnamese + English)

Every Southeast Asian professional does this. It's how we communicate.
But here's the problem: AI transcription models were trained on monolingual speech. They've never seen this before.

Why AI Breaks on Code-Switching
Modern speech-to-text models work by:

Tokenization: Breaking audio into phonemes (individual sounds)
Language identification: Figuring out which language it is
Pattern matching: Matching sound patterns to known words
Grammar correction: Using language models to fix errors

Code-switching breaks at step 2.
When you say a sentence in Singlish, the AI hears English words and Malay grammar patterns simultaneously. The language identification model gets confused. Is this English or Malay? The model has to pick one. It picks wrong. Everything downstream breaks.
Here's a real example:
What was said: "Eh, cannot lah, server go down already."
What Google Transcribe hears: English (because the root words are English)
What Google Transcribe outputs: "Eh, cannot la server go down already" (it strips the Singlish particles because they don't fit English grammar)
What the speaker meant: "No, we can't do that right now, because the server has crashed."
The meaning was there. But the model didn't preserve it.

The Training Data Problem
Why does this happen?
Because the datasets used to train these models don't include code-switched speech.
Google, OpenAI, and other major AI labs trained their speech models on:

English audio (billions of hours)
Mandarin audio (billions of hours)
Spanish, French, German, etc. (hundreds of millions of hours each)

But Singlish? Taglish? Bahasa Campur?
There's almost no training data.
Why? Because code-switched speech is:

Hard to label (Is this English or Malay? Both? The labeler has to make a judgment call)
Not standardized (Singlish spoken in Singapore sounds different from Singlish in Malaysia)
Seen as "low prestige" by researchers (academic datasets tend to focus on formal, monolingual speech)
Computationally expensive to include (mixed-language models are harder to train)

So the models just ignore it. And when they encounter it, they fail.

Why This Matters for SE Asian Teams
Imagine you're a regional PM. You record a standup with your Singapore, Bangkok, and Manila teams. Everyone code-switches naturally — it's how they communicate best.
You use Otter.ai to transcribe. You get:

60% accuracy on the English parts
40% accuracy on the code-switched parts
Completely butchered grammar and meaning in the mixed sentences
Useless meeting notes

So you spend 20 minutes manually fixing the transcript. Or you just don't use it.
Either way, you've lost the main benefit of transcription: saving time.
Scale this across a team. You're losing hours every week to bad transcription.
For a team of 10, that's 500 hours a year of wasted time. For a 50-person team, it's 2,500 hours.
That's real money.

The Current Solutions (And Why They Don't Work)
Option 1: Use a tool built for your specific language
There are some tools built for Singlish or Taglish specifically. But they only work if you speak one code-switched language. If your team spans multiple countries, you're out of luck.

Option 2: Record separate videos in each language
Some teams do this. One person records the English parts, someone else records the Malay parts. This is absurd and doesn't reflect how people actually talk.

Option 3: Use Google Meet or Zoom's built-in captions
They're improving, but still 50-60% accurate on code-switched speech. Better than nothing, but still not usable for meeting notes.

Option 4: Hire someone to manually transcribe
Expensive and slow. But it works because humans can understand code-switching. A human transcriber gets the meaning right even if the grammar is mixed.

Option 5: Just don't transcribe
Most teams do this. They record meetings but don't transcribe them because the tools are so bad at code-switching.

How We're Solving This at BYSIK
When we started building BYSIK, we noticed this problem immediately.
Our first user was a Singapore startup with a team across SG, MY, and ID. They said: "Every transcription tool fails on our meetings because we speak Singlish. We just stopped using transcription."
That's when we realized: the existing solutions aren't built for Southeast Asia.
So we did something different:
1. We trained on code-switched speech
We built a dataset of actual code-switched audio from SE Asian professionals. Singlish, Taglish, Bahasa Campur, mixed Thai-English, all of it. We labeled it carefully and trained our speech-to-text model on this specific data.
2. We use language-agnostic embedding models
Instead of deciding "Is this English or Malay?" upfront, we use embedding models that represent words in a shared semantic space. "Lah" (Malay particle) and "already" (English particle) both mean roughly the same thing in context. The model learns this.
3. We added dialect and accent handling
Singlish from Singapore sounds different from Singlish from Malaysia. Bangkok Thai-English is different from Northern Thai-English. We built models that handle these variations.
4. We preserve the original speech patterns
Instead of "correcting" code-switched speech into monolingual grammar, we keep it as spoken. If you said "cannot lah," the transcript says "cannot lah," not "cannot." The meaning is preserved.
The result? 85%+ accuracy on code-switched speech, compared to 40-50% on existing tools.
Is it perfect? No. Code-switching is inherently ambiguous sometimes — even humans disagree on what was said. But it's accurate enough to be useful for meeting notes, which is the point.

The Bigger Picture: Why SE Asia Keeps Getting Left Behind
This code-switching problem is a microcosm of a bigger issue.
AI is built in the US and China. The training data is English, Mandarin, and a few other "high-resource" languages.
Everything else — including all of Southeast Asia's languages and dialects — gets treated as an edge case.
So we have tools that work great for monolingual English speakers in San Francisco. They're okay for Mandarin speakers in Shanghai. But for a multilingual team in Singapore? For developers in Manila who code-switch naturally? For regional teams that communicate across languages?
The tools fail.
And the assumption is: "That's an edge case. Most of the world speaks monolingual English or Mandarin anyway."
But Southeast Asia is 650 million people. It's not an edge case. It's a massive market that's been ignored because building for multilingual, code-switched speech is harder than building for monolingual English.

What Needs to Change
For researchers: Start collecting and publishing datasets of code-switched speech. It's harder than monolingual data, but it's important. Southeast Asia's languages matter.
For AI companies: Stop pretending code-switching is an edge case. Train your models on it. Your users in SE Asia deserve tools that work.
For regional companies: If existing tools don't work for you, you don't have to accept it. Build your own, or support tools (like BYSIK) that are built for your market.
For SE Asian founders: This is an opportunity. The entire region is using transcription tools that don't work for how we actually speak. That's a problem worth solving.

The Practical Takeaway
If you're running a team in Southeast Asia and you've been frustrated with transcription accuracy, now you know why.
It's not your audio quality. It's not your accent. It's not that you're speaking "wrong."
It's that the tools were built for a different market. They were trained on monolingual speech. Code-switching breaks them.
There are solutions now. Tools are getting better at handling multilingual and code-switched speech. If you've given up on transcription because it didn't work, try again. The tech has caught up.
And if you find a tool that actually understands how your team talks — that preserves meaning instead of "correcting" your language — stick with it. You've found something rare.

P.S. — If you want to geek out about the linguistics of code-switching, there's a whole field of research on it. Start here: Poplack's "The Bilingual's Linguistic System: Evidence for Asymmetric Competence." It's fascinating stuff.
And if you're building tools for Southeast Asia, feel free to reach out. I'm always interested in talking to founders who are solving regional problems instead of just copying the US.

Full disclosure: I founded BYSIK AI because of this exact problem. We're solving code-switching for transcription. But even if you use a different tool, I hope this helped you understand why transcription has been hard in SE Asia and why it's getting better.