DEV Community

Cover image for From Voice Demo to Operational Voice Assistant: Reviving Ovela AI
Dhruv
Dhruv

Posted on

From Voice Demo to Operational Voice Assistant: Reviving Ovela AI

GitHub โ€œFinish-Up-A-Thonโ€ Challenge Submission

This is a submission for the GitHub Finish-Up-A-Thon Challenge

What I Built

Ovela AI started as a side project driven by a question that kept pulling me back:

What would it take for a business to genuinely trust a voice AI system?

At first, I thought the answer was simple: make conversations sound natural.

The original prototype could answer calls, respond to questions, and carry a conversation reasonably well. From a technical perspective, it looked impressive.

But after speaking with accommodation providers and small business owners, I realized I was focused on the wrong problem.

Businesses don't trust a system because it sounds human.

They trust it because it behaves reliably.

Can it check availability correctly?

Can it update reservations safely?

Can it collect payments?

Can it transfer a call when confidence is low?

Can staff see exactly what happened afterward?

That realization changed the direction of the project completely.

Ovela AI evolved from a voice demo into an operational voice assistant designed to help businesses handle real customer interactions while keeping humans in control of important decisions.

Today, Ovela can:

  • Handle inbound phone calls
  • Check room availability
  • Create reservations
  • Process payments through Stripe
  • Answer property and local information questions
  • Transfer calls when needed
  • Keep staff synchronized through a management dashboard

More importantly, every improvement is guided by a simple principle:

AI should support human operations, not blindly replace them.


Demo

๐Ÿ“ž Live Demo (Australia)

Phone: +61 3 4823 6219

Due to abuse protection and testing limits, availability may occasionally be restricted.

Try asking:

  • "Do you have any rooms available this weekend?"
  • "Can I make a reservation?"
  • "What attractions are nearby?"
  • "What's the weather like today?"

๐ŸŒ Website: https://ovela.dev

๐Ÿ™ GitHub Repository:https://github.com/My-CMDhub/Ovela-AI


The Comeback Story

Like many side projects, Ovela reached a point where the prototype worked well enough to demonstrate the idea.

Then it sat untouched not because the project failed but because other priorities took over.

Months later, after more conversations with business owners and more exposure to real operational challenges, I came back to the project with a very different perspective.

The biggest lesson was surprisingly non-technical.

The challenge isn't making AI speak.

The challenge is making AI behave appropriately within human workflows.

A real receptionist doesn't simply answer questions.

They:

  • Recognize interruptions
  • Acknowledge requests before acting
  • Handle uncertainty
  • Understand when information is missing
  • Escalate sensitive situations
  • Maintain context across an entire conversation

Most voice demos don't fail because speech recognition is poor.

They fail because the operational behavior doesn't match what people expect from a trusted assistant.

That became the focus of the revival.

What Changed

Multi-Agent Architecture

The original system relied on a much simpler flow.

The new version uses a multi-agent architecture built around Google's Agent Development Kit (ADK), allowing different agents to handle reservations, business operations, and information requests independently.

Lower Latency Conversations

Voice interactions are highly sensitive to delays.

Several architectural bottlenecks were removed to improve response times and reduce awkward pauses during calls.

Stronger Context Awareness

One of the most interesting challenges was interruption handling.

People interrupt constantly during real conversations.

The system now maintains awareness of what information has already been spoken, allowing it to continue naturally instead of restarting or losing context.

Operational Reliability

Reservation workflows, payment handling, availability checks, and dashboard synchronization were rebuilt to behave more like real business processes rather than isolated AI actions.

Abuse Protection

Real phone systems attract misuse.

Rate limits, call protections, and operational safeguards were added to prevent abuse while keeping legitimate usage frictionless.


My Experience with GitHub Copilot

Returning to a codebase that has sat inactive for months is often harder than starting a brand new one. You inherit your own past decisions without fully remembering why you made them.

For the revival of Ovela AI, I didn't use GitHub Copilot as a simple autocomplete tool to write boilerplate code. Instead, I used it as a high-level engineering partner and data auditor to manage complex architectural shifts and harden my systemโ€™s reliability.

Here are the two major ways Copilot helped me cross the finish line, backed by real-world interaction during my workflow development:

1. Translating Complex Systems into Architecture Diagrams

As Ovela AI transitioned to a multi-agent setup, mapping out component connections, telephony triggers, and dashboard synchronization endpoints became a major cognitive bottleneck. I leveraged Copilot within my workspace as a principal solutions architect. By feeding it my core file dependencies, it mapped out a clean, production-ready system workflow directly in Mermaid.js syntax for the repository documentation.

Architecture Diagrams Breakdown

2. Eval Hardening (The Supreme Judge)

Building a reliable voice AI requires robust testing. I simulate conversations between two LLMs and dump the evaluation telemetry into local .json log files. However, default automated grading scripts are notoriously prone to false positives (e.g., grading a hallucinated response highly simply because it sounded polite).

I utilized Copilot as a Supreme AI Evaluation Auditor. I passed it raw JSON conversation objects, prompting it to critically audit the automated scores, spot misleading feedback, and generate an adjusted "Supreme Score" with a bulleted logical justification. This drastically reduced noise in my evaluation pipeline.

Eval score finalisation

Solidifying eval score of simulation test

Operational Code Polish & Balancing

Beyond these two core pillars, Copilot served as an excellent "cleanup crew" throughout this journey even after hitting rate limits โœ‹. It assisted in tracking down legacy typing issues, reviewing asynchronous edge cases, and generating clean inline documentation.

Ultimately, the biggest value Copilot provided wasn't just writing lines of code faster but it was accelerating complex architectural decisions and data validation when reviving a stale codebase.


What I Learned

The most valuable lesson wasn't technical.

It was understanding the difference between a convincing demo and a useful product.

A demo succeeds when the AI says the right thing.

A business system succeeds when the right thing actually happens afterward.

That distinction changed how I think about voice AI.

Natural conversation matters.

Latency matters.

Speech quality matters.

But trust matters more.

Trust comes from reliability, transparency, and knowing when humans should remain part of the process.

I don't believe current voice AI systems perfectly replicate human interaction, and that's not really the goal.

What interests me is the space between humans and AI:

How can AI handle repetitive operational work while humans remain responsible for judgment, relationships, and important decisions?

Reviving Ovela helped me explore that question far more deeply than when I first started the project.

And honestly, that's what made finishing it worthwhile.

Top comments (0)