DEV Community

Cover image for Building Aura: What We Learned Building a Real-Time AI Mentor
giursan
giursan

Posted on

Building Aura: What We Learned Building a Real-Time AI Mentor

Imagine you're preparing for the pitch of your life. The slides are perfect, the data is solid, but you’re worried about how you’re delivering it. Are you making eye contact? Is your posture projecting confidence, or are you "shrimping" over your laptop? Are you speaking too fast or stumbling over your own words?

That's why we built Aura: a real-time, AI-powered pitch mentor designed to turn high-stakes nerves into high-performance delivery.

In this post, we’re peeling back the curtain on how we built Aura, the technical mountains we had to climb, and why we spent so much time obsessing over UI.


Our Vision: More Than Just a Webcam App

We didn't want to build just another video recorder. We wanted an Arena.

Aura uses a combination of Google Gemini for high-level content analysis and MediaPipe for granular, frame-by-frame body language tracking. From the start, our goal was clear: immediate, actionable feedback that feels visceral.

Face and Gesture Recognition showing a person

The Stack

  • Frontend: Next.js with a custom "Aura CI" design system
  • Computer Vision: MediaPipe (Face, Pose & Gesture Recognizer)
  • AI Brain: Google Gemini API for deep content intel & coaching personas
  • Real-time Engine: Custom React hooks for low-latency metric processing

Learning #1: The "Shrimp" Problem

Early on, we realized that "good posture" isn't a universal constant. If you're tall, your natural "neutral" looks different than someone shorter. If we just hard-coded thresholds, we’d be telling half our users they were slouching when they were just... existing.

We had to implement a Gesture-Driven Calibration system.
By asking users to "show a thumbs up" once they were in their best posture, we could capture a personalized baseline. We developed custom metrics like Neck Ratio and Shoulder Expansion to quantify "shrimp" (kyphotic) posture.

Data is useless without context. Personalization isn't a feature; it’s the foundation.

Learning #2: Premium Design as a Functional Requirement

We spent days refining the "Aura CI." We moved away from generic UI to a "Premium Arena" aesthetic—think extra-bold typography (font-black), capsule buttons, and subtle overlays.

Why? Because pitching is high-stress. A clunky, unintuitive UI makes you more nervous. A premium, polished environment makes you feel like a pro before you even open your mouth.


Struggle #1: The "Metric Freeze"

One of our biggest hurdles was handling the "Pause" state. When a user pauses their session to reflect, they want to see their metrics frozen in time. Initially, our hooks would either keep processing (wasting battery) or reset to zero (losing the data). We had to refactor our entire hook architecture to support a "Stable Metric View", freezing the last known good data point until the session resumed.

Struggle #2: Audio Ghosting

We hit a wall where the microphone RMS amplitude kept reporting 0.0000 despite the user screaming into the mic. Debugging the browser's AudioContext and ensuring permissions for both video and audio were handled synchronously was a masterclass in async race conditions.

Struggle #3: "Shark Mode" API Logic

Gemini's reasoning capabilities are incredible, but they come with a "thought signature." We spent a good chunk of time cleaning up console warnings and ensuring that our "Shark" persona (the brutal, demanding coach) got the right data packets without overloading the API.


💡 Final Thoughts: The Future of Pitching

Building Aura taught us that AI is at its best when it's invisible. The user shouldn't be thinking about "Pose Landmarkers" - they should be thinking about their pitch, while the UI gently nudges them to stand taller or slow down.

We built this project as part of the Google Gemini Live Agent Challenge Hackathon and wanted to documents our journey in this blogpost (also we want the bonus points for the submission...! 🚀)

We’re just getting started. Next up? More personas, deeper congruity checks...and many more!

Are you ready for your next pitch in front of Aura?

Top comments (0)