DEV Community

Rishika Thulasi
Rishika Thulasi

Posted on

How I Built Clara: A Mobile-First AI Form-Filling Companion with Gemini and Google Cloud

This blog post was created for the purposes of entering the Gemini Live Agent Challenge hackathon. #GeminiLiveAgentChallenge


The Problem

I kept missing job opportunities — not because I wasn't qualified, but because I couldn't bring myself to fill out another application on my phone. Tiny fields, endless scrolling, and those dreaded open-ended questions: "Why do you want this role?" I'd open the form on the subway, stare at it, and close the tab.

I built Clara to fix this.


What is Clara?

Clara is a mobile-first AI form-filling companion. Snap a screenshot, upload a PDF, or paste a URL — Clara reads any form and guides you through filling it with a simple conversation.


How I Built It with Google AI and Google Cloud

Gemini Vision for Form Understanding

The core magic is Gemini 2.5-flash with Vision. When you upload a form, Clara sends it to Gemini Vision, which extracts:

  • Every field label
  • Field types (text, dropdown, checkbox)
  • Bounding box coordinates for each field

This lets Clara understand any form — job applications, medical intake, government paperwork — without pre-built templates.

3-Layer Smart Prefill

Clara doesn't just read forms — it fills them intelligently using a 3-layer matching system:

  1. Learned aliases: If you previously confirmed "Mobile Phone" maps to your phone number, Clara remembers.
  2. Gemini semantic matching: AI matches new fields to your profile with confidence scores.
  3. Keyword fallback: Static synonym matching as a safety net.

Open-Ended Answer Coaching

The hardest part of forms? Those open-ended questions. Clara uses Gemini to draft personalized answers based on your profile and the specific role, then lets you approve or edit.

Voice I/O with Gemini TTS

Clara speaks. Using Gemini TTS, Clara reads questions aloud and supports barge-in — interrupt anytime by typing or tapping.

Google Cloud Infrastructure

  • Cloud Run: Serverless deployment with auto-scaling
  • Firestore: Stores user profiles, sessions, and form progress
  • Cloud Storage: Holds uploaded resumes, cover letters, and generated documents

Architecture Overview

Browser SPA (Vanilla JS) 
    ↓ HTTP REST
Flask Backend (~3000 lines)
    ↓
┌─────────────┬─────────────┬─────────────┐
│ Gemini API  │  Firestore  │   Cloud     │
│ Vision+TTS  │  Profiles   │   Storage   │
└─────────────┴─────────────┴─────────────┘
Enter fullscreen mode Exit fullscreen mode

Challenges

  • Field matching ambiguity: Forms label fields inconsistently. The 3-layer system solved this.
  • Mobile-first UX: Making a chat interface feel natural for complex forms took iteration.
  • PDF annotation: Rendering answers as overlays using Gemini's bounding box coordinates.

What's Next

  • Browser extension for direct website form-filling
  • Clara Profile API for third-party integrations
  • Expanded document generation (tax forms, visa applications)

Try It

Clara is live and deployed on Google Cloud Run. Any form, anywhere, from your phone.


This post was created for the Gemini Live Agent Challenge. #GeminiLiveAgentChallenge

Top comments (0)