Aayush Gupta

Posted on Dec 6, 2025

BowSensei 🏹— An AI Powered Archery Coach. 🎯

#googleaichallenge #ai #agents #devchallenge

Google AI Challenge Submission

This is a submission for the Google AI Agents Writing Challenge.

Key Learnings & Insights That Resonated Most With Me

The most powerful insight I gained during this intensive was realizing that “agentic AI” is not just a smarter chatbot — it is a system capable of perceiving, reasoning, planning, and acting like an intelligent collaborator.

I understood how dramatically AI transforms when you give it autonomy, memory, multimodal inputs, and the ability to call tools.

Suddenly, it stops being reactive and starts behaving like a true assistant.

Another deep learning was how multimodal reasoning unlocks real-world impact.

The moment I saw an agent interpret images, draw inferences, and take decisions, I realized how AI can shift from passive to genuinely assistive.

Finally, the biggest insight for me was this:

AI does not replace human skill — it amplifies it.

This philosophy shaped the entire foundation of my project.

What Concept Resinated Most With Me?

The concept that resonated most with me came from the Agent Quality whitepaper 4 — especially the idea that evaluating an agent is not just about its final answer, but about understanding its trajectory of decisions.

This completely shifted my thinking. Earlier, I believed a “good agent” was simply one that produced correct outputs.

But this paper showed me that high-quality agents:

follow coherent, step-by-step reasoning,
maintain consistency in their internal workflows,
show stability in decision-making under uncertainty,
and leave behind traceable reasoning paths that can be inspected and improved.

This “inside-out” perspective (the Glass-Box approach) was a click moment for me.

It made me realize that reliable agents are built through structured internal workflows, even if the whitepaper doesn’t explicitly use that phrase.

Another concept that stayed with me was how tool-use and function-calling directly shape agent quality.

An agent that can measure, retrieve, analyze, and act with precision naturally becomes more trustworthy.

How Has My Understanding of AI Agents Evolved?

Before this program, my understanding of AI was very basic.

I used to think of AI as something similar to a smart autocomplete system:

Input → Output.

You give a query, it gives you an answer — nothing more.

I first heard the term Agentic AI a few months ago during a hackathon in Lucknow (#HackWithUttarPradesh). I participated, submitted my idea, and didn’t get selected 😭.

But the experience made me curious about what “agentic AI” really means.

At that time, I believed agents were just normal AI models wrapped inside an application.

But after attending Google's 5-day Agentic AI Workshop and studying all five whitepapers,

my understanding became far more mature:

Agents can perceive (images, videos, multimodal signals)
Agents can reason (multi-step planning, constraint solving)
Agents can act (tool calling, memory usage, environment interaction)
Agents can self-correct (evaluate their own outputs)
Agents can improve over time (memory + feedback loops)
Agents can be deployed at scale (Vertex AI Agents)

I now think of AI agents as digital teammates who can collaborate with humans

on complex, dynamic tasks.

This shift changed not only how I build AI systems —

but how I imagine future workflows.

Capstone Project

1- What I Built :

BowSensei — AI Powered Archery Coach

BowSensei is an AI-powered multi-agent, multimodal archery coach built using Gemini.
It analyzes an archer’s technique through images, video, audio, or text, identifies form mistakes, and generates personalized corrective drills + a 7-day training plan.

Tech Intelligence: Multi-Agent Reasoning + Memory Layer + Multimodal Analysis

2- Why I Built This

Being a Professional competitive recurve archer myself, I know how hard it is to get consistent, high-quality feedback.

Technique mistakes like:

anchor position shifting every shot
inconsistent release
bow arm collapsing
posture + shoulder alignment issues

are extremely difficult to diagnose without a coach watching you daily.

Coaches are expensive, not available 24/7, and beginners often don’t understand what they are doing wrong or how to fix it.

I personally hit a plateau around 322/360, training every day but still unable to identify the subtle mistakes holding me back.

So I decided to built something like which solve my real problem i faced in my life in 5 day Agentic AI organised by Google X Kagggle:

AI agents can now perceive, reason, and coach — exactly like real experts.

So I built BowSensei:

An Agentic AI archery coach that can

Analyze shooting through photos (multiple angles)
Understand short shooting videos
Listen to audio descriptions of problems
Interpret long or short text inputs from the archer
Detect technique mistakes
Provide personalized drills + explanation
Generate a custom 7-day training plan
Remember past sessions using a memory layer

BowSensei solves a real sports problem:

giving every archer — from beginner to professional — a personal AI coach that is available anytime, anywhere.

3- How BowSensei Works (Multi-Agent Architecture)

BowSensei uses three coordinated agents, each with a clear job.

i) Agent 1: Intake Agent

Goal: Convert ANY user input (text, audio transcript, video frame summary, images) into a clean structured JSON profile.

It extracts:

bow type
draw weight
experience level
main issues
goals

This ensures every next step is consistent and structured.

ii) Agent 2: Coach Agent

Goal: Detect(Analyse) technique problems + give fixes.

It identifies issues such as:

posture problems
inconsistent anchor
unstable bow arm
bad release timing
poor shoulder alignment
weak back tension

Output:

likely mistakes
corrective drills
explanations
Hinglish-friendly coaching notes

iii) Agent 3: Plan Agent

Goal: Build a detailed 7-day structured training plan.

Includes:

daily technical drills
strength drills
bow-arm + back-tension work
rest days
volume distribution
mental training cues

The plan is fully personalized to the archer’s profile + mistakes.

iv) Memory Layer — Behaves Like a Real Coach

The memory system stores:

past mistakes
improvements
session-to-session consistency
recurring problems

This allows BowSensei to “remember” the archer and coach them long-term.

4 System Architecture

5- Technical Implementation (Tech Stack)

Python
Gemini 2.5 Flash (via google.generativeai)
Google AI Agents Framework (ADK)
Multi-agent orchestration ← (architecture layer)
Kaggle Notebook Runtime
Google Cloud (Vertex AI Deployment)
Google AI Studio (Model testing & configuration)

6- Demo:

📌 Important Note:

For this project write-up, I am showing only one demo — using text as the input.

If you want to see how BowSensei works with image, video, or audio inputs,

I have added the full notebook link .

Kaggle Notebook Link: https://www.kaggle.com/code/san1357/bowsensei-an-ai-archery-coach-by-aayush-gupta

You can open the notebook and run those demos yourself.

All other input types (image/video/audio) produce the same final output format as Demo 1 (text input).

7- What I Learned From Building BowSensei

This project taught me the core skills required to build real-world, agentic AI systems — not just chatbots.

Here’s everything I learned, grouped into three clear sections::

i) Technical Learnings

1. Multi-Agent System Design

How to design Intake → Coach → Plan agents
How agents collaborate using structured JSON

2. Strict JSON Enforcement & Error Handling

Forcing LLMs to output clean, valid JSON
Building fallback logic when the model breaks

3. Multimodal AI Implementation

How Gemini2.5 Flash processes images + video + audio + text
How multimodal reasoning improves accuracy

4. End-to-End AI Pipeline Architecture

Structuring the full pipeline from user → agents → final plan
Designing memory-ready systems
Thinking in terms of real-world deployment, not just notebooks
Making the system reliable, debuggable, and predictable

ii) AI & Agentic Understanding

1. Agentic AI ≠ Chatbot

Agents perceive → reason → plan → act, not just reply
Autonomy + memory makes them behave like real coaches

2. Why Roles Matter (Intake / Coach / Plan)

Splitting responsibilities leads to clearer, better reasoning
Each agent focuses on one layer of intelligence

iii) Personal Growth

1. Built a Real-World AI Product

Learned to think like an engineer building for real users

2. Deep Debugging & Prompt Engineering

Understanding where LLMs fail and how to fix them

3. Converting Domain Knowledge → AI Logic

Turning archery concepts into structured rules, drills, and plans

8- Final Reflection

This project taught me how real agent-based AI systems are built —

from prompts to pipelines to structured reasoning.

DEV Community