DEV Community

Cover image for Echo Location Project [Google AI Studio Multimodal Challenge]
William Henry King
William Henry King

Posted on

Echo Location Project [Google AI Studio Multimodal Challenge]

๐ŸŽฏ This is a submission for the Google AI Studio Multimodal Challenge ๐Ÿš€

๐ŸŒฟ๐Ÿ” Echo Location Project ๐Ÿฆ…โœจ

๐Ÿš€ What I Built

I built ๐ŸŒŸ Echo Location ๐ŸŒŸ, a web app that reframes the concept of an animal identifier into an interactive and purpose-driven conservation quest! ๐ŸŽฏ๐ŸŒ The experience isn't just about answering "What animal is this?" ๐Ÿค”; it's about solving the deeper problem of the disconnect between humanity and the natural world! ๐Ÿ’š๐ŸŒฑ

๐Ÿšจ It's actually meant for mobile...but in this case it can also be seen as used on desktop, tablet...etc. ๐Ÿ“ฑ๐Ÿ’ป๐Ÿ“ฒ

๐ŸŽชโœจ๐ŸŒŸโญ๐Ÿ’ซโšก๐Ÿ”ฅ๐Ÿ’ฅ๐ŸŽ‰๐ŸŽŠ๐ŸŒˆโœจ

๐ŸŒŸ The Experience ๐ŸŒŸ

Echo Location transforms every user into an "Eco-Scout" ๐Ÿงญ๐ŸŒฟ on a mission! When a user uploads a photo ๐Ÿ“ธ, video ๐ŸŽฅ, or even an audio clip ๐ŸŽต of wildlife, they're not just getting a name. They're filing a "Sighting Report" ๐Ÿ“‹ that initiates an immersive learning journey! ๐ŸŽ“โœจ

๐Ÿค– TECH MAGIC: The app uses Gemini 2.5 Pro and Flash ๐Ÿง โšก to perform a deep multimodal analysis!

๐ŸŽฏ Generated Field Report Features:

๐Ÿฆ† The animal's story

๐Ÿšจ Official conservation status

โš ๏ธ Primary threats

๐Ÿ“ Simulated geolocation based on its environment

๐ŸŽฎ Gamification Elements:

This journey is gamified through a "Ranger's Field Journal" ๐Ÿ“– where users:

  • ๐Ÿ“ˆ Level up
  • ๐Ÿ… Earn badges for completing ecosystem collections
  • ๐Ÿ”“ Unlock "Hope Spotlights"โ€”real stories of conservation success! ๐ŸŒŸ

๐ŸŽช MOST IMPORTANTLY: Echo Location bridges the digital-to-real-world gap by issuing actionable "Field Missions," like plastic cleanups ๐Ÿ—‘๏ธ or pollinator pledges ๐Ÿ, turning learning into tangible, positive environmental action! ๐Ÿ’ช๐ŸŒ

๐ŸŽฌ Demo

๐Ÿ“ฑ Deployed Link: https://echo-location-849419792496.us-west1.run.app/

๐Ÿ“ธ Screenshots:

โœจ๐ŸŒŸโญ๐Ÿ’ซโšก๐Ÿ”ฅ๐Ÿ’ฅ๐ŸŽ‰๐ŸŽŠ๐ŸŽช๐ŸŒˆโœจ

๐Ÿง  How I Used Google AI Studio

Google AI Studio was the central nervous system ๐Ÿงฌ for developing Echo Location! Gemini 2.5 Pro ๐Ÿค– is not just a feature; it is the brain ๐Ÿง  and the heart โค๏ธ of the entire application!

๐ŸŽฏ WORKFLOW MAGIC: My primary workflow revolved around extensive prompt engineering in the AI Studio! ๐Ÿ› ๏ธโœจ

I crafted a detailed system prompt that establishes the persona of "Gem," ๐Ÿ’Ž our AI Field Biologist! This prompt instructs Gemini to act as an enthusiastic guide ๐Ÿ—บ๏ธ and to structure all its responses in the format of our "Field Report." ๐Ÿ“‹

๐Ÿš€ The Process

๐Ÿ”„ THE MAGIC FLOW:

๐Ÿ“ฑ User uploads media (image, video, or audio)
        โ†“
๐Ÿš€ Backend sends to Gemini 2.5 Pro and Flash  
        โ†“
๐Ÿง  Prompt directs chain-of-thought analysis
        โ†“  
๐Ÿ“Š Generates immersive Field Report!
Enter fullscreen mode Exit fullscreen mode

๐ŸŽฏ Chain-of-Thought Analysis Breakdown:

๐Ÿ” Identify: Identify the species, its scientific name, and its conservation status

๐ŸŒฟ Analyze: Analyze the surrounding environment, flora, and context to deduce a probable ecosystem and geolocation

๐Ÿ“– Narrate: Weave this data into an engaging, educational story in Gem's persona

โšก Act: Based on the species and its threats, generate a relevant, actionable "Field Mission" for the user

๐ŸŽช PRO TIP: Google AI Studio was indispensable for rapidly testing and refining these complex prompts! Being able to quickly iterate on inputs and outputs and then deploy the model via Cloud Run made the entire development cycle seamless! ๐ŸŒŸ

๐ŸŽจ Multimodal Features

The multimodal capabilities of Gemini 2.5 Pro are what elevate Echo Location from a simple app to an immersive experience! ๐ŸŽชโœจ

๐ŸŒ Holistic Scene and Video Understanding

๐ŸŽฏ This is the core input! The app doesn't just recognize an animal; it understands the scene! ๐ŸŽฌ When a user uploads a video of a bear ๐Ÿป catching a salmon ๐ŸŸ, Gemini comprehends:

๐ŸŽฏ The action (hunting)

๐Ÿค The interaction between species

๐Ÿ“Š Context differences (e.g., a lion resting ๐Ÿ˜ด vs. a lion on the prowl ๐Ÿ‘€)

๐ŸŒŸ MAGIC MOMENT: This contextual understanding allows "Gem" to generate narratives that are incredibly rich and specific to the moment captured by the user, making every Field Report unique and insightful! ๐Ÿ’ซ

๐ŸŽต Eco-Acoustic Analysis

๐ŸŽช A standout feature is the ability to accept audio uploads! ๐ŸŽ™๏ธ A user can record:

๐Ÿฆ Bird calls in their backyard

๐Ÿฆ— The sound of insects at night

๐Ÿค– GEMINI MAGIC: Gemini analyzes this soundscape to identify potential species ("That distinct call belongs to a Northern Cardinal!" ๐Ÿฆ) and describe the health of the local ecosystem! ๐ŸŒฟ

This turns the user's own environment into a subject of discovery and makes them feel like a true field biologist ๐Ÿ”ฌ using advanced tools! ๐Ÿ› ๏ธ

๐Ÿ”„ Context-Driven Content Generation

The app demonstrates a powerful multimodal feedback loop! ๐ŸŒช๏ธ The visual ๐Ÿ‘๏ธ or audio ๐Ÿ‘‚ input is the catalyst for all generated content!

๐ŸŽฏ Examples in Action:

๐Ÿข Sea Turtle Image โ†’ doesn't just trigger a story about turtles; it triggers the generation of the "Plastic Patrol Mission" ๐Ÿ—‘๏ธ

๐Ÿ Garden Bee Audio โ†’ triggers the generation of the "Pollinator Pledge Mission" ๐ŸŒป

๐ŸŽช THE SECRET SAUCE: This ensures that the conservation message and the call-to-action are always directly relevant to the user's discovery, creating a powerful and persuasive user experience that drives the app's core mission forward! ๐Ÿš€๐Ÿ’š

โœจ๐ŸŒŸโญ๐Ÿ’ซโšก๐Ÿ”ฅ๐Ÿ’ฅ๐ŸŽ‰๐ŸŽŠ๐ŸŽช๐ŸŒˆโœจ

Made with โค๏ธ by @williamhenryking and @linfordlee14 ! ๐Ÿ‘จโ€๐Ÿ’ป๐Ÿ‘จโ€๐Ÿ’ป

๐Ÿ™ Thanks for reading our submission! ๐ŸŽ‰๐Ÿš€

Top comments (2)

Collapse
 
linfordlee14 profile image
Linford

It was amazing collaborating on this project! Learned a lot about multimodal AI and Google AI Studio along the way. Proud of what we achieved together!

Collapse
 
linfordlee14 profile image
Linford

Echo Location Project is such a creative application of AI! Projects like this show how much potential there is in multimodal AI for real-world solutions.