DIY-AI Fix-It: A Conversational Assistant for Household Repairs

#devchallenge #googleaichallenge #ai #gemini

Google AI Challenge Submission

This is a submission for the Google AI Studio Multimodal Challenge

What I Built

I built "DIY-AI Fix-It", a comprehensive, AI-powered assistant that guides users through common household repairs. The app is designed to solve the entire problem from start to finish: from not knowing what's wrong, to successfully completing the fix with confidence.

A user simply uploads a photo of their problem and describes it. The AI then generates a complete, professional-grade repair plan, including:

A Clear Diagnosis: A simple explanation of what's wrong.
Complete Tool & Material List: A checklist of everything needed for the job.
Difficulty & Time Estimates: So you know what you're getting into before you start.
Step-by-Step Instructions: Safe, easy-to-follow steps, with automated warnings for hazardous tasks.
Potential Pitfalls: Critical advice on common mistakes to avoid.

The experience is fully interactive. Users can ask clarifying questions in a Contextual Follow-up Chat and even have the instructions read aloud with a Text-to-Speech feature for hands-free guidance during a repair.

Link to my deployed applet:

diy-ai-pi.vercel.app

Screenshots of the app in action:

The main interface where a user uploads an image and describes the problem.

The initial, structured solution provided by the AI assistant.

The conversational follow-up feature, where the user can ask for more help.

How I Used Google AI Studio

I used Google AI Studio's Freeform prompt to build the entire logic core of my AI assistant. The prompt I created is designed to be highly intelligent and state-aware. It instructs the Gemini model to analyze a user's request and determine if it's an initial query or a follow-up question within an existing conversation.

This conditional logic, built entirely within a single prompt, allows the model to respond in two distinct ways:

For new problems, it provides a structured, predictable JSON output that my React app can easily parse and display.
For follow-up questions, it switches to a conversational mode, providing helpful, context-aware answers in natural language.

I was able to test, refine, and perfect this complex logic flow entirely within the AI Studio environment before deploying it.

Multimodal Features

My app is built from the ground up on Gemini's powerful multimodal capabilities, specifically combining image/video analysis with contextual text.

The core feature is the AI's ability to ground its entire analysis and conversation in the visual information provided by the user. When it provides instructions, it's referring directly to the components it can "see" in the photo.

The most advanced multimodal feature is the conversational repair logic. The AI must hold the initial image/video in context while processing the new, text-based questions from the chat_history. This creates an experience where a user can essentially have a conversation about a physical object, asking questions like "Are you sure I should turn that blue knob?" and getting an intelligent response. This deep integration of visual data and conversational text is what makes the app so uniquely helpful.

Top comments (4)

Cyber Safety Zone • Sep 15

Really cool project, Abraham! I love how DIY-AI Fix-It blends image analysis + conversational help to walk people through household repair problems. The step-by-step instructions + context-aware follow ups make it super useful, especially for folks who aren’t “hands-on” experts.

A few suggestions/questions:

Could you add video examples (e.g. before/after repair) to enhance understanding of the instructions and highlight tricky parts?
Sometimes tool names or material specs vary by region — it might help to allow users to input their locale so the assistant can adapt recommendations.
What safety validation do you have in place for more hazardous fixes? Maybe a “recommended professional consultation” warning when the risk is high.

Great work overall — this has a lot of real-world value. Looking forward to trying it out!

Abraham Yakubu • Sep 16

Thank you for this fantastic feedback. You're absolutely right that safety is the top priority, and I'm already working on it. The video and regional adaptation ideas are brilliant, too. This is incredibly helpful.