DEV Community

Sercan Koç
Sercan Koç

Posted on

Road Accident Report - AI Assistant

This is a submission for the Google AI Studio Multimodal Challenge

What I Built

I built the AI Accident Assistant, a sophisticated web application designed to solve a major real-world problem: the stress, confusion, and error-prone process of creating a vehicle accident report at the roadside.

An accident is a chaotic event. The last thing anyone wants is to manually copy information while trying to recall details under pressure. My application transforms this experience.

A user simply:

  • Takes photos of official documents (driver's license, insurance policy).
  • Captures the scene with photos or videos.
  • Records a brief voice memo describing what happened.

The AI assistant takes over, processing this multimedia evidence to generate a complete, accurate, and pre-filled accident report. The final output is a secure and organized Evidence Package (.zip) containing a formal HTML report, the AI-generated sketch, and all original media files, ready for submission to an insurance company.

Landing Page

Landing Page 2

Demo

Deployed Applet: https://road-accident-report-ai-assistant-121419357176.us-west1.run.app

The demo showcases the full, end-to-end user journey:

Setup: The user is greeted by a clean landing page and then selects the report's jurisdiction (e.g., UK, California) and language.

Setup Page

Upload: The user uploads all their evidence—documents, scene photos, and audio statements for each driver involved. They can also use their device's GPS to log the accident location.

Upload

AI-Powered Verification: The user reviews the AI-generated draft report. If the AI has questions due to missing or conflicting data, it presents a conversational chat interface for clarification. The user can also visually adjust the AI-generated SVG diagram and review a separate AI-generated artistic sketch of the scene.

AI-Powered Verification

Download: After signing digitally and providing consent, the user downloads the final Evidence Package.

Download

Downloaded

How I Used Google AI Studio

The core of this application is powered by a chain of sophisticated multimodal prompts sent to Google's Gemini 2.5 Flash and Imagen 4.0 models.

I engineered a multi-step AI workflow within the application:

Data Extraction (Gemini): The first call sends all media files (images, audio) along with contextual data (like GPS location) to Gemini. I leveraged Gemini's native JSON output mode by providing a strict schema, ensuring a reliable and structured data response that populates the report draft.

Interactive Diagram Generation (Gemini): A second, targeted prompt asks Gemini to analyze the scene photos and extracted data to generate a clean, interactive SVG diagram of the accident. The prompt explicitly instructs the model to use specific group IDs () to make the diagram elements draggable in the UI.

Sketch Generation (Imagen): A third call sends a descriptive prompt to Imagen, which generates a top-down, black-and-white schematic sketch of the accident, providing an alternative visual representation for the final report.

Conversational Clarification (Gemini): If information is missing, the application initiates a conversational loop. The user's text answers are sent back to Gemini with the current report data, and the model intelligently updates the JSON with the new information.

Multimodal Features

The multimodal capabilities of Google's AI models are the foundation of this app's user experience.

Image Understanding (Documents & Scene)

What it does: Gemini analyzes official documents to extract key details and simultaneously interprets scene photos to understand road conditions, vehicle positions, and impact points.

Why it's better: This eliminates manual data entry, reduces human error, and saves critical time. The AI is even prompted to detect and flag unreadable or blurry documents, giving the user a chance to re-upload for better accuracy.

Audio Intelligence (Voice Memos)

What it does: The user can record a voice memo describing the accident. Gemini transcribes the statement and cross-references the narrative with visual evidence from the photos.

Why it's better: This offers a natural way for users to provide their statement while events are fresh. The AI’s ability to check for contradictions is a powerful validation feature that ensures a more truthful report.

Intelligent Generation (SVG Diagram & PNG Sketch)

What it does: The app leverages both Gemini and Imagen to generate two distinct visual aids: a clean, interactive SVG diagram for precise adjustments, and a simple, easy-to-understand PNG sketch.

Why it's better: This provides users with multiple ways to visualize the incident. The interactive SVG diagram, in particular, empowers the user to fine-tune the AI's output, creating a collaborative report-building process that leads to a highly accurate final document.

Top comments (0)