Osama_Osman

Posted on Sep 10

Sketch2Web

#devchallenge #googleaichallenge #ai #gemini

This is a submission for the Google AI Studio Multimodal Challenge

What I Built

Sketch2Web is a revolutionary AI-powered web development environment that transforms your ideas into fully-functional, multi-page websites in minutes. It's built for creators, entrepreneurs, and anyone with a vision who wants to bypass the complexities of traditional coding.

The core problem Sketch2Web solves is the significant barrier to entry in web development—the need for extensive time, resources, and technical expertise. With Sketch2Web, you can simply describe your ideal website, and our AI agent will handle the rest, generating clean, responsive, and production-ready HTML, CSS, and JavaScript.

Key Features at a Glance:
Conversational Creation & Refinement: Engage in a natural, iterative dialogue with the AI. Start with a simple idea and refine it step-by-step. Ask for changes like, "make the navigation bar sticky" or "add a testimonials section," and watch it happen in real-time.

Live Visual Editor: Why wait to see your changes? Sketch2Web provides a live preview of your website. Click directly on any element—a button, a headline, an image—to open an intuitive popover and edit its styling, content, or attributes without touching a line of code.

Multimodal Prompting: We believe ideas come in many forms. That's why you can prompt Sketch2Web in whatever way feels most natural:

Text: Describe your website in detail.

Voice: Use your microphone to dictate your ideas.

Images: Upload a wireframe, a mockup, or even a sketch on a napkin, and the AI will turn your visual concept into a coded reality.

Documents: Provide content briefs, project specs, or text documents to have the AI populate your site with the right information from the start.

AI-Powered Image Generation: Need the perfect image for your hero section? Just describe it. Our built-in image generator creates stunning, royalty-free visuals on the fly, eliminating the need to search for stock photos.

One-Click Deployment: When your masterpiece is ready, deploy it instantly. With a single click, Sketch2Web publishes your site to a unique, shareable URL, perfect for previewing on different devices or sharing your progress with collaborators.

Demo

https://ai.studio/apps/drive/15-CaS81Ai1rGg1lI9oJ21bgPoYzZqyTD

Here is a brief overview of the workflow:
**
**Prompting: A user starts by describing their desired website in the main input, using the guided wizard, or starting from a template.

Generation: The AI generates the complete set of files (HTML, CSS, JS) and displays a live preview.

Visual Editing: The user can click any element in the preview to bring up an editor popover, allowing for direct manipulation of text and styles.

AI Image Generation: The user opens the image generator, describes an image, and the AI creates a custom asset.

Deployment: The user clicks the deploy button to get a unique, shareable URL for their new website.

How I Used Google AI Studio

I used the Gemini API, accessible via Google AI Studio, as the core intelligence for Sketch2Web. The application is built around two primary AI capabilities:

Code Generation (gemini-2.5-flash): The generateWebsite function in src/services/geminiService.ts is the engine of the app. It constructs a detailed request to the Gemini API, including a comprehensive system prompt, the user's request, the conversation history, and the current state of the website files. A crucial part of this implementation is the highly-detailed system instruction that guides the model to act as a "Production Agent," ensuring it returns well-structured, complete, and correct code within a specific --- file: ... --- endfile format. This structured output allows the application to reliably parse the response and render the website.

Image Generation (imagen-4.0-generate-001): The generateImage function and the ImageGeneratorModal.tsx component provide an in-app tool for asset creation. It sends a user's text description to the Imagen API to generate photorealistic images that can be used directly in the website, solving the common problem of finding placeholder or final imagery.

Multimodal Features

Sketch2Web is fundamentally multimodal, creating an intuitive and powerful workflow that blends different types of user input to generate a cohesive web output.

Text-to-Code: This is the core functionality. The app translates a user's written description directly into a complete set of website files.

Image-to-Code (Sketch-to-Code): Users can upload an image (like a wireframe, a mockup, or even a sketch on a napkin), which is passed to the Gemini model as inlineData. The model analyzes the visual layout, structure, and content of the image to inform the website it generates, effectively turning a visual concept into code.

Document-to-Code: Similar to image uploads, users can provide .pdf or .docx files containing content, instructions, or specifications. The model reads and understands the document to populate the website with the correct text and build the requested structure.

Speech-to-Text: To make the process even more accessible, the app integrates the Web Speech API (useSpeechRecognition.ts). Users can simply speak their ideas, which are transcribed in real-time into the text prompt for the AI.

These multimodal features dramatically enhance the user experience by allowing creators to use the input method that feels most natural to them. They can show a sketch, upload a document, say an idea, and describe an image, and the AI seamlessly integrates these varied inputs to build, refine, and perfect their final website.

DEV Community

Sketch2Web

What I Built

Demo

How I Used Google AI Studio

Multimodal Features

Top comments (0)