DEV Community

Umar Pathan
Umar Pathan

Posted on

GeminiLens: The Photo Editing Revolution You Need to See

This is a submission for the Google AI Studio Multimodal Challenge

🚀 What I Built

Imagine turning a good photo into a headline-grabbing image with a single sentence or a single click.

I built an app called PixelSculpt that makes professional photo retouching so simple anyone can do it.

✨ Point at a blemish → it disappears

🌅 Say "warm sunlight" → your picture glows like golden hour

🎬 Want film noir or soft pastel vibes? → type it and watch the transformation

This tool solves the two big problems every creator faces:

  1. Editing that looks great usually takes hours and expensive tools.
  2. Creative ideas live in your head but are hard to translate into sliders and settings.

PixelSculpt bridges that gap by letting people edit with natural language + precise point-and-click controls.

The result → fast, stunning edits that feel personal and intentional.

🎥 Demo

đź”— Live Demo

▶️ Watch the 2-minute walkthrough video to see PixelSculpt in action.

The demo shows:

  • One-click blemish removal at a precise spot
  • A prompt that turns midday light into a soft golden glow
  • A creative filter chain that transforms a portrait into a retro magazine cover

📸 Screenshots in the demo highlight:

  • Before & After frames
  • Interactive click-to-remove or add elements with pixel-level accuracy
  • Three starter presets: Quick Fix, Cinematic Mood, and Dream Pastel

đź›  How I Used Google AI Studio

PixelSculpt was built inside Google AI Studio to leverage its multimodal hosting and fast image runtimes.

The studio handled model orchestration, prompt handling, and secure asset storage, freeing me to focus on experience & creative controls.

Key technical pieces:

  • đź–Ľ Image understanding & region selection → Google’s multimodal models for accurate object masks when clicking a point
  • đź’¬ Text-to-edit translation → natural language prompts → safe, deterministic editing operations
  • ⚡ Lightweight inference endpoints → near-instant previews so users get real-time feedback while experimenting

🎨 Multimodal Features

Precise Retouching

  • Click anywhere → localized segmentation + inpainting remove blemishes, objects, or distractions
  • Synthesizes nearby textures → blends naturally (not just blur!)

Prompt-Driven Creative Filters

  • Type phrases like “vintage film grain with teal shadows” or “futuristic neon glow”
  • Generates layered, dynamic filters → image-aware, preserves skin tones & fine details

Pro Adjustments Without the Learning Curve

  • Plain language like “lift shadows” or “cinematic contrast”
  • Mapped to exposure, selective blur, color grading, and tone mapping → respects facial features

Assistive Suggestions & Safety

  • Suggested prompts based on photo content
  • Built-in safety checks prevent harmful/deceptive edits

đź’ˇ Why this matters:

PixelSculpt blends point-and-click precision with natural language creativity → making bold edits faster, easier, and more personal.

🎯 UX & Psychology Choices

  • Short wins & instant feedback → create momentum
  • Before/after preview → triggers delight
  • Progressive disclosure → clean UI, advanced options for power users
  • Friendly defaults & suggestions → reduce decision paralysis, encourage sharing

Top comments (0)