Digitalization Social Creativity Multimedia Content Creator

#devchallenge #googleaichallenge #ai #gemini

Google AI Challenge Submission

🛠️ What I Developed

Digital Social Creative is a lightweight, ready-to-deploy application that transforms a short brief, image snippet into:

Platform-specific post variations (LinkedIn, Instagram, X, Facebook, TikTok)
A full 7-day posting calendar
A refined image prompt (with optional image generation)
Seamless export options (ZIP bundle, CSV file, ICS calendar, Markdown snapshot)
A rapid A/B test generator with performance scoring and CTA suggestions

Built for ease-of-use, the interface is clean and intuitive, featuring a card-style layout for posts and a JSON view for advanced users.

If the Gemini 2.5 Flash Image model isn’t accessible, the app defaults to branded placeholder visuals—ensuring the flow remains uninterrupted.

🤖 Powered by Google AI Studio

Text generation via gemini-2.5-flash API enables:

Multi-platform post creation (with channel-specific formatting and constraints)
Smart 7-day scheduling (ISO timestamps + rationale)
Custom 1080×1080 image prompts aligned with brand identity

Optional image generation uses a configurable model (e.g., gemini-2.5-flash-image-preview). If unavailable, the app substitutes placeholder PNGs to maintain user experience.

🧠 Multimodal Intelligence

Inputs supported:

Text briefs
Images

Image analysis includes:

Captions, object detection, color palette, style, product type, and mood

Brand customization:

Injects brand_name and brand_color into post and image prompts

Image generation:

/images/zip endpoint produces multiple image variants per post, with optional style reference uploads

A/B testing:

Generates two post versions for a selected platform, scores them, and suggests a CTA

📤 Export Options

ZIP bundle with all assets and README
ICS calendar events
CSV for post operations
Markdown summary

✨ User Experience Highlights

Card layout for readable post previews (title, body, hashtags, CTA) with copy functionality
JSON view for raw data with copy button
“Quick Example” button auto-fills the form and loads sample media
Clear status indicators, file size limits (20 MB for images, 100 MB for media), and a 60s timeout to prevent frontend stalls

🧱 Architecture Overview

Frontend: Static HTML/CSS/JS served via FastAPI’s StaticFiles
Backend: FastAPI + Gemini SDK (text/image), pydub + ffmpeg for media trimming, speech_recognition for transcription
Deployment: Docker → Cloud Build → Cloud Run (with warm-start instance); secrets managed via Secret Manager; CORS enabled for demo; favicon and sample files included for clean logs

💡 Why Multimodal Is Essential

Marketing teams rarely begin with a polished text brief. They start with assets—a Google AI Studio’s multimodal capabilities