Join our FREE AI Community: https://www.skool.com/ai-with-apex/about
Everyone’s talking about Lyria 3 making music from photos.
They’re missing the real opportunity.
Creation just became live, not linear.
DeepMind’s new model can take an image and a vibe prompt.
Then it generates a ~30-second song with vocals and instruments.
Not after you hit “render.”
While you’re still steering.
The part I noticed most is the streaming.
Audio comes in 2-second chunks.
That means you can guide the track mid-flight.
Like directing a session, not exporting a file.
This changes how teams work.
Marketing can prototype sound for a campaign in minutes.
Product teams can test sonic branding faster.
Creators can iterate without booking studio time.
A simple example.
You’re launching a winter collection.
You upload a snowy street photo.
You prompt “warm, nostalgic, lo-fi pop.”
In under a minute, you have a draft to react to.
Then you adjust mood, tempo, and energy live.
Here’s the framework I’d use this week ↓
↳ Start with one image that already performs.
↳ Write 3 vibe prompts, not 30.
↳ Generate 5 drafts, then pick one direction.
↳ Iterate live until it fits the brand.
↳ Document what changed and why.
One more detail matters.
SynthID watermarks the waveform for traceability.
That’s not a footnote.
That’s how this becomes usable at scale.
Where would “live-steerable” music help you most?
Top comments (0)