This post is my submission for the DEV Education Track: Build Apps with Google AI Studio.
What I Built (and Why)
I set out to build the "Personalized Storybook Illustrator," a web app designed to prove that you don't need to be a professional artist to create something beautiful; you just need a better paintbrush. The idea was simple: write a story, describe a character, pick an art style, and let a symphony of AI models turn your fleeting thoughts into a fully-illustrated, one-of-a-kind digital storybook.
The app uses Gemini 2.5 Flash for the narrative heavy-lifting (like generating titles and story ideas) and Imagen 3 for the artistic magic. In the end, it packages your creation into a flippable digital book and a downloadable PDF, ready for you to share and prove that the story about the grumpy, space-faring capybara you just invented deserves to be immortalized.
Demo
Quick Note: The app needs a Google AI API Key to run. You can get a free one from Google AI Studio. The free key won't generate images in the live app (you'll get an error), but you can import the project files into AI Studio and run it from there to see the full magic.
- Live App: Vercel Deployment
- Video Demo: YouTube
- Sample PDF: GitHub
My Experience: A Chronicle of Calamities
Naturally, this all worked perfectly on the first try. Just kidding. It was a glorious dumpster fire, and here are some of the "character-building experiences" I faced.
1. My Fox is a Bear Now
The first sign of trouble was consistency. I had a simple story:
- Page 1: A curious fox finds a shiny key.
- Page 2: The fox uses the key to open a tiny door.
Imagen generated a lovely picture of a fox for page one. Success! Then for page two, it generated a picture of... what appeared to be a small, reddish bear fumbling with a doorknob. The art style was different. The character was different. My cohesive storybook was a chaotic fever dream.
The Solution: AI Middle Management. I couldn't fix the model's memory, but I could give it a much, much more detailed set of instructions for every single call. This led to the core "aha!" moment of the project: meta-prompting. I decided to use one AI to boss another one around.
Instead of sending the user's text directly to the image model, I had Gemini act as an expert "prompt engineer." It takes the user's simple input (The fox finds a shiny key
) and transforms it into a rich, detailed prompt for Imagen, complete with art style, a consistent character description, and scene details. This AI middle-manager was the key. It was slower, but the results were exponentially more consistent.
2. I Don't Want a Dialogue
You know what happens when you ask an AI to generate an image for a "storybook." For some reason, it hears "book" and thinks, "Oh, there should be text bubbles." At one point, it even quoted Lorem ipsum on the "The End" page. It was... not the vibe.
The Solution: More AI Middle Management, but Louder. The only way to fix this was to get aggressive. The meta-prompting function became our hero, armed with a screamingly loud negative constraint.
// A snippet from the prompt-generating function
// ...
const systemInstruction = "You are an expert prompt engineer... Your output is ONLY the final, ready-to-use prompt for the image AI."
// ...
content = `Generate a rich, detailed, and effective prompt...
- Art Style: ${artStyle}.
- Main Character: ${characterDescription}.
- Scene to Illustrate: ${pageText}.
- CRITICAL RULE: The prompt you generate must instruct the AI to create a purely visual image with absolutely NO text, words, or letters. Any text is a failure.`;
// ...
Giving the AI a role (expert prompt engineer), a clear format, and a "CRITICAL RULE" was the equivalent of putting a sticky note on your roommate's monitor that says, "DO NOT EAT MY LEFTOVERS. I WILL FIND YOU." It’s not subtle, but it works.
3. Deployment Troubles
I just wanted to put the app someplace where it could be accessed by others. That was all I wanted. Why AI Studio decided this simple request should turn into a three-day-long existential crisis, I'll never understand. It kept getting stuck in a logic loop over an <script type="importmap">
block in the index.html
file:
- AI Studio: "This block is the problem. I will remove it."
- Me: "Okay, great."
- AI Studio: "The block is removed."
- Me: (checks file) "No, it isn't."
- AI Studio: "You're right. It is still there. I will attempt to remove it again."
- Return to step 1 and repeat until sanity frays.
I had to revert to a previous version multiple times. Then, today, it dawned on me that I could just... remove the block from the file myself. So I did that, uploaded the whole thing to GitHub, and it worked. And now I'm here telling you this tale.
My Key Takeaways
- AI is a Tool, Not an Oracle. It's incredibly powerful, but it needs specific, detailed instructions. The real skill is getting good at telling the AI what you want. Or, sometimes, realizing it's faster to just do the thing yourself.
- Embrace Strategic Laziness. Why manually generate test inputs when you can build a "Surprise Me!" button that has the AI do it for you? The same principle led to the meta-prompting strategy—it's easier to teach one AI to boss around another than to perfect every image prompt by hand.
- Client-Side AI is Viable (and Kinda Cool). Running this all in the browser means no complex backend, no managing user keys on a server, and the user's data stays with them. It's a surprisingly elegant architecture for certain types of apps.
Overall, this was a fun, challenging, and at times head-scratchingly annoying experience.
10/10, would do again.
Top comments (0)