We created this piece of content for the purposes of entering the Gemini Live Agent Challenge. #GeminiLiveAgentChallenge
Enpitsu (鉛筆) — A...
For further actions, you may consider blocking this person and/or reporting abuse
It was very interesting to read about some of the transformations and adjustments for consistency you used. I wanted to make one suggestion: adding a license to the GitHub code so that it is clear if and how others can use it. I didn’t see a license when I checked it out, but please do let me know if I missed it. MIT seems to be a popular default for many AI-related projects.
Thanks for the kind words and the suggestion! You're correct, we don't have a license on the repo yet. We're currently in the judging phase of the hackathon and have been asked to hold off on any changes to the codebase until winners are announced. Once that's done, we'll add a license (MIT sounds like a great fit). Appreciate you flagging it!
That's really great news! Having a license attached will make it usable for testing in my workflow. Thank you for taking the time to reply, and of course thank you for developing this fascinating code.
Hi, great stuff you are building! Your GitHub repository is not to be found anymore, unfortunately… Can you share, please?
Hello, thank you!
Sorry about that, it was set to private initally. You can check it now
Awesome!
This is awesome. Do you have examples of generated outputs? Would be nice to include in the post if you have time to grab some!
Thanks so much! I've updated the post with a dedicated examples section, you can see the character model sheets Gemini generates, full storyboard pages with screentone and speech bubbles, and a sketch-to-manga before/after. The sketch conversion is honestly one of my favourite parts: you draw the rough composition and Gemini handles the inking, screentones, and line weight.
Wow! It's even better than I was expecting! Great work
Thank you very much!🚀
"This is amazing work! I love how you handled character consistency with Gemini’s multimodal input — the sketch-to-manga feature is especially impressive. Really inspiring approach for anyone wanting to generate full manga with AI."
Thanks so much! The multimodal input was honestly the key insight, passing character sheets as image references on every panel call is what made consistency actually work. Glad it's inspiring!
Very creative use of Gemini! Did you consider using a platform to make calling multiple AI models easier?
Thanks! We did consider orchestration frameworks like LangChain early on, but decided against them for this project. Since we're going deep on a single provider (Gemini) rather than swapping models, the abstraction layer didn't buy us much — and the Google GenAI SDK's native async support and multimodal Part API were exactly what we needed for the character sheet consistency technique. Going direct also meant one less dependency and easier debugging when image generation behaved unexpectedly. For a multi-provider setup it'd be a different call though!
This really resonated with me.
Really appreciate you saying that! It was a fun week of building, hope it sparks some ideas for your own projects.