Why I Still Prefer Sora to Nano Banana for Image Generation

#ai #productivity #performance #chatgpt

I’m using AI to create cover images only for blog posts, so I don’t have high expectations. I thought that Google was better than OpenAI at first, but I’ve changed my mind: I don’t know whether it’s a matter of hidden system prompts or an LLM model issue. But I’m noticing huge limits in Gemini.

Sora vs. Nano Banana for Images

I’m not interested in generating videos to date. I’m not even trying. I only have the need of creating cool pictures to use as cover images for my blog posts: that’s why I’m definitely not approaching this process as a professional. On the contrary, I’m trying use the service as an end user would. So very badly.

I’m not really writing “prompts”, for example. I’m just describing at a high level what I want — as I would when talking to someone. I’m also using the free versions of the given services: being just a hobby, I didn’t consider any subscription. Most of the time I’m rely on the recent built-in feature of this platform now.

But I think I’ll do otherwise, since I noticed that Nano Banana is subject to a sort of system prompt that limits the service capabilities a priori. Despite I tried different approaches, it always ends up proposing similar images. Good, but not great… since I want to have images representing the content of my articles.

Same Prompt, Different Results

Of course, Sora and Nano Banana are using different models: keep i mind that I’m Italian and I live in Italy, then I can’t use Sora 2. In order to test the prompt, I sent the same to both the services, checking their results. My description is, as I mentioned, high-level and generic.

A picture to represent the challenge between Sora and Nano Banana as
image generators powered by AI.

You can check the Nano Banana’s result above, having a look at the cover image of this post. Below, the result from Sora: for sure this is just my opinion, but I think it’s far better. Please, notice that I didn’t change a single word to generate the second image. As I’ve already said, I’m using the free version of both the services.

^{Generated by Sora}

Things change a bit if I try Nano Banana Pro from my own Google AI Studio instance: the standard version outputs only a smaller square picture. Anyway, I didn’t find significant differences. I still prefer the Sora’s result, even though I’m using an outdated GPT model version.

This Is Not a Benchmark

Please, notice that I’m not showing off a benchmark. I’m just sharing my experience as, let’s say, a regular user who tries two different services as a prospect. I could fine-tune the prompt (well, it would be rewritten from scratch) to get better results, but that’s not my purpose here. I already use to do it for different needs.

I know that you could think I’m not testing them properly. And, from a developer perspective, you’re right: this is not a professional approach. Comparing Sora and Nano Banana is wrong, because at the same time I’m using two different services based on two different models.

It makes sense to adapt the prompt to the model, as well as to the service, in use, but I wanted to get an immediate result: this is what I got, leaving most of the details to be defined by the LLM itself. I didn’t add any filter — even though I really like Cartoonify for Sora. I wrote just two lines of text.

Nano Banana and Its Pro Version

While the cover image for this post has been generated by the built-in Nano Banana version from the DEV editor, I also tried to get a result from the standard interface. I obtained a very different picture, so I think that Forem is putting something between the model and the prompt.

^{Generated by Nano Banana Pro}

I didn’t suggest any percentage to the LLM. Funny to see that Nano Banana Pro assigned a better performance to its rival: is it a strategy? There’s also a Gemini watermark in the lower right corner. I assume it showed up because I’m not paying for the service, but in general I’m far from being satisfied from it.

I Will Continue to Use Sora

No matter the service and the tier, Gemini is generating worse pictures to date. GPT offers the best results: and I can’t even access the latest update. It would be interesting to try the APIs, sending a real prompt, to better evaluate their performances. I will definitely do it in the future, since it’s crucial in my opinion.

OpenAI’s Responses API added image generation, while Completions API lacks it. I’ve never used Google to do the same with Gemini, so I don’t know if 2.5 and 3 have different capabilities. Unfortunately, Big G drastically reduced the free tier limits, then I can’t “play” with them anymore.

That’s why I’m not even trying to have a professional approach. The fastest way of getting a decent picture is to open Sora and use its web interface: I think this is what the end users are doing. Next, I will give a chance to TranslateGemma, but it’s a completely different use case, and there I will be more technical.

^{If you’d like, follow me on Bluesky and/or GitHub for more contents. I enjoy networking.}