
TL;DR
If you want to skip to the conclusion, here’s a quick summary of the findings comparing OpenAI o3, Gemini 2.5, and OpenAI o4-Mini:...
For further actions, you may consider blocking this person and/or reporting abuse
Interesting. I've had better luck with o4-mini and o3 for coding. Granted my work is very iterative and human in the loop. Gemini tends to make a mess of things for me after round 1. o4-mini is my goto but o3 is for debugging. I still use Claude a lot for the tradeoff of speed.
My expectations are lower though. Step one anything that runs and is roughly a game is pretty good. I'd expect to iterate more with more clarifying prompts. The fully autonomous thing seems like a pipe dream for now for anything beyond "implement this API"
Nicely said. I agree with you. o3 seems to be more better with debugging stuff.
NIce Comparison, Shrijal!
Appreciate you, Arindam 🫡
Im loving all the hype around these AI models nowadays. Gemini 2.5 seems to be the one to go for.
It's the worst. Always hallucinates on a bigger codebase.
Does it? Haven't really used it on a bigger codebase. What's your typical usecase?
Not always. As said, choose the models based on your use case.
Nice Comparison, Shrijal!
Appreciate you, Arindam 🫡
OpenAI O3 handles complex code well, while Gemini penumbra ffxiv 2.5 is strong in logic and structure.
O4-Mini is fast and efficient great for everyday coding tasks on the go
Well said!
Love the zero-shot concept for testing. This really tests raw performance. But I don't think Gemini is a good choice for bigger codebase. Maybe works for such smaller projects. Go for o1, that's more than enough for most of the use case. Jack of all.
Glad you liked it. I've heard good things about this model o1, and it definitely sounds great. I haven't tried it, though…
Honestly, I get super annoyed when I can't just pick one model for everything - feels like it's always tradeoffs. Makes me double check every AI response now, not gonna lie.
Yeah, same for me, though. Thank you for checking out, Nevo! :)
I WANT TO JOIN
What exactly? 👀
Share your thoughts on the comments! 👇
Love this detailed comparison, Shrijal! 🤍
Thank you, Lara! I appreciate you taking the time to check out the blog post. 😊
Wow amazing but deepseek why include for compare response?
Thank you, Satria, but where did I mention DeepSeek exactly?
Am I the only one sticking with codellama still? It just gets my job done. Never had to look back.
I'm surprised. How do you use it? 👀 Maybe locally with Ollama?
lol yeah, local setup. For vscode, I use an extension. Here you can find it: marketplace.visualstudio.com/items...
🙌
Love this.
Thanks!
Thank you! I agree that the recent new smaller models are really great. I love the Gemini 1.5 flash and also the recent pretty small and solid model, Gemma 3 27B. Have you tried it out?
Impressive!
Thank you, @ricorizz ✌️