DEV Community

Cover image for Best AI Models for Agentic Vibe Coding in VS Code (December 2025)
Daniel Shashko
Daniel Shashko

Posted on

Best AI Models for Agentic Vibe Coding in VS Code (December 2025)

Hey everyone! In this short article, I’ll cover the AI models that worked best for me when doing agentic coding in VS Code.

Nothing here is backed by formal research. This is all based on hands-on, daily usage, real projects, and a lot of trial and error.

Let’s go!

1. Claude Opus 4.5

This one was probably expected.

I use Claude Opus 4.5 mainly for multi-step execution, combined with a custom “Beast Mode” agent written by Burke Holland from the Copilot team at Microsoft. If you haven’t tried it yet, I strongly recommend it. You can find it here or read my LinkedIn post about it.

In my experience, Opus 4.5 is the best model for working across multiple large files. It’s currently the only model I trust to run without constant manual code review. The code is clean, focused, and avoids unnecessary boilerplate.

The main downside is the quota cost, which is counted as x3.

Best use case

  • Large features
  • Multi-file refactors
  • End-to-end implementation with minimal supervision

Usage quota: x3

2. Claude Sonnet 4.5

Need a quick review? A small UI tweak? Minor backend logic improvements? This is the model I choose for those.

With only x1 quota usage, Sonnet 4.5 gives the best balance of speed, quality, and cost for smaller tasks.

I usually use it right after a long Opus 4.5 session to polish things. It performs best with strong context, so I always test my app using Simple Browser and then point Sonnet to the exact elements I want to change.

I also rely on it for changelog creation after long coding sessions and summarization.

Example of a changelog created by Sonnet 4.5

Best use case

  • Small fixes after large coding sessions
  • UI tweaks
  • Code reviews
  • Changelog and summary generation

Usage quota: x1

3. GPT-5.2

I have mixed feelings about GPT-5.2, but it clearly is very good in one specific area.

I don’t enjoy using it for direct coding. Compared to Opus 4.5, I often see more boilerplate and more bugs. That said, where GPT-5.2 really stands out is planning and architecture.

It consistently produces excellent:

  • System architecture
  • Stack decisions
  • Data models
  • Step-by-step development plans

With a single, well-defined prompt, I often get a deeply structured plan with clear todos, many fetched relevant sources, and strong reasoning. Beast Mode helps a lot here. See this example:

Full architecture and dev plan generated by GPT-5.2

Best use case

  • Architecture planning
  • System design
  • Data modeling
  • Breaking ideas into actionable steps

Usage quota: x1

4. Gemini 3 Pro

Gemini 3 Pro is a strong backend-focused model in my experience, but is much weaker for complex UI work.

I usually choose it for database logic and authentication because it requires very little back-and-forth to reach a solid result (at least for me). It tends to “get it” quickly when it comes to structured backend problems.

It’s also very good at writing documentation and general content.

A GIF showing a detailed README.md generated by Gemini 3 Pro

Best use case

  • Database implementation
  • Auth flows
  • Backend logic
  • README and documentation generation

Usage quota: x1

5. GPT-5 Codex (Preview)

This is my go-to model for smaller coding tasks when I want precision and predictability.

GPT-5 Codex is much less 'opinionated' than GPT-5.2 and stays very focused on the code. It’s not great for planning, but it’s very good at implementing exactly what you ask for, especially for backend coding and when the scope is clear (less of a UI star for me).

I use it a lot for isolated features, functions, and clean implementations.

Best use case

  • Writing specific backend functions
  • Implementing clear specs
  • Code-first tasks with minimal discussion

Usage quota: x1

6. Claude Haiku 4.5

Claude Haiku 4.5 is my “fast response” model.

It’s not meant for heavy logic or large features, but it’s extremely useful when I want a quick answer without wasting quota on stronger models. It’s responsive, clear, and surprisingly accurate for small tasks.

I mostly use it as a helper during development, never as the execution model. I haven't even tried using it for execution, feel free to share your experience with it in the comments.

Best use case

  • Quick questions
  • Small helpers
  • Explaining existing code

Usage quota: x0.33

Final Thoughts

No single model does everything perfectly.

For me:

  • Opus 4.5 is the heavy lifter
  • Sonnet 4.5 is the finisher
  • GPT-5.2 is the architect
  • Gemini 3 Pro owns backend logic
  • GPT-5 Codex is the clean coder
  • Haiku 4.5 is the fast assistant

If you combine them instead of sticking to one model, the results are on a completely different level (and your quota too 🫠).

Top comments (0)