TL;DR: Blend of frontier & open models is more cost efficient and reviews faster. NVIDIA Nemotron is supported for CodeRabbit self-hosted customers.
We are delighted to share that CodeRabbit now supports the NVIDIA Nemotron family of open models among its blend of Large Language Models (LLMs) used for AI code reviews. Support for Nemotron 3 Nano has initially been enabled for CodeRabbit’s self-hosted customers running its container image on their infrastructure. Nemotron is used to power the context gathering and summarization stage of the code review workflow before the frontier models from OpenAI and Anthropic are used for deep reasoning and generating review comments for bug fixes.
How Nemotron helps: Context gathering at scale
This new blend of open and frontier models allows us to improve the overall speed of context gathering and improves cost efficiency by routing different parts of the review workflow to the appropriate model family, while delivering review accuracy that is at par with running frontier models alone.
High quality AI code reviews that can find deep lying and hidden bugs require lots of context gathering related to the code being analyzed. The most frequent (and most token-hungry) work is summarizing and refreshing that context: what changed in the code and does it match developer intent, how do those changes connect with rest of the codebase, what are the repo conventions or custom rules, what external data sources are available to aid the review, etc.
This context building stage is the workhorse of the overall AI code review process and it is run several times iteratively throughout the review workflow. NVIDIA Nemotron 3 Nano was built for high-efficiency tasks and its large context window (1 million tokens) along with fast speed helps to gather a lot of data and run several iterations of context summarization and retrieval.
CodeRabbit architecture with Nemotron support
Blend of frontier and Open Models
When you open a Pull Request (PR), CodeRabbit’s code review workflow is triggered starting with an isolated and secure sandbox environment where CodeRabbit analyzes code from a clone of the repo. In parallel, CodeRabbit pulls in context signals from several sources:
- Code and PR index
- Linter / Static App Security Tests (SAST)
- Code graph
- Coding agent rules files
- Custom review rules and Learnings
- Issue tickets (Jira, Linear, Github issues)
- Public MCP servers
- Web search
To dive deeper into our context engineering approach you can check out our blog: The art and science of context engineering for AI code reviews.
A lot of this context, along with the code diff being analyzed, is used to generate a PR Summary before any review comments are generated. This is where open models come in. Instead of sending all of the context to frontier models, CodeRabbit now uses Nemotron Nano v3 to gather and summarize the relevant context. Summarization is at the heart of every code review and is the key to delivering high signal-to-noise in the review comments.
After the summarization stage is completed the frontier models (e.g., OpenAI GPT-5.2-Codex and Anthropic Claude-Opus/Sonnet 4.5) perform deep reasoning to generate review comments for bug fixes, and execute agentic steps like review verification, pre-merge checks, and “finishing touches” (including docstrings and unit test suggestions).
What this means for our customers
CodeRabbit is now enabling Nemotron-3-Nano-30B support (initially for its self-hosted customers) for the context summarization part of the review workflow along with the frontier models from OpenAI and Anthropic. This results in faster code reviews without compromising quality.
We are also delighted to support the announcement from NVIDIA today about the expansion of its Nemotron family of open models and are excited to work with the company to help accelerate AI coding adoption across every industry.
Get in touch with our team to access CodeRabbit’s container image if you would like to run AI code reviews on your self-hosted infrastructure.

Top comments (0)