Lewis

Posted on Jun 5 • Originally published at Medium

State of AI Code Review | May 2026 Roundup

#ai #agents #codereview #productivity

Welcome to my second monthly roundup of everything significant shipped across the AI code review space.

Most of the players on the field spent May catching up with the leaders. We saw severity scores, effort dials, and better dashboards turning up across tool after tool. The bigger bets came from two companies:

Macroscope shipped the month's most ambitious launch in Check Run Agents (programmable code review - automate any review standard you care about, not just bugs, on every PR), and;
CodeRabbit kept pushing to make giant AI-written PRs reviewable with Change Stack.

There's actually a deeper strategic rift sitting between these two features - one that gets at the question hanging over the whole space.

What role will AI play in the future of code review?

Will it take on an increasingly larger share of the job? Or will the burden remain with humans, and AI's job be to augment the reviewer?

Both Macroscope and CodeRabbit cater to both scenarios - but Macroscope is clearly making the bigger bet on shifting the work to AI, while CodeRabbit is more focused on improving the human reviewer's cockpit.

1/7 Macroscope

Check Run Agents (May 1)

The most ambitious feature launched anywhere in the code review space all month.

Macroscope already hunts for bugs on every PR out of the box. Now Check Run Agents let you add your own checks on top.

Say you want every PR checked against your team's coding conventions, or an architecture diagram of the change drawn up and posted to Slack so reviewers grasp it at a glance, or each PR run through your database-migration checklist.

Well, now you can write each check as a plain-English markdown file, drop it in your repo, and Macroscope will run it automatically on every PR it applies to - posting a pass/fail result right in the PR, which you can set to block the merge or leave as an FYI.

Checks can also be wired to other tools (Slack, Jira/Linear, Sentry, your analytics), so they can verify things beyond the code itself - like whether a PR actually does what its linked ticket asked.

In short, it lets you automate any review standard your team cares about - not just bugs - on every PR.

CLI (May 8)

Brings Macroscope's review to your own machine: run it from your terminal or editor to catch and fix issues before you push, rather than waiting on the PR. Includes an autopilot mode that loops review-and-fix on its own, and plugins for Claude Code, Codex, Cursor, and OpenCode.

Claude Opus 4.8 support (May 29)

Minor: Opus 4.8 is now selectable as the model powering Check Run Agents.

2/7 CodeRabbit

Change Stack (May 7)

A new review interface built for big, AI-written PRs.

The problem it tackles: AI generates sprawling changes, and GitHub hands them to you as a flat, alphabetical list of files you have to mentally reassemble before you can judge anything.

Change Stack reorganizes that into a guided, layer-by-layer walkthrough - it groups the change into a few self-contained storylines, orders each so the foundational changes come before the code that builds on them, captions every chunk in plain English, and draws a diagram where one helps. You review and approve right there in the view.

CodeRabbit built it out twice more in May: a semantic diff view (May 17) and "Code Peek" - click any name to jump to where it's defined (May 18).

CLI matured (throughout May)

CodeRabbit's command-line reviewer kept iterating across four releases. The standout add is coderabbit doctor, which checks your setup and connectivity before you run a review.

3/7 Cursor Bugbot

Effort Levels (May 11)

Bugbot (Cursor's PR reviewer) now lets you set how hard it works on a review.

"Default" is today's behavior - fast and cheap.

"High" makes it think longer - slower and pricier, but it catches more bugs.

"Custom" lets you write plain-English rules for which PRs get the heavy treatment (e.g. "go high on anything touching auth or payments, default elsewhere"), and Cursor sets the effort per-PR. Requires usage-based billing.

The numbers Cursor shared:

Default finds ~0.7 bugs per review, and 79% of those get fixed before merge;
High finds ~0.95 - about a third more bugs for the extra spend.

4/7 Greptile

After a quiet April (shipped nothing), Greptile rolled out a batch of features in May via a launch video - no individual dates.

Revamped memory

Greptile keeps its own internal docs on your codebase and coding standards, and updates them every time someone uses it, so it gets sharper over time.

Severity scores + agent hand-off

Comments now show how serious an issue is, and you can fire a comment off to a coding agent (Claude, Codex, Cursor, or Devin) to fix it. Both are catch-up moves, though - rivals have had findings-to-coding-agent hand-off and severity scoring for months.

Rebuilt web app

Manage multiple GitHub and GitLab orgs in one place, configure Greptile down to the repo level, and pull analytics. (The detail they shared was light.)

5/7 GitHub Copilot

Code review comment improvements (May 12)

Copilot is playing catch-up here, adding two review-comment features most dedicated rivals have had for a while:

Severity labels - each comment now carries a High, Medium, or Low tag, so you can see what to prioritise.
Grouped comments - repetitive suggestions are merged into one (e.g. the same variable-rename flagged once, not on every occurrence), cutting noise on big PRs.

More control when handing fixes to its agent (May 19)

Another catch-up. Copilot could already turn a review comment into a fix via its cloud agent (GitHub's autonomous coding agent, which writes the change and opens a pull request). May adds the controls rivals already had: clicking Fix with Copilot now opens a pop-up to choose how the fix lands - on the current PR or a new one, which model to use, any extra instructions - and a new Fix batch with Copilot lets you pick several comments and fix them in one go.

Minor - review suggestions now measurable by type (May 8)

The usage-metrics API breaks Copilot's review suggestions down by category (security, bug_risk, etc.) and reports how many developers actually applied each - so admins see what Copilot flags and what gets acted on.

6/7 Qodo

Findings page in the portal - Beta (May 13)

Qodo joined other leading tools by adding a central dashboard: one place that collects every issue its reviewer has flagged across all your PRs and repos.

Team leads can browse, filter, and track findings, with 30-day analytics on code-quality trends. It's a visibility layer over the review output, not a change to the reviewing itself.

Auto-imports rules from common formats (May 13)

Qodo now automatically pulls in review rules from Cursor's rule files (.cursor/rules/, .cursorrules) and skill files (SKILL.md), so existing config carries over without re-creating it. (Same move Greptile made this month - reading rules from where they already live.)

7/7 Claude Code Code Review

New /code-review command (May 21)

A single slash command for running code review inside Claude Code.

Rather than adding new functionality, it consolidates Claude's existing review abilities into one command. You shape the review by adding a suffix to it:

How deep it looks - add a depth word, anywhere from /code-review low to /code-review ultra. low/medium is a quick pass (just the issues it's confident about), high/max digs deeper and flags more (some of which won't pan out), and ultra kicks off the big multi-agent review out in the cloud.
What it reviews - add a PR number (/code-review 128) to have it review that specific pull request. If you don't include a number, it reviews your current changes by default.
What it does with the findings - add --comment (/code-review --comment) and it posts its findings as inline comments on your GitHub PR; otherwise it just outputs them in the chat. Add --fix and it applies the fixes to your code.

It joins the existing review commands rather than replacing them - /simplify (cleanup only - won't hunt for bugs), /review (reviews an open PR), and /ultrareview (the standalone heavy cloud review, same as /code-review ultra) all remain.

The State Of AI Code Review in May 2026

May in a nutshell: Macroscope made code review programmable, CodeRabbit made giant AI PRs reviewable, and everyone else played catch-up - severity scores, effort dials, and dashboards all round.

See you at the start of July with June's moves.

DEV Community