DEV Community

Cover image for Review the Logic, Not Whether the Junior Used AI
Bradley Matera
Bradley Matera

Posted on

Review the Logic, Not Whether the Junior Used AI

Code review has a new lazy question:

"Did AI write this?"

Sometimes that question matters.

If private code was pasted into an unapproved tool, it matters.

If generated code was shipped without understanding, it matters.

If the team has a compliance policy, it matters.

But as a review standard, that question is weak.

The better question is:

"Is this code correct, tested, maintainable, and owned by the person submitting it?"

That standard catches bad AI code. It also catches bad human code.

Coding Computer Science GIF by Quixy - Find & Share on GIPHY

Discover & share this Coding Computer Science GIF by Quixy with everyone you know. GIPHY is how you search, share, discover, and create GIFs.

favicon giphy.com

AI panic can hide ordinary review failures

Take a SQL example.

A report needs to include analytics events even when the segment reference is missing, because missing references are part of the data-quality signal.

The wrong query looks clean:

SELECT a.id, a.user_id, a.segment_id
FROM analytics_events a
JOIN segments s ON a.segment_id = s.id
WHERE a.deleted_at IS NULL;
Enter fullscreen mode Exit fullscreen mode

The problem is the JOIN. It drops rows without matching segment records.

If missing segment references are supposed to remain visible, the query needs a LEFT JOIN and an explicit signal:

SELECT
  a.id,
  a.user_id,
  a.segment_id,
  s.id IS NULL AS missing_segment
FROM analytics_events a
LEFT JOIN segments s ON a.segment_id = s.id
WHERE a.deleted_at IS NULL;
Enter fullscreen mode Exit fullscreen mode

That bug has nothing to do with whether AI was used.

A human can make it. AI can make it. A rushed senior can make it. A junior can make it.

The review should catch it either way.

Code review is supposed to teach and protect

Research on code review does not describe it as a rubber stamp.

The 2013 paper Expectations, Outcomes, and Challenges of Modern Code Review found that while defect finding remains a major motivation, code review also supports knowledge transfer, team awareness, and alternative solution discovery.

Google's 2018 paper Modern Code Review: A Case Study at Google describes review as serving readability, education, maintainability, and correctness goals.

Chart: Code review research from 2013 and 2018 frames review as knowledge transfer, education, maintainability, and correctness work

Sources: ICSE 2013 modern code review paper and Google Modern Code Review case study.

That matters for juniors. If review only says:

"This looks AI-generated."

It is not teaching anything useful.

If review says:

"This join drops orphaned events. Add a fixture that proves missing segments stay visible."

That teaches.

Ask review questions that expose behavior

Here is the shift teams need:

Weak review question Strong review question
Did AI write this? Can the author explain the logic?
Is this generated? What behavior does this guarantee?
Does it look clean? What data does it drop?
Did the junior follow policy? Did the reviewer verify the assumption?
Is this allowed? What risk does this introduce?
Can we reject it? What test would make it safe?

The tool-origin question is not useless. It is just not enough.

AI disclosure should be boring

Teams need policies that make disclosure normal.

Example:

AI assistance:
- used AI to understand the existing SQL join behavior
- used AI to brainstorm edge cases
- final query was manually reviewed and tested

Validation:
- added fixture for missing segment reference
- verified deleted events are excluded
- verified active orphaned events remain visible
Enter fullscreen mode Exit fullscreen mode

That is a reviewable note.

It gives the reviewer a path:

  • inspect the behavior
  • inspect the test
  • ask what was rejected
  • verify the edge case

Compare that with the usual vague policy:

"Use AI responsibly."

That is not a policy. That is a future argument.

Tests are where the tool debate gets real

For the SQL example, a useful test fixture includes:

  • event with valid segment
  • event with missing segment
  • deleted event
  • expected output that preserves the missing-segment row

Example:

it('preserves active analytics events with missing segment references', async () => {
  await seedAnalyticsEvent({ id: 1, segmentId: 'known-segment', deletedAt: null });
  await seedAnalyticsEvent({ id: 2, segmentId: 'missing-segment', deletedAt: null });
  await seedAnalyticsEvent({ id: 3, segmentId: 'deleted-segment', deletedAt: new Date() });

  const rows = await reportRepository.getCampaignSegmentRows();

  expect(rows).toEqual([
    expect.objectContaining({ id: 1, missing_segment: false }),
    expect.objectContaining({ id: 2, missing_segment: true }),
  ]);
});
Enter fullscreen mode Exit fullscreen mode

That test does not care who wrote the query. It cares whether the business rule survives.

That is the right level of discipline.

Developers already know AI is unreliable

The common narrative is that juniors trust AI too much.

Some do. But the broader developer population is not blindly confident either.

Stack Overflow's 2025 Developer Survey says more developers distrust AI output accuracy than trust it: 46% versus 33%. It also says the biggest frustration is AI answers that are "almost right, but not quite." [Stack Overflow AI survey]

Chart: Stack Overflow 2025 shows 46% distrust AI output accuracy, 33% trust it, and 66% cite almost-right answers as a frustration

Source: Stack Overflow 2025 Developer Survey.

That phrase should be printed above every code review tool:

almost right, but not quite

That is the danger zone. It applies to human code too.

Bad AI use versus smart AI use

Bad AI use:

  • generate code
  • paste it
  • cannot explain it
  • no tests
  • no docs
  • no edge cases
  • hide the tool use

Smart AI use:

  • ask AI to explain unfamiliar code
  • ask for edge cases
  • compare with docs
  • write or improve tests
  • reject wrong suggestions
  • disclose meaningful help
  • own the final change

Teams should punish the first pattern and teach the second.

The senior double standard

The uncomfortable part is that many seniors have a double standard.

They treat junior AI use as suspicious but treat senior intuition as trustworthy.

That is not engineering. Engineering is evidence.

A senior's hand-written code can still be wrong.

A junior's AI-assisted code can still be correct.

Neither gets a free pass.

The standard should be:

Standard Applies to
Explain the code Everyone
Test risky behavior Everyone
Follow privacy policy Everyone
Disclose required AI use Everyone
Accept review Everyone
Own production impact Everyone

That is how teams avoid turning AI policy into status politics.

What hiring managers should ask

If companies want AI-aware juniors, they should evaluate AI-aware review skills.

Ask candidates:

  • How would you verify AI-generated code?
  • What makes an AI answer unsafe?
  • When would you refuse to use generated output?
  • How would you test this SQL query?
  • What would you disclose in a PR?

That is better than pretending AI does not exist. It is also better than rewarding candidates who hide their workflow.

Bottom line

Review the logic. Review the tests. Review the assumptions. Review the data that disappears.

Review the security boundary. Review the production risk.

Yes, review AI usage too.

But do not confuse tool suspicion with engineering rigor.

The goal is not to keep AI out of the conversation. The goal is to keep bad code out of production.

Interested in code review, AI, and engineering culture? Explore #code-review on DEV.

Top comments (0)