DEV Community

Cover image for I built a merge gate that quizzes developers on their own code changes — here's why and how
IslandBytes
IslandBytes

Posted on

I built a merge gate that quizzes developers on their own code changes — here's why and how

We've all done it. Merged our own PR on autopilot. Clicked through an AI suggestion without really reading it. Approved a teammate's "small change" that turned out to be anything but.

AI coding tools have made this worse. The volume of code being generated and merged without genuine human understanding is growing fast. Linters catch style. Tests catch regressions. Nothing catches the case where the developer simply never understood what they were shipping.

So I built Commit Comprehension Gate (CCG).


What it does
CCG is a GitHub Action that intercepts pull requests and generates multiple choice questions from the actual diff before allowing a merge.
When a PR is opened:

  1. The diff is sent to Claude via the Anthropic API
  2. Claude generates 3 multiple choice questions about the actual logic — not trivia, questions that require reading the code to answer
  3. Questions are posted as a PR comment, merge is blocked via a required status check
  4. The PR author answers in the comment thread
  5. All 3 correct → merge unlocked. Wrong answers → stays blocked

The author can retry as many times as needed. Pushing new commits regenerates fresh questions from the updated diff.


The interesting implementation challenge
The most interesting design constraint I set for myself: no external database, no storage service, completely stateless.

The answer key needed to persist between two decoupled GitHub Actions workflows — the question generation workflow and the answer verification workflow — without any shared state.

The solution: the answer key is stored as a base64-encoded hidden HTML comment embedded directly in the PR comment itself.
<!-- comprehension-gate: eyJxdWVzdGlvbnMiOiBb... -->

Human readable on the surface. Answer key invisible. No database required.
This means:

  • Zero infrastructure to maintain
  • Works in any repo with zero setup beyond 3 files and one API key
  • Answer key automatically refreshes when new commits are pushed

How answer checking works
Claude is called exactly once per PR state — once on open, once per new commit. All subsequent answer checking is instant local string comparison. No additional API calls regardless of how many times the author retries.

This keeps costs predictable:

  • Diff size: Small (< 2 KB), Estimated cost: ~$0.01
  • Diff size: Medium (2–6 KB), Estimated cost: ~$0.04
  • Diff size: Large (6–12 KB), Estimated cost: ~$0.07

Typical cost: $0.05–$0.10 per PR.


One more design decision worth explaining
Only the PR author can pass the gate.

The answer verification workflow checks commenter identity against PR metadata before evaluating any answers. Teammates can't answer on your behalf. Automated bots can't satisfy the gate. The person who wrote the code has to demonstrate they understand it.


Who this is for
This is built for teams where "LGTM" has become a rubber stamp. Where AI-generated code is getting merged faster than anyone can review it. Where a regression ships and nobody can explain what the change was supposed to do.

It's not a linter. It's not another AI reviewer adding comments to your PR. It's a forcing function — if you can't answer basic questions about your own change, you shouldn't be merging it.


Try it
The repo is open source and MIT licensed:
github.com/islandbytesio/commit_comprehension_gate

Setup takes about 5 minutes — copy 3 files into your repo, add your Anthropic API key as a repo secret, and configure the required status check.

Would love feedback from anyone who tries it on a real codebase — especially around question quality on different languages and diff sizes.


Patent Pending — IslandBytes LLC

Top comments (0)