hyhmrright

Posted on Apr 15

I gave Claude the same code review twice. It missed a SQL injection the second time. So I encoded 12 engineering books into it.

#ai #codereview #opensource #devtools

"The bearing of a child takes nine months, no matter how many women are assigned."
— Frederick Brooks, The Mythical Man-Month (1975)

Fifty years later, Brooks was still right. And so was the rest of my shelf.

I ran the same code review with Claude twice.

First time: caught the SQL injection, flagged separation of concerns. Solid review.
Second time: focused on naming conventions. Missed the injection entirely.

Same code. Same model. Completely different results.

That's not a Claude problem. That's a consistency problem. And it's fixable — if you give the model a framework to work from.

So I encoded 12 classic engineering books into a Claude Code skill. Here's what happened.

The Problem

Most code quality tools count lines and cyclomatic complexity. That's useful, but it misses the deeper problems: architectural drift, knowledge silos, domain model distortion — the issues that slow teams down for months before anyone notices.

Meanwhile, the software engineering classics have had answers to these problems for decades. Brooks, Fowler, Martin, McConnell, Evans, Ousterhout — twelve books, fifty years of hard-won wisdom. The insights haven't changed. We just stopped encoding them consistently.

What I Built

brooks-lint is a Claude Code skill (also works with Gemini CLI and Codex CLI) that diagnoses code against twelve decay risk dimensions synthesized from 12 classic engineering books, producing structured findings with book citations, severity labels, and concrete remedies every time.

The 12 Books

Book	Author
The Mythical Man-Month	Frederick Brooks
Code Complete	Steve McConnell
Refactoring	Martin Fowler
Clean Architecture	Robert C. Martin
The Pragmatic Programmer	Hunt & Thomas
Domain-Driven Design	Eric Evans
A Philosophy of Software Design	John Ousterhout
Software Engineering at Google	Winters, Manshreck & Wright
xUnit Test Patterns	Gerard Meszaros
The Art of Unit Testing	Roy Osherove
How Google Tests Software	Whittaker, Arbon & Carollo
Working Effectively with Legacy Code	Michael Feathers

The Six Production Code Decay Risks

Risk	Diagnostic Question
🧠 Cognitive Overload	How much mental effort to understand this?
🔗 Change Propagation	How many unrelated things break on one change?
📋 Knowledge Duplication	Is the same decision expressed in multiple places?
🌀 Accidental Complexity	Is the code more complex than the problem?
🏗️ Dependency Disorder	Do dependencies flow in a consistent direction?
🗺️ Domain Model Distortion	Does the code faithfully represent the domain?

Every finding follows the same chain: Symptom → Source (book + chapter) → Consequence → Remedy

The Six Test-Suite Decay Risks (New in v0.5)

brooks-lint now also audits your test suite against six test-space decay risks sourced from xUnit Test Patterns, The Art of Unit Testing, How Google Tests Software, and Working Effectively with Legacy Code:

Risk	Diagnostic Question
🔍 Test Obscurity	Can you understand what this test verifies at a glance?
🧱 Test Brittleness	Does this test break when unrelated implementation details change?
📋 Test Duplication	Are the same scenarios covered in multiple places?
🎭 Mock Abuse	Are mocks hiding real design problems?
📊 Coverage Illusion	Does high coverage give false confidence?
🏗️ Architecture Mismatch	Do tests reflect the production architecture?

What It Looks Like

Given this code:

class UserService:
    def update_profile(self, user_id, name, email, avatar_url):
        user = self.db.query(f"SELECT * FROM users WHERE id = {user_id}")
        user['email'] = email
        if user['email'] != email:  # always False — silent bug
            self.smtp.send(...)
        points = user['login_count'] * 10 + 500
        self.db.execute(f"UPDATE loyalty SET points={points} WHERE user_id={user_id}")

brooks-lint produces:

Health Score: 28/100

🔴 Change Propagation — Single Method Changes for Four Unrelated Business Reasons
Symptom: update_profile performs profile updates, email notifications, loyalty
         points recalculation, and cache invalidation all in one method body.
Source:  Fowler — Refactoring — Divergent Change
         Hunt & Thomas — The Pragmatic Programmer — Orthogonality
Consequence: Any change to the loyalty formula risks breaking email notifications.
Remedy: Extract NotificationService, LoyaltyService, and UserCacheInvalidator.

🔴 Domain Model Distortion — Silent Logic Bug: Email Notification Never Fires
Symptom: user['email'] = email overwrites the old value before the comparison —
         the condition is always False. The notification is dead code.
Source:  McConnell — Code Complete — Ch. 17: Unusual Control Structures
Consequence: Users are never notified when their email address changes.
Remedy: Capture old_email = user['email'] before any mutation.

(+ 6 more findings including SQL injection, dependency disorder, magic numbers)

Architecture Audit with Dependency Graph (v0.6)

In Mode 2, brooks-lint generates a Mermaid dependency graph color-coded by severity — red = Critical, yellow = Warning, green = clean. It renders natively in GitHub, VS Code, and Notion.

Four Modes

Command	Short Form	Action
`/brooks-lint:brooks-review`	`/brooks-review`	PR-level code review
`/brooks-lint:brooks-audit`	`/brooks-audit`	Architecture audit with Mermaid dependency graph
`/brooks-lint:brooks-debt`	`/brooks-debt`	Tech debt assessment with prioritized roadmap
`/brooks-lint:brooks-test`	`/brooks-test`	Test suite health review

Benchmark Results

Tested across 3 real-world scenarios (PR review, architecture audit, tech debt):

Criterion	brooks-lint	Plain Claude
Structured findings	✅ 100%	❌ 0%
Book citations	✅ 100%	❌ 0%
Severity labels	✅ 100%	❌ 0%
Health Score (0–100)	✅ 100%	❌ 0%
Overall pass rate	94%	16%

The gap isn't what Claude can find — it's what it consistently finds, with traceable evidence and actionable remedies every time.

How It Compares

	brooks-lint	ESLint/Pylint	GitHub Copilot	Plain Claude
Structured diagnosis chain	✅	❌	❌	❌
Traces findings to classic books	✅	❌	❌	❌
Architecture-level insights	✅	❌	~	~
Domain model analysis	✅	❌	❌	~
Zero config, no plugins	✅	❌	✅	✅
Works with any language	✅	❌	✅	✅

brooks-lint doesn't replace your linter. It catches what linters can't.

Installation

Claude Code (Recommended)

/plugin marketplace add hyhmrright/brooks-lint

Gemini CLI

/extensions install https://github.com/hyhmrright/brooks-lint

Codex CLI

$brooks-review  # skills trigger automatically on code quality discussions

Manual Install

cp commands/*.md ~/.claude/commands/
cp -r skills/ ~/.claude/skills/brooks-lint

Configuration (v0.7)

Place a .brooks-lint.yaml in your project root to customize behavior:

version: 1
disable:
  - T5  # skip coverage metrics check
severity:
  R1: suggestion  # downgrade Cognitive Overload for this domain
ignore:
  - "**/*.generated.*"
  - "**/vendor/**"

GitHub: https://github.com/hyhmrright/brooks-lint — MIT licensed, free to use.

AI can help you write code faster, but it can't tell you whether you're building a cathedral or a tar pit. brooks-lint bridges that gap.

If you've used AI for code reviews, I'm curious: what's your biggest frustration with consistency? Drop it in the comments — I'd love to hear what decay risks you're seeing most.

If this was useful, a ❤️ or unicorn helps others find it.

DEV Community