In February 2025, Andrej Karpathy coined "vibe coding" to describe programming's new reality: give in to the vibes, accept all changes, "forget that the code even exists." He called it "not too bad for throwaway weekend projects." But for production systems? That's where the trouble starts.
I've watched AI-generated codebases accumulate the same mess developers spent decades learning to avoid—duplication everywhere, inconsistent naming, missing edge cases. Then it hit me: these are exactly the problems Robert C. Martin warned about in Clean Code almost two decades ago.
So I went back to the book, specifically Chapter 17's catalog of 66 code smells and heuristics. These aren't just relevant to AI coding—they're more relevant. AI makes exactly the mistakes Uncle Bob warned us about, just faster and at scale.
The solution? Skills—instruction files that AI agents read before writing code. I've translated Clean Code's complete catalog into Python skills you can use today. They work in Google's Antigravity IDE, Anthropic's Claude Code, and anywhere that supports the Agent Skills standard.
Let me show you why we need this, and how to implement it.
Even Linus Torvalds Vibe Codes (Sometimes)
In January 2026, Linus Torvalds revealed a side project called AudioNoise—a digital audio effects simulator he'd been tinkering with over the holidays. The Python visualizer, he noted, was "basically written by vibe-coding."
In his own words from the repo:
"I know more about analog filters—and that's not saying much—than I do about python. It started out as my typical 'google and do the monkey-see-monkey-do' kind of programming, but then I cut out the middle-man—me—and just used Google Antigravity to do the audio sample visualizer."
The Hacker News discussion revealed two camps. Some saw it as validation: "It's official, vibe coding is legit." Others noted the crucial context: Torvalds used AI for the part he lacks expertise in (Python visualization) while hand-coding the parts he knows (C and digital signal processing).
One commenter nailed it: "There's a big difference between vibe-coding an entire project and having an AI build a component that you lack competency for."
Another observation cut deeper: "If anyone on the planet knows how to do vibe coding right, it's him"—because Torvalds spent decades mastering code review. He can spot bad code instantly. Most of us can't.
But here's what's telling: Torvalds wrote tests for his hand-coded C—numerical accuracy checks for the DSP primitives he understands. The vibe-coded Python visualizer? No tests, no type hints, and a duplicated function definition that slipped right through. The same four-line method appears twice in a row—the first an empty stub, the second the real implementation. It's textbook "Accept All, don't read the diffs." The code runs fine (Python silently overwrites the first definition), but it's exactly the kind of dead code that accumulates into maintenance nightmares.
This works for Torvalds' toy project precisely. It's a throwaway learning exercise. The moment that visualizer needs to be production code, those missing guardrails become technical debt.
The same week, Torvalds rejected "AI slop" submissions to the Linux kernel, arguing that documentation telling people not to submit garbage won't help because "the people who would submit it won't read the documentation anyway."
The lesson isn't that vibe coding is bad. It's that context matters. Skills let you define when to enforce rigor and when to let the vibes flow.
The Data: AI Code Quality Is Getting Worse
Google's DORA Report found AI adoption shows a negative relationship with software delivery stability. The 2025 report's central finding: "AI doesn't fix a team; it amplifies what's already there." Without robust control systems—strong testing, mature practices, fast feedback loops—increased AI-generated code leads to instability. Skills are exactly those control systems, encoded as instructions.
Carnegie Mellon researchers analyzed 807 GitHub repositories after Cursor adoption: +30% static analysis warnings, +41% code complexity. The speed gains were transient; the quality problems compounded.
GitClear's analysis of 211 million lines of code from Google, Microsoft, Meta, and enterprise repositories found code duplication increased 4x with AI adoption. For the first time in their dataset, copy/pasted code exceeded refactored code.
Even Anthropic's Agentic Coding Trends Report shows the gap: developers use AI in roughly 60% of their work, but can fully delegate only 0-20% of tasks. The rest requires "thoughtful setup, active supervision, and human judgment."
That gap—between what AI touches and what AI can own—is exactly what skills address. The setup is the skill. The supervision is the rules.
The Pattern: AI Recreates Classic Code Smells
The research consistently identifies the same failure patterns. Here's how they map to specific Clean Code violations:
Naming and Consistency Problems
- Inconsistent variable names across similar functions
- Vague names like
data,tmp,proc - Mixing naming conventions (camelCase and snake_case)
- Clean Code rules: N1 (descriptive names), G11 (consistency), G24 (conventions)
Code Duplication
- Copy/paste instead of extracting shared logic
- Same calculation appearing in multiple places
- Pattern repetition that should be abstracted
- Clean Code rule: G5 (DRY - Don't Repeat Yourself)
Missing Safety Checks
- No validation of input boundaries
- Assumptions about data structure without verification
- Missing null/None checks
- Clean Code rules: G3 (boundary conditions), G4 (don't override safeties), G26 (be precise)
Readability Issues
- Magic numbers without explanation (what does 86400 mean?)
- Unused variables cluttering code
- Functions mixing multiple abstraction levels
- Clean Code rules: G12 (remove clutter), G16 (no obscured intent), G34 (single abstraction level)
Performance Problems
- Functions doing multiple things at once
- Exposing internal data unnecessarily
- Nested loops that could be optimized
- Clean Code rules: G8 (minimize public interface), G30 (functions do one thing)
These aren't arbitrary style preferences—they're the exact problems that make code hard to maintain, debug, and extend. The skills we'll build enforce these rules automatically.
The fix isn't to stop using AI. It's to give AI the explicit rules it needs to follow.
That's what skills do.
What Are Skills?
Skills are markdown files containing domain-specific instructions that AI agents read before working on your code. They follow the Agent Skills open standard and work in Google Antigravity, Anthropic's Claude Code, and other compatible agents.
The architecture is called Progressive Disclosure. Instead of dumping every instruction into the agent's context at once (causing what Antigravity's docs call "Context Saturation"), skills work in layers:
- Discovery: The agent sees only a lightweight menu of skill names and descriptions
- Activation: When your request matches a skill's description, the full instructions load
- Execution: Scripts and templates are read only when the task requires them
This keeps the agent fast and focused. It's not thinking about database migrations when you're writing a React component.
The format is simple:
---
name: skill-name
description: When this skill should activate
---
# Skill Title
Your instructions, examples, and rules here.
The description field is crucial—it's the trigger phrase. The agent semantically matches your request against all available skill descriptions to decide which ones to load. "Enforces function best practices" is vague. "Use when writing or refactoring Python functions" tells the agent exactly when to activate.
Skills can do far more than enforce coding standards—the community has built skills for Stripe integration, Metasploit security testing, voice agents, and even multi-agent startup automation. This article focuses on one specific use case: encoding Clean Code principles.
Let me show you how to translate Clean Code's catalog into working skills.
Building the Skills: Three Examples
Rather than catalog all 66 rules exhaustively, I'll show you three critical categories in detail. The complete implementation is at the end.
1. Comments (C1-C5): Code Should Explain Itself
Uncle Bob is famously skeptical of comments—not because documentation is bad, but because comments rot faster than code updates.
File Reference: clean-comments/SKILL.md
---
name: clean-comments
description: Use when writing, fixing, editing, or reviewing Python comments and docstrings. Enforces Clean Code principles—no metadata, no redundancy, no commented-out code.
---
# Clean Comments
## C1: No Inappropriate Information
Comments shouldn't hold metadata. Use Git for author names, change history,
ticket numbers, and dates. Comments are for technical notes about code only.
## C2: Delete Obsolete Comments
If a comment describes code that no longer exists or works differently,
delete it immediately. Stale comments become "floating islands of
irrelevance and misdirection."
## C3: No Redundant Comments
# Bad - the code already says this
i += 1 # increment i
user.save() # save the user
# Good - explains WHY, not WHAT
i += 1 # compensate for zero-indexing in display
## C4: Write Comments Well
If a comment is worth writing, write it well:
- Choose words carefully
- Use correct grammar
- Don't ramble or state the obvious
- Be brief
## C5: Never Commit Commented-Out Code
# DELETE THIS - it's an abomination
# def old_calculate_tax(income):
# return income * 0.15
Who knows how old it is? Who knows if it's meaningful? Delete it.
Git remembers everything.
## The Goal
The best comment is the code itself. If you need a comment to explain
what code does, refactor first, comment last.
2. Functions (F1-F4): Small, Focused, Obvious
Functions should do one thing, do it well, and have an obvious purpose.
File Reference: clean-functions/SKILL.md
---
name: clean-functions
description: Use when writing or refactoring Python functions. Enforces Clean Code principles—maximum 3 arguments, single responsibility, no flag parameters.
---
# Clean Functions
## F1: Too Many Arguments (Maximum 3)
# Bad - too many parameters
def create_user(name, email, age, country, timezone, language, newsletter):
...
# Good - use a dataclass or dict
@dataclass
class UserData:
name: str
email: str
age: int
country: str
timezone: str
language: str
newsletter: bool
def create_user(data: UserData):
...
More than 3 arguments means your function is doing too much or needs
a data structure.
## F2: No Output Arguments
Don't modify arguments as side effects. Return values instead.
# Bad - modifies argument
def append_footer(report: Report) -> None:
report.append("\n---\nGenerated by System")
# Good - returns new value
def with_footer(report: Report) -> Report:
return report + "\n---\nGenerated by System"
## F3: No Flag Arguments
Boolean flags mean your function does at least two things.
# Bad - function does two different things
def render(is_test: bool):
if is_test:
render_test_page()
else:
render_production_page()
# Good - split into two functions
def render_test_page(): ...
def render_production_page(): ...
## F4: Delete Dead Functions
If it's not called, delete it. No "just in case" code. Git preserves history.
3. General Principles (G1-G36): The Core Rules
These are the fundamental patterns that separate clean code from legacy nightmares.
File Reference: clean-general/SKILL.md
---
name: clean-general
description: Use when reviewing Python code quality. Enforces Clean Code's core principles—DRY, single responsibility, clear intent, no magic numbers, proper abstractions.
---
# General Clean Code Principles
## Critical Rules
**G5: DRY (Don't Repeat Yourself)**
Every piece of knowledge has one authoritative representation.
# Bad - duplication
tax_rate = 0.0825
ca_total = subtotal * 1.0825
ny_total = subtotal * 1.07
# Good - single source of truth
TAX_RATES = {"CA": 0.0825, "NY": 0.07}
def calculate_total(subtotal: float, state: str) -> float:
return subtotal * (1 + TAX_RATES[state])
**G16: No Obscured Intent**
Don't be clever. Be clear.
# Bad - what does this do?
return (x & 0x0F) << 4 | (y & 0x0F)
# Good - obvious intent
return pack_coordinates(x, y)
**G23: Prefer Polymorphism to If/Else**
# Bad - will grow forever
def calculate_pay(employee):
if employee.type == "SALARIED":
return employee.salary
elif employee.type == "HOURLY":
return employee.hours * employee.rate
elif employee.type == "COMMISSIONED":
return employee.base + employee.commission
# Good - open/closed principle
class SalariedEmployee:
def calculate_pay(self): return self.salary
class HourlyEmployee:
def calculate_pay(self): return self.hours * self.rate
class CommissionedEmployee:
def calculate_pay(self): return self.base + self.commission
**G25: Replace Magic Numbers with Named Constants**
# Bad
if elapsed_time > 86400:
...
# Good
SECONDS_PER_DAY = 86400
if elapsed_time > SECONDS_PER_DAY:
...
**G30: Functions Should Do One Thing**
If you can extract another function, your function does more than one thing.
**G36: Law of Demeter (Avoid Train Wrecks)**
# Bad - reaching through multiple objects
output_dir = context.options.scratch_dir.absolute_path
# Good - one dot
output_dir = context.get_scratch_dir()
## Enforcement Checklist
When reviewing AI-generated code, verify:
- [ ] No duplication (G5)
- [ ] Clear intent, no magic numbers (G16, G25)
- [ ] Polymorphism over conditionals (G23)
- [ ] Functions do one thing (G30)
- [ ] No Law of Demeter violations (G36)
- [ ] Boundary conditions handled (G3)
- [ ] Dead code removed (G9)
The Complete Catalog
I've translated all 66 rules from Clean Code Chapter 17 into skills covering six categories:
Comments (C1-C5): Minimal, accurate commenting Environment (E1-E2): One-command build and test Functions (F1-F4): Small, focused, obvious General (G1-G36): Core principles Names (N1-N7): Descriptive, unambiguous, right-sized Tests (T1-T9): Fast, independent, exhaustiveClick to expand all skill categories
Get the complete skill files:
Clean Code Skills for AI Agents
Teach your AI to write code that doesn't suck.
This repository contains Agent Skills that enforce Robert C. Martin's Clean Code principles. They work with Google Antigravity, Anthropic's Claude Code, and any agent that supports the Agent Skills standard.
Why?
AI generates code fast, but research shows it also generates technical debt fast:
- GitClear: 4x increase in code duplication with AI adoption
- Carnegie Mellon: +30% static analysis warnings, +41% code complexity after Cursor adoption
- Google DORA: Negative relationship between AI adoption and software delivery stability
These skills encode battle-tested solutions to exactly these problems—directly into your AI workflow.
What's Included
| Skill | Description | Rules |
|---|---|---|
boy-scout |
Orchestrator—always leave code cleaner than you found it | Coordinates all skills |
python-clean-code |
Master skill with all 66 rules | C1-C5, E1-E2, F1-F4, G1-G36, N1-N7, P1-P3, T1-T9 |
clean-comments |
Minimal, accurate commenting | C1-C5 |
clean-functions |
Small, focused, obvious functions | F1-F4 |
The repo includes:
-
boy-scout: An orchestrator skill that embodies the Boy Scout Rule—"always leave code cleaner than you found it"—and coordinates the other skills -
python-clean-code: A master skill with all 66 rules, plus a quick reference table and anti-patterns cheatsheet -
Individual skills for each category (
clean-comments,clean-functions,clean-general,clean-names,clean-tests)—drop in only what you need - Installation instructions for Antigravity, Claude Code, and other Agent Skills-compatible tools
How to Use These Skills
Skills sit in a specific place in the agent ecosystem. Rules are passive guardrails that are always on. Skills are agent-triggered—the model decides when to equip them based on your intent. If you're using MCP servers (connections to external tools like GitHub or Postgres), think of MCP as the "hands" and skills as the "brains" that direct them.
For Antigravity
- Create
.agent/skills/in your project root (or~/.gemini/antigravity/skills/for global access) - Save the skill as a folder with a
SKILL.mdfile inside (e.g.,.agent/skills/python-clean-code/SKILL.md) - Ask the agent to review or write code—it'll automatically apply the rules when relevant
Global vs Project Skills
-
Project-specific:
.agent/skills/ -
Global Antigravity:
~/.gemini/antigravity/skills/
The agent only loads full skill content when needed, so comprehensive skills don't slow down simple requests.
Going Further
The skills in this article are instruction-only—they tell the agent what to do. For stricter enforcement, you could add a scripts/ folder with a linter that compatible agents runs them automatically, or an examples/ folder with before/after code samples for few-shot learning. The format supports it; we're just keeping things simple here.
A Real-World Example
Here's code that violates multiple Clean Code rules:
from utils import * # P1
# Author: John, Modified: 2024-01-15 # C1
def proc(d, t, flag=False): # N1, F1, F3
# Process the data # C3
x = [] # N1
for i in d:
if flag: # F3
if i['type'] == 'A': # G23
x.append(i['val'] * 1.0825) # G25
elif i['type'] == 'B':
x.append(i['val'] * 1.05) # G25
else:
x.append(i['val'])
with open(f'/tmp/{t}.json', 'w') as f: # G6
json.dump(x, f)
# Old approach # C5
# for item in d:
# print(item)
return x
Violations: P1, C1, C3, C5, F1, F3, G6, G23, G25, N1
With the Clean Code skill active, ask your AI agent to refactor this:
import json
from pathlib import Path
from typing import List, Literal
from dataclasses import dataclass
TAX_RATE_CA = 0.0825
TAX_RATE_NY = 0.05
TransactionType = Literal['CA', 'NY']
@dataclass
class Transaction:
value: float
type: TransactionType
def apply_tax(transaction: Transaction) -> float:
"""Apply state-specific tax to transaction value."""
tax_rates = {'CA': TAX_RATE_CA, 'NY': TAX_RATE_NY}
return transaction.value * (1 + tax_rates[transaction.type])
def process_transactions_with_tax(
transactions: List[Transaction]
) -> List[float]:
"""Calculate taxed values for all transactions."""
return [apply_tax(t) for t in transactions]
def process_transactions_without_tax(
transactions: List[Transaction]
) -> List[float]:
"""Extract raw values from all transactions."""
return [t.value for t in transactions]
def save_results(values: List[float], output_path: Path) -> None:
"""Save processed values to JSON file."""
output_path.parent.mkdir(parents=True, exist_ok=True)
with output_path.open('w') as f:
json.dump(values, f)
The refactored version:
- ✅ No wildcard imports (P1)
- ✅ No metadata comments (C1)
- ✅ No redundant comments (C3)
- ✅ No commented-out code (C5)
- ✅ Descriptive names (N1)
- ✅ No flag arguments (F3)
- ✅ Named constants instead of magic numbers (G25)
- ✅ Functions do one thing (G30)
- ✅ Polymorphism through data structure (G23)
Anatomy of a Vibe-Coded Script
Remember the duplicated function I mentioned in Torvalds' AudioNoise visualizer? Here it is:
def update_slider_text(self, val):
"""Helper to update slider texts (Width and End Point)."""
start_val, end_val = val
width = end_val - start_val
def update_slider_text(self, val):
"""Helper to update slider texts (Width and End Point)."""
start_val, end_val = val
width = end_val - start_val
if self.x_mode == 'Time':
self.slider.valtext.set_text(f"Window: {start_val:.3f} + {width:.3f} s")
else:
self.slider.valtext.set_text(f"Window: {int(start_val)} + {int(width)}")
The first definition unpacks values, calculates width, then... returns None. The second definition is the real implementation. Python silently overwrites the first with the second, so the code runs. But it's textbook dead code—Clean Code rule G9: Remove dead code.
With the skill active, an agent refactors the entire 600-line script. The duplicate vanishes, magic numbers become constants, and nested functions get extracted into focused methods:
def update_slider_text(self, val: tuple[float, float]):
"""Update slider text with either time or sample count."""
start_val, end_val = val
width = end_val - start_val
if self.x_mode == 'Time':
self.slider.valtext.set_text(f"Window: {start_val:.3f} + {width:.3f} s")
else:
self.slider.valtext.set_text(f"Window: {int(start_val)} + {int(width)}")
The refactored version:
- ✅ Dead code removed (G9)
- ✅ Type hints added (clarity)
- ✅ Single, authoritative definition (G5)
- ✅ Magic numbers extracted to constants (G25)
- ✅ Large methods decomposed (G30)
The full diff shows 600+ lines reduced to ~440—not by removing functionality, but by eliminating duplication and extracting reusable patterns.
Why This Matters Now
Vibe coding isn't going away. AI will get better at generating code, not worse. But "better at generating" doesn't mean "better at maintaining."
The research is clear: AI produces code faster, but that code accumulates technical debt faster too. Without guard rails, we're building tomorrow's legacy systems today.
Uncle Bob's Clean Code principles are almost 20 years old, but they're exactly what we need now. They're not arbitrary style preferences—they're battle-tested solutions to the problems AI recreates at scale.
Skills give you the mechanism to encode these rules directly into your AI workflow. Whether you're using Antigravity, Claude Code, or another agent, the approach is the same: define what clean code means, then let the AI follow the rules.
Your agent doesn't know what good code looks like unless you tell it.
So tell it.
Resources
The Book
- Clean Code by Robert C. Martin: Amazon
Skills Documentation
- Agent Skills Standard — The open standard for AI agent instructions
- Antigravity Skills Guide — Google's official documentation
- Claude Code Agent Skills — Anthropic's implementation
Research Cited
- DORA 2025: AI-Assisted Software Development — Google's findings on AI and delivery stability
- Code Quality After Cursor Adoption — Carnegie Mellon's analysis of 807 repositories
- GitClear 2025 Code Quality Report — 211M lines analyzed
- Agentic Coding Trends — Anthropic's delegation gap analysis
Get the Skills
- Clean Code Skills Repository — All 66 rules as ready-to-use skill files
The future of programming is human intent translated by AI. Make sure the translation preserves quality, not just speed.


Top comments (0)