A drop-in CLAUDE.md for Python projects. Place it at the repository root so AI coding assistants (Claude Code, Cursor, GitHub Copilot, Aider) follow your house rules before they touch a .py file.
Targets Python 3.11+, type-checked with mypy --strict or pyright, tested with pytest, packaged with the src/ layout.
Rule 1: Type hints on every function signature — no exceptions
AI ships untyped helpers because untyped Python "just runs". Three months later you are reading def process(data, opts=None) and you don't know what either argument is.
BAD
def process(data, opts=None):
out = []
for x in data:
if opts and opts.get("upper"):
x = x.upper()
out.append(x)
return out
GOOD
from collections.abc import Iterable
def process(data: Iterable[str], *, upper: bool = False) -> list[str]:
return [x.upper() if upper else x for x in data]
Why: Types are documentation the type checker enforces. mypy --strict catches the bug before it reaches the test suite.
Rule 2: dataclass (or pydantic.BaseModel) for structured data — never a dict with string keys
AI defaults to {"user_id": 1, "name": "..."} because it scraped a decade of tutorials. The dict has no schema, no autocomplete, and silently accepts typos like usr_id.
BAD
def make_user(name, email):
return {"name": name, "email": email, "active": True}
u = make_user("Ada", "ada@example.com")
print(u["emial"]) # KeyError at runtime, not import time
GOOD
from dataclasses import dataclass
@dataclass(frozen=True, slots=True)
class User:
name: str
email: str
active: bool = True
u = User(name="Ada", email="ada@example.com")
Why: A dataclass is a typed contract; typos and missing fields fail at construction, not at the database boundary three layers down.
Rule 3: pathlib.Path for filesystem work — never os.path string concatenation
os.path.join(base, "data", filename) is the 2014 idiom AI keeps emitting. It returns a string, drops platform separators on the floor, and forces every helper to re-parse the path.
BAD
import os
def load(base, name):
full = os.path.join(base, "data", name)
if not os.path.exists(full):
raise FileNotFoundError(full)
with open(full) as f:
return f.read()
GOOD
from pathlib import Path
def load(base: Path, name: str) -> str:
target = base / "data" / name
if not target.exists():
raise FileNotFoundError(target)
return target.read_text(encoding="utf-8")
Why: Path is an object with methods (.exists(), .read_text(), .with_suffix()); strings are not. Every os.path call is a missed method.
Rule 4: Context managers for every resource — no manual close(), no try/finally ceremony
Files, locks, DB connections, HTTP sessions, subprocesses, temp dirs — all of them have __enter__/__exit__. AI still writes f = open(...); f.close() and leaks file descriptors when an exception fires between the two.
BAD
f = open("data.json")
data = json.load(f)
f.close() # never reached if json.load raises
GOOD
from pathlib import Path
import json
with Path("data.json").open(encoding="utf-8") as f:
data = json.load(f)
Why: with guarantees cleanup on normal exit AND on exceptions; manual close() does not.
Rule 5: No bare except: — catch the exception type you mean, and re-raise what you don't
except: (or except Exception: at module top level) swallows KeyboardInterrupt, SystemExit, and the bug you were supposed to fix. AI uses it because it makes the linter quiet.
BAD
try:
user = fetch_user(uid)
except:
user = None # which error? a 404? a network blip? a bug in fetch_user?
GOOD
import logging
from myapp.errors import UserNotFound
log = logging.getLogger(__name__)
try:
user = fetch_user(uid)
except UserNotFound:
user = None
except TimeoutError:
log.warning("fetch_user timed out for uid=%s", uid)
raise
Why: Bare except hides bugs and breaks Ctrl-C. Catching specific types makes intent explicit and lets unexpected errors bubble up where they can be fixed.
Rule 6: No mutable default arguments — use None and assign inside
def f(items=[]): items.append(1); return items returns [1], then [1, 1], then [1, 1, 1]. AI writes this every week because it doesn't remember that defaults are evaluated once.
BAD
def add_tag(item, tags=[]):
tags.append(item)
return tags
add_tag("a") # ['a']
add_tag("b") # ['a', 'b'] — surprise!
GOOD
def add_tag(item: str, tags: list[str] | None = None) -> list[str]:
tags = list(tags) if tags is not None else []
tags.append(item)
return tags
Why: Default values are evaluated at function definition, not at every call — mutable defaults become shared state across calls.
Rule 7: pydantic.BaseModel at every external boundary — never trust raw JSON
The HTTP handler receives a dict from request.json(), passes it three layers down, and crashes on KeyError deep in business logic. AI plumbs the raw dict because validation feels like ceremony.
BAD
def create_user(payload: dict):
return User(name=payload["name"], age=int(payload["age"]))
GOOD
from pydantic import BaseModel, EmailStr, Field
class CreateUserRequest(BaseModel):
name: str = Field(min_length=1, max_length=100)
email: EmailStr
age: int = Field(ge=0, le=150)
def create_user(req: CreateUserRequest) -> User:
return User(name=req.name, email=req.email, age=req.age)
Why: Pydantic validates, coerces, and produces a typed object at the boundary; everything inside the boundary can trust its inputs.
Rule 8: Dependency injection via constructor — never import a singleton inside a function
from myapp.db import db in the middle of a service is how you end up with tests that need a real Postgres to run. AI does this because it's the shortest path to "it works on my machine".
BAD
from myapp.db import db # module-level singleton
def get_user(uid: int) -> User:
return db.query("SELECT * FROM users WHERE id = %s", uid)
GOOD
from typing import Protocol
class UserRepo(Protocol):
def get(self, uid: int) -> User: ...
class UserService:
def __init__(self, repo: UserRepo) -> None:
self._repo = repo
def get_user(self, uid: int) -> User:
return self._repo.get(uid)
Why: Constructor injection makes dependencies visible in the type signature and trivially substitutable in tests — no monkeypatching, no global teardown.
Rule 9: async def end-to-end — never call sync I/O from inside an async function
A single requests.get(...) inside an async def blocks the entire event loop. AI mixes requests and httpx.AsyncClient because both look like HTTP clients.
BAD
import requests
async def fetch(url: str) -> dict:
return requests.get(url).json() # blocks the event loop
GOOD
import httpx
async def fetch(client: httpx.AsyncClient, url: str) -> dict:
response = await client.get(url)
response.raise_for_status()
return response.json()
Why: Async only buys concurrency if every I/O call yields. One sync call inside an async coroutine serializes the whole loop.
Rule 10: asyncio.gather with return_exceptions=False and a TaskGroup for structured concurrency
AI fires tasks with asyncio.create_task(...) and never awaits them. The task raises, the exception is logged to stderr, and the parent function returns success.
BAD
async def sync_all(users):
for u in users:
asyncio.create_task(sync_one(u)) # fire-and-forget; errors lost
GOOD
import asyncio
async def sync_all(users: list[User]) -> None:
async with asyncio.TaskGroup() as tg:
for u in users:
tg.create_task(sync_one(u))
Why: TaskGroup (Python 3.11+) cancels siblings on the first failure and re-raises an ExceptionGroup — orphaned tasks become impossible.
Rule 11: pytest with fixtures and parametrize — no unittest.TestCase, no test classes
AI writes class TestFoo(unittest.TestCase): def test_bar(self): self.assertEqual(...). Pytest doesn't need any of it; the result is half the lines and twice the readability.
BAD
import unittest
class TestSlugify(unittest.TestCase):
def test_basic(self):
self.assertEqual(slugify("Hello World"), "hello-world")
def test_unicode(self):
self.assertEqual(slugify("Olá Mundo"), "ola-mundo")
GOOD
import pytest
@pytest.mark.parametrize(
"raw, expected",
[
("Hello World", "hello-world"),
("Olá Mundo", "ola-mundo"),
("", ""),
],
)
def test_slugify(raw: str, expected: str) -> None:
assert slugify(raw) == expected
Why: parametrize turns N similar tests into one declarative table; failures report which row failed, not which method.
Rule 12: logging.getLogger(__name__) per module — never print(), never the root logger
print("got user", user) ships to production, fills CloudWatch with noise, and gives the SRE no way to filter it out. AI uses print because the prompt didn't ask for logging.
BAD
def charge(amount: int) -> None:
print(f"charging {amount}") # not structured, not leveled, not filterable
GOOD
import logging
log = logging.getLogger(__name__)
def charge(amount_cents: int) -> None:
log.info("charging amount_cents=%d", amount_cents)
Why: Per-module loggers inherit configuration, support levels and handlers, and produce structured output your log aggregator can parse.
Rule 13: src/ layout with pyproject.toml — never code at the repo root, never setup.py
AI scaffolds projects with mypackage/__init__.py next to tests/ at the repo root. Tests import from the working directory by accident, the install is broken, and pip install -e . imports the wrong thing.
BAD
myproject/
├── mypackage/
│ └── __init__.py
├── tests/
└── setup.py
GOOD
myproject/
├── pyproject.toml
├── src/
│ └── mypackage/
│ └── __init__.py
└── tests/
└── test_mypackage.py
# pyproject.toml
[build-system]
requires = ["hatchling"]
build-backend = "hatchling.build"
[project]
name = "mypackage"
version = "0.1.0"
requires-python = ">=3.11"
[tool.hatch.build.targets.wheel]
packages = ["src/mypackage"]
Why: The src/ layout forces tests to import the installed package, not whatever cwd happens to expose — the same code the user will run.
Rule 14: Composition over inheritance — Protocol for duck typing, not deep class hierarchies
AI loves class PremiumUser(User): class TrialPremiumUser(PremiumUser):. Three months later you're debugging which __init__ ran in what order, and super().method() resolves to a class nobody remembers writing.
BAD
class User:
def discount(self) -> float: return 0.0
class PremiumUser(User):
def discount(self) -> float: return 0.1
class TrialPremium(PremiumUser):
def discount(self) -> float: return 0.05
GOOD
from dataclasses import dataclass
from typing import Protocol
class DiscountPolicy(Protocol):
def rate(self) -> float: ...
class FlatDiscount:
def __init__(self, rate: float) -> None:
self._rate = rate
def rate(self) -> float:
return self._rate
@dataclass
class User:
name: str
discount: DiscountPolicy
ada = User("Ada", FlatDiscount(0.1))
Why: Composed objects are independently testable and swappable; deep inheritance couples behavior to type identity and breaks the moment requirements change.
Wrapping up
These 14 rules don't replace PEP 8 or the Python docs — they encode the failure modes AI repeats most often in real Python codebases. Type hints over untyped helpers, dataclasses over dicts, pathlib over os.path, context managers over manual cleanup, specific except over bare, no mutable defaults, pydantic at the boundary, DI over import-singletons, async end-to-end, TaskGroup over fire-and-forget, pytest over unittest, per-module loggers over print, src/ layout over root-level packages, and composition over inheritance — that's the difference between Python that ships and Python that gets rewritten in six months.
Drop this file at the root of your repo. The next AI prompt produces Python your future self won't have to apologise for in a code review.
— OliviaCraft · oliviacraft.lat
Want 35+ more production rules across 40+ stacks? → https://oliviacraftlat.gumroad.com/l/skdgt
Original Gist: https://gist.github.com/oliviacraft/8ea9ea2459902e31c5e24da39b534e73
Want 35+ more production rules across 40+ stacks? → https://oliviacraftlat.gumroad.com/l/skdgt
Top comments (0)