Olivia Craft

Posted on Apr 22

Cursor Rules for Python: The Complete Guide to AI-Assisted Python Development

#python #cursorrules #ai #programming

Cursor Rules for Python: The Complete Guide to AI-Assisted Python Development

Python is the language where "it works on my machine" hides the longest. The interpreter does not stop you from returning a dict typed as Any, reaching for %s formatting in 2026, catching Exception to keep a worker alive, or writing os.path.join in a codebase that has had pathlib available since 3.4. The code runs. CI is green. A refactor six months later reveals that nothing ever had real types, three of your fetches were blocking the event loop, and a file handle in a for loop has been leaking descriptors since launch.

Then you add an AI assistant.

Cursor and Claude Code were trained on a planet's worth of Python. Most of that Python predates pathlib, half of it predates type hints, and most of it treats asyncio as the interesting section at the back of the book. So when you ask for "a function that reads a config file and fetches from an API," the default output is a synchronous open('config.json'), an os.path dance for the directory, a requests.get inside an async def, and a response shape typed as Dict[str, Any]. The code runs. It's not the Python you would ship.

The fix is not better prompting. It is .cursorrules — a single file checked into the repo that tells the AI what idiomatic modern Python looks like in your codebase.

This is the complete guide. Eight rules, each with the failure mode, the rule that prevents it, and a before/after. A copy-paste .cursorrules template at the end. Use it today.

How Cursor Rules Work for Python Projects

Cursor reads project rules from two locations:

.cursorrules — a single file at the repo root (classic format, still supported)
.cursor/rules/*.mdc — modular rule files with frontmatter (recommended for anything bigger than a script)

For Python I recommend modular rules so that a FastAPI service's async conventions don't bleed into a data pipeline's pandas-heavy rules in the same monorepo:

.cursor/
  rules/
    py-core.mdc          # typing, f-strings, pathlib, comprehensions
    py-async.mdc         # asyncio, await discipline, blocking calls
    py-models.mdc        # Pydantic, dataclasses, NamedTuple
    py-resources.mdc     # context managers, file I/O, connection pools
    py-testing.mdc       # pytest layout, fixtures, mocks

Frontmatter controls when each rule activates:

---
description: Python async patterns for FastAPI services
globs: ["**/*.py"]
alwaysApply: false
---

Now the rules.

Rule 1: Type Hints Everywhere — No `Any`, Use `TypedDict` or `Protocol`

The most common AI failure in Python is not missing type hints — it is fake type hints. Cursor slaps Any on anything it cannot figure out, types a heterogeneous return as dict, and hands you a signature that looks typed and tells you nothing. mypy --strict catches it. mypy without flags does not.

The rule:

Every function and method has parameter and return type hints.
`Any` is banned outside of narrowly scoped adapters; use TypedDict,
Protocol, dataclass, or Pydantic models for structured data.

For structural types use Protocol, not ABCs or duck typing in docstrings.
For JSON-shaped payloads use TypedDict (total=False for optional keys).
For generic containers use the builtin generics (list[T], dict[K, V],
tuple[T, ...]) — never typing.List, typing.Dict, typing.Tuple.

Use `from __future__ import annotations` in every module so forward
references resolve lazily and circular imports stop biting.

Before — AI-generated signature with hidden Any:

from typing import Any

def enrich_user(user: dict, sources: list) -> dict:
    result = dict(user)
    for s in sources:
        result.update(s.fetch(user["id"]))
    return result

user is dict[str, Any]. sources could hold anything. The return has no shape. Rename user.id and nothing lights up.

After — same function with the rule applied:

from __future__ import annotations

from typing import Protocol, TypedDict


class User(TypedDict):
    id: str
    email: str
    name: str


class EnrichedUser(User, total=False):
    company: str
    plan: str


class Source(Protocol):
    def fetch(self, user_id: str) -> dict[str, str]: ...


def enrich_user(user: User, sources: list[Source]) -> EnrichedUser:
    result: EnrichedUser = {**user}
    for s in sources:
        result.update(s.fetch(user["id"]))  # type: ignore[typeddict-item]
    return result

The shape is public API. mypy now flags a renamed id field in every caller. Source is a structural type, so any class with a matching fetch signature satisfies it — no inheritance required.

Rule 2: Async Discipline — `asyncio`, One Event Loop, Never Block It

async def is the spot Cursor is most dangerous. It writes an async function, then calls requests.get inside it, or time.sleep(5), or open(path).read(). Each of those blocks the event loop for the entire duration. One slow call, every coroutine on that loop stalls. Then the AI "fixes" a concurrency test by wrapping asyncio.run inside a running loop, which raises RuntimeError in production and looks fine in a unit test.

The rule:

Inside any `async def`, network, filesystem, and CPU work go through
async-native APIs or `asyncio.to_thread`. Never call requests, urllib,
time.sleep, open().read(), subprocess.run, or any sync DB driver from
async code.

Use httpx.AsyncClient, aiofiles, asyncio.sleep, asyncio.create_subprocess_exec,
and the async variants of the DB drivers you use (asyncpg, motor, aioredis).

Run many coroutines with asyncio.gather or asyncio.TaskGroup (3.11+,
preferred) — never a list of bare awaits in a loop when they can run
in parallel. Bound concurrency with asyncio.Semaphore.

One event loop per process. The entrypoint is asyncio.run(main()).
Never nest asyncio.run, never call loop.run_until_complete inside
async code, never create a new loop with new_event_loop unless you
are the framework author.

Before — blocking calls inside async, serial awaits:

import time
import requests


async def fetch_prices(symbols: list[str]) -> dict[str, float]:
    prices = {}
    for s in symbols:
        r = requests.get(f"https://api.example.com/price/{s}")
        prices[s] = r.json()["price"]
        time.sleep(0.1)  # "rate limit"
    return prices

requests.get blocks. time.sleep blocks. The for loop serializes everything. The async def is theatre.

After — async-native client, bounded concurrency, TaskGroup:

from __future__ import annotations

import asyncio

import httpx


async def _fetch_one(
    client: httpx.AsyncClient, sem: asyncio.Semaphore, symbol: str
) -> tuple[str, float]:
    async with sem:
        r = await client.get(f"https://api.example.com/price/{symbol}")
        r.raise_for_status()
        return symbol, r.json()["price"]


async def fetch_prices(symbols: list[str]) -> dict[str, float]:
    sem = asyncio.Semaphore(10)
    async with httpx.AsyncClient(timeout=5.0) as client:
        async with asyncio.TaskGroup() as tg:
            tasks = [tg.create_task(_fetch_one(client, sem, s)) for s in symbols]
    return dict(t.result() for t in tasks)

Concurrent, bounded, cancelled cleanly on first failure (that is what TaskGroup gives you), and the loop is never blocked.

Rule 3: Pydantic Models Over Raw Dicts at Every Boundary

JSON comes into your program as a dict. It should not stay one. Cursor will happily pipe a raw payload through ten function signatures as dict[str, Any], and the one time a field is missing you get a KeyError in a background worker two days after deploy. Pydantic v2 parses, validates, coerces, and gives you a typed object you can pass around.

The rule:

Every external payload (HTTP request/response, queue message, config
file, CLI input, file you did not write in the same process) crosses
the boundary through a Pydantic v2 BaseModel. Never pass raw dicts or
JSON strings around the domain layer.

Use model_validate to parse untrusted input, model_dump for
serialization, Field(..., description=...) for schema docs, and
ConfigDict(extra='forbid') so typos raise instead of silently
dropping fields.

Datetimes are datetime objects with tzinfo=UTC, never ISO strings.
Money is Decimal, never float. Enum fields are Python Enum subclasses.

Dataclasses are for internal, trusted values (DTOs that never touch
the network). Pydantic is for anything crossing a process boundary.

Before — raw dict across the boundary:

def create_order(payload: dict) -> dict:
    user_id = payload["user_id"]
    items = payload["items"]
    total = sum(i["price"] * i["qty"] for i in items)
    return {"id": "ord_123", "user_id": user_id, "total": total}

Missing items? KeyError. Integer price? Silent. String qty? TypeError during sum.

After — parsed, validated, typed:

from __future__ import annotations

from decimal import Decimal

from pydantic import BaseModel, ConfigDict, Field


class OrderItem(BaseModel):
    model_config = ConfigDict(extra="forbid")
    sku: str
    price: Decimal = Field(gt=0)
    qty: int = Field(gt=0)


class CreateOrder(BaseModel):
    model_config = ConfigDict(extra="forbid")
    user_id: str
    items: list[OrderItem] = Field(min_length=1)


class Order(BaseModel):
    id: str
    user_id: str
    total: Decimal


def create_order(payload: CreateOrder) -> Order:
    total = sum((i.price * i.qty for i in payload.items), start=Decimal(0))
    return Order(id="ord_123", user_id=payload.user_id, total=total)

extra="forbid" means a typo in user_id raises at the boundary, not fifty lines in. Decimal means you do not lose a cent to float rounding. Invalid shapes produce a structured ValidationError the framework can turn into a 422.

Rule 4: f-strings Only — No `%`, No `.format`, No String Concatenation

Python has had f-strings since 3.6 and they are faster, shorter, and more readable than anything that came before. AI keeps emitting "Hello, %s" % name and "path/%s/%s" % (user, file) because the corpus is full of pre-3.6 Python. In 3.12 you also gained t-strings (PEP 750) for safe templating — use them where structured interpolation matters (SQL, shell, HTML).

The rule:

Use f-strings for all string interpolation. Never the `%` operator,
never str.format, never `"a" + str(x)` concatenation.

Format specifiers go inside the f-string: f"{price:.2f}",
f"{ts:%Y-%m-%d}", f"{value=}" for debug logging.

For multi-line templates prefer inspect.cleandoc or textwrap.dedent
with triple-quoted f-strings over string concatenation across lines.

For logging, pass the format string and args separately to the logger
(logger.info("user %s logged in", user_id)) — the logger only
interpolates if the level is enabled. f-strings in log calls always
interpolate.

That last clause is the subtle one. f-strings beat % everywhere except logging, where you want lazy evaluation. Most style guides leave this out and AI writes logger.debug(f"...") in a tight loop, burning CPU formatting messages nobody will ever see.

Rule 5: `pathlib` Over `os.path` — Always

os.path.join, os.path.exists, os.path.dirname, os.path.splitext — four function calls to do what pathlib.Path does with one object. Cursor still writes os.path because the training corpus is 80% pre-3.4 Python. pathlib is not just shorter; it is strongly typed (Path vs str), cross-platform, and composable.

The rule:

Use pathlib.Path for all filesystem paths. Never os.path, never
manual string concatenation with "/", never os.sep.

Read with Path.read_text(encoding='utf-8') / read_bytes(),
write with write_text / write_bytes. Always pass encoding explicitly
for text.

Use / for joining, .parent for directories, .with_suffix for
extension changes, .glob / .rglob for traversal, .mkdir(parents=True,
exist_ok=True) for directory creation.

Function signatures take Path (not str). Accept both with
Path(argument) normalization at the boundary if a CLI hands you
a string.

Before — os.path spaghetti:

import os

def load_configs(root):
    configs = {}
    for f in os.listdir(root):
        if f.endswith(".yaml"):
            full = os.path.join(root, f)
            with open(full) as fh:
                name = os.path.splitext(f)[0]
                configs[name] = fh.read()
    return configs

After — pathlib, typed, no untyped open:

from __future__ import annotations

from pathlib import Path


def load_configs(root: Path) -> dict[str, str]:
    return {
        p.stem: p.read_text(encoding="utf-8")
        for p in root.glob("*.yaml")
    }

Five lines to one expression, cross-platform, typed, with explicit encoding.

Rule 6: Context Managers for Every Resource — Files, Locks, Clients, Transactions

AI loves a bare open(path) followed by a .read(). It works. It also leaks a file descriptor if anything raises between open and the implicit GC close. Multiply across a web request and you hit the per-process file handle limit before lunch. Same pattern with database transactions, thread locks, HTTP sessions, and tempfile.NamedTemporaryFile.

The rule:

Every resource that needs cleanup is acquired inside a `with` (sync)
or `async with` (async) statement. No raw open, no manual close, no
try/finally when a context manager exists.

For multiple resources, use a single with statement with parenthesized
context managers (PEP 617, 3.10+):
    with (open(a) as fa, open(b) as fb):
        ...

For optional or dynamic resources, use contextlib.ExitStack
(or AsyncExitStack). For custom resources, write a
@contextlib.contextmanager generator — never a class with
__enter__/__exit__ unless you also need state between enter and exit.

Database transactions: with conn.transaction(): ...
HTTP clients: with httpx.Client() as client: ...
Background tasks: async with asyncio.TaskGroup() as tg: ...

Before — descriptor leak waiting to happen:

def merge_files(paths):
    output = ""
    for p in paths:
        f = open(p)
        output += f.read()
        f.close()
    return output

If any read() raises, subsequent files still open and never close. String concatenation is O(n²).

After — ExitStack for dynamic resources, buffered write:

from __future__ import annotations

import contextlib
from io import StringIO
from pathlib import Path


def merge_files(paths: list[Path]) -> str:
    buf = StringIO()
    with contextlib.ExitStack() as stack:
        for p in paths:
            fh = stack.enter_context(p.open(encoding="utf-8"))
            buf.write(fh.read())
    return buf.getvalue()

Every file closes, including on exception. O(n) concatenation.

Rule 7: Comprehensions and Generator Expressions Over Loops — When the Loop Is Building Something

AI frequently writes a for loop that appends to a list, when a comprehension is shorter, faster, and clearer. It also writes a comprehension when a generator would save memory, or a sum(... for x in xs) when sum(map(...)) is a faster closed form. The rule here is about intent: if the loop exists to build a collection, use the right comprehension. If it exists for side effects, keep the for.

The rule:

If a loop's purpose is to build a list/dict/set, use a comprehension.
Never `result = []; for x in xs: result.append(...)`.

Use a generator expression (no brackets) when the caller will iterate
once — sum, any, all, max, min, join, or consumers that stream —
never materialize a list just to throw it away.

Nested comprehensions are fine up to two levels; beyond that, or with
non-trivial filtering, use a named function.

Never comprehensions for side effects. `[print(x) for x in xs]` is a
bug. Use a plain for loop when you need mutation, I/O, or early exit.

dict and set comprehensions are first-class — use them instead of
dict(zip(...)) or dict((k, v) for k, v in ...).

Before — append loop, then another loop for side effects disguised as a comprehension:

def summarize(users):
    names = []
    for u in users:
        if u.active:
            names.append(u.name.upper())
    total = 0
    for u in users:
        total += u.spend
    [log.info(u.name) for u in users]  # side-effect comprehension
    return names, total

After — comprehension, generator, real for loop for side effects:

def summarize(users: list[User]) -> tuple[list[str], Decimal]:
    names = [u.name.upper() for u in users if u.active]
    total = sum((u.spend for u in users), start=Decimal(0))
    for u in users:
        log.info(u.name)
    return names, total

Three patterns, each matched to its intent. sum(... for ...) streams; it never builds an intermediate list.

Rule 8: Dataclasses vs NamedTuple vs Pydantic — Pick the Right One

Cursor reaches for class Foo: def __init__(self, ...) by default. Every argument to __init__ gets assigned to self. It's thirty lines for a value object. Python has @dataclass, NamedTuple, and pydantic.BaseModel, and they do not substitute for each other.

The rule:

For internal, trusted value objects: `@dataclass(frozen=True, slots=True)`.
Frozen unless you have a specific reason to mutate. slots=True when the
type is instantiated hot (smaller memory, faster attribute access).

For small immutable records that act like tuples (coordinates, row
pairs, return groups): typing.NamedTuple (not collections.namedtuple).
It tuple-unpacks, is positional and keyword, and has type hints.

For anything crossing a process boundary (HTTP, queue, file format,
config): pydantic.BaseModel with ConfigDict(extra='forbid').

Never write __init__ / __eq__ / __repr__ / __hash__ by hand for value
objects — use one of the three above.

Mutable default fields in dataclass use field(default_factory=list);
never default=[] (it is a shared reference and the oldest Python foot
gun in the book).

Before — hand-rolled __init__ with the shared default-list bug:

class Cart:
    def __init__(self, user_id, items=[]):
        self.user_id = user_id
        self.items = items

    def __eq__(self, other):
        return self.user_id == other.user_id and self.items == other.items

    def __repr__(self):
        return f"Cart({self.user_id}, {self.items})"

Cart("a") and Cart("b") share the same items list. Every modification leaks across instances. __eq__ explodes if other is not a Cart. __hash__ is now missing because we defined __eq__.

After — @dataclass, no footguns:

from __future__ import annotations

from dataclasses import dataclass, field


@dataclass(frozen=True, slots=True)
class Cart:
    user_id: str
    items: list[str] = field(default_factory=list)

Hashable (because frozen=True). Safe defaults. __eq__, __repr__, __init__, __hash__ all generated. Six lines total.

The Complete `.cursorrules` File

Drop this in the repo root. Cursor and Claude Code both pick it up. To split into .cursor/rules/*.mdc, one rule per file — the headings below map directly.

# Python — Production Patterns

## Typing
- Every function and method has parameter and return type hints.
- `Any` is banned outside narrow adapters. Use TypedDict / Protocol /
  dataclass / Pydantic for structured data.
- Builtin generics: list[T], dict[K, V], tuple[T, ...]. Never typing.List.
- `from __future__ import annotations` in every module.

## Async
- No blocking calls inside async def: no requests, no time.sleep,
  no open().read(), no sync DB drivers.
- Use httpx.AsyncClient, aiofiles, asyncio.sleep, async DB drivers.
- asyncio.TaskGroup (3.11+) for parallel work; Semaphore for concurrency bounds.
- One event loop per process. Entry is asyncio.run(main()). Never nest.

## Data Models
- External payloads cross the boundary through Pydantic v2 BaseModel.
  model_validate to parse, model_dump to serialize.
- ConfigDict(extra='forbid') so typos raise.
- Datetimes with tzinfo=UTC. Money as Decimal. Never float money, never
  ISO string datetimes in the domain layer.

## Strings
- f-strings for all interpolation. No `%`, no .format, no "a" + str(x).
- Format specs inside: f"{price:.2f}", f"{value=}".
- Logging uses lazy args: logger.info("x=%s", x) — never f-strings in
  log calls.

## Paths
- pathlib.Path for all filesystem paths. Never os.path, never manual "/".
- Path.read_text(encoding='utf-8') / write_text with explicit encoding.
- Function signatures take Path, not str.

## Resources
- Every resource acquired with `with` / `async with`. No bare open,
  no manual close.
- Parenthesized multi-context managers for >1 resource.
- contextlib.ExitStack (AsyncExitStack) for dynamic sets.
- @contextmanager generators for custom resources — class-based only
  when state needs to live between enter/exit.

## Collections
- Comprehensions when the loop builds a list/dict/set. No append loops.
- Generator expressions for one-shot consumers (sum, any, all, join).
- Never comprehensions for side effects. Use plain for for I/O / mutation.
- Max two levels of nesting; extract a named function beyond that.

## Value Objects
- Internal immutable: @dataclass(frozen=True, slots=True).
- Positional tuple-like: typing.NamedTuple.
- Boundary-crossing: pydantic.BaseModel with ConfigDict(extra='forbid').
- Never hand-rolled __init__/__eq__/__repr__ for value objects.
- Mutable defaults: field(default_factory=list). Never default=[].

Two End-to-End Examples

Example 1: "Write a function that loads a config file and hits an API with it."

Without rules:

import json, os, requests

def sync_user(user_id):
    path = os.path.join(os.path.dirname(__file__), "config.json")
    with open(path) as f:
        config = json.load(f)
    r = requests.get(config["api_url"] + "/users/" + user_id)
    return r.json()

Sync in a function that will be called from an async handler. os.path. String concat for the URL. config is dict[str, Any]. No error handling, no timeout.

With rules:

from __future__ import annotations

from pathlib import Path

import httpx
from pydantic import BaseModel, ConfigDict, HttpUrl


class Config(BaseModel):
    model_config = ConfigDict(extra="forbid")
    api_url: HttpUrl


class UserDTO(BaseModel):
    id: str
    email: str


async def load_config(path: Path) -> Config:
    return Config.model_validate_json(path.read_text(encoding="utf-8"))


async def sync_user(client: httpx.AsyncClient, config: Config, user_id: str) -> UserDTO:
    r = await client.get(f"{config.api_url}users/{user_id}", timeout=5.0)
    r.raise_for_status()
    return UserDTO.model_validate_json(r.content)

Typed config. Typed response. Async client injected (testable). Explicit timeout. f-string for the URL.

Example 2: "Write a worker that processes a queue of jobs concurrently."

Without rules:

import asyncio, time

async def worker(jobs):
    results = []
    for j in jobs:
        time.sleep(0.1)
        results.append({"id": j["id"], "ok": True})
    return results

Three rules broken in seven lines: time.sleep blocks the loop, the for serializes, j and the return are dict[str, Any].

With rules:

from __future__ import annotations

import asyncio
from dataclasses import dataclass


@dataclass(frozen=True, slots=True)
class Job:
    id: str


@dataclass(frozen=True, slots=True)
class JobResult:
    id: str
    ok: bool


async def _process(sem: asyncio.Semaphore, job: Job) -> JobResult:
    async with sem:
        await asyncio.sleep(0.1)
        return JobResult(id=job.id, ok=True)


async def worker(jobs: list[Job]) -> list[JobResult]:
    sem = asyncio.Semaphore(20)
    async with asyncio.TaskGroup() as tg:
        tasks = [tg.create_task(_process(sem, j)) for j in jobs]
    return [t.result() for t in tasks]

Bounded concurrency. Non-blocking sleep. Typed job and result. Cancels cleanly on first failure.

Get the Full Pack

These eight rules cover the Python patterns where AI assistants consistently reach for the wrong idiom. Drop them into .cursorrules and the next prompt you write will look different — typed, async-correct, resource-safe Python, without having to re-prompt.

If you want the expanded pack — these eight plus rules for FastAPI routing, SQLAlchemy 2.0 sessions, pytest fixtures and parametrization, structlog / structured logging, Celery task shape, and the data-pipeline rules I use on pandas/polars code — it is bundled in Cursor Rules Pack v2 ($27, one payment, lifetime updates). Drop it in your repo, stop fighting your AI, ship Python you would actually merge.

DEV Community

Cursor Rules for Python: The Complete Guide to AI-Assisted Python Development

Cursor Rules for Python: The Complete Guide to AI-Assisted Python Development

How Cursor Rules Work for Python Projects

Rule 1: Type Hints Everywhere — No `Any`, Use `TypedDict` or `Protocol`

Rule 2: Async Discipline — `asyncio`, One Event Loop, Never Block It

Rule 3: Pydantic Models Over Raw Dicts at Every Boundary

Rule 4: f-strings Only — No `%`, No `.format`, No String Concatenation

Rule 5: `pathlib` Over `os.path` — Always

Rule 6: Context Managers for Every Resource — Files, Locks, Clients, Transactions

Rule 7: Comprehensions and Generator Expressions Over Loops — When the Loop Is Building Something

Rule 8: Dataclasses vs NamedTuple vs Pydantic — Pick the Right One

The Complete `.cursorrules` File

Two End-to-End Examples

Example 1: "Write a function that loads a config file and hits an API with it."

Example 2: "Write a worker that processes a queue of jobs concurrently."

Get the Full Pack

Top comments (0)

Cursor Rules for Python: The Complete Guide to AI-Assisted Python Development

How Cursor Rules Work for Python Projects

Rule 1: Type Hints Everywhere — No Any, Use TypedDict or Protocol

Rule 2: Async Discipline — asyncio, One Event Loop, Never Block It

Rule 3: Pydantic Models Over Raw Dicts at Every Boundary

Rule 4: f-strings Only — No %, No .format, No String Concatenation

Rule 5: pathlib Over os.path — Always

Rule 6: Context Managers for Every Resource — Files, Locks, Clients, Transactions

Rule 7: Comprehensions and Generator Expressions Over Loops — When the Loop Is Building Something

Rule 8: Dataclasses vs NamedTuple vs Pydantic — Pick the Right One

The Complete .cursorrules File

Two End-to-End Examples

Example 1: "Write a function that loads a config file and hits an API with it."

Example 2: "Write a worker that processes a queue of jobs concurrently."

Get the Full Pack

Rule 1: Type Hints Everywhere — No `Any`, Use `TypedDict` or `Protocol`

Rule 2: Async Discipline — `asyncio`, One Event Loop, Never Block It

Rule 4: f-strings Only — No `%`, No `.format`, No String Concatenation

Rule 5: `pathlib` Over `os.path` — Always

The Complete `.cursorrules` File