You ask Claude to "store the user and their last 10 orders," and 30 seconds later you ship something that looks fine in review:
- Orders embedded in the user document — until a power user has 9,000 of them and you hit the 16 MB document limit at 3 a.m.
- A
find({ email })query with no index —COLLSCANover 50 million docs every login. -
new MongoClient(uri)inside the request handler — your Atlas dashboard shows 4,000 connections and the cluster starts dropping new ones. -
db.users.find({ role: req.query.role })— and someone passes{ "$ne": null }to dump every user in the database.
The model didn't fail. It pattern-matched on tutorials where MongoDB is a toy with twelve documents and zero ops concerns. Production makes each one a real incident.
A CLAUDE.md at the root of your repo fixes this. Claude Code reads it on every task. Cursor, Aider, and Copilot do the same. Below are four of the thirteen rules I drop into every MongoDB project — full set in the free gist linked at the end.
Rule 1 — Embed for "contains", reference for "relates to"
Why: MongoDB rewards modelling around access patterns, not normalisation. Embed when a child is owned by the parent, fetched with it, and bounded in size. Reference when the data is shared across documents, grows without limit, or is queried independently. AI defaults to one of two failure modes: relational-style normalisation everywhere (joins MongoDB doesn't have), or embedding unbounded arrays that eventually breach the 16 MB document cap.
Bad:
// User doc with unbounded order history embedded.
{
_id: ObjectId("..."),
email: "x@y.com",
orders: [ /* grows forever — heavy users blow the 16 MB cap */ ]
}
Good:
// User: bounded, profile-shaped data embedded.
{
_id: ObjectId("..."),
email: "x@y.com",
addresses: [ /* a handful, bounded, fetched with the user */ ],
lastOrderSummary: { id: ObjectId("..."), total: 4200, at: ISODate("...") }
}
// Order: separate collection, queryable on its own, indexed by userId.
{
_id: ObjectId("..."),
userId: ObjectId("..."),
items: [...],
total: 4200,
createdAt: ISODate("...")
}
Rule for CLAUDE.md:
Embed when the child is owned by the parent, accessed together, and bounded
(known max size, e.g. addresses, settings, last-N summary).
Reference when the child is shared, unbounded, or queried independently.
Any embedded array must have a documented upper bound — unbounded arrays are
rejected in code review. Bounded arrays use `$slice` on writes to enforce the cap.
Rule 2 — Every query needs an index — verify with explain("executionStats")
Why: A query without a supporting index runs COLLSCAN — read every document, every time. On a 50 M-doc collection that is seconds of CPU and gigabytes of disk per request. AI writes the query, the test suite passes on a 10-doc fixture, and the regression only shows up under production load. Reading the query plan is non-negotiable.
Bad:
// Looks fine. Runs COLLSCAN in production.
const user = await db.collection("users").findOne({ email });
> db.users.find({ email: "x@y.com" }).explain("executionStats")
winningPlan: { stage: "COLLSCAN" }
totalDocsExamined: 52_113_004
nReturned: 1
Good:
// One-time, idempotent, lives next to the model.
await db.collection("users").createIndex({ email: 1 }, { unique: true });
const user = await db.collection("users").findOne({ email });
winningPlan: { stage: "IXSCAN", indexName: "email_1" }
totalDocsExamined: 1
nReturned: 1
Rule for CLAUDE.md:
Every query path is backed by an index. Before merging, run
`db.collection.find(query).explain("executionStats")` and confirm:
- `winningPlan.stage` is `IXSCAN` (not `COLLSCAN`)
- `totalDocsExamined` is within ~1× of `nReturned`
- `executionTimeMillis` is < 50 ms on a representative dataset
Compound queries follow ESR: **E**quality, **S**ort, **R**ange — in that field order.
Rule 7 — One MongoClient per process — never per request
Why: MongoClient is an internally pooled object. One instance per process gives you a warm pool of authenticated, TLS-negotiated sockets that handle thousands of ops per second. A fresh client per request opens a new pool every time — TCP handshake, AUTH round-trip, TLS, and on Atlas you'll see "connection limit reached" before you see real traffic. AI generates new MongoClient() inside the handler because each function "needs its own connection."
Bad:
// Express handler — fresh client (and its pool) per request.
app.get("/users/:id", async (req, res) => {
const client = new MongoClient(process.env.MONGO_URI);
await client.connect();
const user = await client.db().collection("users").findOne({ _id: new ObjectId(req.params.id) });
res.json(user);
await client.close(); // and here we go again on the next request
});
Good:
// One client per process, instantiated at startup.
const client = new MongoClient(process.env.MONGO_URI, {
maxPoolSize: 50,
minPoolSize: 5,
serverSelectionTimeoutMS: 5_000,
});
await client.connect();
const db = client.db();
app.get("/users/:id", async (req, res) => {
const user = await db.collection("users").findOne({ _id: new ObjectId(req.params.id) });
res.json(user);
});
process.on("SIGTERM", async () => { await client.close(); });
Rule for CLAUDE.md:
Exactly one `MongoClient` per process, instantiated at startup, injected via DI/context.
`maxPoolSize` is set explicitly based on expected concurrency — never the implicit default.
Serverless workloads (Lambda, Vercel) cache the client across invocations on the global
scope and use a connection-pooling proxy or Atlas Serverless — short-lived processes
cannot pool effectively on their own.
Health checks verify the existing client is reused, not re-created.
Rule 13 — Never pass user input as a query operator — sanitize or whitelist
Why: MongoDB queries are JSON. If you pass an HTTP body straight into find(), an attacker can submit {"$ne": null} and dump the collection, {"$gt": ""} to bypass auth checks, or {"$where": "..."} to execute arbitrary JavaScript on the server. AI does this all the time because it looks identical to a string-keyed lookup. NoSQL injection is the SQL injection of 2010 — still shipping in 2026.
Bad:
// req.body.email = { "$ne": null } ⇒ returns the first user in the collection.
const user = await db.collection("users").findOne({
email: req.body.email,
password: req.body.password,
});
Good:
// Coerce to primitive, then validate shape with a schema layer.
const { email, password } = z.object({
email: z.string().email().max(254),
password: z.string().min(8).max(256),
}).parse(req.body);
const user = await db.collection("users").findOne({ email, password });
Rule for CLAUDE.md:
Never spread an unvalidated request body into a query. All inputs are parsed
through a schema (Zod, Yup, JSON Schema, Pydantic) that coerces each field
to a primitive before it reaches the driver.
Field names in projections, sorts, and filters come from a whitelist — never
from `req.query` or `req.body` directly.
`$where` and server-side JavaScript (`mapReduce` with JS) are forbidden;
CI greps the codebase and fails the build on either.
How to Use These Rules
- Drop a
CLAUDE.mdat the root of the repo, next to yoursrc/and your Mongo client wrapper. - Paste the rules. Edit what doesn't fit your stack (driver, ODM, Atlas vs self-hosted, region setup).
- Restart Claude Code so it picks up the new context file. The same file works for Cursor, Aider, Codex, and Copilot Workspace.
The full set covers ESR-ordered compound indexes, sparse and partial indexes, TTL collections for expiring data, aggregation-pipeline ordering ($match early, $project late), write concern and read preference, when transactions actually buy you something, change streams with resumeToken, JSON Schema validators at the collection level, bulkWrite with ordered: false, and the ObjectId vs UUID trade-off.
Get the Rules
Free MongoDB gist with all 13 rules → gist.github.com/oliviacraft/ca79e55663671652f02d6aaa3b03e196
The 13 rules above are one chapter of the CLAUDE.md Rules Pack — editions covering Go, Rust, Python, FastAPI, Next.js, React Native, Terraform, Docker, Kubernetes, PostgreSQL, GraphQL, Java, Redis, MongoDB, and more. Production-tested AI guardrails, packaged as drop-in CLAUDE.md files.
→ Get the full pack on Gumroad: oliviacraftlat.gumroad.com/l/skdgt — one-time payment, lifetime updates.
Top comments (0)