Ashwin S I

Posted on Jun 28

mongoose-drift: Schema Versioning and Diffing for Mongoose (That SQL Devs Have Had for Years)

#mongodb #mongoose #node #opensource

If you've ever worked with a SQL-backed project, you've seen a migrations folder.

migrations/
  V1__create_users.sql
  V2__add_email_column.sql
  V3__add_roles_table.sql

Prisma has prisma migrate. Flyway has versioned SQL files. Liquibase has changelogs. The concept is the same everywhere: a clear, committed history of every schema change, with version numbers. You know what the schema looked like at any point in time. You can replay it. You can diff it.

Now think about a typical Mongoose project. Your UserSchema lives in one file. You change it. You push it. That's it. There's no migration file, no version number, no diff. Three weeks later something breaks and you're grepping through git blame trying to figure out when you removed that field and why.

I kept looking for something that solved this for Mongoose the way Prisma solves it for PostgreSQL. It didn't exist. So I built it.

What mongoose-drift does

mongoose-drift is a CLI and library that lets you:

Snapshot your Mongoose schema at any point in time
Diff any two snapshots — including your current unsaved state — field by field and index by index
Generate a migrate-mongo compatible migration stub from the diff
Detect potential field renames automatically
Wire up your AI coding assistant (Claude Code, Cursor, Copilot, Windsurf) to your schema on install — automatically

Let's walk through all of it.

Installation

npm install -D mongoose-drift

That's it. On install, a postinstall script fires and writes a context block into whatever AI agent config files it finds in your project (CLAUDE.md, .cursorrules, copilot-instructions.md, etc.). More on that later.

The core workflow

Step 1 — Initialize

Point mongoose-drift at your models directory:

npx mongoose-drift init --models ./src/models

This creates .mongoose-drift/default/config.json storing the models path so you don't need to repeat it every command.

Step 2 — Take a snapshot

Say your UserSchema currently looks like this:

// src/models/User.js
const userSchema = new mongoose.Schema({
  name:     { type: String, required: true },
  email:    { type: String, required: true, unique: true },
  password: { type: String, required: true },
  age:      Number,
  roles:    [String],
});

export const User = mongoose.model('User', userSchema);

Snapshot it:

npx mongoose-drift snapshot --version 1.0.0

This creates .mongoose-drift/default/1.0.0.json — a normalized JSON representation of every collection in your models directory. Commit this file. It's your schema history.

{
  "version": "1.0.0",
  "createdAt": "2024-06-28T10:00:00.000Z",
  "modelsPath": "./src/models",
  "collections": {
    "User": {
      "fields": {
        "name":     { "type": "String", "required": true },
        "email":    { "type": "String", "required": true, "unique": true },
        "password": { "type": "String", "required": true },
        "age":      { "type": "Number" },
        "roles":    { "type": "Array<String>" }
      },
      "indexes": [
        { "fields": { "email": 1 }, "options": { "unique": true } }
      ]
    }
  }
}

Step 3 — Change your models

Requirements change. You need to add a phone number field, remove the age field (it's not needed anymore), and restrict roles to a specific enum. Your schema now looks like:

const userSchema = new mongoose.Schema({
  name:        { type: String, required: true },
  email:       { type: String, required: true, unique: true },
  password:    { type: String, required: true },
  phoneNumber: { type: String },
  roles:       { type: [String], enum: ['admin', 'user', 'moderator'] },
});

Step 4 — Diff against HEAD

HEAD is a special version meaning "my models right now, on disk." No snapshot needed.

npx mongoose-drift diff 1.0.0 HEAD

Output:

Schema diff: 1.0.0 → HEAD

Collection: User
  + phoneNumber              (String)   [FIELD ADDED]
  - age                      (Number)   [FIELD REMOVED]
  ~ roles                    {"type":"Array<String>"} → {"type":"Array<String>","enum":["admin","user","moderator"]}  [MODIFIED]

Summary: 0 added  0 removed  1 modified

Field by field. Before and after. Exactly what changed.

Step 5 — Generate a migration stub

npx mongoose-drift diff 1.0.0 HEAD --stub

Creates migrations/default/1.0.0-to-HEAD.js:

// Auto-generated by mongoose-drift
// Project: default
// From: 1.0.0  →  To: HEAD
// Generated: 2024-06-28T10:05:00.000Z
//
// Review carefully before running.
// Renames appear as remove + add — update manually if needed.

module.exports = {
  async up(db) {
    // Collection: User

    // TODO: Add field 'phoneNumber'
    // await db.collection('User').updateMany({}, { $set: { phoneNumber: null } });

    // TODO: Remove field 'age' — verify no data dependency first
    // await db.collection('User').updateMany({}, { $unset: { age: '' } });

    // TODO: Field 'roles' was modified — handle data transformation
  },

  async down(db) {
    // TODO: Reverse the above operations
  },
};

Every operation is commented out by default. You review, uncomment what applies, adjust the values, then run with migrate-mongo up.

How it works under the hood

This is the part I want to go deeper on because the internals are what make it reliable across different schema styles.

The pipeline

Your model files (.ts / .js)
        │
        ▼
   extractor.ts  ──  walks schema.paths, normalizes each field
        │
        ▼
   snapshot.ts   ──  serializes to .mongoose-drift/<project>/<version>.json
        │
        ▼  (two snapshots loaded)
   diff.ts       ──  field-by-field + index-by-index comparison
        │
        ├──► reporter.ts        ──  colored terminal output or plain text
        └──► stub-generator.ts  ──  migrations/<project>/<from>-to-<to>.js

Each module is independent and communicates through plain data types. No shared state, no hidden coupling.

Extractor — reading Mongoose models without running your app

This is the trickiest part. Mongoose models are just JavaScript. mongoose-drift loads your model files with require() and inspects what they export. It handles three cases:

Case 1: Default export is a compiled Model
  → exported.schema.paths exists
  → model name comes from exported.modelName

Case 2: Default export is a raw Schema object
  → exported.paths exists and exported.path is a function (the .path() method)
  → model name comes from the filename

Case 3: Named exports
  → iterates Object.entries(exported)
  → applies Cases 1 and 2 to each value

Once it has a schema object, normalizeFieldFromPath walks schema.paths and converts each SchemaType instance into a portable FieldDefinition:

{
  type: string;       // "String", "Number", "ObjectId", "Array<String>", etc.
  required?: boolean;
  ref?: string;       // populated from ObjectId refs
  enum?: unknown[];
  unique?: boolean;
  index?: boolean;
  default?: unknown;
  // ... min, max, trim, minlength, maxlength, sparse, immutable, select
}

Array types are unwrapped — [String] becomes "Array<String>", [{ type: ObjectId, ref: 'User' }] becomes "Array<ObjectId>". Nested subdocument schemas (embedded schemas) appear as flattened dot-notation paths — address.street, address.city — because that's how schema.paths represents them internally.

If you use TypeScript model files, mongoose-drift tries ts-node/register at startup to handle .ts files without a build step.

Snapshot format — portable JSON, validated with Zod

Every snapshot is validated on read using Zod schemas. This means if a snapshot file gets corrupted or manually edited in a way that breaks the shape, you get a clear error message instead of a silent wrong diff.

const SchemaSnapshotSchema = z.object({
  version:    z.string(),
  createdAt:  z.string(),
  modelsPath: z.string(),
  collections: z.record(CollectionSchemaSchema),
});

The HEAD version is never written to disk — when you pass HEAD, it runs extractSchemas live and returns the result as a snapshot object in memory.

Diff — what "changed" actually means

Two things are diffed per collection: fields and indexes.

Field diff: Takes the union of all field names across both snapshots and classifies each as:

added — exists in after, not in before
removed — exists in before, not in after
modified — exists in both but JSON.stringify(before) !== JSON.stringify(after)

That stringify comparison is intentional. It catches any change to any property of the field definition — type, required flag, default value, enum list, anything. No special-casing needed.

Index diff: Serializes each index as JSON, builds two sets, and reports what's in one but not the other. Indexes are either present or absent — there's no "modified index" (a change appears as a remove + add pair).

Rename detection: After the field diff runs, detectPotentialRenames looks for (removed, added) pairs that share the same type. This is advisory — the CLI prints a warning and suggests you check whether it's a rename, it doesn't automatically classify anything as renamed. Your data is your call.

Stub generator — commented-out operations, not destructive by default

Every generated operation is commented out. The philosophy is: mongoose-drift tells you what probably needs to happen, but you decide what actually runs. Especially for $unset (removing data) or index creation on large collections — those need human review.

Index changes

Indexes are first-class citizens in the diff. If you add a compound index to your schema:

rentSchema.index({ tenant: 1, dueDate: -1 });
rentSchema.index({ status: 1, isActive: 1 }, { sparse: true });

The diff shows them:

Collection: Rent
  + index {"tenant":1,"dueDate":-1}                   [INDEX ADDED]
  + index {"status":1,"isActive":1} ({"sparse":true})  [INDEX ADDED]

And the stub generates:

// TODO: New index {"tenant":1,"dueDate":-1} — create if needed
// await db.collection('Rent').createIndex({"tenant":1,"dueDate":-1});

// TODO: New index {"status":1,"isActive":1} — create if needed
// await db.collection('Rent').createIndex({"status":1,"isActive":1},{"sparse":true});

Output formats

Three ways to get the diff out:

Pretty terminal output (default)

npx mongoose-drift diff 1.0.0 HEAD

Color-coded. Added fields in green, removed in red, modified in yellow. Designed for reading during development.

JSON

npx mongoose-drift diff 1.0.0 HEAD --json

{
  "collections": {
    "User": {
      "type": "modified",
      "changes": [
        { "type": "added",   "field": "phoneNumber", "after": { "type": "String" } },
        { "type": "removed", "field": "age",         "before": { "type": "Number" } },
        { "type": "modified","field": "roles",
          "before": { "type": "Array<String>" },
          "after":  { "type": "Array<String>", "enum": ["admin","user","moderator"] }
        }
      ]
    }
  }
}

Useful when you want to pipe it into something or let your AI assistant parse it.

Plain text export

npx mongoose-drift diff 1.0.0 HEAD --txt

Writes a diff.txt file — no color codes, readable anywhere. Useful for sharing with non-dev teammates or pasting into a PR description.

AI integration — the part I hadn't seen anywhere else

When you install mongoose-drift, a postinstall npm script runs and writes a context block into every AI agent config file it finds in your project:

File written	AI tool it targets
`CLAUDE.md`	Claude Code
`.cursor/rules/mongoose-drift.mdc`	Cursor (MDC format, `alwaysApply: true`)
`.cursorrules`	Cursor (legacy)
`.github/copilot-instructions.md`	GitHub Copilot
`.windsurfrules`	Windsurf
`.augment/guidelines.md`	Augment
`gemini.md`	Gemini CLI

The block tells the AI:

That mongoose-drift is installed
What commands to run to list snapshots, read the current schema, and get diffs
What the JSON output structure looks like (field types, index format)
What HEAD means

After that, you can ask your AI assistant things like:

"What fields does the Rent collection have?"

And instead of asking you to paste the schema, it runs:

npx mongoose-drift show 1.0.0

...reads the JSON output, and answers directly.

Or:

"What changed in the schema since the last snapshot?"

It runs:

npx mongoose-drift diff 1.0.0 HEAD --json

...parses it, and tells you exactly what fields and indexes changed.

How the block injection works

The block is wrapped in HTML comment markers:

<!-- mongoose-drift:start -->
...content...
<!-- mongoose-drift:end -->

This means:

If the file doesn't exist, it's created with just the block
If the file exists but has no markers, the block is appended — your existing content is untouched
If the file already has the markers (from a previous install), only the content between them is replaced

Re-running setup-ai is always safe. Run it after saving new snapshots so agents can see the updated version list:

npx mongoose-drift snapshot --version 1.1.0
npx mongoose-drift setup-ai

Multi-project support

For monorepos or multi-service architectures, use the -p flag to isolate per service:

# Initialize each service separately
npx mongoose-drift init --models ./services/auth/models -p auth
npx mongoose-drift init --models ./services/billing/models -p billing
npx mongoose-drift init --models ./services/inventory/models -p inventory

# Snapshot each independently
npx mongoose-drift snapshot --version 1.0.0 -p auth
npx mongoose-drift snapshot --version 1.0.0 -p billing

# Diff each independently
npx mongoose-drift diff 1.0.0 HEAD -p billing --stub

Each project stores its snapshots under .mongoose-drift/<project>/ and its migrations under migrations/<project>/. They never interfere with each other.

Useful commands at a glance

# Initialize
npx mongoose-drift init --models ./src/models

# Save a snapshot
npx mongoose-drift snapshot --version 1.0.0

# List all snapshots
npx mongoose-drift log

# Read a specific snapshot
npx mongoose-drift show 1.0.0

# Diff (various outputs)
npx mongoose-drift diff 1.0.0 HEAD
npx mongoose-drift diff 1.0.0 HEAD --json
npx mongoose-drift diff 1.0.0 HEAD --txt
npx mongoose-drift diff 1.0.0 HEAD --stub

# Refresh AI agent files after new snapshots
npx mongoose-drift setup-ai

Programmatic API

The whole pipeline is also available as a library if you want to build on top of it:

import {
  extractSchemas,
  saveSnapshot,
  loadSnapshot,
  diffSnapshots,
  detectPotentialRenames,
  generateStub,
} from 'mongoose-drift';

// Extract live schema from disk
const collections = await extractSchemas('./src/models');

// Load two snapshots and diff them
const before = await loadSnapshot('1.0.0', 'default');
const after  = await loadSnapshot('HEAD', 'default');
const diff   = diffSnapshots(before, after);

// Inspect field changes for a specific collection
const userChanges   = diff.collections['User']?.changes ?? [];
const userRenames   = detectPotentialRenames(userChanges);
const indexChanges  = diff.collections['User']?.indexChanges ?? [];

// Generate a stub file
generateStub(diff, '1.0.0', 'HEAD', 'default');

All exported types are available too:

import type {
  SchemaSnapshot,
  DiffResult,
  FieldChange,
  IndexChange,
  CollectionChange,
  FieldDefinition,
} from 'mongoose-drift';

What it doesn't do (yet)

A few things worth being upfront about:

It doesn't run migrations for you. The stub generator produces a migrate-mongo file — you run it with migrate-mongo up. The actual data migration is on you.
It doesn't detect renames automatically. Rename detection is advisory. It flags (removed field, added field) pairs with matching types, but it doesn't rename anything in the stub. You update the comment manually if it was a rename.
It doesn't cover every possible Mongoose option. The field normalizer captures the most common options. Exotic custom validators or plugin-added properties won't appear in the snapshot.

Try it

npm install -D mongoose-drift
npx mongoose-drift init --models ./src/models
npx mongoose-drift snapshot --version 1.0.0

GitHub: github.com/ashwinn-si/mongoose-drift

If you've ever lost track of what changed in a Mongoose schema — or if you're tired of explaining your schema to an AI assistant by hand — give it a try and let me know what you think.

DEV Community