Imagine you're building a dating app. You’ve hit a major milestone: one million users. Congratulations!
But then, requirements change. In the early days, you kept things simple. To track whether a user had children, you used a single column called kids that stored a simple string, like: 'yes- they live with me' or 'no.'
const updatePreferencesSchema = Joi.object({
kids: Joi.string().valid(
"Yes- they live with me",
"Yes- they don't live with me",
"no"
).optional()
});
However, as your app matures, your product team wants better data. They want to know if a user has kids or if their current kids live away from home. You need to transform that single string into a nested object:
const updatePreferencesSchema = Joi.object({
kids: Joi.object({
hasKids: Joi.boolean().required(),
liveAtHome: Joi.boolean().required(),
wantsKids: Joi.string().valid("Yes", "No", "Not Decided")
}).optional()
});
This change might look like an easy change; you just rewrite the model, and boom, it works!
Here is where most devs make a fatal mistake: they think they can just run an ALTER TABLE and deploy the new code.
It doesn't work that way. Even if you only had 10 users, the moment your new code tries to read user.kids.hasKids from a row that still contains the string "no", your app will throw a TypeError and crash the request. You can't update every row in the same millisecond you deploy your code. You have to handle the "In-Between State" where old data and new code have to coexist.
To survive this, you use the Expand and Contract Pattern.
Phase 1: The Expand (The Safety Bridge)
Instead of modifying the existing kids column, we "expand" the schema. We add a new JSONB column to hold the new format.
-- Step 1: Add the new column (defaulting to empty so it doesn't break)
ALTER TABLE users ADD COLUMN kids_v2 JSONB DEFAULT '{}';
The Dual-Write Logic
Now, update your application. Every time a user updates their profile, you save data to both columns. This is your insurance policy. It ensures that any new data is compatible with both the old code (for now) and the new code (later).
const oldValue = req.body.kids; // e.g., "yes- they live with me."
const updateData = {
kids: oldValue,
kids_v2: {
hasKids: oldValue.toLowerCase().includes("yes"),
liveAtHome: oldValue === "yes- they live with me",
wantsKids: "Not Decided" // Default value for the new requirement
}
};
Phase 2: The Migration (Backfilling the Past)
You have a million rows where kids_v2 is just an empty {}. You need to transform them. While you could do this with a single SQL query, doing it all at once on a million rows will lock your table and kill your performance.
The "Safe" Batch Update Script
Instead, use a Node.js script to process the data in batches. We add a small delay (the "breather") to let the database handle regular user traffic while we work in the background.
const { Pool } = require('pg');
const pool = new Pool();
async function backfillKidsData() {
const batchSize = 5000;
let hasMore = true;
while (hasMore) {
const res = await pool.query(`
UPDATE users
SET kids_v2 = jsonb_build_object(
'hasKids', CASE WHEN kids LIKE 'yes%' THEN true ELSE false END,
'liveAtHome', CASE WHEN kids = 'yes- they live with me' THEN true ELSE false END,
'wantsKids', 'Not Decided'
)
WHERE id IN (
SELECT id FROM users
WHERE kids_v2 = '{}' AND kids IS NOT NULL
LIMIT ${batchSize}
)
RETURNING id;
`);
if (res.rowCount === 0) hasMore = false;
console.log(`Migrated ${res.rowCount} rows...`);
// Crucial: Give the DB and the event loop 100ms to breathe
await new Promise(r => setTimeout(r, 100));
}
}
Phase 3: The Contract (The Clean Cut)
Once your script finishes and every row has a valid object in kids_v2, you are in a state of "Schema Parity." Now it's time to clean up.
-
Switch the API: Update your code to read only from
kids_v2. - Monitor: Watch your logs. If you don't see any "undefined" property errors for a few days, you're in the clear.
- Delete: Drop the old column and remove the dual-write logic from your controllers.
-- The final step
ALTER TABLE users DROP COLUMN kids;
Conclusion
This isn't just about avoiding downtime; it's about Data Integrity. If you jump from a string to an object in one leap, you're going to fall into the gap. The Expand and Contract pattern is the bridge that lets you walk across safely.
What’s your biggest migration horror story? Have you ever locked a production database by accident? Let’s talk about it in the comments!
Top comments (0)