When running Reform, we often faced the issue of having to migrate data in our database. We were an early-stage startup, so we built features as needed. Down the road we ran into scenarios where we realised we hadn’t chosen the correct data model.
The nature of the product (a form builder) meant we couldn’t have any downtime, because that meant our customers lost leads.
Example
What do I mean by migrating data? I mean any change to the data model, like moving data to a new column or changing a column type.
When we launched Reform, we only supported single page forms. We stored the page as a JSON object in a page
column. But we quickly realised we needed to support multipage forms. That meant we had to replace page
with pages
and store an array of pages instead.
3 step approach
We developed a 3-step approach to make these changes with no downtime.
- Preparation: Prepare the database to handle the new data model and migrate existing data
- Launch: Releasing the new feature
- Clean-up: Remove old data and tweak new columns
The main idea is to create a new column before reading data from it. We can migrate existing data to the new column and since we’re not reading data from this column yet, it doesn’t matter how long it takes. Once all data is migrated, we can update the app logic to read from the new column.
In practice this requires releasing a preparation PR, migrating the data and then releasing the actual feature (and clean up).
Preparation
The first release is a preparation PR. This PR contains a migration to add the new column. Because the new column is created for existing rows, it will have to be nullable for now.
Next we’ll update the app logic to save any changes both in the old column and the new column. In our example we would still save it as an object in page
but we’d also generate a JSON array on the fly and save the page as the only item in the array. This means that if user saves their data it will sync to both page
and pages
.
// Dummy code
$form->update([
'page' => $page,
'pages' => [$page],
]);
But we still haven’t dealt with rows the user haven’t updated. For that, we’ll have to create a custom migration script. Reform is built with Laravel, so we’d create the script using a custom command and a custom queue job.
The command fetches unmigrated data from the database. In our example, that means any rows where pages
is null. Only fetching unmigrated data also means we can re-run the command as many times as needed. We’d often have to re-run it if an error occurred, or if we’d batch it to only run it for N rows.
// Dummy code
$forms = Form::whereNull('pages')->get();
$forms->each(function ($form) {
MigrateFormPageToPages::dispatch($form);
});
For each row, it dispatches a queue job responsible for migrating a single row. Having a separate queue job for each row has a couple of benefits:
- If the job fails, we can inspect and re-run it
- Processing a lot of queue jobs automatically adds a small delay so we don’t overload the database with too many requests
We triggered the command through the server console on DigitalOcean.
Launch
At this point we have a new column in our database. We’ve migrated all existing data so it exists in both columns and updates are synced to both columns.
It’d now be safe to release the new feature that prompted this migration. In our Reform example that meant launching a new UI that could handle multiple pages. On the backend we also updated the code to only update the pages
column since page
is obsolete.
Cleanup
Finally, we can clean up the mess we created. We can safely drop the page
column since it’s not needed any more. Initially we made pages
nullable but at this point we can add a not-null constraint since all rows have data.
I’m currently building a new database client for developers. It’s the tool I wish we had when building Reform. Check out a quick demo here:
https://www.youtube.com/watch?v=KAyeOBe7csc
Top comments (0)