Supporting multiple live instances of an application that interact with a single relational database where the schema and data must remain consistent throughout deployments is significantly more complex than migrating with just one instance, or in a simple "stop all instances, migrate, deploy" scenario.
Let's explore how to perform seamless rolling migrations with Flyway across multiple instances while ensuring high availability!
Issue at Hand
Our original deployment strategy was straightforward: anchored by Flyway for database migrations and JPA with ddl-auto=validate
, we would stop all instances, run migrations on the first instance at startup, and then start the remaining instances. This “stop the world” approach worked well under previous requirements—downtime during off-peak hours was acceptable, and availability targets were met.
Over time, business requirements evolved. Clients began expecting constant access, making any downtime, even for maintenance, count against our SLA (more on it Thoughts on SLA. While not a crisis, this change highlighted that our old approach was insufficient for a high-availability environment.
In other words, we needed a strategy that allowed database migrations and application updates to occur on the fly, without interrupting service or breaking running instances. In other words, this evolution of requirements prompted us to rethink how deployments and migrations should be handled in a multi-instance setup.
How Did We Address It?
We adopted a rolling update strategy, updating application instances gradually while keeping others running. The key challenge was interoperability: old and new versions must operate concurrently with the database without causing downtime or errors.
This way, we have several instances running the old version and one instance running the updated version. This indicates that we must be able to support different versions at the same time if we would like to use this deployment strategy. Otherwise, the old or the new instances might not work as expected. In other words, interoperability is key.
Deployment Strategies breakdown on Baeldung
Since Flyway triggers our migrations, we needed to verify that it is up to the task. We addressed this with a small POC, which I covered in a previous blog post.
In short, Flyway is forward-compatible: older application versions can start even if newer migrations (based on database migration table) are detected.
Except the flyways schema/migration validation, we still have to make sure actual schema in DB is OK for queries old version will still be execution, and the JPA startup db schema validation. Simply deleting a column from DB will result with old version of service failing at query time. Also, if an old instance is restarted, JPA database schema validation will fail and the instance will not restart.
Solution
The solution is multi-step migration. An intermediary version writes "extra/legacy data" to the DB for older instances to function properly while deprecating its use of the old columns by using the new ones. This way, the both old and new version of app can work with the DB at the same time. Once all the instances are updated, we deploy yet another version that safely removes the old columns from the DB. Only then the migration is done.
Procedures to Handle the Deployment Process
For further examples, assume we have 2 instances of the app running version 1.0.0, and we want to deploy a new version.
Adding a New Field
- Create a new version (v1.0.1) of the app with a migration script to add the new field.
- Deploy the new version to all the instances.
(If you want to ensure non-nullability, the following steps are needed:)
- Create a new version (v1.0.2) of the app with a migration script that adds the non-null constraint to the new column with a default value for all the existing rows.
- Deploy the new version (v1.0.2) to at least one instance (to run the migration).
Deleting a Field
- Create a new version (v1.0.1) of the app that does not use the field in the application logic but lacks the migration script that will actually delete the column from the DB.
- Deploy the new version (v1.0.1) to all the instances (no DB migration is run).
- Create a new version (v1.1.0) of the app with a migration script that deletes the column from the DB.
- Deploy the new version (v1.1.0) to at least one instance (to run the migration).
Changing the Type of the Field or Any Other Complex Operation
- Create a new version (v1.0.1) of the app with a migration script to add the new field, copy/map the data from the old column, and ensure that any new data is written to both the new column and the old column (write to both so that old instances can still work with the data, read from the new column to enable the latter step).
- Deploy the new version to all the instances (at this point, the app is ready to stop using the old field).
- Create a new version (v1.0.2) that deprecates the old column by writing only to the new column (final sync of the data).
- Deploy the new version (v1.0.2) to all instances.
- Create a new version (v1.1.0) of the app with a migration script that copies/maps the data from the old column to the new one (final sync of the data) and a migration script that deletes the old column from the DB.
- Deploy the new version (v1.1.0) to at least one instance (to run the migration).
Conclusion
Ensuring interoperability between different versions of your application during rolling updates is crucial for enabling rollout when there's more than one instance interested. Flyways validation defaults play the key role in supporting this process, enabling forward-compatible database migrations that allow older versions of the application to coexist with newer ones.
Achieving this seamless transition requires careful planning and "additional steps". These extra steps, while adding complexity to the deployment process, are essential to ensure data consistency, rollback and avoiding breaking changes.
Interoperability is key. I hope this blog post provides valuable insights and practical guidance beyond the simplistic approach of merely deploying multiple instances.
Top comments (0)