DEV Community

Discussion on: What's Your Approach to Data Migration & Database Versioning?

Collapse
 
rouilj profile image
John P. Rouillard

I'm the current maintainer for the Roundup Issue Tracker. This is a somewhat unique application in that it supports both nosql (kv store) and sql (sqlite, postresql, mariadb/mysql) primary databases. Also the database is designed to be modified by a schema written in Python code.

The user can modify the database at will adding/removing tables and relationships. Roundup code migrates the backend database to the new schema automatically. The current user defined schema is stored in a serialized form in the database. It is up to the tracker admin to track addition/removal of fields using their favorite VCS on the tracker's schema.py file.

The serialized schema also tracks the database version that describes what relationships are defined and what data types are used under the hood. This changes with the Roundup Tracker software release. For example a string() property defined in the schema could be a varchar, blob, or text in the backend.

The CLI for the issue tracker includes a migrate command that changes the schema version when an upgrade is installed. It is responsible for running the alter commands on the db. If the database version does not match the version expected by the software, it will exit with an error.

Next month's 2.3.0 release changed the schema . The underlying type used for storing an expiration timestamp changed to prevent truncation. This bumps the schema version from 7 to 8. A prior release bumped the version as it added indexes that were present only in new installations but not in older upgraded databases.

The supported migration path is one way (upgrade). You can manually downgrade the schema version if you need to go to a prior version of the software. (The tests for the last 3 migrations roll back the migration changes as part of testing the forward migration.)

Data integrity is maintained by never deleting data by default. Instead data is retired, so it's still available for record keeping, auditing etc., but is hidden so it doesn't get reused. You can make changes required for GDPR to remove users while still maintaining the referential integrity of the database but with loss of auditing info.