I would like to introduce a recently started open source project designed for database migration. Why would we need such a tool when there are many database migration tools that exist? Let's dive into details. We were asked to add some new features to a project which was made back in time. Upon entering the project we realized that the database structure was not designed very well, had complicated issues on tables. We decided to redesign the database structure and keep data as it was.
We looked at many tools to produce solution for this case, however, they did operate almost the same. Basically, the procedure of the tools, obtaining old database structure and convert data types,change structure of database and transfer it to another database. We have designed new database structure and our main goal to keep data. We needed to take data from different tables and transfer them to a new table. For sure, another option could be dumping old database and change everthing there and import it into new a database. Fortunately we did not apply this tecnnique, we decided to make a tool that does all of it. That is where story begins MadMigration.
MadMigration is written entirely in python. We used the power of the SQLAlchemy library. In this tutorial, we will explain the details of how to use MadMigration.
Installation
pip install madmigration
or
pip3 install -U madmigration
You can check version:
madmigration --version
Let's get start.
For example you have old MySQL database table with name user
:
#Mysql
CREATE TABLE `user` (
`id` int NOT NULL AUTO_INCREMENT,
`fname` varchar(10) NOT NULL,
`lname` varchar(10) NOT NULL,
`nick` varchar(10) NOT NULL,
`email` varchar(15) NOT NULL,
`pass` varchar(15) NOT NULL,
`updated` TIMESTAMP NULL,
`created` TIMESTAMP NOT NULL,
`companyID` INT,
PRIMARY KEY (`id`),
FOREIGN KEY (companyID)
REFERENCES company(`id`)
ON DELETE SET NULL
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_0900_ai_ci
And you want to migrate this table to new postgresql table with different column names:
#postgresql
CREATE TABLE public."User" (
created_at timestamp NOT NULL,
updated_at timestamp NULL,
id serial NOT NULL,
"name" varchar NOT NULL,
surname varchar NOT NULL,
nickname varchar NOT NULL,
email varchar NOT NULL,
"password" varchar NOT NULL,
CONSTRAINT "User_email_key" UNIQUE (email),
CONSTRAINT "User_pkey" PRIMARY KEY (id),
company_id integer REFERENCES company (id)
ON DELETE SET NULL
);
Then create yaml
file schema.yml
for declaring tables and columns. At first stage add Configs
section. This section is for connecting to database. Configs
section have two subsections SourceConfig
,DestinationConfig
:
version: 0.1.8
Configs:
- SourceConfig:
dbURI: "mysql://root:admin@127.0.0.1/old"
- DestinationConfig:
dbURI: "postgresql://root:admin@127.0.0.1/new"
The migrationTables
section is intended to describe information about tables:
migrationTables:
- migrationTable:
SourceTable:
name: user
DestinationTable:
name: User
create: true
Nextly, write information about columns in MigrationColumns
section.
MigrationColumns:
- sourceColumn:
name: id
destinationColumn:
name: id
options:
primary_key: true
type_cast: bigint
In case of creating a foreign key
you can specify it in the column parameters. By defining foreign key
options, such as ondelete,onupdate
. The table_name
parameter is a reference to the table, column_name
is the name of the referenced column.
- sourceColumn:
name: companyID
destinationColumn:
name: company_id
options:
type_cast: int
nullable: true
foreign_key:
table_name: company
column_name: id
ondelete: "SET NULL"
No options added sourceColumn
section, however in destinationColumn
section we could have primary_key,nullable,default,index,unique,autoincrement,foreign_key,length,type_cast
options. The final yaml sturtcure will look like this:
version: 0.1.8
Configs:
- SourceConfig:
dbURI: "mysql://root:admin@127.0.0.1/old"
- DestinationConfig:
dbURI: "postgresql://root:admin@127.0.0.1/new"
migrationTables:
- migrationTable:
SourceTable:
name: user
DestinationTable:
name: User
create: True
MigrationColumns:
- sourceColumn:
name: id
destinationColumn:
name: id
options:
primary_key: true
type_cast: bigint
- sourceColumn:
name: fname
destinationColumn:
name: name
options:
type_cast: string
- sourceColumn:
name: lname
destinationColumn:
name: surname
options:
type_cast: string
- sourceColumn:
name: email
destinationColumn:
name: email
options:
type_cast: string
unique: true
- sourceColumn:
name: nick
destinationColumn:
name: nickname
options:
type_cast: string
- sourceColumn:
name: pass
destinationColumn:
name: password
options:
type_cast: string
- sourceColumn:
name: updated
destinationColumn:
name: updated_at
options:
type_cast: datetime
- sourceColumn:
name: created
destinationColumn:
name: created_at
options:
type_cast: datetime
- sourceColumn:
name: companyID
destinationColumn:
name: company_id
options:
type_cast: int
nullable: true
foreign_key:
table_name: company
column_name: id
ondelete: "SET NULL"
In addition, you can migrate your data from RDBMS
to NOSQL
. Below a small example for migration data from PostgreSQl
to MongoDB
. Work on PostgreSQl
to MongoDB
still goes, we will add other features as soon as possible.
version: 0.1.8
Configs:
- SourceConfig:
dbURI: "postgresql://root:admin@localhost:5432/olddb"
- DestinationConfig:
dbURI: "mongodb://localhost:27017/mydb"
migrationTables:
- migrationTable:
SourceTable: #postgresql table name
name: company
DestinationTable: #the collection name:
name: Company
MigrationColumns:
- sourceColumn:
name: id
destinationColumn:
name: id
options:
type_cast: uuid
- sourceColumn:
name: name
destinationColumn:
name: NAME
options:
type_cast: varchar
- sourceColumn:
name: created
destinationColumn:
name: CREATED
options:
type_cast: datetime
- sourceColumn:
name: email
destinationColumn:
name: EMAIL
options:
type_cast: string
- sourceColumn:
name: updated
destinationColumn:
name: UPDATED
options:
type_cast: datetime
- sourceColumn:
name: code
destinationColumn:
name: CODE
options:
type_cast: string
Upon finishing declaration of yaml file run the following command.
madmigrate -f schema.yaml
Alternatively, you can separate migration logic into multiple .yaml
or .json
files. Importantly, one main .yaml
file should exist, other files should be imported into this main file.
main.yaml
file:
version: 1.1
Configs:
- SourceConfig:
dbURI: "mysql://root:admin@127.0.0.1/old"
- DestinationConfig:
dbURI: "postgresql://root:admin@127.0.0.1/new"
migrationTables:
- migrationTable: !import company.yaml
- migrationTable: !import op_cond.json
company.yaml
file
SourceTable:
name: company
DestinationTable:
name: company
create: true
MigrationColumns:
- sourceColumn:
name: id
destinationColumn:
name: id
options:
primary_key: true
type_cast: uuid
- sourceColumn:
name: name
destinationColumn:
name: name
options:
length: 120
type_cast: varchar
nullable: false
- sourceColumn:
name: created
destinationColumn:
name: created
options:
type_cast: datetime
- sourceColumn:
name: updated
destinationColumn:
name: updated
options:
type_cast: datetime
nullable: true
op_conds.json
file
{
"SourceTable": {
"name": "operation_conditions"
},
"DestinationTable": {
"name": "operation_conditions",
"create": true
},
"MigrationColumns": [
{
"sourceColumn": {
"name": "id"
},
"destinationColumn": {
"name": "id",
"options": {
"primary_key": true,
"type_cast": "uuid"
}
}
},
{
"sourceColumn": {
"name": "interest"
},
"destinationColumn": {
"name": "interest",
"options": {
"type_cast": "varchar",
"length": 30,
"nullable": false
}
}
},
{
"sourceColumn": {
"name": "FIFD"
},
"destinationColumn": {
"name": "FIFD",
"options": {
"type_cast": "varchar",
"length": 30,
"nullable": false
}
}
},
{
"sourceColumn": {
"name": "comission"
},
"destinationColumn": {
"name": "comission",
"options": {
"type_cast": "varchar",
"length": 30,
"nullable": false
}
}
}
]
}
We are not currently fully implemented all functionality for SQL
and NOSQL
, therefore we are still developing and adding new features on it.
Thank you for amazing work to our team for implementing this tool.
Top comments (0)