Integrate Database Migration Using Flyway Into CI/CD

#database #migration #flyway #cicd

Practically, Database Migration is changing the structure or data of a database, which includes some activities on the database, like create a table, update or insert data to a table.

Without DB migration, if it is not establihsing new datbase, we need to consider whether the script has been run on the target DB or not. In case we accidentally re-run the script, it will causes duplicate records on the table for inserting data. That is one of reasons for manual deployment.

With the development of CI/CD, there are a lot of great DB migration tools developed. In this post, I will use Flyway for CircleCI pipeline to deploy on Heroku, you can find their configurations in my last article.

Ideas

We may use different platforms for DB migration, but overall they are just around the idea of generating the connection configuration from the connection string. In this session, I will realize that idea by using flyway on Heroku PostgreSQL

Get the DB connection string using platform CLI: For Heroku, it is heroku config:get DATABASE_URL -a <APP_NAME>, then the console will print out postgres://DB_USER:DB_PASSWORD@DB_HOST:5432/DB_NAME.
Extract DB user, password, host, name from connection string: We may any technique to get that info out, in this article, I used sed, cut.
Generate a connection config for DB migration tool: Depending on the DB migration tool, the config formats may be different. In this case, I created a config file like below. You may wonder about the locations, and table, they are SQL scripts directory for the former, and table for version control for the later.

flyway.url=jdbc:postgresql://DB_HOST/DB_NAME
flyway.locations=SQL_FILE_LOCATION
flyway.user=DB_USER
flyway.password=DB_PASSWORD
flyway.table=schema_history
flyway.outOfOrder=true

Call DB migration tool CLI: If we use flyway, the command will be flyway -configFiles=<CONFIG_FILE_DIRECTORY> migrate
Combine all things into a bash file

# Extract data from DB connection
export CONNECTION_URL="$(heroku config:get DATABASE_URL -a ${HEROKU_APP_NAME})"
export SCRIPT_DIR="$( cd "$( dirname "${BASH_SOURCE}" )" && pwd)"
export SQL_FILE_LOCATION="filesystem:/${SCRIPT_DIR}/migrations/scripts/sql"
export CONFIG_DIR="${SCRIPT_DIR}/migrations/config"
export DB_NAME="$(echo $CONNECTION_URL | sed "s|.*:5432/||")"
export DB_USER="$(echo $CONNECTION_URL | cut -d ":" -f 2 | sed "s|.*//||")"
export DB_PASSWORD="$(echo $CONNECTION_URL | cut -d ":" -f 3 | cut -d "@" -f 1)"
export DB_HOST="$(echo $CONNECTION_URL | cut -d ":" -f 3 | cut -d "@" -f 2)"

# Generate config file
sed -i -e "s|DB_HOST|${DB_HOST}|g" ${CONFIG_DIR}/flyway.conf
sed -i -e "s|DB_NAME|${DB_NAME}|g" ${CONFIG_DIR}/flyway.conf
sed -i -e "s|DB_USER|${DB_USER}|g" ${CONFIG_DIR}/flyway.conf
sed -i -e "s|DB_PASSWORD|${DB_PASSWORD}|g" ${CONFIG_DIR}/flyway.conf
sed -i -e "s|SQL_FILE_LOCATION|${SQL_FILE_LOCATION}|g" ${CONFIG_DIR}/flyway.conf

# Run flyway
flyway -configFiles=${CONFIG_DIR}/flyway.conf migrate

Note: This file should be run by the pipeline, but not manually. Because the information in the config file will change and lead to wrong information for the next run.

Integrate DB migration to CI/CD

Let re-use our CI/CD pipeline from the last post.
CirlceCI supports post-steps, which is the last part of the job. If post-step is failed, the whole job will be marked as failed.
Because CircleCI doesn't support flyway, we need to install it independently by command line. After installing flyway, we are good to call our bash file above to start the migration process.

The pipeline config now will look like this.

orbs:
  node: circleci/node@4.1.0
  heroku: circleci/heroku@1.2.6

version: 2.1

commands:
  deploy-command:
    parameters:
      BRANCH:
        type: string
      HEROKU_APP_NAME:
        type: string
    steps:
      - run: heroku config:set YARN_PRODUCTION=false -a <<parameters.HEROKU_APP_NAME>>
      - heroku/deploy-via-git:
          app-name: <<parameters.HEROKU_APP_NAME>>
          branch: <<parameters.BRANCH>>

jobs:
  test:
    executor: node/default
    steps:
      - checkout
      - node/install-packages:
          cache-path: ~/project/node_modules
          override-ci-command: npm install
      - run: npm test

  deploy:
    executor: heroku/default
    steps:
      - checkout
      - heroku/install

      - when:
          condition:
            equal: [master, << pipeline.git.branch >>]
          steps:
            - deploy-command:
                HEROKU_APP_NAME: production-server
                BRANCH: master

      - when:
          condition:
            equal: [staging, << pipeline.git.branch >>]
          steps:
            - deploy-command:
                HEROKU_APP_NAME: staging-server
                BRANCH: staging

      - when:
          condition:
            equal: [develop, << pipeline.git.branch >>]
          steps:
            - deploy-command:
                HEROKU_APP_NAME: develop-server
                BRANCH: develop

workflows:
  heroku_deploy:
    jobs:
      - test

      - deploy:
          requires:
            - test
          filters:
            branches:
              only:
                - master
                - develop
                - staging
          post-steps:
            - run: 
                name: install flyway
                command: wget -qO- https://repo1.maven.org/maven2/org/flywaydb/flyway-commandline/7.14.0/flyway-commandline-7.14.0-linux-x64.tar.gz | tar xvz && sudo ln -s `pwd`/flyway-7.14.0/flyway /usr/local/bin
            - run: 
                name: run migration scripts
                command: sh migrations/scripts/migrations-run.sh
                environment:
                  GIT_BRANCH: << pipeline.git.branch >>

In this new pipeline, you can see that I have the environment part, which is used to declare environment variables. We have totally 3 environments (develop, staging, production), so, with that variables, we can change APP_NAME depending on GIT_BRANCH. By adding the below condition at the very first of the bash script, the script will get the appropriate DB info for each environment.

if [ "$GIT_BRANCH" = "master" ]; then
  export HEROKU_APP_NAME=production-server
elif [ "$GIT_BRANCH" = "staging" ]; then
  export HEROKU_APP_NAME=staging-server
else
  export HEROKU_APP_NAME=develop-server
fi

# Extract data from DB connection ...

Paypal.

I am really happy to receive your feedback on this article. Thanks for your precious time reading this.

Top comments (2)

Arthur Yip • May 10 '22

Why do you run the migration after deployment? Shouldn't you run it before deployment? If you deploy some application code dependent on the migration, then between the deployment and the migration, there will be (at least) some brief moment when the code doesn't work as it depends on the migration, right?

Dantis Mai • May 11 '22

For saving cost purposes, in commercial projects, when the deployment step fails, the cloud provider (ex AWS) will revert to the previous version of the infrastructure, which is also the basic feature for any cloud provider. Difference from reverting the development, reverting in migration tool, this case is Flyway, is a paid feature (this is free for a side project). So I let the deployment step run first to make it fail fast when issues, but if the migration step is failed, the pipeline still marks the run as a failure.
Before releasing any feature, it must be tested on staging env, which is similar to prod env (dev.to/maithanhdanh/ci-cd-pipeline...). In case it is good to go, it means the code also works with the DB data perfectly, then we will deploy the feature on prod env. The quality engineers will take the responsibility to make sure the feature behaves the same as the business requirement.