Blaž Šelih

Posted on Mar 29, 2018

(Slightly more advanced) deployment with git

#git #python #devops #beginners

Deploying can sometimes be a pain and go wrong

As most developers can confirm, deploying and updating a basic web application isn't really rocket science. Transfer the updated files to the server, maybe do some local edits, restart and hope for the best.
This naive approach may be good enough in many cases, but once database structure updates and dependency changes are involved, things can get messy quickly.
Besides, keeping development, testing and production database structure in sync by manually editing and executing a bunch of SQL files is tedious as well as error prone. The same goes for manually managing packages and dependencies.

Since we should be using a version control system anyway, we'll try to leverage git to perform some of the above tasks.

Git (hooks) to the rescue

There is plenty of "Deploy your app using git" articles on the internet. Most of them show how to use the post-receive hook to dump the files from the repository to the project folder and restart the server.
In this post, I will attempt to expand on that and show how to use shell scripts to somewhat automate database and package upgrades. I will also show how to keep a simple deployment log to provide a basic rollback capability.
This is all pretty basic stuff but it is a first step to more complex systems like continuous integration (CI) and delivery (CD) solutions. Hopefully, even this simple setup will make your life as a developer a bit easier.

As an example, we will be deploying a Python (Flask) application to WebFaction server. Pip is used to manage the dependencies in the virtual environment and Flask-Migrate to handle the SQLAlchemy database migrations.
The application is structured following Miguel Grindberg's excellent Flask Web Developement book and Flask Mega Tutorial.
Similar principles should apply no matter the programming language, frameworks and libraries used.

Some basic familiarity with git and shell scripting is assumed. I will not go into every detail and code fragments are here for illustration only. Examples of pre-receive and post-receive hooks are at the bottom for reference.

Local setup and workflow

There is not much to set-up on the client side, except adding a git remote after it was created (see below).

Once we are done developing and testing the new amazing feature, it is time to deploy it. We will have to:

Activate the virtual environment.
Save the current package configuration into a file by executing pip freeze > requirements.txt.
Since pip does not really do a complete dependency resolution it is a good idea to verify and, if required, edit the file manually. See pip manual.
Once this is done, add requirements.txt to the repo.
Add any new database migration files to the repo and commit the changes.
Push to the remote.

Now the server will take over, execute the pre- and post-receive scripts and deploy the app for us.
Happy days!

All of the above steps can be executed as a shell script as well.

Initail remote setup

First of all, I would strongly suggest you create identical staging and production environments, except for small non-critical apps. Try out everything in the staging environment before deploying to production.
Also, remote setup will differ slightly depending on your hosting provider but the basics should be the same.

Ssh to your server and create a bare git repository (git init --bare). Set up SSH keys as usual, then add the newly created repo as a remote. Push initial configuration to this remote.

Checkout the files from the repo to your working directory. It may not be strictly necessary, but I like to keep my working directory and repository in separate folders (bare repo, see above).

Upload files not included in the version control (security sensitive information) and set the required environment variables.

Confugure the database, web server, etc. and check that everything works as expected.

This is now the initial configuration.

Deploying using server side hooks

Once the push reaches the server, we will have to:

Check if the correct branch is being deployed.
Transfer updated files to the server and update the working directory.
Install and/or update new packages/dependencies.
Upgrade the database.
Finally, log the changes and restart the server.

We will use the pre-receive and post-receive hooks to perform these steps.

1) Branch checks

First, we need to make sure that only the correct branch (master) gets deployed.
Both pre- and post- receive hooks receive a list of pushed references on stdin (in the form of old-hash new-hash reference). We can iterate over this information to perform our checks. If pre-receive hook exits non-zero, no pushed references are accepted and deployment will abort this way:

#!/bin/bash
# pre-receive
while read OLD NEW REF
do
    if ! [[ $REF =~ '/master'$ ]] ; then    
        echo "Pushing non-master branch $REF, aborting."
        exit 1
    fi
done

Note that in this case the repository contains only the 'deployable' master branch and all other branches are rejected. If you wish to store other branches, you can easily integrate above logic into the post-receive hook and not use the pre-receive at all.

2) File transfer

Remote repository is updated when push completes. In order to update the working directory, we need to checkout the files from the repository using the post-receive hook:

#!/bin/bash
# post-receive
export GIT_WORK_TREE=/home/...        # working directory
git checkout $NEW -f master

Any local changes are overridden by forcing overwrite (-f).

3) Packages (virtual environment) update

First we need to determine if any update is required at all. Since we are using pip to manage the packages, we can check if the requirements.txt file was updated during the push:

git diff --name-only $OLD $NEW | grep 'requirements.txt'

If requirements.txt was changed, we'll activate the virtual environment and run pip install -qr requirements.txt:

PIP=0
if git diff --name-only  $OLD $NEW | grep -q 'requirements.txt'; then
    source /home/.../venv/bin/activate
    pip install -qr requirements.txt
    PIP=1
fi

Option -q will suppress a rather verbose pip output, but will still show the errors. Variable PIP is used later for logging (see bellow).

4) Database upgrade

Database migration files are stored in app/migrations/versions, so we can use the same git diff trick to detect any changes:

git diff --name-only  $OLD $NEW | grep 'app/migrations/versions'

Again, we'll need to activate the virtual environment and then execute the manager.py to upgrade the database:

DB=0
if git diff --name-only $OLD $NEW | grep -q 'app/migrations/versions'; then
    source /home/.../venv/bin/activate
    python manager.py db upgrade
DB=1
fi

5) Log changes and restart the server

We will use the current time, author, old git hash, new git hash as well as packages and database update status to generate a simple CSV log file:

AUTHOR=`git show $NEW -s --format="%an"`
TIME=`date -Is`
echo "$TIME,$AUTHOR,$OLD,$NEW,$PIP,$DB" >> updates.log

Finally, restarting the server should be fairly self explanatory.

If everything worked as planned the updated application should now be running.

Rollback

First, note that sometimes complete rollback is not even possible, because data may be lost while downgrading the database. It is usually easier to deploy a small fix, than doing the complete rollback.

In case you still need to revert to the older version, you can use the information from the log, repository and local environment to get things back into working order.

This is best done by manually reversing the steps from deployment:

Try to figure out what went wrong (by far the most important step)!
Activate the virtual environment.
Consult the log and verify if database was upgraded. Execute manage.py db downgrade if required. This will downgrade the database structure to previous version. You may lose some data during this step!
Checkout old files from the repository, forcing overwrite.
Check if the packages were upgraded and, if required, downgrade them by executing pip install -r requirements.txt. Fortunately, most packages are backward compatible and this step is usually not needed.
Restart the server and hope for the best.

Examples

pre-receive

#!/bin/bash
# pre-receive
while read OLD NEW REF
do
    if ! [[ $REF =~ '/master'$ ]] ; then    
        echo "Pushing non-master branch $REF, aborting."
    exit 1
    fi
done

post-receive

#!/bin/bash
# post-receive

ROOT=/home/.../app-folder
VENV=/home/.../virtual-environment-folder
export GIT_WORK_TREE=$ROOT
PIP=0
DB=0

read OLD NEW REF        # only master was accepted, no need to iterate

git checkout $NEW -f

if git diff --name-only $OLD $NEW | grep -q 'requirements.txt'; then
    $PIP=1
    source $VENV/bin/activate
    pip install -qr $ROOT/requirements.txt
    echo "Packages upgraded"
else
    echo "No package changes detected"
fi

if git diff --name-only $OLD $NEW | grep -q 'app/migrations/versions'; then
    $DB=1
    source $VENV/bin/activate
    python3 $ROOT/manage.py db upgrade
    echo "Database upgraded"
else
echo "No database change detected"
fi

AUTHOR=`git show $NEW -s --format="%an"`
TIME=`date -Is`
echo "$TIME,$AUTHOR,$OLD,$NEW,$PIP,$DB" >> $ROOT/updates.log

$ROOT/apache2/bin/restart

Forem