DEV Community

Chris White
Chris White

Posted on

Implementing Quality Checks In Your Git Workflow With Hooks and pre-commit

Git is a powerful tool to enable developers to organize and share their code. One of the more interesting features is hooks. This allows for execution of scripts at certain points in the git workflow. In this article I'll be showcasing one of the simple git workflow events: pre-commit. I'll also be introducing a tool to help with automation of it, the aptly named pre-commit. Note that the code used for this is the final step of my python beginners series and can be found in a code repository in GitHub. Simply click on the green "Code" button and select "Download ZIP". Then extract it into your preferred directory of choice.

Hooks and Webhooks

To clear up any potential confusion, Git hooks and GitHub webhooks are different entities (though GitHub webhooks most likely is powered by git hooks). Git hooks are specific to the git software package. GitHub webhooks are a feature of the GitHub platform that pushes JSON payloads based on various GitHub related events. The same goes for similar source control service sites such as GitLab

Git Hook Basics

A git hook is essentially a script that is executed at a certain point in the git workflow. The basic rules of hooks are:

  • Set as executable
  • Strip the extension (with the exception of certain windows extensions such as .exe)
  • Located in a .git/hooks directory under the repository parent directory

To see how this looks download the code mentioned in the introduction paragraph and make sure there's no .git directory in it (remove it if there is). Then run git init to initialize the repository:

$ git init
$ git branch -m main
Enter fullscreen mode Exit fullscreen mode

The second command ensures the default branch is main, which is fairly common for new repositories on major code hosting sites. After the git repository has been initialized it's time to look at the contents of .git/hooks:

$ cd .git/hooks
$ ls -1
applypatch-msg.sample
commit-msg.sample
fsmonitor-watchman.sample
post-update.sample
pre-applypatch.sample
pre-commit.sample
pre-merge-commit.sample
prepare-commit-msg.sample
pre-push.sample
pre-rebase.sample
pre-receive.sample
push-to-checkout.sample
update.sample
Enter fullscreen mode Exit fullscreen mode

Each of these has examples. To make them work we'll need to make a copy with .sample removed and ensure it's executable. Let's take a look at the pre-commit.sample contents:

#!/bin/sh
#
# An example hook script to verify what is about to be committed.
# Called by "git commit" with no arguments.  The hook should
# exit with non-zero status after issuing an appropriate message if
# it wants to stop the commit.
#
# To enable this hook, rename this file to "pre-commit".

if git rev-parse --verify HEAD >/dev/null 2>&1
then
        against=HEAD
else
        # Initial commit: diff against an empty tree object
        against=$(git hash-object -t tree /dev/null)
fi

# If you want to allow non-ASCII filenames set this variable to true.
allownonascii=$(git config --type=bool hooks.allownonascii)

# Redirect output to stderr.
exec 1>&2

# Cross platform projects tend to avoid non-ASCII filenames; prevent
# them from being added to the repository. We exploit the fact that the
# printable range starts at the space character and ends with tilde.
if [ "$allownonascii" != "true" ] &&
        # Note that the use of brackets around a tr range is ok here, (it's
        # even required, for portability to Solaris 10's /usr/bin/tr), since
        # the square bracket bytes happen to fall in the designated range.
        test $(git diff --cached --name-only --diff-filter=A -z $against |
          LC_ALL=C tr -d '[ -~]\0' | wc -c) != 0
then
        cat <<\EOF
Error: Attempt to add a non-ASCII file name.

This can cause problems if you want to work with people on other platforms.

To be portable it is advisable to rename the file.

If you know what you are doing you can disable this check using:

  git config hooks.allownonascii true
EOF
        exit 1
fi

# If there are whitespace errors, print the offending file names and fail.
exec git diff-index --check --cached $against --
Enter fullscreen mode Exit fullscreen mode

So this will do a few basic checks against file names and whitespace issues. As one of the comments mentions To enable this hook, rename this file to "pre-commit". We can then just run git commit afterwards to run it:

$ cp pre-commit.sample pre-commit
$ cd ../../
$ git commit
Enter fullscreen mode Exit fullscreen mode

At this point nothing happened as nothing was added. I'll go ahead and add a file with a Japanese name to see what happens:

$ touch テスト
$ git add テスト
$ git commit
$ git commit
Error: Attempt to add a non-ASCII file name.

This can cause problems if you want to work with people on other platforms.

To be portable it is advisable to rename the file.

If you know what you are doing you can disable this check using:

  git config hooks.allownonascii true
Enter fullscreen mode Exit fullscreen mode

As you can see our hook is working and is preventing the non ASCII named file from being committed. I'll go ahead and cleanup the test:

$ git reset テスト
$ rm テスト
Enter fullscreen mode Exit fullscreen mode

Practical Git Hook Usage

So we've seen a very basic example of how hooks work. Now the more practical example would be to use it to run tests on code before committing. Note that I don't recommend this if your project is in the prototyping phase where you may be doing many commits and may not have tests setup due to the frequency of code architecture changes. In this case the project in question has tox setup which runs various python linting and tests. Before running the examples you'll need to ensure pdm is installed and running under python 3.11. Instructions for this can be found in my pdm tutorial, or you can install it on your own. Once installed we'll make sure the necessary packages are available for tox to work with:

$ pdm install
Enter fullscreen mode Exit fullscreen mode

Then I'll edit the .git/hooks/pre-commit file to contain the following:

#!/bin/sh
pdm run tox

if [ $? -ne 0 ]; then
  echo "tox checks failed" >&2
  exit 1
fi
Enter fullscreen mode Exit fullscreen mode

Now when git commit runs:

$ git commit
lint: install_deps> pdm sync --no-self --group testing --group lint
lint: commands[0]> flake8
Enter fullscreen mode Exit fullscreen mode

tox is run to initiate all the checks. As is the code should be in working condition so you'll see a commit screen after the tox run. Go ahead and close it out without putting a message to abort the comment. When there's an issue with something (I went ahead and commented out one of the imports):

$ git commit
lint: install_deps> pdm sync --no-self --group testing --group lint
lint: commands[0]> flake8
<snip>
  lint: FAIL code 1 (4.34=setup[3.69]+cmd[0.65] seconds)
  test: FAIL code 1 (11.80=setup[10.09]+cmd[1.71] seconds)
  docs: OK (14.44=setup[11.79]+cmd[2.65] seconds)
  evaluation failed :( (30.79 seconds)
tox checks failed
Enter fullscreen mode Exit fullscreen mode

A message at the end shows that tox has failed to run and no commit screen is displayed.

Git Hook Automation With pre-commit

Given how useful checking code quality is before committing, there's actually a framework around git hooks called pre-commit. It's somewhat like a mini-GitHub Actions that can be used to run various commands during the pre-commit phase. This is where you normally want most CI/CD like checks. Before continuing you'll want to remove the existing hook or you'll end up having duplicated tool runs. We'll also go ahead and add all the files in the repo so there's something to check:

$ rm .git/hooks/pre-commit
$ git add .
Enter fullscreen mode Exit fullscreen mode

Given how useful pre-commit is across projects I generally recommend installing via pip install --user, making it part of a tooling virtual environment, or using pipx:

$ pip install --user pre-commit
$ pipx install pre-commit
Enter fullscreen mode Exit fullscreen mode

Next we'll need to create a YAML configuration file in the root of our repository called .pre-commit-config.yaml. To make things simple you can bootstrap a basic one via:

$ pre-commit sample-config > .pre-commit-config.yaml
Enter fullscreen mode Exit fullscreen mode

As is the file looks something like this:

# See https://pre-commit.com for more information
# See https://pre-commit.com/hooks.html for more hooks
repos:
-   repo: https://github.com/pre-commit/pre-commit-hooks
    rev: v3.2.0
    hooks:
    -   id: trailing-whitespace
    -   id: end-of-file-fixer
    -   id: check-yaml
    -   id: check-added-large-files
Enter fullscreen mode Exit fullscreen mode

This already has some great entries for basic git related checks. Now we'll add a new entry to cover tox checking:

# See https://pre-commit.com for more information
# See https://pre-commit.com/hooks.html for more hooks
repos:
-   repo: https://github.com/pre-commit/pre-commit-hooks
    rev: v3.2.0
    hooks:
    -   id: trailing-whitespace
    -   id: end-of-file-fixer
    -   id: check-yaml
    -   id: check-added-large-files
-   repo: local
    hooks:
    -   id: tox check
        name: tox-validation
        entry: pdm run tox
        language: system
        types: [python]
        pass_filenames: false
Enter fullscreen mode Exit fullscreen mode

repo: local tells pre-commit that what I'm doing is not a managed action by code someone else wrote. It's completely custom code written by me. The code itself will run pdm run tox in the system environment when it finds python files have been modified. This is one of the more powerful features of pre-commit over the standard git hook. There's more flexibility on when certain commands get run. You don't need to check tests if only the README file was updated. Now to actually have this integrated with our workflow we'll need to run:

$ pre-commit install
pre-commit installed at .git/hooks/pre-commit
Enter fullscreen mode Exit fullscreen mode

The generated file is simply an entry point to the actual pre-commit program which handles the processing of what we want to do:

#!/usr/bin/env bash
# File generated by pre-commit: https://pre-commit.com
# ID: 138fd403232d2ddd5efb44317e38bf03

# start templated
INSTALL_PYTHON=/home/johndoe/.pyenv/versions/py311/bin/python3.11
ARGS=(hook-impl --config=.pre-commit-config.yaml --hook-type=pre-commit)
# end templated

HERE="$(cd "$(dirname "$0")" && pwd)"
ARGS+=(--hook-dir "$HERE" -- "$@")

if [ -x "$INSTALL_PYTHON" ]; then
    exec "$INSTALL_PYTHON" -mpre_commit "${ARGS[@]}"
elif command -v pre-commit > /dev/null; then
    exec pre-commit "${ARGS[@]}"
else
    echo '`pre-commit` not found.  Did you forget to activate your virtualenv?' 1>&2
    exit 1
fi
Enter fullscreen mode Exit fullscreen mode

Now after this I'll add the pre-commit YAML as there's a sanity check to ensure it's part of the repository:

$ git add .pre-commit-config.yaml
Enter fullscreen mode Exit fullscreen mode

This is because the end goal is that anyone who downloads the code can install your pre-commit setup themselves and be able to run the appropriate tests. Now looking at the result:

$ git commit
[INFO] Installing environment for https://github.com/pre-commit/pre-commit-hooks.
[INFO] Once installed this environment will be reused.
[INFO] This may take a few minutes...
Trim Trailing Whitespace.................................................Failed
- hook id: trailing-whitespace
- exit code: 1
- files were modified by this hook

Fixing README.md

Fix End of Files.........................................................Passed
Check Yaml...............................................................Passed
Check for added large files..............................................Passed
tox-validation...........................................................Failed
Enter fullscreen mode Exit fullscreen mode

So as the file I purposely broke hasn't been fixed the tox-validation stage has failed. It also found an issue with trailing whitespace in my README.md and even fixed it for me. I'll go ahead and add the updated README and fix the commented out import so things are working again:

$ git add README.md
$ vim src/my_pdm_project_cwprogram_test/mymath.py
$ git add src/my_pdm_project_cwprogram_test/mymath.py
$ git commit
Trim Trailing Whitespace.................................................Passed
Fix End of Files.........................................................Passed
Check Yaml...............................................................Passed
Check for added large files..............................................Passed
tox-validation...........................................................Passed
Enter fullscreen mode Exit fullscreen mode

Now that everything has passed I'm given the ability to commit. The output is also much more user friendly and mimics what you'd expect from a CI/CD or test suite run.

Working With Contributed pre-commit Code

In the configuration file you might have noticed:

repo: https://github.com/pre-commit/pre-commit-hooks
Enter fullscreen mode Exit fullscreen mode

The basic functionality of this works similar to a GitHub repo with GitHub Actions code. You can even see the code for end_of_file_fixer as an example:

def fix_file(file_obj: IO[bytes]) -> int:
    # Test for newline at end of file
    # Empty files will throw IOError here
    try:
        file_obj.seek(-1, os.SEEK_END)
    except OSError:
        return 0
    last_character = file_obj.read(1)
    # last_character will be '' for an empty file
    if last_character not in {b'\n', b'\r'} and last_character != b'':
        # Needs this seek for windows, otherwise IOError
        file_obj.seek(0, os.SEEK_END)
        file_obj.write(b'\n')
        return 1
Enter fullscreen mode Exit fullscreen mode

In fact there's actually a fairly sizeable amount of these listed on the pre-commit website. In fact another one is provided by the PDM project to make sure pdm.lock is up to date. I'll go ahead and add this in:

# See https://pre-commit.com for more information
# See https://pre-commit.com/hooks.html for more hooks
repos:
-   repo: https://github.com/pre-commit/pre-commit-hooks
    rev: v3.2.0
    hooks:
    -   id: trailing-whitespace
    -   id: end-of-file-fixer
    -   id: check-yaml
    -   id: check-added-large-files
-   repo: local
    hooks:
    -   id: tox check
        name: tox-validation
        entry: pdm run tox
        language: system
        types: [python]
        pass_filenames: false
-   repo: https://github.com/pdm-project/pdm
    rev: 2.10.4 # a PDM release exposing the hook
    hooks:
    -   id: pdm-lock-check
Enter fullscreen mode Exit fullscreen mode

Now I'll manually edit one of my dependencies in pyproject.toml and check the result:

dependencies = [
    "numpy>=1.25.1",
    "requests>=2.31.0",
]
Enter fullscreen mode Exit fullscreen mode
$ git add pyproject.tmol .pre-commit-config.yaml
$ git commit
[INFO] Initializing environment for https://github.com/pdm-project/pdm.
[INFO] Installing environment for https://github.com/pdm-project/pdm.
[INFO] Once installed this environment will be reused.
[INFO] This may take a few minutes...
Trim Trailing Whitespace.................................................Passed
Fix End of Files.........................................................Passed
Check Yaml...............................................................Passed
Check for added large files..............................................Passed
tox-validation...........................................................Passed
pdm-lock-check...........................................................Failed
- hook id: pdm-lock-check
- exit code: 1

Lock file hash doesn't match pyproject.toml, packages may be outdated
Enter fullscreen mode Exit fullscreen mode

In this case it noticed that my numpy dependency has changed. I'll go ahead and revert this and see what happens:

$ vim pyproject.toml
$ git add pyproject.toml
$ git commit
Trim Trailing Whitespace.................................................Passed
Fix End of Files.........................................................Passed
Check Yaml...............................................................Passed
Check for added large files..............................................Passed
tox-validation...........................................................Passed
pdm-lock-check...........................................................Passed
Enter fullscreen mode Exit fullscreen mode

Now everything is looking good and I'm allowed to commit again. Trying to check if the pdm lock file was synced would have taken a substantial amount of time via a normal git hook, primarily due to not having context on the pdm code base. Instead I can use this hook provided by the developers in less than ten minutes to do it instead.

File Filtering

To showcase this best we'll want to commit what we have now so there's better control over what files get added:

$ git commit -m "Initial Commit"
Enter fullscreen mode Exit fullscreen mode

Now since tox can run specific stages we can use this to break out tasks based on what's actually been committed. Considering when we'd want things to run:

  • All files should have the basic large files, trailing whitespace, and end of files check
  • README.md should have a markdown linter run against it
  • pyproject.toml should have a toml linter run against it
  • Linting should be run if files in src or tests are modified
  • Tests should be run if files in src or tests are modified
  • Documentation building should be run if files in src (for automodule generation) or docs are modified
  • Everything should run if pyproject.toml is updated as it controls settings and manages dependencies

So let's see what a solution like this would look like:

# See https://pre-commit.com for more information
# See https://pre-commit.com/hooks.html for more hooks
repos:
-   repo: https://github.com/pre-commit/pre-commit-hooks
    rev: v3.2.0
    hooks:
    -   id: trailing-whitespace
    -   id: end-of-file-fixer
    -   id: check-yaml
    -   id: check-toml
    -   id: check-added-large-files
-   repo: local
    hooks:
    -   id: tox lint
        name: tox-validation
        entry: pdm run tox -e test,lint
        language: system
        files: ^src\/.+py$|pyproject.toml|^tests\/.+py$
        types_or: [python, toml]
        pass_filenames: false
    -   id: tox docs
        name: tox-docs
        language: system
        entry: pdm run tox -e docs
        types_or: [python, rst, toml]
        files: ^src\/.+py$|pyproject.toml|^docs\/
        pass_filenames: false
-   repo: https://github.com/pdm-project/pdm
    rev: 2.10.4 # a PDM release exposing the hook
    hooks:
    -   id: pdm-lock-check
-   repo: https://github.com/jumanjihouse/pre-commit-hooks
    rev: 3.0.0
    hooks:
    -   id: markdownlint
Enter fullscreen mode Exit fullscreen mode

First off we have check-toml for linting our pyproject.toml file. markdownlint is taken from jumanjihouse/pre-commit-hooks. Note that check-* and the markdown linter generally has code to check if appropriate files were added so there's no need to add filters. Another thing to keep in mind is that it's recommended to pin rev against a specific revision (usually a tag) versus something like master or main. This will reduce "unexpected" surprises due to code changes. Here we see the actual filtering at work:

    -   id: tox lint
        name: tox-validation
        entry: pdm run tox -e test,lint
        language: system
        files: ^src\/.+py$|pyproject.toml|^tests\/.+py$
        types_or: [python, toml]
Enter fullscreen mode Exit fullscreen mode

So files: ^src\/.+py$|pyproject.toml|^tests\/.+py$ is a regular expression to show what files we're interested in. In this case it's files under src/ and tests/ as well as pyproject.toml. types_or (requires 2.9.0 or later pre-commit) also ensures we're only looking at python or toml files. If you're wondering what to put in for types_or the identify-cli tool will let you know appropriate values that can be used:

$ identify-cli docs/source/index.rst
["file", "non-executable", "rst", "text"]
Enter fullscreen mode Exit fullscreen mode

file is the most generic type. You can also be more specific with rst for example. types can be used if you want something that works off AND comparison instead:

        types: [python, executable]
Enter fullscreen mode Exit fullscreen mode

This will run if a file is a python file and executable as well. Now that we've seen how comparisons work, it's time to put this into practice. I'll add our updated pre-commit YAML, modify README.md, and then run git commit:

$ vim README.md #changes here
$ git add .pre-commit-config.yaml README.md
$ git commit
[INFO] Initializing environment for https://github.com/jumanjihouse/pre-commit-hooks.
[INFO] Installing environment for https://github.com/jumanjihouse/pre-commit-hooks.
[INFO] Once installed this environment will be reused.
[INFO] This may take a few minutes...
Trim Trailing Whitespace.................................................Passed
Fix End of Files.........................................................Passed
Check Yaml...............................................................Passed
Check Toml...........................................(no files to check)Skipped
Check for added large files..............................................Passed
tox-validation.......................................(no files to check)Skipped
tox-docs.............................................(no files to check)Skipped
pdm-lock-check.......................................(no files to check)Skipped
Check markdown files.....................................................Passed
Enter fullscreen mode Exit fullscreen mode

Given that no toml or appropriate python files were modified unnecessary checks are skipped according to our setup. The README.md file does have markdownlint run against it and has passed the checks. I'll go ahead and commit the file with the message "Updated README.md title". Now it's time to see what happens when we make changes to docs/:

$ vim docs/source/index.rst
$ git add docs/source/index.rst
$ git commit
Trim Trailing Whitespace.................................................Passed
Fix End of Files.........................................................Passed
Check Yaml...............................................................Passed
Check Toml...........................................(no files to check)Skipped
Check for added large files..............................................Passed
tox-validation.......................................(no files to check)Skipped
tox-docs.................................................................Passed
Enter fullscreen mode Exit fullscreen mode

In this case tox-docs is run but since no python files are present neither linting nor tests were run. Now I'll revert the docs change and modify one of the tests. This should kick off the linting/tests phase but not do anything with the docs building:

$ git reset docs/source/index.rst
$ git checkout docs/source/index.rst
$ vim tests/test_mymath.py
$ git add tests/test_mymath.py
$ git commit
Trim Trailing Whitespace.................................................Passed
Fix End of Files.........................................................Passed
Check Yaml...............................................................Passed
Check Toml...........................................(no files to check)Skipped
Check for added large files..............................................Passed
tox-validation...........................................................Passed
tox-docs.............................................(no files to check)Skipped
pdm-lock-check.......................................(no files to check)Skipped
Check markdown files.................................(no files to check)Skipped
Enter fullscreen mode Exit fullscreen mode

Finally we'll make an update to pyproject.toml which should run everything (I'll also reset the state of the test file so there are no false positives):

$ git reset tests/test_mymath.py
$ git checkout tests/test_mymath.py
$ vim pyproject.toml
$ git add .pre-commit-config.yaml pyproject.toml
$ git commit
Trim Trailing Whitespace.................................................Passed
Fix End of Files.........................................................Passed
Check Yaml...............................................................Passed
Check Toml...............................................................Passed
Check for added large files..............................................Passed
tox-validation...........................................................Passed
tox-docs.................................................................Passed
pdm-lock-check...........................................................Passed
Check markdown files.................................(no files to check)Skipped
Enter fullscreen mode Exit fullscreen mode

Everything has been run. I will say that technically it wasn't necessary to run this as I only made a description update and nothing was changed in the actual dependencies or tool configuration. That means nothing was changed that would require linting/tests/documentation updates. Even so, trivial pyproject.toml changes will generally be very rare after the project is flushed out and you should only really be updating for dependency or tool configuration changes from there.

Conclusion

pre-commit is definitely one of the tools I wish I knew about sooner in my development career. It's a great way to avoid having people frown at your PRs for having 3 different lint fix only commits. Given that it is another roadblock to pushing commits you'll want to make sure your linting passes on the entire project via pre-commit run -a. Otherwise your coworkers will most definitely find a way to bypass it.

If you like what you see I'm am also available for hire. Those interested can find out more info in my dev.to profile.

Top comments (0)