DEV Community

loading...
Cover image for git: removing or replacing sensitive data

git: removing or replacing sensitive data

Isaac Adams
I code. Sometimes I do other things.
・3 min read

Table Of Contents

BFG Repo Cleaner is both faster and easier to use than using git filter-branch. It can handle most use cases for editing sensitive information in a git log.

Read the instructions (see above series) for installing BFG before moving ahead. Otherwise, if you already have it installed, move onto the next section.

Introduction

Accidents happen. We have all been there. You committed sensitive data (e.g. passwords, keys, emails, or phone numbers) to the branch. Ok, no problem. Amend the previous commit. Easy. Right? Wrong.

This time you didn't realize the mistake until several weeks later. The commit is now baked deep into the repository, making it almost impossible to fix using git. A git rebase is too difficult due to the complexity or amount of the changes. Other methods like git filter-branch might be too slow. Now what?

[Enter BFG Repo Cleaner]

BFG is a tool that cleans git repositories of sensitive data. It is easy to use and faster than git filter-branch. Using BFG is easy because it expects "expressions" as input. An "expression" (see definition) describes what the sensitive data is, how to search for it and how to edit it. BFG expects one or more expressions as input from the command line or from a local file.

Examples

I will refer to the "expressions file" (see definition) in examples. This is the local file that contains a list of expressions separated by newlines.

The creator of bfg gave an example of how to use the tool in a stack overflow answer (see credits).

$ bfg --replace-text replacements.txt -fi *.php my-repo.git

replacements.txt

PASSWORD1
PASSWORD2==>examplePass
PASSWORD3==>
regex:password=\w+==>password=
regex:\r(\n)==>$1
Enter fullscreen mode Exit fullscreen mode
expression example description
PASSWORD1 Replace literal string 'PASSWORD1' with '***REMOVED***' (default)
PASSWORD2==>examplePass replace with 'examplePass' instead
PASSWORD3==> replace with the empty string
regex:password=\w+==>password= Replace, using a regex
regex:\r(\n)==>$1 Replace Windows newlines with Unix newlines

Let's unpack every argument in the example command

  • --replace-text replacements.txt
    • --replace-text <path: file location describing replacements>
    • replacements.txt is the path to the expressions file
  • -fi *.php
    • -fi <glob: files to search>
    • the "filter content including" (-fi) option tells bfg to only perform its actions "in these files"
    • configured to only perform the replacements in all files that end in .php
    • remove this if you want bfg to search through all files
  • my-repo.git
    • <path: folder location of repo>
    • this is the path to the repo you want to
    • use . (a dot) as the input when invoking bfg from the root of your target repo

Email Use Case

I want to use BFG to replace my old email since my github account uses a new email.

important note.
emails are best stored in configuration files, rather than hard coded into the source.

For this example, I will be using dummy emails.

$ bfg -rt replace-email.txt .

replace-email.txt

old-email@gmail.com==>my.new.email@gmail.com
Enter fullscreen mode Exit fullscreen mode
  • old-email@gmail.com==>my.new.email@gmail.com
    • [left part of the expression]: old-email@gmail.com is the old email I want to replace
    • [middle part of the expression]: ==> indicates to bfg that I want to give it a value for replacing the old email
    • [right part of the expression]: my.new.email@gmail.com is the value that will replace the old email
  • -rt replace-email.txt
    • -rt is the alternative form of --replace-text
    • replace-email.txt is the path to the expressions file
    • invoking bfg from the same directory as the file means makes it easier to reference the file
  • .
    • the dot indicates that the repo I am targeting is the current directory from which I am invoking bfg

Running bfg will rewrite the current branch with new commits. Be very careful when using this tool! In case something goes wrong, make a copy of the branch you are working on by checking it out with a new name.

Definitions

word description
expression special syntax describing "the what" and "the how" to replace
expressions file it is a file which contains expressions separated by new lines

Credits

I'd recommend using the BFG Repo-Cleaner, a simpler, faster alternative to git-filter-branch specifically designed for rewriting files from Git history.

You should carefully follow these steps here: https://rtyley.github.io/bfg-repo-cleaner/#usage - but the core bit is just this: download the BFG's jar (requires Java 7 or above) and run this command:

Discussion (0)