DEV Community

Chris Tooley
Chris Tooley

Posted on

Merging old repos into a monolithic git repo archive

I needed to archive some old repositories into a monorepo and of course I gave myself the requirement of maintaining git history, in some way. I tried a couple of solutions but it wasn't until I stumbled upon the git-filter-repo project at https://github.com/newren/git-filter-repo and another article which I've since lost (which was badly documented anyway) that I was able to figure out how to do this.
Branch representation

The basic steps are as follows:

  1. clone the repositories to be archived into a fresh directory
  2. modify each repository so that its source code is placed in a common directory name (I chose sources)
  3. import the repositories as a remote git repo
  4. merge the "remote" repo branches into the appropriate branches
  5. done

Here are the actual steps I ran:

setup

mkdir oldapps
mkdir legacy_tools
cd legacy_tools && git init #or init with github in whatever way suits you
Enter fullscreen mode Exit fullscreen mode

process

cd oldapps
git clone git@old/repo/location/old-repo.git old-repo
cd old-repo
git checkout master #might not be needed, might be main
git filter-repo --to-subdirectory-filter sources/old-repo --tag-rename :old-repo-
# git filter-repo --force --to-subdirectory-filter sources/old-repo --tag-rename :old-repo-
cd ../../legacy_tools
git remote add -f local ../oldapps/old-repo/
git merge --allow-unrelated-histories local/master
# maybe also do for develop or other branches as necessary
# git checkout develop
# git merge --allow-unrelated-histories local/develop
git push origin main
git remote rm local
Enter fullscreen mode Exit fullscreen mode

Repeat the above for all repos you want to archive and Bob's your uncle.

breakdown of steps

cd oldapps
git clone git@old/repo/location/old-repo.git old-repo
cd old-repo
git checkout master #might not be needed, might be main
git filter-repo --to-subdirectory-filter sources/old-repo --tag-rename :old-repo-
Enter fullscreen mode Exit fullscreen mode

This is you going into your "oldapps" repo, and modifying a fresh checkout of your repo such that all the code gets placed into a sources directory in oldapps. When the above is run the directory structure looks somewhat like this:

.
├── old-repo
│   └── sources
│       └── old-repo
Enter fullscreen mode Exit fullscreen mode

So old-repo now has a directory with sources and old-repo within that contains all the actual code.

cd ../../legacy_tools
git remote add -f local ../oldapps/old-repo/
Enter fullscreen mode Exit fullscreen mode

Next we go back to the legacy_tools repository (the monorepo in which we're archiving all our old repos) and we set a remote to the local repository we just modified.

git merge --allow-unrelated-histories local/master
Enter fullscreen mode Exit fullscreen mode

The above then merges the master (or main, our code was so old the main branch was called master) into the current branch you're in on legacy_tools. I also suggest making new branches that correspond to the names of the old repos and keeping that around too.

The last steps are just removing the local remote source and pushing to origin if you have it.

Why did I do this? Dunno man, it was asked of me and I found it interesting.

caveats

This merges all the historical git commits - if this doesn't work for you, then you should really question how you could do this without merging the commits (hint: it's not possible!). I was told I could archive without git history so I wasn't particularly bent out of shape, I have the logs, who cares whether the history is mixed?

This also is not going to work great if you're going to be merging code from branch to branch in legacy_tools but... Why are you archiving the code then?

I'm also not going to say I'm a git expert either but this seems to have done the job for me! Hope this helps someone in the future.

Top comments (0)