While working on a multitude of open source projects I faced an interesting Git puzzle. How can you move a file between Git repos?
The easy solution is to just forget about Git history. If you like to preserve it, the situation is not very straightforward. Below we will explain the problem, your options and how to apply them.
The problem
Let’s assume you work on forkX, which is a fork of repo projectX. In forkX you are collaborating with your colleagues. At the same time, projectX is also moving forward. ForkX includes files with a long history of commits that are not in projectX. You would like to push some of those files to projectX.
Your current situation looks like this:
What do you do?
Option A: Delete history
The easiest option is to push the files without preserving history.
Based on the example in the graph above, add relevant remotes and fetch their current state:
git remote add projectX git@github.com:someOrg/projectX
git remote add forkX git@github.com:myOrg/forkX
git fetch --all
Now, move to a clean branch that is identical to projectX/master.
git checkout projectX/master
git checkout -b cleanBranch
Then, add your new files:
git checkout forkX/newFeature -- file_1
git add file_1
You can repeat for more files and commit:
git commit -m “Add new feature”
You are now ready to open a pull request. If you have the proper rights, you can just push with:
git push -u projectX
Other than the “Add new feature” commit, the pushed files will have no history upstream. The person who pushed will appear as the sole contributor.
This option is simple but will not work well when many people collaborate on the same files. You will end up deleting all contributors and every commit message along the way.
Option B: Preserve history
In order to preserve history the situation is more complicated.
Similarly to the previous option, add remotes, fetch their current state and move to a clean branch that is identical to projectX/master:
git remote add projectX git@github.com:someOrg/projectX
git remote add forkX git@github.com:myOrg/forkX
git fetch --all
git checkout projectX/master
git checkout -b cleanBranch
Then:
git merge --no-ff --no-commit forkX/newFeature
The command above will stop the merge with all the files staged and ready to be committed. For every file, besides the new ones, you do:
git reset filePath
Be careful NOT to reset everything and then stage the new files again. When the new files are the only ones staged then you commit with:
git commit -m "Add new feature"
Finally, delete the unstaged files with:
git stash && git stash drop
git clean -f
You are now ready to open a pull request. If you have the proper rights, you can just push with:
git push -u projectX
Example
At Mist.io we rely on Apache Libcloud to interface with the APIs of many cloud providers. We maintain a fork of Libcloud to better control Mist’s dependencies and to develop new features before we push them upstream.
Until recently, we were maintaining a driver for vSphere only in our fork. The driver was big, complicated and introduced new dependencies so we refrained from pushing it upstream. When we felt confident with the code we decided to open a pull request.
The bulk of the new code was in a few files that didn’t exist upstream. However, the work on these files was done, over a long period of time, from several people in our team. For this reason, we wanted to preserve the history and we ended up using option B above.
Here is an example of how the same pull request looks like when pushed without history and with history.
Conclusion
In this post I went over how you can move files between Git repos. I showed two options; a simple one that deletes history and a complicated one that preserves it. I illustrated the two options with an example from Mist’s Libcloud fork and the upstream Libcloud repo.
I would love to hear your thoughts in the comments!
Top comments (0)