I picked up another git nugget this week. The TL;DR is that you can see what files were included in a commit with:
git log --name-only
If you'd like the whole story, I'll need to set the scene a bit before I get into the good stuff. When I need to reset my database on this project, its not repopulated with a seed file, but instead we use a copy of the production database. This allows us to have the most accurate data to query. Because of this, and how I setup my commands through docker, I'll temporarily have a copy of the production database in my local repo, usually called
This file shows up as an untracked file in git, and every once and a while my muscle memory gets the best of me, I run a
git add . by accident and that
prod_dump file gets included in the commit. It's happened, I'll reset, remove it, re-commit, and move on with my day. But on this occurrence it wasn't that simple. I was working through some PR comments when it happened and had run a fixup on that particular commit, squashing it into another commit. By the time I realized what I had done, I didn't know which commit contained the addition of
Its a big file, ~500 MB. What clued me into what had happened was that I got an error when I tried to push up the reworked commits:
File prod_dump is 553.85 MB; this exceeds GitHub's file size limit of 100.00 MB
I had run fixup on four or five files in one go, and I couldn't think of an obvious way to trace it back to which commit contained the huge file. I also couldn't utilize Github, because now the changes were too big to push up. My first attempt was to simply delete the now added
prod_dump file, and commit that. I hoped that adding it then removing it would be perceived by git as it never actually being added in the first place. I was wrong, but not surprised. Git is still going to track that the file was added and then deleted, meaning that it would push to Github with the file, and be rejected.
git log, but it only showed the commit, author, date, and message. For some reason I went to VSCode instead of a search engine and started looking around. I don't usually use the VSCode interface for git, but I thought it might have something useful. I right-clicked on the
prod_dump file and saw an option labeled
Clicking on that did pretty much exactly what I expected, showing the git history of that file. Knowing which commit contained the file, I could take the commit I added with the deleted file, squash it into the commit where the file was added, and now as far as git was concerned, the adding of
prod_dump never happened and I could push the changes to Github.
Having solved the problem, but not liking the idea that I relied on a GUI to do it, I turned to the search engine. I was able to find the command in short order:
git log --name-only
This command presents a log, like before, but now includes the name of the files included in the commit, here it is in the docs.
The exact commits and PR this happened with have long since been merged in, but I setup a quick example so I could walk through it again with some visuals.
commit e1efa840200dd7b2f56486e0b31ef4d4958c310f (HEAD -> nw/testing-commits) Author: Noel Worden <email@example.com> Date: Fri Jul 24 12:33:48 2020 -0600 fourth commit, with test_file_c.md test_file_c.md commit 64bf0ba2354f9b7ca228399d0357b747d3666d4b Author: Noel Worden <firstname.lastname@example.org> Date: Fri Jul 24 12:33:20 2020 -0600 third commit, with test_file_b.md test_file_b.md commit 603a6dc4334f68b13313e73c6ef024eb1ffdaaf7 Author: Noel Worden <email@example.com> Date: Fri Jul 24 12:33:05 2020 -0600 second commit, prod_dump prod_dump commit 10dad7c90038d182f922f32101eb982b916b5060 Author: Noel Worden <firstname.lastname@example.org> Date: Fri Jul 24 12:32:16 2020 -0600 first commit, with test_file_a.md test_file_a.md
In the above example, running
git log --name-only there are four commits, all committing one file each, with the
prod_dump as the second commit. If I were to replicate the fixup mistake I described earlier, I would get something that looks like this:
commit 3b8e5e2ae46f5ee704452acb47a2ee20504a3e38 (HEAD -> nw/testing-commits) Author: Noel Worden <email@example.com> Date: Fri Jul 24 12:33:48 2020 -0600 fourth commit, with test_file_c.md test_file_c.md commit 10b5cca013f1163871311a4f48428538a5200426 Author: Noel Worden <firstname.lastname@example.org> Date: Fri Jul 24 12:33:20 2020 -0600 third commit, with test_file_b.md test_file_b.md commit c004e03abf6420d21a40ee1679a1aad57215fe9e Author: Noel Worden <email@example.com> Date: Fri Jul 24 12:32:16 2020 -0600 first commit, with test_file_a.md prod_dump test_file_a.md
git log --name-only shows that
prod_dump is added in the first commit.
As I said earlier, this can also be achieved in VSCode. If you go to the side bar, in
Explorer and right-click the file there are several options:
Open Timeline and at the bottom of the side bar a
Timeline will open, in this case showing that
prod_dump was added with the first commit.
It's funny how it has taken me this long to get to a point where I needed to track which commit contained a particular file. I guess this was the first time I didn't have Github to help with the process. There's always one more thing to learn when it comes to git.
The Ah-Ha Moment
About two minutes into staring this post I realized that I could save myself from this kind of hassle in the future if I just have git ignore the
prod_dump file. I can do this by either the
.gitignore file or better yet, referring to my post from last week, add it to my personal git
exclude file. Should have done it months ago.
This post is part of an ongoing This Week I Learned series. I welcome any critique, feedback, or suggestions in the comments.