DEV Community

Muhamamd Miftah
Muhamamd Miftah

Posted on

Reasons to use .gitignore and ignore binary output (like .exe, .dll files)

So browsing the internet and StackOverflow, I've noticed there isn't really a detailed articulation as to why .gitignore should be used and more specifically, for compiled languages, like C#, why you should ignore binary output files, like .dll, .exe files in your git repo.

Let's break down the many reasons:

  1. For tracking purposes, including obj and bin directories and any .exe/.dll files is completely redundant, as the build output is already determined by the source code - you don't directly edit the .exe files to fix a bug in it - you edit source code & rebuild, so why bother keeping track of the changes to the .exe file every time the source code also changes?
  2. This is important if you hand off your work to another developer, they're not going to edit .exe/.dll files included in the repo - they're going to edit the source code to build a new version, and once a new build finishes, the previously committed .exe/.dll files are now immediately out of date. So why bother constantly re-committing the binaries as well? It gets really tedious, and wastes space.
  3. Another reason too is that it's very difficult to meaningfully compare differences in binary data. A huge benefit of source control like git is the ability to compare your current repo state to a previous one. And source code is just plain text, so comparing the differences for plain text between branches or different commits is easily done. For instance you can choose to ignore whitespace or excess line breaks to simplify comprehending a diff, which you cannot do with .exe/.dll files.
  4. By default, the C# compiler produces non-deterministic output as well, even for the exact same source code inputs. So rebuilding your .exe when no source code has actually been updated will cause git to detect the .exe file as changed, when it really hasn't, because the file hash changes each time. You can turn this off with the Deterministic element however.
  5. On a related note, you shouldn't include dependency dll files in your repo as well, because managing changes to them gets really tedious. It's very rare nowadays, but before nuget, whole .dll files that were just dependencies (like say controls for WPF or Windows Forms) were included in repos because they were needed for the build to work. That was excusable then, but now unnecessary with nuget. Back then, when a dependency needed updating, you'd have to ensure that all the required files were present, especially if a newer version added new files, removed or renamed them, you had to do manual file management and ensure they were referenced correctly in your project file (using relative file paths instead of absolute ones etc.).
  6. And also what if your dependencies had their own dependencies? Managing that manually was kind of a nightmare. Dependency management is actually pretty complex - nuget now does that all for you automatically, and it's completely transparent to git or any other source control system, so there's no need to track the .dll files in git. The only thing needed in git is simply a pointer to the dependency, in the form of PackageRef's in your project file (or package.config), and nuget will do the rest.
  7. There's also a performance consideration. Going back to dependencies, sometimes dependencies themselves are bigger than the program's source code itself. This is commonly the case with node projects, as the node_modules folder can balloon to hundreds of megabytes. They should never be tracked in git due to the nature of how git's internal database works.
    • For e.g. if you delete a dependency because it's no longer needed, you can never fully reclaim that disk space (at least for the master branch) as git will need to keep the binary data stored in its internal tracking database because a previous commit in the master branch has captured the data. As you make more branches, git needs to store the data required to reconstruct your repo to a different state when you switch branches. When a branch has changes measured in the kilobytes, check out is pretty fast, but when the differences balloon to many megabytes due to the presence of heavy binary files, then checkout between different branches or commits can get very slow.
    • Though, this happens anyway when source code data eventually reaches a certain threshold, beyond the hundreds of megabytes, it's made unnecessarily worse by including any binary files. It's one of the reasons Microsoft created VFS for git: https://github.com/microsoft/VFSForGit.

If in doubt, use a pre-made gitignore file, like this one: https://github.com/github/gitignore/blob/main/VisualStudio.gitignore

Top comments (0)