<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>DEV Community: Yuichi Tanaka</title>
    <description>The latest articles on DEV Community by Yuichi Tanaka (@yuichielectric).</description>
    <link>https://dev.to/yuichielectric</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F236709%2Ff21a7f4f-2806-48bd-ae5e-3ca8d650004e.png</url>
      <title>DEV Community: Yuichi Tanaka</title>
      <link>https://dev.to/yuichielectric</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://dev.to/feed/yuichielectric"/>
    <language>en</language>
    <item>
      <title>Git: rewriting entire history</title>
      <dc:creator>Yuichi Tanaka</dc:creator>
      <pubDate>Fri, 10 Jan 2020 14:02:25 +0000</pubDate>
      <link>https://dev.to/yuichielectric/git-rewriting-entire-history-18li</link>
      <guid>https://dev.to/yuichielectric/git-rewriting-entire-history-18li</guid>
      <description>&lt;p&gt;What if you find a sensitive data is committed to Git? You should remove that file. What if your repository gets too large and it takes over an hour to clone? You should remove large files to reduce repository size.&lt;/p&gt;

&lt;p&gt;However, removing those files and committing that change is not enough. The sensitive data or large files still exist in Git history.&lt;/p&gt;

&lt;p&gt;Therefore, you should remove sensitive data or large files from the entire repository history.&lt;/p&gt;

&lt;p&gt;How to do that? Use &lt;a href="https://github.com/newren/git-filter-repo"&gt;&lt;code&gt;git-filter-repo&lt;/code&gt;&lt;/a&gt;.&lt;/p&gt;

&lt;h2&gt;
  
  
  &lt;code&gt;git-filter-repo&lt;/code&gt;
&lt;/h2&gt;

&lt;p&gt;&lt;code&gt;git-filter-repo&lt;/code&gt; is a tool to rewrite entire repository history. It's fast and safe.&lt;/p&gt;

&lt;h3&gt;
  
  
  Removing a single file
&lt;/h3&gt;

&lt;p&gt;If you want to remove a file called &lt;code&gt;sensitive.md&lt;/code&gt;:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight"&gt;&lt;pre class="highlight shell"&gt;&lt;code&gt;    &lt;span class="nv"&gt;$ &lt;/span&gt;git filter-repo &lt;span class="nt"&gt;--path&lt;/span&gt; sensitive.md &lt;span class="nt"&gt;--invert-paths&lt;/span&gt;
    Parsed 104 commits
    New &lt;span class="nb"&gt;history &lt;/span&gt;written &lt;span class="k"&gt;in &lt;/span&gt;0.16 seconds&lt;span class="p"&gt;;&lt;/span&gt; now repacking/cleaning...
    Repacking your repo and cleaning out old unneeded objects
    HEAD is now at 58387b2 Modify README
    Enumerating objects: 6, &lt;span class="k"&gt;done&lt;/span&gt;&lt;span class="nb"&gt;.&lt;/span&gt;
    Counting objects: 100% &lt;span class="o"&gt;(&lt;/span&gt;6/6&lt;span class="o"&gt;)&lt;/span&gt;, &lt;span class="k"&gt;done&lt;/span&gt;&lt;span class="nb"&gt;.&lt;/span&gt;
    Delta compression using up to 4 threads
    Compressing objects: 100% &lt;span class="o"&gt;(&lt;/span&gt;3/3&lt;span class="o"&gt;)&lt;/span&gt;, &lt;span class="k"&gt;done&lt;/span&gt;&lt;span class="nb"&gt;.&lt;/span&gt;
    Writing objects: 100% &lt;span class="o"&gt;(&lt;/span&gt;6/6&lt;span class="o"&gt;)&lt;/span&gt;, &lt;span class="k"&gt;done&lt;/span&gt;&lt;span class="nb"&gt;.&lt;/span&gt;
    Total 6 &lt;span class="o"&gt;(&lt;/span&gt;delta 0&lt;span class="o"&gt;)&lt;/span&gt;, reused 4 &lt;span class="o"&gt;(&lt;/span&gt;delta 0&lt;span class="o"&gt;)&lt;/span&gt;
    Completely finished after 0.38 seconds.
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;



&lt;p&gt;&lt;code&gt;--path&lt;/code&gt; option specifies which path to include to the new history. With &lt;code&gt;--invert-path&lt;/code&gt; option, &lt;code&gt;--path&lt;/code&gt; means which path to exclude from the new history.&lt;/p&gt;

&lt;p&gt;Then, a file called &lt;code&gt;[sensitive.md](http://sensitive.md)&lt;/code&gt; is completely removed from the entire history. Therefore, it looks &lt;code&gt;sensitive.md&lt;/code&gt; didn't exist from the initial commit.&lt;/p&gt;

&lt;h3&gt;
  
  
  Removing all files bigger than a certain size
&lt;/h3&gt;

&lt;p&gt;If you want to remove files whose size is over 100KB, you can use &lt;code&gt;--strip-blobs-bigger-than&lt;/code&gt; option as follows:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight"&gt;&lt;pre class="highlight shell"&gt;&lt;code&gt;    &lt;span class="nv"&gt;$ &lt;/span&gt;git filter-repo &lt;span class="nt"&gt;--strip-blobs-bigger-than&lt;/span&gt; 100K                                                                                        466ms  Fri Jan 10 22:23:35 2020
    Processed 318 blob sizes
    Parsed 106 commits
    New &lt;span class="nb"&gt;history &lt;/span&gt;written &lt;span class="k"&gt;in &lt;/span&gt;0.10 seconds&lt;span class="p"&gt;;&lt;/span&gt; now repacking/cleaning...
    Repacking your repo and cleaning out old unneeded objects
    HEAD is now at 0fc502c Modify README
    Enumerating objects: 312, &lt;span class="k"&gt;done&lt;/span&gt;&lt;span class="nb"&gt;.&lt;/span&gt;
    Counting objects: 100% &lt;span class="o"&gt;(&lt;/span&gt;312/312&lt;span class="o"&gt;)&lt;/span&gt;, &lt;span class="k"&gt;done&lt;/span&gt;&lt;span class="nb"&gt;.&lt;/span&gt;
    Delta compression using up to 4 threads
    Compressing objects: 100% &lt;span class="o"&gt;(&lt;/span&gt;209/209&lt;span class="o"&gt;)&lt;/span&gt;, &lt;span class="k"&gt;done&lt;/span&gt;&lt;span class="nb"&gt;.&lt;/span&gt;
    Writing objects: 100% &lt;span class="o"&gt;(&lt;/span&gt;312/312&lt;span class="o"&gt;)&lt;/span&gt;, &lt;span class="k"&gt;done&lt;/span&gt;&lt;span class="nb"&gt;.&lt;/span&gt;
    Total 312 &lt;span class="o"&gt;(&lt;/span&gt;delta 98&lt;span class="o"&gt;)&lt;/span&gt;, reused 312 &lt;span class="o"&gt;(&lt;/span&gt;delta 98&lt;span class="o"&gt;)&lt;/span&gt;
    Computing commit graph generation numbers: 100% &lt;span class="o"&gt;(&lt;/span&gt;104/104&lt;span class="o"&gt;)&lt;/span&gt;, &lt;span class="k"&gt;done&lt;/span&gt;&lt;span class="nb"&gt;.&lt;/span&gt;
    Completely finished after 0.31 seconds.
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;



&lt;p&gt;There are many other examples at &lt;code&gt;git-filter-repo&lt;/code&gt; &lt;a href="https://htmlpreview.github.io/?https://github.com/newren/git-filter-repo/blob/docs/html/git-filter-repo.html#EXAMPLES"&gt;man page&lt;/a&gt;.&lt;/p&gt;

&lt;h2&gt;
  
  
  Why not &lt;code&gt;git filter-branch&lt;/code&gt;?
&lt;/h2&gt;

&lt;p&gt;&lt;code&gt;git filter-branch&lt;/code&gt; command used to be an official way to rewrite history. However, you'll see a warning like below when you try to execute &lt;code&gt;git filter-branch&lt;/code&gt; on Git 2.24.0 or later.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight"&gt;&lt;pre class="highlight shell"&gt;&lt;code&gt;    WARNING: git-filter-branch has a glut of gotchas generating mangled &lt;span class="nb"&gt;history
             &lt;/span&gt;rewrites.  Hit Ctrl-C before proceeding to abort, &lt;span class="k"&gt;then &lt;/span&gt;use an
             alternative filtering tool such as &lt;span class="s1"&gt;'git filter-repo'&lt;/span&gt;
             &lt;span class="o"&gt;(&lt;/span&gt;https://github.com/newren/git-filter-repo/&lt;span class="o"&gt;)&lt;/span&gt; instead.  See the
             filter-branch manual page &lt;span class="k"&gt;for &lt;/span&gt;more details&lt;span class="p"&gt;;&lt;/span&gt; to squelch this warning,
             &lt;span class="nb"&gt;set &lt;/span&gt;&lt;span class="nv"&gt;FILTER_BRANCH_SQUELCH_WARNING&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;1.
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;



&lt;p&gt;Compared to &lt;code&gt;git filter-branch&lt;/code&gt;, &lt;code&gt;git-filter-repo&lt;/code&gt; has several advantages:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Simple&lt;/li&gt;
&lt;li&gt;Fast&lt;/li&gt;
&lt;li&gt;Safe&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;For example, when removing a file that has modified 100 times, &lt;code&gt;git filter-branch&lt;/code&gt; takes &lt;strong&gt;17 times longer&lt;/strong&gt; than &lt;code&gt;git-filter-repo&lt;/code&gt;! The repository I used on that test is public &lt;a href="https://github.com/yuichielectric/git-filter-repo-playground"&gt;here&lt;/a&gt;, you can test by yourself. Removing &lt;code&gt;sensitive.md&lt;/code&gt; from this repo, &lt;code&gt;git filter-repo&lt;/code&gt; took 0.84 second and &lt;code&gt;git filter-branch&lt;/code&gt; took 14.49 seconds.&lt;/p&gt;

&lt;p&gt;For more details about the &lt;code&gt;git filter-branch&lt;/code&gt; issues, see &lt;a href="https://git-scm.com/docs/git-filter-branch#PERFORMANCE"&gt;&lt;code&gt;git filter-branch&lt;/code&gt; man page&lt;/a&gt;.&lt;/p&gt;

&lt;h2&gt;
  
  
  Install
&lt;/h2&gt;

&lt;p&gt;&lt;code&gt;git filter-repo&lt;/code&gt; is not included in the official Git command, so you should install it by yourself. If you use a package manager like Homebrew, you can use those tools.&lt;/p&gt;

&lt;p&gt;For more details, see the &lt;a href="https://github.com/newren/git-filter-repo/blob/master/INSTALL.md"&gt;official installation documentation&lt;/a&gt;.&lt;/p&gt;

</description>
      <category>git</category>
    </item>
  </channel>
</rss>
