I mentioned in a previous post that I had used git and git hooks to upload Hugo blog posts to my server, and that I am now using rsync as a faster more succinct alternative.

This has left me with lot of git history in my .git folder, which I really don’t need. I don’t think I am ever going to want to recover a blog post version, ever.

So I thought why not just delete all the history and reduce the junk in the site directories.

The following are the steps I used to clean out and shrink the data down to the latest committed.

  • Let’s check the state of the repo:
git status    # get the current state of repo

Caution these steps will delete all the repo’s history

  • First create an orphan or isolated copy of the current state of the repo.
  • Then add the files and commit.
  • Next delete the old Master and move the clone to Master.
  • Then push the new master to GitHub.
  • Finally clean the old crap out of the repo using garbage collection aggressively.
git checkout --orphan <orphanName>  # create and orphan clone of master
git add -A                          # adds everything hardocre
git commit -am "commit message"     # Commit cleaned branch
git branch -D master                # Deletes original Master
git branch -m master                # Moves cloned branch to Master
git push -f origin master           # Push new Master to GitHub
git gc --aggressive --prune=all     # Garbage Collection hardcore

After running these steps; my site directory went from around 60 MB to about 40 MB, which is a saving of around 30%.
I don’t have a heavily image based site, so mostly the history files were Markdown text files, that being said, the saving is nice and on a larger graphical site it could equate to a lot of duplicated data.