DEV Community

Cover image for Caching Docker builds in GitHub Actions: Which approach is the fastest? 🤔 A research.

Caching Docker builds in GitHub Actions: Which approach is the fastest? 🤔 A research.

Thai Pangsakulyanont on April 19, 2020

Abstract: In this post, I experimented with 6 different approaches for caching Docker builds in GitHub Actions to speed up the build process and co...
Collapse
 
sagikazarmark profile image
Márk Sági-Kazár

Thanks a lot for this article and the research!

I've spent some time on the topic as well and I thought I'd share my thoughts/findings here.

tl;dr Using buildx with GitHub cache yields the same (or sometimes even better) results and it's probably a better alternative in case of a multi-stage build which is quite common these days.

As someone already pointed out before, the official Docker build actions changed significantly since this article was released. Namely, they split up the action into smaller ones (they separated login and some setup actions) and added buildx support.

I added a buildx example using GitHub's cache action to the linked repository based on the official buildx cache example: github.com/dtinth/github-actions-d...

After running it a couple times, it turned out to be as fast as the GitHub Package Registry approach. To be completely fair: I couldn't actually run the GPR example myself, because it became outdated since GitHub introduced the new Container Registry, but I compared it to the numbers posted in this article.

So it looks like using buildx with GitHub cache can be a viable alternative.

Further thinking about the method used to compare the different approaches I came to the following theories:

  • In case of a multi-stage build the GitHub Package Registry approach probably falls short from the buildx+GH cache approach (probably from others as well), because it only caches the final stage
  • Using ${{ hashFiles('Dockerfile') }} might have an adverse effect on performance when using GH cache: Dockerfiles usually don't change, but code and dependencies do, so relying on the Dockerfile in the cache key will result in invalid caches most of the time. When working with large caches, downloading the cache itself takes time as well which is a waste of time when the cache is invalid. This is why the official example uses SHA with a fallback instead. Builds for the same SHA will always be fast. When no cache is found, the latest cache matching the fallback key will be used.

I haven't actually verified these thoughts, hence they are just theories, but I strongly suspect they are correct, so take them into account when choosing a caching strategy.

Last, but not least: as I pointed out above, the Docker Package Registry example became outdated. See this announcement blog post for more details.

Collapse
 
valentijnscholten profile image
valentijnscholten

To test out the potential gain, I tried running docker-compose build
on DEV Community’s repository. Without any caching, building the
web image took 9 minutes and 5 seconds. Using GitHub Package
Registry as a cache, the time to build the image has reduced to 37
seconds.

Great article and replies. At the risk of sounding stupid: How do you get docker-compose to use the cache? It doesn't have a --cache-from option.

Collapse
 
alvistar profile image
alvistar

Hi all,
great article. I have been experimenting a bit, porting some workflow from Gitlab.

What about this? Using buildx ability to export cache to local files and then caching it with standard github actions.

Example:
gist.github.com/alvistar/5a5d241bf...

Collapse
 
dtinth profile image
Thai Pangsakulyanont • Edited

Thanks for your comment. I have never used buildx before, and I saw that it was an experimental feature, and so it might be subject to change as the feature develops.

If you are curious as to how it fares against other approaches in this article, I would encourage you to try adding the buildx approach to the showdown workflow file. This workflow file contains all the approaches used in this article.

When you fork, edit, and commit, GitHub Actions will run the workflow file automatically. Each approach will be run in a separate job which are run in parallel. So it’s like starting a race, and you can see how long each approach will take.

Let me know if you find an interesting result!

Collapse
 
amiantos profile image
Brad Root

Fantastic article. Thank you for going to the effort of testing all these methods. I've implemented the Github Package Registry solution and it shaved our Action times down by 50%, which is very impressive.

Collapse
 
dtinth profile image
Thai Pangsakulyanont

Happy to hear it is useful to you! Cheers 😁

Collapse
 
bahit profile image
Bahit Hamid

Awesome insight. Although I'm not sure what would I need docker to build it with other than having to do them locally. But I'll keep this in mind when the time comes. (Perhaps this is for CI - which I have yet to explore. 😁 I better start learning to test).

Collapse
 
kye_russell_cc53bf59c56be profile image
Kye Russell

Hi. Thanks for putting all the legwork into this. Just a heads up that—according to my cursory search (so don't quote me on this)—GHCR + BuildKit is now possible: github.community/t/cache-manifest-...

Collapse
 
patrikstas profile image
Patrik Staš

Thank you, well written! I was trying to build images with buildkit, storing images in github packages, then try use them as cache on next builds. But I bumped into a problem with this approach. If you are team of people pushing branches directly to this repo, it's alright. But if someone else forks the repo and submits PR, his pull request CI run will not have access to your repository secrets. To use Github Packages, ideally you'd use GITHUB_TOKEN secret, but it's not available from forked repositories. It's often being discussed, see this thread for example github.community/t/token-permissio...
I am planning to try my luck with this github.com/actions/cache instead

Collapse
 
jamesmortensen profile image
James

Thanks for doing all the research.

I built a Github Action to cache images pulled from docker hub using Podman. I save about 30 seconds for two images that are about 1.3gb

github.com/jamesmortensen/cache-co...

Collapse
 
scholli profile image
Tom Stein

I started to use this: github.com/whoan/docker-build-with...

Does the job very well :)

Collapse
 
andrioid profile image
Andri

Great article. Thanks for documenting your findings. I still wish that "Conclusion" had more of a conclusion to it. What do you recommend? What are you doing with your images?

Collapse
 
hadesarchitect profile image
Aleks Volochnev

I've registered at dev.to only to say it's an incredible research!

Collapse
 
roms1383 profile image
Rom's

Thanks a lot, that's definitely useful 🙏😀

Collapse
 
blacksails profile image
Benjamin Nørgaard

I would like to see results for kaniko as well :)