We open sourced drone-cache, a plugin for the popular Continuous Delivery platform Drone. It allows you to cache dependencies and interim files between builds to reduce your build times. This post explains why we are using Drone, why we needed a cache plugin, and what I learned while trying to release drone-cache as open source software.
Read on for the story behind drone-cache or if you want to jump into action directly, go to the github.com/meltwater/drone-cache, and try it for yourself.
Originally published at underthehood.meltwater.com on April 10, 2019.
At Meltwater, we empower self-sufficient teams. Teams are free to choose their technology stacks. As a result, we have a diverse set of tools in our stack. In my team, we had been using a combination of TravisCI, CircleCI and Jenkins as our CI/CD pipeline.
In 2018, we decided to migrate to Kubernetes. In doing so, we wanted to simplify our toolchain and migrate to a more flexible, cloud-native and on-premise CI/CD pipeline solution. We ended up choosing Drone, and with one year of experience under our belt, we are more than happy with it.
My team lives and breaths the “release early, release often” philosophy. We release and deploy our software to production several times a day. When we moved from CircleCI to Drone, our build times went up drastically.
Build times went up so much because, for each build, our package manager was downloading the Internet (you know, usual suspects are npm, RubyGems, etc.). This was not a problem with CircleCI because of their built-in caching facilities. So with our pace of continuous releases and increased build times, we got frustrated quickly.
Since we had been spoiled with the wonderful caching features of CircleCI, we wanted the same features in Drone but they are not available by default. However, Drone offers plugins which are “special Docker containers used to drop preconfigured tasks into a Pipeline”. We found tens of plugins related to caching in Drone.
We first tried drone-volume-cache, but because volumes are local to the currently running Drone worker node, you cannot be sure your that next build will run on the same machine. Using a storage layer that could persist the cache between builds would be a better option. So we quickly abandoned this approach.
Our Drone deployment runs on AWS, hence we looked for plugins that use S3 as their storage. We found lots of them and decided to use drone-s3-cache. It’s a well-written, simple Go program which follows the Drone plugin starter conventions.
After using drone-s3-cache for a couple of weeks, we needed to add another parameter to pass to S3. To do so we forked drone-s3-cache and modified it. We thought that nobody would need those minor changes. So rather than contributing back to upstream, we built a docker image of our own and pushed it to our private registry to use as a custom Drone plugin.
Months later, I have received a feature request from one of my colleagues working in a different team, and I was surprised because I didn’t think other teams used drone-cache. When I checked, I realised that various teams throughout Meltwater heavily used it. Then we started to get similar messages and requests from other teams.
I received this message when I was looking for a problem to solve during our internal Hackathon. What are the chances? So I decided to work on this plugin and add the requested feature. Building something to make life easy for fellow developers always gives me pure joy. Long story short, stars were aligned, and we decided to work on our fork and improve it.
I had not worked with Go much, but I always wanted to learn. Thanks to this plugin, I have also achieved this goal of mine. I changed, refactored and churned a lot of code. I experimented with a lot of different ideas. I have added features that nobody has asked for. I tried different things just for the sake of trying. That’s why when I decided to open source my changes, I realised I had re-written the plugin. So rather than sending a pull-request, I created a new repository. drone-cache has born!
What does a Drone cache plugin actually have to accomplish? In Drone, each step in the build pipeline is a container which is thrown away after it serves its purpose. So a caching system has to persist current workspace files between builds. You can think of workspace as the root of your git repository. It is a mounted volume shared by all steps in your Drone build pipeline.
With drone-cache, after your initial pipeline run, a snapshot of your current workspace will be stored. Then you can restore that snapshot in your next build, which saves you time.
The best example would be to use this plugin with your package managers such as npm, Mix, Bundler or Maven. With restored dependencies from a cache, commands such as npm install would only need to download new dependencies, rather than re-download every package on each build.
The most useful feature of drone-cache is that you can provide your own custom cache key templates. This means you can store your cached files under keys which prescribes your use cases. For example, with a custom key generated from a checksum of a file (say package.json), you keep your cached files until you actually touch that file again.
All other caching solutions for drone offer only a single storage form for your cache. drone-cache in contrast offers 2 storage forms out of the box: an S3 bucket or a mounted volume. Even better, drone-cache provides a pluggable backend system, so you can implement your own storage backend.
Last but not least, drone-cache is a small CLI program, written in Go without any external OS dependencies. So even if you are not using Drone as your build system, you can still fork and tinker with drone-cache to make it fit your needs.
Building a caching solution is hard. Especially, if every team in your company uses it every time they push something to their repositories. It is also fun because it means you have users who give you feedback from the beginning. With the help of my colleagues’ feedback and feature requests, we have crafted this plugin.
There are only two hard things in Computer Science: cache invalidation and naming things.
What could we have done better? As I have mentioned before, rather than forking and modifying a new code base, we could have contributed back to the original project. We could have applied “release early and often” philosophy to open sourcing this repository, and we would have collected feedback from the outside world as well. However we didn’t, that’s mostly on me. This is the first time I actually open sourced a project and contributed back to the community. So next time I will know better :)
In Meltwater we are using drone-cache in 20 teams and 120 components now. It works and gets things done for us. We have learned a lot while we build it. We hope this also solves similar problems of yours.
Please try it in your pipeline, give us feedback, feel free to open issues and send us pull-requests. Personally, I am also very interested to discuss your experiences with open sourcing in general, so if you have any thoughts on that, please share them in the comments below.
xkcd.com — How standards proliferate https://xkcd.com/927/
Originally published at underthehood.meltwater.com on April 10, 2019.