loading...
Cover image for Benchmarking builds with Gradle-Profiler

Benchmarking builds with Gradle-Profiler

autonomousapps profile image Tony Robalik ・7 min read

(Cover image courtesy of Unsplash user Felix M. Dorn)

I recently had someone tell me this about my plugin:

build times improved by a whopping 36%

I was (1) very gratified to hear this and (2) highly skeptical. Without context, I have no idea what that means. What kind of build was it? Was the daemon warm or cold? How warm was it? How big is your project? How many lines of code, how many modules? Were you also using Slack and Chrome on your machine?

When people ask about build performance, I always encourage them to use gradle-profiler. It is the canonical way to benchmark and profile your Gradle builds, and produces standard summary statistics so you can make reasonable inferences.

It can be complicated, however, and hard to know where to start. Let's fix that!

Installing

There are at least three ways to install gradle-profiler. You can use Homebrew

$ brew install gradle-profiler

You can use sdkman

$ sdk install gradleprofiler 

You can build from source

$ git clone git@github.com:gradle/gradle-profiler.git
$ cd gradle-profiler
$ ./gradlew installDist

and then add gradle-profiler/build/install/gradle-profiler/bin to your PATH.

Benchmarking builds

Gradle-profiler supports two use-cases: benchmarking and profiling. The former produces basic summary statistics (such as mean, median, etc), while the latter produces detailed profiles with flamecharts or icicle graphs. You'd use the former to test the impact of build script changes at a high level (did my build get slower or faster? By how much? What's the uncertainty?), while the latter is used to investigate hotspots in your code in depth.

In this post, we will focus on benchmarking.

The basics

Per the docs, you can create a benchmark like so

$ gradle-profiler --benchmark --project-dir <root-dir-of-build> <task>

This should be invoked ideally from the parent directory of the project you want to benchmark. You'll note that this lets you specify the task you want to benchmark.

While this is useful, I often find I need something more.

Scenarios

Instead of supplying <task> on the command line, you can provide a scenarios file (written in the Typesafe config format) for the tool to use.

You run gradle-profiler against a scenarios file like so:

$ gradle-profiler --benchmark --project-dir <root-dir-of-build> --scenario-file benchmark.scenarios [<scenarios>...]

Where [<scenarios>...] represents an optional specification of named scenarios to run.

Let's start with a simple example:

// benchmark.scenarios
configuration {
  tasks = ["help"]
}

This creates a scenario named "configuration" which simply invokes ./gradlew help. This is a useful scenario for looking at improvements to configuration time, vs task execution.

Let's look at another simple example:

noop {
  tasks = [":app:assembleDebug"]
}

This scenario is named noop (for "no-op"), which runs the :app:assembleDebug task. I call it "no-op" because there are no file changes in this scenario, so every assemble after the first should be a "no-op." This scenario tests whether your primary daily task is well-configured: the task and all its dependencies should all be UP-TO-DATE; this build should be very fast.

Increasing complexity: testing the build cache

Does the build cache do any good? Let's find out:

clean_build_with_cache {
  tasks = ["clean", ":app:assembleDebug"]
  gradle-args = ["--build-cache"]
}
clean_build_without_cache {
  tasks = ["clean", ":app:assembleDebug"]
  gradle-args = ["--no-build-cache"]
}

This is our first situation where we have more than one scenario in a scenarios file. When we have this, gradle-profiler will run each scenario in series, producing a summary report after all are complete. These two scenarios are very similar to the noop scenario, in that there are no source changes, but the filesystem does change: we delete the build directory for each build. In the first case, we use the build cache, while in the second we don't. If the build cache lives up to its promises, the first scenario should be measurably faster.

Increasing complexity: testing incremental changes

Ok, we're done, right? Well, we would be if software development involved never once touching source code.... For those scenarios in which you must actually change your code (ugh), we can iterate on our benchmarking technique.

incremental_app {
  tasks = [":app:assembleDebug"]
  apply-abi-change-to = "app/src/main/java/com/my/project/manager/AccountManager.java"
}
resource_change {
  tasks = [":app:assembleDebug"]
  apply-android-resource-change-to = "app/src/main/res/values/strings.xml"
}

Gradle-profiler refers to apply-abi-change-to and apply-android-resource-change-to as mutations, and there are many available. These two scenarios benchmark, respectively, build time when making a change to a Java file and a change to a resource file. I used some git magic to choose the files to mutate:

$ git log --pretty=format: --name-only | sort | uniq -c | sort -rg | head -10

This showed me the most frequently-changing files in the project; i.e., the ones most indicative of normal development.

Example use-case

I was recently able to remove Jetifier from my project (thank you John Rodriguez for updating Picasso to no longer depend on the support libs!). I wanted to know what impact this would have on build times. So, I did the following:

# gradle.properties
android.enableJetifier=true

I set that to true and ran

$ gradle-profiler --benchmark --project-dir my-project/ --scenario-file benchmark.scenarios

I then walked away and did something else, because it took a while.

When it was done, I updated my gradle.properties

android.enableJetifier=false

and ran the profile again. Again I walked away, because I didn't want other activities on the computer to interfere with the benchmark.

Before I show the results, here are some relevant details about the benchmarking context.

The project is small, with about 35k LOC. There are seven Gradle modules, a heterogeneous mix of Android app, Android lib, Java lib, and Kotlin lib; most of the code is in the original app module, however. I ran it on my 2020 Macbook Pro with 16 GB memory. I left Firefox, Slack, and Android Studio on while the scenarios ran. And finally, to be clear, this project does not need Jetifier. So the results below are simply for enabling it vs disabling it.

Ok, now for the results. This is what I saw:

(All values in milliseconds. Negative numbers indicate an improvement.)

Scenario Jetified Non-Jetified Difference
configuration
...mean 1401.5 1188.6 -212.9
...median 1408 1128.0 -280
...stddev 164.94 175.77
noop
...mean 2772.3 2743.60 -28.7
...median 2733 2705.50 -27.5
...stddev 110.69 171.99
clean_build_with_cache
...mean 40811.2 34614.70 -6196.5
...median 40686 34004.0 -6682.0
...stddev 2274.38 1764.33
clean_build_without_cache
...mean 85286.20 73584.30 -11701.9
...median 86250.00 73684.0 -12566
...stddev 7664.00 3266.77
incremental_app
...mean 32120.2 29754.1 -2366.1
...median 33160 27812.5 -5347.5
...stddev 4194.69 3698.54
resource_change
...mean 4853.8 4103.00 -750.8
...median 4822.0 4084.50 -737.5
...stddev 321.57 95.75

You heard it here first. Not jetifying is faster than jetifying, even when jetifier "does nothing" — as is the case here.

I would expect the performance benefit of dropping Jetifier to scale with the size of the project, although not necessarily linearly.

How it works

In this post, I've mainly focused on why you'd use gradle-profiler, and also how to use it. Now we'll touch on what it does that makes it more reliable than just running builds ad hoc with the bash time command.

One of the most common sources of confusion I have run across when it comes to Gradle is the daemon. From the docs:

The Daemon is a long-lived process, so not only are we able to avoid the cost of JVM startup for every build, but we are able to cache information about project structure, files, tasks, and more in memory.

Gradle-profiler helps provide normalized results by ensuring every scenario is run the same number of times, with an identically-"warm" daemon. It does this by running several warm-up builds (six by default), followed by a number of "measured builds" (ten by default). It then provides a summary for the measured builds. In between each scenario, it kills the daemon so that each starts with a clean slate.

Here's an example of the console output from running the clean_build_without_cache scenario:

* Running scenario `clean_build_without_cache` using Gradle 6.5.1

* Stopping daemons

* Running warm-up build #1
Execution time 128227 ms

* Running warm-up build #2
Execution time 87976 ms

* Running warm-up build #3
Execution time 81087 ms

* Running warm-up build #4
Execution time 85749 ms

* Running warm-up build #5
Execution time 77478 ms

* Running warm-up build #6
Execution time 76109 ms

* Running measured build #1
Execution time 77715 ms

* Running measured build #2
Execution time 74513 ms

* Running measured build #3
Execution time 81199 ms

* Running measured build #4
Execution time 86559 ms

* Running measured build #5
Execution time 78003 ms

* Running measured build #6
Execution time 90885 ms

* Running measured build #7
Execution time 89237 ms

* Running measured build #8
Execution time 85941 ms

* Running measured build #9
Execution time 88481 ms

* Running measured build #10
Execution time 100329 ms

* Stopping daemons

You can see that the very first build, with a so-called "cold" daemon, took over 128 seconds! The second build took only 88s, an improvement of 31%. You can also see the noise in the results, since I ran this scenario while writing this blog post, listening to music, and chatting on slack.

Special thanks

Thanks again to John Rodriguez for reviewing an early draft.

Discussion

pic
Editor guide