re: An unexpected performance regression VIEW POST


There's a nice project called asv (airspeed velocity) which can track benchmark performance across commits, graph them, track different machines separately, track different options separately, and look for step-changes in performance asv.readthedocs.io/en/stable

It's designed for Python, but I've used it with Haskell and Racket projects. There are two steps to making it generic:

  • Use a plugin to set up the "environment". This tells asv how to turn a raw commit into a runnable program (fetching dependencies, compiling, etc.). asv comes with Python-specific virtualenv and conda environments. I made a plugin to use the language-agnostic Nix package manager instead chriswarbo.net/git/asv-nix

  • Use "tracking" benchmarks. asv benchmarks are Python functions; for a tracking benchmark, asv just looks at its return value (i.e. it doesn't get timed or anything). If we want to perform measurements using some other program (e.g. a Rust binary), we can have our Python function execute that binary and return some number parsed from the output. More practically, we could have a "setup function" run the binary and parse out all of our measurements into e.g. a dictionary; then each "tracking benchmark" just returns the relevant value from that dictionary.

asv's "matrix" option allows different versions of dependencies to be benchmarked and compared, e.g. in your case it could be different Rust compilers. In fact these "dependency versions" are just argument strings which are provided to the environment plugin (e.g. my Nix environment passes them into Nix as arguments to the (user-specified) build function), so you could have a "dependency" called use-jemalloc with "versions" true and false, and have your environment build things as appropriate.

It's easy enough to integrate asv with CI (I personally use Laminar). Instead of invoking a local binary we could do it over SSH instead (e.g. on a dedicated machine, to reduce interference); we can do anything we like to get that return value ;)


Thank you very much for the feedback. This looks very promising!

Code of Conduct Report abuse