Jeremy Grifski

Posted on Jun 23, 2020 • Originally published at therenegadecoder.com on Jun 23, 2020

How to Performance Test Python Code: timeit, cProfile, and More

#performance #python #testing

A lot of the articles in this series take advantage of a feature of Python which allows us to performance test our code, and I finally wanted to get around to explaining how it works and how to use it.

In this article, I cover three main techniques: brute force, timeit, and cProfile. Personally, I performance test my code with timeit because I find it easy to understand, but you may find the various profiling tools helpful. At the end, I’ll ask you to demonstrate your newfound skills with a challenge.

Problem Introduction

In order to start talking about how to performance test Python code, we need to define it. At a high level, a performance test is anything that verifies the speed, reliability, scalability, and/or stability of software. For our purposes, we’ll be looking at speed. In particular, we’ll be looking at different ways to compare the speed of two programs using their relative execution times.

When it comes to performance testing software, the process isn’t always easy or obvious. In particular, there are a lot of pitfalls. For example, performance tests can be influenced by all sorts of factors like background processes (i.e. Spotify, Eclipse, GitHub Desktop, etc.).

In addition, performance tests aren’t always written in a way that fairly accounts for differences in implementation. For example, I might have two snippets of code that have the same behavior except one requires a library to be imported. When I run my test, I don’t want the import to affect the test outcome. As a result, I should write my tests such that I don’t start timing until after the library is imported.

On top of all that, it’s important to take into account different types of scenarios when performance testing. For instance, if we have two similar snippets, one might have better performance for larger data sets. It’s important to test a range of data sets for that reason.

At any rate, the goal of this article is to look at a few different ways we can performance test code in Python. Let’s dig in!

Solutions

As always, I like to share a few ways to accomplish our task. Of course, if you’ve been following along in this series, you know that I prefer to use the timeit library to test snippets. Luckily, there are more options if timeit isn’t for you.

Performance Testing by Brute Force

If you’ve never done any performance testing before, you probably have a gist of how to get started. Typically, we want to take a timestamp before and after we run our code snippet. Then, we can calculate the difference between those times and use the result in our comparison with other snippets.

To do this in Python, we can take advantage of the datetime library:

import datetime
start_time = datetime.datetime.now()
# insert code snippet here
end_time = datetime.datetime.now()
print(end_time - start_time)

Of course, this solution leaves a lot to be desired. For example, it only gives us a single data point. Ideally, we’d want to run this a few times to collect an average or at least a lower bound, but this can do in a pinch.

Performance Testing Using the `timeit` Library

If you’d prefer to have all this timestamp garbage abstracted away with the addition of a few perks, check out the timeit library. With the timeit library, there are basically two main ways to test code: command line or inline. For our purposes, we’ll take a look at the inline version since that’s what I use for all my testing.

To test code using the timeit library, you’ll need to call either the timeit function or the repeat function. Either one is fine, but the repeat function gives a bit more control.

As an example, we’ll test the following code snippet from an earlier article on list comprehensions:

[(a, b) for a in (1, 3, 5) for b in (2, 4, 6)]

In this snippet, we’re generating a list of pairs from two tuples. To test it, we could use the timeit function:

import timeit
timeit.timeit("[(a, b) for a in (1, 3, 5) for b in (2, 4, 6)]")

If done correctly, this will run the snippet a million times and return an average execution time as a result. Of course, you’re welcome to change the number of iterations using the number keyword argument:

import timeit
timeit.timeit("[(a, b) for a in (1, 3, 5) for b in (2, 4, 6)]", number=1000)

Naturally, we can take this test a step further by running it multiple times using the repeat function:

import timeit
timeit.repeat("[(a, b) for a in (1, 3, 5) for b in (2, 4, 6)]")

Instead of returning an execution time, this function returns a list of execution times. In this case, the list will contain three separate execution times. Of course, we don’t need all those times. Instead, we can return the smallest execution time, so we can get an idea of the lower bound of the snippet:

import timeit
min(timeit.repeat("[(a, b) for a in (1, 3, 5) for b in (2, 4, 6)]"))

If you’ve been around a bit, you’ve probably seen this exact syntax in my performance tests in other articles or videos. Of course, I go the extra mile and increase the number of repetitions, but it’s probably overkill. In any case, this is a great way of performance testing Python snippets.

Performance Testing Using the cProfile Library

Outside of timeit and outright brute force, you can always leverage other profiling tools like cProfile. Like timeit, we can leverage cProfile to get runtime statistics from a section of code. Of course, cProfile is quite a bit more detailed. For example, we can run the same list comprehension from above as follows:

import cProfile
cProfile.run("[(a, b) for a in (1, 3, 5) for b in (2, 4, 6)]")

As a result, you get a nice report that looks like this:

4 function calls in 0.000 seconds 

    Ordered by: standard name 

    ncalls tottime percall cumtime percall filename:lineno(function) 
        1 0.000 0.000 0.000 0.000 <string>:1(<listcomp>) 
        1 0.000 0.000 0.000 0.000 <string>:1(<module>) 
        1 0.000 0.000 0.000 0.000 {built-in method builtins.exec} 
        1 0.000 0.000 0.000 0.000 {method 'disable' of '\_lsprof.Profiler' objects}

Here, we get a nice table which includes a lot of helpful information. Specifically, each row indicates a function that was executed, and each column breaks down a different runtime segment. For example, the <listcomp> function was called once (ncalls) and took 0.000 seconds (tottime) excluding calls to subfunctions. To understand everything else in this table, check out the following breakdown of all six columns:

ncalls : the number of times that particular function was called
- This number may actually be written as a fraction (e.g. 3/1) where the first value is the number of total calls and the second value is the number of primitive calls (not recursive).
tottime : the total amount of time the function spent executing not including calls to subfunctions
percall (first): the ratio of tottime to ncalls (i.e. the average amount of time spent in this function excluding subfunctions)
cumtime : the total amount of time the function spent executing including calls to subfunctions
percall (second): the ratio of cumtime to primitive calls (i.e. the average amount of time spent in this function)
filename:lineno(function): the filename, line number, and function in question

As you can see, cProfile helps you peek at the internal workings of a code snippet. Of course, you don’t get fine grained timings, so this works better as a compliment to timeit rather than a replacement. That said, I think cProfile would be excellent for profiling large scripts. That way, you can determine which functions need optimization.

Performance Testing With External Libraries

While Python provides plenty of ways to benchmark your own code, there are also other libraries we can leverage as well. For instance:

Personally, I’ve never used any of these tools, but I felt I should share them for the sake of completeness. Feel free to follow those links to learn more.

Challenge

At this point, I’d usually share some performance metrics for each of the solutions above, but that doesn’t really make sense in this context. Instead, it’s time to jump straight to the challenge!

Pick one of the articles in this series and run your own performance metrics on each of the solutions. Since I typically run timeit, maybe you could try using one of the other tools from this article. For example, try running cProfile on all the string formatting solutions.

When you’re done, share the best results in the comments. I’m interested to see what you learn! While you’re at it, check my work. I’d love to know if there are other solutions I’m missing.

A Little Recap

As always, I like to finish things out with a list of options. Keep in mind that each solution leverages an example code snippet. In this case, I chose a list comprehension, but you can use any snippet:

# Brute force solution
import datetime
start_time = datetime.datetime.now()
[(a, b) for a in (1, 3, 5) for b in (2, 4, 6)] # example snippet
end_time = datetime.datetime.now()
print(end_time - start_time)

# timeit solution
import timeit
min(timeit.repeat("[(a, b) for a in (1, 3, 5) for b in (2, 4, 6)]"))

# cProfile solution
import cProfile
cProfile.run("[(a, b) for a in (1, 3, 5) for b in (2, 4, 6)]")

Well, that’s all I’ve got! If you have any performance tools of your own to add to the list, feel free to share them in the comments.

In the meantime, I have plenty of How to Python articles you might be interested in:

If you prefer visual media, I have a YouTube channel which is currently focused on explaining content from this series. Head on over there and throw me a subscribe to help me build up my channel.

Finally, you can always get the latest The Renegade Coder content sent to your inbox through the email list. If you want to go the extra mile, toss me a couple bucks on Patreon. You won’t regret it!

At any rate, until next time!

The post How to Performance Test Python Code: timeit, cProfile, and More appeared first on The Renegade Coder.

Top comments (3)

Mišo • Jun 25 '20

It would be better to use perf_counter instead of the datetime.datetime.now() because it is intended to it. You could get really weird results when changing summer/winter time or overlap second. More on why also in docs.python.org/3/library/time.htm... 🙂