Today, we will be discussing the optimization technique in Python. In this article, you will get to know to speed up your code by avoiding the re-evaluation inside a list and dictionary.
Here I have written the decorator function to calculate the execution time of a function.
import functools
import time
def timeit(func):
@functools.wraps(func)
def newfunc(*args, **kwargs):
startTime = time.time()
func(*args, **kwargs)
elapsedTime = time.time() - startTime
print('function - {}, took {} ms to complete'.format(func.__name__, int(elapsedTime * 1000)))
return newfunc
let's move to the actual function
Avoid Re-evaluation in Lists
Evaluating nums.append
inside the loop
@timeit
def append_inside_loop(limit):
nums = []
for num in limit:
nums.append(num)
append_inside_loop(list(range(1, 9999999)))
In the above function nums.append
function references that are re-evaluated each time through the loop. After execution, The total time taken by the above function
o/p - function - append_inside_loop, took 529 ms to complete
Evaluating nums.append
outside the loop
@timeit
def append_outside_loop(limit):
nums = []
append = nums.append
for num in limit:
append(num)
append_outside_loop(list(range(1, 9999999)))
In the above function, I evaluate nums.append
outside the loop and used append
inside the loop as a variable. Total time is taken by the above function
o/p - function - append_outside_loop, took 328 ms to complete
As you can see when I have evaluated the append = nums.append
outside the for
loop as a local variable, it took less time and speed-up the code by 201 ms
.
The same technique we can apply to the dictionary case also, look at the below example
Avoid Re-evaluation in Dictionary
Evaluating data.get
each time inside the loop
@timeit
def inside_evaluation(limit):
data = {}
for num in limit:
data[num] = data.get(num, 0) + 1
inside_evaluation(list(range(1, 9999999)))
Total Time taken by the above function -
o/p - function - inside_evaluation, took 1400 ms to complete
Evaluating data.get
outside the loop
@timeit
def outside_evaluation(limit):
data = {}
get = data.get
for num in limit:
data[num] = get(num, 0) + 1
outside_evaluation(list(range(1, 9999999)))
Total time taken by the above function -
o/p - function - outside_evaluation, took 1189 ms to complete
As you can see we have speed-up the code here by 211 ms
.
I hope you like the explanation of the optimization technique in Python for the list and dictionary. Still, if any doubt or improvement regarding it, ask in the comment section. Also, don't forget to share your optimization technique.
Top comments (11)
Wow thanks! I had no idea this was even possible.
For the list
append
, is it because of the use oflen()
? Just checked the code because I wanted to know why this is happening. Couldn't figure out the dictionary though.Both are actually the same effect. When you call
data.get(...)
, you're calling the method, of course, but also looking up the method in thedata
object's internal dictionary, called__dict__
. What the above article is showing is that there's some savings to be made if you cache the lookup.He's not optimizing a list at all, but he is optimizing a dict - in both cases.
Because
data[num] = ...
is actuallydata.set(num, ...)
, there's another candidate for optimization there, as well - but by now you can probably guess what it is.It's was fixed in python3.8
Do you have some source on this? How it was fixed?
so v3.8 is faster? (like the article says)
Thanks for sharing.
I wonder though if this reduces code readability.
If we are not talking about mission critical paths, then it may not worth the 10-15 % reduction.
Yes it does - idiomatic Python code is typically not written this way.
This. Before applying this technique, make sure to measure the performance of your entire application!
Finally, a better advice for beginners (in my opinion), is to advise them to use comprehensions when they can:
is more idiomatic and even faster than the technique presented here.
@Dimitri Did you tried this -
Kindly Look at the result below -
and as you said -
Excellent! You took the time to verify what I was saying without giving proof - and you we right to do so!
The problem is that the last line in the second function is actually building a list.
Here's a better example:
This makes me thing of how slow really Python must be if the evaluation takes so long... Anyone got more info that, how much that affects performance and why?
Great article!
Check out the dis module and disassemble the code and it explains why this happens. TLDR: it's one less instruction inside the hot loop.