Why should you care about your Python code performance?
Let's be honest first, you're not using Python in your project because it's a fast programming language, but because it's easy for you to translate your idea into real life working product. So from the beginning, speed is not a problem for you, so why should you even bother to spend your precious time upgrading it? For me, writting code is not just putting sloppy working pieces together and then call it a day. I want to learn more about how to write faster, cleaner and simpler code. Because of this, I've been looking around for ways to increase my Python code's performanc ewithout sacrificing its readability. Let me show you the way of the force!
1. Choose wisely between "Ask for Forgiveness" and "Look before you leap"
-
Ask for forgiveness: You run your code like normal, then wrap them in the
try/catch
block.
try:
with open("path/to/file.txt", "r") as fp:
return fp.read()
except IOError:
# Handle the error or just ignore it
- Look before you leap: Try to check everything possible that can result in a bug or crash in you code before you run it.
# Check if file exists and if we have the permission to read it
if Path("/path/to/file").exists() and os.access("path/to/file.txt", os.R_OK):
with open("path/to/file.txt") as input_file:
return input_file.read()
- When to use which: In most cases, "Look before you leap" is slower than "Ask for forgiveness" (~30-80% slower). But only in cases where you expect your code to fail alot, then "Look before you leap" should be faster (4x faster). Because handling exceptions is very expensive
2. How to properly filter a list:
There are 3 popular ways to filter a list in Python:
- For Loop: This is the basic way to get your job done, it's easy to read and understand.
# Get all even numbers in the list contains 1 million number
def filter_odd_number():
lst = list(range(1000000))
res = []
for item in lst:
if item % 2 == 0:
res.append(item)
return res
-
List Comprehension: Your code will be drastically faster (~50% faster) than the normal
for
loop, and shorter as well. It's faster than thefor
loop because when you use afor
loop, on every iteration, you have to look up the variable holding the list and then call itsappend()
function. This doesn't happen in a list comprehension. Instead, there is a special bytecode instructionLIST_APPEND
that will append the current value to the list you're constructing.
def filter_odd_number():
lst = list(range(1000000))
return [item for item in lst if item % 2 == 0]
-
Python built-in
filter()
function: This function will return agenerator
rather than the whole list at once. If you want to get the whole list at once, this should be the worst performance-wise option.
def filter_odd_number():
lst = list(range(1000000))
return filter(lambda x: x % 2 == 0, lst)
- When to use which: Most of the time, list comprehension is the winner, it's faster, shorter and some may say it looks cooler as well. But it has 1 limitation is that you cannot pact multiple statements inside of it, but don't worry, you can always wrap your logic inside a function and use it inside a list comprehension. But if you have a realy big list (I'm talking billions of item) then consider using filter(), you definitely don't want your big list to be duplicated.
-
Note: Dictionary comprehension is also faster than normal
for
loop for many tasks
3. Checking for True or False
We have 3 ways to achieve this:
-
if var == True
: This is bad, ~120% slower than the winner -
if var is True
: This is also bad, ~60% slower than the winner -
if var
: Good, recommended by PEP8 -
The winner is clearly
if var
Wait, it's not going to be that easy. In Python, there's a concept called "truthy & falsy", it means that there are variables interpreted as True
or False
if you run bool(variable)
(explicitly by you or by the interpreter when you put them behind the if
clause). Now, you may have different logic to deal with a certain type
of the data.
var = [] # or None or True
if len(var) == 0:
print("empty")
elif var is None:
print("null")
elif var is True:
print(true)
- To check if a variable is equal to
True/False
(and you don't have to distinguish betweenTrue/False
and truthy / falsy values), use if variable or if not variable. It's the simplest and fastest way to do this. - To check that a variable is explicitly
True
orFalse
(and is not truthy/falsy), useis
(if variable is True
). - To check if a variable is equal to 0 or if a list is empty, use
if variable == 0
orif variable == []
.
4. Use in
to check if item exists in a collections of item
-
in
in a powerful tool to check if an item is in alist/set/tuple
or a dictionary (usein
would only check if a Key exists in adictionary
). Do not usefor
to do this kind of task (~50% slower) - Note: If you need to "check for membership" a lot, then you should consider using dict or set to store your items, because they have constant average lookup time.
5. Remove duplicate value inside a list
There should be many ways to implement this task, but 3 stand out the most:
- Use
for
loop or list comprehension to manually remove duplicate value: Slow and takes longer to implement - Use
list(set(arr))
to do the trick: This is ths fastest and shortest way as well, too bad they're going to shuffle the order of your original list (if you don't care about it then good!) - Use
list(dict.fromkeys(arr))
for Python 3.6 and above orlist(OrderedDict.fromkeys(arr))
: This code is a little slower than theset
way but it will keep your original list order
6. Declare a data structure in style
When you need to declare an empty list
, dict
, tuple
or set
, you should use the Pythonista syntax rather than explicitly call the function because it will be faster and shorter to (~100% faster). This is because when you call the function, Python would then try to check if there's any other function with that name in your script before thinking about using its own
- List: Use
[]
instead oflist()
- Dict: Use
{}
instead ofdict()
- Tuple: Use
()
instead oftuple()
- Set: Use
{x,}
instead ofset([x])
7. String format
Python 3.6 introduce the famous f-string for us to build formated string (before that we have to use the .format()
or Template
string) and since then it's hard to find someone who doesn't love them! But is it the optimal way to create a format string?
- Yes, f-string perform consistently better than its competitor, here's an example
name, age = "Triet", 24
s = f"My name is {name}, I'm {age} years old"
- The "concat string" is also an honorable mention, but its performance is as consistent as the f-string (sometimes it's even faster than the f-string)
8. Map vs List Comprehension
In the previous section, I've said that you should always use "List comprehension" instead of for
loop for tasks that require you to loop over a list. One big problem with that is "List comprehension" cannot take too many logic inside (without ruin your beautiful code), so my solution back there is to wrap that logic into a function, then call it inside of the "List comprehension". It's a good work around, But should you do it?
-
No you should not! Use
map()
instead.map()
is faster (~50%) and easier to write as well. Here's an example:
def pow2(x):
return x**2
arr = list(range(1000000))
res = list(map(pow2, arr))
Why we need the list()
outside of map()
you asked? Well map()
return a generator
which would return item one by one when you call it, so list should get the whole result and put them back inside a list
9. Inlining Functions
- This is one way to increase your code performance that go against the concept of beautiful code - Try to wrap the whole function's logic in just one line. And yes, it may sound silly at first but it's actually work (~100% faster)! Here's an example
def inline_functions():
return sum([sum([sum([1 for _ in range(100)]) for _ in range(100)]) for _ in range(100)])
I've won... But at what cost?
- This should only be done with function that has so little logic inside! please don't hurt your colleague.
10. Upgrade your Python version
- Yes, you heard me, go upgrade your Python version, it's free and can affect instantly! Python 3.11 is 10-60% faster than Python 3.10
Reference
- Sebastion Witowski's blog: https://switowski.com/blog
Top comments (0)