**product Object.model_dump()
Python Performance Optimization: Detailed Guide
1. Overview: Optimize What Needs Optimizing
Why This Matters
Premature optimization is the root of all evil. Optimizing the wrong parts of your code wastes time and can make code harder to maintain.
The Right Approach
# Step 1: Get it right first
def calculate_total(items):
return sum(item['price'] * item['quantity'] for item in items)
# Step 2: Test it's right
def test_calculate_total():
items = [{'price': 10, 'quantity': 2}, {'price': 5, 'quantity': 3}]
assert calculate_total(items) == 35
# Step 3: Profile if slow
import cProfile
cProfile.run('calculate_total(large_item_list)')
# Step 4: Optimize based on profiling results
# Step 5: Repeat testing after optimization
Key Point: Always profile first. What you think is slow might not be the bottleneck!
2. Sorting Optimization
❌ Avoid: Using Comparison Functions
# BAD - Comparison function called O(n log n) times
def compare_by_age(person1, person2):
if person1['age'] < person2['age']:
return -1
elif person1['age'] > person2['age']:
return 1
return 0
people = [{'name': 'Alice', 'age': 30}, {'name': 'Bob', 'age': 25}]
# Python 2 style - SLOW
# people.sort(cmp=compare_by_age)
Why This is Slow:
- Comparison function is a Python function call (expensive!)
- Called O(n log n) times during sorting
- For 10,000 items, ~130,000 function calls
- Each call has Python interpreter overhead
✅ Use: key Parameter with operator.itemgetter
from operator import itemgetter
# GOOD - key function called only O(n) times
people = [
{'name': 'Alice', 'age': 30, 'salary': 100000},
{'name': 'Bob', 'age': 25, 'salary': 80000},
{'name': 'Charlie', 'age': 35, 'salary': 120000}
]
# Sort by single field
people.sort(key=itemgetter('age'))
# [{'name': 'Bob', 'age': 25, ...}, {'name': 'Alice', 'age': 30, ...}, ...]
# Sort by multiple fields (age, then salary)
people.sort(key=itemgetter('age', 'salary'))
# For tuples/lists, use index
data = [(1, 'apple', 5), (2, 'banana', 3), (3, 'cherry', 8)]
data.sort(key=itemgetter(2)) # Sort by third element
# [(2, 'banana', 3), (1, 'apple', 5), (3, 'cherry', 8)]
Why This is Fast:
-
itemgetter
is implemented in C - Key function called only once per item (O(n) calls)
- Native comparisons on extracted keys (fast C code)
- For 10,000 items: 10,000 key calls vs 130,000 comparison calls
✅ Use: sorted() for Non-Destructive Sorting
# ❌ Bad - modifies original list
original = [3, 1, 4, 1, 5]
original.sort()
print(original) # [1, 1, 3, 4, 5] - original destroyed!
# ✅ Good - preserves original
original = [3, 1, 4, 1, 5]
sorted_copy = sorted(original)
print(original) # [3, 1, 4, 1, 5] - unchanged
print(sorted_copy) # [1, 1, 3, 4, 5]
# Works with any iterable
sorted_set = sorted({3, 1, 4, 1, 5}) # [1, 3, 4, 5]
sorted_dict_keys = sorted({'z': 1, 'a': 2, 'm': 3}) # ['a', 'm', 'z']
Advanced Sorting Techniques
from operator import itemgetter, attrgetter, methodcaller
# 1. Sort objects by attribute
class Person:
def __init__(self, name, age):
self.name = name
self.age = age
def __repr__(self):
return f"Person({self.name}, {self.age})"
people = [Person('Alice', 30), Person('Bob', 25), Person('Charlie', 35)]
# Sort by attribute
people.sort(key=attrgetter('age'))
# [Person(Bob, 25), Person(Alice, 30), Person(Charlie, 35)]
# Sort by multiple attributes
people.sort(key=attrgetter('age', 'name'))
# 2. Sort by method result
words = ['Python', 'java', 'C++', 'javascript']
words.sort(key=methodcaller('lower')) # Case-insensitive sort
# ['C++', 'java', 'javascript', 'Python']
# 3. Reverse sorting (descending)
numbers = [3, 1, 4, 1, 5, 9]
numbers.sort(reverse=True) # [9, 5, 4, 3, 1, 1]
# Or with key
people.sort(key=attrgetter('age'), reverse=True) # Oldest first
# 4. Complex sorting with lambda (when operator functions won't work)
points = [(1, 5), (3, 2), (1, 3), (2, 4)]
# Sort by distance from origin
points.sort(key=lambda p: p[0]**2 + p[1]**2)
# But prefer operator when possible (faster):
points.sort(key=itemgetter(1)) # Sort by y-coordinate
Decorate-Sort-Undecorate (DSU) Pattern (Legacy)
# Old Python 2 pattern - now obsolete with key parameter
# But understanding it helps explain how key works internally
# ❌ Manual DSU (old way)
def sortby_manual(somelist, n):
# Decorate
decorated = [(x[n], x) for x in somelist]
# Sort
decorated.sort()
# Undecorate
return [x for (key, x) in decorated]
# ✅ Modern way (Python 2.4+)
def sortby_modern(somelist, n):
return sorted(somelist, key=itemgetter(n))
Sorting Stability (Important!)
# Python's sort is STABLE (since 2.3)
# Equal elements maintain their relative order
students = [
{'name': 'Alice', 'grade': 'A', 'age': 20},
{'name': 'Bob', 'grade': 'B', 'age': 19},
{'name': 'Charlie', 'grade': 'A', 'age': 21},
{'name': 'David', 'grade': 'B', 'age': 20}
]
# Sort by grade (stable)
students.sort(key=itemgetter('grade'))
# Within same grade, original order preserved
# Multi-level sorting using stability
# Sort by secondary key first, then primary key
students.sort(key=itemgetter('age')) # Sort by age first
students.sort(key=itemgetter('grade')) # Then by grade (stable!)
# Result: Sorted by grade, and within same grade, sorted by age
Performance Comparison
import timeit
from operator import itemgetter
data = [{'id': i, 'value': i % 100} for i in range(10000)]
# Using lambda
time_lambda = timeit.timeit(
lambda: sorted(data, key=lambda x: x['value']),
number=1000
)
# Using itemgetter
time_itemgetter = timeit.timeit(
lambda: sorted(data, key=itemgetter('value')),
number=1000
)
print(f"Lambda: {time_lambda:.4f}s")
# Output: ~2.5s
print(f"itemgetter: {time_itemgetter:.4f}s")
# Output: ~1.8s (30-40% faster!)
When to Use What
# Use .sort() when:
# - You want to modify the list in place
# - You don't need the original order
# - Slightly more memory efficient
my_list.sort(key=itemgetter('field'))
# Use sorted() when:
# - You need to keep the original
# - Sorting any iterable (not just lists)
# - More functional programming style
new_list = sorted(my_list, key=itemgetter('field'))
# Use itemgetter when:
# - Sorting by dictionary keys or tuple/list indices
# - Need maximum performance
from operator import itemgetter
data.sort(key=itemgetter('age', 'name'))
# Use attrgetter when:
# - Sorting objects by attributes
from operator import attrgetter
objects.sort(key=attrgetter('attribute'))
# Use lambda when:
# - Complex transformation needed
# - itemgetter/attrgetter won't work
data.sort(key=lambda x: (x['category'], -x['priority']))
3. String Concatenation
❌ Avoid: Using += for String Building
# BAD - Creates a new string object on every iteration
def build_html_bad(items):
html = ""
for item in items:
html += "<li>" + item + "</li>" # Creates new string each time
return html
# Why it's slow:
# Iteration 1: "" -> "<li>apple</li>" (new string created)
# Iteration 2: "<li>apple</li>" -> "<li>apple</li><li>banana</li>" (another new string)
# Each concatenation copies ALL previous characters again!
Why This is Slow:
- Strings are immutable in Python
- Each
+=
creates a completely new string object - For n items, this copies characters O(n²) times
- With 10,000 items, you might copy millions of characters
✅ Use: join() Method
# GOOD - Builds list first, then joins once
def build_html_good(items):
parts = []
for item in items:
parts.append(f"<li>{item}</li>")
return "".join(parts)
# Even better - list comprehension
def build_html_best(items):
return "".join(f"<li>{item}</li>" for item in items)
Why This is Fast:
- List operations are cheap
-
join()
calculates total size once and allocates memory once - Only one string copy operation at the end
- O(n) time complexity instead of O(n²)
Performance Comparison
import time
items = ['item' + str(i) for i in range(10000)]
# Bad approach
start = time.time()
result = ""
for item in items:
result += item
print(f"Concatenation: {time.time() - start:.4f}s")
# Output: ~2-5 seconds
# Good approach
start = time.time()
result = "".join(items)
print(f"Join: {time.time() - start:.4f}s")
# Output: ~0.001 seconds (1000x faster!)
String Formatting
# ❌ Avoid concatenation
output = "<html>" + head + prologue + query + tail + "</html>"
# ✅ Use formatting (better)
output = "<html>%s%s%s%s</html>" % (head, prologue, query, tail)
# ✅ Use f-strings (Python 3.6+, best)
output = f"<html>{head}{prologue}{query}{tail}</html>"
3. Loops and Iteration
❌ Avoid: Manual Loop with Append
# BAD - Slow due to repeated attribute lookups and Python loop overhead
def process_words_bad(words):
result = []
for word in words:
result.append(word.upper())
return result
Why This is Slow:
- Python interpreter overhead for each iteration
- Repeated method lookups (
.append
,.upper
) - Function call overhead for each operation
✅ Use: List Comprehensions
# GOOD - Optimized by interpreter
def process_words_good(words):
return [word.upper() for word in words]
Why This is Fast:
- List comprehensions are optimized at the bytecode level
- Reduces interpreter overhead
- More concise and readable
✅ Use: map() for Simple Operations
# ALSO GOOD - Pushes loop into C code
def process_words_map(words):
return list(map(str.upper, words))
Why This is Fast:
-
map()
is implemented in C - No Python interpreter overhead per iteration
- Very efficient for simple transformations
Map in detail
The map()
function in Python is used to apply a function to every item in an iterable (like a list or tuple) and return a map object (which can be converted to a list, set, etc.).
🧠 Syntax of map()
map(function, iterable)
-
function
: A function that will be applied to each item in the iterable. -
iterable
: A sequence (like a list, tuple, etc.) whose items will be processed by the function.
✅ Example 1: Using Built-in Function str.upper
words = ["hello", "world"]
result = map(str.upper, words)
print(list(result)) # Output: ['HELLO', 'WORLD']
Here, str.upper
is applied to each word in the list.
✅ Example 2: Using a Lambda Function
numbers = [1, 2, 3, 4]
squared = map(lambda x: x**2, numbers)
print(list(squared)) # Output: [1, 4, 9, 16]
The lambda function lambda x: x**2
squares each number.
✅ Example 3: Using a Custom Function
def add_prefix(word):
return "pre_" + word
words = ["fix", "view", "dict"]
result = map(add_prefix, words)
print(list(result)) # Output: ['pre_fix', 'pre_view', 'pre_dict']
Generator Expressions (Memory Efficient)
# When you don't need the full list at once
def process_large_file(filename):
with open(filename) as f:
# ❌ Bad - loads entire file into memory
lines = [line.upper() for line in f]
# ✅ Good - processes one line at a time
lines = (line.upper() for line in f)
for line in lines:
process(line) # Only one line in memory at a time
Generator Expression in Detail
You're asking a great question — and it's a common point of confusion for many Python learners. Let's clarify the difference between yield
and generators, and how they relate to each other.
🧠 What Is a Generator?
A generator is a special type of iterator in Python that produces values one at a time, only when requested. It’s useful for working with large datasets or streams of data because it doesn’t store everything in memory.
There are two ways to create a generator:
1. Using a Generator Function with yield
def my_generator():
yield 1
yield 2
yield 3
- When you call
my_generator()
, it returns a generator object. - Each time you iterate over it (e.g., with a
for
loop), it runs until it hits ayield
, returns that value, and pauses. - When you ask for the next value, it resumes from where it left off.
2. Using a Generator Expression
gen = (x * x for x in range(3))
- This is similar to a list comprehension, but with ()` instead of [].
- It also returns a generator object and evaluates lazily (one item at a time).
🔄 What Is yield
?
-
yield
is a keyword used inside a function to turn it into a generator function. - It’s like
return
, but instead of ending the function, it pauses and allows the function to continue later.
✅ Example: Using yield
`python
def count_up_to(n):
count = 1
while count <= n:
yield count
count += 1
gen = count_up_to(3)
for num in gen:
print(num)
`
Output:
1
2
3
- Each call to
next(gen)
gives the next number. - The function remembers its state between calls.
🔁 What a Generator Does
When you write:
python
lines = (line.upper() for line in f)
This creates a generator object. It doesn’t actually read or process any lines yet. It just sets up the logic for how each line will be processed when requested.
🧠 Why the for
Loop Is Needed
python
for line in lines:
process(line)
The for
loop triggers** the generator to start reading the file line by line, converting each line to uppercase, and passing it to process()
.
Without the loop, the generator just sits there — it doesn’t do anything.
4. Avoiding Dots (Attribute Lookups)
❌ Avoid: Repeated Attribute Lookups
`python
BAD - Looks up .append and .upper on every iteration
def process_bad(words):
result = []
for word in words:
result.append(word.upper()) # Two lookups per iteration
return result
`
Why This is Slow:
- Python does attribute lookup at runtime
- Each dot (
.
) triggers a dictionary lookup - For 1 million items, that's 2 million dictionary lookups!
✅ Use: Cache Attribute Lookups
`python
GOOD - Lookup once, use many times
def process_good(words):
result = []
append = result.append # Cache the method
upper = str.upper # Cache the function
for word in words:
append(upper(word)) # Direct reference, no lookup
return result
`
Why This is Fast:
- Attribute lookup happens only once
- Direct variable access is much faster
- Reduces bytecode instructions per iteration
Real-World Example
`python
❌ Bad - repeated lookups
def parse_data_bad(data):
results = []
for item in data:
if item.value > 0: # Lookup 'value'
results.append({ # Lookup 'append'
'id': item.id, # Lookup 'id'
'name': item.name # Lookup 'name'
})
return results
✅ Good - cache lookups
def parse_data_good(data):
results = []
append = results.append
for item in data:
value = item.value
if value > 0:
append({
'id': item.id,
'name': item.name
})
return results
`
Caution: Only use this technique in performance-critical loops. It reduces readability, so use it judiciously.
5. Local vs Global Variables
❌ Avoid: Global Variables in Loops
`python
BAD - Accessing globals is slow
counter = 0
def process_global():
global counter
for i in range(1000000):
counter += 1 # Global lookup on every iteration
`
Why This is Slow:
- Global variables are stored in a dictionary (
globals()
) - Each access requires a dictionary lookup
- Much slower than local variable access
✅ Use: Local Variables
`python
GOOD - Local variables use optimized storage
def process_local():
counter = 0 # Local variable
for i in range(1000000):
counter += 1 # Fast local access
return counter
`
Why This is Fast:
- Local variables are stored in an array-like structure
- Access is by index, not dictionary lookup
- Much faster at the C level
Best Practice
`python
❌ Avoid
import math
def calculate_distances(points):
distances = []
for p1, p2 in points:
# math.sqrt is a global lookup each time
dist = math.sqrt((p2[0]-p1[0])2 + (p2[1]-p1[1])2)
distances.append(dist)
return distances
✅ Use
import math
def calculate_distances_fast(points):
distances = []
append = distances.append
sqrt = math.sqrt # Make it local!
for p1, p2 in points:
dist = sqrt((p2[0]-p1[0])**2 + (p2[1]-p1[1])**2)
append(dist)
return distances
`
6. Dictionary Initialization
❌ Avoid: if-else for Dictionary Keys
`python
BAD - Dictionary lookup happens twice on every iteration
def count_words_bad(words):
word_count = {}
for word in words:
if word not in word_count: # First lookup
word_count[word] = 0
word_count[word] += 1 # Second lookup
return word_count
`
Why This is Slow:
- Double dictionary lookup for existing keys
-
if
statement evaluated every single time - After first occurrence, the
if
always fails but still gets checked
✅ Use: try-except (EAFP)
`python
GOOD - Only one lookup for existing keys
def count_words_try(words):
word_count = {}
for word in words:
try:
word_count[word] += 1 # Try to increment
except KeyError:
word_count[word] = 1 # Only runs once per unique word
return word_count
`
Why This is Fast:
- Python's EAFP (Easier to Ask for Forgiveness than Permission) philosophy
- Exceptions are cheap when not raised
- Only one dictionary lookup for existing keys
- Exception only raised once per unique word
✅ Use: dict.get() with Default
`python
ALSO GOOD - Clear and concise
def count_words_get(words):
word_count = {}
for word in words:
word_count[word] = word_count.get(word, 0) + 1
return word_count
`
✅ Use: defaultdict (Best for Most Cases)
`python
BEST - Most Pythonic and readable
from collections import defaultdict
def count_words_defaultdict(words):
word_count = defaultdict(int) # int() returns 0
for word in words:
word_count[word] += 1 # No checking needed!
return word_count
`
Why This is Best:
- No explicit initialization needed
- Clear intent
- Very efficient
7. Import Statement Overhead
❌ Avoid: Imports Inside Tight Loops
`python
BAD - Imports module on every function call
def process_data_bad():
for i in range(100000):
import string # Module lookup happens 100,000 times!
result = string.ascii_lowercase
`
Why This is Slow:
- Python checks
sys.modules
on every import - Even though module isn't reloaded, the lookup is expensive
- Adds unnecessary overhead to every iteration
✅ Use: Import at Module Level
`python
GOOD - Import once
import string
def process_data_good():
for i in range(100000):
result = string.ascii_lowercase # Direct access
`
Why This is Fast:
- Module imported only once when file is loaded
- No import overhead in the function
- Variable lookup is much faster
✅ Use: Import Specific Names
`python
EVEN BETTER - No attribute lookup
from string import ascii_lowercase
def process_data_better():
for i in range(100000):
result = ascii_lowercase # No dot lookup!
`
Lazy Imports (When Needed) see import email part
`python
When module might not be needed
class EmailProcessor:
def init(self):
self._email_module = None
def parse_email(self, email_string):
# ✅ Import only when first needed
if self._email_module is None:
import email
self._email_module = email
return self._email_module.message_from_string(email_string)
This is useful when:
1. Import is expensive (large module)
2. Module might not be used in this execution
3. You want faster startup time
`
8. Data Aggregation
❌ Avoid: Processing Items One at a Time
`python
BAD - Function call overhead for each item
def process_items_bad(items):
total = 0
for item in items:
total = add_to_total(total, item) # Function call per item
return total
def add_to_total(total, item):
return total + item
`
Why This is Slow:
- Python function calls are expensive
- Stack frame creation/destruction for each call
- Parameter passing overhead
- For 1 million items, 1 million function calls!
✅ Use: Process Data in Batches
`python
GOOD - Process entire collection at once
def process_items_good(items):
return sum(items) # Built-in, implemented in C
Or if you need custom processing
def process_items_batch(items):
total = 0
for item in items: # Loop inside the function
total += item
return total
`
Why This is Fast:
- Single function call regardless of data size
- Loop overhead amortized over all items
- Built-in functions use optimized C code
Real-World Example
`python
❌ Bad - one API call per item
def save_users_bad(users):
for user in users:
database.save(user) # Network round-trip each time
✅ Good - batch API call
def save_users_good(users):
database.bulk_save(users) # Single network round-trip
❌ Bad - one validation per call
def validate_emails_bad(emails):
valid = []
for email in emails:
if is_valid_email(email): # Function call per email
valid.append(email)
return valid
✅ Good - validate in batch
def validate_emails_good(emails):
return [email for email in emails
if '@' in email and '.' in email] # Inline check
`
9. Advanced Built-in Functions and Techniques
all() and any() - Short-Circuit Evaluation
`python
❌ Bad - checks every element even after finding result
def has_negative_manual(numbers):
found = False
for num in numbers:
if num < 0:
found = True
break # Manual short-circuit
return found
✅ Good - built-in with automatic short-circuit
def has_negative_builtin(numbers):
return any(num < 0 for num in numbers)
Why it's better:
numbers = list(range(-1, 1000000))
any() stops at -1 (first element)
Manual loop in Python is slower even with break
`
Key Point: any()
and all()
are implemented in C and stop as soon as the result is determined.
`python
Real-world examples
def validate_data(records):
# Check if all records are valid (stops at first invalid)
return all(record.get('id') and record.get('name') for record in records)
def has_error(responses):
# Check if any response has error (stops at first error)
return any(resp.status_code >= 400 for resp in responses)
`
enumerate() - Better Than range(len())
`python
❌ Bad - manual indexing
items = ['apple', 'banana', 'cherry']
for i in range(len(items)):
print(f"{i}: {items[i]}") # Extra lookup
✅ Good - enumerate provides index and value
for i, item in enumerate(items):
print(f"{i}: {item}") # No lookup needed
Start from different index
for i, item in enumerate(items, start=1):
print(f"{i}: {item}") # 1: apple, 2: banana, 3: cherry
`
zip() - Parallel Iteration
`python
❌ Bad - manual indexing for parallel lists
names = ['Alice', 'Bob', 'Charlie']
ages = [30, 25, 35]
cities = ['NYC', 'LA', 'Chicago']
for i in range(len(names)):
print(f"{names[i]}, {ages[i]}, {cities[i]}")
✅ Good - zip combines iterables
for name, age, city in zip(names, ages, cities):
print(f"{name}, {age}, {city}")
Create dictionary from two lists
keys = ['a', 'b', 'c']
values = [1, 2, 3]
dict_from_zip = dict(zip(keys, values)) # {'a': 1, 'b': 2, 'c': 3}
Unzip (transpose)
pairs = [(1, 'a'), (2, 'b'), (3, 'c')]
numbers, letters = zip(*pairs)
numbers = (1, 2, 3), letters = ('a', 'b', 'c')
`
itertools - The Power Tools
`python
from itertools import chain, islice, groupby, accumulate, product, combinations
1. chain - flatten multiple iterables (no intermediate list!)
❌ Bad
list1 = [1, 2, 3]
list2 = [4, 5, 6]
combined = list1 + list2 # Creates new list
✅ Good
combined = chain(list1, list2) # Lazy iterator
for item in combined:
print(item) # No intermediate list created
2. islice - slicing iterators without loading everything
❌ Bad - loads entire file
with open('huge_file.txt') as f:
first_10 = list(f)[:10] # Loads entire file!
✅ Good - only reads what's needed
from itertools import islice
with open('huge_file.txt') as f:
first_10 = list(islice(f, 10)) # Reads only 10 lines
3. groupby - group consecutive items
from operator import itemgetter
data = [
{'category': 'A', 'value': 1},
{'category': 'A', 'value': 2},
{'category': 'B', 'value': 3},
{'category': 'B', 'value': 4}
]
data.sort(key=itemgetter('category')) # MUST be sorted first!
for category, items in groupby(data, key=itemgetter('category')):
print(f"{category}: {list(items)}")
4. accumulate - running totals
from itertools import accumulate
numbers = [1, 2, 3, 4, 5]
running_sum = list(accumulate(numbers)) # [1, 3, 6, 10, 15]
Custom operation
running_product = list(accumulate(numbers, lambda x, y: x * y))
[1, 2, 6, 24, 120]
5. product - cartesian product (nested loops)
❌ Bad - manual nested loops
result = []
for color in ['red', 'blue']:
for size in ['S', 'M', 'L']:
result.append((color, size))
✅ Good - itertools.product
from itertools import product
result = list(product(['red', 'blue'], ['S', 'M', 'L']))
[('red', 'S'), ('red', 'M'), ('red', 'L'), ('blue', 'S'), ...]
6. combinations and permutations
from itertools import combinations, permutations
items = ['A', 'B', 'C']
All 2-item combinations (order doesn't matter)
list(combinations(items, 2))
[('A', 'B'), ('A', 'C'), ('B', 'C')]
All 2-item permutations (order matters)
list(permutations(items, 2))
[('A', 'B'), ('A', 'C'), ('B', 'A'), ('B', 'C'), ('C', 'A'), ('C', 'B')]
`
set Operations - O(1) Membership Testing
`python
❌ Bad - O(n) lookup in list
allowed_users = ['alice', 'bob', 'charlie'] # List
for user in all_users:
if user in allowed_users: # O(n) check!
process(user)
✅ Good - O(1) lookup in set
allowed_users = {'alice', 'bob', 'charlie'} # Set
for user in all_users:
if user in allowed_users: # O(1) check!
process(user)
Set operations
set1 = {1, 2, 3, 4, 5}
set2 = {4, 5, 6, 7, 8}
Intersection (common elements)
common = set1 & set2 # {4, 5}
Union (all elements)
all_elements = set1 | set2 # {1, 2, 3, 4, 5, 6, 7, 8}
Difference (in set1 but not set2)
only_in_set1 = set1 - set2 # {1, 2, 3}
Symmetric difference (in either but not both)
symmetric = set1 ^ set2 # {1, 2, 3, 6, 7, 8}
Remove duplicates while preserving order (Python 3.7+)
items = [1, 2, 2, 3, 1, 4, 3, 5]
unique = list(dict.fromkeys(items)) # [1, 2, 3, 4, 5]
`
functools - Function Tools
`python
from functools import lru_cache, partial, reduce
1. lru_cache - Memoization (caching results)
❌ Bad - recalculates same values
def fibonacci_slow(n):
if n < 2:
return n
return fibonacci_slow(n-1) + fibonacci_slow(n-2)
fibonacci_slow(35) takes ~5 seconds
✅ Good - caches results
@lru_cache(maxsize=128)
def fibonacci_fast(n):
if n < 2:
return n
return fibonacci_fast(n-1) + fibonacci_fast(n-2)
fibonacci_fast(35) takes ~0.0001 seconds!
Real-world example: expensive API call
@lru_cache(maxsize=100)
def get_user_data(user_id):
# Expensive database query or API call
return database.query(user_id)
2. partial - Pre-fill function arguments
def power(base, exponent):
return base ** exponent
square = partial(power, exponent=2)
cube = partial(power, exponent=3)
print(square(5)) # 25
print(cube(5)) # 125
Useful with map/filter
from functools import partial
from operator import mul
double = partial(mul, 2)
numbers = [1, 2, 3, 4, 5]
doubled = list(map(double, numbers)) # [2, 4, 6, 8, 10]
3. reduce - Cumulative operations
from functools import reduce
from operator import mul
numbers = [1, 2, 3, 4, 5]
product = reduce(mul, numbers) # 1*2*3*4*5 = 120
More readable alternatives exist for common cases:
sum() instead of reduce(add, numbers)
math.prod() (Python 3.8+) instead of reduce(mul, numbers)
`
collections Module Power Tools
`python
from collections import deque, namedtuple, ChainMap
1. deque - Fast appends/pops from both ends
❌ Bad - list is O(n) for left operations
my_list = [1, 2, 3]
my_list.insert(0, 0) # O(n) - shifts all elements
✅ Good - deque is O(1) for both ends
from collections import deque
my_deque = deque([1, 2, 3])
my_deque.appendleft(0) # O(1)
my_deque.append(4) # O(1)
my_deque.popleft() # O(1)
my_deque.pop() # O(1)
Ring buffer / sliding window
recent_items = deque(maxlen=5) # Only keeps last 5 items
for item in range(10):
recent_items.append(item)
print(recent_items) # deque([5, 6, 7, 8, 9])
2. namedtuple - Lightweight object
❌ Bad - dictionary overhead
user = {'name': 'Alice', 'age': 30, 'email': 'alice@example.com'}
print(user['name']) # Dictionary lookup
✅ Good - namedtuple is faster and cleaner
from collections import namedtuple
User = namedtuple('User', ['name', 'age', 'email'])
user = User('Alice', 30, 'alice@example.com')
print(user.name) # Attribute access (faster)
print(user[0]) # Also supports indexing
Immutable and memory efficient
Uses tuple's memory layout but with named access
3. ChainMap - Combine multiple dictionaries
❌ Bad - creates new dictionary
config = {**defaults, **user_config, **override} # Memory overhead
✅ Good - ChainMap provides view without copying
from collections import ChainMap
combined = ChainMap(override, user_config, defaults)
Lookups check each dict in order, no copying
`
Context Managers - Resource Management
`python
❌ Bad - manual resource management
file = open('data.txt', 'r')
try:
data = file.read()
process(data)
finally:
file.close() # Easy to forget!
✅ Good - automatic cleanup
with open('data.txt', 'r') as file:
data = file.read()
process(data)
File automatically closed even if exception occurs
Multiple context managers
with open('input.txt') as infile, open('output.txt', 'w') as outfile:
for line in infile:
outfile.write(process(line))
Custom context manager with contextlib
from contextlib import contextmanager
import time
@contextmanager
def timer(name):
start = time.time()
yield # Control returns to caller
end = time.time()
print(f"{name} took {end-start:.4f}s")
Usage
with timer("Data processing"):
process_large_dataset()
`
Assignment Expressions (Walrus Operator) - Python 3.8+
`python
❌ Bad - compute twice or extra variable
data = get_data()
if len(data) > 0:
process(data)
Or
if len(get_data()) > 0: # Calls get_data() twice!
process(get_data())
✅ Good - compute once, use in condition
if (n := len(data)) > 0:
print(f"Processing {n} items")
process(data)
Useful in list comprehensions
❌ Bad - expensive_function called twice per item
results = [expensive_function(x) for x in items
if expensive_function(x) > threshold]
✅ Good - called once per item
results = [y for x in items
if (y := expensive_function(x)) > threshold]
While loop pattern
❌ Bad
line = file.readline()
while line:
process(line)
line = file.readline()
✅ Good
while (line := file.readline()):
process(line)
`
Great! Let's break down this line of Python code and explain it in detail:
🔍 Code in Question
python
if (n := len(data)) > 0:
print(f"Processing {n} items")
process(data)
✅ What’s Happening Here?
This line uses the walrus operator :=
, which was introduced in Python 3.8. It allows you to assign a value to a variable as part of an expression — especially useful in conditions.
🧠 Step-by-Step Explanation
-
len(data)
:- Calculates the number of items in the
data
list (or any iterable).
- Calculates the number of items in the
-
n := len(data)
:- Assigns the result of
len(data)
to the variablen
. - This is done inside the
if
condition, so you don’t need a separate line like:python n = len(data) if n > 0:
- Assigns the result of
-
if (n := len(data)) > 0:
:- Checks if the length of
data
is greater than 0. - If true, it enters the block and uses
n
(already computed) without recalculatinglen(data)
.
- Checks if the length of
-
print(f"Processing {n} items")
:- Prints how many items are being processed.
-
process(data)
:- Calls a function named
process()
and passes thedata
to it.
- Calls a function named
📌 Why Is This Good Practice?
-
Efficiency: You compute
len(data)
once, but use it twice — in the condition and in the print statement. - Cleaner Code: Avoids repetition and keeps logic compact.
- Readability: Once you're familiar with the walrus operator, it makes code more expressive.
🧪 Example Without Walrus Operator
python
n = len(data)
if n > 0:
print(f"Processing {n} items")
process(data)
This is perfectly fine too — but the walrus operator lets you do it in one line.
10. Profiling: Finding the Real Bottlenecks
Why Profile?
`python
You might think THIS is slow:
def process_data(data):
# Complex calculation
results = [x*2 + x3 + x*4 for x in data] # Suspicious!
# Simple operation
return '\n'.join(map(str, results)) # Looks innocent
But profiling reveals the truth:
95% of time is spent in join()!
Only 5% in the calculation
`
Basic Profiling with cProfile
`python
import cProfile
import pstats
def my_function():
# Your code here
data = list(range(10000))
result = [x**2 for x in data]
return sum(result)
Profile the function
cProfile.run('my_function()', 'output.prof')
Analyze results
stats = pstats.Stats('output.prof')
stats.sort_stats('cumulative')
stats.print_stats(10) # Top 10 slowest
`
Line-by-Line Profiling
`python
Install: pip install line_profiler
Add @profile decorator
@profile
def slow_function():
result = []
for i in range(10000):
result.append(i ** 2) # How slow is this?
total = sum(result) # How about this?
return total
Run: kernprof -l -v script.py
Shows time spent on EACH LINE
`
Memory Profiling
`python
Install: pip install memory_profiler
from memory_profiler import profile
@profile
def memory_hog():
# ❌ Bad - creates huge list
big_list = [i for i in range(10000000)]
# ✅ Good - uses generator
big_gen = (i for i in range(10000000))
return sum(big_gen)
Run: python -m memory_profiler script.py
`
Quick Timing with timeit
`python
import timeit
Compare different approaches
setup = "data = list(range(1000))"
Approach 1
time1 = timeit.timeit(
"result = [x**2 for x in data]",
setup=setup,
number=10000
)
Approach 2
time2 = timeit.timeit(
"result = list(map(lambda x: x**2, data))",
setup=setup,
number=10000
)
print(f"List comp: {time1:.4f}s")
print(f"Map: {time2:.4f}s")
print(f"Winner: {'List comp' if time1 < time2 else 'Map'}")
`
13. Memory Optimization Techniques
Generators vs Lists
`python
❌ Bad - loads everything into memory
def read_large_file_bad(filename):
with open(filename) as f:
return [line.strip() for line in f] # All lines in memory!
For 1GB file, needs 1GB+ RAM
lines = read_large_file_bad('huge.txt')
for line in lines:
process(line)
✅ Good - processes one line at a time
def read_large_file_good(filename):
with open(filename) as f:
for line in f: # Generator, not list!
yield line.strip()
Only one line in memory at a time
for line in read_large_file_good('huge.txt'):
process(line)
Generator expression vs list comprehension
❌ List comprehension - immediate memory allocation
squares_list = [x**2 for x in range(1000000)] # ~8MB memory
✅ Generator expression - lazy evaluation
squares_gen = (x**2 for x in range(1000000)) # ~128 bytes
Values computed on-demand
`
When to Use Generators:
- Processing large datasets
- Pipeline operations (one result feeds into another)
- You don't need the entire result set at once
- Memory is a concern
When to Use Lists:
- Need multiple passes over data
- Need random access (indexing)
- Need to know the length
- Dataset is small
slots - Reduce Object Memory
`python
❌ Bad - each instance has dict (overhead)
class Point:
def init(self, x, y):
self.x = x
self.y = y
Each instance: ~280 bytes (with dict)
p = Point(1, 2)
print(p.dict) # {'x': 1, 'y': 2}
✅ Good - use slots for memory efficiency
class PointSlots:
slots = ['x', 'y'] # Fixed attributes
def __init__(self, x, y):
self.x = x
self.y = y
Each instance: ~48 bytes (no dict)
~80% memory reduction!
Real impact with many objects
1 million Points: ~280MB vs ~48MB
points = [PointSlots(i, i) for i in range(1000000)]
`
When to Use slots:
- Creating millions of instances
- Objects have fixed set of attributes
- Memory is critical (e.g., data science, games)
Tradeoffs:
- Can't add attributes dynamically
- Slightly less flexible
- No dict attribute
- Can't use with multiple inheritance (complex rules)
Array Module - Compact Numeric Arrays
`python
❌ Bad - list of numbers (lots of overhead)
numbers_list = [1, 2, 3, 4, 5] * 100000
Each integer object: ~28 bytes
500,000 integers: ~14MB
✅ Good - array module (C-style array)
from array import array
numbers_array = array('i', [1, 2, 3, 4, 5] * 100000)
Each integer: 4 bytes
500,000 integers: ~2MB (7x less memory!)
Type codes:
'b': signed char (1 byte)
'i': signed int (4 bytes)
'f': float (4 bytes)
'd': double (8 bytes)
For numerical computing, use numpy (even better)
import numpy as np
numbers_numpy = np.array([1, 2, 3, 4, 5] * 100000, dtype=np.int32)
Optimized for mathematical operations
`
String Interning
`python
Python automatically interns some strings
a = "hello"
b = "hello"
print(a is b) # True - same object in memory
Force interning for frequently used strings
from sys import intern
❌ Bad - many identical strings
tags = ["python", "python", "python"] * 10000
Each "python" might be a separate object
✅ Good - intern frequently used strings
tags = [intern("python")] * 30000
All reference same object
Useful for:
- Large datasets with repeated string values
- Dictionary keys that repeat
- Tag systems, category labels
`
Weak References - Avoid Circular References
`python
import weakref
❌ Bad - circular reference prevents garbage collection
class Node:
def init(self, value):
self.value = value
self.parent = None
self.children = []
def add_child(self, child):
child.parent = self # Strong reference cycle!
self.children.append(child)
Creates memory leak - nodes never freed
✅ Good - use weak references
class NodeWeak:
def init(self, value):
self.value = value
self._parent = None
self.children = []
@property
def parent(self):
return self._parent() if self._parent else None
@parent.setter
def parent(self, node):
self._parent = weakref.ref(node) if node else None
def add_child(self, child):
child.parent = self # Weak reference, can be freed
self.children.append(child)
`
14. Algorithm Complexity Matters
Choose the Right Data Structure
`python
Searching: O(n) vs O(1)
❌ Bad - O(n) lookup in list
valid_ids = [1, 5, 10, 15, 20, 25, 30] # 7 items
if user_id in valid_ids: # Checks each item: O(n)
grant_access()
✅ Good - O(1) lookup in set
valid_ids = {1, 5, 10, 15, 20, 25, 30}
if user_id in valid_ids: # Hash lookup: O(1)
grant_access()
For 1 million IDs, 1 million lookups:
List: ~trillion operations
Set: ~1 million operations (1000x faster!)
Removing duplicates
❌ Bad - O(n²)
def remove_duplicates_bad(items):
result = []
for item in items:
if item not in result: # O(n) check
result.append(item)
return result
10,000 items: ~100 million operations
✅ Good - O(n)
def remove_duplicates_good(items):
return list(dict.fromkeys(items)) # Preserves order (3.7+)
10,000 items: ~10,000 operations
Or if order doesn't matter:
def remove_duplicates_set(items):
return list(set(items))
`
`
Avoid Nested Loops When Possible
`python
❌ Bad - O(n²) complexity
def find_common_bad(list1, list2):
common = []
for item1 in list1: # O(n)
for item2 in list2: # O(m)
if item1 == item2:
common.append(item1)
return common
1000 items each: 1 million comparisons
✅ Good - O(n + m) complexity
def find_common_good(list1, list2):
set2 = set(list2) # O(m)
return [item for item in list1 if item in set2] # O(n)
1000 items each: ~2000 operations (500x faster!)
Even better - use set intersection
def find_common_best(list1, list2):
return list(set(list1) & set(list2)) # O(min(n, m))
Practical example: finding duplicate emails
❌ Bad - O(n²)
def find_duplicate_emails_bad(users):
duplicates = []
for i, user1 in enumerate(users):
for user2 in users[i+1:]:
if user1['email'] == user2['email']:
duplicates.append(user1['email'])
return duplicates
✅ Good - O(n)
def find_duplicate_emails_good(users):
seen = set()
duplicates = set()
for user in users:
email = user['email']
if email in seen:
duplicates.add(email)
else:
seen.add(email)
return list(duplicates)
`
Use bisect for Sorted Data
`python
import bisect
❌ Bad - linear search in sorted list O(n)
sorted_numbers = list(range(0, 1000000, 2)) # Even numbers
def find_position_bad(numbers, value):
for i, num in enumerate(numbers):
if num >= value:
return i
return len(numbers)
✅ Good - binary search O(log n)
def find_position_good(numbers, value):
return bisect.bisect_left(numbers, value)
Insert while maintaining sort order
❌ Bad - O(n log n) due to sort
def insert_sorted_bad(numbers, value):
numbers.append(value)
numbers.sort()
✅ Good - O(n) just for insertion
def insert_sorted_good(numbers, value):
bisect.insort(numbers, value)
Real-world example: maintaining sorted timestamps
from bisect import insort
timestamps = []
def add_event(event_time):
insort(timestamps, event_time) # Keeps sorted
def get_events_in_range(start, end):
# O(log n) to find positions
left = bisect.bisect_left(timestamps, start)
right = bisect.bisect_right(timestamps, end)
return timestamps[left:right]
`
15. String Operations Optimization
String Methods vs Regex
`python
import re
import time
text = "Hello, World! This is a test." * 1000
❌ Slower - regex for simple operations
start = time.time()
for _ in range(10000):
result = re.sub(r'test', 'exam', text)
print(f"Regex: {time.time() - start:.4f}s")
Output: ~0.8s
✅ Faster - string method
start = time.time()
for _ in range(10000):
result = text.replace('test', 'exam')
print(f"String method: {time.time() - start:.4f}s")
Output: ~0.1s (8x faster!)
Use regex only when you need pattern matching
❌ Overkill
if re.match(r'hello', text.lower()):
pass
✅ Better
if text.lower().startswith('hello'):
pass
When regex is appropriate:
- Pattern matching: r'\d{3}-\d{3}-\d{4}' for phone numbers
- Complex replacements: r'(\w+)\s+(\w+)' -> r'\2 \1'
- Multiple patterns: r'(cat|dog|bird)'
`
String Building with f-strings (Python 3.6+)
`python
Performance comparison
name = "Alice"
age = 30
city = "NYC"
❌ Slowest - concatenation
result = "Name: " + name + ", Age: " + str(age) + ", City: " + city
✅ Fast - % formatting
result = "Name: %s, Age: %d, City: %s" % (name, age, city)
✅ Fast - .format()
result = "Name: {}, Age: {}, City: {}".format(name, age, city)
✅ Fastest - f-strings (and most readable!)
result = f"Name: {name}, Age: {age}, City: {city}"
f-strings can also include expressions
prices = [10.50, 23.75, 5.25]
result = f"Total: ${sum(prices):.2f}" # Total: $39.50
Multi-line f-strings
message = (
f"User: {name}\n"
f"Age: {age}\n"
f"City: {city}"
)
`
str.translate() - Fast Character Replacement
`python
❌ Bad - multiple replace calls
def remove_punctuation_bad(text):
for char in '.,;:!?':
text = text.replace(char, '')
return text
✅ Good - single translate call
def remove_punctuation_good(text):
translator = str.maketrans('', '', '.,;:!?')
return text.translate(translator)
Even more complex translations
def leetspeak(text):
translation_table = str.maketrans({
'a': '4',
'e': '3',
'i': '1',
'o': '0',
's': '5',
't': '7'
})
return text.lower().translate(translation_table)
print(leetspeak("Hello World")) # h3ll0 w0rld
Performance comparison
import time
text = "Hello, World! How are you today?" * 10000
start = time.time()
for _ in range(1000):
result = remove_punctuation_bad(text)
print(f"Multiple replace: {time.time() - start:.4f}s")
Output: ~1.2s
start = time.time()
for _ in range(1000):
result = remove_punctuation_good(text)
print(f"Translate: {time.time() - start:.4f}s")
Output: ~0.15s (8x faster!)
`
16. Comprehensions vs map/filter
When to Use Each
`python
numbers = range(1000)
List comprehension - clear and Pythonic
squares = [x**2 for x in numbers]
map with lambda - slower due to lambda
squares = list(map(lambda x: x**2, numbers))
map with built-in function - FASTEST
from operator import mul
from functools import partial
double = partial(mul, 2)
doubled = list(map(double, numbers))
✅ Use list comprehension when:
- Transformation is complex
- Need filtering with transformation
- Readability is priority
result = [x**2 for x in numbers if x % 2 == 0]
✅ Use map when:
- Applying built-in function or C-function
- Don't need intermediate list (return iterator)
result = map(str.upper, words) # Iterator, no list created
✅ Use filter when:
- Simple predicate function
- Don't need transformation
result = filter(str.isdigit, characters)
`
Nested Comprehensions
`python
❌ Bad - hard to read
matrix = []
for i in range(3):
row = []
for j in range(3):
row.append(i * j)
matrix.append(row)
✅ Good - nested comprehension
matrix = [[i * j for j in range(3)] for i in range(3)]
[[0, 0, 0], [0, 1, 2], [0, 2, 4]]
Flatten nested list
❌ Bad - manual loops
nested = [[1, 2, 3], [4, 5, 6], [7, 8, 9]]
flat = []
for sublist in nested:
for item in sublist:
flat.append(item)
✅ Good - list comprehension
flat = [item for sublist in nested for item in sublist]
[1, 2, 3, 4, 5, 6, 7, 8, 9]
✅ Best for deeply nested - use itertools
from itertools import chain
flat = list(chain.from_iterable(nested))
Dictionary comprehension
❌ Bad
word_lengths = {}
for word in words:
word_lengths[word] = len(word)
✅ Good
word_lengths = {word: len(word) for word in words}
Set comprehension
❌ Bad
unique_lengths = set()
for word in words:
unique_lengths.add(len(word))
✅ Good
unique_lengths = {len(word) for word in words}
`
17. Number Operations Optimization
Integer Operations
`python
In Python, some operations are faster than others
❌ Slower
x = 47
result = x * 2 # Multiplication
✅ Faster (though less readable)
result = x << 1 # Bit shift (left shift by 1 = multiply by 2)
BUT: In practice, the difference is tiny
Only optimize this in extremely tight loops
Readability usually trumps micro-optimization
Division: // (floor division) vs / (true division)
import time
True division (returns float)
start = time.time()
for i in range(10000000):
result = i / 2
print(f"True division: {time.time() - start:.4f}s")
Output: ~0.5s
Floor division (returns int)
start = time.time()
for i in range(10000000):
result = i // 2
print(f"Floor division: {time.time() - start:.4f}s")
Output: ~0.4s (20% faster if you need integer result)
Modulo operations
Use & for power-of-2 modulo (much faster)
❌ Slower
if x % 2 == 0: # Check if even
pass
✅ Faster (but less clear)
if x & 1 == 0: # Check if even using bitwise AND
pass
`
Float Operations
`python
Avoid float when integers work
❌ Slower - unnecessary float operations
total = 0.0
for i in range(1000000):
total += float(i)
✅ Faster - integer operations
total = 0
for i in range(1000000):
total += i
Use math module for complex operations
import math
❌ Slower - repeated exponentiation
result = x ** 0.5
✅ Faster - dedicated function
result = math.sqrt(x)
❌ Slower
result = x ** 2
✅ Faster (and clearer)
result = x * x
`
Decimal for Financial Calculations
`python
❌ Bad - float precision issues
price = 0.1
quantity = 3
total = price * quantity
print(total) # 0.30000000000000004 (WRONG for money!)
✅ Good - Decimal for exact precision
from decimal import Decimal
price = Decimal('0.1')
quantity = 3
total = price * quantity
print(total) # 0.3 (CORRECT)
Real-world example
from decimal import Decimal, ROUND_HALF_UP
def calculate_total(prices):
total = sum(Decimal(str(price)) for price in prices)
# Round to 2 decimal places for currency
return total.quantize(Decimal('0.01'), rounding=ROUND_HALF_UP)
prices = [10.10, 20.20, 30.33]
print(calculate_total(prices)) # 60.63
`
18. Exception Handling Performance
EAFP vs LBYL
`python
LBYL (Look Before You Leap)
❌ Slower when condition is usually true
if key in dictionary:
value = dictionary[key]
else:
value = default
EAFP (Easier to Ask for Forgiveness than Permission)
✅ Faster when exception is rare
try:
value = dictionary[key]
except KeyError:
value = default
Why EAFP is faster:
- No extra dictionary lookup when key exists
- Exceptions are cheap when not raised
- Only pays penalty on actual error
Performance comparison
import time
data = {i: i*2 for i in range(10000)}
LBYL approach
start = time.time()
for i in range(10000):
if i in data: # Extra lookup
value = data[i] # Second lookup
print(f"LBYL: {time.time() - start:.4f}s")
Output: ~0.003s
EAFP approach
start = time.time()
for i in range(10000):
try:
value = data[i] # Single lookup
except KeyError:
pass
print(f"EAFP: {time.time() - start:.4f}s")
Output: ~0.002s (30% faster when key exists)
`
Avoid Exceptions in Hot Paths
`python
When exceptions are common, they're expensive
❌ Bad - exception raised 50% of the time
def process_items_bad(items):
results = []
for item in items:
try:
results.append(expensive_operation(item))
except ValueError: # Raised for 50% of items!
results.append(None)
return results
✅ Good - check first if exceptions are common
def process_items_good(items):
results = []
for item in items:
if is_valid(item): # Cheap check
results.append(expensive_operation(item))
else:
results.append(None)
return results
`
19. Parallel Processing
Threading vs Multiprocessing vs AsyncIO
`python
import time
import threading
import multiprocessing
from concurrent.futures import ThreadPoolExecutor, ProcessPoolExecutor
def cpu_bound_task(n):
"""Simulates CPU-intensive work"""
return sum(i*i for i in range(n))
def io_bound_task(n):
"""Simulates I/O wait (network, disk)"""
time.sleep(0.1) # Simulate I/O wait
return n * 2
❌ Bad - sequential for I/O-bound tasks
def process_io_sequential(tasks):
results = []
for task in tasks:
results.append(io_bound_task(task))
return results
Time: 10 tasks * 0.1s = 1 second
✅ Good - threading for I/O-bound
def process_io_threaded(tasks):
with ThreadPoolExecutor(max_workers=10) as executor:
results = list(executor.map(io_bound_task, tasks))
return results
Time: ~0.1 seconds (10x faster!)
❌ Bad - threading for CPU-bound (GIL limitation)
def process_cpu_threaded(tasks):
with ThreadPoolExecutor(max_workers=4) as executor:
results = list(executor.map(cpu_bound_task, tasks))
return results
No speedup due to Global Interpreter Lock (GIL)
✅ Good - multiprocessing for CPU-bound
def process_cpu_multiprocessing(tasks):
with ProcessPoolExecutor(max_workers=4) as executor:
results = list(executor.map(cpu_bound_task, tasks))
return results
Time: ~1/4 on quad-core processor
Use cases:
Threading: I/O-bound (network requests, file I/O, database queries)
Multiprocessing: CPU-bound (calculations, data processing, image processing)
AsyncIO: Many concurrent I/O operations (thousands of connections)
`
AsyncIO for High Concurrency
`python
import asyncio
import aiohttp # pip install aiohttp
❌ Bad - sequential HTTP requests
import requests
def fetch_urls_sequential(urls):
results = []
for url in urls:
response = requests.get(url)
results.append(response.text)
return results
10 URLs, 0.5s each = 5 seconds
✅ Good - async HTTP requests
async def fetch_url(session, url):
async with session.get(url) as response:
return await response.text()
async def fetch_urls_async(urls):
async with aiohttp.ClientSession() as session:
tasks = [fetch_url(session, url) for url in urls]
results = await asyncio.gather(*tasks)
return results
Run async function
results = asyncio.run(fetch_urls_async(urls))
10 URLs concurrently = ~0.5 seconds (10x faster!)
When to use async:
- Many concurrent I/O operations
- Web scraping (many URLs)
- API calls to multiple services
- Database queries (with async driver)
- WebSocket connections
`
20. Pro Tips and Best Practices
Use Virtual Environments
`bash
Always use virtual environments
python -m venv venv
source venv/bin/activate # Linux/Mac
venv\Scripts\activate # Windows
Why:
- Isolates dependencies
- Prevents version conflicts
- Reproducible environments
`
Utilize Python's Standard Library
`python
Standard library is optimized and battle-tested
Don't reinvent the wheel!
❌ Bad - manual date parsing
def parse_date_bad(date_string):
parts = date_string.split('-')
year = int(parts[0])
month = int(parts[1])
day = int(parts[2])
# ... validation logic ...
return (year, month, day)
✅ Good - use datetime
from datetime import datetime
def parse_date_good(date_string):
return datetime.strptime(date_string, '%Y-%m-%d')
Other powerful standard library modules:
- collections: defaultdict, Counter, deque, namedtuple
- itertools: chain, groupby, combinations, product
- functools: lru_cache, partial, reduce
- operator: itemgetter, attrgetter, methodcaller
- pathlib: modern path handling
- dataclasses: reduce boilerplate (Python 3.7+)
`
Type Hints and Static Analysis
`python
Use type hints for better tooling and catching errors
from typing import List, Dict, Optional, Union
def process_users(users: List[Dict[str, Union[str, int]]]) -> Optional[Dict[str, int]]:
"""Process users and return statistics.
Args:
users: List of user dictionaries
Returns:
Dictionary of statistics or None if empty
"""
if not users:
return None
return {
'total': len(users),
'average_age': sum(u['age'] for u in users) // len(users)
}
Benefits:
- IDEs provide better autocomplete
- Catch type errors before runtime
- Self-documenting code
- Use mypy for static type checking: mypy script.py
`
Dataclasses (Python 3.7+)
`python
❌ Bad - lots of boilerplate
class PointOld:
def init(self, x, y):
self.x = x
self.y = y
def repr(self):
return f"Point(x={self.x}, y={self.y})"
def eq(self, other):
return self.x == other.x and self.y == other.y
✅ Good - dataclass handles boilerplate
from dataclasses import dataclass, field
from typing import List
@dataclass
class Point:
x: float
y: float
def distance(self) -> float:
return (self.x*2 + self.y2) * 0.5
Auto-generates: init, repr, eq, and more!
@dataclass
class User:
name: str
age: int
emails: List[str] = field(default_factory=list) # Mutable default
active: bool = True # Simple default
Usage
user = User("Alice", 30)
print(user) # User(name='Alice', age=30, emails=[], active=True)
`
Use Logging, Not Print
`python
❌ Bad - debugging with print
def process_data(data):
print(f"Processing {len(data)} items") # Can't disable!
result = expensive_operation(data)
print(f"Result: {result}") # Clutters output
return result
✅ Good - use logging
import logging
logging.basicConfig(level=logging.INFO)
logger = logging.getLogger(name)
def process_data(data):
logger.debug(f"Processing {len(data)} items")
result = expensive_operation(data)
logger.info(f"Processing complete")
return result
Benefits:
- Can adjust verbosity (DEBUG, INFO, WARNING, ERROR)
- Can log to files, not just console
- Can disable in production
- Includes timestamps and context
`
Summary: The Golden Rules of Python Performance
1. Measure First, Optimize Later
`python
Always profile before optimizing
import cProfile
cProfile.run('your_function()')
`
2. Use the Right Data Structure
- List: Ordered, allows duplicates, O(n) membership
- Tuple: Immutable list, slightly faster, can be dict key
- Set: Unordered, no duplicates, O(1) membership
- Dict: Key-value pairs, O(1) lookup
- deque: Fast appends/pops from both ends
- defaultdict: Dict with default values
- Counter: Count hashable objects
3. Leverage Built-ins
`python
Built-ins are optimized C code
sum(numbers) # vs manual loop
any(conditions) # vs manual check
all(conditions) # vs manual check
max(numbers) # vs manual comparison
sorted(items) # vs manual sort
`
4. Avoid Premature Optimization
`python
❌ Don't optimize this (runs once)
config = eval(open('config.txt').read()) # Simple is fine
✅ Do optimize this (runs millions of times)
def hot_path(data):
# This needs optimization
pass
`
5. Write Pythonic Code
`python
Pythonic code is often faster AND more readable
Use comprehensions, iterators, context managers, etc.
`
6. When All Else Fails
- Use Cython to compile Python to C
- Use NumPy for numerical operations
- Use PyPy JIT compiler
- Rewrite bottlenecks in C/Rust
- Use numba JIT compilation for numerical code
Final Wisdom
"Premature optimization is the root of all evil." - Donald Knuth
The best code is:
- Correct first
- Clear and maintainable
- Fast enough for your needs
Only optimize what profiling shows is actually slow!# Python Performance Optimization: Detailed Guide
Summary: The Golden Rules
- Profile First: Don't optimize without measuring
- Use Built-ins: They're implemented in C and heavily optimized
-
Avoid String Concatenation: Use
join()
instead - Cache Lookups: Store frequently accessed attributes in local variables
- Use Local Variables: Much faster than globals
- List Comprehensions: Usually faster and more readable than loops
-
EAFP over LYBYL:
try-except
is often faster thanif
checks - Import Wisely: Keep imports out of loops
- Batch Operations: Process data in aggregate, not item-by-item
- Read the Docs: Standard library has optimized solutions for common problems
When to Optimize
`python
❌ Don't optimize this (runs once)
def load_config():
config = {}
for line in open('config.txt'):
# Not worth optimizing
key, value = line.split('=')
config[key] = value
return config
✅ DO optimize this (runs millions of times)
def process_requests(requests):
results = []
append = results.append # Worth optimizing
for request in requests:
append(expensive_operation(request))
return results
`
Remember: Readable code that runs fast enough is better than optimized code that's hard to maintain!
Top comments (0)