List comprehensions are one of Python's most beloved features. They let you create lists in a single, readable line that would otherwise require a for loop and append calls. Beyond being more concise, comprehensions are often faster than equivalent loops because they're optimized at the C level in CPython. This guide covers everything from basic syntax to nested comprehensions, with the rules for when to use them and when not to.
Basic Syntax
The anatomy of a list comprehension:
[expression for item in iterable]
[expression for item in iterable if condition]
Read it as: "give me expression for each item in iterable".
# Loop version
squares = []
for x in range(10):
squares.append(x ** 2)
# List comprehension — same result, one line
squares = [x ** 2 for x in range(10)]
print(squares) # [0, 1, 4, 9, 16, 25, 36, 49, 64, 81]
# String transformations
words = ['hello', 'world', 'python']
upper = [word.upper() for word in words]
print(upper) # ['HELLO', 'WORLD', 'PYTHON']
# With a function call
import math
roots = [math.sqrt(n) for n in [4, 9, 16, 25]]
print(roots) # [2.0, 3.0, 4.0, 5.0]
Filtering with Conditions
Add an if clause to filter items:
# Only even numbers
evens = [x for x in range(20) if x % 2 == 0]
print(evens) # [0, 2, 4, 6, 8, 10, 12, 14, 16, 18]
# Strings longer than 4 characters
words = ['cat', 'elephant', 'dog', 'rhinoceros', 'ant']
long_words = [w for w in words if len(w) > 4]
print(long_words) # ['elephant', 'rhinoceros']
# Filter and transform together
numbers = [1, -2, 3, -4, 5, -6, 7, -8]
positive_squares = [x ** 2 for x in numbers if x > 0]
print(positive_squares) # [1, 9, 25, 49]
# Filter objects from a list of dicts
users = [
{'name': 'Alice', 'active': True},
{'name': 'Bob', 'active': False},
{'name': 'Carol', 'active': True},
]
active_names = [u['name'] for u in users if u['active']]
print(active_names) # ['Alice', 'Carol']
Conditional Expressions (Ternary)
You can also use a conditional expression in the output part (not to filter, but to choose between two values):
# Syntax: [value_if_true if condition else value_if_false for item in iterable]
numbers = [1, 2, 3, 4, 5, 6, 7, 8, 9, 10]
labels = ['even' if n % 2 == 0 else 'odd' for n in numbers]
print(labels)
# ['odd', 'even', 'odd', 'even', 'odd', 'even', 'odd', 'even', 'odd', 'even']
# Clamp values
raw = [-5, 10, 200, 3, -1, 50]
clamped = [max(0, min(100, x)) for x in raw]
print(clamped) # [0, 10, 100, 3, 0, 50]
# Replace None with a default
data = [1, None, 3, None, 5]
cleaned = [x if x is not None else 0 for x in data]
print(cleaned) # [1, 0, 3, 0, 5]
Nested List Comprehensions
You can nest comprehensions to iterate over nested structures — but keep readability in mind.
# Flatten a 2D matrix
matrix = [[1, 2, 3], [4, 5, 6], [7, 8, 9]]
flat = [num for row in matrix for num in row]
print(flat) # [1, 2, 3, 4, 5, 6, 7, 8, 9]
# Generate coordinate pairs
coords = [(x, y) for x in range(3) for y in range(3)]
print(coords)
# [(0,0), (0,1), (0,2), (1,0), (1,1), (1,2), (2,0), (2,1), (2,2)]
# Transpose a matrix (rows become columns)
matrix = [[1, 2, 3], [4, 5, 6], [7, 8, 9]]
transposed = [[row[i] for row in matrix] for i in range(3)]
print(transposed)
# [[1, 4, 7], [2, 5, 8], [3, 6, 9]]
# Filter cells from a matrix
matrix = [[1, -2, 3], [-4, 5, -6], [7, -8, 9]]
positives = [n for row in matrix for n in row if n > 0]
print(positives) # [1, 3, 5, 7, 9]
Dictionary Comprehensions
The same syntax works for dictionaries with {key: value for ...}:
# Invert a dictionary
original = {'a': 1, 'b': 2, 'c': 3}
inverted = {v: k for k, v in original.items()}
print(inverted) # {1: 'a', 2: 'b', 3: 'c'}
# Create a word frequency counter
words = ['apple', 'banana', 'apple', 'cherry', 'banana', 'apple']
freq = {word: words.count(word) for word in set(words)}
print(freq) # {'cherry': 1, 'banana': 2, 'apple': 3}
# Better with Counter, but the comprehension way:
freq = {word: sum(1 for w in words if w == word) for word in set(words)}
# Filter and transform a dict
prices = {'apple': 1.20, 'banana': 0.50, 'mango': 3.00, 'grape': 2.00}
expensive = {k: v for k, v in prices.items() if v > 1.0}
print(expensive) # {'apple': 1.2, 'mango': 3.0, 'grape': 2.0}
# From two lists (zip)
keys = ['name', 'age', 'city']
values = ['Alice', 30, 'Paris']
profile = {k: v for k, v in zip(keys, values)}
print(profile) # {'name': 'Alice', 'age': 30, 'city': 'Paris'}
# Square a dictionary's values
squared = {k: v ** 2 for k, v in {'a': 2, 'b': 3, 'c': 4}.items()}
print(squared) # {'a': 4, 'b': 9, 'c': 16}
Set Comprehensions
# Get unique first letters
words = ['apple', 'banana', 'avocado', 'blueberry', 'cherry']
first_letters = {word[0] for word in words}
print(first_letters) # {'a', 'b', 'c'} (order may vary)
# Get unique domains from email addresses
emails = ['user@gmail.com', 'admin@company.com', 'test@gmail.com']
domains = {email.split('@')[1] for email in emails}
print(domains) # {'gmail.com', 'company.com'}
# Unique absolute values
numbers = [-3, -2, -1, 0, 1, 2, 3]
unique_abs = {abs(n) for n in numbers}
print(unique_abs) # {0, 1, 2, 3}
Generator Expressions
Generator expressions have the same syntax as list comprehensions but use parentheses. They produce values lazily — one at a time — instead of building the entire list in memory. Use them when you only need to iterate once or when the dataset is large.
# List comprehension: builds entire list in memory
squares_list = [x ** 2 for x in range(1_000_000)]
# Generator expression: lazy, uses almost no memory
squares_gen = (x ** 2 for x in range(1_000_000))
# Consume a generator with sum, max, min, any, all
total = sum(x ** 2 for x in range(100))
print(total) # 328350
maximum = max(len(word) for word in words)
# any() and all() short-circuit — generators are perfect here
has_admin = any(u['role'] == 'admin' for u in users)
all_active = all(u['active'] for u in users)
# Pass to list() or tuple() when you need the full sequence
first_ten = list(x ** 2 for x in range(10))
# Generators are single-use
gen = (x for x in range(5))
print(list(gen)) # [0, 1, 2, 3, 4]
print(list(gen)) # [] — exhausted!
Real-World Examples
# Parse CSV line (simple)
line = "Alice,30,engineer,Paris"
name, age, role, city = line.split(',')
# Process all lines
rows = [line.split(',') for line in csv_text.strip().split('\n')]
# Extract specific columns
names = [row[0] for row in rows]
ages = [int(row[1]) for row in rows]
# Build SQL IN clause
ids = [1, 2, 3, 4, 5]
placeholders = ', '.join(['%s'] * len(ids)) # '%s, %s, %s, %s, %s'
# Flatten nested API response
response = {
"departments": [
{"name": "Engineering", "employees": ["Alice", "Bob"]},
{"name": "Design", "employees": ["Carol", "Dave"]},
]
}
all_employees = [emp for dept in response['departments']
for emp in dept['employees']]
print(all_employees) # ['Alice', 'Bob', 'Carol', 'Dave']
# Find files matching a pattern
import os
py_files = [f for f in os.listdir('.') if f.endswith('.py')]
# Chunk a list into batches
def chunks(lst, n):
return [lst[i:i + n] for i in range(0, len(lst), n)]
batch_size = 3
batches = chunks(list(range(10)), batch_size)
print(batches) # [[0, 1, 2], [3, 4, 5], [6, 7, 8], [9]]
# Remove duplicates while preserving order
seen = set()
unique = [x for x in items if not (x in seen or seen.add(x))]
Performance Comparison
import timeit
# List comprehension vs for loop vs map()
n = 10_000
loop_time = timeit.timeit(
stmt="result = []\nfor x in range(n):\n result.append(x**2)",
globals={'n': n}, number=1000
)
comp_time = timeit.timeit(
stmt="result = [x**2 for x in range(n)]",
globals={'n': n}, number=1000
)
map_time = timeit.timeit(
stmt="result = list(map(lambda x: x**2, range(n)))",
globals={'n': n}, number=1000
)
# Typical results: comprehension ~30% faster than loop, comparable to map
When NOT to Use List Comprehensions
-
Side effects: Don't use comprehensions just to run code for side effects (printing, API calls). Use a regular
forloop — it's clearer about intent. - Multiple operations on the same item: If you need to call multiple methods or do multiple things with each item, a loop is often cleaner.
- Complex nested logic: More than two levels of nesting is usually a readability problem. Extract a function instead.
-
Very large datasets where you need lazy evaluation: Use a generator expression or the
itertoolsmodule.
# ❌ Comprehension for side effects — don't do this
[print(x) for x in items] # confusing, creates a list of None
# ✅ Use a loop for side effects
for x in items:
print(x)
# ❌ Too complex to read
result = [transform(item) for sublist in nested
for item in sublist if item.condition
if item.value > threshold]
# ✅ Extract a function
def is_valid(item):
return item.condition and item.value > threshold
result = [transform(item)
for sublist in nested
for item in sublist
if is_valid(item)]
Quick Reference
# List comprehension
[expr for x in iterable]
[expr for x in iterable if condition]
[true_val if cond else false_val for x in iterable]
[expr for x in outer for y in inner] # nested
# Dict comprehension
{k: v for k, v in iterable}
{k: v for k, v in iterable if condition}
# Set comprehension
{expr for x in iterable}
{expr for x in iterable if condition}
# Generator expression (lazy)
(expr for x in iterable)
(expr for x in iterable if condition)
sum(x**2 for x in range(100)) # direct use
List comprehensions are a core part of idiomatic Python. Once you're comfortable with basic and conditional comprehensions, try dict and set comprehensions — they'll clean up a surprising amount of code. And whenever you just need to consume a sequence once, reach for a generator expression instead to keep memory usage low.
Free Developer Tools
If you found this article helpful, check out DevToolkit — 40+ free browser-based developer tools with no signup required.
Popular tools: JSON Formatter · Regex Tester · JWT Decoder · Base64 Encoder
🛒 Get the DevToolkit Starter Kit on Gumroad — source code, deployment guide, and customization templates.
Top comments (0)