楊東霖

Posted on Mar 25 • Originally published at devtoolkit.cc

Python List Comprehension: Complete Guide with Examples

#tools #productivity #webdev #programming

List comprehensions are one of Python's most beloved features. They let you create lists in a single, readable line that would otherwise require a for loop and append calls. Beyond being more concise, comprehensions are often faster than equivalent loops because they're optimized at the C level in CPython. This guide covers everything from basic syntax to nested comprehensions, with the rules for when to use them and when not to.

Basic Syntax

The anatomy of a list comprehension:

[expression for item in iterable]
[expression for item in iterable if condition]

Read it as: "give me expression for each item in iterable".

# Loop version
squares = []
for x in range(10):
    squares.append(x ** 2)

# List comprehension — same result, one line
squares = [x ** 2 for x in range(10)]
print(squares)  # [0, 1, 4, 9, 16, 25, 36, 49, 64, 81]

# String transformations
words = ['hello', 'world', 'python']
upper = [word.upper() for word in words]
print(upper)  # ['HELLO', 'WORLD', 'PYTHON']

# With a function call
import math
roots = [math.sqrt(n) for n in [4, 9, 16, 25]]
print(roots)  # [2.0, 3.0, 4.0, 5.0]

Filtering with Conditions

Add an if clause to filter items:

# Only even numbers
evens = [x for x in range(20) if x % 2 == 0]
print(evens)  # [0, 2, 4, 6, 8, 10, 12, 14, 16, 18]

# Strings longer than 4 characters
words = ['cat', 'elephant', 'dog', 'rhinoceros', 'ant']
long_words = [w for w in words if len(w) > 4]
print(long_words)  # ['elephant', 'rhinoceros']

# Filter and transform together
numbers = [1, -2, 3, -4, 5, -6, 7, -8]
positive_squares = [x ** 2 for x in numbers if x > 0]
print(positive_squares)  # [1, 9, 25, 49]

# Filter objects from a list of dicts
users = [
    {'name': 'Alice', 'active': True},
    {'name': 'Bob',   'active': False},
    {'name': 'Carol', 'active': True},
]
active_names = [u['name'] for u in users if u['active']]
print(active_names)  # ['Alice', 'Carol']

Conditional Expressions (Ternary)

You can also use a conditional expression in the output part (not to filter, but to choose between two values):

# Syntax: [value_if_true if condition else value_if_false for item in iterable]

numbers = [1, 2, 3, 4, 5, 6, 7, 8, 9, 10]
labels = ['even' if n % 2 == 0 else 'odd' for n in numbers]
print(labels)
# ['odd', 'even', 'odd', 'even', 'odd', 'even', 'odd', 'even', 'odd', 'even']

# Clamp values
raw = [-5, 10, 200, 3, -1, 50]
clamped = [max(0, min(100, x)) for x in raw]
print(clamped)  # [0, 10, 100, 3, 0, 50]

# Replace None with a default
data = [1, None, 3, None, 5]
cleaned = [x if x is not None else 0 for x in data]
print(cleaned)  # [1, 0, 3, 0, 5]

Nested List Comprehensions

You can nest comprehensions to iterate over nested structures — but keep readability in mind.

# Flatten a 2D matrix
matrix = [[1, 2, 3], [4, 5, 6], [7, 8, 9]]
flat = [num for row in matrix for num in row]
print(flat)  # [1, 2, 3, 4, 5, 6, 7, 8, 9]

# Generate coordinate pairs
coords = [(x, y) for x in range(3) for y in range(3)]
print(coords)
# [(0,0), (0,1), (0,2), (1,0), (1,1), (1,2), (2,0), (2,1), (2,2)]

# Transpose a matrix (rows become columns)
matrix = [[1, 2, 3], [4, 5, 6], [7, 8, 9]]
transposed = [[row[i] for row in matrix] for i in range(3)]
print(transposed)
# [[1, 4, 7], [2, 5, 8], [3, 6, 9]]

# Filter cells from a matrix
matrix = [[1, -2, 3], [-4, 5, -6], [7, -8, 9]]
positives = [n for row in matrix for n in row if n > 0]
print(positives)  # [1, 3, 5, 7, 9]

Dictionary Comprehensions

The same syntax works for dictionaries with {key: value for ...}:

# Invert a dictionary
original = {'a': 1, 'b': 2, 'c': 3}
inverted = {v: k for k, v in original.items()}
print(inverted)  # {1: 'a', 2: 'b', 3: 'c'}

# Create a word frequency counter
words = ['apple', 'banana', 'apple', 'cherry', 'banana', 'apple']
freq = {word: words.count(word) for word in set(words)}
print(freq)  # {'cherry': 1, 'banana': 2, 'apple': 3}

# Better with Counter, but the comprehension way:
freq = {word: sum(1 for w in words if w == word) for word in set(words)}

# Filter and transform a dict
prices = {'apple': 1.20, 'banana': 0.50, 'mango': 3.00, 'grape': 2.00}
expensive = {k: v for k, v in prices.items() if v > 1.0}
print(expensive)  # {'apple': 1.2, 'mango': 3.0, 'grape': 2.0}

# From two lists (zip)
keys = ['name', 'age', 'city']
values = ['Alice', 30, 'Paris']
profile = {k: v for k, v in zip(keys, values)}
print(profile)  # {'name': 'Alice', 'age': 30, 'city': 'Paris'}

# Square a dictionary's values
squared = {k: v ** 2 for k, v in {'a': 2, 'b': 3, 'c': 4}.items()}
print(squared)  # {'a': 4, 'b': 9, 'c': 16}

Set Comprehensions

# Get unique first letters
words = ['apple', 'banana', 'avocado', 'blueberry', 'cherry']
first_letters = {word[0] for word in words}
print(first_letters)  # {'a', 'b', 'c'} (order may vary)

# Get unique domains from email addresses
emails = ['user@gmail.com', 'admin@company.com', 'test@gmail.com']
domains = {email.split('@')[1] for email in emails}
print(domains)  # {'gmail.com', 'company.com'}

# Unique absolute values
numbers = [-3, -2, -1, 0, 1, 2, 3]
unique_abs = {abs(n) for n in numbers}
print(unique_abs)  # {0, 1, 2, 3}

Generator Expressions

Generator expressions have the same syntax as list comprehensions but use parentheses. They produce values lazily — one at a time — instead of building the entire list in memory. Use them when you only need to iterate once or when the dataset is large.

# List comprehension: builds entire list in memory
squares_list = [x ** 2 for x in range(1_000_000)]

# Generator expression: lazy, uses almost no memory
squares_gen = (x ** 2 for x in range(1_000_000))

# Consume a generator with sum, max, min, any, all
total = sum(x ** 2 for x in range(100))
print(total)  # 328350

maximum = max(len(word) for word in words)

# any() and all() short-circuit — generators are perfect here
has_admin = any(u['role'] == 'admin' for u in users)
all_active = all(u['active'] for u in users)

# Pass to list() or tuple() when you need the full sequence
first_ten = list(x ** 2 for x in range(10))

# Generators are single-use
gen = (x for x in range(5))
print(list(gen))  # [0, 1, 2, 3, 4]
print(list(gen))  # []  — exhausted!

Real-World Examples

# Parse CSV line (simple)
line = "Alice,30,engineer,Paris"
name, age, role, city = line.split(',')

# Process all lines
rows = [line.split(',') for line in csv_text.strip().split('\n')]

# Extract specific columns
names = [row[0] for row in rows]
ages  = [int(row[1]) for row in rows]

# Build SQL IN clause
ids = [1, 2, 3, 4, 5]
placeholders = ', '.join(['%s'] * len(ids))  # '%s, %s, %s, %s, %s'

# Flatten nested API response
response = {
    "departments": [
        {"name": "Engineering", "employees": ["Alice", "Bob"]},
        {"name": "Design",      "employees": ["Carol", "Dave"]},
    ]
}
all_employees = [emp for dept in response['departments']
                 for emp in dept['employees']]
print(all_employees)  # ['Alice', 'Bob', 'Carol', 'Dave']

# Find files matching a pattern
import os
py_files = [f for f in os.listdir('.') if f.endswith('.py')]

# Chunk a list into batches
def chunks(lst, n):
    return [lst[i:i + n] for i in range(0, len(lst), n)]

batch_size = 3
batches = chunks(list(range(10)), batch_size)
print(batches)  # [[0, 1, 2], [3, 4, 5], [6, 7, 8], [9]]

# Remove duplicates while preserving order
seen = set()
unique = [x for x in items if not (x in seen or seen.add(x))]

Performance Comparison

import timeit

# List comprehension vs for loop vs map()
n = 10_000

loop_time = timeit.timeit(
    stmt="result = []\nfor x in range(n):\n    result.append(x**2)",
    globals={'n': n}, number=1000
)

comp_time = timeit.timeit(
    stmt="result = [x**2 for x in range(n)]",
    globals={'n': n}, number=1000
)

map_time = timeit.timeit(
    stmt="result = list(map(lambda x: x**2, range(n)))",
    globals={'n': n}, number=1000
)

# Typical results: comprehension ~30% faster than loop, comparable to map

When NOT to Use List Comprehensions

Side effects: Don't use comprehensions just to run code for side effects (printing, API calls). Use a regular for loop — it's clearer about intent.
Multiple operations on the same item: If you need to call multiple methods or do multiple things with each item, a loop is often cleaner.
Complex nested logic: More than two levels of nesting is usually a readability problem. Extract a function instead.
Very large datasets where you need lazy evaluation: Use a generator expression or the itertools module.

# ❌ Comprehension for side effects — don't do this
[print(x) for x in items]  # confusing, creates a list of None

# ✅ Use a loop for side effects
for x in items:
    print(x)

# ❌ Too complex to read
result = [transform(item) for sublist in nested
          for item in sublist if item.condition
          if item.value > threshold]

# ✅ Extract a function
def is_valid(item):
    return item.condition and item.value > threshold

result = [transform(item)
          for sublist in nested
          for item in sublist
          if is_valid(item)]

Quick Reference

# List comprehension
[expr for x in iterable]
[expr for x in iterable if condition]
[true_val if cond else false_val for x in iterable]
[expr for x in outer for y in inner]   # nested

# Dict comprehension
{k: v for k, v in iterable}
{k: v for k, v in iterable if condition}

# Set comprehension
{expr for x in iterable}
{expr for x in iterable if condition}

# Generator expression (lazy)
(expr for x in iterable)
(expr for x in iterable if condition)
sum(x**2 for x in range(100))          # direct use

List comprehensions are a core part of idiomatic Python. Once you're comfortable with basic and conditional comprehensions, try dict and set comprehensions — they'll clean up a surprising amount of code. And whenever you just need to consume a sequence once, reach for a generator expression instead to keep memory usage low.

Free Developer Tools

If you found this article helpful, check out DevToolkit — 40+ free browser-based developer tools with no signup required.

Popular tools: JSON Formatter · Regex Tester · JWT Decoder · Base64 Encoder

🛒 Get the DevToolkit Starter Kit on Gumroad — source code, deployment guide, and customization templates.

DEV Community