Emmimal P Alexander

Posted on Mar 4 • Originally published at emitechlogic.com

Why Python's sorted() Is Safer Than list.sort() in Production Systems

#python #tutorial #beginners #programming

Every Python tutorial puts sorted() and list.sort() side by side and says something like: "sort() modifies in place, sorted() returns a new list." That is correct. And for a script that runs once and exits, it is probably enough.

But if you write backend services — where request handlers share state, where functions receive lists as arguments, where a cache holds data that multiple parts of the codebase read — this explanation misses the details that cause real incidents.

This article covers those details. By the end you will understand exactly what CPython does in memory during each operation, what the GIL actually protects during a sort (and where a popular explanation gets it wrong), and why the mutation behaviour of list.sort() is a consistent source of bugs that are hard to find.

The One-Line Difference That Misleads People

nums = [3, 1, 4, 1, 5]

# list.sort() — modifies the list, returns None
nums.sort()
print(nums)          # [1, 1, 3, 4, 5]

# sorted() — leaves the input alone, returns a new list
result = sorted(nums)
print(nums)          # [1, 1, 3, 4, 5] — untouched
print(result)        # [1, 1, 3, 4, 5] — new object

In isolation this looks like a minor stylistic choice. In a service, it is not.

The Mutation Problem

When you call my_list.sort(), Python rearranges element pointers inside the original list object. Every variable in your program that holds a reference to that same object immediately sees the new order. No copy is made. No warning is raised. The data just changes.

In a linear script this does not matter. In a service where a list travels from a cache into a handler into a utility function, it creates a category of bug that does not crash — it just produces subtly wrong output.

The aliasing pattern that shows up constantly

raw_orders = [5, 2, 9, 1, 7]
audit_trail = raw_orders        # looks like a rename — it is an alias

def process(order_ids):
    order_ids.sort()            # mutates raw_orders silently
    return order_ids

result = process(raw_orders)

print(result)       # [1, 2, 5, 7, 9]  ← correct output
print(audit_trail)  # [1, 2, 5, 7, 9]  ← insertion order gone
print(raw_orders)   # [1, 2, 5, 7, 9]  ← same object, same damage

The audit trail was supposed to record arrival order. It is now sorted. No crash occurred. The data just silently lost its shape.

The fix is one word:

def process(order_ids):
    return sorted(order_ids)    # new list, nothing mutated

What CPython Does in Memory

Understanding why the mutation reaches all aliases requires a quick look at how Python represents a list internally.

CPython stores each list as a struct called PyListObject. It holds a pointer to an array of pointers (ob_item) — one pointer per element. The actual element objects live elsewhere in memory.

/* CPython — Objects/listobject.c (simplified) */
typedef struct {
    PyObject_VAR_HEAD
    PyObject **ob_item;    /* array of pointers to elements */
    Py_ssize_t allocated;  /* capacity */
} PyListObject;

When list.sort() runs: Timsort rearranges the pointers inside ob_item. The integer or string objects on the heap do not move at all. Because every Python name that refers to this list holds the same memory address of the PyListObject, they all observe that rearrangement the instant it finishes.

When sorted() runs: It creates a brand new PyListObject, copies the element pointers into it, and sorts that new object. The original PyListObject's ob_item is never touched.

Before any sort:
  original  ──►  PyListObject @ 0x7f3a  [5, 2, 9, 1]
  alias     ──►  PyListObject @ 0x7f3a  [5, 2, 9, 1]

After alias.sort():
  original  ──►  PyListObject @ 0x7f3a  [1, 2, 5, 9]  MUTATED
  alias     ──►  PyListObject @ 0x7f3a  [1, 2, 5, 9]  same object

After sorted(original):
  original  ──►  PyListObject @ 0x7f3a  [5, 2, 9, 1]  untouched
  alias     ──►  PyListObject @ 0x7f3a  [5, 2, 9, 1]  untouched
  result    ──►  PyListObject @ 0x8b1c  [1, 2, 5, 9]  new object

You can verify the object identity yourself with id():

a = [3, 1, 2]
b = a

b.sort()
print(id(a) == id(b))   # True — one object, two names

a = [3, 1, 2]
b = a
c = sorted(a)
print(id(a) == id(c))   # False — two independent objects

The GIL and list.sort() — Correcting a Widespread Claim

Many articles — including a previous version of content on this topic — state that "CPython's Timsort drops the GIL at internal checkpoints, allowing other threads to see a partially sorted list."

That claim is not accurate for standard CPython, and it is worth being precise.

What actually happens

Without a Python key function: When you call list.sort() with no key, or with a C-level key like operator.attrgetter, the entire sort runs as a single C function call. The GIL is held throughout. No other Python thread runs during that call. As a CPython implementation detail, the list appears empty to other threads while the sort is in progress — not partially sorted, not torn. The final sorted state becomes visible all at once when the sort completes.

With a Python key function: When you pass a Python lambda — key=lambda x: x.score — every comparison calls that lambda, which re-enters Python bytecode execution. The GIL can release between those calls. During those windows, another thread can read the list in an intermediate rearrangement state. So partially sorted views are possible — but only when a Python-level key function is involved.

import threading, time

shared = list(range(500, 0, -1))

def reader():
    time.sleep(0.001)
    snap = shared[:]          # what does this thread see?
    print(f"Reader saw {len(snap)} items")

# Case 1: No key — GIL held, reader sees empty list or final result
t1 = threading.Thread(target=reader)
t2 = threading.Thread(target=lambda: shared.sort())
t1.start(); t2.start()
t1.join(); t2.join()

# Case 2: Python lambda key — GIL releases between comparisons
shared = list(range(500, 0, -1))
t3 = threading.Thread(target=reader)
t4 = threading.Thread(target=lambda: shared.sort(key=lambda x: x))
t3.start(); t4.start()
t3.join(); t4.join()

But the GIL is not the real problem

Whether the GIL drops during the sort or not, the end result is the same: once sort() finishes, the list is permanently reordered. Every reference to that object — in every thread, in every function, in every cache — immediately sees the new state. No notification, no flag, no synchronisation point.

That permanent shared mutation is the actual danger, not the torn-read scenario.

A Real Cache Corruption Pattern

Here is the pattern that causes multi-week debugging sessions.

# cache.py
_store = {}

def get_products():
    if 'products' not in _store:
        _store['products'] = [
            {'name': 'Pen',   'price': 1.50,  'order': 1},
            {'name': 'Desk',  'price': 299.0, 'order': 2},
            {'name': 'Chair', 'price': 189.0, 'order': 3},
            {'name': 'Lamp',  'price': 45.0,  'order': 4},
        ]
    return _store['products']    # returns the actual cache object

# handlers.py

def handler_top_by_price():
    products = get_products()
    # BUG: sorts the shared cache object in place
    products.sort(key=lambda p: p['price'], reverse=True)
    return [p['name'] for p in products]

def handler_listing():
    products = get_products()
    return [p['name'] for p in products]    # expects insertion order

# Simulate requests arriving in this order

print(handler_listing())
# ['Pen', 'Desk', 'Chair', 'Lamp']  ← correct

print(handler_top_by_price())
# ['Desk', 'Chair', 'Lamp', 'Pen']  ← correct output

print(handler_listing())
# ['Desk', 'Chair', 'Lamp', 'Pen']  ← WRONG
# Cache is permanently reordered. Every future request is affected.
# No exception. No log entry. Just wrong data.

The fix is one word — .sort() becomes sorted():

def handler_top_by_price():
    products = get_products()
    top = sorted(products, key=lambda p: p['price'], reverse=True)
    return [p['name'] for p in top]

# Cache is now untouched. handler_listing() always returns insertion order.

The investigation for this class of bug typically takes much longer than the fix. The symptom — data in the wrong order — points toward the algorithm, the database query, or the cache expiry policy. The sort call in a request handler is usually the last place anyone looks.

Pipeline Safety

One structural advantage of sorted() that gets overlooked: functions that do not mutate their inputs are independently testable and safely composable.

def filter_active(users):
    return [u for u in users if u['active']]

def rank_by_score(users):
    return sorted(users, key=lambda u: u['score'], reverse=True)

def take_top(users, n=10):
    return users[:n]

# all_users is untouched through every stage.
result = take_top(rank_by_score(filter_active(all_users)))

With sorted(), each function in the chain can be tested like this:

def test_rank_does_not_mutate_input():
    users = [{'score': 82}, {'score': 91}, {'score': 74}]
    before = users[:]
    rank_by_score(users)
    assert users == before    # passes with sorted(), fails with .sort()

This is not possible when rank_by_score uses .sort() internally — the test fixture is mutated by the call, and running the same test twice on the same data produces different results the second time.

Performance — The Honest Picture

sorted() is a little slower because it allocates a new list before sorting. The table below shows approximate timings on CPython 3.12.

List size	list.sort()	sorted()	Overhead
100 elements	0.8 µs	1.1 µs	+38%
1,000 elements	8.5 µs	10.2 µs	+20%
10,000 elements	28 µs	31 µs	+11%
100,000 elements	310 µs	345 µs	+11%
1,000,000 elements	3.8 ms	4.2 ms	+11%

A few things to note:

The percentage overhead is largest at small list sizes where the absolute difference is trivially small (0.3 µs).
At large list sizes where the absolute difference is measurable (40 ms at 1M elements), sorting that many items in a request handler is itself the architectural problem — the sort belongs at the database level.
A single database round-trip takes roughly 1 ms. The allocation cost of sorted() on a 10,000-element list is 3 µs. The comparison is not close.

The allocation cost of sorted() does not compound across refactors. The mutation risk of .sort() does.

You can run this benchmark yourself with the script in the companion GitHub repository:

# benchmark.py
import timeit, random

def run(n, iterations=300):
    data = random.sample(range(n * 10), n)
    t_sort   = timeit.timeit('lst[:].sort()',
                              setup=f'lst={data}', number=iterations) / iterations * 1e6
    t_sorted = timeit.timeit('sorted(lst)',
                              setup=f'lst={data}', number=iterations) / iterations * 1e6
    delta = ((t_sorted - t_sort) / t_sort) * 100
    print(f"n={n:>9,}  sort={t_sort:>8.2f}µs  sorted={t_sorted:>8.2f}µs  overhead={delta:>+.1f}%")

for n in [100, 1_000, 10_000, 100_000, 1_000_000]:
    run(n)

sorted() Accepts Any Iterable — list.sort() Does Not

A practical difference that matters during refactoring: sorted() works on any object that implements the iterator protocol. list.sort() is a method on list objects only.

nums_tuple = (3, 1, 4, 1, 5)
nums_set   = {3, 1, 4, 1, 5}
nums_gen   = (x for x in range(5, 0, -1))

sorted(nums_tuple)   # [1, 1, 3, 4, 5]  ← works
sorted(nums_set)     # [1, 3, 4, 5]     ← works (deduped by set)
sorted(nums_gen)     # [1, 2, 3, 4, 5]  ← works

nums_tuple.sort()    # AttributeError: 'tuple' has no attribute 'sort'
nums_set.sort()      # AttributeError: 'set' has no attribute 'sort'
nums_gen.sort()      # AttributeError: 'generator' has no attribute 'sort'

If your data source changes from a list to a generator or a query result during refactoring, sorted() keeps working. .sort() calls fail at runtime.

When list.sort() Is the Right Choice

list.sort() is appropriate when all of the following hold:

You built the list in the current scope and no other reference to it exists anywhere — not a parameter, not from a cache, not assigned to another name.
You will not pass this list downstream — no function that calls this one will receive the list after the sort.
You have profiled and measured that the allocation cost of sorted() is a real bottleneck in your specific workload. Not assumed — measured with a profiler like cProfile or py-spy.

In most backend service code, all three conditions are rarely true at the same time.

# This is fine — you built it, you own it, no external references
def get_report_rows(data):
    rows = [transform(d) for d in data]   # new list, built here
    rows.sort(key=lambda r: r['date'])     # sole owner — acceptable
    return rows

# This is not fine — you received it, you do not own it
def process(records):
    records.sort()    # mutates the caller's data
    return records

Antipatterns Worth Knowing

# 1. Sorting then copying — worst of both worlds
def get_sorted(items):
    items.sort()            # mutates caller
    return sorted(items)   # pays allocation cost anyway
# Use: return sorted(items)  — one call, no mutation

# 2. Trusting the return value of sort()
result = items.sort()      # result is None
print(result[0])           # TypeError: 'NoneType' is not subscriptable

# 3. Manual copy + sort — noisier than sorted()
copy = items[:]
copy.sort()
return copy
# Use: return sorted(items)  — identical result, one line

# 4. Sorting a parameter without the caller knowing
def rank(items, key_fn):
    items.sort(key=key_fn)  # caller's list is now sorted
    return items            # looks like it just returns, hides the mutation

Both Functions Are Stable — This Is Guaranteed

Both list.sort() and sorted() use Timsort internally. Both are guaranteed stable by the Python language specification — not just CPython's implementation. Equal elements maintain their original relative order.

records = [
    {'dept': 'Engineering', 'hire_date': '2022-03'},
    {'dept': 'Design',      'hire_date': '2021-07'},
    {'dept': 'Engineering', 'hire_date': '2020-11'},
]

# After sorting by department, the two Engineering records appear
# in the same relative order they had in the original list.
sorted_records = sorted(records, key=lambda r: r['dept'])
# Design / Engineering(2022-03) / Engineering(2020-11)
# → Design / Engineering(2020-11) / Engineering(2022-03) — stable

This guarantee holds across CPython, PyPy, and any other compliant Python implementation. It has been part of the language specification since Python 2.2.

The Full Test Suite

Here is a test file you can drop into any project to verify sorting behaviour:

# test_sort_safety.py
import copy
import unittest

def rank_buggy(items):
    items.sort(key=lambda x: x['v'])
    return items

def rank_safe(items):
    return sorted(items, key=lambda x: x['v'])


class TestSortSafety(unittest.TestCase):

    def setUp(self):
        self.data = [{'v': 3}, {'v': 1}, {'v': 4}, {'v': 1}, {'v': 5}]
        self.snapshot = copy.deepcopy(self.data)

    def test_sort_mutates_input(self):
        rank_buggy(self.data)
        # Input was changed — this assertion fails, proving the mutation
        self.assertNotEqual(self.data, self.snapshot,
                            "rank_buggy() did not mutate input")

    def test_sorted_preserves_input(self):
        rank_safe(self.data)
        self.assertEqual(self.data, self.snapshot,
                         "rank_safe() mutated the input")

    def test_sorted_returns_new_object(self):
        result = rank_safe(self.data)
        self.assertIsNot(result, self.data)

    def test_sorted_produces_correct_order(self):
        result = rank_safe(self.data)
        vals = [d['v'] for d in result]
        self.assertEqual(vals, sorted(vals))

    def test_stable_sort_preserves_relative_order(self):
        items = [{'v': 1, 'pos': 'first'}, {'v': 2}, {'v': 1, 'pos': 'second'}]
        result = rank_safe(items)
        # Equal elements maintain original order
        equals = [r for r in result if r['v'] == 1]
        self.assertEqual(equals[0].get('pos'), 'first')
        self.assertEqual(equals[1].get('pos'), 'second')

    def test_same_input_gives_same_output_every_time(self):
        r1 = rank_safe(self.data)
        r2 = rank_safe(self.data)
        self.assertEqual([d['v'] for d in r1], [d['v'] for d in r2])


if __name__ == '__main__':
    unittest.main()

Run it:

python -m pytest test_sort_safety.py -v

The first test (test_sort_mutates_input) is designed to fail — it proves the mutation happened. The rest should all pass.

Quick Reference

# The rule in one place

# DEFAULT: always safe, works on any iterable
result = sorted(collection, key=..., reverse=...)

# EXCEPTION: only when you built the list here,
# no other references exist, and you have measured a bottleneck
your_list.sort(key=..., reverse=...)

# NEVER:
result = your_list.sort()    # result is None

What to Read Next

The full article on emitechlogic.com includes interactive visualisations you cannot see in a Markdown post:

A live mutation visualiser — click a button and watch all aliased variables change simultaneously
A CPython memory diagram with tabs showing the ob_item pointer array before and after each operation
A thread simulator that replays the shared-cache corruption event, step by step with timestamps
An animated benchmark chart with click-to-expand engineering context for each list size
A 10-question quiz with per-answer explanations and a final score
A decision flowchart with five questions that walk you to the correct function for your situation

The companion GitHub repository has all code from this article in runnable form — benchmark script, full test suite, cache corruption demo, and GIL behaviour examples — all with no external dependencies.

DEV Community