Алексей Кузнецов

Posted on Mar 29

TITLE Async/Await: Python's Concurrency Game Changer, 10 Years On

#python #asyncio #concurrency #asynchronous

Async/Await: Python's Concurrency Game Changer, 10 Years On

Python has come a long way since its early days. While much digital ink has been spilled over the Python 2-to-3 transition, a quieter, yet equally profound revolution was brewing in the background. In 2015, Python 3.5 dropped with a pair of new keywords: async and await. These weren't just new syntax; they ushered in a paradigm shift, fundamentally changing how Python handles concurrency.

A decade later, as Python 3.14 looms with exciting new features like free-threading and multiple interpreters, it's worth reflecting on how async/await has truly transformed the language, even if its path to universal adoption has been anything but straightforward.

The Problem Async/Await Came to Solve: Waiting Around

Imagine your Python application needs to fetch data from a dozen different web APIs or query a database. In a traditional, synchronous model, your program would do this one by one: fetch data from API 1, wait for the response, then fetch from API 2, wait, and so on. Most of the time, your program isn't doing anything useful during these waits; it's simply blocked, twiddling its thumbs while waiting for an external system to respond.

import requests
import time

def fetch_data_sync(url):
    print(f"Fetching {url} synchronously...")
    response = requests.get(url)
    print(f"Finished fetching {url}.")
    return response.text

urls = [
    "https://jsonplaceholder.typicode.com/todos/1",
    "https://jsonplaceholder.typicode.com/posts/1",
    "https://jsonplaceholder.typicode.com/users/1",
]

start_time = time.time()
for url in urls:
    fetch_data_sync(url)
end_time = time.time()
print(f"Synchronous fetching took {end_time - start_time:.2f} seconds.")

(Note: requests is a synchronous library. The example demonstrates the blocking nature concept.)

This is where async/await shines. It allows your program to say, "Hey, I'm going to start fetching from API 1. While I'm waiting for that, I might as well start fetching from API 2 and API 3. When any of them respond, let me know, and I'll deal with it."

This isn't about making a single network call faster; it's about making many network calls concurrently without needing multiple threads or processes, thus improving overall application throughput. The actual "waiting" for network data happens outside your Python process, allowing your code to efficiently manage other tasks.

The Core Idea: Coroutines and the Event Loop

At its heart, async/await introduced coroutines: special functions that can be paused and resumed. The event loop is the orchestrator, managing these coroutines, yielding control from one when it encounters an await expression, and resuming another.

Let's look at the asynchronous equivalent of fetching data, using a library like aiohttp:

import asyncio
import aiohttp
import time

async def fetch_data_async(url):
    print(f"Fetching {url} asynchronously...")
    async with aiohttp.ClientSession() as session:
        async with session.get(url) as response:
            print(f"Finished fetching {url}.")
            return await response.text()

async def main_async():
    urls = [
        "https://jsonplaceholder.typicode.com/todos/1",
        "https://jsonplaceholder.typicode.com/posts/1",
        "https://jsonplaceholder.typicode.com/users/1",
    ]

    start_time = time.time()
    tasks = [fetch_data_async(url) for url in urls]
    results = await asyncio.gather(*tasks) # Run all tasks concurrently
    end_time = time.time()

    print(f"Asynchronous fetching took {end_time - start_time:.2f} seconds.")
    # print(results[0][:50]) # Print a snippet of one result

if __name__ == "__main__":
    asyncio.run(main_async())

You'll immediately see the performance gain. Instead of waiting for each request sequentially, asyncio.gather launches them all, and the event loop efficiently handles responses as they arrive. This is why async/await became the killer feature for web development, database interactions, and other I/O-bound network tasks.

The Nuances and The "Gotchas"

Despite its power, async/await in Python has a steeper learning curve than many expect, and its limitations can be frustrating for newcomers.

1. It's for I/O-Bound, Not CPU-Bound Tasks

This is perhaps the most crucial distinction. async/await helps with tasks that wait for external resources. If your task involves heavy computation (e.g., crunching numbers, image processing), async/await won't make it faster. In fact, a CPU-bound operation within an async function will still block the entire event loop, freezing all other concurrent tasks.

As Will McGugan, creator of Rich and Textualize, points out: "A reoccurring problem I see with Textual is folk testing concurrency by dropping in a time.sleep(10) call to simulate the work they are planning. Of course, that blocks the entire loop." This highlights a fundamental misunderstanding: time.sleep() is a blocking operation that halts everything in the current thread. For asynchronous pausing, you need await asyncio.sleep().

2. Disk I/O is Tricky

You'd think reading/writing files from disk would be a prime candidate for async/await, but asyncio doesn't natively support truly asynchronous file operations. Standard file I/O functions (open, read, write) are blocking.

To work around this, libraries like aiofiles exist. But critically, aiofiles achieves its non-blocking behavior by offloading the actual file I/O to a thread pool. It's not truly asynchronous at the OS level (like io_uring on Linux, which itself has had security concerns), but rather uses threads to prevent the main event loop from blocking.

3. The Ever-Present GIL

Python's Global Interpreter Lock (GIL) ensures that only one native thread can execute Python bytecode at a time. This impacts how async/await works. While async/await allows concurrent execution of I/O tasks (by switching between coroutines when one is waiting), it doesn't enable parallel execution of CPU-bound Python code on multiple cores within the same process. For true parallelism of CPU-bound tasks, you still generally need multiple processes (using multiprocessing).

Michael Kennedy notes that the GIL's omnipresence means "most Python people never developed multithreaded/async thinking. Because async/await only works for I/O bound work, not CPU as well, it’s of much less use." This underscores the paradigm shift that async demands.

Python's Async vs. C#'s Async: A Different Philosophy

It's interesting to compare Python's async/await with its C# counterpart, from which the syntax was borrowed. C# implements a Task-based Asynchronous Pattern (TAP) where Task is a higher-level abstraction that can represent either a thread or a coroutine. This allows for broader async support across C#'s core I/O libraries (disk, network, even serialization) and simplifies scheduling between concurrent and parallel tasks.

In Python, an event loop runs in a single thread, and all Tasks (coroutines) execute within that thread. When a Task awaits something, it suspends, and the event loop executes the next Task. This means any blocking operation, even within an async function, will block the entire event loop, unlike in C# where a blocking I/O operation might transparently dispatch to a thread pool without blocking the main event stream.

Merging Worlds: `run_in_executor`

So, what do you do when you have a blocking I/O operation (like file I/O or a synchronous library call) that you need to integrate into an async application? asyncio offers loop.run_in_executor(). This function allows you to offload a regular (blocking) function call to a separate thread or process pool, preventing it from blocking the event loop.

import asyncio
import concurrent.futures
import httpx
import tempfile
import os

# A synchronous function that potentially involves blocking I/O
# httpx.stream and tmp_file.write are not GIL-blocking,
# but the operation as a whole can be lengthy.
def download_file_blocking(url: str) -> str:
    print(f"Starting blocking download for {url}...")
    temp_file_path = ""
    try:
        with tempfile.NamedTemporaryFile(delete=False, mode='wb') as tmp_file:
            temp_file_path = tmp_file.name
            with httpx.stream("GET", url, follow_redirects=True, timeout=30.0) as response:
                response.raise_for_status() # Raise an exception for bad status codes
                for chunk in response.iter_bytes(chunk_size=8192):
                    tmp_file.write(chunk)
        print(f"Finished blocking download for {url} to {temp_file_path}")
        return temp_file_path
    except httpx.RequestError as exc:
        print(f"An error occurred while requesting {url}: {exc}")
        if os.path.exists(temp_file_path):
            os.remove(temp_file_path) # Clean up partial download
        raise
    except Exception as exc:
        print(f"An unexpected error occurred for {url}: {exc}")
        if os.path.exists(temp_file_path):
            os.remove(temp_file_path)
        raise

async def main_with_executor():
    loop = asyncio.get_running_loop()
    # Using placeholder URLs for demonstration
    # In a real scenario, these would be actual large files
    URLS = [
        "https://speed.hetzner.de/100MB.bin", # Example for a 100MB file
        "https://speed.hetzner.de/100MB.bin",
        "https://speed.hetzner.de/100MB.bin",
    ]

    # Use a ThreadPoolExecutor to run blocking functions without blocking the event loop
    with concurrent.futures.ThreadPoolExecutor(max_workers=3) as pool:
        tasks = [loop.run_in_executor(pool, download_file_blocking, url) for url in URLS]
        print("Scheduled all downloads via executor. Awaiting results...")
        downloaded_files = await asyncio.gather(*tasks, return_exceptions=True) # Collect results
        print("\nAll downloads completed.")
        for i, result in enumerate(downloaded_files):
            if isinstance(result, Exception):
                print(f"Download for {URLS[i]} failed: {result}")
            else:
                print(f"Downloaded file {i+1}: {result}")
                # Clean up the temporary file after demonstration
                try:
                    os.remove(result)
                    print(f"Cleaned up {result}")
                except OSError as e:
                    print(f"Error cleaning up file {result}: {e}")

if __name__ == "__main__":
    asyncio.run(main_with_executor())

This pattern allows you to retain an async API while dealing with blocking operations, though it adds a layer of complexity. The constant challenge of knowing "what blocks and what doesn't" remains a common stumbling block.

The Road Ahead: Free-Threading and Multiple Interpreters

The Python ecosystem is never static. With Python 3.13's "free-threaded" builds and 3.14's continued advancements in this area, we are looking at a future where the GIL might be significantly less restrictive, or even optional. Free-threading aims to replace the single GIL with more granular locks, potentially allowing true parallelism for CPU-bound Python code within a single process. Multiple Interpreters could offer isolation and parallelism akin to lightweight processes.

These advancements don't render async/await obsolete; rather, they build upon its foundation. async/await provided the first structured way for Python developers to think about concurrent execution. Coroutines offer benefits like smaller memory footprints, lower context-switching overhead, and faster startup times compared to traditional threads.

As these new parallelism features stabilize, there's an exciting opportunity for Python to develop a standard library API for task parallelism that complements async/await, allowing developers to choose the right tool for CPU-bound parallelism, I/O-bound concurrency, or a hybrid approach.

Conclusion: A Foundation for the Future

async/await in Python 3.5 was a monumental leap. It didn't solve all of Python's concurrency challenges, particularly those related to CPU-bound parallelism or the complexities of the GIL. However, it undeniably changed Python forever by:

Introducing structured concurrency: Providing a clean, readable syntax for managing concurrent I/O operations.
Empowering web development: Making frameworks like FastAPI possible and vastly improving the efficiency of network-heavy applications.
Shifting developer mindset: Forcing developers to distinguish between I/O-bound and CPU-bound tasks and understand the concept of a non-blocking event loop.
Laying the groundwork: Creating a robust platform for future concurrency and parallelism features, like those arriving in 3.14 and beyond.

While the journey with async/await has had its fair share of "oops, forgot to await" moments and "why is this still blocking?" puzzles, it has propelled Python into a new era of performance and efficiency. As we look towards Python's next decade, async/await stands as a testament to the language's adaptability and its community's relentless pursuit of better ways to build software. It's a skill worth mastering, as it's truly foundational to modern Python development.

DEV Community

TITLE Async/Await: Python's Concurrency Game Changer, 10 Years On

Async/Await: Python's Concurrency Game Changer, 10 Years On

The Problem Async/Await Came to Solve: Waiting Around

The Core Idea: Coroutines and the Event Loop

The Nuances and The "Gotchas"

1. It's for I/O-Bound, Not CPU-Bound Tasks

2. Disk I/O is Tricky

3. The Ever-Present GIL

Python's Async vs. C#'s Async: A Different Philosophy

Merging Worlds: `run_in_executor`

The Road Ahead: Free-Threading and Multiple Interpreters

Conclusion: A Foundation for the Future

Top comments (0)

Async/Await: Python's Concurrency Game Changer, 10 Years On

The Problem Async/Await Came to Solve: Waiting Around

The Core Idea: Coroutines and the Event Loop

The Nuances and The "Gotchas"

1. It's for I/O-Bound, Not CPU-Bound Tasks

2. Disk I/O is Tricky

3. The Ever-Present GIL

Python's Async vs. C#'s Async: A Different Philosophy

Merging Worlds: run_in_executor

The Road Ahead: Free-Threading and Multiple Interpreters

Conclusion: A Foundation for the Future

Merging Worlds: `run_in_executor`