How I Boosted Concurrency 40x with asyncio — and Ops Thought We Were DDoSed

#python #异步编程 #asyncio #性能优化

Last month, the product manager cornered me in the break room and fired off: “The dashboard timed out again. Can we please stop making the boss sit there refreshing?” At the time, our metrics sync script took 11 minutes to run — 200 third‑party APIs called one after another, the logs filled with line after line of “waiting for response.” I didn’t bother explaining. I just went back to my desk, opened the editor, and thought: this thing has to go async.

A week later the rewrite went live. Same 200 endpoints, consistently completing in 14 seconds. The monitoring alerts went off immediately — Ops thought we were getting DDoSed. Here’s how that refactor worked, where asyncio shined, where it bit me, and how to dig yourself out gracefully.

How the Event Loop “Steals” Time

There’s no deep magic in asyncio — just a single‑threaded event loop. Think of it as a hyper‑focused dispatcher that does exactly one thing: when Task A fires off an HTTP request and sits idle waiting for the network, the dispatcher suspends it, immediately moves on to Task B, and only switches back when A’s bytes arrive. No thread‑switching overhead, no callback hell — all logic lives inside async/await.

import asyncio
import aiohttp
import time

# 模拟一次 API 调用
async def fetch_api(session, url: str) -> dict:
    async with session.get(url, timeout=aiohttp.ClientTimeout(total=5)) as resp:
        return await resp.json()

A coroutine function is just a blueprint until you create a task and hand it to the event loop. The most common way to fire many coroutines at once is asyncio.gather — one line that lights them all up concurrently. The total time is no longer the sum of all requests, but the duration of the slowest one.

async def main():
    urls = [f"https://api.example.com/data/{i}" for i in range(200)]

    async with aiohttp.ClientSession() as session:
        start = time.time()

        # 同时发出 200 个请求，总耗时 ≈ 最慢的一个
        tasks = [fetch_api(session, url) for url in urls]
        results = await asyncio.gather(*tasks, return_exceptions=True)

        elapsed = time.time() - start
        print(f"完成 {len(urls)} 个请求，耗时 {elapsed:.2f}s")

In the synchronous version, 200 requests add up sequentially. With the code above, all connections go into a waiting state at once, and the event loop only pays the worst‑case IO time once. That’s the biggest shift asyncio brought to my workflow — “waiting in line” became “everything arriving at once.”

Controlling Concurrency and Timeouts So You Don’t Shoot Yourself in the Foot

At first I lazily used gather to fire everything at once and immediately blew through the upstream gateway’s rate limit — 429s everywhere. Then I introduced asyncio.Semaphore to cap concurrency at 20 simultaneous requests, along with timeouts and retries. That’s when the pipeline became truly production‑ready.

import asyncio
import aiohttp
from asyncio import Semaphore

CONCURRENCY = 20
MAX_RETRIES = 2

async def fetch_with_limit(sem, session, url):
    async with sem:   # 超过 20 个协程会在这一行排队
        for attempt in range(MAX_RETRIES + 1):
            try:
                async with session.get(url, timeout=aiohttp.ClientTimeout(total=5)) as resp:
                    resp.raise_for_status()
                    return await resp.json()
            except Exception as e:
                if attempt == MAX_RETRIES:
                    return {"error": str(e), "url": url}
                await asyncio.sleep(2 ** attempt)  # 指数退避

async def main_controlled():
    urls = [...]
    sem = Semaphore(CONCURRENCY)

    async with aiohttp.ClientSession() as session:
        tasks = [fetch_with_limit(sem, session, u) for u in urls]
        results = await asyncio.gather(*tasks, return_exceptions=True)
    return results

Semaphore acts like a bouncer — only 20 coroutines get in at a time, the rest wait at await sem. Add timeouts, status checks, and exponential backoff retries, and you won’t nuke the downstream, while tolerating transient hiccups.

Lessons Learned: Three Hours Hunting a Silent Bug

1. Forgetting `await` Turns Coroutines into Ghost Code

Once I noticed the network requests weren’t actually being made, but the logs said “execution successful.” Half an hour later I spotted fetch_data(url) instead of await fetch_data(url). A coroutine without await is just a generator object — the event loop ignores it completely. Python 3.8+ does emit a RuntimeWarning, but it’s easy to overlook in a long block. The fix: run with python -W error::RuntimeWarning to turn warnings into exceptions, or explicitly create an asyncio.Task.

2. Mixing Synchronous Blocking Code Inside a Coroutine

I initially reused old code and called requests.get directly inside an async def. The event loop got blocked, and all concurrency degraded to serial execution. The iron rule: inside an async function, never use any synchronous blocking call. Either switch to the corresponding aio library (aiohttp, aiofiles, etc.) or offload the blocking call to a thread pool with loop.run_in_executor.

Refactoring that pipeline taught me that asyncio’s real power isn’t just speed — it’s the ability to handle IO‑bound workloads without the complexity of threading. But it also demands that you be disciplined: respect the event loop, control your concurrency, and always, always await.

DEV Community

How I Boosted Concurrency 40x with asyncio — and Ops Thought We Were DDoSed

How the Event Loop “Steals” Time

Controlling Concurrency and Timeouts So You Don’t Shoot Yourself in the Foot

Lessons Learned: Three Hours Hunting a Silent Bug

1. Forgetting `await` Turns Coroutines into Ghost Code

2. Mixing Synchronous Blocking Code Inside a Coroutine

Top comments (0)

How the Event Loop “Steals” Time

Controlling Concurrency and Timeouts So You Don’t Shoot Yourself in the Foot

Lessons Learned: Three Hours Hunting a Silent Bug

1. Forgetting await Turns Coroutines into Ghost Code

2. Mixing Synchronous Blocking Code Inside a Coroutine

1. Forgetting `await` Turns Coroutines into Ghost Code