Last month, the product manager cornered me in the break room and fired off: “The dashboard timed out again. Can we please stop making the boss sit there refreshing?” At the time, our metrics sync script took 11 minutes to run — 200 third‑party APIs called one after another, the logs filled with line after line of “waiting for response.” I didn’t bother explaining. I just went back to my desk, opened the editor, and thought: this thing has to go async.
A week later the rewrite went live. Same 200 endpoints, consistently completing in 14 seconds. The monitoring alerts went off immediately — Ops thought we were getting DDoSed. Here’s how that refactor worked, where asyncio shined, where it bit me, and how to dig yourself out gracefully.
How the Event Loop “Steals” Time
There’s no deep magic in asyncio — just a single‑threaded event loop. Think of it as a hyper‑focused dispatcher that does exactly one thing: when Task A fires off an HTTP request and sits idle waiting for the network, the dispatcher suspends it, immediately moves on to Task B, and only switches back when A’s bytes arrive. No thread‑switching overhead, no callback hell — all logic lives inside async/await.
import asyncio
import aiohttp
import time
# 模拟一次 API 调用
async def fetch_api(session, url: str) -> dict:
async with session.get(url, timeout=aiohttp.ClientTimeout(total=5)) as resp:
return await resp.json()
A coroutine function is just a blueprint until you create a task and hand it to the event loop. The most common way to fire many coroutines at once is asyncio.gather — one line that lights them all up concurrently. The total time is no longer the sum of all requests, but the duration of the slowest one.
async def main():
urls = [f"https://api.example.com/data/{i}" for i in range(200)]
async with aiohttp.ClientSession() as session:
start = time.time()
# 同时发出 200 个请求,总耗时 ≈ 最慢的一个
tasks = [fetch_api(session, url) for url in urls]
results = await asyncio.gather(*tasks, return_exceptions=True)
elapsed = time.time() - start
print(f"完成 {len(urls)} 个请求,耗时 {elapsed:.2f}s")
In the synchronous version, 200 requests add up sequentially. With the code above, all connections go into a waiting state at once, and the event loop only pays the worst‑case IO time once. That’s the biggest shift asyncio brought to my workflow — “waiting in line” became “everything arriving at once.”
Controlling Concurrency and Timeouts So You Don’t Shoot Yourself in the Foot
At first I lazily used gather to fire everything at once and immediately blew through the upstream gateway’s rate limit — 429s everywhere. Then I introduced asyncio.Semaphore to cap concurrency at 20 simultaneous requests, along with timeouts and retries. That’s when the pipeline became truly production‑ready.
import asyncio
import aiohttp
from asyncio import Semaphore
CONCURRENCY = 20
MAX_RETRIES = 2
async def fetch_with_limit(sem, session, url):
async with sem: # 超过 20 个协程会在这一行排队
for attempt in range(MAX_RETRIES + 1):
try:
async with session.get(url, timeout=aiohttp.ClientTimeout(total=5)) as resp:
resp.raise_for_status()
return await resp.json()
except Exception as e:
if attempt == MAX_RETRIES:
return {"error": str(e), "url": url}
await asyncio.sleep(2 ** attempt) # 指数退避
async def main_controlled():
urls = [...]
sem = Semaphore(CONCURRENCY)
async with aiohttp.ClientSession() as session:
tasks = [fetch_with_limit(sem, session, u) for u in urls]
results = await asyncio.gather(*tasks, return_exceptions=True)
return results
Semaphore acts like a bouncer — only 20 coroutines get in at a time, the rest wait at await sem. Add timeouts, status checks, and exponential backoff retries, and you won’t nuke the downstream, while tolerating transient hiccups.
Lessons Learned: Three Hours Hunting a Silent Bug
1. Forgetting await Turns Coroutines into Ghost Code
Once I noticed the network requests weren’t actually being made, but the logs said “execution successful.” Half an hour later I spotted fetch_data(url) instead of await fetch_data(url). A coroutine without await is just a generator object — the event loop ignores it completely. Python 3.8+ does emit a RuntimeWarning, but it’s easy to overlook in a long block. The fix: run with python -W error::RuntimeWarning to turn warnings into exceptions, or explicitly create an asyncio.Task.
2. Mixing Synchronous Blocking Code Inside a Coroutine
I initially reused old code and called requests.get directly inside an async def. The event loop got blocked, and all concurrency degraded to serial execution. The iron rule: inside an async function, never use any synchronous blocking call. Either switch to the corresponding aio library (aiohttp, aiofiles, etc.) or offload the blocking call to a thread pool with loop.run_in_executor.
Refactoring that pipeline taught me that asyncio’s real power isn’t just speed — it’s the ability to handle IO‑bound workloads without the complexity of threading. But it also demands that you be disciplined: respect the event loop, control your concurrency, and always, always await.
Top comments (0)