Last week my boss threw me a task: pull data from 50 third‑party APIs and build an aggregated report. I thought it was a piece of cake — just write a loop with Requests and be done. But when I ran it, I was dumbfounded: the whole thing was synchronously blocking, and cycling through all 50 endpoints took almost 80 seconds. That’s when I naturally reached for asyncio, Python’s silver bullet for IO‑bound concurrency. I jumped in eagerly, only to spend the next three hours glued to my screen hunting down one weird behavior after another.
I thought I understood asyncio — I'd only scratched the surface
The event loop: a single‑threaded time‑management wizard
At the heart of asyncio sits an event loop. It juggles all coroutines inside a single thread. When a coroutine is waiting on something slow — network, disk — it doesn’t block the thread. Instead, it yields control back to the event loop, which then wakes up the next ready coroutine.
Define coroutines with async def and voluntarily hand over the execution with await:
import asyncio
async def fetch_api(url: str) -> str:
print(f"开始请求 {url}")
await asyncio.sleep(1) # simulates network IO; in real code use aiohttp
return f"data from {url}"
Real concurrency: gather and create_task
Throw all 50 tasks together and run them concurrently with asyncio.gather. The total time depends on the slowest one, not the sum of all requests:
async def main():
urls = [f"https://api.example.com/item/{i}" for i in range(50)]
tasks = [fetch_api(url) for url in urls]
results = await asyncio.gather(*tasks)
print(f"获取 {len(results)} 条数据")
asyncio.run(main())
Just like that, I went from 80 seconds to under 2 seconds. I nearly slapped the desk in excitement — but that’s exactly when the real traps started lining up.
Full comparison: sync vs async — how big is the gap?
You can literally copy and run the two snippets below. Trust me, you’ll want to see the difference yourself.
Synchronous version (painfully slow)
import time
import requests
def fetch_sync(url: str) -> str:
resp = requests.get(url, timeout=5)
return resp.status_code
def main():
urls = ["https://httpbin.org/delay/1"] * 10 # 10 slow endpoints
start = time.perf_counter()
results = [fetch_sync(url) for url in urls]
elapsed = time.perf_counter() - start
print(f"同步耗时: {elapsed:.2f}s, 结果数: {len(results)}")
if __name__ == "__main__":
main()
Async version (the right way)
import asyncio
import time
import aiohttp
async def fetch_async(session: aiohttp.ClientSession, url: str) -> int:
async with session.get(url, timeout=aiohttp.ClientTimeout(total=5)) as resp:
return resp.status
async def main():
urls = ["https://httpbin.org/delay/1"] * 10
start = time.perf_counter()
async with aiohttp.ClientSession() as session:
tasks = [fetch_async(session, url) for url in urls]
results = await asyncio.gather(*tasks)
elapsed = time.perf_counter() - start
print(f"异步耗时: {elapsed:.2f}s, 结果数: {len(results)}")
if __name__ == "__main__":
asyncio.run(main())
The synchronous version runs 10 endpoints in about 12 seconds. The async one finishes in just over 1 second. The difference is impossible to miss.
The traps I fell into — each one more subtle than the last
1. Forgetting await turns coroutines into zombies
tasks = [fetch_async(session, url) for url in urls] # only creates coroutine objects, never executed!
Without await or asyncio.gather to wrap them, those coroutines are never scheduled. The code finishes almost instantly, and your “results” list is full of coroutine objects. The fix is simple: always use gather or create_task.
2. Calling time.sleep inside a coroutine freezes the entire loop
async def buggy_fetch(url):
import time
time.sleep(1) # blocks the thread — event loop frozen!
return "data"
time.sleep is a synchronous blocking call. It seizes the only thread and the event loop cannot switch to anything else. You must use await asyncio.sleep(n) or offload synchronous work with loop.run_in_executor.
3. No concurrency limit got me blocked by the target API
50 coroutines bombarded the server at the same time, immediately earning a flood of HTTP 429 responses. The cure is a Semaphore:
sem = asyncio.Semaphore(10) # at most 10 concurrent requests
async def rate_limited_fetch(session, url):
async with sem:
return await fetch_async(session, url)
4. asyncio.run() crashes on Windows
On Windows, the default ProactorEventLoop can trigger a RuntimeError in some scenarios.
Top comments (0)