DEV Community

BAOFUFAN
BAOFUFAN

Posted on

I Spent 3 Hours on This Asyncio Bug — Here’s How to Avoid It

Last Friday at 5:30 PM, I was about to shut my laptop and call it a day when our operations colleague pinged me with a request: speed up the competitor price monitoring script. Running sequentially, a single scan took 40 minutes, and the boss was already asking for the data before it even hit the database. I looked at the code — over 100 API calls, all using requests.get one after another. Unacceptable. I immediately rewrote the core fetching logic with asyncio, thinking “add concurrency, finish in five minutes.” I ended up debugging until 8:30 PM, hitting more pitfalls than I’d written bugs the entire week. Here are those painful lessons, so you don’t have to repeat them.


You Think You Understand asyncio? It’s Playing You

The core idea of asyncio isn’t complicated: an event loop schedules all coroutines within a single thread. When a coroutine awaits an I/O operation, the loop suspends it and switches to another one. That’s why it shines for I/O-bound tasks; shove CPU-heavy work into it and you’ll block the whole loop.

import asyncio
import time

async def fetch_price(symbol: str) -> tuple:
    # 模拟一次网络请求,耗时 0.5~1.5 秒
    await asyncio.sleep(0.5 + hash(symbol) % 10 * 0.1)
    return symbol, round(100 + hash(symbol) % 50, 2)

async def main_naive():
    """❌ 乍看是并发,实际上还是串行"""
    symbols = ["AAPL", "GOOGL", "MSFT", "AMZN", "META",
               "TSLA", "NVDA", "BABA", "JD", "PDD"]
    tasks = []
    for sym in symbols:
        # 错误:在这里 await 就等于顺序执行!
        price = await fetch_price(sym)
        tasks.append(price)
    return tasks

async def main_better():
    """✅ 用 gather 真正并发"""
    symbols = ["AAPL", "GOOGL", "MSFT", "AMZN", "META",
               "TSLA", "NVDA", "BABA", "JD", "PDD"]
    coros = [fetch_price(sym) for sym in symbols]
    results = await asyncio.gather(*coros)
    return results

start = time.time()
# asyncio.run(main_naive())   # 耗时 ≈ 所有请求时间之和
asyncio.run(main_better())    # 耗时 ≈ 最慢那一个请求
print(f"耗时: {time.time() - start:.1f}s")
Enter fullscreen mode Exit fullscreen mode

I made this mistake right out of the gate. Putting await inside the for loop effectively tears off the wings of concurrency — you wait for each coroutine to finish before even creating the next one while the event loop does nothing. The correct approach is to build a list of coroutine objects and hand them to gather all at once; only then does the loop schedule them concurrently.

But is gather enough? That’s just the start of the trouble.


The Real Trap: Unbounded Concurrency Crashes Hard

The moment I added gather, my console exploded with activity — and 5 seconds later, everything crashed:

aiohttp.client_exceptions.ClientOSError: [Errno 24] Too many open files
Enter fullscreen mode Exit fullscreen mode

All 100 requests fired almost simultaneously. The OS file descriptor limit was crushed, and the remote server mercilessly slapped back with waves of 429 Too Many Requests. That’s when I realized: concurrency doesn’t mean launching everything at once — you have to rein it in.

The solution is asyncio.Semaphore, a coroutine-friendly semaphore:

import asyncio
import aiohttp

MAX_CONCURRENT = 5          # 同时最多 5 个请求
semaphore = asyncio.Semaphore(MAX_CONCURRENT)

async def fetch_with_limit(session, url):
    """用信号量限制并发数,并做好异常重试"""
    async with semaphore:   # 超过上限的协程会在这里等待
        try:
            async with session.get(url, timeout=10) as resp:
                resp.raise_for_status()
                return await resp.json()
        except asyncio.TimeoutError:
            print(f"[超时] {url}")
        except aiohttp.ClientError as e:
            print(f"[请求错误] {url}: {e}")
        return None

async def main_controlled():
    urls = [f"https://api.example.com/price/{sym}" for sym in SYMBOLS]
    async with aiohttp.ClientSession() as session:
        tasks = [fetch_with_limit(session, u) for u in urls]
        results = await asyncio.gather(*tasks, return_exceptions=True)
    # 过滤掉失败的请求
    return [r for r in results if r is not None and not isinstance(r, Exception)]
Enter fullscreen mode Exit fullscreen mode

Two crucial details here:

  1. async with semaphore ensures that at most MAX_CONCURRENT coroutines are running requests at any time; the rest queue up peacefully. This protects both the server and your own connections — no more file descriptor explosions.
  2. return_exceptions=True tells gather not to blow up the entire batch if one coroutine raises an exception. Instead, the exception object gets placed into the results list, and you handle it later. Without it, a single failure causes gather to throw immediately, leaving other in-flight coroutines dangling (and likely becoming zombie tasks).

Once I wrapped everything with a semaphore and added proper error handling, the script ran reliably in under 3 minutes, with zero angry HTTP 429s and zero “too many open files” errors. The three-hour detour turned into three lines of safety — sometimes the smallest constraints make the biggest difference.

Top comments (0)