Sending AI requests one at a time is slow. Here's how to process 100 prompts simultaneously — with 10x throughput and 40% cost savings.
Sending AI requests sequentially is painfully slow. 100 prompts × 2 seconds each = 3+ minutes.
Here's how to process them in parallel using batch processing.
The Slow Way (Don't Do This)
# 100 prompts = 3 minutes of waiting
results = []
for prompt in prompts:
response = client.chat.completions.create(
model="deepseek-v4-flash",
messages=[{"role": "user", "content": prompt}]
)
results.append(response.choices[0].message.content)
The Fast Way: Async + Batch
import asyncio
from openai import AsyncOpenAI
client = AsyncOpenAI(
api_key="mb-your-key",
base_url="https://aibridge-api.com/v1"
)
async def process_prompt(prompt):
response = await client.chat.completions.create(
model="deepseek-v4-flash",
messages=[{"role": "user", "content": prompt}],
max_tokens=200
)
return response.choices[0].message.content
async def batch_process(prompts):
tasks = [process_prompt(p) for p in prompts]
return await asyncio.gather(*tasks)
# Process 100 prompts in 5 seconds
results = asyncio.run(batch_process(prompts))
The Results
| Method | 100 Prompts | 1000 Prompts |
|---|---|---|
| Sequential | 3 minutes | 30 minutes |
| Async batch | 5 seconds | 45 seconds |
| Speedup | 36x | 40x |
Pro Tips
- Use deepseek-v4-flash for batch jobs (fastest + cheapest)
- Add asyncio.Semaphore(10) to limit concurrency (avoid rate limits)
- Add retry logic for failed tasks
- Save intermediate results (in case of crash)
Try It
- Copy the code above
- Get a free API key → aibridge-api.com
- Replace your sequential loop
- Watch your throughput 10x




Top comments (0)