How I processed 2,000 concurrent OpenAI requests using Node.js Streams (Zero 429 Errors)

#javascript #node #openai #webdev

I recently built a backend engine to solve a boring but massive problem in e-commerce: Taxonomy Mapping.

Watch the demo test:

The goal was simple: Take a messy CSV of 20,000 products and map them to the official Google Taxonomy IDs using an LLM.

The problem? Rate Limits.

If you try to Promise.all() 2,000 requests to OpenAI, three things happen:

Memory Spike: Loading a 15MB+ CSV into a variable kills the Node process.
429 Errors: OpenAI bans you for hitting the Request Per Minute (RPM) limit instantly.
Error Collapse: Promise.all fails fast if one request fails, ruining the whole batch.

Here is the architecture I built to process 450+ requests per minute reliably using Node.js Streams and Bottleneck.

1. The Memory Problem (Streams vs. Arrays)

Loading a large CSV into memory is a rookie mistake. I switched to fs.createReadStream combined with csv-parser. This allows us to pipe the data row-by-row, keeping memory usage almost flat regardless of file size.

javascript
const fs = require('fs');
const csv = require('csv-parser');

const stream = fs.createReadStream(inputFilePath)
  .pipe(csv())
  .on("data", (row) => {
     // Push job to the limiter (see next section)
     // RAM usage stays constant even with 500MB files
     limiter.schedule(() => processRow(row));
  });

2. The Rate Limit Problem (Bottleneck)

This was the hardest part. OpenAI's Tier 1 limits are strict (Requests Per Day and Requests Per Minute). I needed a queue system that was "Aware" of time.

I used the bottleneck library to enforce a strict "Speed Limit" that is aware of concurrency.

Target Speed: ~450 RPM (Requests Per Minute) to stay safe.
Calculation: 60,000ms / 450 ≈ 133ms delay.
Concurrency: We allow 10 concurrent requests so we don't lose time waiting for network latency.

javascript
const Bottleneck = require("bottleneck");

// Configure the limiter
const limiter = new Bottleneck({
  minTime: 133, // Wait 133ms between launching requests
  maxConcurrent: 10 // Allow 10 active connections to handle latency
});

// Wrap the AI call
const task = limiter.schedule(async () => {
   return await callOpenAI(row);
});

3. Handling "Fatal" vs "Minor" Errors

When processing thousands of rows, you don't want to stop if one row fails (e.g., bad encoding). But you do want to stop if you run out of API Credits or hit a hard daily limit.

We implemented a custom error handling logic where the agent throws specific FATAL_ error codes, which the queue listener catches to stream.destroy() immediately.

javascript
// Simplified Logic
limiter.schedule(async () => {
  try {
     return await agent(row);
  } catch (e) {
     if (e.message.startsWith("FATAL_")) {
        // Kill the queue immediately so we don't waste retries
        limiter.stop({ dropWaitingJobs: true });
        stream.destroy(); 
        console.error("🛑 Queue Killed: " + e.message);
     }
  }
});

4. Context-Aware Prompting

Even with the architecture fixed, LLMs have a habit of hallucinating IDs. If a product description says "100% Cotton," the model might return 100 as the ID.

We solved this using Negative Constraints and Few-Shot Prompting to force strict integer validation against the 2024 Taxonomy standard.

The Result

We ran a stress test yesterday against a raw dataset of unorganized products:

Input: 2,000 Unorganized SKUs (15MB CSV).
Throughput: ~450 RPM (Requests Per Minute).
Errors: 0 Rate Limit Errors (429s).
Time: ~4.5 Minutes total.
Accuracy: 100% Valid Integer IDs (No text hallucinations).

By combining Node.js Streams for memory management and Bottleneck for flow control, we turned a script that crashed at 500 rows into an engine that handles 50k rows effortlessly.

🚀 We just launched on Product Hunt!

I wrapped this engine into an API called CatMap.

It’s live on Product Hunt today. If you want to test the speed yourself (or try to break it with a messy CSV), we just opened the Public Demo Key.

Check it out here (and I'd love your support!):
CatMap API on Product Hunt 🚀

Let me know in the comments if you have questions about the Node.js implementation or the prompting strategy!