DEV Community

Ross Coundon
Ross Coundon

Posted on

Rate limiting API calls - sometimes a Bottleneck is a good thing

What is Bottleneck and why do I need it in my coding life?

If you've spent any time working with 3rd party APIs you'll have come up against an issue where you make a tonne of calls to an API and it doesn't finish giving you what you want. You might get a helpful error like 429 - Too Many Requests or something less helpful like ECONNRESET

Either way, what is happening is that as a consumer of that API you are only allowed to make so many requests in a certain period of time, or the number of concurrent requests you're allowed to make is restricted.

In Javascript your code might look something like this:


const axios = require('axios');

async function getMyData(data){
  const axiosConfig = {
    url: 'https://really.important/api',
    method: 'post',
    data
  }
  return axios(axiosConfig)
}


async function getAllResults(){

  const sourceIds = []

  // Just some code to let us create a big dataset
  const count = 1000000;
  for(let i = 0; i < count; i++){
    sourceIds.push({
      id: i
    });
  }

  // Map over all the results and call our pretend API, stashing the promises in a new array
  const allThePromises = sourceIds.map(item => {
    return getMyData(item);
  })

  try{
    const results = await Promise.all(allThePromises);
    console.log(results);
  }
  catch(err){
    console.log(err);
  }

}

What's going to happen here is the code will call the 1000000 times as fast as possible and all requests will take place in a very short space of time (on my MacBook Pro it's < 700ms)

Understandably, some API owners might be a little upset by this as it's creating a heavy load.

What do we need to do?

We need to be able to limit the number of requests we're making, potentially both in terms of the number of API calls in a space of time and in terms of the number of concurrent requests.

I'd encourage you to attempt to roll your own solution as a learning exercise. For example, there is a reasonably simple solution that can get you out of a hole using setInterval. What I think you'll find is that building a reliable solution that limits rate and concurrency is actually trickier than it looks and requires you to build and manage queues. It's even more complicated if you're clustering.

We can instead turn to a gem of a package on NPM - Bottleneck
https://www.npmjs.com/package/bottleneck

The author describes this as:

Bottleneck is a lightweight and zero-dependency Task Scheduler and Rate Limiter for Node.js and the browser.

What you do is create a 'limiter' and use it to wrap the function you want to rate limit. You then simply call the limited version instead.

Our code from earlier becomes:


const axios = require('axios');
const Bottleneck = require('bottleneck');

const limiter = Bottleneck({
  minTime: 200
});

async function getMyData(data){
  const axiosConfig = {
    url: 'https://really.important/api',
    method: 'post',
    data
  }
  return axios(axiosConfig)
}

const throttledGetMyData = limiter.wrap(getMyData);

async function getAllResults(){

  const sourceIds = []

  // Just some code to let us create a big dataset
  const count = 1000000;
  for(let i = 0; i < count; i++){
    sourceIds.push({
      id: i
    });
  }

  // Map over all the results and call our pretend API, stashing the promises in a new array
  const allThePromises = sourceIds.map(item => {
    return throttledGetMyData(item);
  })


  try{
    const results = await Promise.all(allThePromises);
    console.log(results);
  }
  catch(err){
    console.log(err);
  }

}

getAllResults()

As you can see, we've created a limiter with a minTime property. This defines the minimum number of milliseconds that must elapse between requests. We have 200 so we'll make 5 requests per second.

We then wrap our function using the limiter and call the wrapped version instead:


const throttledGetMyData = limiter.wrap(getMyData);
...
  const allThePromises = sourceIds.map(item => {
    return throttledGetMyData(item);
  })

If there's a chance your requests will take longer than the minTime, you're also easily able to limit the number of concurrent requests by setting up the limiter like this:

const limiter = Bottleneck({
  minTime: 200,
  maxConcurrent: 1,
});

Here we'll ensure that there is only one request submitted at a time.

What else can it do?

There are many options for setting up Bottleneck'ed functions. You can rate limit over a period of time using the reservoir options - e.g. send a maximum of 100 requests every 60 seconds. Or, send an initial batch of requests and then subsequent batches every x seconds.

The documentation over at NPM is excellent so I advise you to read it to get a full appreciation of the power of this package, and also the gotchas for when things don't behave as you expect.

Wrapping up

If you've ever in need of a highly flexible package that deals with how to rate limit your calls to an API, Bottleneck is your friend.

Top comments (19)

Collapse
 
melitus profile image
Aroh Sunday

I tried the above suggestion to make multiple requests to a remote third party API to solve the " Socket hangout issue, ECONNRESET in Express.js (Node.js) with multiple requests: but still getting the error. I will appreciate your help. Thanks

Collapse
 
rcoundon profile image
Ross Coundon

That means the other end of the connection closed it for some reason. Can you share your code?

Collapse
 
melitus profile image
Aroh Sunday • Edited

This is the code I use to simulate the bulk operation

export async function bulkParserSimulator(arrayOfIds) {
  // Fetch CV urls
  // logg.info('Fetching uploaded resume arrays')
  let cvUrl = await getUploadedCVUrl(arrayOfIds);
  let parsedResult = [];
  let save;
  let start = +new Date();
  let parserError;

  // sequential operation
  for (let index = 0, len = cvUrl.length; index < len; index++) {
    const file = cvUrl[index].fileurl;
    const filename = cvUrl[index].filename;
    const jobpositionIdFromArray = cvUrl[index].jobPositionId;
    // Check if cvUrl exist else go to next loop
    if (!file && !filename) continue;

    let singleResult = await throttleApiCall(filename, file);
    console.log({ singleResult });
    const { data } = singleResult;
    if (data.Error) {
      parserError = true;
    }
    if (data.Results) {
      let newparsed = new ParsersCollection(data);
      newparsed.jobPositionId = jobpositionIdFromArray;
      save = await newparsed.save();
      parsedResult.push(save);

      let end = +new Date();
      console.log({ TimeToFinish: end - start + 'ms' });
    }
  }
  // You can return this result then
  let final = { parsedResult, parserError };
  logger.info('Returning the parsed results');
  return final;
}

Then this is the one for bottleneck

import Bottleneck from 'bottleneck';
import { initializeParsingProcess } from './hire';
// Never more than 1 requests running at a time.
// Wait at least 1000ms between each request.
const limiter = new Bottleneck({
  maxConcurrent: 1, // one request per second
  minTime: 333,
});

export const throttleApiCall = limiter.wrap(initializeParsingProcess);

Thread Thread
 
rcoundon profile image
Ross Coundon

How you're wrapping the function looks fine. If you call initializeParsingProcess() directly, what happens?

Thread Thread
 
melitus profile image
Aroh Sunday • Edited

I will get the error below when I call direct. I use bottleneck to see whether that can be solved after going your amazing post here but still getting the same error below

Error: socket hang up
          at createHangUpError (_http_client.js:323:15)
          at TLSSocket.socketOnEnd (_http_client.js:426:23)
          at TLSSocket.emit (events.js:194:15)
          at TLSSocket.EventEmitter.emit (domain.js:441:20)
          at endReadableNT (_stream_readable.js:1125:12)
          at process._tickCallback (internal/process/next_tick.js:63:19)
        code: 'ECONNRESET',
Thread Thread
 
rcoundon profile image
Ross Coundon

Ah, I see, I thought you were suggesting the problem was with your usage of Bottleneck. Can you share the code that you use to call the API?

Thread Thread
 
melitus profile image
Aroh Sunday

I used Axios to make post request to a remote server. This is the code below

import axios from 'axios';
import Agent from 'agentkeepalive';
import http from 'http';
import https from 'https';

//keepAlive pools and reuses TCP connections, so it's faster
const keepAliveAgent = new Agent({
  maxSockets: 100, // Maximum number of sockets to allow per host. Defaults to Infinity.
  maxFreeSockets: 10,
  timeout: 60000, // active socket keepalive for 60 seconds
  freeSocketTimeout: 60000, // // Maximum number of sockets to leave open for 60 seconds in a free state. Only relevant if keepAlive is set to true. Defaults to 256.
  socketActiveTTL: 1000 * 60 * 10,
});

const axiosInstance = axios.create({ httpAgent: keepAliveAgent });

After that, I import into the file below to make the call

import { axiosInstance} from './axiosInstance';

 let responseBody = await axiosInstance.post(ROOT_URI, prepareFormData(filename, file), {
      headers: form.getHeaders(),
    });
Thread Thread
 
rcoundon profile image
Ross Coundon

Can you try with a raw axios instance, i.e. without the KeepAliveAgent?

Thread Thread
 
melitus profile image
Aroh Sunday

I have done that. It was after some research that I added KeepAliveAgent to see whether it can be solved but still proved abortive

Thread Thread
 
rcoundon profile image
Ross Coundon

How about making a single, non-bottlenecked call to the API? Does that work?

Thread Thread
 
melitus profile image
Aroh Sunday • Edited

Making a single call even with bottleneck works perfectly

Thread Thread
 
rcoundon profile image
Ross Coundon

Then I'm guessing there's something weird in the way you're building the URLs or making the requests when there are multiple API calls.

To clean things up, if I was you I'd change the code to use map() on the array of cvUrl returning a promise for each call.
The await promise.all() on the result of that map, then do your parsing.

Put console.logs in each iteration to determine exactly what you're sending and wrap in try/catch to see if you can find any more information about what's actually going wrong with the connection.

Thread Thread
 
melitus profile image
Aroh Sunday

This is where I tried it with Promise.all but got the same error. Is this code below look like what you suggested above?

const toReadInParalel = async arrayOfIds => {
  let cvUrl = await getUploadedCVUrl(arrayOfIds);
  let parsedResult = [];
  let save;
  let start = +new Date();
  let parserError;
  await Promise.all(
    cvUrl.map(async url => {
      const file = url.fileurl;
      const filename = url.filename;
      const jobpositionIdFromArray = url.jobPositionId;
      console.log({ file, filename, jobpositionIdFromArray });
      if (!file && !filename) return;

      let singleResult = await throttleApiCall(filename, file);
      console.log({ singleResult });
      const { data } = singleResult;
      if (data.Error) {
        parserError = true;
      }
      if (data) {
        let newparsed = new ParsersCollection(data);
        newparsed.jobPositionId = jobpositionIdFromArray;
        save = await newparsed.save();
        parsedResult.push(save);
      }
    }),
  );
Thread Thread
 
melitus profile image
Aroh Sunday

I need your assistance to get this resolve. Thanks

Thread Thread
 
melitus profile image
Aroh Sunday • Edited

@ross Coundon, I would appreciate your assistance from the wealth of your experience in the field interacting with several third part API on how I can make a concurrent request to the server without experience the "socket hangout issue". In the first interaction of the loop shown above, I will get results from the third-party API but on the second iteration, there will be a delay for a response from the third party and hence the error message below

Trace: { Error: socket hang up
    at createHangUpError (_http_client.js:323:15)
    at TLSSocket.socketOnEnd (_http_client.js:426:23)
    at TLSSocket.emit (events.js:194:15)
    at TLSSocket.EventEmitter.emit (domain.js:441:20)
    at endReadableNT (_stream_readable.js:1125:12)
    at process._tickCallback (internal/process/next_tick.js:63:19)
  code: 'ECONNRESET',
Thread Thread
 
rcoundon profile image
Ross Coundon

Hi - I'm not sure what to suggest, are you able to share what the 3rd party API is? Do they provide any documentation/information on acceptable usage, time between requests, number of concurrent requests etc?

Thread Thread
 
melitus profile image
Aroh Sunday

They do not have that spell out on their API documentation. I have sent a mail to them to inquire about the acceptable usage, the time between requests, number of concurrent requests

Collapse
 
pavelloz profile image
Paweł Kowalski

Great stuff, thanks for bringing bottleneck to the broader audience.

Collapse
 
raghavsharma profile image
Raghav Sharma • Edited

Been using it for quite a while now, it's amazing!