Parthipan Natkunam for Coded Parts

Posted on Jun 13

Reading a Paginated API Without Holding the Whole Thing in Memory

#webdev #programming #javascript #node

Your API hands out 50 records at a time across 400 pages. You need all of them. You do not need them all at once.

Here's a very familiar situation that shows up constantly on the backend. Some API returns data in pages, 50 or 100 records at a time, and you need to walk every page: sync them to your database, export them to a file, run a report. The endpoint gives you a cursor or a page number and you keep asking until there's nothing left.
The way most of us write it the first time looks like this:

async function getAllRecords() {
  const all = [];
  let cursor = 0;
  while (cursor !== null) {
    const { records, nextCursor } = await fetchPage(cursor);
    all.push(...records);
    cursor = nextCursor;
  }
  return all;
}

const everything = await getAllRecords();
for (const record of everything) {
  process(record);
}

It works. At four hundred records it's fine. The trouble starts when the dataset grows, and it has three separate problems hiding in it.

It holds the entire dataset in memory before you touch a single record. It's all or nothing: if page 380 fails, you've thrown away the 19,000 records you already fetched. And it's eager. You can't start processing record one until the very last page has landed, even if all you wanted was the first ten.

There's a shape in JavaScript built for exactly this, and if you read the first two posts in this series you already have both halves of it.

Two ideas you've already seen

In the CSV post, we pulled rows out of a huge file one at a time with a generator, so the file never fully loaded into memory. Lazy. Pull-based. You ask for the next row, you get the next row, nothing more.

In the async/await post, we saw that a generator can pause at a yield and resume later.A generator can hold its place across an asynchronous gap.

Put those together. A generator that pulls data lazily, and can pause to await something between pulls. That's an async generator, and it's the natural tool for walking a paginated API. You pull records one at a time, and behind each pull it quietly fetches the next page only when you've run out of the current one.

The async generator

Here it is. Notice it's async function*, with the star, and that it both awaits and yields.

async function* allRecords() {
  let cursor = 0;
  while (cursor !== null) {
    const { records, nextCursor } = await fetchPage(cursor);
    for (const record of records) {
      yield record;          // hand out one record at a time
    }
    cursor = nextCursor;     // remember where we are for next time
  }
}

Here's what it does: It fetches a page, awaiting it like any async function. Then it yields each record in that page one by one.
The function pauses at every yield and sits there, holding its cursor, until someone asks for the next record.
You consume it with for await...of, which is a normal for loop that knows how to wait:

for await (const record of allRecords()) {
  process(record);
}

That reads almost exactly like the eager version's final loop. The difference is what's happening underneath. Each turn of this loop might quietly trigger a network fetch, or might just hand you the next record already sitting in the current page. The loop doesn't care. You write straight-line code and the paging disappears.

I ran this against a fake API holding 20,000 records in pages of 50. It read all of them, in order, no gaps, across exactly 400 fetches. Which is the boring, correct result. The interesting result is what happens when you don't want all of them.

The payoff: you can stop

Here's the thing the eager version can never do. Say you only want the first ten records.

const firstTen = [];
for await (const record of allRecords()) {
  firstTen.push(record.id);
  if (firstTen.length === 10) break;
}

With the collect-everything approach, getting ten records still costs you all 400 page fetches, because it loads the whole dataset before you see record one. With the async generator, I counted the fetches:

pages_fetched=1

One fetch. Not 400. When you break, the generator is paused at a yield, and breaking out of the loop means nobody ever asks it for record eleven. So it never runs the loop body again. It never fetches page two. The laziness here is the entire advantage: you only use the compute for the pages you actually process.

What this does to memory

The eager version's real cost is that it keeps every record alive at once. The streaming version holds about one page at a time. To show the gap more accurately, I measured peak heap growth for both, at three dataset sizes, with the same chunky records:

dataset       collect-all peak     stream peak
10,000 rows        4.0 MB             3.7 MB
100,000 rows      36.1 MB            12.4 MB
500,000 rows     161.1 MB            15.8 MB

Collect-all grows with the dataset: ten times the rows, roughly ten times the memory. The streaming version barely moves, because at any moment it's holding one page and one record, not half a million of them. At ten thousand rows the difference hardly matters. At half a million it's the difference between a job that runs and a job that gets killed.
That's the same lesson as the CSV post, now pointed at the network instead of the disk.

It composes, and stays lazy

The eager array has one more hidden tax. Every transform you bolt on, a filter, a map, walks the whole array again and builds another whole array. With async generators you pipe one into the next and the laziness survives the whole chain.

async function* onlyEven(source) {
  for await (const record of source) {
    if (record.id % 2 === 0) yield record;
  }
}

for await (const record of onlyEven(allRecords())) {
  // first 5 even records, then break
}

I asked this pipeline for the first five even records and broke. It fetched one page. The filter pulls from allRecords one record at a time, and allRecords fetches one page at a time, and nothing runs ahead of what you've actually consumed. You can stack filters and maps like this and the chain still only does the work you draw out of the end of it.

The part that bites people: cleanup

Now the most overlooked gap, because this is where streaming code leaks in production.

Let's say your generator doesn't fetch from a stateless API. Say it opens a database cursor or a file handle and reads from it. If the consumer breaks early, like in the first-ten example, the generator is left paused forever. Does the handle ever close?

It does, if you write it right. When you break out of a for await...of loop, the loop calls .return() on the generator under the hood. That resumes the paused generator just long enough to run any finally block before it shuts down. So you put cleanup in finally and it fires even on early exit:

async function* withCleanup() {
  try {
    while (true) yield i++;
  } finally {
    // prints even when you break
    console.log('finally ran: connection closed');  
  }
}

The important point here is: any resource an async generator holds open goes in a try, and its release goes in the matching finally. Skip that and an early break will quietly leak connections and will probably wake you up at 2 AM through PagerDuty alerts.

Where async generators are the wrong tool

They're built for I/O that arrives in sequence, so they're sequential by default. allRecords fetches page two only after you've finished page one, so you pay the network latency of every page back to back.

If your API can serve pages in parallel and you need throughput more than you need simple code, a plain Promise.all over known page numbers will beat this, and async generators won't parallelize for free.

Error handling is your job. One thrown page ends the loop, same as the eager version. If you want retries or skip-and-continue, you wrap the fetch inside the generator yourself.

And per item, pulling through a generator is slower than indexing into an array. For network-bound work that overhead vanishes next to the latency. For a tight CPU-bound loop over data already in memory, reach for the plain array.

The whole arc, in one mental model

Step back and the three posts in this series are the same idea three times:

Pull local data lazily so a file never fully loads.
Pause and await so async code reads like sync code.
And now, pull remote data lazily so an API never fully lands in memory.

Underneath all of it is one trick: a function that can stop in the middle and pick up later when you ask for more.

The protocol that makes for await...of work, the Symbol.asyncIterator it looks for, the way .return() drives that finally cleanup, the patterns for adding controlled parallelism back in: I pulled the whole async iteration layer apart in a short free book on generators. If this series made the mechanism click and you want the full picture in one place, it's here:
Get Your Free Copy

The next time an API hands you data 50 rows at a time, you don't have to choose between holding all of it and writing a tangle of cursor bookkeeping. You write a loop that looks eager and runs lazy. The paging hides itself, and you only ever pay for the pages you actually walk through.

Cheers :)

DEV Community