DEV Community

Nico Reyes
Nico Reyes

Posted on

API pagination broke differently on each endpoint

API pagination broke differently on each endpoint

Started building a dashboard that pulls data from a client's API. Three endpoints, all paginated. Should be simple right?

Nope.

First endpoint worked fine

Their /users endpoint used offset pagination. Standard stuff. Pass ?offset=0&limit=100, get 100 results, increment offset by 100, repeat until you get less than 100 back.

def fetch_users(api_key):
    offset = 0
    limit = 100
    all_users = []

    while True:
        response = requests.get(
            f"https://api.example.com/users",
            params={"offset": offset, "limit": limit},
            headers={"Authorization": f"Bearer {api_key}"}
        )
        data = response.json()

        if len(data) < limit:
            all_users.extend(data)
            break

        all_users.extend(data)
        offset += limit

    return all_users
Enter fullscreen mode Exit fullscreen mode

Worked first try. Got 847 users. Moved on.

Second endpoint used cursor tokens

Their /orders endpoint didn't use offsets. Used cursor tokens instead. You get a next_cursor in the response, pass it back in the next request.

def fetch_orders(api_key):
    cursor = None
    all_orders = []

    while True:
        params = {"limit": 100}
        if cursor:
            params["cursor"] = cursor

        response = requests.get(
            f"https://api.example.com/orders",
            params=params,
            headers={"Authorization": f"Bearer {api_key}"}
        )
        data = response.json()

        all_orders.extend(data["orders"])

        cursor = data.get("next_cursor")
        if not cursor:
            break

    return all_orders
Enter fullscreen mode Exit fullscreen mode

Fine. Different pattern but whatever. Documentation mentioned it so I adjusted.

Products endpoint was broken

Their /products endpoint looked like it used offset pagination. Documentation said it did.

Liar.

First 200 products came back fine with ?offset=0&limit=100 and ?offset=100&limit=100.

Then ?offset=200&limit=100 returned duplicate products. Products from offset 150 through 200 plus 50 new ones. Made zero sense.

Thought it was caching. Waited 10 minutes. Same duplicates.

Tried cursor tokens like the orders endpoint. No next_cursor field.

Tried page numbers instead of offsets. Got different products but duplicates scattered throughout.

Emailed support.

"Use offset pagination it works fine."

Cool thanks.

What I ended up doing

Tracked product IDs myself and filtered duplicates:

def fetch_products(api_key):
    offset = 0
    limit = 100
    seen_ids = set()
    all_products = []
    empty_responses = 0

    while empty_responses < 3:  # Stop after 3 consecutive empty/duplicate batches
        response = requests.get(
            f"https://api.example.com/products",
            params={"offset": offset, "limit": limit},
            headers={"Authorization": f"Bearer {api_key}"}
        )
        data = response.json()

        new_products = [p for p in data if p["id"] not in seen_ids]

        if not new_products:
            empty_responses += 1
        else:
            empty_responses = 0
            all_products.extend(new_products)
            seen_ids.update(p["id"] for p in new_products)

        offset += limit

    return all_products
Enter fullscreen mode Exit fullscreen mode

Got 1,243 unique products. Filtered out around 180 duplicates.

Still don't know why their pagination breaks. Dashboard works now tho so I stopped asking

Top comments (0)