DEV Community

Cover image for How Should You Design API Pagination for Millions of Records?
Wanda
Wanda

Posted on • Originally published at apidog.com

How Should You Design API Pagination for Millions of Records?

TL;DR

For large datasets, prefer cursor-based or keyset pagination over offset-based pagination. Offset pagination (?page=1&limit=20) performs poorly at scale and can lead to inconsistent data. Modern PetstoreAPI uses cursor-based pagination with opaque tokens and HATEOAS links for efficient, consistent results.

Try Apidog today


Introduction

If your API returns a list of pets and your database contains 10 million records, a request like GET /pets?page=500000&limit=20 translates to OFFSET 10000000 LIMIT 20 in SQL. This approach causes the database to scan millions of rows just to return 20, resulting in poor performance and potential timeouts.

This is the offset pagination problem—fine for small datasets, but not scalable. The legacy Swagger Petstore API doesn't address pagination, whereas Modern PetstoreAPI implements scalable, cursor-based pagination.

💡 Tip: If you’re developing or testing REST APIs, Apidog lets you test pagination behavior, validate responses, and verify your API’s performance with large datasets. You can simulate pagination scenarios and test edge cases.

This guide covers why offset pagination fails, how cursor-based pagination works, and practical implementation examples using Modern PetstoreAPI.


Why Offset Pagination Fails at Scale

Offset pagination is common, but has critical drawbacks for large datasets.

How Offset Pagination Works

GET /pets?page=1&limit=20    → OFFSET 0 LIMIT 20
GET /pets?page=2&limit=20    → OFFSET 20 LIMIT 20
GET /pets?page=3&limit=20    → OFFSET 40 LIMIT 20
Enter fullscreen mode Exit fullscreen mode

The database skips offset rows and returns limit rows.

Problem 1: Performance Degrades with Page Number

Page 1:

SELECT * FROM pets OFFSET 0 LIMIT 20;
-- Fast: scans 20 rows
Enter fullscreen mode Exit fullscreen mode

Page 1000:

SELECT * FROM pets OFFSET 20000 LIMIT 20;
-- Slow: scans 20,020 rows, returns 20
Enter fullscreen mode Exit fullscreen mode

Page 500,000:

SELECT * FROM pets OFFSET 10000000 LIMIT 20;
-- Very slow: scans 10,000,020 rows, returns 20
Enter fullscreen mode Exit fullscreen mode

Performance becomes linearly worse as the page number increases because the database must scan all rows up to the offset.

Problem 2: Inconsistent Results

If data changes while paging:

Request 1:

GET /pets?page=1&limit=2
Returns: [Pet A, Pet B]
Enter fullscreen mode Exit fullscreen mode

Someone adds Pet Z (sorts first alphabetically)

Request 2:

GET /pets?page=2&limit=2
Returns: [Pet B, Pet C]  ← Pet B appears twice!
Enter fullscreen mode Exit fullscreen mode

Duplicates or missing records can occur due to inserts or deletions between requests.

Problem 3: Deep Pagination Is Expensive

Allowing queries like ?page=1000000 forces the backend to process expensive queries. This can be abused for denial-of-service attacks.

When Offset Pagination Is Acceptable

Use offset pagination only when:

  • Dataset is small (< 10,000 records)
  • Internal APIs with controlled usage
  • Admin interfaces with shallow paging
  • Data changes infrequently

For public APIs or large datasets, switch to cursor-based pagination.


Cursor-Based Pagination Explained

Cursor-based pagination uses an opaque token to mark position within the results.

How It Works

Request 1:

GET /pets?limit=20
Enter fullscreen mode Exit fullscreen mode

Response 1:

{
  "data": [...],
  "pagination": {
    "nextCursor": "eyJpZCI6IjAxOWI0MTMyLTcwYWEtNzY0Zi1iMzE1LWUyODAzZDg4MmEyNCJ9",
    "hasMore": true
  }
}
Enter fullscreen mode Exit fullscreen mode

Request 2:

GET /pets?cursor=eyJpZCI6IjAxOWI0MTMyLTcwYWEtNzY0Zi1iMzE1LWUyODAzZDg4MmEyNCJ9&limit=20
Enter fullscreen mode Exit fullscreen mode

The client passes the opaque cursor value from the previous response to fetch the next page.

Benefits

1. Consistent Performance

The query uses an index seek, not a scan:

SELECT * FROM pets
WHERE id > '019b4132-70aa-764f-b315-e2803d882a24'
ORDER BY id
LIMIT 20;
Enter fullscreen mode Exit fullscreen mode

Performance is consistent regardless of position in the dataset.

2. Consistent Results

Cursors prevent duplicates and missing records even if data changes between requests.

3. No Deep Pagination Attacks

Clients must page sequentially, limiting resource abuse.

Cursor Format

Cursors are typically base64-encoded JSON payloads:

// Decoded cursor
{
  "id": "019b4132-70aa-764f-b315-e2803d882a24",
  "createdAt": "2026-03-13T10:30:00Z"
}
Enter fullscreen mode Exit fullscreen mode

The cursor includes enough information (such as ID and sort field) to resume pagination.


Keyset Pagination for Sorted Data

Keyset pagination is a specialized form of cursor-based pagination for sorted data.

How It Works

Use the last value from the previous page to fetch the next:

Request 1:

GET /pets?limit=20&sortBy=createdAt
Enter fullscreen mode Exit fullscreen mode

Response 1:

{
  "data": [
    {"id": "...", "createdAt": "2026-03-13T10:00:00Z"},
    ...
    {"id": "...", "createdAt": "2026-03-13T10:30:00Z"}
  ]
}
Enter fullscreen mode Exit fullscreen mode

Request 2:

GET /pets?limit=20&sortBy=createdAt&after=2026-03-13T10:30:00Z
Enter fullscreen mode Exit fullscreen mode

The after parameter uses the last createdAt value from the previous page.

SQL Query

SELECT * FROM pets
WHERE created_at > '2026-03-13T10:30:00Z'
ORDER BY created_at
LIMIT 20;
Enter fullscreen mode Exit fullscreen mode

Efficient for time-series or naturally sorted data.

When to Use Keyset Pagination

  • Data is naturally sorted (timestamps, IDs)
  • Clients understand the pagination key
  • You want transparent, not opaque, pagination

Modern PetstoreAPI documentation defaults to cursor-based but supports keyset pagination for time-series endpoints.


How Modern PetstoreAPI Implements Pagination

Modern PetstoreAPI uses cursor-based pagination with HATEOAS links for all large datasets.

Request Format

GET /pets?limit=20
GET /pets?cursor={token}&limit=20
Enter fullscreen mode Exit fullscreen mode

Parameters:

  • limit — Number of results per page (default: 20, max: 100)
  • cursor — Opaque pagination token from previous response

Response Format

{
  "data": [
    {
      "id": "019b4132-70aa-764f-b315-e2803d882a24",
      "name": "Fluffy",
      "species": "CAT"
    }
  ],
  "pagination": {
    "limit": 20,
    "hasMore": true,
    "nextCursor": "eyJpZCI6IjAxOWI0MTMyLTcwYWEtNzY0Zi1iMzE1LWUyODAzZDg4MmEyNCJ9"
  },
  "links": {
    "self": "https://petstoreapi.com/pets?limit=20",
    "next": "https://petstoreapi.com/pets?cursor=eyJpZCI6IjAxOWI0MTMyLTcwYWEtNzY0Zi1iMzE1LWUyODAzZDg4MmEyNCJ9&limit=20"
  }
}
Enter fullscreen mode Exit fullscreen mode

Key Features

  • Opaque Cursors: Clients do not parse cursor tokens.
  • HATEOAS Links: The links object provides ready-to-use URLs for pagination.
  • hasMore Flag: Indicates if more results are available.
  • Limit Validation: Enforces a maximum limit of 100 per page.

See Modern PetstoreAPI pagination documentation for full details.


Pagination Response Format

PetstoreAPI wraps paginated responses in a consistent structure for extensibility and usability.

Collection Wrapper

{
  "data": [...],
  "pagination": {...},
  "links": {...}
}
Enter fullscreen mode Exit fullscreen mode

Why wrap collections?

  1. Extensibility—Add metadata without breaking clients
  2. Consistency—Uniform structure for all endpoints
  3. HATEOAS—Links for easy navigation

Pagination Metadata

"pagination": {
  "limit": 20,
  "hasMore": true,
  "nextCursor": "...",
  "totalCount": 1000  // Optional, expensive to compute
}
Enter fullscreen mode Exit fullscreen mode

totalCount is optional and typically omitted for large datasets due to computation cost.


Testing Pagination with Apidog

Apidog is a practical tool for testing pagination scenarios and API robustness.

Test Scenarios

1. First Page

GET /pets?limit=20
Expect: 20 results, hasMore=true, nextCursor present
Enter fullscreen mode Exit fullscreen mode

2. Subsequent Pages

GET /pets?cursor={token}&limit=20
Expect: 20 results, hasMore=true/false, nextCursor present/absent
Enter fullscreen mode Exit fullscreen mode

3. Last Page

GET /pets?cursor={lastToken}&limit=20
Expect: <20 results, hasMore=false, no nextCursor
Enter fullscreen mode Exit fullscreen mode

4. Empty Results

GET /pets?status=NONEXISTENT&limit=20
Expect: 0 results, hasMore=false, no nextCursor
Enter fullscreen mode Exit fullscreen mode

5. Limit Validation

GET /pets?limit=1000
Expect: 400 Bad Request (exceeds max limit)
Enter fullscreen mode Exit fullscreen mode

Apidog Test Configuration

// Test: Pagination structure
pm.test("Response has pagination", () => {
  pm.expect(pm.response.json()).to.have.property('pagination');
  pm.expect(pm.response.json().pagination).to.have.property('hasMore');
});

// Test: HATEOAS links
pm.test("Response has links", () => {
  const links = pm.response.json().links;
  pm.expect(links).to.have.property('self');
  if (pm.response.json().pagination.hasMore) {
    pm.expect(links).to.have.property('next');
  }
});
Enter fullscreen mode Exit fullscreen mode

Choosing the Right Pagination Strategy

Select a pagination strategy based on your use case.

Offset Pagination

Use when:

  • Dataset is small (< 10,000 records)
  • Users need random access (e.g., jump to page 50)
  • Data changes infrequently
  • Internal APIs

Avoid when:

  • Large datasets (> 100,000 records)
  • High performance or data consistency is required
  • Data changes frequently

Cursor-Based Pagination

Use when:

  • Large, dynamic datasets
  • Consistent performance is required
  • Sequential access is sufficient

Avoid when:

  • Users require random access to pages
  • Cursor complexity is a concern

Keyset Pagination

Use when:

  • Data is naturally sorted
  • Transparent, simple pagination is desired
  • High performance is needed

Avoid when:

  • Complex or multi-field sorting is required

Recommendation: For public APIs and large datasets, favor cursor-based pagination.


Conclusion

Effective pagination is essential for scalable APIs. Offset pagination is simple but doesn't scale. Cursor-based pagination delivers consistent performance and reliable results, even with millions of records.

Modern PetstoreAPI uses cursor-based pagination with opaque tokens, HATEOAS links, and robust metadata for a scalable and developer-friendly design.

Test your pagination implementation using Apidog to ensure edge cases and validation are properly handled.

Key takeaways:

  • Avoid offset pagination for large datasets
  • Use cursor-based pagination for scalability
  • Wrap responses with metadata and navigation links
  • Test thoroughly with Apidog
  • Follow Modern PetstoreAPI’s proven patterns

FAQ

Why not just return all results without pagination?

Returning millions of records in a single response causes memory issues, slow transfers, and poor user experience. Pagination is essential for large datasets.

Can clients jump to a specific page with cursor pagination?

No, cursor pagination requires sequential access. If random access is necessary, use offset pagination for small datasets or implement filtering/search.

How do I handle pagination with filtering?

Include filter parameters in paginated requests:

GET /pets?status=AVAILABLE&cursor={token}&limit=20

Cursors should encode both position and filter state.

Should I include total count in pagination responses?

Only if required and the dataset is small. Computing total count is expensive for large datasets.

How do I implement cursor pagination in SQL?

Use a WHERE clause with the cursor value and ensure indexing on the sort column:

SELECT * FROM pets WHERE id > ? ORDER BY id LIMIT 20
Enter fullscreen mode Exit fullscreen mode

What if my cursor tokens become invalid?

Return 400 Bad Request with an error message. Cursors can become invalid if data is deleted or the state expires.

How long should cursors remain valid?

Modern PetstoreAPI cursors remain valid as long as the referenced resource exists. Some APIs expire cursors after 24 hours.

Can I use cursor pagination with multiple sort fields?

Yes, but the cursor must encode all sort fields. This increases complexity. Prefer a single composite sort key if possible.

Top comments (0)