Saqueib Ansari

Posted on Apr 25 • Originally published at qcode.in

Pagination stops being simple when one list endpoint has to do five jobs

#backend #api #pagination #architecture

Pagination looks trivial when all you need is page=3&per_page=20 in a CRUD screen. It stops being trivial the moment the same dataset starts serving customer search, CSV exports, background sync jobs, and admin tooling with different correctness requirements.

That is when a list endpoint quietly turns into infrastructure.

The problem is not pagination itself. The problem is pretending one pagination strategy can satisfy every consumer equally well. It cannot. Offset pagination, cursor pagination, keyset pagination, snapshot exports, and bulk traversal each solve different problems. If you force one model across all of them, you usually end up with slow queries, duplicate rows, missing rows, broken exports, or admin screens that feel inconsistent under load.

The practical rule is simple: paginate by product need, not by frontend habit.

If a list is customer-facing and needs numbered pages, optimize for navigation clarity. If a job needs to walk millions of rows safely, optimize for traversal stability. If an export must reflect a coherent slice of data, optimize for snapshot semantics. Treating those as the same problem is how “simple pagination” becomes a source of recurring bugs.

The first decision is not page size. It is consistency model

Most teams start pagination discussions with UI concerns: page count, next/previous links, infinite scroll, visible totals. Those matter, but they are downstream from a more important question:

What kind of correctness does this consumer expect while the dataset is changing?

That question immediately separates your use cases.

Customer browsing usually wants navigability

A customer looking through products, invoices, or posts usually cares about:

predictable sorting
reasonable page-to-page movement
stable enough results for a short session
visible counts or progress markers

They do not usually need perfect traversal of a mutating dataset. They need a good browsing experience.

Background jobs want traversal safety

A sync worker or batch processor cares about different things:

never skipping rows
never reprocessing rows accidentally unless idempotent
surviving inserts and deletes during traversal
avoiding deep offset scans

That is not a browsing problem. It is a data movement problem.

Exports want snapshot-like behavior

Exports are even stricter. Users usually assume “export the results I am looking at” means a coherent dataset, not a moving target assembled over several minutes while records keep changing underneath it.

Admin tools sit awkwardly in the middle

Admin screens often want both:

human-friendly navigation
filters and search
stable enough views to investigate issues
the ability to bulk act on rows safely

That mixed requirement is why admin tooling is where weak pagination design gets exposed fastest.

Offset pagination is fine until it becomes your default hammer

Offset pagination is the first thing most teams ship because it is easy to reason about.

SELECT id, name, created_at
FROM users
ORDER BY created_at DESC, id DESC
LIMIT 50 OFFSET 100;

It works well for simple interfaces where users want page numbers, total counts, and arbitrary jumps.

Where offset pagination wins

Offset is still the best fit when you need:

numbered pages
direct jumps to page N
compatibility with common UI table patterns
relatively small or moderately sized datasets
simple mental models for internal tools

That is why it stays popular. For many backoffice screens, it is good enough.

Where offset pagination starts failing

The weaknesses show up when the dataset is large or actively changing.

Deep offsets get expensive

Databases still have to walk past earlier rows to reach the requested offset. On large datasets, page 1 is cheap and page 10,000 is not.

Changing data causes drift

If new rows are inserted at the top between page requests, offset-based browsing can produce duplicates or gaps.

A user sees rows 1 to 50, moves to the next page, and now sees some overlapping records because the whole result set shifted.

Exports built on offsets are especially fragile

If you implement export by repeatedly calling the same offset-based list endpoint, you are asking for silent inconsistency under concurrent writes.

That is the point many teams miss: offset pagination is a navigation tool, not a reliable dataset traversal strategy.

Use offset where it belongs

Use offset for human navigation when:

page numbers matter
absolute traversal correctness does not
the dataset is not huge
filters are reasonably selective

Do not stretch it into batch infrastructure just because the endpoint already exists.

Cursor and keyset pagination are better when the list must survive change

Once you care about stable traversal under inserts and deletes, cursor-style pagination becomes the better tool.

In practice, most production-safe cursor pagination is a form of keyset pagination: “give me the next rows after this ordered position.”

SELECT id, created_at, email
FROM users
WHERE (created_at, id) < ('2026-04-24T12:30:00Z', 98421)
ORDER BY created_at DESC, id DESC
LIMIT 50;

This pattern is dramatically more stable than offset because it does not ask the database to skip an arbitrary number of rows. It asks for rows after a known boundary in a stable sort order.

Why keyset pagination survives production better

It has three big strengths:

It scales better for deep traversal.
It behaves more predictably while new rows are inserted.
It maps naturally to APIs and infinite scroll.

If you are building public APIs, activity feeds, large search result sets, or internal tools that may be traversed deeply, cursor-based pagination is usually the better default.

But cursor pagination is not a free upgrade

It has real tradeoffs.

You need a stable sort key

The order must be deterministic. Sorting only by created_at is not enough if multiple rows share the same timestamp. Add a tiebreaker like id.

Arbitrary page jumps become awkward

Cursor pagination is great for “next” and “previous.” It is bad for “jump to page 87.” If your UI truly depends on numbered navigation, forcing cursors into that experience can make the product worse.

Cursors need careful encoding

Do not expose raw assumptions loosely. Encode the cursor cleanly, usually as an opaque token.

{
  "data": [...],
  "next_cursor": "eyJjcmVhdGVkX2F0IjoiMjAyNi0wNC0yNFQxMjozMDowMFoiLCJpZCI6OTg0MjF9"
}

That gives you flexibility to evolve internals later without breaking clients.

A solid full-stack pattern for search APIs

If a search page supports filters, sorting, and “load more,” cursor pagination is usually the right choice.

Backend response shape:

{
  "items": [
    {"id": 98421, "name": "Aarav", "created_at": "2026-04-24T12:30:00Z"},
    {"id": 98420, "name": "Sara", "created_at": "2026-04-24T12:29:58Z"}
  ],
  "page_info": {
    "has_next_page": true,
    "next_cursor": "eyJjcmVhdGVkX2F0IjoiMjAyNi0wNC0yNFQxMjozMDo1OFoiLCJpZCI6OTg0MjB9"
  }
}

Frontend usage stays simple: keep filters and sort params stable, pass the cursor forward, append results, and reset the cursor when the query changes.

That is a better long-term pattern than pretending infinite scroll is just offset pagination with a nicer UI.

Exports should almost never reuse the live paginated browsing flow

This is one of the most common production mistakes.

A team already has a list endpoint, so they build CSV export by iterating over its pages until no more results remain. It feels efficient because the endpoint already exists.

It is also usually wrong.

Exports have different semantics from browsing.

Why live pagination is a bad export foundation

If the export takes time and rows are changing underneath it, a live page-by-page export can:

miss rows inserted after earlier pages were read
duplicate rows when sorting shifts
export data with mixed timestamps or inconsistent state
create confusing mismatches between on-screen counts and exported totals

That is not a pagination bug in isolation. It is a contract bug.

Better export patterns

Pattern 1: export from a fixed filter snapshot

At export start, persist the exact filter and sort configuration plus a cutoff boundary.

For example:

status = active
created_at <= export_started_at
sort by id asc

Then run the export job against that frozen definition, not against the evolving UI query.

Pattern 2: export by ID materialization

For stricter correctness, materialize the matching IDs first, then process them in chunks.

INSERT INTO export_items (export_id, record_id)
SELECT :export_id, users.id
FROM users
WHERE status = 'active'
  AND created_at <= :snapshot_time;

Then stream the export off export_items in chunked passes.

This costs more upfront, but it gives you a stable export contract and clean retry semantics.

Pattern 3: export from a replica or warehouse when latency is acceptable

For analytics-heavy or operationally expensive exports, moving the concern away from the transactional app database is often the right call.

The important idea is this: exports are batch jobs with consistency expectations, not just large paginated reads.

Admin tools need dual-mode pagination, not one-size-fits-all purity

Admin systems are where pagination design gets political. People want page numbers, total counts, fast filters, bulk actions, and safe processing across large datasets.

You will not satisfy all of that with one primitive.

The better approach is to separate admin use cases by intent.

Mode 1: human inspection

For analysts, support staff, or operators browsing a filtered table, offset pagination may still be the right answer.

Why? Because admins often want:

page numbers
visible totals
direct page jumps
familiar data-table behavior

That is a UI problem first.

Mode 2: bulk operations

The moment an admin selects “apply action to all matching records,” you are no longer in simple browsing mode.

Now you need bulk traversal semantics. That usually means:

snapshotting the matching set
materializing IDs
processing in chunks or keyset order
making the action idempotent

Do not run bulk operations by replaying the visible page structure. The paginated table is just the discovery layer.

A clean admin architecture

A strong pattern looks like this:

GET /admin/users uses offset or cursor pagination for browsing
POST /admin/users/export creates a snapshot-backed export job
POST /admin/users/bulk-disable creates a bulk operation from a frozen filter or materialized ID set

That split avoids the classic anti-pattern where the admin table endpoint quietly becomes the source of truth for every downstream workflow.

Search changes pagination more than most teams expect

Search is where naive pagination contracts start breaking because relevance ranking is not always stable in the same way as relational sorting.

If your search backend is Elasticsearch, Meilisearch, Typesense, or a hybrid database search layer, pagination behavior depends heavily on ranking stability and index refresh timing.

Why search results are trickier

Search datasets can change because of:

new documents being indexed
ranking signals changing
typo tolerance or synonym behavior
filter changes
personalization layers

That means “page 2” may not be a fixed slice of reality in the same way as a table sorted by id.

Good pattern: separate search pagination from database pagination

Do not force your application DB pagination assumptions directly onto search results.

If search is the source of ranking truth, paginate within the search engine’s model and then hydrate records from the database as needed.

That often means cursor-like or engine-specific continuation tokens are more correct than page/offset semantics.

Bad pattern: search IDs first, then re-sort in SQL

Teams sometimes fetch IDs from search, then run a SQL query that reorders the results differently. That breaks pagination consistency immediately.

Pick the source of ordering truth and keep it consistent through the response.

Search plus exports needs an explicit contract

If users can export search results, define what that means:

export the currently matching results at export start?
export a capped relevance window?
export all records matching the current filters, ignoring ranking drift after snapshot?

If that contract is vague, pagination bugs will show up as product confusion.

The safest production design is usually three separate patterns

Most mature systems converge on a split like this, whether they admit it or not.

Pattern 1: browsing pagination

Use offset or cursor depending on the UX.

Best for:

customer lists
dashboards
admin inspection tables
public APIs with next/previous navigation

Pattern 2: traversal pagination

Use keyset pagination or chunk-by-ID for workers, syncs, and batch jobs.

Best for:

backfills
data sync jobs
email campaign recipient traversal
background reconciliation
bulk reprocessing

A simple example in application code:

$lastId = 0;

while (true) {
    $rows = DB::table('orders')
        ->where('id', '>', $lastId)
        ->orderBy('id')
        ->limit(1000)
        ->get();

    if ($rows->isEmpty()) {
        break;
    }

    foreach ($rows as $row) {
        processOrder($row);
        $lastId = $row->id;
    }
}

This is not flashy, but it is far safer than looping over OFFSET across a large, changing table.

Pattern 3: snapshot pagination

Use frozen filters, materialized IDs, or export manifests for workflows that need coherence.

Best for:

CSV and Excel exports
compliance reports
admin bulk actions with audit requirements
cross-system syncs that must be retryable

These patterns should be different because the guarantees are different.

What to standardize across the stack

Even if you use multiple pagination patterns, you still want consistency in how the stack expresses them.

Standardize response metadata by intent

For browsing endpoints, expose a predictable shape:

items
page_info
total only when it is truly supported and affordable
next_cursor or page metadata depending on strategy

For batch and export flows, do not pretend they are normal paginated reads. Expose job resources instead:

job_id
status
snapshot_time
download_url
processed_count

That distinction keeps clients honest.

Standardize sort rules

Every paginated endpoint should have:

an explicit default sort
a deterministic tiebreaker
documented allowed sort fields
a clear statement of whether pagination is stable under concurrent writes

A shocking number of production bugs come from undocumented sort ambiguity, not from the pagination primitive itself.

Standardize frontend expectations

Frontend teams should know whether an endpoint supports:

direct page jumps
infinite scroll
stable totals
export of current filters
background bulk action handoff

If the UI assumes all list endpoints behave alike, backend pagination differences will leak as weird product behavior.

The practical rule of thumb

Pagination is not one problem. It is at least three:

navigation for humans
traversal for systems
snapshotting for exports and bulk workflows

Treating all three as limit + offset is how simple list endpoints become fragile product infrastructure.

If you want a durable production rule, use this one:

Use offset for navigation, keyset for traversal, and snapshots for exports.

You can bend that rule in specific cases, but if your stack has customer search, admin tables, exports, and background jobs all touching the same data, that baseline split will save you from a lot of quiet bugs.

The real maturity move is not finding one pagination pattern that does everything. It is admitting the dataset now serves different consumers with different correctness needs, and designing each path accordingly.

Read the full post on QCode: https://qcode.in/full-stack-pagination-patterns-that-survive-exports-search-and-admin-tools/