Stop Using OFFSET for Pagination: Why Keyset Cursors Scale and LIMIT/OFFSET Doesn't

#programming #webdev

---
title: "Stop Using OFFSET for Pagination — It Won't Scale"
published: true
description: "LIMIT/OFFSET pagination degrades linearly with dataset size. Let me walk you through keyset cursor pagination for consistent O(1) performance at any page depth."
tags: postgresql, api, architecture, performance
canonical_url: https://blog.mvpfactory.co/stop-using-offset-for-pagination
---

## What We're Building

By the end of this tutorial, you'll understand exactly why your paginated queries slow down as your dataset grows — and you'll have a working keyset cursor pagination pattern you can drop into any REST or GraphQL API. We'll use PostgreSQL and Kotlin (Ktor), but the principle applies everywhere.

## Prerequisites

- Basic SQL knowledge (SELECT, WHERE, ORDER BY)
- A PostgreSQL database with a non-trivial table (thousands of rows)
- Familiarity with any backend framework (examples use Ktor)

## Step 1: See the Problem With Your Own Eyes

Here is the minimal setup to get this working. Run these two queries against any table with 100K+ rows and compare:

sql
-- OFFSET approach (page 5000, 20 rows per page)
EXPLAIN ANALYZE
SELECT * FROM orders ORDER BY id LIMIT 20 OFFSET 100000;

-- Result: Seq Scan on orders
-- Planning Time: 0.08 ms
-- Execution Time: 112.45 ms (scanned 100,020 rows)

sql
-- Keyset approach (same logical page)
EXPLAIN ANALYZE
SELECT * FROM orders WHERE id > 100000 ORDER BY id LIMIT 20;

-- Result: Index Scan using orders_pkey
-- Planning Time: 0.07 ms
-- Execution Time: 0.12 ms (scanned 20 rows)


That's ~900x faster at this depth. PostgreSQL scans 100,020 rows with OFFSET, discards 100,000, and returns 20. The keyset query does an index seek directly to the starting point, then reads exactly 20 rows. Consistent O(log n + k) performance where k is your page size.

## Step 2: Build a Cursor-Based API Response

Your API returns an opaque `next_cursor` that the client passes back on the next request:

json
{
"data": [...],
"pagination": {
"next_cursor": "eyJpZCI6MTAwMDIwfQ==",
"has_more": true
}
}


The cursor is a Base64-encoded representation of the last row's sort key. Here's the server side in Ktor:

kotlin
get("/orders") {
val cursor = call.parameters["cursor"]
?.let { Base64.decode(it) }
?.let { Json.decodeFromString(it) }

val orders = db.orders
    .run { if (cursor != null) where { id greater cursor.lastId } else this }
    .orderBy(Orders.id)
    .limit(20)
    .toList()

val nextCursor = orders.lastOrNull()?.let {
    Base64.encodeToString(Cursor(lastId = it.id))
}

call.respond(PagedResponse(orders, nextCursor))

}


Let me show you a pattern I use in every project — Spring Boot follows the same shape: decode the cursor, apply a `WHERE` clause, encode the next cursor from the last result.

## Step 3: Handle Non-Unique Sort Columns

When sorting by a non-unique column like `created_at`, you need a composite cursor with a tiebreaker. The docs don't mention this, but without it, rows with identical timestamps get skipped or duplicated.

sql
SELECT * FROM orders
WHERE (created_at, id) > (:last_timestamp, :last_id)
ORDER BY created_at ASC, id ASC
LIMIT 20;


This tuple comparison leverages PostgreSQL's row-value comparison and can use a composite index on `(created_at, id)` efficiently.

## Step 4: GraphQL — You're Already Set Up

If you're building a GraphQL API, the Relay Connection specification was built for exactly this:

graphql
query {
orders(first: 20, after: "eyJpZCI6MTAwMDIwfQ==") {
edges {
node { id total createdAt }
cursor
}
pageInfo { hasNextPage endCursor }
}
}


## Gotchas

Here's the gotcha that will save you hours:

- **No arbitrary page jumping.** Keyset cursors don't support "jump to page 50." If you need numbered pages on large datasets, use a hybrid: keyset pagination under the hood with a separate cached count query for the page count display.
- **Always include a unique tiebreaker.** Sorting by `created_at` alone will silently skip or duplicate rows when timestamps collide. Always add `id` as a secondary sort.
- **OFFSET is fine for small datasets.** Under ~10,000 rows, users never paging past the first few pages, admin dashboards with modest data — OFFSET works. The problem is when teams pick OFFSET as a default and never revisit the decision as data grows.
- **Duplicate/skipped rows with OFFSET.** When rows are inserted while a user pages forward, rows shift between pages. Cursors are stable — they point to a fixed position in the sort order.

## Conclusion

Run `EXPLAIN ANALYZE` on your paginated queries at realistic offsets — page 100, 500, 1000. If execution time grows linearly with the page number, you've got a problem that will only get worse. For any API-facing or feed-style pagination, default to keyset cursors. Encode composite cursors from day one so you handle the non-unique sort column case before it bites you.

DEV Community

Stop Using OFFSET for Pagination: Why Keyset Cursors Scale and LIMIT/OFFSET Doesn't

Top comments (0)