Assumptions
- Firestore is used on your API server
- We want to use cursor-based pagination instead of Offset-based pagination in the API server
- The client sends the cursor string to the API server, which then retrieves the continuation and returns it
What I want to do
- I want to modularize the pagination process using Firestore as a data source in the API server.
- The module requires cursor, limit, and Firestore of Query like
firestore().collection('posts').where(...).orderBy(...)
- At a minimum, the followings will be returned as a result
- An array of the documents you retrieved
- Whether there is next data (hasNextPage)
- The cursor of the last document
Difficulties
The startAfter/startAt
used for pagination in Firestore can be specified by "the value of the field specified by orderBy
" or "a snapshot of the document".
In the former case, the type of the cursor to be passed depends on what orderBy is specified. In TypeScript, it can be string, number, firestore.Timestamp
, etc., but it is more practical to use string
for returning the cursor to the API client (the cursor for pagination in Relay in GraphQL is a string). However, in order to make pagination processing common, it is complicated to say "the type of the cursor in this query is XX, so it must be converted like this".
In the latter case (document snapshot), the snapshot is an object and is not suitable to be converted directly into a cursor (string). So, I came up with the idea to convert the path of the document into a cursor.
How to do it
The following is an example of TypeScript code (roughly equivalent to GraphQL's Relay style cursor pagination).
import { firestore } from 'firebase-admin'
// base64 encode the snapshot's path
const encodeCursor = (snapshot: firestore.DocumentSnapshot | firestore.QueryDocumentSnapshot) => {
return Buffer.from(snapshot.ref.path).toString('base64')
}
const decodeCursor = (cursor: string) => {
return Buffer.from(cursor, 'base64').toString('utf8')
}
type Connection = {
nodes: { id: string }[].
pageInfo: {
hasNextPage: boolean
endCursor?: string | null
}
}
export const paginateFirestore = async (query: firestore.Query, limit: number, cursor?: string | null): Promise<Connection> => {
// get one more item for hasNextPage
let q = query.limit(limit + 1)
if (cursor) {
// If a cursor is passed, convert it to a path and get a snapshot of the document
const path = decodeCursor(cursor)
const snap = await admin.firestore().doc(path).get()
if (!snap.exists) {
return { nodes: [], pageInfo: { hasNextPage: false }
}
// pass to startAfter
q = q.startAfter(snap)
}
const snapshot = await q.get()
const hasNextPage = snapshot.size > limit
const docs = snapshot.docs.slice(0, limit)
// make the path of the last document a cursor
const endCursor = hasNextPage ? encodeCursor(docs[docs.length - 1]) : null
return {
nodes: docs.map(doc => ({ id: doc.id, . .doc.data() })),
pageInfo: {
hasNextPage,
endCursor,
},
}
}
The usage is as follows.
const query = firestore().collection('posts').orderBy('createdAt', 'desc')
const connection = await paginateFirestore(query, 100, args.cursor)
`
Advantages and disadvantages.
Advantages
- Simplifies the process because the cursor will always be path.
- Clients only needs to pass the generated cursor.
- Pagination by snapshot is more accurate than pagination by orderBy field.
Disadvantages
- Overhead from getting one extra item.
What do you think?
I thought that the overhead of retrieving one extra item would not have a big impact on gRPC, so I implemented it this way, prioritizing the simplicity of the code. If you have any thoughts on this, I'd love to hear your feedback!
Top comments (3)
In this approach, does it mean that for every page you have to fetch twice? Two round trips, being one to fetch the cursor document and another to fetch the actual page?
Meaning this would always incur of a extra read I guess.
Sorry for the late reply. You are completely right.
Unfortunately, the approach will not work if the last document of the page gets deleted before requesting the next page since it cannot get the document snapshot of a deleted path.