Working for PDFTron, I’ve had the pleasure of building various APIs using GraphQL. And one aspect that persists across our projects is the need to retrieve data one page at a time to avoid massive amounts of data being transferred.

In this blog, we discuss two patterns you can use for your API -- offset-based and cursor-based pagination. We also look at the pros and cons of each paging method, and in our conclusion, summarize where each method is preferable.

Two Types of Pagination: Offset and Cursor

Pagination or paging is an important concept for retrieving list data from a back end. One example of pagination would be showing the ten-most recent users who signed up in a table, and then enabling click "next page" to display ten older signups (and so on).

There are multiple ways to achieve pagination, but most APIs use one of two patterns:

Offset-based pagination

  • Simpler
  • Easier to implement
  • Less robust, especially with rapidly changing data

Cursor-based pagination

  • More complex
  • More robust, less repeated data on paging
  • Supports bi-directional pagination
  • Provides valuable fields to improve UX ('totalCount', 'hasNextPage', 'hasPreviousPage')

Let’s take a closer look at each pattern.

Offset-Based Pagination

image of offset-based pagination

Offset-based pagination (also known as ‘skip-based’) generally accepts two parameters: 'limit' (or 'first') to set the max number of results, and 'offset' (or 'skip') to set how many results to skip past.

As defined in GraphQL, offset-based pagination is quite simple:

type User {
  id: ID!
}

type Query {
  signedUpUsers(limit: Int, offset: Int): [User!]!
}

As you can see, to add pagination, all you have to do is add the arguments 'limit' and 'offset' to the field 'signedUpUsers'. To get the third page of results in a ten-row table, you would do this:

signedUpUsers(limit: 10, offset: 20) {
  id
}

Pros

  • Very simple to implement.
  • Most SQL databases support 'OFFSET' and 'LIMIT', so it's easy to map the values into SQL or an ORM

Cons

First, with offset, you will notice repeated data in certain situations. For example: if you add a new row while paging through data. This is because you aren’t paging relative to any specific row; instead, you are just offsetting ‘n’ number of rows and getting a limit of ‘m’ number of rows.

image of a an offset-based pagination issue

Therefore, if a single new row is added while you look at a three-item page 1 ('offset=0', 'limit=3'), the first item on page 2 will be the same as the last item on page 1. This is because all items are shifted back by one to accommodate the new row.

Other cons of offset include:

  • No way to retrieve the last page (this is a commonly desired feature for table views of data)
  • No way to know if there are more pages (for disabling the next page button)
  • No way to know the total number of items

Cursor-Based Pagination

image of cursor-based pagination

Cursor-based pagination comes in many forms, but I will implement Relay's GraphQL Cursor Connections Specification. Cursor-based is more verbose than offset, but it provides richer functionality and is used by most major GraphQL APIs (ex: GitHub).

Cursor-based accepts two parameters that serve for forward pagination, and two for reverse pagination. For forward pagination, there is 'first' which defines the limit of how many items are returned, and 'after' which provides the offset cursor. For reverse, there is 'last' (defines limit) and 'before' (defines offset cursor). We provide some examples later on how to use these.

The key concepts of cursor-based are:

  1. Define a PageInfo type, which contains info on whether previous and next page exists, as well as the cursor of the first result ('startCursor') and the last result ('endCursor').

  2. Wrap each data type (in this case 'User') with an Edge to attach a cursor to each user. (The cursor is generally an opaque value which could be generated from the user's ID -- this way you can keep 'User' as pure data with no pagination information.)

  3. Combine all the edges with a page info and total item count in a Connection type, and return it.

Here is the same example from our Offset-based sample, but implemented with Cursor-based pagination:

type PageInfo {
  hasNextPage: Boolean!
  hasPreviousPage: Boolean!
  startCursor: String!
  endCursor: String!
}

type User {
  id: ID!
}

type UserEdge {
  node: User!
  cursor: String!
}

type UserConnection {
  totalCount: Int!
  edges: [UserEdge!]!
  pageInfo: PageInfo!
}

type Query {
  signedUpUsers(first: Int, after: String, last: Int, before: String): UserConnection!
}

This pattern more closely aligns to the idea of pagination. You will initially call something like:

signedUpUsers(first: 10) {
  totalCount
  edges {
    cursor          # can omit this to save bandwidth
    node {
      id            # node contains the User values
    }
  }
  pageInfo {
    hasNextPage     # if false, can disable next page button
    hasPreviousPage # if false, can disable prev page button
    startCursor     # used for getting prev page
    endCursor       # used for getting next page
  }
}

Then, use 'endCursor' (ex '"abc123def456"') to get the next page:

signedUpUsers(first: 10, after: "abc123def456") {
  # ...
}

Or, use 'startCursor' (ex '"123abc456def"') to get the previous page:

signedUpUsers(last: 10, before: "123abc456def") {
  # ...
}

The final solution will then look something like this:

image of a cursor-based pagination solution

Pros

  • Cursor-based provides a lot more data which can be helpful for UX
  • Supports reverse pagination
  • Pagination is relative to specific rows to avoid issues around dynamic data

Cons

  • More verbose than Offset-based pagination
  • Results in larger and more nested queries
  • No way to grab an arbitrary page to start (for example, you can't start on page 3, you need to get each page to retrieve the cursors)

Conclusion

In summary, both cursor-based pagination and offset-based pagination have their uses.

Offset-based is preferable where, for example:

  • You want to keep your app development simple
  • Data doesn’t change very often, or seeing duplicates is not a deal breaker
  • You need to retrieve (i.e., skip to) a specific page at the start

In contrast, cursor-based is preferred where:

  • Your lists are dynamic, changing often (e.g., a newsfeed)
  • Users want to navigate back and forth, or jump to the very first or last page
  • You want a professional pagination experience (i.e., no duplicate entries, additional data like total entries/pages, etc.)

That’s all! We hope this guide was helpful. If you have any questions, about this article or otherwise, don’t hesitate to contact us.