Keyset cursors, not offsets, for Postgres pagination
66 points
10 months ago
| 5 comments
| blog.sequinstream.com
| HN
deathanatos
10 months ago
[-]
reply
eddd-ddde
10 months ago
[-]
I don't know why I never thought of this technique. Nice little resource.

Goes to show why actually knowing SQL will always beat using generalized ORMs.

reply
AlexErrant
10 months ago
[-]
> This user experience is mostly good, as users typically care about domain-level filtering (commits between these time ranges) versus the arbitrary traversal of pages

If you're using SQLite, you can use temp tables as a hack to get arbitrary traversal:

Imagine an MS Excel-style spreadsheet that may have 10k+ rows that a user may search/sort on; I virtualize it and query for data in blocks of 100 rows. Being Excel-style, you can arbitrarily scroll any distance down the spreadsheet. OFFSET has terrible perf when skipping 15k+ rows, and users might want to jump to the bottom of the spreadsheet (or anywhere, really). I handle this in two steps:

    1. Return the first 100 results of a query ASAP.
    2. In a nonblocking manner, add all the relevant ids for that query to a temp table.
When a user tires of the first 100 results and starts randomly scrolling, I can use cursor pagination with the temp table's rowid (WHERE rowid > 15000 ORDER BY rowid LIMIT 100) by taking advantage of the fact that the spreadsheet's row number is now the same as the temp table's rowid. Building this cache is relatively expensive; it takes ~10 seconds on my devbox with ~15k rows. However jumping to the bottom of that result set takes ~200ms, so I'm happy with it. (Note I'm on a fork of wa-sqlite, i.e. it's running in the browser - "real" sqlite is much, much faster.)
reply
bbirk
10 months ago
[-]
>Traversing

With the reverse pagination solution here, the order of the items is inverted. For example if you order by some incrementing value, and page size is 5, then page 1 has items 1,2,3,4,5, in that order, the next page, page 2, has items 6,7,8,9,10, and when you press previous page you get items 5,4,3,2,1 instead of the original order. You can reverse them in the client when you use previous pagination, but this is inconvenient. You can implement this in sql by wrapping it in another order by with the correct order. There is a page on select (sql) on wikipedia that covers this kind of pagination for most relational databases[1].

>Designing interfaces

You mentioned that you don't have a concept of link to page 7 when you use cursor pagination, but you could always combine offset and cursor pagination, like: "6 pages (as offset) after item with lastupdate=2024-01-01". The performance downsides of using offset are not that big with small offsets, and you can add some maximum offset in the backend to make sure the performance is good enough. This means you can also add a feature where the user can jump 5 pages ahead instead of having to jump one page at a time like with the github example, without most of the performance overhead of using pure offset pagination.

>So, I think it's reasonable to return cursors that traverse in just one direction

Being able to traverse backwards is genuinely useful in a lot of cases, and as a developer it is your job to figure out a technical solution to the user needs. Again, have a look at the pagination wikipedia page to see how you can implement it in sql.

[1] https://en.wikipedia.org/wiki/Select_(SQL)#Method_with_filte...

reply
millipede
10 months ago
[-]
Cursors are generally better, but they don't give any hint about how much more is in front of the cursor. I don't think there is a O(log n) algorithm to keep track of what position a given item is in a collection (k'th item out of n), which still keeps the other CRUD operations at O(log n). This would be useful to split a large, dynamically changing result set into smaller parts that can be scanned through using a cursor.
reply
lucas_codes
10 months ago
[-]
If you only want a hint, in postgres you could always do an EXPLAIN and see the estimated number of rows returned
reply
Rendello
10 months ago
[-]
reply