better results paging in postgresql 8.2

some time ago merlin moncure wrote about one of new features of postgresql 8.2 – row-wise comparison.

i also read about it, but at first didn't find anything very useful about it – after all it doesn't give you any new functionality. any kind “(a, b, c) > (…)" can be written with standard column-based operators. so it's basically just a syntactic-sugar.

that's true. but merlin pointed me to not-so-obvious benefit – basically it allows much better paging of results. how? let's see.

one of the databases i wrote/used had a very interesting, and very useful, table. basically it was denormalization of standard normalized tables (which are irrelevant at the moment).

table looks like this:

create table test (
    id serial primary key,
    object_id int4,
    x_id int4,
    y_id int4,
    sortable text
);

columns in fact don't contain nulls, but i skipped all limitations to simplify example.

basically it stores data about “object", which belongs to many (usually 2-4) xs and many (also usually 2-4) ys (these are things like categories, but there are two separate lists).

“sortable" field contains some text which is constant for given object.

so, having object with id 1, that “belongs to" xs: 267, 449 and 475, and ys: 27, 281, 509 and 696, which sortable data is ‘twin confidant policing' i would get these rows:

 id  | object_id | x_id | y_id |        sortable
-----+-----------+------+------+-------------------------
 347 |         1 |  267 |  509 | twin confidant policing
 348 |         1 |  267 |  281 | twin confidant policing
 349 |         1 |  267 |   27 | twin confidant policing
 350 |         1 |  267 |  696 | twin confidant policing
 351 |         1 |  449 |  509 | twin confidant policing
 352 |         1 |  449 |  281 | twin confidant policing
 353 |         1 |  449 |   27 | twin confidant policing
 354 |         1 |  449 |  696 | twin confidant policing
 355 |         1 |  475 |  509 | twin confidant policing
 356 |         1 |  475 |  281 | twin confidant policing
 357 |         1 |  475 |   27 | twin confidant policing
 357 |         1 |  475 |  696 | twin confidant policing

what's important is that (x_id, y_id, object_id) are unique – because given object can be only once in given “crossing" of x and y (so, let's add: create unique index ui_test_xiyioi on test (x_id, y_id, object_id);)

now, we need to get list of objects in given x,y, sorted by “sortable". seems simple. the problem is that in given crossing there might be many hundred thousand objects.

in my dataset the most populated crossings are:

# select x_id, y_id, count (*) from test group by x_id, y_id order by count desc limit 10;
 x_id | y_id | count
------+------+--------
   14 |   14 | 449591
   14 |   13 | 353883
   13 |   14 | 353244
   15 |   14 | 321052
   14 |   15 | 320918
   13 |   13 | 278365
   14 |   12 | 262972
   12 |   14 | 262863
   15 |   13 | 253198
   13 |   15 | 251951
(10 rows)

(sidenote, this is just sample data, not real-world case 🙂

to search efficiently i will create standard index:

create index i_test_xiyis on test (x_id, y_id, sortable);

and a query:

# explain analyze select object_id from test where x_id = 14 and y_id = 14 order by x_id, y_id, sortable limit 20;
                                                              QUERY PLAN
--------------------------------------------------------------------------------------------------------------------------------------
 Limit  (cost=0.00..66.05 rows=20 width=25) (actual time=0.132..0.445 rows=20 loops=1)
   ->  Index Scan using i_test_xiyis on test  (cost=0.00..1499461.42 rows=454038 width=25) (actual time=0.127..0.365 rows=20 loops=1)
         Index Cond: ((x_id = 14) AND (y_id = 14))
 Total runtime: 0.613 ms
(4 rows)

(in this table i have 7780865 with 1000000 different object_id's. so getting 20 sorted rows in 0.6ms is not too shabby 🙂 (next calls get between 0.33 to 0.37 ms).

now. this is perfectly fine. but let's add paging. using standard offset. let's first try something totally “wrong":

# explain analyze select object_id from test where x_id = 14 and y_id = 14 order by x_id, y_id, sortable limit 20 offset 100000;
                                                                     QUERY PLAN
-----------------------------------------------------------------------------------------------------------------------------------------------------
 Limit  (cost=136716.17..136716.22 rows=20 width=25) (actual time=9943.891..9944.014 rows=20 loops=1)
   ->  Sort  (cost=136466.17..137601.27 rows=454038 width=25) (actual time=9359.915..9814.020 rows=100020 loops=1)
         Sort Key: sortable
         Sort Method:  external merge  Disk: 16376kB
         ->  Bitmap Heap Scan on test  (cost=10639.16..73622.73 rows=454038 width=25) (actual time=795.671..3467.840 rows=449591 loops=1)
               Recheck Cond: ((x_id = 14) AND (y_id = 14))
               ->  Bitmap Index Scan on ui_test_xiyioi  (cost=0.00..10525.65 rows=454038 width=0) (actual time=790.566..790.566 rows=449591 loops=1)
                     Index Cond: ((x_id = 14) AND (y_id = 14))
 Total runtime: 9949.831 ms
(9 rows)

ok, it doesn't look good. the problem lies in fact that postgresql has to fetch all 100020 rows just to get me 20.

what's also important – even without changing plan it can become slow for distant pages:

# explain analyze select object_id from test where x_id = 14 and y_id = 14 order by x_id, y_id, sortable limit 20 offset 20000;
                                                                QUERY PLAN
-------------------------------------------------------------------------------------------------------------------------------------------
 Limit  (cost=66050.04..66116.09 rows=20 width=25) (actual time=307.900..308.188 rows=20 loops=1)
   ->  Index Scan using i_test_xiyis on test  (cost=0.00..1499461.42 rows=454038 width=25) (actual time=0.166..279.179 rows=20020 loops=1)
         Index Cond: ((x_id = 14) AND (y_id = 14))
 Total runtime: 308.273 ms
(4 rows)

what can we do about it? with our new row-wise comparisons we can make it significantly faster.

now, we will need to modify the query, so that our order by “thing" will be unique. at the moment it is not. so, let's simply add there “object_id". since (x_id, y_id, object_id) is unique, then (x_id, y_id, sortable, object_id) will be as well 🙂

so i'll drop previous index, and create new one:

# drop index i_test_xiyis;
DROP INDEX
# create index i_test_xiyisoi on test (x_id, y_id, sortable, object_id);
CREATE INDEX

now, quick sanity check:

# explain analyze select object_id from test where x_id = 14 and y_id = 14 order by x_id, y_id, sortable, object_id limit 20;
                                                               QUERY PLAN
----------------------------------------------------------------------------------------------------------------------------------------
 Limit  (cost=0.00..66.69 rows=20 width=25) (actual time=0.063..0.228 rows=20 loops=1)
   ->  Index Scan using i_test_xiyisoi on test  (cost=0.00..1514047.83 rows=454038 width=25) (actual time=0.059..0.146 rows=20 loops=1)
         Index Cond: ((x_id = 14) AND (y_id = 14))
 Total runtime: 0.325 ms
(4 rows)

looks fine.

now, let's get first page of results:

# select * from test where x_id = 14 and y_id = 14 order by x_id, y_id, sortable, object_id limit 20;
    id    | object_id | x_id | y_id |        sortable
----------+-----------+------+------+-------------------------
 29843090 |      7443 |   14 |   14 | a
...
 32357888 |    330757 |   14 |   14 | aardvarks
(20 rows)

now, to get second page, i will just:

# explain analyze select * from test where x_id = 14 and y_id = 14 and (x_id, y_id, sortable, object_id) > (14, 14, 'aardvarks', 330757) order by x_id, y_id, sortable, object_id limit 20;
                                                              QUERY PLAN
---------------------------------------------------------------------------------------------------------------------------------------
 Limit  (cost=0.01..67.48 rows=20 width=29) (actual time=0.188..0.852 rows=20 loops=1)
   ->  Index Scan using i_test_xiyisoi on test  (cost=0.01..261465.67 rows=77505 width=29) (actual time=0.183..0.768 rows=20 loops=1)
         Index Cond: ((ROW(x_id, y_id, sortable, object_id) > ROW(14, 14, 'aardvarks'::text, 330757)) AND (x_id = 14) AND (y_id = 14))
 Total runtime: 0.993 ms
(4 rows)

ok. so, how would i get 1000th page of results? to get it i will need last record from 999th page of results, so let's check it (using plain old limit/offset):

# select * from test where x_id = 14 and y_id = 14 order by x_id, y_id, sortable, object_id limit 1 offset 19979;
    id    | object_id | x_id | y_id | sortable
----------+-----------+------+------+----------
 30025740 |     30905 |   14 |   14 | armlets
(1 row)

(time to get it: 280-320ms).

now, let's use this data to get 1000th page:

# explain analyze select * from test where x_id = 14 and y_id = 14 and (x_id, y_id, sortable, object_id) > (14, 14, 'armlets', 30905) order by x_id, y_id, sortable, object_id limit 20;
                                                               QUERY PLAN
----------------------------------------------------------------------------------------------------------------------------------------
 Limit  (cost=0.01..67.48 rows=20 width=29) (actual time=21.222..21.505 rows=20 loops=1)
   ->  Index Scan using i_test_xiyisoi on test  (cost=0.01..261465.67 rows=77505 width=29) (actual time=21.218..21.429 rows=20 loops=1)
         Index Cond: ((ROW(x_id, y_id, sortable, object_id) > ROW(14, 14, 'armlets'::text, 30905)) AND (x_id = 14) AND (y_id = 14))
 Total runtime: 21.745 ms
(4 rows)

ok. and how about a 10000th page?

again – i have to check last row of previous page using limit/offset (it takes nearly 10 seconds).

and time to get 10000th page?

# explain analyze select * from test where x_id = 14 and y_id = 14 and (x_id, y_id, sortable, object_id) > (14, 14, 'infantrymen', 988050) order by x_id, y_id, sortable, object_id limit 20;
                                                                QUERY PLAN
------------------------------------------------------------------------------------------------------------------------------------------
 Limit  (cost=0.01..67.48 rows=20 width=29) (actual time=119.769..119.864 rows=20 loops=1)
   ->  Index Scan using i_test_xiyisoi on test  (cost=0.01..261465.67 rows=77505 width=29) (actual time=119.765..119.812 rows=20 loops=1)
         Index Cond: ((ROW(x_id, y_id, sortable, object_id) > ROW(14, 14, 'infantrymen'::text, 988050)) AND (x_id = 14) AND (y_id = 14))
 Total runtime: 119.964 ms
(4 rows)

not bad. it grew over original 0.3ms, but 100ms for 10000th page seems like a good deal.

the problem is that to get data for page number X you have to know last row of page (X-1).

while this seems like a problem, it is not really such a big deal – after all, people usually will do something like “next" on long-listing-pages. so, as long as you're providing only next/previous buttons – you're fine.

but what in case you want to provide “shortcuts" for next 10 pages? it's simple – simply get 180 rows (9 pages) instead of only 20 to your app (should be reasonably fast), and use fetched data to provide “pointers" for these 10 pages.

how long does it take to get 180 rows? let's see:

from first page:

# explain analyze select * from test where x_id = 14 and y_id = 14 order by x_id, y_id, sortable, object_id limit 180;
                                                               QUERY PLAN
-----------------------------------------------------------------------------------------------------------------------------------------
 Limit  (cost=0.00..600.23 rows=180 width=29) (actual time=0.118..5.786 rows=180 loops=1)
   ->  Index Scan using i_test_xiyisoi on test  (cost=0.00..1514047.83 rows=454038 width=29) (actual time=0.113..5.079 rows=180 loops=1)
         Index Cond: ((x_id = 14) AND (y_id = 14))
 Total runtime: 6.206 ms
(4 rows)

from 10000th page:

# explain analyze select * from test where x_id = 14 and y_id = 14 and (x_id, y_id, sortable, object_id) > (14, 14, 'infantrymen', 988050) order by x_id, y_id, sortable, object_id limit 20;
                                                                QUERY PLAN
------------------------------------------------------------------------------------------------------------------------------------------
 Limit  (cost=0.01..67.48 rows=20 width=29) (actual time=117.349..117.440 rows=20 loops=1)
   ->  Index Scan using i_test_xiyisoi on test  (cost=0.01..261465.67 rows=77505 width=29) (actual time=117.343..117.387 rows=20 loops=1)
         Index Cond: ((ROW(x_id, y_id, sortable, object_id) > ROW(14, 14, 'infantrymen'::text, 988050)) AND (x_id = 14) AND (y_id = 14))
 Total runtime: 117.543 ms
(4 rows)

so, as you can see – there is drawback, but (at least for me) very limited.

and what it somebody wanted to go straight to last page? then we can use a simple trick and present him last 20 adverts. it's not exactly the same as “last page", but it should be close enough for most people to don't mind:

# explain analyze select * from ( select * from test where x_id = 14 and y_id = 14 order by x_id desc, y_id desc, sortable desc, object_id desc limit 20 ) t order by x_id, y_id, sortable, object_id;
                                                                      QUERY PLAN
-------------------------------------------------------------------------------------------------------------------------------------------------------
 Sort  (cost=67.32..67.37 rows=20 width=48) (actual time=0.844..0.880 rows=20 loops=1)
   Sort Key: test.x_id, test.y_id, test.sortable, test.object_id
   Sort Method:  quicksort  Memory: 18kB
   ->  Limit  (cost=0.00..66.69 rows=20 width=29) (actual time=0.115..0.412 rows=20 loops=1)
         ->  Index Scan Backward using i_test_xiyisoi on test  (cost=0.00..1514047.83 rows=454038 width=29) (actual time=0.110..0.334 rows=20 loops=1)
               Index Cond: ((x_id = 14) AND (y_id = 14))
 Total runtime: 1.058 ms
(7 rows)

one last thing – you might think that passing everytime 4 arguments (x_id, y_id, sortable and object_id) to client (browser?) and back is far from optimal. simple “one-number" would be much better.

true. and this is where our artificial primary key comes to help.

our 10000th page condition used data from row with id = 37473241:

# select * from test where (x_id, y_id, sortable, object_id) = (14, 14, 'infantrymen', 988050);
    id    | object_id | x_id | y_id |  sortable
----------+-----------+------+------+-------------
 37473241 |    988050 |   14 |   14 | infantrymen
(1 row)

so, instead of writing:

select * from test where x_id = 14 and y_id = 14 and (x_id, y_id, sortable, object_id) > (14, 14, 'infantrymen', 988050) order by x_id, y_id, sortable, object_id limit 20;

i can:

# explain analyze
=# select t.* from test t
=# where t.x_id = 14 and t.y_id = 14
=#     AND (t.x_id, t.y_id, t.sortable, t.object_id) > (SELECT q.x_id, q.y_id, q.sortable, q.object_id FROM test q WHERE q.id = 37473241)
=# order by t.x_id, t.y_id, t.sortable, t.object_id limit 20;
                                                                QUERY PLAN
-------------------------------------------------------------------------------------------------------------------------------------------
 Limit  (cost=8.96..76.00 rows=20 width=29) (actual time=65.308..65.431 rows=20 loops=1)
   InitPlan
     ->  Index Scan using test_pkey on test q  (cost=0.00..8.95 rows=1 width=25) (actual time=0.018..0.020 rows=1 loops=1)
           Index Cond: (id = 37473241)
   ->  Index Scan using i_test_xiyisoi on test t  (cost=0.01..507333.25 rows=151346 width=29) (actual time=65.304..65.358 rows=20 loops=1)
         Index Cond: ((ROW(x_id, y_id, sortable, object_id) > ROW($0, $1, $2, $3)) AND (x_id = 14) AND (y_id = 14))
 Total runtime: 65.515 ms
(7 rows)

so basically, all you need to pass to webbrowser is a number. this number will not change in the same way page numbers will.

so, with this trick you can make paging that will go to further pages (than limit/offset) with acceptable speed.

10 thoughts on “better results paging in postgresql 8.2”

Jonathan Gardner says:

2007-08-29 at 21:20

This is an interesting solution, but the pagination will go funky if that data is changing. For instance, what you think would be page 3 may really be page 4 if someone inserted 20 records on page 1.
depesz says:

2007-08-30 at 07:03

@Jonathan Gardner:
sure, but if you would use traditional limit/offset and you would like to go from 3rd to 4th page, while at the same time somebody would add 20 records for page1 – for user it would look like the page didn’t change – he would still see the same records.

both solutions have issues – i’m not saying one is much better than the other. the one with () > () seems to solve one particular problem that i encountered – thus i wrote about it 🙂
Ian says:

2008-01-03 at 22:31

Excellent. I love stuff like this. Thank you.
paftek says:

2008-07-01 at 10:42

I read your article with interest.
I am considering refactoring the results paging of my web app. Actually, the maximum number of results is about 100k, and paging is done via the traditional OFFSET and LIMIT stuff.
I saw that PG has a cursor functionality. Did you compare your solution to this ?
In a few words, do you have any feedback on PG cursors ? What is the memory impact if 1000 of my customers have an 100k rows cursor opened ?
Many thanks.
depesz says:

2008-07-01 at 10:50

@paftek:
i don’t really view cursors as solution. remember that they exist only in transaction.
paftek says:

2008-07-01 at 10:59

Great day for me, I am going to contradict a PG expert 😉
Declared with “WITH HOLD”, the cursor “may continue to be used after the transaction that created it successfully commits”.
depesz says:

2008-07-01 at 11:03

@paftek:
ah, true. forgot about it.
so yes, you can use it. but i would be quite “scared” to allow basically unlimited number of cursors to be used in database.
plus there is synchronization issue – cursor is visible (afaik) only in one connection.
paftek says:

2008-07-01 at 12:17

Indeed you’re right : “If WITH HOLD is specified and the transaction that created the cursor successfully commits, the cursor can continue to be accessed by subsequent transactions in the same session.”
So I will investigate your proposal.
Thanks for you work, and sorry to pollute your comments.
Wrona says:

2010-02-02 at 11:58

Pretty interesting post.
Herndon says:

2010-03-01 at 12:57

Good Thanks

Comments are closed.