Waiting for 9.5 – Add support for index-only scans in GiST.

On 26th of March, Heikki Linnakangas committed patch:

Add support for index-only scans in GiST.
 
This adds a new GiST opclass method, 'fetch', which is used to reconstruct
the original Datum from the value stored in the index. Also, the 'canreturn'
index AM interface function gains a new 'attno' argument. That makes it
possible to use index-only scans on a multi-column index where some of the
opclasses support index-only scans but some do not.
 
This patch adds support in the box and point opclasses. Other opclasses
can added later as follow-on patches (btree_gist would be particularly
interesting).
 
Anastasia Lubennikova, with additional fixes and modifications by me.

After this commit there was also a bunch of others that add necessary logic so that index only gist scans can be used on other datatypes too, but since I'm just writing about new functionality, I figured I can use points:

$ create table test (id serial primary key, some_point point);
$ insert into test (some_point) select point(random() * 5000, random() * 5000) from generate_series(1,100000) i;
$ create index tst_idx on test using gist (some_point);

And now, let's try to use index only scan to find 10 points nearest to given location:

$ explain analyze select some_point, some_point <-> point(500,500) from test order by some_point <-> point(500,500) limit 10;
                                                            QUERY PLAN                                                             
-----------------------------------------------------------------------------------------------------------------------------------
 Limit  (cost=0.28..1.09 rows=10 width=16) (actual time=0.169..0.284 rows=10 loops=1)
   ->  Index Only Scan using tst_idx on test  (cost=0.28..8136.28 rows=100000 width=16) (actual time=0.168..0.280 rows=10 loops=1)
         Order By: (some_point <-> '(500,500)'::point)
         Heap Fetches: 10
 Planning time: 0.129 ms
 Execution time: 0.332 ms
(6 rows)

Of course, for sanity checking, the values:

$ select some_point, some_point <-> point(500,500) from test order by some_point <-> point(500,500) limit 10;
             some_point              |     ?column?     
-------------------------------------+------------------
 (496.12135393545,491.772019304335)  | 9.09634880720175
 (479.593751952052,492.762844078243) | 21.6515908244684
 (478.387083858252,508.600422181189) | 23.2612425688083
 (476.656502578408,504.556254018098) | 23.7839929900202
 (494.28480444476,526.699291076511)  | 27.3041316328299
 (474.246060475707,515.573592856526) | 30.0965478997474
 (490.935954730958,529.630510136485) |  30.985868514334
 (530.101526528597,485.602258704603) |  33.367601858105
 (466.621380764991,489.233953412622) | 35.0719258261832
 (526.196013670415,523.584464099258) | 35.2485188209342
(10 rows)

Of course, if I wanted to add id column, it would switch to normal index scan, because index doesn't contain values for id column:

$ explain analyze select id, some_point, some_point <-> point(500,500) from test order by some_point <-> point(500,500) limit 10;
                                                          QUERY PLAN                                                          
------------------------------------------------------------------------------------------------------------------------------
 Limit  (cost=0.28..1.09 rows=10 width=20) (actual time=0.128..0.205 rows=10 loops=1)
   ->  Index Scan using tst_idx on test  (cost=0.28..8136.28 rows=100000 width=20) (actual time=0.126..0.200 rows=10 loops=1)
         Order By: (some_point <-> '(500,500)'::point)
 Planning time: 0.168 ms
 Execution time: 0.254 ms
(5 rows)

Which, of course, can be fixed:

$ drop index tst_idx ;
$ create extension btree_gist ;
$ create index tst_idx on test using gist (some_point, id);

And now:

$ explain analyze select id, some_point, some_point <-> point(500,500) from test order by some_point <-> point(500,500) limit 10;
                                                            QUERY PLAN                                                             
-----------------------------------------------------------------------------------------------------------------------------------
 Limit  (cost=0.28..1.17 rows=10 width=20) (actual time=0.066..0.107 rows=10 loops=1)
   ->  Index Only Scan using tst_idx on test  (cost=0.28..8908.28 rows=100000 width=20) (actual time=0.065..0.106 rows=10 loops=1)
         Order By: (some_point <-> '(500,500)'::point)
         Heap Fetches: 10
 Planning time: 0.114 ms
 Execution time: 0.130 ms
(6 rows)

Nice. That's definitely going to be useful. Thanks Anastasia and Heikki.

=$
|

Waiting for 9.5 – Add support for index-only scans in GiST.

One thought on “Waiting for 9.5 – Add support for index-only scans in GiST.”