On 8th of October, Tom Lane committed patch:

Support index-only scans using the visibility map to avoid heap fetches.
 
When a btree index contains all columns required by the query, and the
visibility map shows that all tuples on a target heap page are
visible-to-all, we don't need to fetch that heap page.  This patch depends
on the previous patches that made the visibility map reliable.
 
There's a fair amount left to do here, notably trying to figure out a less
chintzy way of estimating the cost of an index-only scan, but the core
functionality seems ready to commit.
 
Robert Haas and Ibrar Ahmed, with some previous work by Heikki Linnakangas.

Before I will go into details, let me just say one thing:

YEAAAAAAAAAAAAAAAHHHH!

I wrote about lack of this functionality in PostgreSQL in January 2007. Since then I mentioned it on my wishlists. And now, we got it.

So, let's see what it is, how it works, and what it can do for us.

Explain looks very simply:

$ create table test as
    select
        i as id,
        i || ' ' || repeat('depesz', 2) as z,
        repeat('xxx', 500) as q
    from generate_series(1,3000) i;
SELECT 3000
 
$ create index zx on test (id, z );
CREATE INDEX

And with this I can:

$ explain analyze select id, z from test order by id limit 20;
                                                         QUERY PLAN
----------------------------------------------------------------------------------------------------------------------------
 Limit  (cost=0.00..9.09 rows=20 width=21) (actual time=0.017..0.027 rows=20 loops=1)
   ->  Index Only Scan using zx on test  (cost=0.00..1363.69 rows=3000 width=21) (actual time=0.015..0.024 rows=20 loops=1)
 Total runtime: 0.051 ms
(3 rows)

Please note the word “Only" in above explain. If I'd change the index to not contain column z, I would get:

$ drop index zx;
DROP INDEX
 
$ create index zx on test (id);
CREATE INDEX
 
$ explain analyze select id, z from test order by id limit 20;
                                                      QUERY PLAN
----------------------------------------------------------------------------------------------------------------------
 Limit  (cost=0.00..4.22 rows=20 width=21) (actual time=0.020..0.035 rows=20 loops=1)
   ->  Index Scan using zx on test  (cost=0.00..632.25 rows=3000 width=21) (actual time=0.019..0.031 rows=20 loops=1)
 Total runtime: 0.057 ms
(3 rows)

The idea is that explain quite well in the commit message, but let me just shed a bit more light on it.

Whenever you select data from table using index, index returns list of
tuples that might be part of the result. Might because it is
possible that some of the rows were deleted, or that they were inserted, but
your current transaction shouldn't see it yet.

This means that for every row PostgreSQL has to fetch the page of table
data, if only to check if the row is visible for the transaction that
requested it.

But now, since 8.4 we have “visibility maps".

The point is that for every page (8192 bytes) PostgreSQL will keep information about (as I understand) newest transaction that changed anything on this page.

So now, with this new patch, PostgreSQL no longer has to read the page – it just checks if the newest transaction that changed given page is older than the transaction requiring the data. If so, and if the index contains all the information we need – we can skip getting data from table.

So, what exactly does “has all the information we need" mean? As simple as that – index has to contain all fields used in query for all clauses – SELECT, WHERE, ORDER BY.

So, for example, if you have query:

SELECT a, b, c FROM table WHERE a = 1 order by b desc limit 20

Normally you'd just make index on (a,b) to get maximum performance. But now, it might be beneficial to make the index on (a, b, c) – i.e. to include column c in the index, so that this query will return data from index only.

Of course there is drawback – index on (a,b,c) will be larger – so it's general performance might be worse. But it should be simply tested on case-by-case basis to see what makes sense and when.

All in all – great stuff.

16 thoughts on “Waiting for 9.2 – Index only scans”

And of course you can disable this functionality with enable_indexonlyscan = false, which you’d only want to do for comparing performance of different methods. I’ve been testing this using various data set sizes, and the performance gain I saw in one instance was 700x without patch. Naturally the effectiveness of this feature will vary, but it’s something PostgreSQL has needed for a long time (as you mentioned).

But this certainly improves the whole count(*) situation for many cases, which has been a bugbear of many a PostgreSQL user. So this will be one of many awesome performance improvements in 9.2! 🙂

Yeaahhhh!!! It was one of my voted features since time ago. I saw the commit by Tom Lane in the git.

Postgresql’s Rock level: 95 😉

Finally!!!! This is great news

To taki “ogólny feature” czy uzależniony od konkretnego typu indeksu? Czy to oznacza, że postgis także przyspieszy gdy będę wyszukiwał współrzędne po indeksie gist/gin?

@pijoter:
Z tego co wiem to tylko dla B-Tree.

@pijoter:
IMHO, index only scans is impossible for GIN and potentially possible for GiST, but require changes in both index only scans and GiST interface.

How I can do the test? How I apply the patch? I apply it to 9.1? or you make available a version of 9.2 with the patch included?

@Hugo – just download current git HEAD, and you’ll have 9.2 with this patch.

“So now, with this new patch, PostgreSQL no longer has to read the page – it just checks if the newest transaction that changed given page is older than the transaction requiring the data. If so, and if the index contains all the information we need – we can skip getting data from table.”

how does it know that the newest transaction that changed the page was committed? it is quite possible that a transaction changed the page and has not committed yet and a new transaction starts and requests that page. what will happen in that case?

@Jigar:

in exactly the same way it knows usually – pg keeps information about which transaction committed.

if i am not mistaken, it keeps the commit information in xmin/xmax_committed columns of the tuple. if the index-only scan has to get that info from the tuple on disk or in some cases from pg_clog, where is the performance gain coming from?

here is the answer to my own question.

http://pgsnaga.blogspot.com/2011/10/index-only-scans-and-heap-block-reads.html

Great stuff. Does anyone know if index only scan is available over multiple single-column indexes (ie. not just multi-column indexes) ? Eg. in your example, if you created one index for column id and one index for z, would it possible to achieve index-only scans for the same query? I guess not but if it did, this would make postgres very suitable for analytical queries on wide tables.

@Lars:
I don’t know, but I don’t think so.

What’s more important – why don’t you simply try?

Thanks. I did try without succeeding and was interested in whether I was overlooking something.

Comments are closed.