On 21st of October, Robert Haas committed patch:

postgres_fdw: Push down aggregates to remote servers.
 
Now that the upper planner uses paths, and now that we have proper hooks
to inject paths into the upper planning process, it's possible for
foreign data wrappers to arrange to push aggregates to the remote side
instead of fetching all of the rows and aggregating them locally.  This
figures to be a massive win for performance, so teach postgres_fdw to
do it.
 
Jeevan Chalke and Ashutosh Bapat.  Reviewed by Ashutosh Bapat with
additional testing by Prabhat Sahu.  Various mostly cosmetic changes
by me.

Description seems to be simple, but let's see how it works in Pg9.6 and Pg10.

First the old version (9.6):

$ create table sample_data as select i as id, cast(random() * 10 as int4) as group_id from generate_series(1,100000) i;
SELECT 100000
 
$ create extension postgres_fdw;
CREATE EXTENSION
 
$ create server origin foreign data wrapper postgres_fdw options( dbname 'depesz' );
CREATE SERVER
 
$ create user mapping for depesz server origin options ( user 'depesz' );
CREATE USER MAPPING
 
$ create foreign table data_from_origin (id int4, group_id int4) server origin options ( table_name 'sample_data' );
CREATE FOREIGN TABLE

Sanity check:

$ select group_id, count(*) from sample_data group by group_id order by group_id;
 group_id | count 
----------+-------
        0 |  4935
        1 | 10131
        2 | 10083
        3 | 10036
        4 |  9828
        5 | 10073
        6 |  9912
        7 | 10027
        8 | 10130
        9 |  9816
       10 |  5029
(11 rows)
 
$ select group_id, count(*) from data_from_origin group by group_id order by group_id;
 group_id | count 
----------+-------
        0 |  4935
        1 | 10131
        2 | 10083
        3 | 10036
        4 |  9828
        5 | 10073
        6 |  9912
        7 | 10027
        8 | 10130
        9 |  9816
       10 |  5029
(11 rows)

OK. Data sets are the same. Let's see explain analyze for the second query:

$ explain (analyze on, verbose on) select group_id, count(*) from data_from_origin group by group_id order by group_id;
                                                                QUERY PLAN                                                                
------------------------------------------------------------------------------------------------------------------------------------------
 GroupAggregate  (cost=100.00..222.22 rows=200 width=12) (actual time=55.823..119.840 rows=11 loops=1)
   Output: group_id, count(*)
   Group Key: data_from_origin.group_id
   ->  Foreign Scan on public.data_from_origin  (cost=100.00..205.60 rows=2925 width=4) (actual time=51.703..110.939 rows=100000 loops=1)
         Output: id, group_id
         Remote SQL: SELECT group_id FROM public.sample_data ORDER BY group_id ASC NULLS LAST
 Planning time: 0.056 ms
 Execution time: 120.534 ms
(8 rows)

Now, I did all of this also on pg 10. But the final explain analyze looks different:

$ explain (analyze on, verbose on) select group_id, count(*) from data_from_origin group by group_id order by group_id;
                                                QUERY PLAN
----------------------------------------------------------------------------------------------------------
 Sort  (cost=167.52..168.02 rows=200 width=12) (actual time=21.606..21.607 rows=11 loops=1)
   Output: group_id, (count(*))
   Sort Key: data_from_origin.group_id
   Sort Method: quicksort  Memory: 25kB
   ->  Foreign Scan  (cost=114.62..159.88 rows=200 width=12) (actual time=21.596..21.597 rows=11 loops=1)
         Output: group_id, (count(*))
         Relations: Aggregate on (public.data_from_origin)
         Remote SQL: SELECT group_id, count(*) FROM public.sample_data GROUP BY group_id
 Planning time: 0.073 ms
 Execution time: 21.900 ms
(10 rows)

Please note that “Foreign Scan" in the first explain – it shows actual…rows=100000, but in second explain, it shows actual…rows=11. Given that data has to be transferred over the network (in my case over unix socket on the same machine) it's clear win, which can be also seen in the execution time for both queries – 120ms vs. 21ms. Pretty cool.

I would assume that drivers for foreign servers will have to be modified to support handling aggregates, but it is definitely cool anyway.

  1. 8 comments

  2. # Kai Guo
    Oct 27, 2016

    I wonder how you did “explain analyze for the second query” on pg 10. Where is pg 10? The latest version is 9.6.

  3. Oct 27, 2016

    @Kai:
    the same way I do all my work for “Waiting for …” blog post series, since 2008 – I test what devs are committing to repository of PostgreSQL source – which is what will become Pg version 10.

  4. # Hi
    Nov 10, 2016

    Hello, do you know something about timeline of implementation ‘LIMIT’ statement push down.

  5. Nov 10, 2016

    @Hi:

    no.

  6. # Hi
    Nov 10, 2016

    That’s sad, because this feature looks not too complex to implement, but very powerful. Can you advice how to push it?

  7. Nov 10, 2016

    @Hi:

    write to developers of postgresql? Mailing list pgsql-hackers if you want to implement it, and -general if you want to talk about it.

  8. # Andrei
    Nov 21, 2016

    Is there a way to postgres_fdw send the ORDER BY to the remote server?

    I have table that grows about ~5 million rows/day, this table have an index on a timestamp column that tracks when that row was added.

    I can easily do “select * from table order by timestamp_column desc limit 1000”

    But on the foreign table, the postgres_fdw executes this remote query: “select * from table” and do the sort locally.

    I’m even using use_remote_estimate ‘true’, fetch_size ‘10000’ to see if it helps :'(

  9. Nov 21, 2016

    @Andrei:

    well, sure. you’d have to add support to this operation to postgresql. As far as I know there is no support for this yet.

Leave a comment