October 7th, 2008 by depesz | Tags: , , , , | 6 comments »
Did it help? If yes - maybe you can help me? Donate BTC to 19zPa5diT2LZqGtTi8f8bfApLn8rw9zBHx

On 4th of September Tom Lane committed another great patch. This one is very large, and even after applying – it's has some rough edges. There will be need for additional patches to make the functionality fully robust, but the fact that it got committed means that it will be available in final 8.4.

What does it do?

First, let's check what was written in commit log:

Implement SQL-standard WITH clauses, including WITH RECURSIVE.
 
There are some unimplemented aspects: recursive queries must use UNION ALL
(should allow UNION too), and we don't have SEARCH or CYCLE clauses.
These might or might not get done for 8.4, but even without them it's a
pretty useful feature.
 
There are also a couple of small loose ends and definitional quibbles,
which I'll send a memo about to pgsql-hackers shortly. But let's land
the patch now so we can get on with other development.
 
Yoshiyuki Asaba, with lots of help from Tatsuo Ishii and Tom Lane

Description looks interesting, it relates to SQL standard, but what it really is?

Example straight from new docs:

WITH regional_sales AS (
SELECT region, SUM(amount) AS total_sales
FROM orders
GROUP BY region
), top_regions AS (
SELECT region
FROM regional_sales
WHERE total_sales > (SELECT SUM(total_sales)/10 FROM
regional_sales)
)
SELECT region,
product,
SUM(quantity) AS product_units,
SUM(amount) AS product_sales
FROM orders
WHERE region IN (SELECT region FROM top_regions)
GROUP BY region, product;

By looking at it, you will see that “WITH" queries are basically queries with inlined views or rather temporary tables, but such that exist only for the time of running of the query.

What does it give us?

Let's test:

# create table orders (region int4, product int4, quantity int4, amount int4);
CREATE TABLE
 
# insert into orders (region, product, quantity, amount)
select random() * 1000, random() * 5000, 1 + random() * 20, 1 + random() * 1000
from generate_series(1,10000);
INSERT 0 10000

So, first, let's try the docs query (or actually, a bit modified doc query):

# explain analyze
>> WITH regional_sales AS (
>> SELECT region, SUM(amount) AS total_sales
>> FROM orders
>> GROUP BY region
>> ), top_regions AS (
>> SELECT region
>> FROM regional_sales
>> WHERE total_sales > (SELECT 2 * avg(total_sales) FROM regional_sales)
>> )
>> SELECT region,
>> product,
>> SUM(quantity) AS product_units,
>> SUM(amount) AS product_sales
>> FROM orders
>> WHERE region IN (SELECT region FROM top_regions)
>> GROUP BY region, product;
QUERY PLAN
-------------------------------------------------------------------------------------------------------------------------------------
HashAggregate (cost=509.55..539.52 rows=1998 width=16) (actual time=71.882..72.254 rows=221 loops=1)
InitPlan
-> HashAggregate (cost=205.00..217.51 rows=1001 width=8) (actual time=31.532..33.234 rows=1001 loops=1)
-> Seq Scan on orders (cost=0.00..155.00 rows=10000 width=8) (actual time=0.005..13.888 rows=10000 loops=1)
-> CTE Scan on regional_sales (cost=22.54..47.56 rows=334 width=4) (actual time=39.250..39.719 rows=12 loops=1)
Filter: ((total_sales)::numeric > $1)
InitPlan
-> Aggregate (cost=22.52..22.54 rows=1 width=8) (actual time=7.666..7.667 rows=1 loops=1)
-> CTE Scan on regional_sales (cost=0.00..20.02 rows=1001 width=8) (actual time=0.002..4.786 rows=1001 loops=1)
-> Hash Join (cost=12.02..224.50 rows=1998 width=16) (actual time=40.069..71.422 rows=221 loops=1)
Hash Cond: (public.orders.region = top_regions.region)
-> Seq Scan on orders (cost=0.00..155.00 rows=10000 width=16) (actual time=0.011..16.861 rows=10000 loops=1)
-> Hash (cost=9.52..9.52 rows=200 width=4) (actual time=39.825..39.825 rows=12 loops=1)
-> HashAggregate (cost=7.51..9.52 rows=200 width=4) (actual time=39.785..39.805 rows=12 loops=1)
-> CTE Scan on top_regions (cost=0.00..6.68 rows=334 width=4) (actual time=39.254..39.762 rows=12 loops=1)
Total runtime: 72.674 ms
(16 rows)

As you can see I changes definition of “top_regions" – instead of having 10% of total sales, I define top region as having over 2 times average sale.

Now, let's try to write the same query without using “WITH":

# explain analyze
>> SELECT region,
>> product,
>> SUM(quantity) AS product_units,
>> SUM(amount) AS product_sales
>> FROM orders
>> WHERE region IN (
>> SELECT region
>> FROM orders
>> group by region
>> having sum(amount) > 2 * (
>> select avg(sum) from (
>> Select sum(amount) from orders group by region
>> ) as x
>> )
>> )
>> GROUP BY region, product;
QUERY PLAN
-------------------------------------------------------------------------------------------------------------------------------------------------
HashAggregate (cost=710.04..740.01 rows=1998 width=16) (actual time=130.271..130.644 rows=221 loops=1)
-> Hash Join (cost=477.58..690.06 rows=1998 width=16) (actual time=85.452..129.649 rows=221 loops=1)
Hash Cond: (public.orders.region = public.orders.region)
-> Seq Scan on orders (cost=0.00..155.00 rows=10000 width=16) (actual time=0.011..16.427 rows=10000 loops=1)
-> Hash (cost=465.07..465.07 rows=1001 width=4) (actual time=85.157..85.157 rows=12 loops=1)
-> HashAggregate (cost=435.04..455.06 rows=1001 width=8) (actual time=83.615..85.127 rows=12 loops=1)
Filter: ((sum(public.orders.amount))::numeric > (2::numeric * $0))
InitPlan
-> Aggregate (cost=230.03..230.04 rows=1 width=8) (actual time=47.372..47.374 rows=1 loops=1)
-> HashAggregate (cost=205.00..217.51 rows=1001 width=8) (actual time=41.329..43.404 rows=1001 loops=1)
-> Seq Scan on orders (cost=0.00..155.00 rows=10000 width=8) (actual time=0.007..20.318 rows=10000 loops=1)
-> Seq Scan on orders (cost=0.00..155.00 rows=10000 width=8) (actual time=0.005..15.737 rows=10000 loops=1)
Total runtime: 131.050 ms
(13 rows)

As you can see – it's slower. Reason is very simple. “Old" query had to scan orders table 3 times.

New query does it only 2 times.

So, WITH can be used to write more readable queries. Queries which work on smaller amounts of data. Queries which can be faster.

But, that's not all. There is also a special type of WITH query which can do something that is simply not possible without it. It is WITH RECURSIVE.

Let's imagine the simplest possible tree structure:

create table tree (id serial primary key, parent_id int4 references tree (id));
insert into tree (parent_id) values (NULL);
insert into tree (parent_id)
select case when random() < 0.95 then floor(1 + random() * currval('tree_id_seq')) else NULL end
from generate_series(1,1000) i;

This created nice forest (not tree, as we have many root elements), that has (with my random data):

  • 1001 elements
  • 56 root nodes
  • 28 of root nodes have any elements “below"
  • longest path in tree has 16 nodes: 1 -> 2 -> 3 -> 7 -> 12 -> 15 -> 39 -> 52 -> 61 -> 107 -> 123 -> 194 -> 466 -> 493 -> 810 -> 890

Traditionally, if one would want to print all parents of node 890, he would have to write a loop to query tree table many time – until it would find row that has “parent_id IS NULL".

But now, we can:

WITH RECURSIVE struct AS (
SELECT t.* FROM tree t WHERE id = 890
UNION ALL
SELECT t.* FROM tree t, struct s WHERE t.id = s.parent_id
)
SELECT * FROM struct;

Which works really cool:

QUERY PLAN
------------------------------------------------------------------------------------------------------------------------------------
CTE Scan on struct (cost=402.18..404.20 rows=101 width=8) (actual time=0.037..69.311 rows=16 loops=1)
InitPlan
-> Recursive Union (cost=0.00..402.18 rows=101 width=8) (actual time=0.030..69.222 rows=16 loops=1)
-> Index Scan using tree_pkey on tree t (cost=0.00..8.27 rows=1 width=8) (actual time=0.024..0.029 rows=1 loops=1)
Index Cond: (id = 890)
-> Hash Join (cost=0.33..39.19 rows=10 width=8) (actual time=2.165..4.313 rows=1 loops=16)
Hash Cond: (t.id = s.parent_id)
-> Seq Scan on tree t (cost=0.00..35.01 rows=1001 width=8) (actual time=0.005..2.434 rows=1001 loops=15)
-> Hash (cost=0.20..0.20 rows=10 width=4) (actual time=0.012..0.012 rows=1 loops=16)
-> WorkTable Scan on struct s (cost=0.00..0.20 rows=10 width=4) (actual time=0.002..0.005 rows=1 loops=16)
Total runtime: 69.445 ms
(11 rows)

You might be worried by seeing “Seq Scan on tree…loops=15″, but don't worry. It worked this way only because i have very little rows in the table.

After adding another 50000, and re-running the query, I got:

QUERY PLAN
---------------------------------------------------------------------------------------------------------------------------------------
CTE Scan on struct (cost=840.76..842.78 rows=101 width=8) (actual time=0.039..0.672 rows=16 loops=1)
InitPlan
-> Recursive Union (cost=0.00..840.76 rows=101 width=8) (actual time=0.032..0.591 rows=16 loops=1)
-> Index Scan using tree_pkey on tree t (cost=0.00..8.27 rows=1 width=8) (actual time=0.025..0.029 rows=1 loops=1)
Index Cond: (id = 890)
-> Nested Loop (cost=0.00..83.05 rows=10 width=8) (actual time=0.018..0.027 rows=1 loops=16)
-> WorkTable Scan on struct s (cost=0.00..0.20 rows=10 width=4) (actual time=0.002..0.004 rows=1 loops=16)
-> Index Scan using tree_pkey on tree t (cost=0.00..8.27 rows=1 width=8) (actual time=0.008..0.011 rows=1 loops=16)
Index Cond: (t.id = s.parent_id)
Total runtime: 0.799 ms
(10 rows)

Which looks perfect.

As I wrote in the beginning – there are still some things to be polished. But as for now, WITH queries look simply great.

  1. 6 comments

  2. # Lafriks
    Oct 7, 2008

    Just perfect! :) I have been waiting for this feature sooo long :)

  3. Oct 7, 2008

    This is wonderful news. Selecting from tree structures will be a whole lot easier.

  4. # pabloj
    Oct 7, 2008

    That’s great news, now on with the window function, CTE and Window function will make PostgreSQL the Open Source database with the most rich SQL dialect

  5. # xor
    Oct 7, 2008

    Excellent news! :) I think some queries may be 2-10… times faster!

  6. # Thom Brown
    Oct 7, 2008

    I wanted something like WITH RECURSIVE for ages! That can be incredibly useful and there are so many applications that could benefit form it.

  7. # VLDG
    Apr 30, 2009

    I confirm : CTE is great enhancement.
    I use it in Firebird 2.1 and it works very well.

Leave a comment