January 21st, 2009 by depesz | Tags: , , , , , , , , , | 2 comments »
Did it help? If yes - maybe you can help me? Donate BTC to 19zPa5diT2LZqGtTi8f8bfApLn8rw9zBHx

A long overdue post about new functionality. At this moment it is no longer such new, as it was committed on 28th of December (yes, I know, I should have written it earlier, Sorry).

On this day Tom Lane committed patch by Hitoshi Harada which adds support for so called window functions:

Support window functions a la SQL:2008.
 
Hitoshi Harada, with some kibitzing from Heikki and Tom.

Disclaimer: I find naming in this topic to be “a bit" confusing. I'm trying to explain it as I understand, but I might be wrong, and if I am, please comment, so others will benefit from clarifications.

What “window functions" are? What they do?

Well, window functions are function that can work on some subset of returned rows to gather some general statistics.

So, they “kind of" work like aggregates, but aggregates force you to return only 1 row per group, while in window function you can return many.

Simplest possible example:

This is aggregate:

# select avg(i) from generate_series(1,3) i;
avg
--------------------
2.0000000000000000
(1 row)

There is only 1 group (all rows from generate_series(1,3), and it returns only 1 row).

But with window functions we can do something like this:

# select i, avg(i) over () from generate_series(1,3) i;
i | avg
---+--------------------
1 | 2.0000000000000000
2 | 2.0000000000000000
3 | 2.0000000000000000
(3 rows)

This is not something you could do with aggregates otherwise then using some kind of subselect.

Before I'll explain how my above query worked, let me introduce some basic concepts of window functions: partitions and windows.

Partition is generally the same as “group by" in standard aggregates. It generates groups (partitions), and all window functions calls are ran in “context" of given partition.

For example. Let's assume we have pretty simple table with people, their department and salary:

create table test (
person text,
dept text,
salary int4
);

Now, let's put there some people:

insert into test (person, dept, salary) values
( 'alexander', 'it', 3700 ),
( 'benjamin', 'accounting', 2100 ),
( 'bradley', 'marketing', 1000 ),
( 'chandler', 'management', 2400 ),
( 'fernando', 'marketing', 1650 ),
( 'joey', 'management', 3250 ),
( 'lincoln', 'accounting', 2950 ),
( 'lj', 'it', 3250 ),
( 'michael', 'accounting', 1700 ),
( 'monica', 'management', 2950 ),
( 'paul', 'accounting', 1650 ),
( 'phoebe', 'accounting', 3700 ),
( 'rachel', 'accounting', 2950 ),
( 'ross', 'it', 1250 ),
( 'sara', 'it', 2600 ),
( 'theodore', 'management', 3250 )
;

So, let's use window functions to show what is the salary of any given person against average salary in his department.

To do so, we will partition the data by department, and calculate averages in these partitions:

# select person, dept, salary, avg(salary) over( partition by dept ) from test;
person | dept | salary | avg
-----------+------------+--------+-----------------------
lincoln | accounting | 2950 | 2508.3333333333333333
benjamin | accounting | 2100 | 2508.3333333333333333
rachel | accounting | 2950 | 2508.3333333333333333
phoebe | accounting | 3700 | 2508.3333333333333333
paul | accounting | 1650 | 2508.3333333333333333
michael | accounting | 1700 | 2508.3333333333333333
alexander | it | 3700 | 2700.0000000000000000
lj | it | 3250 | 2700.0000000000000000
ross | it | 1250 | 2700.0000000000000000
sara | it | 2600 | 2700.0000000000000000
chandler | management | 2400 | 2962.5000000000000000
theodore | management | 3250 | 2962.5000000000000000
joey | management | 3250 | 2962.5000000000000000
monica | management | 2950 | 2962.5000000000000000
fernando | marketing | 1650 | 1325.0000000000000000
bradley | marketing | 1000 | 1325.0000000000000000
(16 rows)

As you can see it automatically sorted results by dept – this is side effect, and don't count on it.

Anyway – we can add our own sorting to this:

# select person, dept, salary, avg(salary) over( partition by dept ) from test order by person;
person | dept | salary | avg
-----------+------------+--------+-----------------------
alexander | it | 3700 | 2700.0000000000000000
benjamin | accounting | 2100 | 2508.3333333333333333
bradley | marketing | 1000 | 1325.0000000000000000
chandler | management | 2400 | 2962.5000000000000000
fernando | marketing | 1650 | 1325.0000000000000000
joey | management | 3250 | 2962.5000000000000000
lincoln | accounting | 2950 | 2508.3333333333333333
lj | it | 3250 | 2700.0000000000000000
michael | accounting | 1700 | 2508.3333333333333333
monica | management | 2950 | 2962.5000000000000000
paul | accounting | 1650 | 2508.3333333333333333
phoebe | accounting | 3700 | 2508.3333333333333333
rachel | accounting | 2950 | 2508.3333333333333333
ross | it | 1250 | 2700.0000000000000000
sara | it | 2600 | 2700.0000000000000000
theodore | management | 3250 | 2962.5000000000000000
(16 rows)

Pretty nifty, isn't it?

Especially, since you can:

# select
person,
dept,
cast(
salary * 100 / avg(salary) over( partition by dept )
as numeric(5,2)
) as percent_of_average
from test
order by percent_of_average desc;
person | dept | percent_of_average
-----------+------------+--------------------
phoebe | accounting | 147.51
alexander | it | 137.04
fernando | marketing | 124.53
lj | it | 120.37
rachel | accounting | 117.61
lincoln | accounting | 117.61
theodore | management | 109.70
joey | management | 109.70
monica | management | 99.58
sara | it | 96.30
benjamin | accounting | 83.72
chandler | management | 81.01
bradley | marketing | 75.47
michael | accounting | 67.77
paul | accounting | 65.78
ross | it | 46.30
(16 rows)

To quickly check who (in given department) is the most overpaid. Or looks like the most overpaid :)

So, I hope you understand what partition is. Now to the windows.

As you saw, all window functions that I called till now, worked on all rows in given partition.

It means that given “average" salary was average to whole department. Windows let you say that you want to calculate something for only part of the partition.

The 2 most common examples of such calculations are “rownum" and cumulative sum. I wrote about them long time ago, but with 8.4 you can write it much simpler:

rownum using window functions:

# select person, dept, salary, rank() over ( partition by dept order by salary ), row_number() over (partition by dept order by salary) from test;
person | dept | salary | rank | row_number
-----------+------------+--------+------+------------
paul | accounting | 1650 | 1 | 1
michael | accounting | 1700 | 2 | 2
benjamin | accounting | 2100 | 3 | 3
lincoln | accounting | 2950 | 4 | 4
rachel | accounting | 2950 | 4 | 5
phoebe | accounting | 3700 | 6 | 6
ross | it | 1250 | 1 | 1
sara | it | 2600 | 2 | 2
lj | it | 3250 | 3 | 3
alexander | it | 3700 | 4 | 4
chandler | management | 2400 | 1 | 1
monica | management | 2950 | 2 | 2
joey | management | 3250 | 3 | 3
theodore | management | 3250 | 3 | 4
bradley | marketing | 1000 | 1 | 1
fernando | marketing | 1650 | 2 | 2
(16 rows)

Again – it just so happens that rows are ordered, but don't count on it! If you want them ordered, add proper order by clause.

rank() is kind-of like rownum. It returns which position given row gets when ordering by (whatever we put in “order by" clause inside “over ()".

Please notice, that we don't have to partition the data:

# select person, dept, salary, rank() over ( order by salary ) from test;
person | dept | salary | rank
-----------+------------+--------+------
bradley | marketing | 1000 | 1
ross | it | 1250 | 2
paul | accounting | 1650 | 3
fernando | marketing | 1650 | 3
michael | accounting | 1700 | 5
benjamin | accounting | 2100 | 6
chandler | management | 2400 | 7
sara | it | 2600 | 8
lincoln | accounting | 2950 | 9
monica | management | 2950 | 9
rachel | accounting | 2950 | 9
theodore | management | 3250 | 12
joey | management | 3250 | 12
lj | it | 3250 | 12
phoebe | accounting | 3700 | 15
alexander | it | 3700 | 15
(16 rows)

In this situation there is only 1 partition – i.e. whole resultset.

This example also shows why rank() is not really rownum – it's not unique (it was also visible in previous example, but I think in this one it's clearer).

There are 2 rows with rank = 3, 3 rows with 9 or 12 and so on.

The reason is very simple – given the ordering we shows (i.e. only salary) these rows are indistinguishable. But if I'd change the order by (within over()) to make them different:

# select person, dept, salary, rank() over ( order by salary, person ) from test;
person | dept | salary | rank
-----------+------------+--------+------
bradley | marketing | 1000 | 1
ross | it | 1250 | 2
fernando | marketing | 1650 | 3
paul | accounting | 1650 | 4
michael | accounting | 1700 | 5
benjamin | accounting | 2100 | 6
chandler | management | 2400 | 7
sara | it | 2600 | 8
lincoln | accounting | 2950 | 9
monica | management | 2950 | 10
rachel | accounting | 2950 | 11
joey | management | 3250 | 12
lj | it | 3250 | 13
theodore | management | 3250 | 14
alexander | it | 3700 | 15
phoebe | accounting | 3700 | 16
(16 rows)

rank() is simple, but what about cumulative sum? Let's try:

# select person, dept, salary, sum(salary) over ( partition by dept order by person ) from test;
person | dept | salary | sum
-----------+------------+--------+-------
benjamin | accounting | 2100 | 2100
lincoln | accounting | 2950 | 5050
michael | accounting | 1700 | 6750
paul | accounting | 1650 | 8400
phoebe | accounting | 3700 | 12100
rachel | accounting | 2950 | 15050
alexander | it | 3700 | 3700
lj | it | 3250 | 6950
ross | it | 1250 | 8200
sara | it | 2600 | 10800
chandler | management | 2400 | 2400
joey | management | 3250 | 5650
monica | management | 2950 | 8600
theodore | management | 3250 | 11850
bradley | marketing | 1000 | 1000
fernando | marketing | 1650 | 2650
(16 rows)

Generally – it runs sum(salary) on every row and previous rows (in given partition).

Now, since it's quite sensible to use the same “OVER" multiple times in given query (to get average and sum of salaries for example) you can define them separately, like this:

select
dept,
rank() over ( w_depts order by person ),
person,
salary,
sum(salary) over ( w_depts ),
avg(salary) over (w_depts),
count(*) over ( w_depts )
from
test
window
w_depts as ( partition by dept )
order by
dept, person;
dept | rank | person | salary | sum | avg | count
------------+------+-----------+--------+-------+-----------------------+-------
accounting | 1 | benjamin | 2100 | 15050 | 2508.3333333333333333 | 6
accounting | 2 | lincoln | 2950 | 15050 | 2508.3333333333333333 | 6
accounting | 3 | michael | 1700 | 15050 | 2508.3333333333333333 | 6
accounting | 4 | paul | 1650 | 15050 | 2508.3333333333333333 | 6
accounting | 5 | phoebe | 3700 | 15050 | 2508.3333333333333333 | 6
accounting | 6 | rachel | 2950 | 15050 | 2508.3333333333333333 | 6
it | 1 | alexander | 3700 | 10800 | 2700.0000000000000000 | 4
it | 2 | lj | 3250 | 10800 | 2700.0000000000000000 | 4
it | 3 | ross | 1250 | 10800 | 2700.0000000000000000 | 4
it | 4 | sara | 2600 | 10800 | 2700.0000000000000000 | 4
management | 1 | chandler | 2400 | 11850 | 2962.5000000000000000 | 4
management | 2 | joey | 3250 | 11850 | 2962.5000000000000000 | 4
management | 3 | monica | 2950 | 11850 | 2962.5000000000000000 | 4
management | 4 | theodore | 3250 | 11850 | 2962.5000000000000000 | 4
marketing | 1 | bradley | 1000 | 2650 | 1325.0000000000000000 | 2
marketing | 2 | fernando | 1650 | 2650 | 1325.0000000000000000 | 2
(16 rows)

You can also use window functions to show you parts of partitions. For example, let's assume you want list of people who earn the most, or second to most in their departments.

Doing this without window functions would be really tedious, but window functions, give as simple way:

# select dept, person, salary from (
select dept, person, salary, rank() over (partition by dept order by salary desc) from test
) as x where rank in (1,2);
dept | person | salary
------------+-----------+--------
accounting | phoebe | 3700
accounting | lincoln | 2950
accounting | rachel | 2950
it | alexander | 3700
it | lj | 3250
management | theodore | 3250
management | joey | 3250
marketing | fernando | 1650
marketing | bradley | 1000
(9 rows)

You might ask yourself if it is not possible to do without subselect – unfortunately – window functions are evaluated after result was generated, i.e. after all conditions have been passed. But the subselect shouldn't be big issue.

Now, after all these great things, there is one thing that bugs me. Citation from manual:

For each row, the window function is computed across the rows that fall into the same partition as the current row

This rings some bad bells for me. Let's write a simple aggregate that will generally do count(), but will also raise notice every time it will get called:

CREATE OR REPLACE FUNCTION test_count(a INT4, b INT4) RETURNs INT4 as $$
BEGIN
raise notice 'test_count(%, %)', a, b;
RETURN a + 1;
END;
$$ language plpgsql;
CREATE aggregate test_count (INT4) (
sfunc = test_count,
stype = INT4,
initcond = '0'
);

It's pretty simple, and we can easily validate how it works:

# select test_count(1), count(*) from test;
NOTICE: test_count(0, 1)
NOTICE: test_count(1, 1)
NOTICE: test_count(2, 1)
NOTICE: test_count(3, 1)
NOTICE: test_count(4, 1)
NOTICE: test_count(5, 1)
NOTICE: test_count(6, 1)
NOTICE: test_count(7, 1)
NOTICE: test_count(8, 1)
NOTICE: test_count(9, 1)
NOTICE: test_count(10, 1)
NOTICE: test_count(11, 1)
NOTICE: test_count(12, 1)
NOTICE: test_count(13, 1)
NOTICE: test_count(14, 1)
NOTICE: test_count(15, 1)
test_count | count
------------+-------
16 | 16
(1 row)

As you can see the function has been called 16 times – one for every row in data.

But let's see how it will go when calling it as window function:

# select person, test_count(1) over (order by person) from test where dept = 'it';
NOTICE: test_count(0, 1)
NOTICE: test_count(1, 1)
NOTICE: test_count(2, 1)
NOTICE: test_count(3, 1)
person | test_count
-----------+------------
alexander | 1
lj | 2
ross | 3
sara | 4
(4 rows)

Great. I was afraid that it will be:

NOTICE: test_count(0, 1)
NOTICE: test_count(0, 1)
NOTICE: test_count(1, 1)
NOTICE: test_count(0, 1)
NOTICE: test_count(1, 1)
NOTICE: test_count(2, 1)
NOTICE: test_count(0, 1)
NOTICE: test_count(1, 1)
NOTICE: test_count(2, 1)
NOTICE: test_count(3, 1)

i.e. – one time for every row in every window it has to take into consideration.

But luckily it's smarter, and realizes that the value can be reused.

So, to finish this long post – another great feature of PostgreSQL – have fun playing with it.

  1. 2 comments

  2. Jan 21, 2009

    breinbaas on irc suggested using row_number in my example. I checked it and yes – it looks great. I added it to one example, but then I realized that adding it, and commenting everywhere would take too much time.

    To put it simply – while writing the post I didn’t know row_number() exists. In http://developer.postgresql.org/pgdocs/postgres/tutorial-window.html it never links to http://developer.postgresql.org/pgdocs/postgres/functions-window.html and I somehow missed it.

    I’m sorry – will check the docs better next time.

  3. # Hitoshi Harada
    Jan 22, 2009

    Thanks for mentioning it. Oh, I shoud’ve had to add to tutorial row_number(), which is the most basic window function :P).
    I believe this is a great feature and will help many poeple to solve complex problems such as above easily.
    Enjoy!

Leave a comment