May 13th, 2010 by depesz | Tags: , , , , , , , | 3 comments »
Did it help? If yes - maybe you can help me?

I got asked this: having this table:

# select * from a order by d;
t | d
---+----
O | 1
O | 2
O | 3
M | 4
M | 5
M | 6
M | 7
O | 8
O | 9
O | 10
I | 11
I | 12
I | 13
(13 rows)

Is it possible to add “rank" column, that will increment whenever t changed?

Expected output looks like this:

t | d | magic
---+----+-------
O | 1 | 1
O | 2 | 1
O | 3 | 1
M | 4 | 2
M | 5 | 2
M | 6 | 2
M | 7 | 2
O | 8 | 3
O | 9 | 3
O | 10 | 3
I | 11 | 4
I | 12 | 4
I | 13 | 4
(13 rows)

Please note that value ‘O' in “t" column has 2 different values, so we can't simply rank() over (order by t).

Solution proved to be relatively simple.

First of all – I need to find out whether current row has t different from previous one. This is trivially done using:

# select
t,
d,
t is distinct from lag(t) over ( order by d ) as diff
from a order by d;
t | d | diff
---+----+------
O | 1 | t
O | 2 | f
O | 3 | f
M | 4 | t
M | 5 | f
M | 6 | f
M | 7 | f
O | 8 | t
O | 9 | f
O | 10 | f
I | 11 | t
I | 12 | f
I | 13 | f
(13 rows)

As you can see column ‘diff' has value true when there is change in column ‘t', and false when no change happened.

Now, all we need to do, is count true values in diff column:

# select
t,
d,
count(nullif(diff, false)) over (order by d)
from
(
select
t,
d,
t is distinct from lag(t) over ( order by d ) as diff
from a
) as x
order by d;
t | d | count
---+----+-------
O | 1 | 1
O | 2 | 1
O | 3 | 1
M | 4 | 2
M | 5 | 2
M | 6 | 2
M | 7 | 2
O | 8 | 3
O | 9 | 3
O | 10 | 3
I | 11 | 4
I | 12 | 4
I | 13 | 4
(13 rows)

Simple, and to the point. Alternative way would be using plpgsql function, which would return setof(t, d, count) and would increment count for every change, but if you can't use PL/pgSQL, or don't want to – pure SQL version is possible.

  1. 3 comments

  2. May 13, 2010

    You could roll this with a custom aggregate, too. We did something similar to count the number of consecutive t’s and reset to 0 when t changed.

  3. May 14, 2010

    Since you’re using 8.4 features, you could also throw in a CTE 🙂

    WITH t1 AS (
    SELECT t,d, t is distinct from lag(t) OVER (order by d) AS diff
    FROM a
    )
    SELECT t, d, count(nullif(diff, false)) over (order by d)
    FROM t1
    ORDER BY d;

  4. May 14, 2010

    @David:
    While I do use CTE when necessary, I am far from being entirely thrilled about them. These are performance issues with them, that basically make me discard the idea of using CTE unless I know I will benefit from it.

Leave a comment