postgresql tips & tricks

luckymurali_81 on freenodes #postgresql had a problem. his query returns data in wrong order. what can we do about it?

his query:

SELECT yday.region,yday.yday,mtd.mtd,target.target FROM
(SELECT dr.region,SUM(f.collection) AS yday FROM fact_collection f,dim_time dt,dim_region dr 
WHERE dt.bus_date=current_date-45 AND dt.time_id=f.time_id AND f.region_id=dr.region_id  GROUP BY dr.region) AS yday,
(SELECT dr.region,SUM(f.collection) AS mtd FROM fact_collection f,dim_time dt, dim_region dr 
WHERE dt.bus_date BETWEEN date_trunc('month',current_date-45) AND CURRENT_DATE AND dt.time_id=f.time_id 
AND f.region_id=dr.region_id GROUP BY dr.region) AS mtd,
(SELECT dr.region,SUM(f.target)AS target FROM dim_region dr,fact_collection_target f ,dim_time dt 
WHERE dt.month_id=f.month_id AND dr.region_id=f.region_id AND dt.month_id=5 GROUP BY dr.region) AS target
WHERE yday.region=mtd.region AND mtd.region=target.region 
UNION
SELECT 'Total' AS region,SUM(foo.yday),SUM(foo.mtd),SUM(foo.target)
FROM
(
SELECT yday.region,yday.yday,mtd.mtd,target.target FROM
(SELECT dr.region,SUM(f.collection) AS yday FROM fact_collection f,dim_time dt,dim_region dr 
WHERE dt.bus_date=current_date-45 AND dt.time_id=f.time_id AND f.region_id=dr.region_id  GROUP BY dr.region) AS yday,
(SELECT dr.region,SUM(f.collection) AS mtd FROM fact_collection f,dim_time dt, dim_region dr 
WHERE dt.bus_date BETWEEN date_trunc('month',current_date-45) AND CURRENT_DATE AND dt.time_id=f.time_id 
AND f.region_id=dr.region_id GROUP BY dr.region) AS mtd,
(SELECT dr.region,SUM(f.target)AS target FROM dim_region dr,fact_collection_target f ,dim_time dt 
WHERE dt.month_id=f.month_id AND dr.region_id=f.region_id AND dt.month_id=5 GROUP BY dr.region) AS target
WHERE yday.region=mtd.region AND mtd.region=target.region
)AS foo;

outputted results as:

"Central";144034;2594109;100394040
"East";144344;2591332;101193660
"North";144910;2591508;100558860
"South";143733;2588980;100303770
"Total";720490;12960302;502879290
"West";143469;2594373;100428960

while he wanted “Total" to be at the end.

so, how to you do it?

there are at least 3 possible choices:

  1. change “union" to “union all" – union has the feature that is removes duplicate rows. to do so, it has to sort source resultsets – thus killing order. union all doesn't sort as it doesn't remove duplicates – so order will be “as expected"
  2. you can notice that third column is some kind of summary, so adding “order by 3 asc" would effectively put “total" to the end of results because it has the highest value there:)
  3. third way involves a small trick. we can sort the results the way we want using functional order by.

first two options are easy. what about a functional order?

to simplify queries i created a table with data as shown by luckymurali_81:

CREATE TABLE lucky (
    region text,
    yday int4,
    mtd int4,
    target int4
);
copy lucky FROM stdin;
West    143469  2594373 100428960
Central 144034  2594109 100394040
Total   720490  12960302    502879290
South   143733  2588980 100303770
East    144344  2591332 101193660
North   144910  2591508 100558860
\.

now, let's try some “magic":

# SELECT * FROM lucky ORDER BY (CASE WHEN region = 'Total' THEN 1 ELSE 0 END);
 region  |  yday  |   mtd    |  target
---------+--------+----------+-----------
 West    | 143469 |  2594373 | 100428960
 Central | 144034 |  2594109 | 100394040
 South   | 143733 |  2588980 | 100303770
 East    | 144344 |  2591332 | 101193660
 North   | 144910 |  2591508 | 100558860
 Total   | 720490 | 12960302 | 502879290
(6 ROWS)

nice. but can we do it that regions will be sorted by name, but still keep total “down there"?

sure, we will use a nice addition to postgresql “order by .. nulls first/last":

# SELECT * FROM lucky ORDER BY (CASE WHEN region = 'Total' THEN NULL ELSE region END) ASC NULLS LAST;
 region  |  yday  |   mtd    |  target
---------+--------+----------+-----------
 Central | 144034 |  2594109 | 100394040
 East    | 144344 |  2591332 | 101193660
 North   | 144910 |  2591508 | 100558860
 South   | 143733 |  2588980 | 100303770
 West    | 143469 |  2594373 | 100428960
 Total   | 720490 | 12960302 | 502879290
(6 ROWS)

of course, this kind of functional order can be used in much more complicated scenarios. but this shows it's usability in quite simple way 🙂

8 thoughts on “postgresql tips & tricks”

  1. luckymurali_81 suggested another method:
    13:29 depesz, thanks
    13:30 despesz one more method
    13:31 select * from (youroriginalquery) as foo order by (region=’Total’),region

  2. ouch – one bad news – order by “nulls first/last” is 8.3 only.
    i’m sorry for inconvenience – i’m using 8.3 on my lap, and didn’t check previous versions.
    so, on 8.2 the best way seems to be luckymurali_81 way with order by (condition), field;

  3. I actually do the sorting by adding an index column, let’s say:

    select ‘W’,10 union select ‘A’,20 union select ‘T’,10+20

    I change to:

    select col1,col2 from (select ‘W’ as col1,10 as col2,1 as idx union select ‘A’,20,1 union select ‘T’,10+20,2) as t order by idx, col1

Comments are closed.