Was asked recently about optimization of interesting case. There was table like:
=$ CREATE TABLE input_data (
category_id INT8,
object_id INT8,
interaction_ts timestamptz,
interaction_type TEXT,
interaction_count INT4
);
And there was a code that was grouping it all by sum()ing interaction_count per category, object, interaction_type, and timestamp truncated to hour.
Basically, storing somewhere result of:
=$ select
category_id,
object_id,
date_trunc( 'hour', interaction_ts ) as ts,
sum(interaction_count) filter (where interaction_type = 'a') as a_count,
sum(interaction_count) filter (where interaction_type = 'b') as b_count
from
input_data
group by 1, 2, 3;
While talking about optimizations, one idea that came was to store whole day of counts in single row, as array. So the resulting count table would be:
=$ create table results (
category_id int8,
object_id int8
interaction_day date,
a_counts int4[],
b_counts int4[]
);
Where a_counts, and b_counts would always have 24 elements, one for each hour.
Now, how to roll it up like this?
Continue reading Grouping data into array of sums – fun with custom aggregates