Getting first and last values per group

Every so often someone needs solution to getting first (or couple of first) values for given column. Or last.

For some cases (when there is not many groups) you can use recursive queries. But it's not always the best choice.

Let's try to implement first() and last() aggregates, so these could be easily used by anybody.

Continue reading Getting first and last values per group

Waiting for PostgreSQL 10 – postgres_fdw: Push down aggregates to remote servers.

On 21st of October, Robert Haas committed patch:

postgres_fdw: Push down aggregates to remote servers.
 
Now that the upper planner uses paths, and now that we have proper hooks
to inject paths into the upper planning process, it's possible for
foreign data wrappers to arrange to push aggregates to the remote side
instead of fetching all of the rows and aggregating them locally.  This
figures to be a massive win for performance, so teach postgres_fdw to
do it.
 
Jeevan Chalke and Ashutosh Bapat.  Reviewed by Ashutosh Bapat with
additional testing by Prabhat Sahu.  Various mostly cosmetic changes
by me.

Continue reading Waiting for PostgreSQL 10 – postgres_fdw: Push down aggregates to remote servers.

Getting list of unique elements in table, per group

Today, on irc, someone asked interesting question.

Basically she ran a query like:

SELECT a, b, c, d, e, f FROM TABLE ORDER BY a

then, she processed the query to get, for each a array of unique values of b, c, d, e, and f, and then he inserted it back to database, to some other table.

It was a problem, because the table had many rows (millions I would assume), and the whole process was slow.

So, how to make it faster?

Continue reading Getting list of unique elements in table, per group

Waiting for 9.6 – Support parallel aggregation.

On 21st of March, Robert Haas committed patch:

Support parallel aggregation.
 
Parallel workers can now partially aggregate the data and pass the
transition values back to the leader, which can combine the partial
results to produce the final answer.
 
David Rowley, based on earlier work by Haribabu Kommi.  Reviewed by
Álvaro Herrera, Tomas Vondra, Amit Kapila, James Sewell, and me.

Continue reading Waiting for 9.6 – Support parallel aggregation.

Filling in the blanks

Some time ago someone on irc asked interesting question. One that I couldn't answer then (didn't have an immediate idea, and didn't have time to spend on looking into it).

Now, I have some more time, and despite the fact that the person that had this problem no longer cares about it (he found some solution himself if I recall correctly), decided to look into it.

Continue reading Filling in the blanks

Waiting for 9.4 – Provide moving-aggregate support for a bunch of aggregates.

On 13th of April, Tom Lane committed patch:

Provide moving-aggregate support for a bunch of numerical aggregates.
 
First installment of the promised moving-aggregate support in built-in
aggregates: count(), sum(), avg(), stddev() and variance() for
assorted datatypes, though not for float4/float8.
 
In passing, remove a 2001-vintage kluge in interval_accum(): interval
array elements have been properly aligned since around 2003, but
nobody remembered to take out this workaround.  Also, fix a thinko
in the opr_sanity tests for moving-aggregate catalog entries.
 
David Rowley and Florian Pflug, reviewed by Dean Rasheed

On the same day he also committed:

Provide moving-aggregate support for boolean aggregates.
 
David Rowley and Florian Pflug, reviewed by Dean Rasheed

Continue reading Waiting for 9.4 – Provide moving-aggregate support for a bunch of aggregates.

Waiting for 9.4 – Support ordered-set (WITHIN GROUP) aggregates.

On 23rd of December, Tom Lane committed patch:

Support ordered-set (WITHIN GROUP) aggregates.
 
This patch introduces generic support for ordered-set and hypothetical-set
aggregate functions, as well as implementations of the instances defined in
SQL:2008 (percentile_cont(), percentile_disc(), rank(), dense_rank(),
percent_rank(), cume_dist()).  We also added mode() though it is not in the
spec, as well as versions of percentile_cont() and percentile_disc() that
can compute multiple percentile values in one pass over the data.
 
Unlike the original submission, this patch puts full control of the sorting
process in the hands of the aggregate's support functions.  To allow the
support functions to find out how they're supposed to sort, a new API
function AggGetAggref() is added to nodeAgg.c.  This allows retrieval of
the aggregate call's Aggref node, which may have other uses beyond the
immediate need.  There is also support for ordered-set aggregates to
install cleanup callback functions, so that they can be sure that
infrastructure such as tuplesort objects gets cleaned up.
 
In passing, make some fixes in the recently-added support for variadic
aggregates, and make some editorial adjustments in the recent FILTER
additions for aggregates.  Also, simplify use of IsBinaryCoercible() by
allowing it to succeed whenever the target type is ANY or ANYELEMENT.
It was inconsistent that it dealt with other polymorphic target types
but not these.
 
Atri Sharma and Andrew Gierth; reviewed by Pavel Stehule and Vik Fearing,
and rather heavily editorialized upon by Tom Lane

Continue reading Waiting for 9.4 – Support ordered-set (WITHIN GROUP) aggregates.

Waiting for 9.4 – Implement the FILTER clause for aggregate function calls.

On 17th of July, Noah Misch committed patch:

Implement the FILTER clause for aggregate function calls.
 
This is SQL-standard with a few extensions, namely support for
subqueries and outer references in clause expressions.
 
catversion bump due to change in Aggref and WindowFunc.
 
David Fetter, reviewed by Dean Rasheed.

Continue reading Waiting for 9.4 – Implement the FILTER clause for aggregate function calls.

Explaining the unexplainable – part 3

In previous post in the series I wrote about how to interpret single line in explain analyze output, it's structure, and later on described all basic data-getting operations (nodes in explain tree).

Today, we'll move towards more complicated operations.

Continue reading Explaining the unexplainable – part 3