Converting list of integers into list of ranges

Yesterday someone on irc asked:

i've a query that returns sequential numbers with gaps (generate_series + join) and my question is: can is somehow construct ranges out of the returned values? sort of range_agg or something?

There was no further discussion, aside from me saying

sure you can. not trivial task, but possible.
you'd need window functions.

but it got me thinking …

Continue reading Converting list of integers into list of ranges

Waiting for PostgreSQL 11 – Support all SQL:2011 options for window frame clauses.

On 7th of February 2018, Tom Lane committed patch:

Support all SQL:2011 options for window frame clauses.
 
 
This patch adds the ability to use "RANGE offset PRECEDING/FOLLOWING"
frame boundaries in window functions.  We'd punted on that back in the
original patch to add window functions, because it was not clear how to
do it in a reasonably data-type-extensible fashion.  That problem is
resolved here by adding the ability for btree operator classes to provide
an "in_range" support function that defines how to add or subtract the
RANGE offset value.  Factoring it this way also allows the operator class
to avoid overflow problems near the ends of the datatype's range, if it
wishes to expend effort on that.  (In the committed patch, the integer
opclasses handle that issue, but it did not seem worth the trouble to
avoid overflow failures for datetime types.)
 
The patch includes in_range support for the integer_ops opfamily
(int2/int4/int8) as well as the standard datetime types.  Support for
other numeric types has been requested, but that seems like suitable
material for a follow-on patch.
 
In addition, the patch adds GROUPS mode which counts the offset in
ORDER-BY peer groups rather than rows, and it adds the frame_exclusion
options specified by SQL:2011.  As far as I can see, we are now fully
up to spec on window framing options.
 
Existing behaviors remain unchanged, except that I changed the errcode
for a couple of existing error reports to meet the SQL spec's expectation
that negative "offset" values should be reported as SQLSTATE 22013.
 
Internally and in relevant parts of the documentation, we now consistently
use the terminology "offset PRECEDING/FOLLOWING" rather than "value
PRECEDING/FOLLOWING", since the term "value" is confusingly vague.
 
Oliver Ford, reviewed and whacked around some by me
 
Discussion: https://postgr.es/m/CAGMVOdu9sivPAxbNN0X+q19Sfv9edEPv=HibOJhB14TJv_RCQg@mail.gmail.com

Continue reading Waiting for PostgreSQL 11 – Support all SQL:2011 options for window frame clauses.

Grouping data by timestamp and interval

Someone asked today on irc about grouping data, that contains timestamps, into “partitions".

Usually when someone wants something like this, you can do grouping by date_trunc(), but this time, this person, wanted to group data that all timestamps are within given interval from each other.

I'm not sure I understood him/her right, but I think he/she wanted something like this:

Continue reading Grouping data by timestamp and interval

Filling in the blanks

Some time ago someone on irc asked interesting question. One that I couldn't answer then (didn't have an immediate idea, and didn't have time to spend on looking into it).

Now, I have some more time, and despite the fact that the person that had this problem no longer cares about it (he found some solution himself if I recall correctly), decided to look into it.

Continue reading Filling in the blanks

Waiting for 9.4 – Provide moving-aggregate support for a bunch of aggregates.

On 13th of April, Tom Lane committed patch:

Provide moving-aggregate support for a bunch of numerical aggregates.
 
First installment of the promised moving-aggregate support in built-in
aggregates: count(), sum(), avg(), stddev() and variance() for
assorted datatypes, though not for float4/float8.
 
In passing, remove a 2001-vintage kluge in interval_accum(): interval
array elements have been properly aligned since around 2003, but
nobody remembered to take out this workaround.  Also, fix a thinko
in the opr_sanity tests for moving-aggregate catalog entries.
 
David Rowley and Florian Pflug, reviewed by Dean Rasheed

On the same day he also committed:

Provide moving-aggregate support for boolean aggregates.
 
David Rowley and Florian Pflug, reviewed by Dean Rasheed

Continue reading Waiting for 9.4 – Provide moving-aggregate support for a bunch of aggregates.

Window, window on the wall …

And maybe not on the wall, but instead in your SQLz, eating your data.

But a bit more seriously. Ever since PostgreSQL 8.4 we have window functions, but still I see people which do not know it or are wary to use it.

That's why I decided to write a piece on window functions. How they work and what they can be used for.

Continue reading Window, window on the wall …

Getting top-N rows per group

Yesterday on irc someone asked:

Hi, how do I get top 5 values from a column group by another column??

From further discussion, I learned that:

total rows in table is 2 million. It'll have unique words of less than 1 million.. (approx count)

I didn't have time yesterday, but decided to write a solution, or two, to the problem.

Continue reading Getting top-N rows per group

Filling the gaps with window functions

Couple of days ago I had a problem that I couldn't solve after ~ 2 hours, and decided to ask on IRC. Almost immediately after asking, I figured out the solution, but David asked me to write about the solution, even though it's now (for me) completely obvious.

The problem was like this:

I had two tables, with very simple structure: event_when timestamptz, event_count int4, and wanted to show it as a single recordset with columns: event_when, event_count_a, event_count_b, but the problem was that event_when usually didn't match. Here is an example:

Continue reading Filling the gaps with window functions

How to group messages into chats?

My jabber server had the feature, that it logs all messages that got sent through it.

This is pretty cool, and useful. And now, i got asked to use it to create list of conversations.

What exactly is this? Whenever I send (or receive) something there is record in database with information about which local user, communication type (send/recv), correspondent, when it happened, and what is the body of message.

And based on this, we want to list messages into chats. How?

Continue reading How to group messages into chats?