Waiting for PostgreSQL 14 – Add date_bin function

On 24th of March 2021, Peter Eisentraut committed patch:

Add date_bin function
 
Similar to date_trunc, but allows binning by an arbitrary interval
rather than just full units.
 
Author: John Naylor <john.naylor@enterprisedb.com>
Reviewed-by: David Fetter <david@fetter.org>
Reviewed-by: Isaac Morland <isaac.morland@gmail.com>
Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us>
Reviewed-by: Artur Zakirov <zaartur@gmail.com>
Discussion: https://www.postgresql.org/message-id/flat/CACPNZCt4buQFRgy6DyjuZS-2aPDpccRkrJBmgUfwYc1KiaXYxg@mail.gmail.com

This is pretty interesting.

First some background. We have date_trunc function which does:

=$ SELECT 'untruncated' AS spec, now()
UNION ALL
SELECT spec, date_trunc(spec, now())
FROM
    unnest('{microseconds,milliseconds,second,minute,hour,day,week,month,quarter,year,decade,century,millennium}'::text[]) AS u(spec);
     spec     │              now
──────────────┼───────────────────────────────
 untruncated  │ 2021-03-31 20:27:25.923338+02
 microseconds │ 2021-03-31 20:27:25.923338+02
 milliseconds │ 2021-03-31 20:27:25.923+02
 SECOND2021-03-31 20:27:25+02
 MINUTE2021-03-31 20:27:00+02
 HOUR2021-03-31 20:00:00+02
 DAY2021-03-31 00:00:00+02
 week         │ 2021-03-29 00:00:00+02
 MONTH2021-03-01 00:00:00+01
 quarter      │ 2021-01-01 00:00:00+01
 YEAR2021-01-01 00:00:00+01
 decade       │ 2020-01-01 00:00:00+01
 century      │ 2001-01-01 00:00:00+01
 millennium   │ 2001-01-01 00:00:00+01
(14 ROWS)

Nice. So, this new function, date_bin, but it takes (almost) any interval as base for truncation.

Almost as it can't take any interval with units of months or more (due to varying duration).

Let's see:

=$ SELECT date_bin('5 minutes', now());
ERROR:  FUNCTION date_bin(UNKNOWN, TIMESTAMP WITH TIME zone) does NOT exist
LINE 1: SELECT date_bin('5 minutes', now());
               ^
HINT:  No FUNCTION matches the given name AND argument types. You might need TO ADD explicit TYPE casts.

hmm … careful reason suggests that there is second argument – basically what is base for the bin calculation. So, to get the same results, I'd need to use epoch start as base:

=$ SELECT date_bin('5 minutes', now(), '1970-01-01');
        date_bin
------------------------
 2021-03-31 20:35:00+02
(1 ROW)

Nice. Of course, I can use any length that can be expressed as interval, and any starting point:

=$ SELECT date_bin('17 minutes 31 seconds'::INTERVAL, now(), '2000-01-01');
        date_bin
------------------------
 2021-03-31 20:23:16+02
(1 ROW)

This is pretty cool.

Of course I don't expect many people needing to group their data in weird durations, like 17 minutes and 31 seconds, but 5/15 minutes can be pretty easy to imagine to be helpful.

Thanks a lot, everyone 🙂

8 thoughts on “Waiting for PostgreSQL 14 – Add date_bin function”

  1. Hi Depesz,

    can you just point out what u(spec) in your first query is doing?
    never has seen that…

    thanks karsten

  2. Hi karsten,

    I wondered as well, never saw it before either.
    I did not find an explanation in the SELECT docs for this, but it is in the docs as well here: https://www.postgresql.org/docs/current/functions-srf.html (search for “AS s(a)”).
    It does not seem to matter if you put s(spec), xx(spec) or just spec, though. So it’s just a different way to specify an ALIAS name.

    Best regards
    Salek

  3. Thanks Salek, with you hint and a second look its clear to me! thx

  4. Hi @karsten

    I think that ‘u’ is the alias name and ‘spec’ is the column name in the resultset. Greetings

  5. I needed this some time ago and ended up with more or less the same function in SQL. It still may be used in pre-14 Postgresql versions.

    CREATE OR REPLACE FUNCTION date_trunc
      (trunc_period INTERVAL, ts timestamptz, base_ts timestamptz DEFAULT '1970-01-01Z')
      RETURNS timestamptz LANGUAGE SQL immutable AS  $function$
    SELECT
      base_ts
      + FLOOR(EXTRACT(epoch FROM ts - base_ts) / EXTRACT(epoch FROM trunc_period))::BIGINT
      * trunc_period;
    $function$;

Comments are closed.