Waiting for 9.5 – Support GROUPING SETS, CUBE and ROLLUP.

On 16th of May, Andres Freund committed patch:

Support GROUPING SETS, CUBE and ROLLUP.
 
This SQL standard functionality allows to aggregate data by different
GROUP BY clauses at once. Each grouping set returns rows with columns
grouped by in other sets set to NULL.
 
This could previously be achieved by doing each grouping as a separate
query, conjoined by UNION ALLs. Besides being considerably more concise,
grouping sets will in many cases be faster, requiring only one scan over
the underlying data.
 
The current implementation of grouping sets only supports using sorting
for input. Individual sets that share a sort order are computed in one
pass. If there are sets that don't share a sort order, additional sort &
aggregation steps are performed. These additional passes are sourced by
the previous sort step; thus avoiding repeated scans of the source data.
 
The code is structured in a way that adding support for purely using
hash aggregation or a mix of hashing and sorting is possible. Sorting
was chosen to be supported first, as it is the most generic method of
implementation.
 
Instead of, as in an earlier versions of the patch, representing the
chain of sort and aggregation steps as full blown planner and executor
nodes, all but the first sort are performed inside the aggregation node
itself. This avoids the need to do some unusual gymnastics to handle
having to return aggregated and non-aggregated tuples from underlying
nodes, as well as having to shut down underlying nodes early to limit
memory usage.  The optimizer still builds Sort/Agg node to describe each
phase, but they're not part of the plan tree, but instead additional
data for the aggregation node. They're a convenient and preexisting way
to describe aggregation and sorting.  The first (and possibly only) sort
step is still performed as a separate execution step. That retains
similarity with existing group by plans, makes rescans fairly simple,
avoids very deep plans (leading to slow explains) and easily allows to
avoid the sorting step if the underlying data is sorted by other means.
 
A somewhat ugly side of this patch is having to deal with a grammar
ambiguity between the new CUBE keyword and the cube extension/functions
named cube (and rollup). To avoid breaking existing deployments of the
cube extension it has not been renamed, neither has cube been made a
reserved keyword. Instead precedence hacking is used to make GROUP BY
cube(..) refer to the CUBE grouping sets feature, and not the function
cube(). To actually group by a function cube(), unlikely as that might
be, the function name has to be quoted.
 
Needs a catversion bump because stored rules may change.
 
Author: Andrew Gierth and Atri Sharma, with contributions from Andres Freund
Reviewed-By: Andres Freund, Noah Misch, Tom Lane, Svenne Krap, Tomas
    Vondra, Erik Rijkers, Marti Raudsepp, Pavel Stehule
Discussion: CAOeZVidmVRe2jU6aMk_5qkxnB7dfmPROzM7Ur8JPW5j8Y5X-Lw@mail.gmail.com

Continue reading Waiting for 9.5 – Support GROUPING SETS, CUBE and ROLLUP.

Waiting for 9.5 – TABLESAMPLE, SQL Standard and extensible

On 15th of May, Simon Riggs committed patch:

TABLESAMPLE, SQL Standard and extensible
 
Add a TABLESAMPLE clause to SELECT statements that allows
user to specify random BERNOULLI sampling or block level
SYSTEM sampling. Implementation allows for extensible
sampling functions to be written, using a standard API.
Basic version follows SQLStandard exactly. Usable
concrete use cases for the sampling API follow in later
commits.
 
Petr Jelinek
 
Reviewed by Michael Paquier and Simon Riggs

Continue reading Waiting for 9.5 – TABLESAMPLE, SQL Standard and extensible

Waiting for 9.5 – Add pg_audit, an auditing extension

UPDATE

Apparently it's not going to happen.

I'm leaving the post here anyway, as I hope the code will resurface as extension, on PGXN maybe?


On 14th of May, Stephen Frost committed patch:

Add pg_audit, an auditing extension
 
This extension provides detailed logging classes, ability to control
logging at a per-object level, and includes fully-qualified object
names for logged statements (DML and DDL) in independent fields of the
log output.
 
Authors: Ian Barwick, Abhijit Menon-Sen, David Steele
Reviews by: Robert Haas, Tatsuo Ishii, Sawada Masahiko, Fujii Masao,
Simon Riggs
 
Discussion with: Josh Berkus, Jaime Casanova, Peter Eisentraut,
David Fetter, Yeb Havinga, Alvaro Herrera, Petr Jelinek, Tom Lane,
MauMau, Bruce Momjian, Jim Nasby, Michael Paquier,
Fabrízio de Royes Mello, Neil Tiffin

Continue reading Waiting for 9.5 – Add pg_audit, an auditing extension

Waiting for 9.5 – Additional functions and operators for jsonb

On 12th of May, Andrew Dunstan committed patch:

Additional functions and operators for jsonb
 
jsonb_pretty(jsonb) produces nicely indented json output.
jsonb || jsonb concatenates two jsonb values.
jsonb - text removes a key and its associated value from the json
jsonb - int removes the designated array element
jsonb - text[] removes a key and associated value or array element at
the designated path
jsonb_replace(jsonb,text[],jsonb) replaces the array element designated
by the path or the value associated with the key designated by the path
with the given value.
 
Original work by Dmitry Dolgov, adapted and reworked for PostgreSQL core
by Andrew Dunstan, reviewed and tidied up by Petr Jelinek.

Continue reading Waiting for 9.5 – Additional functions and operators for jsonb

Waiting for 9.5 – Add support for INSERT … ON CONFLICT DO NOTHING/UPDATE.

On 8th of May, Andres Freund committed patch:

Add support for INSERT ... ON CONFLICT DO NOTHING/UPDATE.
 
The newly added ON CONFLICT clause allows to specify an alternative to
raising a unique or exclusion constraint violation error when inserting.
ON CONFLICT refers to constraints that can either be specified using a
inference clause (by specifying the columns of a unique constraint) or
by naming a unique or exclusion constraint.  DO NOTHING avoids the
constraint violation, without touching the pre-existing row.  DO UPDATE
SET ... [WHERE ...] updates the pre-existing tuple, and has access to
both the tuple proposed for insertion and the existing tuple; the
optional WHERE clause can be used to prevent an update from being
executed.  The UPDATE SET and WHERE clauses have access to the tuple
proposed for insertion using the "magic" EXCLUDED alias, and to the
pre-existing tuple using the table name or its alias.
 
This feature is often referred to as upsert.
 
This is implemented using a new infrastructure called "speculative
insertion". It is an optimistic variant of regular insertion that first
does a pre-check for existing tuples and then attempts an insert.  If a
violating tuple was inserted concurrently, the speculatively inserted
tuple is deleted and a new attempt is made.  If the pre-check finds a
matching tuple the alternative DO NOTHING or DO UPDATE action is taken.
If the insertion succeeds without detecting a conflict, the tuple is
deemed inserted.
 
To handle the possible ambiguity between the excluded alias and a table
named excluded, and for convenience with long relation names, INSERT
INTO now can alias its target table.
 
Bumps catversion as stored rules change.
 
Author: Peter Geoghegan, with significant contributions from Heikki
    Linnakangas and Andres Freund. Testing infrastructure by Jeff Janes.
Reviewed-By: Heikki Linnakangas, Andres Freund, Robert Haas, Simon Riggs,
    Dean Rasheed, Stephen Frost and many others.

Continue reading Waiting for 9.5 – Add support for INSERT … ON CONFLICT DO NOTHING/UPDATE.

Waiting for 9.5 – Add transforms feature

On 26th of April, Peter Eisentraut committed patch:

Add transforms feature
 
This provides a mechanism for specifying conversions between SQL data
types and procedural languages.  As examples, there are transforms
for hstore and ltree for PL/Perl and PL/Python.
 
reviews by Pavel Stěhule and Andres Freund

Continue reading Waiting for 9.5 – Add transforms feature