Recently I noticed that more and more cases that I deal with could use some partitioning. And while theoretically most people know about it, it's definitely not a very well-understood feature, and sometimes people are scared of it.
So, I'll try to explain, to my best knowledge, what it is, why one would want to use it, and how to actually make it happen.
Some time ago someone on irc asked interesting question. One that I couldn't answer then (didn't have an immediate idea, and didn't have time to spend on looking into it).
Now, I have some more time, and despite the fact that the person that had this problem no longer cares about it (he found some solution himself if I recall correctly), decided to look into it.
On 1st of June, Andrew Dunstan committed patch:
Rename jsonb_replace to jsonb_set and allow it to add new values The function is given a fourth parameter, which defaults to true. When this parameter is true, if the last element of the path is missing in the original json, jsonb_set creates it in the result and assigns it the new value. If it is false then the function does nothing unless all elements of the path are present, including the last. Based on some original code from Dmitry Dolgov, heavily modified by me. Catalog version bumped.
Some time ago Karl Bartel asked me to add ability to parse plans that were done using “ANALYZE ON, TIMING OFF". Initially I didn't see the point, but he said that explain.depesz.com allows him to hide parts of the tree, and other columns (aside from actual time) are extracted and presented in more readable way.
OK. Got his point, but was busy. Finally today committed:
Now – plans made using analyze without timing work nicely. In process also fixed display of nodes that never were executed.
And now time for some bragging, a.k.a. statistics:
On 16th of May, Andres Freund committed patch:
Support GROUPING SETS, CUBE and ROLLUP. This SQL standard functionality allows to aggregate data by different GROUP BY clauses at once. Each grouping set returns rows with columns grouped by in other sets set to NULL. This could previously be achieved by doing each grouping as a separate query, conjoined by UNION ALLs. Besides being considerably more concise, grouping sets will in many cases be faster, requiring only one scan over the underlying data. The current implementation of grouping sets only supports using sorting for input. Individual sets that share a sort order are computed in one pass. If there are sets that don't share a sort order, additional sort & aggregation steps are performed. These additional passes are sourced by the previous sort step; thus avoiding repeated scans of the source data. The code is structured in a way that adding support for purely using hash aggregation or a mix of hashing and sorting is possible. Sorting was chosen to be supported first, as it is the most generic method of implementation. Instead of, as in an earlier versions of the patch, representing the chain of sort and aggregation steps as full blown planner and executor nodes, all but the first sort are performed inside the aggregation node itself. This avoids the need to do some unusual gymnastics to handle having to return aggregated and non-aggregated tuples from underlying nodes, as well as having to shut down underlying nodes early to limit memory usage. The optimizer still builds Sort/Agg node to describe each phase, but they're not part of the plan tree, but instead additional data for the aggregation node. They're a convenient and preexisting way to describe aggregation and sorting. The first (and possibly only) sort step is still performed as a separate execution step. That retains similarity with existing group by plans, makes rescans fairly simple, avoids very deep plans (leading to slow explains) and easily allows to avoid the sorting step if the underlying data is sorted by other means. A somewhat ugly side of this patch is having to deal with a grammar ambiguity between the new CUBE keyword and the cube extension/functions named cube (and rollup). To avoid breaking existing deployments of the cube extension it has not been renamed, neither has cube been made a reserved keyword. Instead precedence hacking is used to make GROUP BY cube(..) refer to the CUBE grouping sets feature, and not the function cube(). To actually group by a function cube(), unlikely as that might be, the function name has to be quoted. Needs a catversion bump because stored rules may change. Author: Andrew Gierth and Atri Sharma, with contributions from Andres Freund Reviewed-By: Andres Freund, Noah Misch, Tom Lane, Svenne Krap, Tomas Vondra, Erik Rijkers, Marti Raudsepp, Pavel Stehule Discussion: CAOeZVidmVRe2jU6aMk_5qkxnB7dfmPROzM7Ur8JPW5j8Y5X-Lw@mail.gmail.com
On 15th of May, Simon Riggs committed patch:
TABLESAMPLE, SQL Standard and extensible Add a TABLESAMPLE clause to SELECT statements that allows user to specify random BERNOULLI sampling or block level SYSTEM sampling. Implementation allows for extensible sampling functions to be written, using a standard API. Basic version follows SQLStandard exactly. Usable concrete use cases for the sampling API follow in later commits. Petr Jelinek Reviewed by Michael Paquier and Simon Riggs
Apparently it's not going to happen.
I'm leaving the post here anyway, as I hope the code will resurface as extension, on PGXN maybe?
On 14th of May, Stephen Frost committed patch:
Add pg_audit, an auditing extension This extension provides detailed logging classes, ability to control logging at a per-object level, and includes fully-qualified object names for logged statements (DML and DDL) in independent fields of the log output. Authors: Ian Barwick, Abhijit Menon-Sen, David Steele Reviews by: Robert Haas, Tatsuo Ishii, Sawada Masahiko, Fujii Masao, Simon Riggs Discussion with: Josh Berkus, Jaime Casanova, Peter Eisentraut, David Fetter, Yeb Havinga, Alvaro Herrera, Petr Jelinek, Tom Lane, MauMau, Bruce Momjian, Jim Nasby, Michael Paquier, Fabrízio de Royes Mello, Neil Tiffin
On 12th of May, Andrew Dunstan committed patch:
Additional functions and operators for jsonb jsonb_pretty(jsonb) produces nicely indented json output. jsonb || jsonb concatenates two jsonb values. jsonb - text removes a key and its associated value from the json jsonb - int removes the designated array element jsonb - text removes a key and associated value or array element at the designated path jsonb_replace(jsonb,text,jsonb) replaces the array element designated by the path or the value associated with the key designated by the path with the given value. Original work by Dmitry Dolgov, adapted and reworked for PostgreSQL core by Andrew Dunstan, reviewed and tidied up by Petr Jelinek.