On 2nd of April 2020, Fujii Masao committed patch:
Allow pg_stat_statements to track planning statistics.
This commit makes pg_stat_statements support new GUC
pg_stat_statements.track_planning. If this option is enabled,
pg_stat_statements tracks the planning statistics of the statements,
e.g., the number of times the statement was planned, the total time
spent planning the statement, etc. This feature is useful to check
the statements that it takes a long time to plan. Previously since
pg_stat_statements tracked only the execution statistics, we could
not use that for the purpose.
The planning and execution statistics are stored at the end of
each phase separately. So there are not always one-to-one relationship
between them. For example, if the statement is successfully planned
but fails in the execution phase, only its planning statistics are stored.
This may cause the users to be able to see different pg_stat_statements
results from the previous version. To avoid this,
pg_stat_statements.track_planning needs to be disabled.
This commit bumps the version of pg_stat_statements to 1.8
since it changes the definition of pg_stat_statements function.
Author: Julien Rouhaud, Pascal Legrand, Thomas Munro, Fujii Masao
Reviewed-by: Sergei Kornilov, Tomas Vondra, Yoshikazu Imai, Haribabu Kommi, Tom Lane
Continue reading Waiting for PostgreSQL 13 – Allow pg_stat_statements to track planning statistics.
Recently, on irc, there were couple of cases where someone wanted to use uuid as datatype for their primary key.
I opposed, and tried to explain, but IRC doesn't really allow for longer texts, so figured I'll write a blogpost.
Continue reading Why I'm not fan of uuid datatype
On 27th of July 2019, Michael Paquier committed patch:
Add support for --jobs in reindexdb
When doing a schema-level or a database-level operation, a list of
relations to build is created which gets processed in parallel using
multiple connections, based on the recent refactoring for parallel slots
in src/bin/scripts/. System catalogs are processed first in a
serialized fashion to prevent deadlocks, followed by the rest done in
This new option is not compatible with --system as reindexing system
catalogs in parallel can lead to deadlocks, and with --index as there is
no conflict handling for indexes rebuilt in parallel depending in the
Author: Julien Rouhaud
Continue reading Waiting for PostgreSQL 13 – Add support for –jobs in reindexdb
On 4th of April 2019, Robert Haas committed patch:
Allow VACUUM to be run with index cleanup disabled.
This commit adds a new reloption, vacuum_index_cleanup, which
controls whether index cleanup is performed for a particular
relation by default. It also adds a new option to the VACUUM
command, INDEX_CLEANUP, which can be used to override the
reloption. If neither the reloption nor the VACUUM option is
used, the default is true, as before.
Masahiko Sawada, reviewed and tested by Nathan Bossart, Alvaro
Herrera, Kyotaro Horiguchi, Darafei Praliaskouski, and me.
The wording of the documentation is mostly due to me.
Continue reading Waiting for PostgreSQL 12 – Allow VACUUM to be run with index cleanup disabled.
On 30th of March 2019, Peter Eisentraut committed patch:
This is an SQL-standard feature that allows creating columns that are
computed from expressions rather than assigned, similar to a view or
materialized view but on a column basis.
This implements one kind of generated column: stored (computed on
write). Another kind, virtual (computed on read), is planned for the
future, and some room is left for it.
Continue reading Waiting for PostgreSQL 12 – Generated columns
On 16th of February 2019, Tom Lane committed patch:
Allow user control of CTE materialization, and change the default behavior.
Historically we've always materialized the full output of a CTE query,
treating WITH as an optimization fence (so that, for example, restrictions
from the outer query cannot be pushed into it). This is appropriate when
the CTE query is INSERT/UPDATE/DELETE, or is recursive; but when the CTE
query is non-recursive and side-effect-free, there's no hazard of changing
the query results by pushing restrictions down.
Another argument for materialization is that it can avoid duplicate
computation of an expensive WITH query --- but that only applies if
the WITH query is called more than once in the outer query. Even then
it could still be a net loss, if each call has restrictions that
would allow just a small part of the WITH query to be computed.
Hence, let's change the behavior for WITH queries that are non-recursive
and side-effect-free. By default, we will inline them into the outer
query (removing the optimization fence) if they are called just once.
If they are called more than once, we will keep the old behavior by
default, but the user can override this and force inlining by specifying
NOT MATERIALIZED. Lastly, the user can force the old behavior by
specifying MATERIALIZED; this would mainly be useful when the query had
deliberately been employing WITH as an optimization fence to prevent a
poor choice of plan.
Andreas Karlsson, Andrew Gierth, David Fetter
Continue reading Waiting for PostgreSQL 12 – Allow user control of CTE materialization, and change the default behavior.
On 29th of November 2018, Alvaro Herrera committed patch:
Add log_statement_sample_rate parameter
This allows to set a lower log_min_duration_statement value without
incurring excessive log traffic (which reduces performance). This can
be useful to analyze workloads with lots of short queries.
Author: Adrien Nayrat
Continue reading Waiting for PostgreSQL 12 – Add log_statement_sample_rate parameter
On 1st of August 2018, Peter Eisentraut committed patch:
Allow multi-inserts during COPY into a partitioned table
CopyFrom allows multi-inserts to be used for non-partitioned tables, but
this was disabled for partitioned tables. The reason for this appeared
to be that the tuple may not belong to the same partition as the
previous tuple did. Not allowing multi-inserts here greatly slowed down
imports into partitioned tables. These could take twice as long as a
copy to an equivalent non-partitioned table. It seems wise to do
something about this, so this change allows the multi-inserts by
flushing the so-far inserted tuples to the partition when the next tuple
does not belong to the same partition, or when the buffer fills. This
improves performance when the next tuple in the stream commonly belongs
to the same partition as the previous tuple.
In cases where the target partition changes on every tuple, using
multi-inserts slightly slows the performance. To get around this we
track the average size of the batches that have been inserted and
adaptively enable or disable multi-inserts based on the size of the
batch. Some testing was done and the regression only seems to exist
when the average size of the insert batch is close to 1, so let's just
enable multi-inserts when the average size is at least 1.3. More
performance testing might reveal a better number for, this, but since
the slowdown was only 1-2% it does not seem critical enough to spend too
much time calculating it. In any case it may depend on other factors
rather than just the size of the batch.
Allowing multi-inserts for partitions required a bit of work around the
per-tuple memory contexts as we must flush the tuples when the next
tuple does not belong the same partition. In which case there is no
good time to reset the per-tuple context, as we've already built the new
tuple by this time. In order to work around this we maintain two
per-tuple contexts and just switch between them every time the partition
changes and reset the old one. This does mean that the first of each
batch of tuples is not allocated in the same memory context as the
others, but that does not matter since we only reset the context once
the previous batch has been inserted.
Author: David Rowley <firstname.lastname@example.org>
Continue reading Waiting for PostgreSQL 12 – Allow multi-inserts during COPY into a partitioned table
On 7th of April 2018, Alvaro Herrera committed patch:
Support partition pruning at execution time
Existing partition pruning is only able to work at plan time, for query
quals that appear in the parsed query. This is good but limiting, as
there can be parameters that appear later that can be usefully used to
further prune partitions.
This commit adds support for pruning subnodes of Append which cannot
possibly contain any matching tuples, during execution, by evaluating
Params to determine the minimum set of subnodes that can possibly match.
We support more than just simple Params in WHERE clauses. Support
1. Parameterized Nested Loop Joins: The parameter from the outer side of the
join can be used to determine the minimum set of inner side partitions to
2. Initplans: Once an initplan has been executed we can then determine which
partitions match the value from the initplan.
Partition pruning is performed in two ways. When Params external to the plan
are found to match the partition key we attempt to prune away unneeded Append
subplans during the initialization of the executor. This allows us to bypass
the initialization of non-matching subplans meaning they won't appear in the
EXPLAIN or EXPLAIN ANALYZE output.
For parameters whose value is only known during the actual execution
then the pruning of these subplans must wait. Subplans which are
eliminated during this stage of pruning are still visible in the EXPLAIN
output. In order to determine if pruning has actually taken place, the
EXPLAIN ANALYZE must be viewed. If a certain Append subplan was never
executed due to the elimination of the partition then the execution
timing area will state "(never executed)". Whereas, if, for example in
the case of parameterized nested loops, the number of loops stated in
the EXPLAIN ANALYZE output for certain subplans may appear lower than
others due to the subplan having been scanned fewer times. This is due
to the list of matching subnodes having to be evaluated whenever a
parameter which was found to match the partition key changes.
This commit required some additional infrastructure that permits the
building of a data structure which is able to perform the translation of
the matching partition IDs, as returned by get_matching_partitions, into
the list index of a subpaths list, as exist in node types such as
Append, MergeAppend and ModifyTable. This allows us to translate a list
of clauses into a Bitmapset of all the subpath indexes which must be
included to satisfy the clause list.
Author: David Rowley, based on an earlier effort by Beena Emerson
Reviewers: Amit Langote, Robert Haas, Amul Sul, Rajkumar Raghuwanshi,
Continue reading Waiting for PostgreSQL 11 – Support partition pruning at execution time
On 28th of March 2018, Andrew Dunstan committed patch:
Fast ALTER TABLE ADD COLUMN with a non-NULL default
Currently adding a column to a table with a non-NULL default results in
a rewrite of the table. For large tables this can be both expensive and
disruptive. This patch removes the need for the rewrite as long as the
default value is not volatile. The default expression is evaluated at
the time of the ALTER TABLE and the result stored in a new column
(attmissingval) in pg_attribute, and a new column (atthasmissing) is set
to true. Any existing row when fetched will be supplied with the
attmissingval. New rows will have the supplied value or the default and
so will never need the attmissingval.
Any time the table is rewritten all the atthasmissing and attmissingval
settings for the attributes are cleared, as they are no longer needed.
The most visible code change from this is in heap_attisnull, which
acquires a third TupleDesc argument, allowing it to detect a missing
value if there is one. In many cases where it is known that there will
not be any (e.g. catalog relations) NULL can be passed for this
Andrew Dunstan, heavily modified from an original patch from Serge
Reviewed by Tom Lane, Andres Freund, Tomas Vondra and David Rowley.
Continue reading Waiting for PostgreSQL 11 – Fast ALTER TABLE ADD COLUMN with a non-NULL default