Waiting for PostgreSQL 14 – Allow configurable LZ4 TOAST compression.

On 19th of March 2021, Robert Haas committed patch:

Allow configurable LZ4 TOAST compression.
 
There is now a per-column COMPRESSION option which can be set to pglz
(the default, and the only option in up until now) or lz4. Or, if you
like, you can set the new default_toast_compression GUC to lz4, and
then that will be the default for new table columns for which no value
is specified. We don't have lz4 support in the PostgreSQL code, so
to use lz4 compression, PostgreSQL must be built --with-lz4.
 
In general, TOAST compression means compression of individual column
values, not the whole tuple, and those values can either be compressed
inline within the tuple or compressed and then stored externally in
the TOAST table, so those properties also apply to this feature.
 
Prior to this commit, a TOAST pointer has two unused bits as part of
the va_extsize field, and a compessed datum has two unused bits as
part of the va_rawsize field. These bits are unused because the length
of a varlena is limited to 1GB; we now use them to indicate the
compression type that was used. This means we only have bit space for
2 more built-in compresison types, but we could work around that
problem, if necessary, by introducing a new vartag_external value for
any further types we end up wanting to add. Hopefully, it won't be
too important to offer a wide selection of algorithms here, since
each one we add not only takes more coding but also adds a build
dependency for every packager. Nevertheless, it seems worth doing
at least this much, because LZ4 gets better compression than PGLZ
with less CPU usage.
 
It's possible for LZ4-compressed datums to leak into composite type
values stored on disk, just as it is for PGLZ. It's also possible for
LZ4-compressed attributes to be copied into a different table via SQL
commands such as CREATE TABLE AS or INSERT .. SELECT.  It would be
expensive to force such values to be decompressed, so PostgreSQL has
never done so. For the same reasons, we also don't force recompression
of already-compressed values even if the target table prefers a
different compression method than was used for the source data.  These
architectural decisions are perhaps arguable but revisiting them is
well beyond the scope of what seemed possible to do as part of this
project.  However, it's relatively cheap to recompress as part of
VACUUM FULL or CLUSTER, so this commit adjusts those commands to do
so, if the configured compression method of the table happens not to
match what was used for some column value stored therein.
 
Dilip Kumar. The original patches on which this work was based were
written by Ildus Kurbangaliev, and those were patches were based on
even earlier work by Nikita Glukhov, but the design has since changed
very substantially, since allow a potentially large number of
compression methods that could be added and dropped on a running
system proved too problematic given some of the architectural issues
mentioned above; the choice of which specific compression method to
add first is now different; and a lot of the code has been heavily
refactored.  More recently, Justin Przyby helped quite a bit with
testing and reviewing and this version also includes some code
contributions from him. Other design input and review from Tomas
Vondra, Álvaro Herrera, Andres Freund, Oleg Bartunov, Alexander
Korotkov, and me.
 
Discussion: http://postgr.es/m/20170907194236.4cefce96%40wp.localdomain
Discussion: http://postgr.es/m/CAFiTN-uUpX3ck%3DK0mLEk-G_kUQY%3DSNOTeqdaNRR9FMdQrHKebw%40mail.gmail.com

Continue reading Waiting for PostgreSQL 14 – Allow configurable LZ4 TOAST compression.

Waiting for PostgreSQL 14 – Improvements for handling large number of connections

Title:

There is a thread on pgsql-hackers mailing list, dating back to 1st of March 2020 about changes to Pg to optimize handling of larger number of connections.

The thread is pretty long, and the discussion took 5 months, but lately we got couple of commits:

  1. snapshot scalability: Don't compute global horizons while building snapshots.
  2. snapshot scalability: Move PGXACT->xmin back to PGPROC.
  3. snapshot scalability: Move PGXACT->vacuumFlags to ProcGlobal->vacuumFlags.
  4. snapshot scalability: Move subxact info to ProcGlobal, remove PGXACT.
  5. snapshot scalability: Introduce dense array of in-progress xids.
  6. snapshot scalability: cache snapshots using a xact completion counter.
  7. Fix race condition in snapshot caching when 2PC is used.

that (supposedly) should improve handling of large numbers of concurrent connections.

Unfortunately, I don't have ready test servers that would allow me to sensibly tests thousands of connections, but I think we can all safely assume that PostgreSQL 14 should handle large connection counts better than any PostgreSQL before. Thanks a lot, to all involved, but specifically to mastermind of this project: Andres Freund.

Waiting for PostgreSQL 13 – Allow pg_stat_statements to track planning statistics.

On 2nd of April 2020, Fujii Masao committed patch:

Allow pg_stat_statements to track planning statistics.
 
This commit makes pg_stat_statements support new GUC
pg_stat_statements.track_planning. If this option is enabled,
pg_stat_statements tracks the planning statistics of the statements,
e.g., the number of times the statement was planned, the total time
spent planning the statement, etc. This feature is useful to check
the statements that it takes a long time to plan. Previously since
pg_stat_statements tracked only the execution statistics, we could
not use that for the purpose.
 
The planning and execution statistics are stored at the end of
each phase separately. So there are not always one-to-one relationship
between them. For example, if the statement is successfully planned
but fails in the execution phase, only its planning statistics are stored.
This may cause the users to be able to see different pg_stat_statements
results from the previous version. To avoid this,
pg_stat_statements.track_planning needs to be disabled.
 
This commit bumps the version of pg_stat_statements to 1.8
since it changes the definition of pg_stat_statements function.
 
Author: Julien Rouhaud, Pascal Legrand, Thomas Munro, Fujii Masao
Reviewed-by: Sergei Kornilov, Tomas Vondra, Yoshikazu Imai, Haribabu Kommi, Tom Lane
Discussion: https://postgr.es/m/CAHGQGwFx_=DO-Gu-MfPW3VQ4qC7TfVdH2zHmvZfrGv6fQ3D-Tw@mail.gmail.com
Discussion: https://postgr.es/m/CAEepm=0e59Y_6Q_YXYCTHZkqOc6H2pJ54C_Xe=VFu50Aqqp_sA@mail.gmail.com
Discussion: https://postgr.es/m/DB6PR0301MB21352F6210E3B11934B0DCC790B00@DB6PR0301MB2135.eurprd03.prod.outlook.com

Continue reading Waiting for PostgreSQL 13 – Allow pg_stat_statements to track planning statistics.

Waiting for PostgreSQL 13 – Add support for –jobs in reindexdb

On 27th of July 2019, Michael Paquier committed patch:

Add support for --jobs in reindexdb
 
When doing a schema-level or a database-level operation, a list of
relations to build is created which gets processed in parallel using
multiple connections, based on the recent refactoring for parallel slots
in src/bin/scripts/.  System catalogs are processed first in a
serialized fashion to prevent deadlocks, followed by the rest done in
parallel.
 
This new option is not compatible with --system as reindexing system
catalogs in parallel can lead to deadlocks, and with --index as there is
no conflict handling for indexes rebuilt in parallel depending in the
same relation.
 
Author: Julien Rouhaud
 
Discussion: https://postgr.es/m/CAOBaU_YrnH_Jqo46NhaJ7uRBiWWEcS40VNRQxgFbqYo9kApUsg@mail.gmail.com

Continue reading Waiting for PostgreSQL 13 – Add support for –jobs in reindexdb

Waiting for PostgreSQL 12 – Allow VACUUM to be run with index cleanup disabled.

On 4th of April 2019, Robert Haas committed patch:

Allow VACUUM to be run with index cleanup disabled.
 
 
This commit adds a new reloption, vacuum_index_cleanup, which
controls whether index cleanup is performed for a particular
relation by default.  It also adds a new option to the VACUUM
command, INDEX_CLEANUP, which can be used to override the
reloption.  If neither the reloption nor the VACUUM option is
used, the default is true, as before.
 
Masahiko Sawada, reviewed and tested by Nathan Bossart, Alvaro
Herrera, Kyotaro Horiguchi, Darafei Praliaskouski, and me.
The wording of the documentation is mostly due to me.
 
Discussion: http://postgr.es/m/CAD21AoAt5R3DNUZSjOoXDUY=naYPUOuffVsRzuTYMz29yLzQCA@mail.gmail.com

Continue reading Waiting for PostgreSQL 12 – Allow VACUUM to be run with index cleanup disabled.

Waiting for PostgreSQL 12 – Generated columns

On 30th of March 2019, Peter Eisentraut committed patch:

Generated columns
 
This is an SQL-standard feature that allows creating columns that are
computed from expressions rather than assigned, similar to a view or
materialized view but on a column basis.
 
This implements one kind of generated column: stored (computed on
write).  Another kind, virtual (computed on read), is planned for the
future, and some room is left for it.
 
Discussion: https://www.postgresql.org/message-id/flat/-4019-bdb1-699e-@2ndquadrant.com

Continue reading Waiting for PostgreSQL 12 – Generated columns

Waiting for PostgreSQL 12 – Allow user control of CTE materialization, and change the default behavior.

On 16th of February 2019, Tom Lane committed patch:

Allow user control of CTE materialization, and change the default behavior.
 
Historically we've always materialized the full output of a CTE query,
treating WITH as an optimization fence (so that, for example, restrictions
from the outer query cannot be pushed into it).  This is appropriate when
the CTE query is INSERT/UPDATE/DELETE, or is recursive; but when the CTE
query is non-recursive and side-effect-free, there's no hazard of changing
the query results by pushing restrictions down.
 
Another argument for materialization is that it can avoid duplicate
computation of an expensive WITH query --- but that only applies if
the WITH query is called more than once in the outer query.  Even then
it could still be a net loss, if each call has restrictions that
would allow just a small part of the WITH query to be computed.
 
Hence, let's change the behavior for WITH queries that are non-recursive
and side-effect-free.  By default, we will inline them into the outer
query (removing the optimization fence) if they are called just once.
If they are called more than once, we will keep the old behavior by
default, but the user can override this and force inlining by specifying
NOT MATERIALIZED.  Lastly, the user can force the old behavior by
specifying MATERIALIZED; this would mainly be useful when the query had
deliberately been employing WITH as an optimization fence to prevent a
poor choice of plan.
 
Andreas Karlsson, Andrew Gierth, David Fetter
 
Discussion: https://postgr.es/m/87sh48ffhb.fsf@news-spur.riddles.org.uk

Continue reading Waiting for PostgreSQL 12 – Allow user control of CTE materialization, and change the default behavior.

Waiting for PostgreSQL 12 – Add log_statement_sample_rate parameter

On 29th of November 2018, Alvaro Herrera committed patch:

Add log_statement_sample_rate parameter
 
This allows to set a lower log_min_duration_statement value without
incurring excessive log traffic (which reduces performance).  This can
be useful to analyze workloads with lots of short queries.
 
Author: Adrien Nayrat
 
Discussion: https://postgr.es/m/-ee1e-db9f-fa97-@anayrat.info

Continue reading Waiting for PostgreSQL 12 – Add log_statement_sample_rate parameter

Waiting for PostgreSQL 12 – Allow multi-inserts during COPY into a partitioned table

On 1st of August 2018, Peter Eisentraut committed patch:

Allow multi-inserts during COPY into a partitioned table
 
 
CopyFrom allows multi-inserts to be used for non-partitioned tables, but
this was disabled for partitioned tables.  The reason for this appeared
to be that the tuple may not belong to the same partition as the
previous tuple did.  Not allowing multi-inserts here greatly slowed down
imports into partitioned tables.  These could take twice as long as a
copy to an equivalent non-partitioned table.  It seems wise to do
something about this, so this change allows the multi-inserts by
flushing the so-far inserted tuples to the partition when the next tuple
does not belong to the same partition, or when the buffer fills.  This
improves performance when the next tuple in the stream commonly belongs
to the same partition as the previous tuple.
 
In cases where the target partition changes on every tuple, using
multi-inserts slightly slows the performance.  To get around this we
track the average size of the batches that have been inserted and
adaptively enable or disable multi-inserts based on the size of the
batch.  Some testing was done and the regression only seems to exist
when the average size of the insert batch is close to 1, so let's just
enable multi-inserts when the average size is at least 1.3.  More
performance testing might reveal a better number for, this, but since
the slowdown was only 1-2% it does not seem critical enough to spend too
much time calculating it.  In any case it may depend on other factors
rather than just the size of the batch.
 
Allowing multi-inserts for partitions required a bit of work around the
per-tuple memory contexts as we must flush the tuples when the next
tuple does not belong the same partition.  In which case there is no
good time to reset the per-tuple context, as we've already built the new
tuple by this time.  In order to work around this we maintain two
per-tuple contexts and just switch between them every time the partition
changes and reset the old one.  This does mean that the first of each
batch of tuples is not allocated in the same memory context as the
others, but that does not matter since we only reset the context once
the previous batch has been inserted.
 
Author: David Rowley <david.rowley@2ndquadrant.com>

Continue reading Waiting for PostgreSQL 12 – Allow multi-inserts during COPY into a partitioned table