What tables were touched within given range of wal LSN?

We had a case recently where one of our DBs failed over to a new primary. To this server (old primary, and then new primary) we had connection from some kind of CDC tool (Debezium, I think).

The thing is that while there was failover, this tool read (using logical decoding) changes on old primary to location 1F1F/4ADC3348, and on new db, it connected from location 1F1F/4ADC4038. Which means that there was some WAL that was not decoded and sent to Debezium.

This is fixable, but we need to know what tables were touched in the WAL stream, that Debezium didn't see, if any. Perhaps we could also get ids/location of the rows that were there?

Continue reading What tables were touched within given range of wal LSN?

How much speed you’re leaving at the table if you use default locale?

I've been to PGConf.dev recently, and one of the talks was about collations.

The whole talk was interesting (to put it mildly), but the thing that stuck with me is that we really shouldn't be using default collation provider (libc with locale based collation), unless it's really needed, because we're wasting performance. But how much of a hit is it?

Continue reading How much speed you're leaving at the table if you use default locale?

Waiting for PostgreSQL 17 – MERGE / SPLIT partitions

I thought about it for quite some time, whether I should write about it, and how. That's why there is delay since:

On 6th of April 2024, Alexander Korotkov committed patch:

Implement ALTER TABLE ... MERGE PARTITIONS ... command
 
This new DDL command merges several partitions into the one partition of the
target table.  The target partition is created using new
createPartitionTable() function with parent partition as the template.
 
This commit comprises quite naive implementation which works in single process
and holds the ACCESS EXCLUSIVE LOCK on the parent table during all the
operations including the tuple routing.  This is why this new DDL command
can't be recommended for large partitioned tables under a high load.  However,
this implementation come in handy in certain cases even as is.
Also, it could be used as a foundation for future implementations with lesser
locking and possibly parallel.
 
Discussion: https://postgr.es/m/c73a1746-0cd0-6bdd-6b23-3ae0b7c0c582%40postgrespro.ru
Author: Dmitry Koval
Reviewed-by: Matthias van de Meent, Laurenz Albe, Zhihong Yu, Justin Pryzby
Reviewed-by: Alvaro Herrera, Robert Haas, Stephane Tachoires

and, also on 6th of April 2024, Alexander Korotkov committed patch:

Implement ALTER TABLE ... SPLIT PARTITION ... command
 
This new DDL command splits a single partition into several parititions.
Just like ALTER TABLE ... MERGE PARTITIONS ... command, new patitions are
created using createPartitionTable() function with parent partition as the
template.
 
This commit comprises quite naive implementation which works in single process
and holds the ACCESS EXCLUSIVE LOCK on the parent table during all the
operations including the tuple routing.  This is why this new DDL command
can't be recommended for large partitioned tables under a high load.  However,
this implementation come in handy in certain cases even as is.
Also, it could be used as a foundation for future implementations with lesser
locking and possibly parallel.
 
Discussion: https://postgr.es/m/c73a1746-0cd0-6bdd-6b23-3ae0b7c0c582%40postgrespro.ru
Author: Dmitry Koval
Reviewed-by: Matthias van de Meent, Laurenz Albe, Zhihong Yu, Justin Pryzby
Reviewed-by: Alvaro Herrera, Robert Haas, Stephane Tachoires

Continue reading Waiting for PostgreSQL 17 – MERGE / SPLIT partitions

Waiting for …: SQL/JSON is coming back. Hopefully.

This is not the usual Waiting for post, but something should be said.

Back in March/April of 2022 Andrew Dunstan committed a series of patches that added support for lots of really interesting features from SQL/JSON standard.

While I'm not avid user of json in database, I was very, very happy. Wrote couple of blogposts about it.

Then, around six month later they got reverted.

Lately, since last year, actually, these re-appeared again:

  1. Commit by Alvaro Herrera, from March 29th, 2023: SQL/JSON: add standard JSON constructor functions
  2. Commit by Alvaro Herrera, from March 31st, 2023: SQL/JSON: support the IS JSON predicate
  3. Commit by Amit Langote, from July 20th, 2023: Add more SQL/JSON constructor functions
  4. Commit by Amit Langote, from March 21st, 2024: Add SQL/JSON query functions
  5. Commit by Amit Langote, from April 4th, 2024: Add basic JSON_TABLE() functionality

Since they re-appeared I was asked (twice) to write about them in the Waiting for series.

So, I just want to say that while I did notice these changes, and am very happy that they are there, I don't plan on writing Waiting for about them.

The reason is simple: I kinda have the feeling that I already wrote about waiting for them.

What I can say though, is that as soon as PostgreSQL version (be it 17, or any other) will get released with these in there, I will reblog about SQL/JSON, with updated examples, so that this huge functionality, and astounding amount of work by developers and testers, will get as much publicity as it can.

For now: I hope it will make it to Pg 17 release, and even before that I would like to thank everyone involved. By my quick count we have at least nine separate authors, and fifteen reviewers, and this is just across these five commits I mentioned.

THANK YOU – can't wait till I will be able to write about it properly 🙂

Waiting for PostgreSQL 17 – Invent SERIALIZE option for EXPLAIN.

On 3rd of April 2024, Tom Lane committed patch:

Invent SERIALIZE option for EXPLAIN.
 
EXPLAIN (ANALYZE, SERIALIZE) allows collection of statistics about
the volume of data emitted by a query, as well as the time taken
to convert the data to the on-the-wire format.  Previously there
was no way to investigate this without actually sending the data
to the client, in which case network transmission costs might
swamp what you wanted to see.  In particular this feature allows
investigating the costs of de-TOASTing compressed or out-of-line
data during formatting.
 
Stepan Rutz and Matthias van de Meent,
reviewed by Tomas Vondra and myself
 
Discussion: https://postgr.es/m/ca0adb0e-fa4e-c37e-1cd7-91170b18cae1@gmx.de

Continue reading Waiting for PostgreSQL 17 – Invent SERIALIZE option for EXPLAIN.

Waiting for PostgreSQL 17 – Add new COPY option LOG_VERBOSITY.

On 1st of April 2024, Masahiko Sawada committed patch:

Add new COPY option LOG_VERBOSITY.
 
This commit adds a new COPY option LOG_VERBOSITY, which controls the
amount of messages emitted during processing. Valid values are
'default' and 'verbose'.
 
This is currently used in COPY FROM when ON_ERROR option is set to
ignore. If 'verbose' is specified, a NOTICE message is emitted for
each discarded row, providing additional information such as line
number, column name, and the malformed value. This helps users to
identify problematic rows that failed to load.
 
Author: Bharath Rupireddy
Reviewed-by: Michael Paquier, Atsushi Torikoshi, Masahiko Sawada
Discussion: https://www.postgresql.org/message-id/CALj2ACUk700cYhx1ATRQyRw-fBM%2BaRo6auRAitKGff7XNmYfqQ%40mail.gmail.com

Continue reading Waiting for PostgreSQL 17 – Add new COPY option LOG_VERBOSITY.

Waiting for PostgreSQL 17 – Add support for MERGE … WHEN NOT MATCHED BY SOURCE.

On 30th of March 2024, Dean Rasheed committed patch:

Add support for MERGE ... WHEN NOT MATCHED BY SOURCE.
 
This allows MERGE commands to include WHEN NOT MATCHED BY SOURCE
actions, which operate on rows that exist in the target relation, but
not in the data source. These actions can execute UPDATE, DELETE, or
DO NOTHING sub-commands.
 
This is in contrast to already-supported WHEN NOT MATCHED actions,
which operate on rows that exist in the data source, but not in the
target relation. To make this distinction clearer, such actions may
now be written as WHEN NOT MATCHED BY TARGET.
 
Writing WHEN NOT MATCHED without specifying BY SOURCE or BY TARGET is
equivalent to writing WHEN NOT MATCHED BY TARGET.
 
Dean Rasheed, reviewed by Alvaro Herrera, Ted Yu and Vik Fearing.
 
Discussion: https://postgr.es/m/CAEZATCWqnKGc57Y_JanUBHQXNKcXd7r=0R4NEZUVwP+syRkWbA@mail.gmail.com

Continue reading Waiting for PostgreSQL 17 – Add support for MERGE … WHEN NOT MATCHED BY SOURCE.

Waiting for PostgreSQL 17 – Add RETURNING support to MERGE.

On 17th of March 2024, Dean Rasheed committed patch:

Add RETURNING support to MERGE.
 
This allows a RETURNING clause to be appended to a MERGE query, to
return values based on each row inserted, updated, or deleted. As with
plain INSERT, UPDATE, and DELETE commands, the returned values are
based on the new contents of the target table for INSERT and UPDATE
actions, and on its old contents for DELETE actions. Values from the
source relation may also be returned.
 
As with INSERT/UPDATE/DELETE, the output of MERGE ... RETURNING may be
used as the source relation for other operations such as WITH queries
and COPY commands.
 
Additionally, a special function merge_action() is provided, which
returns 'INSERT', 'UPDATE', or 'DELETE', depending on the action
executed for each row. The merge_action() function can be used
anywhere in the RETURNING list, including in arbitrary expressions and
subqueries, but it is an error to use it anywhere outside of a MERGE
query's RETURNING list.
 
Dean Rasheed, reviewed by Isaac Morland, Vik Fearing, Alvaro Herrera,
Gurjeet Singh, Jian He, Jeff Davis, Merlin Moncure, Peter Eisentraut,
and Wolfgang Walther.
 
Discussion: http://postgr.es/m/CAEZATCWePEGQR5LBn-vD6SfeLZafzEm2Qy_L_Oky2=qw2w3Pzg@mail.gmail.com

Continue reading Waiting for PostgreSQL 17 – Add RETURNING support to MERGE.

Waiting for PostgreSQL 17 – Add new COPY option SAVE_ERROR_TO / Rename COPY option from SAVE_ERROR_TO to ON_ERROR

On 16th of January 2024, Alexander Korotkov committed patch:

Add new COPY option SAVE_ERROR_TO
 
Currently, when source data contains unexpected data regarding data type or
range, the entire COPY fails. However, in some cases, such data can be ignored
and just copying normal data is preferable.
 
This commit adds a new option SAVE_ERROR_TO, which specifies where to save the
error information. When this option is specified, COPY skips soft errors and
continues copying.
 
Currently, SAVE_ERROR_TO only supports "none". This indicates error information
is not saved and COPY just skips the unexpected data and continues running.
 
Later works are expected to add more choices, such as 'log' and 'table'.
 
Author: Damir Belyalov, Atsushi Torikoshi, Alex Shulgin, Jian He
Discussion: https://postgr.es/m/87k31ftoe0.fsf_-_%40commandprompt.com
Reviewed-by: Pavel Stehule, Andres Freund, Tom Lane, Daniel Gustafsson,
Reviewed-by: Alena Rybakina, Andy Fan, Andrei Lepikhov, Masahiko Sawada
Reviewed-by: Vignesh C, Atsushi Torikoshi

and then, three days later, he changed the syntax in next patch:

Rename COPY option from SAVE_ERROR_TO to ON_ERROR
 
The option names now are "stop" (default) and "ignore".  The future options
could be "file 'filename.log'" and "table 'tablename'".
 
Discussion: https://postgr.es/m/20240117.164859.2242646601795501168.horikyota.ntt%40gmail.com
Author: Jian He
Reviewed-by: Atsushi Torikoshi

Continue reading Waiting for PostgreSQL 17 – Add new COPY option SAVE_ERROR_TO / Rename COPY option from SAVE_ERROR_TO to ON_ERROR