Waiting for PostgreSQL 11 – MERGE SQL Command following SQL:2016

update!

It looks that this has been reverted


On 2nd of April 2018, Simon Riggs committed patch:

MERGE SQL Command following SQL:2016
 
MERGE performs actions that modify rows in the target table
using a source table or query. MERGE provides a single SQL
statement that can conditionally INSERT/UPDATE/DELETE rows
a task that would other require multiple PL statements.
e.g.
 
MERGE INTO target AS t
USING source AS s
ON t.tid = s.sid
WHEN MATCHED AND t.balance > s.delta THEN
  UPDATE SET balance = t.balance - s.delta
WHEN MATCHED THEN
  DELETE
WHEN NOT MATCHED AND s.delta > 0 THEN
  INSERT VALUES (s.sid, s.delta)
WHEN NOT MATCHED THEN
  DO NOTHING;
 
MERGE works with regular and partitioned tables, including
column and row security enforcement, as well as support for
row, statement and transition triggers.
 
MERGE is optimized for OLTP and is parameterizable, though
also useful for large scale ETL/ELT. MERGE is not intended
to be used in preference to existing single SQL commands
for INSERT, UPDATE or DELETE since there is some overhead.
MERGE can be used statically from PL/pgSQL.
 
MERGE does not yet support inheritance, write rules,
RETURNING clauses, updatable views or foreign tables.
MERGE follows SQL Standard per the most recent SQL:2016.
 
Includes full tests and documentation, including full
isolation tests to demonstrate the concurrent behavior.
 
This version written from scratch in 2017 by Simon Riggs,
using docs and tests originally written in 2009. Later work
from Pavan Deolasee has been both complex and deep, leaving
the lead author credit now in his hands.
Extensive discussion of concurrency from Peter Geoghegan,
with thanks for the time and effort contributed.
 
Various issues reported via sqlsmith by Andreas Seltenreich
 
Authors: Pavan Deolasee, Simon Riggs
Reviewers: Peter Geoghegan, Amit Langote, Tomas Vondra, Simon Riggs
 
Discussion:
https://postgr.es/m/CANP8+jKitBSrB7oTgT9CY2i1ObfOt36z0XMraQc+Xrz8QB0nXA@mail.gmail.com
https://postgr.es/m/CAH2-WzkJdBuxj9PO=2QaO9-3h3xGbQPZ34kJH=HukRekwM-GZg@mail.gmail.com

That's a pretty long commit message.

The summary is that we can now, using single command, do insert/update/delete with some sanity checking – which can, to some extent, simplify, or maybe even optimize certain operations like loading data to existing datasets.

Let's see the example shown in commit message, with some sensible-ish data.

$ CREATE TABLE target (tid int4 PRIMARY KEY, balance int4);
CREATE TABLE
 
$ CREATE TABLE SOURCE (id serial PRIMARY KEY, sid int4, delta int4);
CREATE TABLE

With this in place, I can insert some rows:

$ INSERT INTO target (tid, balance) VALUES
    (1,0), (2,10), (3,-5);
INSERT 0 3
 
$ INSERT INTO SOURCE (sid, delta) VALUES
    (1,10), (2,0), (3, 15), (4,70), (1,5);
INSERT 0 5

and now, let's try the merge:

$ MERGE INTO target AS t
USING SOURCE AS s
ON t.tid = s.sid
WHEN MATCHED AND t.balance > s.delta THEN
  UPDATE SET balance = t.balance - s.delta
WHEN MATCHED THEN
  DELETE
WHEN NOT MATCHED AND s.delta > 0 THEN
  INSERT VALUES (s.sid, s.delta)
WHEN NOT MATCHED THEN
  DO NOTHING;
ERROR:  MERGE command cannot affect ROW a SECOND TIME
HINT:  Ensure that NOT more than one SOURCE ROW matches any one target ROW

The problem is that we have two rows with sid = 1:

$ SELECT * FROM SOURCE WHERE sid = 1;
 id | sid | delta 
----+-----+-------
  1 |   1 |    10
  5 |   1 |     5
(2 ROWS)

So, let's delete one of them, and retry merge:

$ DELETE FROM SOURCE WHERE id = 5;
DELETE 1
 
$ MERGE INTO target AS t
USING SOURCE AS s
ON t.tid = s.sid
WHEN MATCHED AND t.balance > s.delta THEN
  UPDATE SET balance = t.balance - s.delta
WHEN MATCHED THEN
  DELETE
WHEN NOT MATCHED AND s.delta > 0 THEN
  INSERT VALUES (s.sid, s.delta)
WHEN NOT MATCHED THEN
  DO NOTHING;
MERGE 4
 
$ SELECT * FROM target;
 tid | balance 
-----+---------
   2 |      10
   4 |      70
(2 ROWS)

So, we can see that:

  • row (sid, delta) (1,10) got matched (there is target row with tid = 1), but delta >= balance, so the row in target was removed.
  • row (2,0) got matched, and caused update as balance (10) was larger than delta (0)
  • row (3,-5) got matched, and just like row with sid = 1 – got deleted from target
  • row (4,70) was not matched, and because delta was > 0 – it got inserted

All in all – it looks very powerful, and will allow easier writing of certain data transformations.

Despite the fact that this feature is not yet fully complete (partitioning, returning, updateable views), it is definitely a great addition. Thanks a lot.

One thought on “Waiting for PostgreSQL 11 – MERGE SQL Command following SQL:2016”

Comments are closed.