Why is UPSERT so complicated?

If you worked with certain other (than PostgreSQL) open source database, you might wonder why PostgreSQL doesn't have MERGE, and why UPSERT example in documentation is so complicated.

Well, let's try to answer the question, and look into some alternatives.

First, of course – let's define the task.

I will need a way to either insert row, or update it, if it already exists. What does “it already exists" mean – let's assume we're talking about row with the same value in UNIQUE column.

For example – I have a table:

CREATE TABLE test (
    whatever INT4 PRIMARY KEY,
    counter  INT4 NOT NULL DEFAULT 0
);

Very simple, but that's basically the point. I want to insert new “whatever", but if it already exists – I want instead to update the counter.

And – what makes UPSERT so complicated – it should never raise exception.

The first, simplest solution is:

UPDATE test SET counter = counter + 1 WHERE whatever = ?;

and if it fails (modified row count is = 0) – run:

INSERT INTO test (whatever, counter) VALUES (?, 1);

It's so simple, it can't fail. Right? Let's see. Very simple Perl program to test:

  1. #!/usr/bin/env perl
  2. use DBI;
  3. my $dbh = DBI->connect(
  4.     'dbi:Pg:dbname=pgdba;host=127.0.0.1;port=5920',
  5.     undef,
  6.     undef,
  7.     { 'AutoCommit' => 1, },
  8. );
  9. while (1) {
  10.     my $whatever = int(1 + rand 100);
  11.     my $rows = $dbh->do(
  12.         'UPDATE test set counter = counter + 1 WHERE whatever = ?',
  13.         undef,
  14.         $whatever
  15.     );
  16.     if ( 0 == $rows ) {
  17.         $dbh->do(
  18.             'INSERT INTO test (whatever, counter) VALUES (?, 1)',
  19.             undef,
  20.             $whatever
  21.         );
  22.     }
  23. }

Of course it will never end (while look from line 9 to 23 doesn't have ending condition), but after running it, and pressing ctrl-c I can stop it. And I did, after couple of minutes – and we're good. No errors. So it's OK, right? Wrong.

Before I will show why it's wrong, let's explain what it does – in case reader of this blogpost didn't read fluently such simple Perl programs 🙂

  • Line 1 – standard UNIX “shebang" to tell shell to run it via perl
  • Line 2 – load DBI module, which is all-purpose database access library
  • Lines 3-8 – it's technically one line, just indented a lot – creates database connection, using Pg driver, to database pgdba, using tcp/ip connection to 127.0.0.1, port 5920. Undefs are just a way to specify default username and lack of password. Autocommit bit in line 7 makes each statement its own transaction.
  • Line 9 – beginning of infinite loop
  • Line 10 – I pick random integer value in range 1-100
  • Lines 11-15 – single command, runs UPDATA in database with given $whatever (random value from previous step)
  • Line 16 if no rows were updated (->do() returned 0) – run next step
  • Lines 17-21 – single command, running INSERT in database

All very simple, step by step doing what we need.

And yet, despite initial test – I say that it's wrong. Why?

Answer is very simple – it breaks when I'll run it many times in parallel. How? Let's see:

=$ psql -c 'truncate test'; for i in {1..10}; do ./test.pl & done; sleep 5; killall perl
TRUNCATE TABLE
[1] 12843
[2] 12844
[3] 12845
[4] 12846
[5] 12847
[6] 12848
[7] 12849
[8] 12850
[9] 12851
[10] 12852
DBD::Pg::db do failed: ERROR:  duplicate key value violates unique constraint "test_pkey"
DETAIL:  Key (whatever)=(55) already exists. at ./test.pl line 17.
DBD::Pg::db do failed: ERROR:  duplicate key value violates unique constraint "test_pkey"
DETAIL:  Key (whatever)=(20) already exists. at ./test.pl line 17.
DBD::Pg::db do failed: ERROR:  duplicate key value violates unique constraint "test_pkey"
DETAIL:  Key (whatever)=(11) already exists. at ./test.pl line 17.
DBD::Pg::db do failed: ERROR:  duplicate key value violates unique constraint "test_pkey"
DETAIL:  Key (whatever)=(97) already exists. at ./test.pl line 17.
DBD::Pg::db do failed: ERROR:  duplicate key value violates unique constraint "test_pkey"
DETAIL:  Key (whatever)=(18) already exists. at ./test.pl line 17.
DBD::Pg::db do failed: ERROR:  duplicate key value violates unique constraint "test_pkey"
DETAIL:  Key (whatever)=(18) already exists. at ./test.pl line 17.
[1]   Terminated              ./test.pl
[2]   Terminated              ./test.pl
[4]   Terminated              ./test.pl
[5]   Terminated              ./test.pl
[6]   Terminated              ./test.pl
[7]   Terminated              ./test.pl
[8]   Terminated              ./test.pl
[9]-  Terminated              ./test.pl
[3]-  Terminated              ./test.pl
[10]+  Terminated              ./test.pl

We have errors. But how? It shows error of duplicate key in test.pl line 17 – which is the $dbh->do() of INSERT. But the UPDATE didn't found the row. So what's going on?

Reason is very simple – with multiple processes working in parallel it is perfectly possible that two processes chose the same value. Both issued UPDATE – and neither of the UPDATES found a row, so they both issued INSERT. And the second insert failed.

Of course there is very short time for such case to happen – if the update happened 1 millisecond later – I wouldn't have the problem. That's why it's called race condition.

Some people at the moment say: OK, but that's simple – just do it in transaction.

So, let's try. I changed the script to be:

  1. #!/usr/bin/env perl
  2. use DBI;
  3. my $dbh = DBI->connect(
  4.     'dbi:Pg:dbname=pgdba;host=127.0.0.1;port=5920',
  5.     undef,
  6.     undef,
  7.     { 'AutoCommit' => 0, },
  8. );
  9. while (1) {
  10.     my $whatever = int(1 + rand 100);
  11.     my $rows = $dbh->do(
  12.         'UPDATE test set counter = counter + 1 WHERE whatever = ?',
  13.         undef,
  14.         $whatever
  15.     );
  16.     if ( 0 == $rows ) {
  17.         $dbh->do(
  18.             'INSERT INTO test (whatever, counter) VALUES (?, 1)',
  19.             undef,
  20.             $whatever
  21.         );
  22.     }
  23.     $dbh->commit();
  24. }

Changes – line 7 now shows disabled autocommit, and we have line 23 which does commit() after the work has been done. We don't need BEGIN; because Perl's DBI is smart enough to start transaction whenever we are not in transaction, and some work has to be done.

So, with this change, we see:

=$ psql -c 'truncate test'; for i in {1..10}; do ./test.pl & done; sleep 5; killall perl
TRUNCATE TABLE
...
DBD::Pg::db do failed: ERROR:  duplicate key value violates unique constraint "test_pkey"
DETAIL:  Key (whatever)=(23) already exists. at ./test.pl line 17.
DBD::Pg::db do failed: ERROR:  duplicate key value violates unique constraint "test_pkey"
DETAIL:  Key (whatever)=(98) already exists. at ./test.pl line 17.
DBD::Pg::db do failed: ERROR:  duplicate key value violates unique constraint "test_pkey"
DETAIL:  Key (whatever)=(22) already exists. at ./test.pl line 17.
DBD::Pg::db do failed: ERROR:  duplicate key value violates unique constraint "test_pkey"
DETAIL:  Key (whatever)=(77) already exists. at ./test.pl line 17.
DBD::Pg::db do failed: ERROR:  duplicate key value violates unique constraint "test_pkey"
DETAIL:  Key (whatever)=(9) already exists. at ./test.pl line 17.
...

(I removed the “job started" and “Terminated" lines for brevity).

It still fails? Why? Reason is very simple – transactions (in default isolation level) cannot prevent anything like this – they just are there to make sure that:

  • other connections will not see “your" changes, before you will finish making them
  • if you'll decide – you can abandon all your changes – i.e. rollback

That's all.

If you're paying attention you noticed “in default isolation level“.

There are docs on these levels, but generally they mean – how much “separated" you want your transactions. There is even “SERIALIZABLE" level, which is described like:

The most strict is Serializable, which is defined by the standard in a paragraph which says that any concurrent execution of a set of Serializable transactions is guaranteed to produce the same effect as running them one at a time in some order.

This might sound great. Let's test it. Another change in the test script:

  1. #!/usr/bin/env perl
  2. use DBI;
  3. my $dbh = DBI->connect(
  4.     'dbi:Pg:dbname=pgdba;host=127.0.0.1;port=5920',
  5.     undef,
  6.     undef,
  7.     { 'AutoCommit' => 0, },
  8. );
  9. while (1) {
  10.     $dbh->do('SET TRANSACTION ISOLATION LEVEL SERIALIZABLE');
  11.     my $whatever = int(1 + rand 100);
  12.     my $rows = $dbh->do(
  13.         'UPDATE test set counter = counter + 1 WHERE whatever = ?',
  14.         undef,
  15.         $whatever
  16.     );
  17.     if ( 0 == $rows ) {
  18.         $dbh->do(
  19.             'INSERT INTO test (whatever, counter) VALUES (?, 1)',
  20.             undef,
  21.             $whatever
  22.         );
  23.     }
  24.     $dbh->commit();
  25. }

(new line 10).

Unfortunately, this time, when I'll try to run is, I will get HUGE number of errors like these:

=$ psql -c 'truncate test'; for i in {1..10}; do ./test.pl & done 2>&1 | head -n 15; sleep 5; killall perl
TRUNCATE TABLE
DBD::Pg::db commit failed: ERROR:  could not serialize access due to read/write dependencies among transactions
DETAIL:  Reason code: Canceled on identification as a pivot, during commit attempt.
HINT:  The transaction might succeed if retried. at ./test.pl line 24.
DBD::Pg::db commit failed: ERROR:  could not serialize access due to read/write dependencies among transactions
DETAIL:  Reason code: Canceled on commit attempt with conflict in from prepared pivot.
HINT:  The transaction might succeed if retried. at ./test.pl line 24.
DBD::Pg::db commit failed: ERROR:  could not serialize access due to read/write dependencies among transactions
DETAIL:  Reason code: Canceled on commit attempt with conflict in from prepared pivot.
HINT:  The transaction might succeed if retried. at ./test.pl line 24.
DBD::Pg::db do failed: ERROR:  could not serialize access due to read/write dependencies among transactions
DETAIL:  Reason code: Canceled on identification as a pivot, during write.
HINT:  The transaction might succeed if retried. at ./test.pl line 18.
DBD::Pg::db do failed: ERROR:  could not serialize access due to read/write dependencies among transactions
DETAIL:  Reason code: Canceled on identification as a pivot, during write.
HINT:  The transaction might succeed if retried. at ./test.pl line 18.

Ouch. That sucks.

So, another idea might be: just use a lock. Of course. But what kind of lock? You cannot lock a row if it doesn't exist. SO we'd have to lock the table. That's doable.

So, let's change the line 10 from setting isolation level, to:

  1.     $dbh->do( 'LOCK TABLE test IN ACCESS EXCLUSIVE MODE' );

Quick run shows – NO ERROR. Yeah. We're all good. Are we? Well, not really. Reason is simple – performance.

Each command either inserted row with counter = 1, or updated the counter by incrementing it. So, by summing all counters – we'll see how many queries actually happened. Since all copies of the script are killed after 5 seconds we'll have some base number:

$ SELECT SUM(counter) FROM test;
 SUM
─────
 439
(1 ROW)

OK. And how does it work without locking? We will have errors, but let's just see some numbers. Line 10 got commented out, and:

=$ psql -c 'truncate test'; for i in {1..10}; do ./test.pl & done &> /dev/null; sleep 5; killall perl; psql -c "select sum(counter) from test"
TRUNCATE TABLE
...
 sum
------
 2042
(1 row)

That's very big difference. We had some errors, but the number of logged “items" was nearly 5 times higher. So – locking of the table can be an answer, but it kills performance.

There is one kind of locks that could be used. These are so called advisory locks. Using them you can lock something that doesn't exist in database – because you're applying the lock on a number. Not a row. Not a table. Not any object. Just a number (or two numbers, but that's irrelevant).

So – this looks like a godsend. Right? Well, mostly – yes. Let's see. Two small changes (line 10 got moved to 11 and change to create lock, and I added line 25 and 26 to remove the lock):

  1. #!/usr/bin/env perl
  2. use DBI;
  3. my $dbh = DBI->connect(
  4.     'dbi:Pg:dbname=pgdba;host=127.0.0.1;port=5920',
  5.     undef,
  6.     undef,
  7.     { 'AutoCommit' => 0, },
  8. );
  9. while (1) {
  10.     my $whatever = int(1 + rand 100);
  11.     $dbh->do( 'SELECT pg_advisory_lock( ? )', undef, $whatever );
  12.     my $rows = $dbh->do(
  13.         'UPDATE test set counter = counter + 1 WHERE whatever = ?',
  14.         undef,
  15.         $whatever
  16.     );
  17.     if ( 0 == $rows ) {
  18.         $dbh->do(
  19.             'INSERT INTO test (whatever, counter) VALUES (?, 1)',
  20.             undef,
  21.             $whatever
  22.         );
  23.     }
  24.     $dbh->commit();
  25.     $dbh->do( 'SELECT pg_advisory_unlock( ? )', undef, $whatever );
  26.     $dbh->commit();
  27. }

You might ask: why I need another commit() at line 26? Answer is simple – if I didn't make it there, the unlock would work, but would be in the same transaction as next iteration of the loop. And this – while wouldn't be fully an error is a thing I don't like – doing stuff for two unrelated jobs in the same transaction.

So, let's run it, see if there are errors, and what's the performance:

=$ psql -c 'truncate test'; for i in {1..10}; do ./test.pl & done; sleep 5; killall perl; psql -c "select sum(counter) from test"
TRUNCATE TABLE
...
 sum
------
 2114
(1 row)

Looks good – no errors, and the performance number looks great. So, we're set. Right? Yeah, you guessed it – wrong.

Why? Reason is very simple – lack of trust.

Advisory locks work great, and will protect you from all kinds of errors. If you use them.

And since the advisory lock doesn't lock anything “real" in database – like table or row – it will not stop other accesses to the table – from parts of your code where you forgot about them (locks), or from psql sessions where you're too lazy to do it.

This wouldn't be that big of a problem generally, but PostgreSQL devs, and (to some extent) users strive for perfection. And safety in all cases. So – while using advisory locks (or even the solution with race condition) might be good for your situation – it is not good enough for the general case.

Now, let me make a small digression. I, once, long time ago, wrote on Polish, and then, (quite recently) David Fetter extended description of a method to do “insert of new rows" (i.e. ignore, without error, inserting of rows that already exist).

Generally the method is basically either:

INSERT INTO test (whatever, counter)
    SELECT 123, 1 WHERE NOT EXISTS (SELECT * FROM test WHERE whatever = 123);

or the same, with OUTER JOIN:

INSERT INTO test (whatever, counter)
SELECT v.*
FROM
    (VALUES (123, 1) ) AS v(w, c)
    LEFT JOIN test t ON t.whatever = v.w
WHERE
    t.whatever IS NULL;

Unfortunately both of these approaches have the same problem – race condition. Window of opportunity for the error is shorter than in case of two queries, but it exists. I will not show it, because test is trivial, and I don't want to inflate this, already big, blogpost, but you can either trust me or simply run the test yourself.

End of digression, back to main track.

So, we need a way to do the update, insert if it didn't work, and then – if insert failed redo update. Seems simple, with one problem – we wanted errors not to happen.

And now we should ask ourselves – what does it mean that “error did not happen". Does it mean that at no point we saw the “ERROR" text, or simply – that we have one transaction that does the change in dataase, and that it will always succeed, but might sometime log error (but it will still succeed).

If we want to get rid of errors altogether – we need to dive into PL/pgSQL language, but for now – let's just assume we want it done in single transaction, that will always succeed. Errors might be shown, but each iteration of loop from line 9, should increment a counter in database.

So, I'll use a relatively unknown feature of PostgreSQL: SAVEPOINT. Savepoint is a “thing" within transaction, that we can always roll back to, even in case of errors that happened after setting savepoint.

New version of test program:

  1. #!/usr/bin/env perl
  2. use DBI;
  3. my $dbh = DBI->connect(
  4.     'dbi:Pg:dbname=pgdba;host=127.0.0.1;port=5920',
  5.     undef,
  6.     undef,
  7.     { 'AutoCommit' => 0, },
  8. );
  9. for (1..500) {
  10.     my $whatever = int(1 + rand 100);
  11.     my $rows = $dbh->do(
  12.         'UPDATE test set counter = counter + 1 WHERE whatever = ?',
  13.         undef,
  14.         $whatever
  15.     );
  16.     if ( 1 == $rows ) {
  17.         $dbh->commit();
  18.         next;
  19.     }
  20.     $dbh->do('SAVEPOINT x');
  21.     $rows = $dbh->do(
  22.         'INSERT INTO test (whatever, counter) VALUES (?, 1)',
  23.         undef,
  24.         $whatever
  25.     );
  26.     if ( 1 == $rows ) {
  27.         $dbh->commit();
  28.         next;
  29.     }
  30.     $dbh->do( 'ROLLBACK TO x' );
  31.     $dbh->do(
  32.         'UPDATE test set counter = counter + 1 WHERE whatever = ?',
  33.         undef,
  34.         $whatever
  35.     );
  36.     $dbh->commit();
  37. }

(I know, it's not pretty, but should be simple to understand).

So, what does it do? First – I changed the loop from infinite to “do 500 operations". I run 10 parallel tasks, which should give me 5000 operations in database – which is a bit more than I did previously, but should be good enough to see how it works.

Then – I do update, and see if it worked. If it did – fine, commit transaction, and do another loop.

If it didn't work – I create savepoint, imaginatively named “x", and call insert – this time, checking how many rows were inserted.

If the row was inserted – commit the transaction, and continue with next loop – I don't have to “destroy" savepoints – they “live" within transaction only, so when you commit transaction savepoint is removed too.

If the row was not inserted (because of, in our case, UNIQUE violation) – issue rollback to savepoint x (thus removing “transaction is aborted" state), and redo update. Since we now know that the row is there – we don't have to check anything, just do it, commit, and continue with next iteration of the loop.

Results?

=$ psql -c 'truncate test'; FOR i IN {1..10}; do ./test.pl & done; TIME wait; psql -c "select sum(counter) from test"
TRUNCATE TABLE
...
DBD::Pg::db do failed: ERROR:  duplicate KEY VALUE violates UNIQUE CONSTRAINT "test_pkey"
DETAIL:  KEY (whatever)=(46) already EXISTS. at ./test.pl line 21.
DBD::Pg::db do failed: ERROR:  duplicate KEY VALUE violates UNIQUE CONSTRAINT "test_pkey"
DETAIL:  KEY (whatever)=(90) already EXISTS. at ./test.pl line 21.
DBD::Pg::db do failed: ERROR:  duplicate KEY VALUE violates UNIQUE CONSTRAINT "test_pkey"
DETAIL:  KEY (whatever)=(81) already EXISTS. at ./test.pl line 21.
DBD::Pg::db do failed: ERROR:  duplicate KEY VALUE violates UNIQUE CONSTRAINT "test_pkey"
DETAIL:  KEY (whatever)=(71) already EXISTS. at ./test.pl line 21.
...
 
REAL    0m12.216s
USER    0m1.228s
sys     0m0.336s
 SUM
------
 5000
(1 ROW)

So, we see that we had INSERT error four times. But final count on table shows that all iterations of the loop did in fact increment a counter – which means that savepoint worked.

What about performance? Previously we had ~ 2100 increments in 5 seconds. This time we have 5000 in 12.2 seconds. This means that this is only ~ 3% slower than version with advisory locks.

But, I have to admit, I cheated a little. We do 5000 inserts, but we have only 100 different values of “whatever". Which means that in 4896 cases flow of the loop ended right after first update.

Whether this is an issue for you – you have to consider yourself, and perhaps redo tests with different number of loops, or different range of “whatever" values.

Of course – you might say that you don't like the fact that the program shows errors. I could silence them in script, or I could move to stored procedure. Since the code in docs used procedure – let's use it.

PL/pgSQL function that does the same logic is:

  1. CREATE OR REPLACE FUNCTION upsert_test( in_whatever INT4 ) RETURNS void AS $$
  2. BEGIN
  3.     UPDATE test SET counter = counter + 1 WHERE whatever = in_whatever;
  4.     IF FOUND THEN
  5.         RETURN;
  6.     END IF;
  7.     BEGIN
  8.         INSERT INTO test (whatever, counter) VALUES (in_whatever, 1);
  9.     EXCEPTION WHEN OTHERS THEN
  10.         UPDATE test SET counter = counter + 1 WHERE whatever = in_whatever;
  11.     END;
  12.     RETURN;
  13. END;
  14. $$ LANGUAGE plpgsql;

and this allowed me to simplify perl test script to:

  1. #!/usr/bin/env perl
  2. use DBI;
  3. my $dbh = DBI->connect(
  4.     'dbi:Pg:dbname=pgdba;host=127.0.0.1;port=5920',
  5.     undef,
  6.     undef,
  7.     { 'AutoCommit' => 1, },
  8. );
  9. for (1..500) {
  10.     my $whatever = int(1 + rand 100);
  11.     $dbh->do( 'SELECT upsert_test( ? )', undef, $whatever );
  12. }

Couple of notes:

  • there is no “SAVEPOINT" command in PL/pgSQL. But the inner BEGIN (line 7) does exactly the same thing.
  • WHEN OTHERS (function, line 9) is a “catch all" type of exception handler. Doing “WHEN unique_violation" (as in Pg docs is cleaner, but it's not relevant at the moment
  • You might have noticed that Perl script is again using AutoCommit – this is because now every transaction is just one command (select from function), so we don't need explicit transactions. Thought – in real life applications I wouldn't use autocommit.

So, how it works?

=$ psql -c 'truncate test'; for i in {1..10}; do ./test.pl & done; time wait; psql -c "select sum(counter) from test"
TRUNCATE TABLE
...
 
real    0m12.094s
user    0m0.932s
sys     0m0.268s
 sum
------
 5000
(1 row)

No errors on screen. No errors in logs. Performance acceptable. Function a bit simpler than the one in logs (no loop, just simple 3 queries).

So we're good? Yeah, sure, you expected it. No, we're not good.

We forgot about one fact. DELETEs happen.

What will happen, if we'd have following case:

session #1 session #2
begin;  
update … – fails, there is no row  
  insert
insert – fails, the row exists  
  DELETE
update – fails, there is no row!  

Of course the chances for such case are very low. And the timing would have to be perfect. But it is technically possible, and if it is technically possible, it should be at least mentioned, and at best – solved.

This is, of course, another case of race condition. And this is exactly the reason why docs version of the upsert function has a loop.

If you'll excuse me – I will skip showing the error happening – as it requires either changing the code by adding artificial slowdowns, or a lot of luck, or a lot of time. But I hope you understand why the DELETEs can cause problems. And why loop is needed to solve the problem.

Of course – adding the loop, the way as its shown in docs, introduces potential infinite loop – code that runs inserts and deletes, and runs in perfect synchronization with your function will cause your function to never end.

But such situation is simply improbable (running in perfect synchronization), so we should be good, and the loop will at most redo its work twice.

So, all in all – now, I hope, you understand why code in documentation is so complex with exception handling and loop.

In your own programs you can use another approaches. For example – if you're sure you're running just one writing client at a time – you can use the way with outer join or “exists" that I mentioned in digression. If you are sure about your app, and that noone will ever touch the database otherwise – you can use advisory locks (these would work with DELETE's too, as long as delete would use advisory lock too). Or you can use exception handling, and possibly loop. It all depends on a case that you have.

As a final word – yeah, but why PostgreSQL doesn't handle it internally, as the other database?

Answer lies somewhere between “developers have other things to do first" and “it's complicated, because there are many ways to do it, and the most correct one is slow – so there is a trade off decision to be made".

I don't know which method was used by “MERGE" in the other database, and would like to know what kind of trade offs they made – speed over correctness, or the other way around. And to what extent.

34 thoughts on “Why is UPSERT so complicated?”

  1. As noted, MERGE is complicated and slow. If two different database servers implement it, then it is going to be implemented with different schematics. Oracle and MySQL already have different implementation’s, and to nobody’s surprise, MySQL’s implementation allows for risky behavior that does not protect the data. PostgreSQL users expect that any data put into the database will reliably come out of the database. So MERGE is slow and inconsistently implemented. The correct thing to do is make the application aware of if a record is new or existing. Any application making use of MERGE should open a bug to replace it with more predictable and faster logic using INSERT and UPDATE.

    If an application does use MERGE, it has to account for implementation specific behavior and not will not be portable, in a safe way, to other database servers. So it is not suitable for ORM’s or database independent applications.

    So what is a legitimate use case for MERGE where not knowing if a record is new or existing is not possible prior to the transaction? They only thing that I can think of is one-off scripts that do not have any concurrency. But if you design for that, then some misguided ORM for web applications is going to use your MERGE function and users are going to wonder about what happened to their data when concurrency concerns were ignored or not understood.

  2. Are upserts not really not a DSS type operation in which you would get better performance having a separate insert/ update operations vs the overhead of errors generation?

  3. Regarding overhead and more, I submitted a long comment about this a day or so ago, but added a link and mentioned “that other database”, so it is probably being held for approval. Basically, my conclusion is that combined operations are going to be slower. If you know your data has never entered its table before, an INSERT is always going to be faster. I see this as useful for poorly designed database abstraction layers and applications that do not know if something is new or not and certain types of load scenarios.

    If your application is currently doing a SELECT before choosing between INSERT or UPDATE because it does not know if a given record exists or not, then this has the potential to be faster since making that choice will be faster as the logic is moved closer to the database engine. However, certain operations will be added to make sure that this is done safely, which is the expectation with all PostgreSQL design choices.

  4. Can you arrange the stored procedure to try the INSERT first, trapping a duplicate key exception, and then if that fails because of duplicate key do the UPDATE?

  5. @Berend:
    sure, but it doesn’t change anything. Because still concurrent delete might occur, and delete the existing row before update.

    Positioning update as first command is beneficial because trapping errors is *relatively* expensive. Not very, just more than normal query running. So, it’s better to run update first, and only if it “failed” – run the expensive block of exception catching and insert.

  6. is it possible to lock the table for DELETEs? wouldn’t that “solve” the most hairy problems. for some (let’s say: most) workloads a delete-lock might be acceptable.

  7. @Tobias:

    well, you could use advisory locks. but it’s no match for “rogue” delete.
    you could write a trigger that rejects deletes and enable/disable it for the time you need.
    you could revoke delete privileges.

  8. @MATT

    Yes, MERGE maybe be a slower operation, but can it possibly be slower than having to issue an INSERT over the network, get back the exception over the network, issue an INSERT over the network, get back the result over the network? That is four network round-trips instead of two. I would venture to guess that whatever slowness MERGE/REPLACE introduces would be WAY faster than having to deal with one more network round-trip.

    Some would argue that MERGE/REPLACE can be accomplished with a stored proc or function. Yes, but that routine would only handle a small set of use cases. And besides, why have the user roll a custom solution when this functionality is already so widely available with other DBMS?

  9. @MATT

    Also, I strongly disagree that MERGE/REPLACE is only used in one-off scripts. There are many situations where someone would want to replace existing data when doing an update. One example would be for auditing events. You don’t really care if an event has already been logged, you just want to make sure that it is being logged now. The fact that this feature is available in Oracle and MySQL, AND the fact that many people have already requested this feature in Postgress, should tell you that MERGE/REPLACE is not only for one-off use cases.

  10. I think your SERIALIZABLE example is incorrect; you’re supposed to trap the error that you received and retry the entire transaction in the event of a serialization error. If you do this then you shouldn’t have this issue.

  11. @SamVilain:
    sure, but that kind-of defeats the purpose. I know I can retry, and I can retry under any isolation level, in every case. The point of this blogpost is to provide an idea how to write (or how not to write) solution that will work with single run.

  12. Thanks for writing this interesting article

    Concerning the last part the article, when session 2 does a DELETE. Since there’s a race condition anyway, I think you could simply do the easiest thing, and pretend that the DELETE happened after the failed UPDATE, and simply skip the UPDATE. — That is, something like:

    EXCEPTION WHEN OTHERS THEN
    UPDATE test set counter = counter + 1 WHERE whatever = in_whatever;
    — another session deleted the row:
    EXCEPTION WHEN OTHERS THEN
    RETURN; — doesn’t matter, race condition regardless
    END;
    END;
    RETURN;

  13. @KajMagnus:

    Not sure what you means – UPDATE will not raise exception if someone deletes the row.

  14. One of the most beautiful and informative posts I’ve ever read. Thank you very much for such detailed explanation, kind sir.

  15. @Depesz UPDATE will throw an exception (‘ERROR: 40001: could not serialize access due to concurrent update’) if not in read committed isolation level. I know this was not the case in your examples, just found worth mentioning it. Otherwise this is a great article, but that’s not a big surprise 🙂

  16. I can do an UPSERT with a CTE in a single SQL statement, which might take care of concurrency issues. Have I stumbled upon something useful, or did everybody already know this? Here’s the example:

    => create table t (n int primary key, c varchar(10));
    NOTICE: CREATE TABLE / PRIMARY KEY will create implicit index “t_pkey” for table “t”
    insert into t values (1, ‘hello’);
    CREATE TABLE
    => insert into t values (1, ‘hello’);
    INSERT 0 1
    => select * from t;
    n | c
    —+——-
    1 | hello
    (1 row)

    => with
    postgres-> upsert
    postgres-> as
    postgres-> (select 2 as n, ‘world’::varchar as c),
    postgres-> update_option
    postgres-> as
    postgres-> (update t set c = upsert.c from upsert where t.n = upsert.n)
    postgres-> insert into t
    postgres-> select upsert.n, upsert.c from upsert where not exists(select 1 from t where t.n = upsert.n);
    INSERT 0 1
    => select * from t order by n;
    n | c
    —+——-
    1 | hello
    2 | world
    (2 rows)

    => with
    postgres-> upsert
    postgres-> as
    postgres-> (select 1 as n, ‘goodbye’::varchar as c),
    postgres-> update_option
    postgres-> as
    postgres-> (update t set c = upsert.c from upsert where t.n = upsert.n)
    postgres-> insert into t
    postgres-> select upsert.n, upsert.c from upsert where not exists(select 1 from t where t.n = upsert.n);
    INSERT 0 0
    => select * from t order by n;
    n | c
    —+———
    1 | goodbye
    2 | world
    (2 rows)

  17. what if :
    new data insert

    else data exist then row lock and then update.

    will this work

  18. @Chirag:
    it would be the same thing as the LOOPing function, it’s just insert first, and update later.

    Lock is irrelevant, because UPDATE gets lock too, and delete of the row can happen between failed insert and lock as with insert and update.

  19. What about
    begin transaction;
    ctr = select Coalesce(counter,0) from test where whatever = ?;
    delete from test where whatever = ?;
    insert into test(ctr+1, ?);
    commit transaction;

  20. Dave:

    not sure what you mean “what about”. clearly it’s buggy. consider case where the row does *not* exist when select/delete happens, but is inserted in other transaction just before insert in this transaction.

  21. Excellent summary! So a couple of years have gone by and still there is no solution to this rather common problem? The looping solution does not work in the case when you have multiple unique keys. Is that correct? Is there a known solution to that case? Also do you know the status of other vendor solutions like the mysql “on duplicate key update” solution? It also does not work when you have multiple keys and I’m not sure whether it is “concurrent safe”. Similarly for using merge for other vendors.

  22. @dan:
    there is work in progress for something like it, but I don’t know the status – you can ask on pgsql-hackers about it.

  23. Out of curiosity, I tested the current patches by modifying one of the scripts in this post. The performance is approximately the same as the performance for the final solution you posted above, without concurrent deletes being an issue.

  24. Starting with the standard “I love PostgreSQL”! Although I was in a previous life an Oracle DBA and loved and used the UPSERT command all the time. I just don’t understand why we can’t have an UPSERT command and have to jump through so many hoops to duplicate this functionality. Was there some race condition with the Oracle UPSERT that I was unaware of? I used it in various heavily transacted OLTP DBs without any issues. Why is this so complicated?

  25. And… we have it now! 🙂 For 9.5, of course. As of about 24 hours ago: http://www.postgresql.org/docs/devel/static/sql-insert.html

    It’s so funny, I *just* started “using” PostgreSQL (well, deciding to use for next projects) 3 weeks ago, and something like this was one of those MySQL-type things missing. While I don’t think I have a huge NEED for it, definitely will be great to have!

    I’ll be looking forward to your “Waiting for 9.5” post about it. Would be interesting to see some benchmarks comparing to one of the workarounds, as well as plain INSERT or UPDATE.

    Thanks for the great blog! A wonderful resource since finding it a couple weeks ago when starting to dig deep into all things Postgres. 🙂

  26. What I think Matt means specifically is that there is an “ON CONFLICT” clause, which will enable us to first do the INSERT and if it exists (there’s a unique index violation), then perform the UPDATE.

    I’m still curious what happens in that split nano-second between when a conflict is determined and a DELETE occurs (before the UPDATE is issued).

    Read more about it here: http://www.postgresql.org/docs/devel/static/sql-insert.html#SQL-ON-CONFLICT

  27. I’m a bit skeptical about the importance of handling the DELETE case. Let’s say that another concurrent process does insert the matching row and then delete it as you mentioned in the article:

    1: update (row doesn’t exist)
    2: insert
    1: insert (fails, row exists)
    2: delete
    1: update (row doesn’t exist)

    Here you indicate that client 1 should retry the insert since the row deletion caused the update to effectively not be recorded.

    I’m not sure this is necessary, strictly speaking. If you’re unlucky enough to be in this situation then as a side effect of whichever scheduler causes these concurrent statements to be executed in exactly this order, they could instead happen in this order:

    2: insert
    1: update (row exists and is updated)
    2: delete

    In this case, your update does get written and then is promptly discarded. It won’t have ever been seen, so what is the point here?

    “Correctness” in this case doesn’t seem to be very well-defined. Can you provide an example where the retry loop has a real advantage over the single pass update-then-insert-then-update (no looping) approach? I understand that consistency is key but when you’re one operation reorder away from the data being discarded silently in what we would call correct operation (my second example) then I’m not sure I see the point.

  28. As a follow-up, the only “correct” way I see to perform this query is in a transaction with serializable isolation, and loop it. Other isolation levels are problematic re my previous comment.

  29. For this particular example where whatever is the primary key.
    If we do the insert first and then do the update if insert fails there should be no issues .

  30. @HM:

    Surely there is issue. pre-existing row can be deleted, in the mean-time, but some other transaction.

Comments are closed.