Waiting for 9.2 – NOT VALID CHECKS

On 30th of June, Alvaro Herrera committed patch:

Enable CHECK constraints to be declared NOT VALID
 
This means that they can initially be added to a large existing table
without checking its initial contents, but new tuples must comply to
them; a separate pass invoked by ALTER TABLE / VALIDATE can verify
existing data and ensure it complies with the constraint, at which point
it is marked validated and becomes a normal part of the table ecosystem.
 
An non-validated CHECK constraint is ignored in the planner for
constraint_exclusion purposes; when validated, cached plans are
recomputed so that partitioning starts working right away.
 
This patch also enables domains to have unvalidated CHECK constraints
attached to them as well by way of ALTER DOMAIN / ADD CONSTRAINT / NOT
VALID, which can later be validated with ALTER DOMAIN / VALIDATE
CONSTRAINT.
 
Thanks to Thom Brown, Dean Rasheed and Jaime Casanova for the various
reviews, and Robert Hass for documentation wording improvement
suggestions.
 
This patch was sponsored by Enova Financial.

Some time ago I wrote about NOT VALID FOREIGN KEYS in 9.1. Now for 9.2, we'll be getting the same mechanism for CHECK constraints.

Normally one would add check constraint with:

ALTER TABLE test ADD CHECK ( FIELD >= 0 );

But now we can do it in two steps:

ALTER TABLE test ADD CHECK ( FIELD >= 0 ) NOT VALID;
ALTER TABLE test VALIDATE CONSTRAINT test_field_check;

So. Let's see if, and if yes – why it's cool.

For tests, I created simple tables:

$ CREATE TABLE test (id serial, some_text text, FIELD int4);
NOTICE:  CREATE TABLE will CREATE implicit SEQUENCE "test_id_seq" FOR serial COLUMN "test.id"
CREATE TABLE
 
$ INSERT INTO test (some_text, FIELD)
    SELECT repeat('depesz', 50), CAST( random() * 120000 AS int4)
    FROM generate_series(1,1000000);
INSERT 0 1000000
 
$ ALTER TABLE test ADD PRIMARY KEY (id);
NOTICE:  ALTER TABLE / ADD PRIMARY KEY will CREATE implicit INDEX "test_pkey" FOR TABLE "test"
ALTER TABLE
 
$ CREATE TABLE test2 (id serial, some_text text, FIELD int4);
NOTICE:  CREATE TABLE will CREATE implicit SEQUENCE "test2_id_seq" FOR serial COLUMN "test2.id"
CREATE TABLE
 
$ INSERT INTO test2 (id, some_text, FIELD) SELECT * FROM test;
INSERT 0 1000000
 
$ ALTER TABLE test2 ADD PRIMARY KEY (id);
NOTICE:  ALTER TABLE / ADD PRIMARY KEY will CREATE implicit INDEX "test2_pkey" FOR TABLE "test2"
ALTER TABLE

Structure and data in both tables are exactly the same. Each table, with indexes and toast takes ~ 360MB.

So, let's see speed and locks:

$ BEGIN;
BEGIN
 
*$ ALTER TABLE test ADD CHECK ( FIELD >= 0 );
ALTER TABLE
TIME: 206.138 ms
 
*$ SELECT * FROM pg_locks WHERE pid = pg_backend_pid();
   locktype    | DATABASE | relation |  page  | tuple  | virtualxid | transactionid | classid | objid  | objsubid | virtualtransaction | pid  |           mode           | GRANTED
---------------+----------+----------+--------+--------+------------+---------------+---------+--------+----------+--------------------+------+--------------------------+---------
 relation      |    16404 |    26486 | [NULL] | [NULL] | [NULL]     |        [NULL] |  [NULL] | [NULL] |   [NULL] | 2/295              | 9149 | ShareUpdateExclusiveLock | t
 relation      |    16404 |    26486 | [NULL] | [NULL] | [NULL]     |        [NULL] |  [NULL] | [NULL] |   [NULL] | 2/295              | 9149 | ShareRowExclusiveLock    | t
 transactionid |   [NULL] |   [NULL] | [NULL] | [NULL] | [NULL]     |          1399 |  [NULL] | [NULL] |   [NULL] | 2/295              | 9149 | ExclusiveLock            | t
 relation      |    16404 |    11000 | [NULL] | [NULL] | [NULL]     |        [NULL] |  [NULL] | [NULL] |   [NULL] | 2/295              | 9149 | AccessShareLock          | t
 virtualxid    |   [NULL] |   [NULL] | [NULL] | [NULL] | 2/295      |        [NULL] |  [NULL] | [NULL] |   [NULL] | 2/295              | 9149 | ExclusiveLock            | t
(5 ROWS)
 
*$ SELECT 11000::regclass, 26486::regclass;
 regclass | regclass
----------+----------
 pg_locks | test
(1 ROW)

OK. It took 206ms, and for this time it took ShareUpdateExclusiveLock and ShareRowExclusiveLock locks. Which, thanks to this table we know that it conflicts with everything except Access Share and Row Share – which basically turns table into read-only.

Now. What about new method?

First operation:

$ BEGIN;
BEGIN
TIME: 0.249 ms
 
*$ ALTER TABLE test2 ADD CHECK ( FIELD >= 0 ) NOT VALID;
ALTER TABLE
TIME: 0.871 ms
 
*$ SELECT * FROM pg_locks WHERE pid = pg_backend_pid();
   locktype    | DATABASE | relation |  page  | tuple  | virtualxid | transactionid | classid | objid  | objsubid | virtualtransaction | pid  |           mode           | GRANTED
---------------+----------+----------+--------+--------+------------+---------------+---------+--------+----------+--------------------+------+--------------------------+---------
 virtualxid    |   [NULL] |   [NULL] | [NULL] | [NULL] | 2/296      |        [NULL] |  [NULL] | [NULL] |   [NULL] | 2/296              | 9149 | ExclusiveLock            | t
 relation      |    16404 |    26497 | [NULL] | [NULL] | [NULL]     |        [NULL] |  [NULL] | [NULL] |   [NULL] | 2/296              | 9149 | ShareUpdateExclusiveLock | t
 relation      |    16404 |    26497 | [NULL] | [NULL] | [NULL]     |        [NULL] |  [NULL] | [NULL] |   [NULL] | 2/296              | 9149 | ShareRowExclusiveLock    | t
 transactionid |   [NULL] |   [NULL] | [NULL] | [NULL] | [NULL]     |          1400 |  [NULL] | [NULL] |   [NULL] | 2/296              | 9149 | ExclusiveLock            | t
 relation      |    16404 |    11000 | [NULL] | [NULL] | [NULL]     |        [NULL] |  [NULL] | [NULL] |   [NULL] | 2/296              | 9149 | AccessShareLock          | t
(5 ROWS)
 
*$ SELECT 26497::regclass;
 regclass
----------
 test2
(1 ROW)

OK. So we had the same level of locks, buit only for 0.871ms.

Now. In another transaction, I can run the validation:

$ BEGIN;
BEGIN
TIME: 0.139 ms
 
*$ ALTER TABLE test2 VALIDATE CONSTRAINT test2_field_check;
ALTER TABLE
TIME: 201.310 ms
 
*$ SELECT * FROM pg_locks WHERE pid = pg_backend_pid();
   locktype    | DATABASE | relation |  page  | tuple  | virtualxid | transactionid | classid | objid  | objsubid | virtualtransaction | pid  |           mode           | GRANTED
---------------+----------+----------+--------+--------+------------+---------------+---------+--------+----------+--------------------+------+--------------------------+---------
 relation      |    16404 |    26497 | [NULL] | [NULL] | [NULL]     |        [NULL] |  [NULL] | [NULL] |   [NULL] | 2/297              | 9149 | ShareUpdateExclusiveLock | t
 transactionid |   [NULL] |   [NULL] | [NULL] | [NULL] | [NULL]     |          1401 |  [NULL] | [NULL] |   [NULL] | 2/297              | 9149 | ExclusiveLock            | t
 virtualxid    |   [NULL] |   [NULL] | [NULL] | [NULL] | 2/297      |        [NULL] |  [NULL] | [NULL] |   [NULL] | 2/297              | 9149 | ExclusiveLock            | t
 relation      |    16404 |    11000 | [NULL] | [NULL] | [NULL]     |        [NULL] |  [NULL] | [NULL] |   [NULL] | 2/297              | 9149 | AccessShareLock          | t
(4 ROWS)

OK. Time is the same as in traditional way. But the locking is different – we have only ShareUpdateExclusiveLock, and no more ShareRowExclusiveLock. This change means that all modifications on the table will work just fine during the validation phase!

Let's see if it's really the case.

First, I'll drop the constraint:

$ ALTER TABLE test2 DROP CONSTRAINT test2_field_check;

Now, let's re-add invalid check:

$ ALTER TABLE test2 ADD CHECK ( FIELD >= 0 ) NOT VALID;

With this in place, in first psql, I run:

psql-1 $ BEGIN;
BEGIN
 
psql-1 1*$ ALTER TABLE test2 VALIDATE CONSTRAINT test2_field_check;
ALTER TABLE

and leave it in transaction.

Afterwards, I run another psql, and let's test some updates:

psql-2 $ UPDATE test2 SET FIELD = 50 WHERE id < 20;
UPDATE 19
 
psql-2 $ DELETE FROM test2 WHERE id < 10;
DELETE 9

All works. Sweet. That's really cool stuff.

3 thoughts on “Waiting for 9.2 – NOT VALID CHECKS”

  1. Thanks for the write up. This features is really cool, now we can add new constraints to huge tables with little relative impact!

Comments are closed.