Waiting for PostgreSQL 12 – Support foreign keys that reference partitioned tables

On 3rd of April 2019, Alvaro Herrera committed patch:

Support foreign keys that reference partitioned tables
 
 
Previously, while primary keys could be made on partitioned tables, it
was not possible to define foreign keys that reference those primary
keys.  Now it is possible to do that.
 
Author: Álvaro Herrera
 
Discussion: https://postgr.es/m/20181102234158.735b3fevta63msbj@alvherre.pgsql

THIS IS HUGE

In case you didn't know, if you partitioned tables, you had quite a lot of various limitations. One of the most important ones was that it wasn't possible to add fkey constraint that would point to partitioned table.

I once even wrote small series of blogposts how to hack it yourself ( part 1, part 2, and part 3 ).

But now, we got the real deal.

Let's see:

=$ CREATE TABLE users (
    id             serial NOT NULL,
    username       text NOT NULL,
    password       text
)
PARTITION BY RANGE ( id );
CREATE TABLE
 
=$ CREATE TABLE users_0
    partition OF users (id, PRIMARY KEY (id), UNIQUE (username))
    FOR VALUES FROM (MINVALUE) TO (10);
CREATE TABLE
 
=$ CREATE TABLE users_1
    partition OF users (id, PRIMARY KEY (id), UNIQUE (username))
    FOR VALUES FROM (10) TO (20);
CREATE TABLE
 
=$ INSERT INTO users (username) SELECT 'user #' || i FROM generate_series(1,15) i;
INSERT 0 15

Sanity check that data got distributed:

=$ SELECT tableoid::regclass, id, username FROM users ORDER BY id;
 tableoid | id | username 
----------+----+----------
 users_0  |  1 | USER #1
 users_0  |  2 | USER #2
 users_0  |  3 | USER #3
 users_0  |  4 | USER #4
 users_0  |  5 | USER #5
 users_0  |  6 | USER #6
 users_0  |  7 | USER #7
 users_0  |  8 | USER #8
 users_0  |  9 | USER #9
 users_1  | 10 | USER #10
 users_1  | 11 | USER #11
 users_1  | 12 | USER #12
 users_1  | 13 | USER #13
 users_1  | 14 | USER #14
 users_1  | 15 | USER #15
(15 ROWS)

OK. Looks that we have data properly spread across both partitions.

And now, let's try to make the table that has fkey to users:

=$ CREATE TABLE test (
    id serial PRIMARY KEY,
    user_id int4 NOT NULL REFERENCES users (id)
);
ERROR:  there IS no UNIQUE CONSTRAINT matching given KEYS FOR referenced TABLE "users"

OK, this failed. Previously, when partitioning was first added, you couldn't have pkey/unique on main table. Maybe I can and should now?

=$ ALTER TABLE users ADD PRIMARY KEY (id);
ALTER TABLE

Looks like it worked. So now, let's try to add test table again:

CREATE TABLE test (
    id serial PRIMARY KEY,
    user_id int4 NOT NULL REFERENCES users (id)
);
ERROR:  INDEX FOR 17666 NOT found IN partition users_0

OK. This was unexpected, and not really readable. But lets change my main table to have pkey from get go, and only then add partitions:

=$ DROP TABLE users cascade;
DROP TABLE;
 
=$ CREATE TABLE users (
    id             serial PRIMARY KEY,
    username       text NOT NULL,
    password       text
)
PARTITION BY RANGE ( id );
CREATE TABLE
 
=$ CREATE TABLE users_0
    partition OF users
    FOR VALUES FROM (MINVALUE) TO (10);
CREATE TABLE
 
=$ CREATE TABLE users_1
    partition OF users
    FOR VALUES FROM (10) TO (20);
CREATE TABLE
 
=$ INSERT INTO users (username) SELECT 'user #' || i FROM generate_series(1,15) i;
INSERT 0 15

and now let's try again with the referencing table:

=$ CREATE TABLE test (
    id serial PRIMARY KEY,
    user_id int4 NOT NULL REFERENCES users (id)
);
CREATE TABLE

SUCCESS.

So, let's test it. First, make sure we know correct user ids:

=$ SELECT string_agg(id::text, ', ' ORDER BY id) FROM users;
                    string_agg                     
---------------------------------------------------
 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15
(1 ROW)

OK. So, these two should work:

=$ INSERT INTO test (user_id) VALUES (2);
INSERT 0 1
 
=$ INSERT INTO test (user_id) VALUES (14);
INSERT 0 1

but this should fail:

=$ INSERT INTO test (user_id) VALUES (100);
ERROR:  INSERT OR UPDATE ON TABLE "test" violates FOREIGN KEY CONSTRAINT "test_user_id_fkey"
DETAIL:  KEY (user_id)=(100) IS NOT present IN TABLE "users".

And it does!

Just to be safe, let's also test if it will correctly stop me from deleting used user:

=$ DELETE FROM users WHERE id = 2;
ERROR:  UPDATE OR DELETE ON TABLE "users_0" violates FOREIGN KEY CONSTRAINT "test_user_id_fkey1" ON TABLE "test"
DETAIL:  KEY (id)=(2) IS still referenced FROM TABLE "test".

SWEET.

While this looks good already, I do wonder, if it will work with hash based partitioning.

You see – range based partitions are great, but they require certain maintenance (create new one partitions when needed). So, let's try it:

=$ DROP TABLE test;
DROP TABLE
 
=$ DROP TABLE users cascade;
DROP TABLE
 
=$ CREATE TABLE users (
    username       text PRIMARY KEY,
    password       text
)
PARTITION BY HASH ( username );
CREATE TABLE
 
=$ CREATE TABLE users_0 partition OF users FOR VALUES WITH (MODULUS 8, REMAINDER 0);
CREATE TABLE
 
=$ CREATE TABLE users_1 partition OF users FOR VALUES WITH (MODULUS 8, REMAINDER 1);
CREATE TABLE
 
=$ CREATE TABLE users_2 partition OF users FOR VALUES WITH (MODULUS 8, REMAINDER 2);
CREATE TABLE
 
=$ CREATE TABLE users_3 partition OF users FOR VALUES WITH (MODULUS 8, REMAINDER 3);
CREATE TABLE
 
=$ CREATE TABLE users_4 partition OF users FOR VALUES WITH (MODULUS 8, REMAINDER 4);
CREATE TABLE
 
=$ CREATE TABLE users_5 partition OF users FOR VALUES WITH (MODULUS 8, REMAINDER 5);
CREATE TABLE
 
=$ CREATE TABLE users_6 partition OF users FOR VALUES WITH (MODULUS 8, REMAINDER 6);
CREATE TABLE
 
=$ CREATE TABLE users_7 partition OF users FOR VALUES WITH (MODULUS 8, REMAINDER 7);
CREATE TABLE
 
=$ INSERT INTO users (username) SELECT 'user #' || i FROM generate_series(1,15) i;
INSERT 0 15

Let's check if the users were distributed:

=$ SELECT tableoid::regclass, COUNT(*), string_agg(username, ', ' ORDER BY username)
    FROM users GROUP BY tableoid ORDER BY tableoid::regclass;
 tableoid | COUNT |              string_agg              
----------+-------+--------------------------------------
 users_0  |     2 | USER #12, USER #9
 users_3  |     4 | USER #11, USER #14, USER #2, USER #7
 users_4  |     4 | USER #10, USER #13, USER #4, USER #6
 users_5  |     2 | USER #1, USER #15
 users_6  |     2 | USER #5, USER #8
 users_7  |     1 | USER #3
(6 ROWS)

Well, not all of partitions were used, but I guess it's OK, given the small size of sample.

Now, can we add table that has fkey pointing to users?

=$ CREATE TABLE test (
    id serial PRIMARY KEY,
    username text NOT NULL REFERENCES users (username)
);
CREATE TABLE

OK. Soooo, let's see if we can insert some sane rows:

=$ INSERT INTO test (username) VALUES ('user #10');
INSERT 0 1
 
=$ INSERT INTO test (username) VALUES ('user #9');
INSERT 0 1

And a row that shouldn't be accepted:

=$ INSERT INTO test (username) VALUES ('user #666');
ERROR:  INSERT OR UPDATE ON TABLE "test" violates FOREIGN KEY CONSTRAINT "test_username_fkey"
DETAIL:  KEY (username)=(USER #666) IS NOT present IN TABLE "users".

This is amazing. Works as expected, with virtually no issues along the way.

Not sure if it was only Alvaros work, or were others involved, but in any way: HUGE thanks to everybody involved.

5 thoughts on “Waiting for PostgreSQL 12 – Support foreign keys that reference partitioned tables”

  1. @Alvaro:

    thanks a lot. Sorry for not filling in real bug report, but wasn’t sure if it’s not simply something I screwed up.

  2. Hi, in the first example of current implementation (Postgres 11), the table had a UNIQUE constraint on username.

    That constraint wasn’t used on the example of this new feature, is it a limitation or just an oversight?

    Could that constraint be created on the parent table instead of the child ones? And being on the child one, would that constraint allow these tuples to be inserted:

    (1, 'andrei', 'abc'),
    (11,'andrei','def');
  3. @Andrei:

    Please note that having the UNIQUE wasn’t really all that helpful – because it was only UNIQUE inside single partition. Consider:

    =$ CREATE TABLE users (
        id             serial NOT NULL,
        username       text NOT NULL,
        password       text
    ) PARTITION BY RANGE ( id );
    CREATE TABLE
     
    =$ CREATE TABLE users_0 partition OF users (id, PRIMARY KEY (id), UNIQUE (username)) FOR VALUES FROM (MINVALUE) TO (10);
    CREATE TABLE
     
    =$ CREATE TABLE users_1 partition OF users (id, PRIMARY KEY (id), UNIQUE (username)) FOR VALUES FROM (10) TO (20);
    CREATE TABLE
     
    =$ INSERT INTO users (id, username, password) VALUES (1, 'andrei', 'abc'), (11,'andrei','def');
    INSERT 0 2
     
    =$ SELECT username, COUNT(*) FROM users GROUP BY username;
     username | COUNT 
    ----------+-------
     andrei   |     2
    (1 ROW)

    As you can see – despite the UNIQUE, I was still able to insert duplicate username.

    Adding constraint on parent wouldn’t help – as parent table has no rows.

    To have cross-partition uniqueness, you’d need some kind of custom triggers. Or choose partition key that depends on the value you want to be unique (hash of username for example).

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.