On 23rd of March 2017, Peter Eisentraut committed patch:
Logical replication support for initial data copy Add functionality for a new subscription to copy the initial data in the tables and then sync with the ongoing apply process. For the copying, add a new internal COPY option to have the COPY source data provided by a callback function. The initial data copy works on the subscriber by receiving COPY data from the publisher and then providing it locally into a COPY that writes to the destination table. A WAL receiver can now execute full SQL commands. This is used here to obtain information about tables and publications. Several new options were added to CREATE and ALTER SUBSCRIPTION to control whether and when initial table syncing happens. Change pg_dump option --no-create-subscription-slots to --no-subscription-connect and use the new CREATE SUBSCRIPTION ... NOCONNECT option for that. Author: Petr Jelinek
Some time ago I wrote about support for logical replication in PostgreSQL 10, and noted that it lacks support for sending initial data state.
This was solvable using some semi-clever tricks, but was a pain. Now, supposedly, the problem is gone. Let's see if it really is the case.
Just like previously, I ran master on port 5100, and slave on 5101, and made my psql prompt obvious about where I run the queries:
So, let's create the example table, and some data:
master $ CREATE TABLE test ( id serial PRIMARY KEY, whatever text, when_ts timestamptz ); CREATE TABLE master $ INSERT INTO test (whatever, when_ts) SELECT 'whatever #' || i, now() - '1 year'::INTERVAL * random() FROM generate_series(1,5) i ORDER BY random(); INSERT 0 5
Then, I need to create the table on slave:
slave $ CREATE TABLE test ( id serial PRIMARY KEY, whatever text, when_ts timestamptz ); CREATE TABLE
With this in place, let's create publication on master:
master $ CREATE PUBLICATION test_pub FOR TABLE test; CREATE PUBLICATION
and now on slave, let's subscribe to it:
slave $ CREATE SUBSCRIPTION test_sub CONNECTION 'host=127.0.0.1 port=5100 dbname=depesz user=postgres' PUBLICATION test_pub; NOTICE: created replication slot "test_sub" ON publisher NOTICE: synchronized TABLE states CREATE SUBSCRIPTION slave $ SELECT * FROM test; id | whatever | when_ts ----+-------------+------------------------------- 1 | whatever #2 | 2017-03-24 10:12:52.351159+01 2 | whatever #3 | 2017-01-08 04:49:43.931959+01 3 | whatever #4 | 2016-10-14 23:43:48.079159+02 4 | whatever #5 | 2016-08-04 21:37:21.467959+02 5 | whatever #1 | 2016-07-23 05:21:23.090359+02 (5 ROWS)
Nice. Data is copied automatically, by default. What about changes?
master $ DELETE FROM test WHERE id = 2; DELETE 1 master $ UPDATE test SET whatever = 'updated' WHERE id = 3; UPDATE 1 master $ INSERT INTO test (whatever, when_ts) VALUES ('new', now()); INSERT 0 1
and a moment later on slave:
slave $ SELECT * FROM test; id | whatever | when_ts ----+-------------+------------------------------- 1 | whatever #2 | 2017-03-24 10:12:52.351159+01 4 | whatever #5 | 2016-08-04 21:37:21.467959+02 5 | whatever #1 | 2016-07-23 05:21:23.090359+02 3 | updated | 2016-10-14 23:43:48.079159+02 6 | NEW | 2017-04-03 12:16:07.775591+02 (5 ROWS)
Nice. There could be a case where you don't want to copy initial data. In such cases CREATE SUBSCRIPTION has option NOCOPY DATA that will bring back previous behavior.
Great. This is really, really amazing. Thanks a lot Petr (and all other involved).