Waiting for 8.4 – parallel restoration of dumps

On 2nd of February Andrew Dunstan committed his patch (with editing by Tom Lane) that:

Log Message:
-----------
Provide for parallel restoration from a custom format archive. Each data and
post-data step is run in a separate worker child (a thread on Windows, a child
process elsewhere) up to the concurrent number specified by the new pg_restore
command-line --multi-thread | -m switch.

Well, the basic idea is simple: loading using multiple threads will (in theory) make restoration faster.

Parallel restoration is available only when dealing with -Fc dumps (custom format).

So, let's test how it works.

First, I created tables, and put some rows in them:

# CREATE TABLE t1 (i INT4);
CREATE TABLE
TIME: 56.411 ms
 
# INSERT INTO t1 (i) SELECT generate_series(1,10000000);
INSERT 0 10000000
TIME: 34922.493 ms
 
# ALTER TABLE t1 ADD PRIMARY KEY (i);
NOTICE:  ALTER TABLE / ADD PRIMARY KEY will CREATE implicit INDEX "t1_pkey" FOR TABLE "t1"
ALTER TABLE
TIME: 25071.766 ms
 
# CREATE TABLE t2 (i INT4);
CREATE TABLE
TIME: 3.367 ms
 
# INSERT INTO t2 (i) SELECT generate_series(1,10000000);
INSERT 0 10000000
TIME: 32917.159 ms
 
# ALTER TABLE t2 ADD PRIMARY KEY (i);
NOTICE:  ALTER TABLE / ADD PRIMARY KEY will CREATE implicit INDEX "t2_pkey" FOR TABLE "t2"
ALTER TABLE
TIME: 26295.656 ms
 
# CREATE TABLE t3 (i INT4);
CREATE TABLE
TIME: 20.818 ms
 
# INSERT INTO t3 (i) SELECT generate_series(1,10000000);
INSERT 0 10000000
TIME: 33337.663 ms
 
# ALTER TABLE t3 ADD PRIMARY KEY (i);
NOTICE:  ALTER TABLE / ADD PRIMARY KEY will CREATE implicit INDEX "t3_pkey" FOR TABLE "t3"
ALTER TABLE
TIME: 25605.919 ms

So, now I have 3 tables, each with 10 million records. Each table is 320MB, and the indexes (unique index for primary key) are 170MB each.

Well, first, I need to make dump of the database:

=> time pg_dump -Fc -f test.dump
 
real    0m40.490s
user    0m36.554s
sys     0m0.628s

Dump has 63MB, so it's pretty small.

Now, let's try to load it using single task. To make the test a bit better, I will load it 3 times to get some average on timing.

=> for a in 1 2 3; do dropdb depesz; createdb; time pg_restore -d depesz test.dump; done
 
real    2m53.637s
user    0m15.741s
sys     0m0.284s
 
real    3m3.501s
user    0m16.337s
sys     0m0.264s
 
real    2m59.544s
user    0m16.629s
sys     0m0.296s

So, on average it did load in 2m 58.894ms.

Now, let's try with various number of working threads. Since there are 3
tables, each with 1 index, there is no point in going over 3 worker threads.

Generally – pg_restore can put every data load and index creation in its own worker thread, which can give a lot of benefits if your hardware is good enough.

Let's test how good it will be on my laptop.

With 2 worker threads I got these times:

real  1m15.807s
real    1m14.390s
real    1m11.966s

WHOA! 1m 14.054s in average. It's more than 2 times faster!

And what about 3 worker threads?

real  1m9.079s
real    1m13.560s
real    1m7.277s

1m 9.972s in average. Not bad.

Generally – adding second worker gave a lot. Adding third – less. Most likely because I have 2 cores in the machine. Unfortunately I don't have any machine with more than 2 cores for tests, but I can imagine that with more cores I could benefit from adding more workers (though one have to remember about limitations of i/o channel).

All in all – another great feature, that is greatly appreciated by me (and probably not only me).

3 thoughts on “Waiting for 8.4 – parallel restoration of dumps”

  1. Thanks Depesz.
    Yes, I tested it on 4 way xeon, and up to 6 threads I can see a difference, on fairly big db (35GB compressed dump, loads of tables, indices, etc).

  2. Hi,
    – do dropdb depesz; createdb;
    + do dropdb depesz; createdb depesz;
    I think this is little misspeling.

  3. @Jarek S.:
    not really – createdb, if you don’t give it database name uses name of user as database name. and if username is not given – it uses shell user name.

    so, if i’m logged as depesz to my linuxbox, createdb with no parameters will create database depesz. or at least – will try to.

Comments are closed.