Waiting for 9.3 – Add parallel pg_dump option.

On 24th of March, Andrew Dunstan committed patch:

Add parallel pg_dump option.
 
New infrastructure is added which creates a set number of workers
(threads on Windows, forked processes on Unix). Jobs are then
handed out to these workers by the master process as needed.
pg_restore is adjusted to use this new infrastructure in place of the
old setup which created a new worker for each step on the fly. Parallel
dumps acquire a snapshot clone in order to stay consistent, if
available.
 
The parallel option is selected by the -j / --jobs command line
parameter of pg_dump.
 
Joachim Wieland, lightly editorialized by Andrew Dunstan.

Very recently I wrote about dumping in parallel, and now we have it committed 🙂

So, of course, I had to test it.

To have some sensible dataset, I took couple of smallish databases, and loaded it together into single DB. Result database has 1043 MB, and 54 tables.

I tried dumping it normally, without parallelism, using pg_dump –format=p and pg_dump –format=c. For each format, I ran pg_dump 5 times. Results:

  • format=plain, average time: 29.716s ( from 23.303s to 35.324s, file size 2,053,750,560 )
  • format=custom, average time: 64.874s ( from 63.129s to 66.354s, file size 475,318,581 )

Huge time difference made me think, and realize it's probably due to compression. So I redid –format=c, with -Z0 option. Results:

  • average time: 22.192s
  • time from: 21.564s to: 23.086s
  • file size: 2,061,791,159

So, that's the baseline.

Now, I did test new, parallel dumping, with 2, 4 and 8 jobs.

Results:

  • -j2 : from: 18.592s, to: 23.564s, average: 20.918s
  • -j4 : from: 18.631s, to: 31.076s, average: 24.983s
  • -j8 : from: 18.528s, to: 24.062s, average: 21.648s

The difference, while exists, is not really great. It could be related to number of factors, starting from bad IO (tested on RAID1 of WD velociraptors, and on 1st gen Intel SSD) to just not enough data. Or perhaps tuning issues. Whatever it is – queries run by pg_dump are not locked, so it looks like lack of significant speedup is not a flaw in technology, but rather in my testing setup.

In any case – I am very enthusiastic about this patch, and I hope that when it will get in released Pg version, and I'll get some more serious databases to upgrade – I will see better numbers.

8 thoughts on “Waiting for 9.3 – Add parallel pg_dump option.”

  1. Parallel pg_dump only work with –format=directory according the the manual. But maybe they should make it more obvious with a warning if you specify both jobs and a file format which does not support concurrent writes.

  2. I don’t know about the comment above, but looking at the first result with a single threaded pg_dump I think it would be awesome if pg_dump had multithreaded compression. It would make the dumping into the compressed custom format a lot faster.

    Maybe pg_dump could have an option to define a (de) compression program, so you could use it with pigz for example.

  3. @Kustodian:
    I imagine, that when you use -Fd -j #, you effectively get multithreaded compression – each worker compresses its own data.

  4. @depesz it isn’t obvious since I do not see anywhere in the post a measurement with -Fd and no parallelism.

  5. @Andreas:
    well, I didn’t find them interesting. When one is doing single-threaded backups (as all backups now) he/she chooses plain or custom format in huge majority of cases.
    So, speedup of “single-threaded” dir format vs. “multi-threaded” dir format, is much less interesting than “usual (plain/custom) single-threaded dump” vs. “multi threaded, and by requirement, dir-dump”.

  6. Just testing out pg_restore -j for the first time.. so not entirely related to this post. I cannot speed the dump up yet as I’m only on 9.0.. 🙂

    A lot of the parallel process is stuck loading in 2 very large tables. It would be an awesome feature of pg_dump to be able to break down large tables into smaller chuncks allowing pg_restore -j to utillize more resources here.

    I do know that index-builds currently cannot utillize more than one core but dataload should be able to do it fairly easy.

  7. I think if all your tables are roughly the same size you would not see much improvement.

    You would probably get a better idea if you had 50 other smallish tables and one or two large tables.

    Your large table will be the blocker but with simultaneous dumping of other tables, it can do that work while the big one is chugging along.

    When doing my dumps, I routinely see the audit log and audit log detail tables just hang there when I know it could be working on all the other tables.

Comments are closed.