Waiting for PostgreSQL 13 – pgbench: add –partitions and –partition-method options.

On 3rd of October 2019, Amit Kapila committed patch:

pgbench: add --partitions and --partition-method options.
 
These new options allow users to partition the pgbench_accounts table by
specifying the number of partitions and partitioning method.  The values
allowed for partitioning method are range and hash.
 
This feature allows users to measure the overhead of partitioning if any.
 
Author: Fabien COELHO
 
Alvaro Herrera
Discussion: https://postgr.es/m/alpine.DEB.2.21..7008@lancre

This is interesting addition. If you're not familiar with pgbench, it's a tool that benchmarks PostgreSQL instance.

Running it happens in two phases:

  1. initialize: pgbench -i -s …
  2. run benchmark: pgbench …

Obviously, partitioning has to be done initialization, so let's try first simple:

=$ pgbench -i -s 100
...
done in 10.03 s (drop tables 0.00 s, create tables 0.02 s, generate 6.70 s, vacuum 1.28 s, primary keys 2.03 s).

In the test database, I see:

$ \d+
                                List OF relations
 Schema |       Name       | TYPE  | Owner | Persistence |  SIZE   | Description
--------+------------------+-------+-------+-------------+---------+-------------
 public | pgbench_accounts | TABLE | pgdba | permanent   | 1281 MB |
 public | pgbench_branches | TABLE | pgdba | permanent   | 40 kB   |
 public | pgbench_history  | TABLE | pgdba | permanent   | 0 bytes |
 public | pgbench_tellers  | TABLE | pgdba | permanent   | 80 kB   |
(4 ROWS)

Now, let's retry initialize, with partitioning:

=$ pgbench -i -s 100 --partitions=10
...
done in 19.70 s (drop tables 0.00 s, create tables 0.03 s, generate 7.34 s, vacuum 10.24 s, primary keys 2.08 s).

and content is:

$ \d+
                                       List OF relations
 Schema |        Name         |       TYPE        | Owner | Persistence |  SIZE   | Description
--------+---------------------+-------------------+-------+-------------+---------+-------------
 public | pgbench_accounts    | partitioned TABLE | pgdba | permanent   | 0 bytes |
 public | pgbench_accounts_1  | TABLE             | pgdba | permanent   | 128 MB  |
 public | pgbench_accounts_10 | TABLE             | pgdba | permanent   | 128 MB  |
 public | pgbench_accounts_2  | TABLE             | pgdba | permanent   | 128 MB  |
 public | pgbench_accounts_3  | TABLE             | pgdba | permanent   | 128 MB  |
 public | pgbench_accounts_4  | TABLE             | pgdba | permanent   | 128 MB  |
 public | pgbench_accounts_5  | TABLE             | pgdba | permanent   | 128 MB  |
 public | pgbench_accounts_6  | TABLE             | pgdba | permanent   | 128 MB  |
 public | pgbench_accounts_7  | TABLE             | pgdba | permanent   | 128 MB  |
 public | pgbench_accounts_8  | TABLE             | pgdba | permanent   | 128 MB  |
 public | pgbench_accounts_9  | TABLE             | pgdba | permanent   | 128 MB  |
 public | pgbench_branches    | TABLE             | pgdba | permanent   | 40 kB   |
 public | pgbench_history     | TABLE             | pgdba | permanent   | 0 bytes |
 public | pgbench_tellers     | TABLE             | pgdba | permanent   | 80 kB   |
(14 ROWS)

with main table looking like this:

$ \d+ pgbench_accounts
                            Partitioned TABLE "public.pgbench_accounts"
  COLUMN  |     TYPE      | Collation | NULLABLE | DEFAULT | Storage  | Stats target | Description
----------+---------------+-----------+----------+---------+----------+--------------+-------------
 aid      | INTEGER       |           | NOT NULL |         | plain    |              |
 bid      | INTEGER       |           |          |         | plain    |              |
 abalance | INTEGER       |           |          |         | plain    |              |
 filler   | CHARACTER(84) |           |          |         | extended |              |
Partition KEY: RANGE (aid)
Indexes:
    "pgbench_accounts_pkey" PRIMARY KEY, btree (aid)
Partitions: pgbench_accounts_1 FOR VALUES FROM (MINVALUE) TO (1000001),
            pgbench_accounts_10 FOR VALUES FROM (9000001) TO (MAXVALUE),
            pgbench_accounts_2 FOR VALUES FROM (1000001) TO (2000001),
            pgbench_accounts_3 FOR VALUES FROM (2000001) TO (3000001),
            pgbench_accounts_4 FOR VALUES FROM (3000001) TO (4000001),
            pgbench_accounts_5 FOR VALUES FROM (4000001) TO (5000001),
            pgbench_accounts_6 FOR VALUES FROM (5000001) TO (6000001),
            pgbench_accounts_7 FOR VALUES FROM (6000001) TO (7000001),
            pgbench_accounts_8 FOR VALUES FROM (7000001) TO (8000001),
            pgbench_accounts_9 FOR VALUES FROM (8000001) TO (9000001)

If I'd use hash based partitioning, it would be:

=$ pgbench -i -s 100 --partitions=10 --partition-method=hash
...
done in 11.98 s (drop tables 0.12 s, create tables 0.03 s, generate 7.40 s, vacuum 1.93 s, primary keys 2.51 s).

created:

$ \d+
                                       List OF relations
 Schema |        Name         |       TYPE        | Owner | Persistence |  SIZE   | Description
--------+---------------------+-------------------+-------+-------------+---------+-------------
 public | pgbench_accounts    | partitioned TABLE | pgdba | permanent   | 0 bytes |
 public | pgbench_accounts_1  | TABLE             | pgdba | permanent   | 128 MB  |
 public | pgbench_accounts_10 | TABLE             | pgdba | permanent   | 128 MB  |
 public | pgbench_accounts_2  | TABLE             | pgdba | permanent   | 128 MB  |
 public | pgbench_accounts_3  | TABLE             | pgdba | permanent   | 128 MB  |
 public | pgbench_accounts_4  | TABLE             | pgdba | permanent   | 128 MB  |
 public | pgbench_accounts_5  | TABLE             | pgdba | permanent   | 128 MB  |
 public | pgbench_accounts_6  | TABLE             | pgdba | permanent   | 128 MB  |
 public | pgbench_accounts_7  | TABLE             | pgdba | permanent   | 128 MB  |
 public | pgbench_accounts_8  | TABLE             | pgdba | permanent   | 128 MB  |
 public | pgbench_accounts_9  | TABLE             | pgdba | permanent   | 128 MB  |
 public | pgbench_branches    | TABLE             | pgdba | permanent   | 40 kB   |
 public | pgbench_history     | TABLE             | pgdba | permanent   | 0 bytes |
 public | pgbench_tellers     | TABLE             | pgdba | permanent   | 80 kB   |
(14 ROWS)
 
$ \d+ pgbench_accounts
                            Partitioned TABLE "public.pgbench_accounts"
  COLUMN  |     TYPE      | Collation | NULLABLE | DEFAULT | Storage  | Stats target | Description
----------+---------------+-----------+----------+---------+----------+--------------+-------------
 aid      | INTEGER       |           | NOT NULL |         | plain    |              |
 bid      | INTEGER       |           |          |         | plain    |              |
 abalance | INTEGER       |           |          |         | plain    |              |
 filler   | CHARACTER(84) |           |          |         | extended |              |
Partition KEY: HASH (aid)
Indexes:
    "pgbench_accounts_pkey" PRIMARY KEY, btree (aid)
Partitions: pgbench_accounts_1 FOR VALUES WITH (modulus 10, remainder 0),
            pgbench_accounts_10 FOR VALUES WITH (modulus 10, remainder 9),
            pgbench_accounts_2 FOR VALUES WITH (modulus 10, remainder 1),
            pgbench_accounts_3 FOR VALUES WITH (modulus 10, remainder 2),
            pgbench_accounts_4 FOR VALUES WITH (modulus 10, remainder 3),
            pgbench_accounts_5 FOR VALUES WITH (modulus 10, remainder 4),
            pgbench_accounts_6 FOR VALUES WITH (modulus 10, remainder 5),
            pgbench_accounts_7 FOR VALUES WITH (modulus 10, remainder 6),
            pgbench_accounts_8 FOR VALUES WITH (modulus 10, remainder 7),
            pgbench_accounts_9 FOR VALUES WITH (modulus 10, remainder 8)

When running actual test it detects partitioning setup on its own:

=$ pgbench -j $( nproc ) -c $( nproc ) -T 30
starting vacuum...end.
transaction type: <builtin: TPC-B (sort of)>
scaling factor: 100
partition method: hash
partitions: 10
query mode: simple
number of clients: 8
number of threads: 8
duration: 30 s
number of transactions actually processed: 49460
latency average = 4.853 ms
tps = 1648.313046 (including connections establishing)
tps = 1648.411195 (excluding connections establishing)

while for range based partitioning it looks like:

=$ pgbench -j $( nproc ) -c $( nproc ) -T 30
starting vacuum...end.
transaction type: <builtin: TPC-B (sort of)>
scaling factor: 100
partition method: range
partitions: 10
query mode: simple
number of clients: 8
number of threads: 8
duration: 30 s
number of transactions actually processed: 51453
latency average = 4.665 ms
tps = 1714.829850 (including connections establishing)
tps = 1715.280907 (excluding connections establishing)

and without partitioning:

=$ pgbench -j $( nproc ) -c $( nproc ) -T 30
starting vacuum...end.
transaction type: <builtin: TPC-B (sort of)>
scaling factor: 100
query mode: simple
number of clients: 8
number of threads: 8
duration: 30 s
number of transactions actually processed: 52600
latency average = 4.563 ms
tps = 1753.153104 (including connections establishing)
tps = 1753.661528 (excluding connections establishing)

This is pretty cool, thanks to all involved.