Waiting for PostgreSQL 14 – Add support for partitioned tables and indexes in REINDEX

On 8th of September 2020, Michael Paquier committed patch:

Add support for partitioned tables and indexes in REINDEX
 
Until now, REINDEX was not able to work with partitioned tables and
indexes, forcing users to reindex partitions one by one.  This extends
REINDEX INDEX and REINDEX TABLE so as they can accept a partitioned
index and table in input, respectively, to reindex all the partitions
assigned to them with physical storage (foreign tables, partitioned
tables and indexes are then discarded).
 
This shares some logic with schema and database REINDEX as each
partition gets processed in its own transaction after building a list of
relations to work on.  This choice has the advantage to minimize the
number of invalid indexes to one partition with REINDEX CONCURRENTLY in
the event a cancellation or failure in-flight, as the only indexes
handled at once in a single REINDEX CONCURRENTLY loop are the ones from
the partition being working on.
 
Isolation tests are added to emulate some cases I bumped into while
developing this feature, particularly with the concurrent drop of a
leaf partition reindexed.  However, this is rather limited as LOCK would
cause REINDEX to block in the first transaction building the list of
partitions.
 
Per its multi-transaction nature, this new flavor cannot run in a
transaction block, similarly to REINDEX SCHEMA, SYSTEM and DATABASE.
 
Author: Justin Pryzby, Michael Paquier
Reviewed-by: Anastasia Lubennikova
Discussion: https://postgr.es/m/db12e897-73ff-467e-94cb-4af03705435f.adger.lj@alibaba-inc.com

This is HUGE.

Let's assume you have partitioned table users:

=$ CREATE TABLE users (
    id int8 generated always AS IDENTITY PRIMARY KEY,
    username text NOT NULL
) partition BY range (id);
=$ CREATE INDEX q ON users (username);
=$ CREATE TABLE users_0 partition OF users FOR VALUES FROM (0) TO (10);
=$ CREATE TABLE users_1 partition OF users FOR VALUES FROM (10) TO (20);
=$ CREATE TABLE users_2 partition OF users FOR VALUES FROM (20) TO (30);

And, after some time, you'd like to reindex an index, to remove bloat.

Running REINDEX would require access exclusive lock, effectively blocking any access to table.

And, so far, we coulnd't reindex concurrently partitioned indexes:

$ reindex (verbose) INDEX concurrently q;
ERROR:  REINDEX IS NOT yet implemented FOR partitioned indexes

We could, of course, reindex each of the sub-indexes separately:

$ reindex (verbose) INDEX concurrently users_0_username_idx;
INFO:  INDEX "z.users_0_username_idx" was reindexed
DETAIL:  CPU: USER: 0.00 s, system: 0.00 s, elapsed: 0.01 s.
REINDEX
 
$ reindex (verbose) INDEX concurrently users_1_username_idx;
INFO:  INDEX "z.users_1_username_idx" was reindexed
DETAIL:  CPU: USER: 0.00 s, system: 0.00 s, elapsed: 0.01 s.
REINDEX
 
$ reindex (verbose) INDEX concurrently users_2_username_idx;
INFO:  INDEX "z.users_2_username_idx" was reindexed
DETAIL:  CPU: USER: 0.00 s, system: 0.00 s, elapsed: 0.01 s.
REINDEX

but that is far from nice.

Luckily, now, with this new patch, we can:

$ reindex (verbose) INDEX concurrently q;
INFO:  INDEX "public.users_0_username_idx" was reindexed
DETAIL:  CPU: USER: 0.00 s, system: 0.00 s, elapsed: 0.02 s.
INFO:  INDEX "public.users_1_username_idx" was reindexed
DETAIL:  CPU: USER: 0.00 s, system: 0.00 s, elapsed: 0.00 s.
INFO:  INDEX "public.users_2_username_idx" was reindexed
DETAIL:  CPU: USER: 0.00 s, system: 0.00 s, elapsed: 0.00 s.
REINDEX

This is great. Thanks a lot to all involved.

2 thoughts on “Waiting for PostgreSQL 14 – Add support for partitioned tables and indexes in REINDEX”

Comments are closed.