Waiting for PostgreSQL 19 – Add CONCURRENTLY option to REPACK

On 6th of April 2026, Álvaro Herrera committed patch:

Add CONCURRENTLY option to REPACK
 
When this flag is specified, REPACK no longer acquires access-exclusive
lock while the new copy of the table is being created; instead, it
creates the initial copy under share-update-exclusive lock only (same as
vacuum, etc), and it follows an MVCC snapshot; it sets up a replication
slot starting at that snapshot, and uses a concurrent background worker
to do logical decoding starting at the snapshot to populate a stash of
concurrent data changes.  Those changes can then be re-applied to the
new copy of the table just before swapping the relfilenodes.
Applications can continue to access the original copy of the table
normally until just before the swap, which is the only point at which
the access-exclusive lock is needed.
 
There are some loose ends in this commit:
1. concurrent repack needs its own replication slot in order to apply
   logical decoding, which are a scarce resource and easy to run out of.
2. due to the way the historic snapshot is initially set up, only one
   REPACK process can be running at any one time on the whole system.
3. there's a danger of deadlocking (and thus abort) due to the lock
   upgrade required at the final phase.
 
These issues will be addressed in upcoming commits.
 
The design and most of the code are by Antonin Houska, heavily based on
his own pg_squeeze third-party implementation.
 
Author: Antonin Houska <ah@cybertec.at>
Co-authored-by: Mihail Nikalayeu <mihailnikalayeu@gmail.com>
Co-authored-by: Álvaro Herrera <alvherre@kurilemu.de>
Reviewed-by: Matthias van de Meent <boekewurm+postgres@gmail.com>
Reviewed-by: Srinath Reddy Sadipiralla <srinath2133@gmail.com>
Reviewed-by: Amit Kapila <amit.kapila16@gmail.com>
Reviewed-by: Jim Jones <jim.jones@uni-muenster.de>
Reviewed-by: Robert Treat <rob@xzilla.net>
Reviewed-by: Noriyoshi Shinoda <noriyoshi.shinoda@hpe.com>
Reviewed-by: vignesh C <vignesh21@gmail.com>
Discussion: https://postgr.es/m/5186.1706694913@antos
Discussion: https://postgr.es/m/202507262156.sb455angijk6@alvherre.pgsql

Let's see how well that really works.

I made table with 200 million rows:

=$ create table test_for_repack as
    select i::int8 as id, repeat('d'||i::text, (2+random() * 10)::int4) as payload
    from generate_series(1,200_000_000) i;
SELECT 200000000
 
=$ alter table test_for_repack add primary key (id);
ALTER TABLE

Then I deleted 80% of rows from it:

=$ delete from test_for_repack where random() < .8;

Afterwards, size of table, including all the dead space, was:

=$ select pg_size_pretty(pg_table_size('test_for_repack'));
 pg_size_pretty
────────────────
 19 GB
(1 row)

Then, in separate session I ran, every half a second:

=$ insert into test_for_repack
    select (extract(epoch from now())*1000)::int8, 'test for: ' || now();

And made sure all of these get logged, with their duration.

With this running I ran:

=$ select now();
              now
───────────────────────────────
 2026-04-21 12:25:47.976493+02
(1 row)
 
=$ repack (verbose, analyze, concurrently) test_for_repack;
INFO:  repacking "public.test_for_repack" in physical order
INFO:  "public.test_for_repack": found 0 removable, 39999324 nonremovable row versions in 2636864 pages
DETAIL:  0 dead row versions cannot be removed yet.
CPU: user: 29.13 s, system: 15.44 s, elapsed: 55.16 s.
INFO:  analyzing "public.test_for_repack"
INFO:  "test_for_repack": scanned 30000 of 526817 pages, containing 2278370 live rows and 0 dead rows; 30000 rows in sample, 40009468 estimated total rows
INFO:  finished analyzing table "depesz.public.test_for_repack"
I/O timings: read: 3.358 ms, write: 41.653 ms
avg read rate: 384.273 MB/s, avg write rate: 384.132 MB/s
buffer usage: 127 hits, 30004 reads, 29993 dirtied
WAL usage: 30003 records, 29993 full page images, 245767603 bytes, 244294468 full page image bytes, 28394 buffers full
system usage: CPU: user: 0.22 s, system: 0.27 s, elapsed: 0.60 s
REPACK
Time: 65724.623 ms (01:05.725)
 
=$ select now();
              now
───────────────────────────────
 2026-04-21 12:26:53.704368+02
(1 row)

Size showed that the table shrank, as expected:

=$ select pg_size_pretty(pg_table_size('test_for_repack'));
 pg_size_pretty
────────────────
 4117 MB
(1 row)

How about times?

Well, before repack started I got 47 inserts with average time of 1.469ms. Maximal time was 3.3ms.

During the repack I logged 131 inserts. Slowest was 63ms, with average of 5.536ms.

Afterwards I let the inserts ran for another ~ 600 rows. Average logged time was 9.537ms with max of ~ 14.5ms.

Now, you might wander why average time of insert increased – I guess it's related to allocating new pages for the table.

Anyway – during the repack, while there was visible slowdown, nothing I would say that really damaging.

This is awesome. Thanks a lot to everyone that has been working on it.

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.