bloat – select * from depesz;

Waiting for PostgreSQL 19 – Add CONCURRENTLY option to REPACK

On 6th of April 2026, Álvaro Herrera committed patch:

Add CONCURRENTLY option to REPACK
 
When this flag is specified, REPACK no longer acquires access-exclusive
lock while the new copy of the table is being created; instead, it
creates the initial copy under share-update-exclusive lock only (same as
vacuum, etc), and it follows an MVCC snapshot; it sets up a replication
slot starting at that snapshot, and uses a concurrent background worker
to do logical decoding starting at the snapshot to populate a stash of
concurrent data changes.  Those changes can then be re-applied to the
new copy of the table just before swapping the relfilenodes.
Applications can continue to access the original copy of the table
normally until just before the swap, which is the only point at which
the access-exclusive lock is needed.
 
There are some loose ends in this commit:
1. concurrent repack needs its own replication slot in order to apply
   logical decoding, which are a scarce resource and easy to run out of.
2. due to the way the historic snapshot is initially set up, only one
   REPACK process can be running at any one time on the whole system.
3. there's a danger of deadlocking (and thus abort) due to the lock
   upgrade required at the final phase.
 
These issues will be addressed in upcoming commits.
 
The design and most of the code are by Antonin Houska, heavily based on
his own pg_squeeze third-party implementation.
 
Author: Antonin Houska <ah@cybertec.at>
Co-authored-by: Mihail Nikalayeu <mihailnikalayeu@gmail.com>
Co-authored-by: Álvaro Herrera <alvherre@kurilemu.de>
Reviewed-by: Matthias van de Meent <boekewurm+postgres@gmail.com>
Reviewed-by: Srinath Reddy Sadipiralla <srinath2133@gmail.com>
Reviewed-by: Amit Kapila <amit.kapila16@gmail.com>
Reviewed-by: Jim Jones <jim.jones@uni-muenster.de>
Reviewed-by: Robert Treat <rob@xzilla.net>
Reviewed-by: Noriyoshi Shinoda <noriyoshi.shinoda@hpe.com>
Reviewed-by: vignesh C <vignesh21@gmail.com>
Discussion: https://postgr.es/m/5186.1706694913@antos
Discussion: https://postgr.es/m/202507262156.sb455angijk6@alvherre.pgsql

Continue reading Waiting for PostgreSQL 19 – Add CONCURRENTLY option to REPACK

Waiting for PostgreSQL 19 – Introduce the REPACK command

On 10th of March 2026, Álvaro Herrera committed patch:

Introduce the REPACK command
 
REPACK absorbs the functionality of VACUUM FULL and CLUSTER in a single
command.  Because this functionality is completely different from
regular VACUUM, having it separate from VACUUM makes it easier for users
to understand; as for CLUSTER, the term is heavily overloaded in the
IT world and even in Postgres itself, so it's good that we can avoid it.
 
We retain those older commands, but de-emphasize them in the
documentation, in favor of REPACK; the difference between VACUUM FULL
and CLUSTER (namely, the fact that tuples are written in a specific
ordering) is neatly absorbed as two different modes of REPACK.
 
This allows us to introduce further functionality in the future that
works regardless of whether an ordering is being applied, such as (and
especially) a concurrent mode.
 
Author: Antonin Houska <ah@cybertec.at>
Reviewed-by: Mihail Nikalayeu <mihailnikalayeu@gmail.com>
Reviewed-by: Álvaro Herrera <alvherre@kurilemu.de>
Reviewed-by: Robert Treat <rob@xzilla.net>
Reviewed-by: Euler Taveira <euler@eulerto.com>
Reviewed-by: Matheus Alcantara <matheusssilv97@gmail.com>
Reviewed-by: Junwang Zhao <zhjwpku@gmail.com>
Reviewed-by: jian he <jian.universality@gmail.com>
Discussion: https://postgr.es/m/82651.1720540558@antos
Discussion: https://postgr.es/m/202507262156.sb455angijk6@alvherre.pgsql

Continue reading Waiting for PostgreSQL 19 – Introduce the REPACK command

When to use VACUUM FULL

Well, the short answer is: NEVER. But given how often I see people ask about it, I'll try to expand my answer a bit…

Continue reading When to use VACUUM FULL

Bloat removal without table swapping

Some time ago I wrote about my favorite method of bloat removal. Around one year earlier, I wrote about another idea for bloat removal. This older idea was great – it didn't involve usage of triggers, overhead on all writes, table swapping. It had just one small, tiny, minuscule little issue. It was unbearably slow.

My idea was explored by Nathan Thom, but his blogpost disappeared.

Recently, Sergey Konoplev wrote to me about his tool, that he wrote using the same idea – updating rows to move them to other pages. So I decided that I have to check it.

Continue reading Bloat removal without table swapping

Bloat removal by tuples moving

Looong time ago, I wrote a piece about removing bloat by moving rows away from the end of table, and vacuuming it.

This is/was very slow, and was optimized (to some extent) by Nathan Thom, but his blogpost vanished. Besides, later on we got great tool: pg_reorg (or, as it's currently named: pg_repack).

But recently I was in position where I couldn't pg_reorg. So I had to look for other way. And I found it 🙂

Continue reading Bloat removal by tuples moving

Bloat happens

For various reasons, and in various cases, bloat happens. Theoretically autovacuum protects us all, but sometimes it doesn't. Sometimes someone disables it, or mis-configures, or bad planet alignment happens, and we end up in deep bloat.

What to do then? Vacuum? Vacuum Full? Cluster? No. pg_reorg!

Continue reading Bloat happens

Waiting for 9.1 – Add UNIQUE/PRIMARY KEY with index

On 25th of January, Tom Lane committed patch:

Implement ALTER TABLE ADD UNIQUE/PRIMARY KEY USING INDEX.
 
This feature allows a unique or pkey constraint to be created using an
already-existing unique index.  While the constraint isn't very
functionally different from the bare index, it's nice to be able to do that
for documentation purposes.  The main advantage over just issuing a plain
ALTER TABLE ADD UNIQUE/PRIMARY KEY is that the index can be created with
CREATE INDEX CONCURRENTLY, so that there is not a long interval where the
table is locked against updates.
 
On the way, refactor some of the code in DefineIndex() and index_create()
so that we don't have to pass through those functions in order to create
the index constraint's catalog entries.  Also, in parse_utilcmd.c, pass
around the ParseState pointer in struct CreateStmtContext to save on
notation, and add error location pointers to some error reports that didn't
have one before.
 
Gurjeet Singh, reviewed by Steve Singer and Tom Lane

Continue reading Waiting for 9.1 – Add UNIQUE/PRIMARY KEY with index

Reduce bloat of table without long/exclusive locks

Some time ago Joshua Tolley described how to reduce bloat from tables without locking (well, some locks are there, but very short, and not really intrusive).

Side note: Joshua: big thanks, great idea.

Based on his idea and some our research, i wrote a tool which does just this – reduces bloat in table.

Continue reading Reduce bloat of table without long/exclusive locks

=$
|

Tag: bloat