On 20th of January 2020, Amit Kapila committed patch:
Allow vacuum command to process indexes in parallel. This feature allows the vacuum to leverage multiple CPUs in order to process indexes. This enables us to perform index vacuuming and index cleanup with background workers. This adds a PARALLEL option to VACUUM command where the user can specify the number of workers that can be used to perform the command which is limited by the number of indexes on a table. Specifying zero as a number of workers will disable parallelism. This option can't be used with the FULL option. Each index is processed by at most one vacuum process. Therefore parallel vacuum can be used when the table has at least two indexes. The parallel degree is either specified by the user or determined based on the number of indexes that the table has, and further limited by max_parallel_maintenance_workers. The index can participate in parallel vacuum iff it's size is greater than min_parallel_index_scan_size. Author: Masahiko Sawada and Amit Kapila Mahendra Singh and Sergei Kornilov Discussion: https://postgr.es/m/CAD21AoDTPMgzSkV4E3SFo1CH_x50bf5PqZFQf4jmqjk-C03BWg@mail.gmail.com https://postgr.es/m/CAA4eK1J-VoR9gzS5E75pcD-OH0mEyCdp8RihcwKrcuw7J-Q0email@example.com
On 4th of April 2019, Robert Haas committed patch:
Allow VACUUM to be run with index cleanup disabled. This commit adds a new reloption, vacuum_index_cleanup, which controls whether index cleanup is performed for a particular relation by default. It also adds a new option to the VACUUM command, INDEX_CLEANUP, which can be used to override the reloption. If neither the reloption nor the VACUUM option is used, the default is true, as before. Masahiko Sawada, reviewed and tested by Nathan Bossart, Alvaro Herrera, Kyotaro Horiguchi, Darafei Praliaskouski, and me. The wording of the documentation is mostly due to me. Discussion: http://postgr.es/m/CAD21AoAt5R3DNUZSjOoXDUY=naYPUOuffVsRzuTYMz29yLzQCA@mail.gmail.com
On 15th of March, Robert Haas committed patch:
Add simple VACUUM progress reporting. There's a lot more that could be done here yet - in particular, this reports only very coarse-grained information about the index vacuuming phase - but even as it stands, the new pg_stat_progress_vacuum can tell you quite a bit about what a long-running vacuum is actually doing. Amit Langote and Robert Haas, based on earlier work by Vinayak Pokale and Rahila Syed.
On 9th of March, Robert Haas committed patch:
Add a generic command progress reporting facility. Using this facility, any utility command can report the target relation upon which it is operating, if there is one, and up to 10 64-bit counters; the intent of this is that users should be able to figure out what a utility command is doing without having to resort to ugly hacks like attaching strace to a backend. As a demonstration, this adds very crude reporting to lazy vacuum; we just report the target relation and nothing else. A forthcoming patch will make VACUUM report a bunch of additional data that will make this much more interesting. But this gets the basic framework in place. Vinayak Pokale, Rahila Syed, Amit Langote, Robert Haas, reviewed by Kyotaro Horiguchi, Jim Nasby, Thom Brown, Masahiko Sawada, Fujii Masao, and Masanori Oyama.
On 23rd of January, Alvaro Herrera committed patch:
vacuumdb: enable parallel mode This mode allows vacuumdb to open several server connections to vacuum or analyze several tables simultaneously. Author: Dilip Kumar. Some reworking by Álvaro Herrera Reviewed by: Jeff Janes, Amit Kapila, Magnus Hagander, Andres Freund
In release notes to latest release you can find:
Fix VACUUM's tests to see whether it can update relfrozenxid (Andres Freund) In some cases VACUUM (either manual or autovacuum) could incorrectly advance a table's relfrozenxid value, allowing tuples to escape freezing, causing those rows to become invisible once 2^31 transactions have elapsed. The probability of data loss is fairly low since multiple incorrect advancements would need to happen before actual loss occurs, but it's not zero. In 9.2.0 and later, the probability of loss is higher, and it's also possible to get "could not access status of transaction" errors as a consequence of this bug. Users upgrading from releases 9.0.4 or 8.4.8 or earlier are not affected, but all later versions contain the bug. The issue can be ameliorated by, after upgrading, vacuuming all tables in all databases while having vacuum_freeze_table_age set to zero. This will fix any latent corruption but will not be able to fix all pre-existing data errors. However, an installation can be presumed safe after performing this vacuuming if it has executed fewer than 2^31 update transactions in its lifetime (check this with SELECT txid_current() < 2^31).
What does it really mean?
Looong time ago, I wrote a piece about removing bloat by moving rows away from the end of table, and vacuuming it.
But recently I was in position where I couldn't pg_reorg. So I had to look for other way. And I found it 🙂
Now, in the final post, I will try to explain how it happens that Pg chooses “Operation X" over “Operation Y".
For various reasons, and in various cases, bloat happens. Theoretically autovacuum protects us all, but sometimes it doesn't. Sometimes someone disables it, or mis-configures, or bad planet alignment happens, and we end up in deep bloat.
What to do then? Vacuum? Vacuum Full? Cluster? No. pg_reorg!