Waiting for 9.6 – Add simple VACUUM progress reporting.

On 15th of March, Robert Haas committed patch:

Add simple VACUUM progress reporting. 
 
There's a lot more that could be done here yet - in particular, this
reports only very coarse-grained information about the index vacuuming
phase - but even as it stands, the new pg_stat_progress_vacuum can
tell you quite a bit about what a long-running vacuum is actually
doing.
 
Amit Langote and Robert Haas, based on earlier work by Vinayak Pokale
and Rahila Syed.

Continue reading Waiting for 9.6 – Add simple VACUUM progress reporting.

Waiting for 9.6 – Add a generic command progress reporting facility.

On 9th of March, Robert Haas committed patch:

Add a generic command progress reporting facility.
 
Using this facility, any utility command can report the target relation
upon which it is operating, if there is one, and up to 10 64-bit
counters; the intent of this is that users should be able to figure out
what a utility command is doing without having to resort to ugly hacks
like attaching strace to a backend.
 
As a demonstration, this adds very crude reporting to lazy vacuum; we
just report the target relation and nothing else.  A forthcoming patch
will make VACUUM report a bunch of additional data that will make this
much more interesting.  But this gets the basic framework in place.
 
Vinayak Pokale, Rahila Syed, Amit Langote, Robert Haas, reviewed by
Kyotaro Horiguchi, Jim Nasby, Thom Brown, Masahiko Sawada, Fujii Masao,
and Masanori Oyama.

Continue reading Waiting for 9.6 – Add a generic command progress reporting facility.

Waiting for 9.5 – vacuumdb: enable parallel mode

On 23rd of January, Alvaro Herrera committed patch:

vacuumdb: enable parallel mode
 
This mode allows vacuumdb to open several server connections to vacuum
or analyze several tables simultaneously.
 
Author: Dilip Kumar.  Some reworking by Álvaro Herrera
Reviewed by: Jeff Janes, Amit Kapila, Magnus Hagander, Andres Freund

Continue reading Waiting for 9.5 – vacuumdb: enable parallel mode

What does “Fix VACUUM’s tests to see whether it can update relfrozenxid” really mean?

In release notes to latest release you can find:

Fix VACUUM's tests to see whether it can update relfrozenxid (Andres Freund)
 
In some cases VACUUM (either manual or autovacuum) could incorrectly advance
a table's relfrozenxid value, allowing tuples to escape freezing, causing
those rows to become invisible once 2^31 transactions have elapsed. The
probability of data loss is fairly low since multiple incorrect advancements
would need to happen before actual loss occurs, but it's not zero. In 9.2.0
and later, the probability of loss is higher, and it's also possible to get
"could not access status of transaction" errors as a consequence of this
bug. Users upgrading from releases 9.0.4 or 8.4.8 or earlier are not
affected, but all later versions contain the bug.
 
The issue can be ameliorated by, after upgrading, vacuuming all tables in
all databases while having vacuum_freeze_table_age set to zero. This will
fix any latent corruption but will not be able to fix all pre-existing data
errors. However, an installation can be presumed safe after performing this
vacuuming if it has executed fewer than 2^31 update transactions in its
lifetime (check this with SELECT txid_current() < 2^31).

What does it really mean?

Continue reading What does “Fix VACUUM's tests to see whether it can update relfrozenxid" really mean?

Bloat removal by tuples moving

Looong time ago, I wrote a piece about removing bloat by moving rows away from the end of table, and vacuuming it.

This is/was very slow, and was optimized (to some extent) by Nathan Thom, but his blogpost vanished. Besides, later on we got great tool: pg_reorg (or, as it's currently named: pg_repack).

But recently I was in position where I couldn't pg_reorg. So I had to look for other way. And I found it 🙂

Continue reading Bloat removal by tuples moving

Explaining the unexplainable – part 5

In previous posts in this series, I talked about how to read EXPLAIN output, and what each line (operation/node) means.

Now, in the final post, I will try to explain how it happens that Pg chooses “Operation X" over “Operation Y".

Continue reading Explaining the unexplainable – part 5

Reduce bloat of table without long/exclusive locks

Some time ago Joshua Tolley described how to reduce bloat from tables without locking (well, some locks are there, but very short, and not really intrusive).

Side note: Joshua: big thanks, great idea.

Based on his idea and some our research, i wrote a tool which does just this – reduces bloat in table.

Continue reading Reduce bloat of table without long/exclusive locks

Waiting for 8.5 – VACUUM FULL change

Some time ago Josh Berkus wrote about possible changes in VACUUM FULL.

Now these changes came to life. By now, I mean 6th of January, when Takahiro Itagaki committed his patch:

Log Message:
-----------
Support rewritten-based full vacuum as VACUUM FULL. Traditional
VACUUM FULL was renamed to VACUUM FULL INPLACE. Also added a new
option -i, --inplace for vacuumdb to perform FULL INPLACE vacuuming.
 
Since the new VACUUM FULL uses CLUSTER infrastructure, we cannot
use it for system tables. VACUUM FULL for system tables always
fall back into VACUUM FULL INPLACE silently.
 
Itagaki Takahiro, reviewed by Jeff Davis and Simon Riggs.

Continue reading Waiting for 8.5 – VACUUM FULL change

Waiting for 8.4 – Visibility maps

Yeah. This one patch alone is worth upgrading to 8.4.

On 3rd of December Heikki Linnakangas committed his patch. Commit message:

Introduce visibility map. The visibility map is a bitmap with one bit per
heap page, where a set bit indicates that all tuples on the page are
visible to all transactions, and the page therefore doesn't need
vacuuming. It is stored in a new relation fork.
 
Lazy vacuum uses the visibility map to skip pages that don't need
vacuuming. Vacuum is also responsible for setting the bits in the map.
In the future, this can hopefully be used to implement index-only-scans,
but we can't currently guarantee that the visibility map is always 100%
up-to-date.
 
In addition to the visibility map, there's a new PD_ALL_VISIBLE flag on
each heap page, also indicating that all tuples on the page are visible to
all transactions. It's important that this flag is kept up-to-date. It
is also used to skip visibility tests in sequential scans, which gives a
small performance gain on seqscans.

Continue reading Waiting for 8.4 – Visibility maps