Which tables should be auto vacuumed or auto analyzed – UPDATE

Some time ago I wrote blogpost which showed how to list tables that should be autovacuumed or autoanalyzed.

Query in there had one important problem – it didn't take into account per-table settings.

Specifically – it only used system-wide values for:

autovacuum_analyze_scale_factor
autovacuum_analyze_threshold
autovacuum_vacuum_scale_factor
autovacuum_vacuum_threshold

but these can be also set per table using syntax like:

=$ ALTER TABLE t2 SET ( autovacuum_vacuum_scale_factor = 0.3, autovacuum_vacuum_threshold = 200 );

Let's fix it.

To get per-table settings you have to look into reloptions column in pg_class table.

Appropriate query that gets real values of factors and thresholds looks like:

```
=$ SELECT
```
```
    n.nspname,
```
```
    c.relname,
```
```
    c.oid as relid,
```
```
    c.reltuples,
```
```
    s.n_dead_tup,
```
```
    s.n_mod_since_analyze,
```
```
    coalesce(
```

        (SELECT split_part(x, '=', 2) FROM unnest(c.reloptions) q (x) WHERE x ~ '^autovacuum_analyze_scale_factor=' ),

        current_setting('autovacuum_analyze_scale_factor')

    )::float8 as autovacuum_analyze_scale_factor,

```
    coalesce(
```

        (SELECT split_part(x, '=', 2) FROM unnest(c.reloptions) q (x) WHERE x ~ '^autovacuum_analyze_threshold=' ),

        current_setting('autovacuum_analyze_threshold')

    )::float8 as autovacuum_analyze_threshold,

```
    coalesce(
```

        (SELECT split_part(x, '=', 2) FROM unnest(c.reloptions) q (x) WHERE x ~ '^autovacuum_vacuum_scale_factor=' ),

        current_setting('autovacuum_vacuum_scale_factor')

    )::float8 as autovacuum_vacuum_scale_factor,

```
    coalesce(
```

        (SELECT split_part(x, '=', 2) FROM unnest(c.reloptions) q (x) WHERE x ~ '^autovacuum_vacuum_threshold=' ),

        current_setting('autovacuum_vacuum_threshold')

    )::float8 as autovacuum_vacuum_threshold

```
FROM
```
```
    pg_class c
```

    join pg_namespace n on c.relnamespace = n.oid

    LEFT OUTER JOIN pg_stat_all_tables s ON c.oid = s.relid

```
WHERE c.relkind = 'r'
```
```
ORDER BY n.nspname, c.relname;
```

The magic to get appropriate values in in lines 8-23, but it's just 4 separate blocks, one for each variable. Let's look at one of them:

```
coalesce(
```

    (SELECT split_part(x, '=', 2) FROM unnest(c.reloptions) q (x) WHERE x ~ '^autovacuum_analyze_scale_factor=' ),

    current_setting('autovacuum_analyze_scale_factor')

)::float8 as autovacuum_analyze_scale_factor

SELECT in line 2 unpacks reloptions (it's array of texts), and gets only the row that starts with “autovacuum_analyze_scale_factor=“. Then, value is split using = character, and it returns second part – which is the value.

If reloptions do not contain autovacuum_analyze_scale_factor, then the whole subselect in line 2 will return NULL, which will then get passed to COALESCE, which will substitute global value of autovacuum_analyze_scale_factor using current_setting call.

Finally in line 4 I added cast to float8, so that the value will be useful for math.

In my case, with these tables:

 relname │                                                                            reloptions                                                                            
─────────┼──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────
 t1      │ {autovacuum_enabled=off,autovacuum_analyze_scale_factor=0.2,autovacuum_analyze_threshold=100}
 t2      │ {autovacuum_enabled=off,autovacuum_vacuum_scale_factor=0.3,autovacuum_vacuum_threshold=200}
 t3      │ {autovacuum_enabled=off,autovacuum_analyze_scale_factor=0.2,autovacuum_analyze_threshold=100,autovacuum_vacuum_scale_factor=0.3,autovacuum_vacuum_threshold=200}
 t4      │ [null]
(4 rows)

and these global settings:

              name               │ setting 
─────────────────────────────────┼─────────
 autovacuum_analyze_scale_factor │ 0.1
 autovacuum_analyze_threshold    │ 50
 autovacuum_vacuum_scale_factor  │ 0.2
 autovacuum_vacuum_threshold     │ 50
(4 rows)

The query from above shows:

 nspname │ relname │ relid │ reltuples │ n_dead_tup │ n_mod_since_analyze │ autovacuum_analyze_scale_factor │ autovacuum_analyze_threshold │ autovacuum_vacuum_scale_factor │ autovacuum_vacuum_threshold 
─────────┼─────────┼───────┼───────────┼────────────┼─────────────────────┼─────────────────────────────────┼──────────────────────────────┼────────────────────────────────┼─────────────────────────────
 public  │ t1      │ 51871 │         0 │       1052 │              103453 │                             0.2 │                          100 │                            0.2 │                          50
 public  │ t2      │ 51881 │         0 │      10128 │              112529 │                             0.1 │                           50 │                            0.3 │                         200
 public  │ t3      │ 51891 │         0 │      51360 │              153761 │                             0.2 │                          100 │                            0.3 │                         200
 public  │ t4      │ 51901 │    102401 │          0 │                   0 │                             0.1 │                           50 │                            0.2 │                          50
(4 rows)

Now, with this in place, I can rewrite view from original post to:

=$ CREATE VIEW autovacuum_queue AS
WITH s AS (
    SELECT
    n.nspname,
    c.relname,
    c.oid as relid,
    c.reltuples,
    s.n_dead_tup,
    s.n_mod_since_analyze,
    coalesce(
        (SELECT split_part(x, '=', 2) FROM unnest(c.reloptions) q (x) WHERE x ~ '^autovacuum_analyze_scale_factor=' ),
        current_setting('autovacuum_analyze_scale_factor')
    )::float8 as analyze_factor,
    coalesce(
        (SELECT split_part(x, '=', 2) FROM unnest(c.reloptions) q (x) WHERE x ~ '^autovacuum_analyze_threshold=' ),
        current_setting('autovacuum_analyze_threshold')
    )::float8 as analyze_threshold,
    coalesce(
        (SELECT split_part(x, '=', 2) FROM unnest(c.reloptions) q (x) WHERE x ~ '^autovacuum_vacuum_scale_factor=' ),
        current_setting('autovacuum_vacuum_scale_factor')
    )::float8 as vacuum_factor,
    coalesce(
        (SELECT split_part(x, '=', 2) FROM unnest(c.reloptions) q (x) WHERE x ~ '^autovacuum_vacuum_threshold=' ),
        current_setting('autovacuum_vacuum_threshold')
    )::float8 as vacuum_threshold
    FROM
        pg_class c
        join pg_namespace n on c.relnamespace = n.oid
        LEFT OUTER JOIN pg_stat_all_tables s ON c.oid = s.relid
    WHERE
        c.relkind = 'r'
), tt AS (
    SELECT
        nspname,
        relname,
        relid,
        n_dead_tup,
        n_mod_since_analyze,
        reltuples * vacuum_factor + vacuum_threshold AS v_threshold,
        reltuples * analyze_factor + analyze_threshold AS a_threshold
    FROM
        s
)
SELECT
    nspname,
    relname,
    relid,
    CASE WHEN n_dead_tup > v_threshold THEN 'VACUUM' ELSE '' END AS do_vacuum,
    CASE WHEN n_mod_since_analyze > a_threshold THEN 'ANALYZE' ELSE '' END AS do_analyze
FROM
    tt
WHERE
    n_dead_tup > v_threshold OR
    n_mod_since_analyze > a_threshold;

and with this, I can:

=$ SELECT *,
    pg_relation_size(relid),
    pg_total_relation_size(relid)
FROM autovacuum_queue
    ORDER BY pg_total_relation_size(relid) DESC
;
  nspname   │   relname    │ relid │ do_vacuum │ do_analyze │ pg_relation_size │ pg_total_relation_size 
────────────┼──────────────┼───────┼───────────┼────────────┼──────────────────┼────────────────────────
 public     │ t3           │ 51891 │ VACUUM    │ ANALYZE    │          7159808 │               11812864
 public     │ t2           │ 51881 │ VACUUM    │ ANALYZE    │          5128192 │                7987200
 public     │ t1           │ 51871 │ VACUUM    │ ANALYZE    │          4677632 │                7020544
 pg_catalog │ pg_statistic │  2619 │           │ ANALYZE    │           196608 │                 344064
(4 rows)

Hope it helps.

=$
|

Which tables should be auto vacuumed or auto analyzed – UPDATE

One thought on “Which tables should be auto vacuumed or auto analyzed – UPDATE”