Waiting for 9.1 – Rewrite-less changing types of column

On 12th of February, Robert Haas committed patch:

Teach ALTER TABLE .. SET DATA TYPE TO avoid SOME TABLE rewrites.
 
WHEN the OLD TYPE IS BINARY coercible TO the NEW TYPE AND the USING     
clause does NOT CHANGE the COLUMN contents, we can avoid a FULL TABLE
rewrite, though any indexes ON the affected COLUMNS will still need
TO be rebuilt.  This applies, FOR example, WHEN changing a VARCHAR
COLUMN TO be OF TYPE text.
 
The prior coding assumed that the SET OF operations that force a
rewrite IS identical TO the SET OF operations that must be propagated
TO TABLES making USE OF the affected TABLE's rowtype.  This is
no longer true: even though the tuples in those tables wouldn't
need TO be modified, the DATA TYPE CHANGE invalidate indexes built
USING those composite TYPE COLUMNS.  Indexes ON the TABLE we're
actually modifying can be invalidated too, of course, but the
existing machinery is sufficient to handle that case.
 
Along the way, add some debugging messages that make it possible
to understand what operations ALTER TABLE is actually performing
in these cases.                                                      
 
Noah Misch and Robert Haas

Later on, on 15th, he committed second patch with few more cases where rewrite can be avoided.

This is not really new functionality, but rather (partial) removal of one of the biggest problems I ever had with PostgreSQL.

Consider table:

CREATE TABLE test (
    x VARCHAR(16)
);

If you'll decide that you need longer strings. Being smart guy, I assume that since I already had to increase it once, I will probably need to increase it again in the future, so I decide to use TEXT datatype.

ALTER TABLE test ALTER COLUMN x SET DATA TYPE text;

it will require full table lock, and rewrite, which can take very long time.

At least – up to 9.0.

So, let's see how it works. First, sanity check on 9.0:

$ CREATE TABLE test ( x VARCHAR(16) );
CREATE TABLE
 
$ INSERT INTO test SELECT i::text FROM generate_series(1,1000000) i;
INSERT 0 1000000

After these two commands, this table row in pg_class looks like this:

$ SELECT * FROM pg_class WHERE relname = 'test';
-[ RECORD 1 ]---+-------
relname         | test
relnamespace    | 2200
reltype         | 16482
reloftype       | 0
relowner        | 16385
relam           | 0
<b>relfilenode     | 16480</b>
reltablespace   | 0
relpages        | 0
reltuples       | 0
reltoastrelid   | 0
reltoastidxid   | 0
relhasindex     | f
relisshared     | f
relistemp       | f
relkind         | r
relnatts        | 1
relchecks       | 0
relhasoids      | f
relhaspkey      | f
relhasexclusion | f
relhasrules     | f
relhastriggers  | f
relhassubclass  | f
relfrozenxid    | 723
relacl          | [NULL]
reloptions      | [NULL]

Now, let's alter the table:

$ BEGIN;
BEGIN
 
*$ ALTER TABLE test ALTER COLUMN x SET DATA TYPE text;
ALTER TABLE
TIME: 1001.682 ms
 
*$ SELECT relation::regclass, mode FROM pg_locks WHERE pid = pg_backend_pid() AND relation IS NOT NULL;
 relation |        mode         
----------+---------------------
 pg_locks | AccessShareLock
 test     | ShareLock
 test     | AccessExclusiveLock
 16483    | AccessExclusiveLock
(4 ROWS)
 
*$ commit;

As you can see alter table took 1 second, and it obtained AccessExclusiveLock of test table.

Table row from pg_class:

$ SELECT * FROM pg_class WHERE relname = 'test';
-[ RECORD 1 ]---+-------
relname         | test
relnamespace    | 2200
reltype         | 16482
reloftype       | 0
relowner        | 16385
relam           | 0
<b>relfilenode     | 16483</b>
reltablespace   | 0
relpages        | 0
reltuples       | 0
reltoastrelid   | 0
reltoastidxid   | 0
relhasindex     | f
relisshared     | f
relistemp       | f
relkind         | r
relnatts        | 1
relchecks       | 0
relhasoids      | f
relhaspkey      | f
relhasexclusion | f
relhasrules     | f
relhastriggers  | f
relhassubclass  | f
relfrozenxid    | 726
relacl          | [NULL]
reloptions      | [NULL]

OK. So, sanity check complete, version 9.0 behaved as I described. What about 9.1?

$ CREATE TABLE test (
>>     x VARCHAR(16)
>> );
CREATE TABLE
 
$ INSERT INTO test SELECT i::text FROM generate_series(1,1000000) i;
INSERT 0 1000000
 
$ SELECT * FROM pg_class WHERE relname = 'test';
-[ RECORD 1 ]--+-------
relname        | test
relnamespace   | 2200
reltype        | 18663
reloftype      | 0
relowner       | 16387
relam          | 0
<b>relfilenode    | 18661</b>
reltablespace  | 0
relpages       | 0
reltuples      | 0
reltoastrelid  | 0
reltoastidxid  | 0
relhasindex    | f
relisshared    | f
relpersistence | p
relkind        | r
relnatts       | 1
relchecks      | 0
relhasoids     | f
relhaspkey     | f
relhasrules    | f
relhastriggers | f
relhassubclass | f
relfrozenxid   | 1090
relacl         | [NULL]
reloptions     | [NULL]

Test prepared, and now:

$ BEGIN;
BEGIN
 
*$ ALTER TABLE test ALTER COLUMN x SET DATA TYPE text;
ALTER TABLE
TIME: 30.846 ms
 
TIME: 0.391 ms
 
*$ SELECT relation::regclass, mode FROM pg_locks WHERE pid = pg_backend_pid() AND relation IS NOT NULL;
           relation            |        mode         
-------------------------------+---------------------
 pg_toast.pg_toast_18661       | ShareLock
 test                          | AccessExclusiveLock
 pg_toast.pg_toast_18661_index | AccessExclusiveLock
 pg_class                      | AccessShareLock
 pg_class_oid_index            | AccessShareLock
 pg_locks                      | AccessShareLock
 pg_class_relname_nsp_index    | AccessShareLock
(7 ROWS)
 
*$ SELECT * FROM pg_class WHERE relname = 'test';
-[ RECORD 1 ]--+-------
relname        | test
relnamespace   | 2200
reltype        | 18663
reloftype      | 0
relowner       | 16387
relam          | 0
<b>relfilenode    | 18661</b>
reltablespace  | 0
relpages       | 0
reltuples      | 0
reltoastrelid  | 18664
reltoastidxid  | 0
relhasindex    | f
relisshared    | f
relpersistence | p
relkind        | r
relnatts       | 1
relchecks      | 0
relhasoids     | f
relhaspkey     | f
relhasrules    | f
relhastriggers | f
relhassubclass | f
relfrozenxid   | 1090
relacl         | [NULL]
reloptions     | [NULL]

Here we can see three important informations:

  • relfilenode did not change – so it clearly proves that no rewrite happened
  • time spent by ALTER TABLE was 30ms, and not 1s
  • lock level didn't change – still AccessExclusiveLock on whole table

The last part could be troublesome, but given the very short time it means it's virtually irrelevant. Of course some caution is required, but the general disruption is much, much smaller.

One final note – while rewrite is avoided when changing from varchar() to text, it is not (as of now) avoided when changing varchar(n) to varchar(m)! Even for such a simple cases like increasing varchar(16) to varchar(32). So, if you have table with varchars you can easily change it to texts, and then add trigger/check to enforce length limit.

4 thoughts on “Waiting for 9.1 – Rewrite-less changing types of column”

  1. Filip:
    int4 and int8 have different storage – basically it’s not “variable length”. So it will need to rewrite anyway.

  2. Note that as of PostgreSQL 9.2 lengthening varchar (and other similar fields) no longer requires a table rewrite.

    http://www.postgresql.org/docs/current/static/release-9-2.html#AEN116571

    «Increasing the length limit for a varchar or varbit column, or removing the limit altogether, no longer requires a table rewrite. Similarly, increasing the allowable precision of a numeric column, or changing a column from constrained numeric to unconstrained numeric, no longer requires a table rewrite. Table rewrites are also avoided in similar cases involving the interval, timestamp, and timestamptz types.»

Comments are closed.