On 30th of March 2019, Peter Eisentraut committed patch:
Generated columns This is an SQL-standard feature that allows creating columns that are computed from expressions rather than assigned, similar to a view or materialized view but on a column basis. This implements one kind of generated column: stored (computed on write). Another kind, virtual (computed on read), is planned for the future, and some room is left for it. Discussion: https://firstname.lastname@example.org
This change looks like relatively simple to explain. Let's consider table like:
=$ create table test ( id serial primary key, val_a int4 not null default 0, val_b int4 not null default 0, val_sum int4 generated always as (val_a + val_b) stored ); CREATE TABLE
Now, I can:
=$ insert into test (val_a, val_b) values (1, 2); INSERT 0 1 =$ select * from test; id | val_a | val_b | val_sum ----+-------+-------+--------- 1 | 1 | 2 | 3 (1 row) =$ update test set val_b = 5; UPDATE 1 =$ select * from test; id | val_a | val_b | val_sum ----+-------+-------+--------- 1 | 1 | 5 | 6 (1 row)
If I'd try to update val_sum, or insert row with specified values for it, then:
=$ update test set val_sum = 15; ERROR: column "val_sum" can only be updated to DEFAULT DETAIL: Column "val_sum" is a generated column. =$ insert into test (val_a, val_b, val_sum) values (1,2,5); ERROR: cannot insert into column "val_sum" DETAIL: Column "val_sum" is a generated column.
even if val_sum value is correct:
=$ insert into test (val_a, val_b, val_sum) values (1,2,3); ERROR: cannot insert into column "val_sum" DETAIL: Column "val_sum" is a generated column.
Generally – generated columns are basically read-only for user.
This kind of functionality was available before with triggers. But there are couple of differences.
First of all, \d of the table shows generated values in easier to read way than digging through triggers:
=$ \d test Table "public.test" Column | Type | Collation | Nullable | Default ---------+---------+-----------+----------+-------------------------------------------- id | integer | | not null | nextval('test_id_seq'::regclass) val_a | integer | | not null | 0 val_b | integer | | not null | 0 val_sum | integer | | | generated always as (val_a + val_b) stored Indexes: "test_pkey" PRIMARY KEY, btree (id)
Secondly – COPY, without mentioning columns, will copy just real columns, skipping all that are generated:
=$ copy test to stdout with (format csv, header true); id,val_a,val_b 1,1,5
This is nice addition that is basically impossible to do with triggers, unless you'll modify every COPY statement everywhere.
What's more – it's faster. I tried, and doing COPY of 100k rows to this table, with GENERATED val_sum took ~ 200ms on my computer, but when I changed it to be calculated by trivial pl/PgSQL trigger, the time was ~ 400ms – twice as long.
And thirdly, though it hasn't been done yet, there will be non-STORED generated columns.
Currently every generated column has to be marked as STORED, which means that table data, on disk, will contain calculated value for the column.
But in future, we will (hopefully) get virtual columns that are calculated on demand.
There are limitations, of course:
- generated column can't be part of partition key (whatever you use to partition your data with)
- generated column can't access other generated columns to calculate its value
- expression used to calculate generated column value has to be immutable – so either plain(ish) expression using columns from current row, or a function call to immutable function.
All in all – it looks pretty cool, and will definitely make certain things easier, or at least easier to read.
Thanks a lot, devs 🙂