Waiting for 9.1 – Writable CTE

On 25th of February, Tom Lane committed patch:

Support data-modifying commands (INSERT/UPDATE/DELETE) IN WITH.   
 
This patch implements data-modifying WITH queries according TO the           
semantics that the updates ALL happen WITH the same command counter VALUE,
AND IN an unspecified ORDER.  Therefore one WITH clause can't see the 
effects of another, nor can the outer query see the effects other than
through the RETURNING values.  And attempts to do conflicting updates will
have unpredictable results.  We'll need TO document ALL that.           
 
This commit just fixes the code; documentation updates are waiting ON
author.                                                                 
 
Marko Tiikkaja AND Hitoshi Harada

Ever since we got 8.2, we have this brilliant thing called “RETURNING", which lets you get back information from queries that modify data (delete/update/insert).

For example you could:

DELETE FROM TABLE WHERE STATUS = 'inactive' RETURNING *;

And then use returned values for example to backup them.

One thing that wasn't possible, was to use the values in some other query.

But now, thanks to wCTE it's possible. Finally. And you can, for example do things like:

WITH deleted_posts AS (
    DELETE FROM posts
        WHERE created < now() - '6 months'::INTERVAL
        RETURNING *
)
SELECT user_id, COUNT(*) FROM deleted_posts GROUP BY 1;

Which would delete all posts older that 6 months, and show list of affected users, with count of removed posts.

Great thing is that you can use the CTE for next insert/update/delete – for example like this:

WITH deleted_posts AS (
    DELETE FROM posts
        WHERE created < now() - '6 months'::INTERVAL
        RETURNING *
), deleted_per_user AS (
    SELECT user_id, COUNT(*)
        FROM deleted_posts
        GROUP BY 1
)
UPDATE counts
    SET posts_count = posts_count - d.count
    FROM deleted_per_user d
    WHERE d.user_id = counts.user_id;

Which does the delete, generates summary of deleted posts per user, and updates
additional table, using single query – and not single query for every user, or
single query for every deleted post (like it would happen with triggers).

This is absolutely amazing. As a side benefit, it will provide answer to all
those people that “need" a way to update two tables with the same query.

10 thoughts on “Waiting for 9.1 – Writable CTE”

  1. WITH deleted_posts AS (
    DELETE FROM posts
    WHERE created < now() – '6 months'::INTERVAL
    RETURNING *
    )
    SELECT user_id, count(*) FROM deleted_posts group BY 1;

    why not
    SELECT user_id, count(*) FROM (
    DELETE FROM posts
    WHERE created < now() – '6 months'::INTERVAL
    RETURNING * ) deleted_posts
    group BY 1;

    ?

  2. @EDO yes, that’ll work for the simple case. This one makes more complicated cases simpler to do, as illustrated above.

  3. @Edo:
    because such syntax doesn’t work.

    Please note what I wrote about not being able to use RETURNING * in select queries.

  4. @EDO:

    I think the main reasons why we decided to forbid data-modifying statements in subqueries disappeared during the development, but as David pointed out, this syntax is a lot more flexible. Some databases allow INSERT/UPDATE/DELETE in a subquery, but you can’t JOIN against it, the top-level statement needs to be a SELECT etc. There are a number of reasons to implement this behaviour first. Maybe at some point in the future we can revisit the idea of allowing a data-modifying statement in a single, top level subquery if it’s the only item in FROM, but I don’t really see the point of it.

  5. @ Depesz, for what it’s worth, I’d like to see RETURNING get some enhancements. For example, in UPDATE, RETURNING can only access the new rows. It should have access to the old ones, too.

    Then there’s the case of RETURNING for other operations: COPY, TRUNCATE, DDL…

  6. Another example presented during PgCon was ho to move data from ‘big table’ to partition in one statement. There is lack of extra functional for partition handling in Postgres.BTW it is pitty that explain planer does not use contraints like in this definition CREATE TABLE accounting.cdr_ig_00 ( CHECK ( (id_group_B%64) = 0 ) ) INHERITS (accounting.cdr_ig);, insteda we have to create dynamic query

  7. Late to the party but you should be able to call one statement and then run multiple statements from the initial source statement.
    E.g If you wanted to delete a set of records and then update multiple other tables based on the first set of deleted tables.
    e.g

    WITH source AS (
    — delete from tbl returning *
    SELECT 1
    ), action_ctn AS (
    WITH action1 AS (
    — upd tble … from source returning 1; returning 1 just so to push to the next block
    SELECT 1
    ), action2 AS (
    — upd tble … from source returning 1;
    SELECT 1
    ), action3 AS (
    — upd tble … from source returning 1;
    SELECT 1
    ), action4 AS (
    — upd tble … from source returning 1;
    SELECT 1
    ) SELECT * FROM source
    )
    SELECT * FROM action_ctn

  8. Should of noted this will only be available If Davids suggestion of enhancing RETURNING.

    either returning from the source or
    being able to do what edo suggests.
    Would be most helpful

Comments are closed.