March 16th, 2011 by depesz | Tags: , , , , , | 10 comments »
Did it help? If yes - maybe you can help me? Donate BTC to 19zPa5diT2LZqGtTi8f8bfApLn8rw9zBHx

On 25th of February, Tom Lane committed patch:

Support data-modifying commands (INSERT/UPDATE/DELETE) in WITH.
 
This patch implements data-modifying WITH queries according to the
semantics that the updates all happen with the same command counter value,
and in an unspecified order. Therefore one WITH clause can't see the
effects of another, nor can the outer query see the effects other than
through the RETURNING values. And attempts to do conflicting updates will
have unpredictable results. We'll need to document all that.
 
This commit just fixes the code; documentation updates are waiting on
author.
 
Marko Tiikkaja and Hitoshi Harada

Ever since we got 8.2, we have this brilliant thing called “RETURNING", which lets you get back information from queries that modify data (delete/update/insert).

For example you could:

DELETE FROM table WHERE status = 'inactive' RETURNING *;

And then use returned values for example to backup them.

One thing that wasn't possible, was to use the values in some other query.

But now, thanks to wCTE it's possible. Finally. And you can, for example do things like:

WITH deleted_posts AS (
DELETE FROM posts
WHERE created < now() - '6 months'::INTERVAL
RETURNING *
)
SELECT user_id, count(*) FROM deleted_posts group BY 1;

Which would delete all posts older that 6 months, and show list of affected users, with count of removed posts.

Great thing is that you can use the CTE for next insert/update/delete – for example like this:

WITH deleted_posts AS (
DELETE FROM posts
WHERE created < now() - '6 months'::INTERVAL
RETURNING *
), deleted_per_user as (
SELECT user_id, count(*)
FROM deleted_posts
GROUP BY 1
)
UPDATE counts
SET posts_count = posts_count - d.count
FROM deleted_per_user d
WHERE d.user_id = counts.user_id;

Which does the delete, generates summary of deleted posts per user, and updates
additional table, using single query – and not single query for every user, or
single query for every deleted post (like it would happen with triggers).

This is absolutely amazing. As a side benefit, it will provide answer to all
those people that “need" a way to update two tables with the same query.

  1. 10 comments

  2. # edo
    Mar 16, 2011

    WITH deleted_posts AS (
    DELETE FROM posts
    WHERE created < now() – '6 months'::INTERVAL
    RETURNING *
    )
    SELECT user_id, count(*) FROM deleted_posts group BY 1;

    why not
    SELECT user_id, count(*) FROM (
    DELETE FROM posts
    WHERE created < now() – '6 months'::INTERVAL
    RETURNING * ) deleted_posts
    group BY 1;

    ?

  3. Mar 16, 2011

    @EDO yes, that’ll work for the simple case. This one makes more complicated cases simpler to do, as illustrated above.

  4. Mar 16, 2011

    @Edo:
    because such syntax doesn’t work.

    Please note what I wrote about not being able to use RETURNING * in select queries.

  5. Mar 16, 2011

    @David: it will? how? tested it and all I got was syntax error.

  6. Mar 16, 2011

    @ Depesz, I should know by now not to post before coffee :P

  7. # Marko Tiikkaja
    Mar 16, 2011

    @EDO:

    I think the main reasons why we decided to forbid data-modifying statements in subqueries disappeared during the development, but as David pointed out, this syntax is a lot more flexible. Some databases allow INSERT/UPDATE/DELETE in a subquery, but you can’t JOIN against it, the top-level statement needs to be a SELECT etc. There are a number of reasons to implement this behaviour first. Maybe at some point in the future we can revisit the idea of allowing a data-modifying statement in a single, top level subquery if it’s the only item in FROM, but I don’t really see the point of it.

  8. Mar 16, 2011

    @ Depesz, for what it’s worth, I’d like to see RETURNING get some enhancements. For example, in UPDATE, RETURNING can only access the new rows. It should have access to the old ones, too.

    Then there’s the case of RETURNING for other operations: COPY, TRUNCATE, DDL…

  9. May 20, 2011

    Another example presented during PgCon was ho to move data from ‘big table’ to partition in one statement. There is lack of extra functional for partition handling in Postgres.BTW it is pitty that explain planer does not use contraints like in this definition CREATE TABLE accounting.cdr_ig_00 ( CHECK ( (id_group_B%64) = 0 ) ) INHERITS (accounting.cdr_ig);, insteda we have to create dynamic query

  10. # Darryl Pye
    Jul 19, 2012

    Late to the party but you should be able to call one statement and then run multiple statements from the initial source statement.
    E.g If you wanted to delete a set of records and then update multiple other tables based on the first set of deleted tables.
    e.g

    WITH source AS (
    — delete from tbl returning *
    SELECT 1
    ), action_ctn AS (
    WITH action1 AS (
    — upd tble … from source returning 1; returning 1 just so to push to the next block
    SELECT 1
    ), action2 AS (
    — upd tble … from source returning 1;
    SELECT 1
    ), action3 AS (
    — upd tble … from source returning 1;
    SELECT 1
    ), action4 AS (
    — upd tble … from source returning 1;
    SELECT 1
    ) SELECT * FROM source
    )
    SELECT * FROM action_ctn

  11. # Darryl Pye
    Jul 19, 2012

    Should of noted this will only be available If Davids suggestion of enhancing RETURNING.

    either returning from the source or
    being able to do what edo suggests.
    Would be most helpful

Leave a comment