Waiting for PostgreSQL 12 – Allow user control of CTE materialization, and change the default behavior.

On 16th of February 2019, Tom Lane committed patch:

Allow user control of CTE materialization, and change the default behavior.
 
Historically we've always materialized the full output of a CTE query,
treating WITH as an optimization fence (so that, for example, restrictions
from the outer query cannot be pushed into it).  This is appropriate when
the CTE query is INSERT/UPDATE/DELETE, or is recursive; but when the CTE
query is non-recursive and side-effect-free, there's no hazard of changing
the query results by pushing restrictions down.
 
Another argument for materialization is that it can avoid duplicate
computation of an expensive WITH query --- but that only applies if
the WITH query is called more than once in the outer query.  Even then
it could still be a net loss, if each call has restrictions that
would allow just a small part of the WITH query to be computed.
 
Hence, let's change the behavior for WITH queries that are non-recursive
and side-effect-free.  By default, we will inline them into the outer
query (removing the optimization fence) if they are called just once.
If they are called more than once, we will keep the old behavior by
default, but the user can override this and force inlining by specifying
NOT MATERIALIZED.  Lastly, the user can force the old behavior by
specifying MATERIALIZED; this would mainly be useful when the query had
deliberately been employing WITH as an optimization fence to prevent a
poor choice of plan.
 
Andreas Karlsson, Andrew Gierth, David Fetter
 
Discussion: https://postgr.es/m/87sh48ffhb.fsf@news-spur.riddles.org.uk

Continue reading Waiting for PostgreSQL 12 – Allow user control of CTE materialization, and change the default behavior.

Converting list of integers into list of ranges

Yesterday someone on irc asked:

i've a query that returns sequential numbers with gaps (generate_series + join) and my question is: can is somehow construct ranges out of the returned values? sort of range_agg or something?

There was no further discussion, aside from me saying

sure you can. not trivial task, but possible.
you'd need window functions.

but it got me thinking …

Continue reading Converting list of integers into list of ranges

Returning data in multiple columns

I was working today on some updates to client database. While doing it, I figured it would be simpler if I saw all “codenames" and ids of rows from dictionary table – not so big. But it was bigger than my screen – I have only 90 lines of text on screen, and there were ~ 200 rows of data in the table. So I started thinking – how to show this (codename, id) into more than one column, in psql.

Continue reading Returning data in multiple columns

Anonymize CTE names on explain.depesz.com

A colleague recently let me know that anonymization in explain.depesz.com doesn't handle CTE names. For example, in plan:

                                            QUERY PLAN                                             
---------------------------------------------------------------------------------------------------
 CTE Scan ON some_name  (cost=0.01..0.03 ROWS=1 width=8) (actual TIME=0.027..0.028 ROWS=1 loops=1)
   CTE some_name
     ->  RESULT  (cost=0.00..0.01 ROWS=1 width=0) (actual TIME=0.023..0.023 ROWS=1 loops=1)
 Planning TIME: 0.217 ms
 Execution TIME: 0.124 ms
(5 ROWS)

“some_name" was kept even if anonymization was turned on.

Now it's fixed, and new code should properly anonymize all CTE names. Of course the change is live on explain.depesz.com too.

Getting count of distinct elements, per group, in PostgreSQL.

So, couple of days ago, some guy, from Periscope company wrote a blogpost about getting number of distinct elements, per group, faster using subqueries.

This was then submitted to Hacker News and r/Programming on Reddit.

Then, the original authors submitted second blogpost comparing speed between four different DB engines. Which, in turn, was also commented on Reddit.

I found the numbers presented by Periscope (as their improvement) as not that great.

Unfortunately – their blog doesn't allow for comments, so I decided to test it, and write on my own blog, what I can find about it.

Continue reading Getting count of distinct elements, per group, in PostgreSQL.

Waiting for 9.3 – Add CREATE RECURSIVE VIEW syntax

On 1st of February, Peter Eisentraut committed patch:

Add CREATE RECURSIVE VIEW syntax
 
This is specified in the SQL standard.  The CREATE RECURSIVE VIEW
specification is transformed into a normal CREATE VIEW statement with a
WITH RECURSIVE clause.
 
reviewed by Abhijit Menon-Sen and Stephen Frost

Continue reading Waiting for 9.3 – Add CREATE RECURSIVE VIEW syntax

Getting top-N rows per group

Yesterday on irc someone asked:

Hi, how do I get top 5 values from a column group by another column??

From further discussion, I learned that:

total rows in table is 2 million. It'll have unique words of less than 1 million.. (approx count)

I didn't have time yesterday, but decided to write a solution, or two, to the problem.

Continue reading Getting top-N rows per group

Filling the gaps with window functions

Couple of days ago I had a problem that I couldn't solve after ~ 2 hours, and decided to ask on IRC. Almost immediately after asking, I figured out the solution, but David asked me to write about the solution, even though it's now (for me) completely obvious.

The problem was like this:

I had two tables, with very simple structure: event_when timestamptz, event_count int4, and wanted to show it as a single recordset with columns: event_when, event_count_a, event_count_b, but the problem was that event_when usually didn't match. Here is an example:

Continue reading Filling the gaps with window functions