Using recursive queries to get distinct elements from table

I wrote about similar things couple of times, but recently found thread on pgsql-general mailing list that made me thing about it again.

Summary of the problem from mail is: we have a table, ~ 800 million rows, with, at least 2 columns:

  • station – 170 distinct values
  • channel – generally 1-3 channels per station

And then we want to run:

SELECT
    station,
    array_agg(DISTINCT (channel)) AS channels
FROM
    DATA
GROUP BY
    station;

Which, on Israel's (original poster) machine took ~ 5 minutes.

And this is with index on on data (station, channel).

Can we do better?

Continue reading Using recursive queries to get distinct elements from table

Waiting for PostgreSQL 14 – SEARCH and CYCLE clauses

On 1st of February 2021, Peter Eisentraut committed patch:

SEARCH and CYCLE clauses 
 
This adds the SQL standard feature that adds the SEARCH and CYCLE
clauses to recursive queries to be able to do produce breadth- or
depth-first search orders and detect cycles.  These clauses can be
rewritten into queries using existing syntax, and that is what this
patch does in the rewriter.
 
Reviewed-by: Vik Fearing <vik@postgresfriends.org>
Reviewed-by: Pavel Stehule <pavel.stehule@gmail.com>
Discussion: https://www.postgresql.org/message-id/flat/db80ceee-6f97-9b4a-8ee8-3ba0c58e5be2@2ndquadrant.com

Continue reading Waiting for PostgreSQL 14 – SEARCH and CYCLE clauses

Which schema is using the most disk space?

I was faced with interesting problem. Which schema, in my DB, uses the most disk space? Theoretically it's trivial, we have set of helpful functions:

  • pg_column_size
  • pg_database_size
  • pg_indexes_size
  • pg_relation_size
  • pg_table_size
  • pg_tablespace_size
  • pg_total_relation_size

But in some cases it becomes more of a problem. For example – when you have thousands of tables …

Continue reading Which schema is using the most disk space?

Waiting for 9.3 – Add CREATE RECURSIVE VIEW syntax

On 1st of February, Peter Eisentraut committed patch:

Add CREATE RECURSIVE VIEW syntax
 
This is specified in the SQL standard.  The CREATE RECURSIVE VIEW
specification is transformed into a normal CREATE VIEW statement with a
WITH RECURSIVE clause.
 
reviewed by Abhijit Menon-Sen and Stephen Frost

Continue reading Waiting for 9.3 – Add CREATE RECURSIVE VIEW syntax

Getting top-N rows per group

Yesterday on irc someone asked:

Hi, how do I get top 5 values from a column group by another column??

From further discussion, I learned that:

total rows in table is 2 million. It'll have unique words of less than 1 million.. (approx count)

I didn't have time yesterday, but decided to write a solution, or two, to the problem.

Continue reading Getting top-N rows per group

How to get shortest connection between two cities

Yesterday, on #postgresql on irc some guy asked:

22:28 < rafasc> i am trying to use plpgsql to find the shortest path between two cities, each pair of cities has one or more edges, each edge has a different wheight.
22:28 < rafasc> Is there a easy way to compute the shortest path between two cities?

Well, I was not really in a mood to solve it, so I just told him to try with recursive queries, and went on my way.

But I thought about it. And decided to see if I can write the query.

Continue reading How to get shortest connection between two cities

r/trees ( recursive trees, what did you think about? )

I got asked on irc to show some examples how to use recursive CTE. Apparently my previous post wasn't good enough 🙂

I think that most of the users will use recursive cte to deal with trees I decided to show how to use it, even though it's not my favorite approach to dealing with trees in SQL.

Continue reading r/trees ( recursive trees, what did you think about? )

Find cheapest combination of rooms in hotels

Today, on Stack Overflow there was interesting question.

Generally, given table that looks like this:

room | people | price   | hotel
 1   |    1   |   200   |   A
 2   |    2   |   99    |   A
 3   |    3   |   95    |   A
 4   |    1   |   90    |   B
 5   |    6   |   300   |   B

Find cheapest combination of rooms that would accomodate given number of guests.

Continue reading Find cheapest combination of rooms in hotels