I wrote about similar things couple of times, but recently found thread on pgsql-general mailing list that made me thing about it again.
Summary of the problem from mail is: we have a table, ~ 800 million rows, with, at least 2 columns:
- station – 170 distinct values
- channel – generally 1-3 channels per station
And then we want to run:
array_agg(DISTINCT (channel)) AS channels
Which, on Israel's (original poster) machine took ~ 5 minutes.
And this is with index on on data (station, channel).
Can we do better?
Continue reading Using recursive queries to get distinct elements from table
On 1st of February 2021, Peter Eisentraut committed patch:
SEARCH and CYCLE clauses
This adds the SQL standard feature that adds the SEARCH and CYCLE
clauses to recursive queries to be able to do produce breadth- or
depth-first search orders and detect cycles. These clauses can be
rewritten into queries using existing syntax, and that is what this
patch does in the rewriter.
Reviewed-by: Vik Fearing <email@example.com>
Reviewed-by: Pavel Stehule <firstname.lastname@example.org>
Continue reading Waiting for PostgreSQL 14 – SEARCH and CYCLE clauses
I was faced with interesting problem. Which schema, in my DB, uses the most disk space? Theoretically it's trivial, we have set of helpful functions:
But in some cases it becomes more of a problem. For example – when you have thousands of tables …
Continue reading Which schema is using the most disk space?
On 1st of February, Peter Eisentraut committed patch:
Add CREATE RECURSIVE VIEW syntax
This is specified in the SQL standard. The CREATE RECURSIVE VIEW
specification is transformed into a normal CREATE VIEW statement with a
WITH RECURSIVE clause.
reviewed by Abhijit Menon-Sen and Stephen Frost
Continue reading Waiting for 9.3 – Add CREATE RECURSIVE VIEW syntax
In PostgreSQL 8.4 we got CTE – Common Table Expressions. Since then we have this great tool available, but apparently for some people it's still black magic. CuTE, but still magic. I'll try to make it a bit less magical, and more understandable.
Continue reading CuTE overload
Yesterday on irc someone asked:
Hi, how do I get top 5 values from a column group by another column??
From further discussion, I learned that:
total rows in table is 2 million. It'll have unique words of less than 1 million.. (approx count)
I didn't have time yesterday, but decided to write a solution, or two, to the problem.
Continue reading Getting top-N rows per group
Yesterday, on #postgresql on irc some guy asked:
22:28 < rafasc> i am trying to use plpgsql to find the shortest path between two cities, each pair of cities has one or more edges, each edge has a different wheight.
22:28 < rafasc> Is there a easy way to compute the shortest path between two cities?
Well, I was not really in a mood to solve it, so I just told him to try with recursive queries, and went on my way.
But I thought about it. And decided to see if I can write the query.
Continue reading How to get shortest connection between two cities
I got asked on irc to show some examples how to use recursive CTE. Apparently my previous post wasn't good enough 🙂
I think that most of the users will use recursive cte to deal with trees I decided to show how to use it, even though it's not my favorite approach to dealing with trees in SQL.
Continue reading r/trees ( recursive trees, what did you think about? )
Today, on Stack Overflow there was interesting question.
Generally, given table that looks like this:
room | people | price | hotel
1 | 1 | 200 | A
2 | 2 | 99 | A
3 | 3 | 95 | A
4 | 1 | 90 | B
5 | 6 | 300 | B
Find cheapest combination of rooms that would accomodate given number of guests.
Continue reading Find cheapest combination of rooms in hotels