I wrote about similar things couple of times, but recently found thread on pgsql-general mailing list that made me thing about it again.
Summary of the problem from mail is: we have a table, ~ 800 million rows, with, at least 2 columns:
- station – 170 distinct values
- channel – generally 1-3 channels per station
And then we want to run:
array_agg(DISTINCT (channel)) AS channels
Which, on Israel's (original poster) machine took ~ 5 minutes.
And this is with index on on data (station, channel).
Can we do better?
Continue reading Using recursive queries to get distinct elements from table
I was faced with interesting problem. Which schema, in my DB, uses the most disk space? Theoretically it's trivial, we have set of helpful functions:
But in some cases it becomes more of a problem. For example – when you have thousands of tables …
Continue reading Which schema is using the most disk space?
On 1st of February, Peter Eisentraut committed patch:
Add CREATE RECURSIVE VIEW syntax
This is specified in the SQL standard. The CREATE RECURSIVE VIEW
specification is transformed into a normal CREATE VIEW statement with a
WITH RECURSIVE clause.
reviewed by Abhijit Menon-Sen and Stephen Frost
Continue reading Waiting for 9.3 – Add CREATE RECURSIVE VIEW syntax
In PostgreSQL 8.4 we got CTE – Common Table Expressions. Since then we have this great tool available, but apparently for some people it's still black magic. CuTE, but still magic. I'll try to make it a bit less magical, and more understandable.
Continue reading CuTE overload
Yesterday on irc someone asked:
Hi, how do I get top 5 values from a column group by another column??
From further discussion, I learned that:
total rows in table is 2 million. It'll have unique words of less than 1 million.. (approx count)
I didn't have time yesterday, but decided to write a solution, or two, to the problem.
Continue reading Getting top-N rows per group
Yesterday, on #postgresql on irc some guy asked:
22:28 < rafasc> i am trying to use plpgsql to find the shortest path between two cities, each pair of cities has one or more edges, each edge has a different wheight.
22:28 < rafasc> Is there a easy way to compute the shortest path between two cities?
Well, I was not really in a mood to solve it, so I just told him to try with recursive queries, and went on my way.
But I thought about it. And decided to see if I can write the query.
Continue reading How to get shortest connection between two cities
I got asked on irc to show some examples how to use recursive CTE. Apparently my previous post wasn't good enough 🙂
I think that most of the users will use recursive cte to deal with trees I decided to show how to use it, even though it's not my favorite approach to dealing with trees in SQL.
Continue reading r/trees ( recursive trees, what did you think about? )
Today, on Stack Overflow there was interesting question.
Generally, given table that looks like this:
room | people | price | hotel
1 | 1 | 200 | A
2 | 2 | 99 | A
3 | 3 | 95 | A
4 | 1 | 90 | B
5 | 6 | 300 | B
Find cheapest combination of rooms that would accomodate given number of guests.
Continue reading Find cheapest combination of rooms in hotels
Yesterday on my favorite IRC channel fooqux asked interesting question. I took some more questions, and here is problem description:
We have a system which, every 5 minutes, takes a number of tasks to be done. Tasks are uniform. Within 5 minutes we can handle at most 100 tasks. Given the history of number of tasks added every 5 minutes, calculate backlog at any given moment.
Did you understand the problem? Well – I didn't. So, let's see the data, and expected output.
Continue reading Calculating backlog of events to handle
On 4th of September Tom Lane committed another great patch. This one is very large, and even after applying – it's has some rough edges. There will be need for additional patches to make the functionality fully robust, but the fact that it got committed means that it will be available in final 8.4.
What does it do?
Continue reading Waiting for 8.4 – Common Table Expressions (WITH queries)