Tips N’ Tricks – Generating readable reports with plain SQL

Let's say you imported some data, but it contains duplicates. You will have to handle them in some way, but to make sensible choice on how to handle it, you need more information.

So, let's start. We have table:

# \d users
                                    Table "public.users"
   Column   |           Type           |                     Modifiers
------------+--------------------------+----------------------------------------------------
 id         | integer                  | not null default nextval('users_id_seq'::regclass)
 username   | text                     |
 registered | timestamp with time zone |
Indexes:
    "users_pkey" PRIMARY KEY, btree (id)

Continue reading Tips N’ Tricks – Generating readable reports with plain SQL

Getting list of most common domains

Today, on #postgresql on IRC, guy (can't contact him now to get his permission to name him), said:

I have a table called problematic_hostnames. It contains a list of banned hostnames in column “hostname" (varchar). I would like to display the top 10 troll ISPs based on this. Does PG have a way of spotting a “pattern"? Some ISPs are example.net while others are foo.bar.example.net, so you can't just regexp the last X.Y (since that would cause “.co.uk" to be one of the top troll ISPs).

Continue reading Getting list of most common domains