Converting list of integers into list of ranges

Yesterday someone on irc asked:

i've a query that returns sequential numbers with gaps (generate_series + join) and my question is: can is somehow construct ranges out of the returned values? sort of range_agg or something?

There was no further discussion, aside from me saying

sure you can. not trivial task, but possible.
you'd need window functions.

but it got me thinking …

Continue reading Converting list of integers into list of ranges

I have PostgreSQL, loaded some data, and have app using it. Now what?

I had to deal with this question, or some version of it, quite a few times. So, decided to write a summary on what one could (or should) do, after data is in database, and application is running. Namely – setup some kind of replication and backups.

What to use, how, and why? This is what this post is all about.

Continue reading I have PostgreSQL, loaded some data, and have app using it. Now what?

Faking “From ” header with procmail/formail

I have an unusual mail setup.

My company mails are handled by gmail, and on my account there, I setup forwarding to my own mail account on my own server. Where I read the mails, and respond.

So, when someone (let's say “") sends me an email to my company email (let's say “"), it arrives to gmail, where it gets forwarded to my real account ( let's say “" ).

So far so good. The problem is that gmail, when forwarding mail modifies return-path, and thus my local SMTP server changes “From " pseudo-header from normal “From some date" to “From some date".

This is bad, because it destroys procmail logs, which use value from “From " to log information about who sent the email, and instead of nice and readable:

From Wed Sep 22 00:10:36 2010
 Subject: test Wed Sep 22 00:10:30 CEST 2010
   Folder: /somewhere/not/important/new/23423     4870

I get utterly useless:

From Wed Sep 22 00:10:36 2010
 Subject: test Wed Sep 22 00:10:30 CEST 2010
   Folder: /somewhere/not/important/new/23423     4870

I tried to fix the problem using formail, but apparently, when you do something like this:

:0 fhw
| formail -I "From $REAL_EMAIL"

in procmail.rc – it all works fine, but the “From " line is generated at the end of headers, which is pretty dumb, as it should be leading header.

Tried various stuff to solve the problem, but finally found one that really works. Here it goes in case someone in future will need it:

ENVELOPE_FROM=`formail "-xFrom "`
* ENVELOPE_FROM ?? ^depesz\+.*
    MAIL_FROM=`formail -xFrom: | perl -pe 's/.*?(\\S+@\\S+).*/\$1/;s/^<//;s/>\$//'`
    RESTORED_FROM=`echo "From $MAIL_FROM $( echo "$ENVELOPE_FROM" | sed 's/^[^[:space:]]* //' )"`
    :0 fhw
    | ( echo "$RESTORED_FROM"; formail -R "From " "X-Old_From_:" )

Important – this is at the beginning of procmailrc!

What it does?

First line: ENVELOPE_FROM=`formail "-xFrom "` gets current value of “From " header, and puts it to ENVELOPE_FROM variable.

Now, with:

* ENVELOPE_FROM ?? ^depesz\+.*

I check if the ENVELOPE_FROM is the one that is broken (after all, someone else might mail me directly), and if yes – I run what's in side of the { block }.

Inside, I get value of From: (with colon) header, and extract from it email.

After running this line:

MAIL_FROM=`formail -xFrom: | perl -pe 's/.*?(\\S+@\\S+).*/\$1/;s/^<//;s/>\$//'`

Assuming mail had line like From: “Some important guy" <>, MAIL_FROM will contain “".

Next line:

RESTORED_FROM=`echo "From $MAIL_FROM $( echo "$ENVELOPE_FROM" | sed 's/^[^[:space:]]* //' )"`

builds new value of “From " header, using extracted email, and timestamp (after first space) from original “From “.

So, to wrap with example. Assuming we have email with headers like:

From Wed Sep 22 00:10:36 2010
Subject: test Wed Sep 22 00:10:30 CEST 2010
From: "Some important guy" <>

After the RESTORED_FROM line, we will have following values in variables:

ENVELOPE_FROM=" Wed Sep 22 00:10:36 2010"
RESTORED_FROM="From Wed Sep 22 00:10:36 2010"

Then goes last part:

:0 fhw
| ( echo "$RESTORED_FROM"; formail -R "From " "X-Old_From_:" )

Which, passed mail headers to command, and treats it as filter.

The command does:

  1. prints new “From " header
  2. passes mail headers through formail, which renames “From " header into “X-Old_From_:" header

The trick is that the print happens before formail even will get the headers, so the new, fixed “From " will be returned to procmail before rest of headers, as returned by formail.

Effect: everything works, and logged From is now much more sensible.

It would be even better if gmail would include original envelope from in the headers, but it doesn't, so I have to take the address from “From:" (which not always is good idea, but at the very least – it's much better than getting all mails with the same “From “.

Anyone knows better/easier approach?

Installing PostgreSQL

First of all – I base information in this post on Linux, so if you are using another operating system – it's quite likely that it will not help you.

Second – if you're using Linux, you have probably some package manager – dpkg, rpm, pacman, poldek, whatever. Right? So, it should be easy to install Pg using this package manager?

Well, yes and no. Yes – of course all (known to me) Linux distributions include PostgreSQL, but I do not use their prepackaged version.

Why, and how do I install?

Continue reading Installing PostgreSQL

Getting record by 2 criteria, ordered by third – how to do it quickly?

Let's assume we have a simple table:

CREATE TABLE objects (
    id          serial primary key,
    category    INT4        NOT NULL DEFAULT 0,
    object_type INT4        NOT NULL DEFAULT 0,
    entered_on  TIMESTAMPTZ NOT NULL DEFAULT now()

(This is simplification, but it contains all necessary columns).

What should I do to be able to quickly get 50 newest objects in given category/object_type (or in many categories/many object_types, or in all categories/object_types), optionally with limiting entered_on “older than …".

Continue reading Getting record by 2 criteria, ordered by third – how to do it quickly?