June 13th, 2012 by depesz | Tags: , , , , , , , , , , | 5 comments »
Did it help? If yes - maybe you can help me?

Similar question has been asked many times on mailing lists and on IRC. Sometimes it's not mail sending, but file/directory creation, or something else that generally requires some interaction with “world outside of database".

Can it be done? Sure. How, then?

There are couple of methods, each has its own set of drawbacks and benefits (as always).

The answer most people are looking for (at least initially) is – how to do it from within database, without any kind of additional programs.

To do so, we will need to use stored procedure (function in PostgreSQL). But not just any.

We need something special.

Let's see. In psql, when I do \dL (to list languages) I get list like this:

$ \dL
                    List of languages
  Name   │ Owner │ Trusted │         Description
─────────┼───────┼─────────┼──────────────────────────────
 plperl  │ pgdba │ t       │ PL/Perl procedural language
 plpgsql │ pgdba │ t       │ PL/pgSQL procedural language
(2 rows)

Most of it is self explanatory, but what is the “Trusted" thing?

Trusted languages are named so because you (at least theoretically) cannot interact (using them) with anything outside of database.

So, you cannot use files, sockets. This includes also loading modules in PL/Perl – since modules are in files – you can't load them (well, technically you can, but not within PL/Perl function, you have to do it from postgresql.conf).

So – any language marked as trusted will not work for stuff like sending mails. Simply not possible.

But – there are, so called, untrusted languages.

The very first, not listed above, but working in every Pg instance is “C". This is not a language for stored procedures – it's rather interface to functions written in C (or any other language compilable to .so files), stored somewhere in filesystem, that PostgreSQL backend can load.

While it's great, and provides limitless possibilities, we'd rather use something interpreted, so we could create the functions within SQL itself (like in psql).

Luckily, PostgreSQL has many accessible languages – all depending on compile options, and installed extensions. In base sources for PostgreSQL 9.2 (9.1 too afaik, and in previous versions too), we have:

  • PL/Perl
  • PL/PerlU
  • PL/PythonU
  • PL/Python2U
  • PL/Tcl
  • PL/TclU
  • PL/pgSQL

As you can imagine – PL/Tcl is using Tcl language, and PL/PythonU, of course, Python. There are many more PL/* languages, available on PGXN or PgFoundryM, but that's not important at the moment.

You might notice that some of the languages have letter “U" at the end of name – these are untrusted languages. Meaning – using them you are not limited in what you can do. So, for example – using open() in PL/Perl will generally fail, but in PL/PerlU will happily work.

You might also notice, that there is no PL/Python – just PL/PythonU (and PL/Python2U, which is (currently) the same thing). Reason is very simple – some time ago developers found out that it is not possible to limit Python functionality in a way that's safe. So PL/Python had to disappear (in 7.4 version of Pg, loooong time ago).

So, let's load the language, and do something that will reach outside of database:

=$ psql -U pgdba -d test
(pgdba@[local]:5920) 10:58:42 [test]
$ create extension plperlu;
CREATE EXTENSION
 
(pgdba@[local]:5920) 10:58:47 [test]
$ \c test test
You are now connected to database "test" as user "test".
 
(test@[local]:5920) 10:58:52 [test]
$ create function make_dir(text) returns void as $$ mkdir('/tmp/' . shift ); $$ language plperlu;
ERROR:  permission denied for language plperlu

So, I was able to create the extension, but I can't create function using normal user. Reason is very simple – since plperlu is untrusted – only superusers can use it to create functions. And you can't even grant privileges on it:

(pgdba@[local]:5920) 11:00:31 [test]
$ grant usage on LANGUAGE plperlu to test;
ERROR:  language "plperlu" is not trusted
HINT:  Only superusers can use untrusted languages.

Does that mean that it's useless? Well, kind of. I stand on position that giving anyone superuser privileges in database equals to giving them shell access. But if you're not so strict – you can create the function using superuser, and then grant privileges to it to other users. Something like this:

(pgdba@[local]:5920) 11:03:39 [test]
$ create or replace function make_dir(text) returns void as $$ mkdir('/tmp/' . shift ); $$ language plperlu;
CREATE FUNCTION
 
(pgdba@[local]:5920) 11:03:49 [test]
$ grant execute on function make_dir(text) to test;
GRANT
 
(pgdba@[local]:5920) 11:03:52 [test]
$ \c test test
You are now connected to database "test" as user "test".
 
(test@[local]:5920) 11:03:54 [test]
$ select make_dir('whatever');
 make_dir
──────────
 [null]
(1 row)
 
(test@[local]:5920) 11:03:57 [test]
$ \! ls -ld /tmp/whatever
drwx------ 2 pgdba pgdba 4096 Jun 13 11:03 /tmp/whatever

As you can see I created the function using superuser account (pgdba), then granted privileges to execute it to test account, and ran it successfully from test. Directory has been created.

In the same way I could send mail, make HTTP request, or anything else.

And since it is a simple function in database – it can be used as a trigger function.

That's all good, but is there any way that doesn't require superuser privileges? Sure there is, but you will need to have something else instead.

The solution I will describe now, uses LISTEN and NOTIFY SQL commands.

The idea is very simple – you have one database connection that does “LISTEN", and then you have trigger that issues “NOTIFY".

So, let's test is.

First, I will need a test table, and the notifying trigger:

create table users (username text primary key, whatever int4);

and the trigger:

$ create function new_user_notify() returns trigger as $$
begin
    perform pg_notify('new_user', NEW.username);
    return NEW;
END;
$$ language plpgsql;
CREATE FUNCTION
 
$ create trigger new_user_notify after insert on users for each row execute procedure new_user_notify();
CREATE TRIGGER

And now I can test it:

Session #1 Session #2
  $ listen new_user;
$ begin;  
$ insert into users (username, whatever) values (‘depesz', 1); (nothing happens)
  (still nothing happens)
$ commit; (still nothing happens)
 
$ select 1;
 ?column?
──────────
        1
(1 row)
 
Asynchronous notification "new_user" with payload "depesz" received from server process with PID 11717.

We see couple of things:

  • notification was delivered only AFTER COMMIT. This is important, because it means that we will not have situations when NOTIFY was sent, but transaction is not yet committed, so the other process cannot see the data.
  • in psql at least – you have to do some kind of database interaction to get the notification

So – how can we use it to do something actually useful? Let's try to write a simple “daemon" that will simply print info about new users. Since I usually write in Perl, decided to write this (well, mostly copy/paste) in Python – for a change:

#!/usr/bin/env python
# -*- coding: utf-8 -*-
 
import select
import psycopg2
import psycopg2.extensions
 
dbc = psycopg2.connect(database='depesz', host='127.0.0.1', port=5920)
dbc.set_isolation_level(psycopg2.extensions.ISOLATION_LEVEL_AUTOCOMMIT)
cur = dbc.cursor()
cur.execute('LISTEN new_user')
 
while 1:
    if not select.select([dbc], [], [], 5) == ([], [], []):
        dbc.poll()
        while dbc.notifies:
            notify = dbc.notifies.pop()
            print "new_user %s, backed: %d" % (notify.payload, notify.pid)

When this is running – every inserted (and committed) new user, generates line like:

new_user neal_stephenson, backed: 11717
new_user daniel_suarez, backed: 13735

What the listening daemon will do – it's up to you. It can query back the database to get full row for new user, or it can send mails, or it can do anything you might want.

Is that perfect solution? Unfortunately no. While PostgreSQL will queue all NOTIFYs that weren't yet handled by all LISTENING processes (you can have many processes doing LISTEN on the same channel (new_user in my example)) – it will not queue them if nothing is listening.

So – if your listener will die – all notifications that happened in the time between its death and restart are gone. Of course this can be mitigated by running (for example) two daemons, and doing some kind of synchronization between them, but it's not perfect, and it's definitely not simple.

Interestingly – there is a whole set of tools that serve this particular case – get some info, and then keep it in queue until someone will take it. Yes – message queues.

Of course you can use general queues like RabbitMQ or ØMQ, but since we're in PostgreSQL, it would be cool to use something that was made for PostgreSQL. Right?

There is queue manager for PostgreSQL – it's called pgq, and it's part of Skype's SkyTools.

It's method of installation, and usage, is pretty well described in docs, and tutorial, so there is not much point in rewriting it here.

At this moment you might say: whoa, so much work just to get some emails sent when new user is created? Is there really nothing simpler? Well, there is. You can write a simple queue yourself.

For example:

$ create table process_queue (
    channel text,
    payload text,
    stored timestamptz,
    processed timestamptz
);
 
$ create function new_user_handle() returns trigger as $$
begin
    insert into process_queue (channel, payload, stored) values ('new_user', NEW.username, now());
    return NEW;
END;
$$ language plpgsql;
 
$ create trigger new_user_handle after insert on users for each row execute procedure new_user_handle();

Now, some inserts:

$ insert into users (username) values ('depesz');
INSERT 0 1
 
$ insert into users (username) values ('admin');
INSERT 0 1
 
$ insert into users (username) values ('user');
INSERT 0 1
 
$ insert into users (username) values ('test');
INSERT 0 1

Afterwards, process_queue contains some rows:

$ select * from process_queue ;
 channel  │ payload │            stored             │ processed
──────────┼─────────┼───────────────────────────────┼───────────
 new_user │ depesz  │ 2012-06-13 11:53:22.068471+02 │ [null]
 new_user │ admin   │ 2012-06-13 11:53:24.170629+02 │ [null]
 new_user │ user2012-06-13 11:53:31.141075+02 │ [null]
 new_user │ test    │ 2012-06-13 11:53:32.884775+02 │ [null]
(4 rows)

And, on top of it – just write a simple script that will run as daemon or as cronjob that will check the process_queue, and do whatever is necessary for all rows that are not yet processed.

What's more – you can also use LISTEN/NOTIFY with this approach, so that, with daemon approach, you will have fewer scans of the table. I.e. instead of:

while 1:
    check_for_new_rows_in_queue()
    sleep 5

(which scans the process_queue table every 5 seconds)

You will get logic as in python script above, waiting for notification from trigger, and only then checking for new rows. And if the daemon will die – all the not-yet-processed users will be waiting in process_queue, so on restart – just add one more check. Like:

check_for_new_rows_in_queue()
while 1:
    wait_for_notification()
    check_for_new_rows_in_queue()

As a final note – personally, I prefer the last option – its simple, can be done with standard PostgreSQL – even in cases when admins do not want to install new languages or extensions, and it “just works".

  1. 5 comments

  2. Jun 26, 2012

    Dear Hubert,
    Few years ago I had to solve a similar task:

    after something changes into certain db table to send immediately an email with attached db report, importable in spreadsheet software – Open Office and keep detailed log about sent email reports.
    I did not want to run daemons, because:
    1. they tend to quietly die sometimes
    2. put some useless burden on the server.
    LISTEN and NOTIFY solution seemed too complicated.

    I have sent you an email with all the details.

  3. # Pedro Ferro
    Mar 1, 2013

    Hi Hubert,

    I have a machine with OS Windows 64-bit (I can not change that).

    I have installed PostgrSQL 9.2 64bits and Active Perl 5.16.2 64bits on my machine.

    However when I create a new extension (plperlu) in Postgresql gives me an error like:
    “can not load library: c: \ Program Files \ PostgreSQL \ 9.2 \ lib \ plperl.dll.”

    I verified that the file plperl.dll exists in that exact location .

    Do you have any idea what is the problem?

    Thanks in advance,
    Pedro

  4. Mar 1, 2013

    @Pedro:
    sorry, I have no windows experience. Perhaps try pgsql-general mailing list.

  5. # postgresfan
    Jun 10, 2014

    Thanks a lot for this tutorial depesz!
    I just used this “howto” to notifiy my application when something changes in the database (which then sends out an email ;).
    Just wanna say you helped me quite often already with the tutorials on your blog – actually it became my #1 source when I got stuck with pg…
    Keep posting! 🙂

  6. # diego
    Aug 18, 2015

    Still waiting for the script to send emails…

Leave a comment