This post has been updated with new code that uses temporary table – the code is at the end of post!

There was this question on Stack Overflow.

For future reference: guy asked how to do session variables – i.e. something he could define once in session, and later reuse in standard sql queries – without modifying postgresql.conf – so usage of custom_variable_classes is forbidden 🙂

While I don't actually see why somebody would want to avoid modifying its postgresql.conf (short of “it's shared hosting and I don't have superuser privileges"), I thought that it will be rather simple, and at the same time, rather interesting.

So, let's do it:

First let's make some assumptions on what we want to achieve:

  • a way to store value for parameter in given session
  • a way to get value of parameter in given session

This looks really simple.

So, let's make a table, and store backend pid and parameter name in there – together with value. If some session asks for value of parameter X, we check if it's there for given backend_pid.

Brilliant.

But. Let's assume some session stored some parameters, then disconnected. Then some other session connected, and it just so happened that the backend got the same pid as previously. While this might sound far fetched – it's actually pretty common in case you're using connection pooling software.

So, we need a way to clean params – so pooling program can cleanup session, before reusing it.

But what about the case when we don't use connection pooling? Reusage of pid is less likely, but still possible. So, perhaps some kind of expiration of parameters? For example “this parameter is only for 5 minutes".

So, now our feature list is longer:

  • a way to store value for parameter in given session
  • a way to get value of parameter in given session
  • a way to clean all values for given session
  • a way to expire parameters

Expiration should be automatical – i.e. we shouldn't need any cronjob to do it. The simplest way to do it, is to add implicit expiration as first step to all other operations, and just making sure that it's as fast as possible.

So, let's start. First – to minimize impact on other code, let's make our own schema:

CREATE SCHEMA session_variables;

In this schema we can put all tables and functions we need.

First things first – let's add table to store data:

CREATE TABLE session_variables.data (
backend_pid INT4,
variable_name TEXT,
variable_value TEXT,
expires_on TIMESTAMPTZ NOT NULL,
PRIMARY KEY (backend_pid, variable_name)
);

Because we will be calling expiration on every set and get operation, we should have index on expires_on:

CREATE INDEX data_expires_on ON session_variables.data (expires_on);

and since we will be calling expire in other functions let's first create expiration:

CREATE OR REPLACE FUNCTION session_variables.expire() RETURNS void as $$
DELETE FROM session_variables.data WHERE expires_on < clock_timestamp();
$$ language sql;

Nothing really fancy, but will work.

Now we should add function to store new value:

CREATE OR REPLACE FUNCTION session_variables.set_value( IN _name TEXT, IN _value TEXT, IN _expires TIMESTAMPTZ ) RETURNS void as $$
DECLARE
_pid int4 := pg_backend_pid();
BEGIN
PERFORM session_variables.expire();
LOOP
UPDATE session_variables.data SET variable_value = _value, expires_on = _expires WHERE backend_pid = _pid AND variable_name = _name;
IF FOUND THEN
RETURN;
END IF;
BEGIN
INSERT INTO session_variables.data( backend_pid, variable_name, variable_value, expires_on ) VALUES ( _pid, _name, _value, _expires );
RETURN;
EXCEPTION
WHEN unique_violation THEN
-- ignore
END;
END LOOP;
END;
$$ language plpgsql;

It's pretty simple – you just call:

SELECT session_variables.set_value('language', 'pl_PL', now() + '2 hours'::interval);

and the value it set in your session.

To make testing and usage simpler we can add simplified function:

CREATE OR REPLACE FUNCTION session_variables.set_value( TEXT, TEXT ) RETURNS void as $$
SELECT session_variables.set_value($1, $2, 'infinity');
$$ language sql;

Which get's only first 2 parameters:

SELECT session_variables.set_value('language', 'pl_PL');

and makes the variable such that it will never expire.

Since we can add new variable, we should now get some way to retrieve it's value:

CREATE OR REPLACE FUNCTION session_variables.get_value( TEXT ) RETURNS TEXT as $$
SELECT session_variables.expire();
SELECT variable_value FROM session_variables.data WHERE backend_pid = pg_backend_pid() AND variable_name = $1;
$$ language sql;

As you can see it first calls expire(), so we can be reasonably sure that retrieved value is fresh.

And to finish it all, cleanup function to be used in custom disconnect method in your code, or in cleanup query in connection pooling software:

CREATE OR REPLACE FUNCTION session_variables.cleanup() RETURNS void as $$
DELETE FROM session_variables.data WHERE backend_pid = pg_backend_pid();
$$ language sql;

And that's all.

Now you can:

SELECT session_variables.set_value('who', 'depesz');
SELECT session_variables.get_value('who');

And play with it.

Full code for copy/paste:

CREATE SCHEMA session_variables;
CREATE TABLE session_variables.data (
backend_pid INT4,
variable_name TEXT,
variable_value TEXT,
expires_on TIMESTAMPTZ NOT NULL,
PRIMARY KEY (backend_pid, variable_name)
);
CREATE INDEX data_expires_on ON session_variables.data (expires_on);
 
CREATE OR REPLACE FUNCTION session_variables.cleanup() RETURNS void as $$
DELETE FROM session_variables.data WHERE backend_pid = pg_backend_pid();
$$ language sql;
 
CREATE OR REPLACE FUNCTION session_variables.expire() RETURNS void as $$
DELETE FROM session_variables.data WHERE expires_on < clock_timestamp();
$$ language sql;
 
CREATE OR REPLACE FUNCTION session_variables.set_value( IN _name TEXT, IN _value TEXT, IN _expires TIMESTAMPTZ ) RETURNS void as $$
DECLARE
_pid int4 := pg_backend_pid();
BEGIN
PERFORM session_variables.expire();
LOOP
UPDATE session_variables.data SET variable_value = _value, expires_on = _expires WHERE backend_pid = _pid AND variable_name = _name;
IF FOUND THEN
RETURN;
END IF;
BEGIN
INSERT INTO session_variables.data( backend_pid, variable_name, variable_value, expires_on ) VALUES ( _pid, _name, _value, _expires );
RETURN;
EXCEPTION
WHEN unique_violation THEN
-- ignore
END;
END LOOP;
END;
$$ language plpgsql;
 
CREATE OR REPLACE FUNCTION session_variables.set_value( TEXT, TEXT ) RETURNS void as $$
SELECT session_variables.set_value($1, $2, 'infinity');
$$ language sql;
 
CREATE OR REPLACE FUNCTION session_variables.get_value( TEXT ) RETURNS TEXT as $$
SELECT session_variables.expire();
SELECT variable_value FROM session_variables.data WHERE backend_pid = pg_backend_pid() AND variable_name = $1;
$$ language sql;

As several people pointed in comments one can use temporary tables which solve the problem of wraparound of backend pids. It does not solve problem of connection reusage by connection pooler, so I'll leave cleanup() function intact.

New code:

CREATE SCHEMA session_variables;
 
CREATE OR REPLACE FUNCTION session_variables.create_table() RETURNS VOID as $$
DECLARE
temprec RECORD;
BEGIN
LOOP
SELECT c.relname, n.nspname INTO temprec
FROM pg_class c join pg_namespace n on c.relnamespace = n.oid
WHERE c.relkind = 'r' AND c.relname = '_session_variables_data' AND n.nspname ~ '^pg_temp_';
IF FOUND THEN
RETURN;
END IF;
BEGIN
EXECUTE 'CREATE TEMP TABLE _session_variables_data(variable_name TEXT PRIMARY KEY, variable_value TEXT, expires_on TIMESTAMPTZ NOT NULL)';
EXECUTE 'CREATE INDEX session_variables_data_expires_on ON _session_variables_data ( expires_on )';
RETURN;
EXCEPTION
WHEN duplicate_table THEN
-- ignore, retry loop
END;
END LOOP;
END;
$$ language plpgsql;
 
CREATE OR REPLACE FUNCTION session_variables.cleanup() RETURNS void as $$
BEGIN
PERFORM session_variables.create_table();
EXECUTE 'TRUNCATE _session_variables_data';
RETURN;
END;
$$ language plpgsql;
 
CREATE OR REPLACE FUNCTION session_variables.expire() RETURNS void as $$
BEGIN
PERFORM session_variables.create_table();
EXECUTE 'DELETE FROM _session_variables_data WHERE expires_on < ' || quote_literal(clock_timestamp());
RETURN;
END;
$$ language plpgsql;
 
CREATE OR REPLACE FUNCTION session_variables.set_value( IN _name TEXT, IN _value TEXT, IN _expires TIMESTAMPTZ ) RETURNS void as $$
DECLARE
tempint INT4;
BEGIN
PERFORM session_variables.expire();
LOOP
EXECUTE 'UPDATE _session_variables_data SET variable_value = $2, expires_on = $3 WHERE variable_name = $1' USING _name, _value, _expires;
GET DIAGNOSTICS tempint = ROW_COUNT;
IF tempint > 0 THEN
RETURN;
END IF;
BEGIN
EXECUTE 'INSERT INTO _session_variables_data( variable_name, variable_value, expires_on) VALUES ($1, $2, $3 )' USING _name, _value, _expires;
RETURN;
EXCEPTION
WHEN unique_violation THEN
-- ignore
END;
END LOOP;
END;
$$ language plpgsql;
 
CREATE OR REPLACE FUNCTION session_variables.set_value( TEXT, TEXT ) RETURNS void as $$
SELECT session_variables.set_value($1, $2, 'infinity');
$$ language sql;
 
CREATE OR REPLACE FUNCTION session_variables.get_value( IN _name TEXT ) RETURNS TEXT as $$
DECLARE
reply TEXT;
BEGIN
PERFORM session_variables.expire();
EXECUTE 'SELECT variable_value FROM _session_variables_data WHERE variable_name = $1' INTO reply USING _name;
RETURN reply;
END;
$$ language plpgsql;

  1. 14 comments

  2. # Betelgeuse
    Aug 20, 2009

    Just use a temporary table and it will be automatically dropped at the end of the session.

  3. Very good idea… in our ERP Software we do a similar implementation to use session variables along a PostgreSQL connection live, but we using temp tables to do that.

    In our case we dont take any care with session expiration because this feature is implemented by temp tables, so it responsability is transfered to database.

    We using “pgbouncer” connection pooler on PostgreSQL 8.2 and I wrote a little plpgsql function to emulate the “DISCARD ALL” command present in version 8.3 and later. I published the code in my blog: http://fabriziomello.blogspot.com/2009/07/case-pgbouncer-13-com-postgresql-82.html

    I liked your implementation, so I will merge your idea with our to make a better solution.

    Best regards,

    Fabrízio de Royes Mello

  4. Aug 21, 2009

    We’ve also used session variables in our CRM. We’ve did some performance tests and discovered that using pltcl functions for this is much faster than using temp tables. We have written two simple functions that store and read a session variable in four ways: plpgsql temp table, pltcl global, plpython global and plperl global. Here are results of the test: http://www.truesolutions.pl/blog/benchmark-procedur-w-postgresql (sorry, Polish only, but the numbers should be understood).
    Currently we are using these functions:


    — Name: set_var(text, integer); Type: FUNCTION; Schema: public; Owner: postgres

    CREATE FUNCTION set_var(name text, val integer) RETURNS integer
    AS $_$
    global sess
    set sess($1) $2;
    $_$
    LANGUAGE pltcl;


    — Name: get_var(text); Type: FUNCTION; Schema: public; Owner: postgres

    CREATE FUNCTION get_var(text) RETURNS integer
    AS $_$
    global sess
    return $sess($1)
    $_$
    LANGUAGE pltcl;

    Regards,
    Wiktor

  5. # mpawlikowski
    Aug 21, 2009

    Perhaps google translate will help translate it to english:
    http://translate.google.pl/translate?prev=hp&hl=pl&js=y&u=http%3A%2F%2Fwww.truesolutions.pl%2Fblog%2Fbenchmark-procedur-w-postgresql&sl=pl&tl=en&history_state0=

    Temporary tables should work here. But are there any problems with efficiency and temp tables in pg?

  6. Aug 21, 2009

    everybody – you’re absolutely right. I totally forgot about temp tables. I’ll provide fixed (with temp tables usage) code today.

  7. Aug 21, 2009

    hmm .. no you are not right 🙂 temp tables will not get dropped in the case cleanup was for – for cases when there is connection pooler. – because connection to postgesql is not actually dropped.

    Regardless – it’s actually good idea to use temp tables – not everybody uses connection poolers, and pid wraparound can happen.

  8. # Taai
    May 19, 2012

    I really want to store a “session” data / a global variable during the connection and this article is the best I found. Can you, please, consider my suggestion and tell me if it may work?

    I suggest to use the timestamp of session when it started.

    SELECT backend_start FROM pg_stat_activity WHERE pg_stat_activity.procpid = procpid LIMIT 1

    It could be used instead of “expires_on” column and I think this should solve pid wraparound problem. Can you test this on some connection pooler?

  9. May 19, 2012

    @Taai:
    please note that in recent pg (the article you commented on is nearly 3 years old) you can use any “class” for uguc, without defining it with “custom_variable_classes”.

    so just use normal select set_config() and select current_setting() functions.

  10. # Taai
    May 19, 2012

    Can you, please, show me an example? I’m new at PostgreSQL and I don’t understand what is “class” and can’t find how to create a “class”.

    I’m using version 9.1. For example, this:
    SELECT set_config(‘sess.uid’, ‘123’, FALSE);
    gives me an error:
    [Err] ERROR: unrecognized configuration parameter “sess.uid”

  11. May 19, 2012

    @Taai:
    just add “sess” to custom_variable_classes variable in postgresql.conf.

    afterwards pg_ctl reload should make the class work.

  12. # Taai
    May 20, 2012

    Looks like I misunderstood the “without modifying postgresql.conf”. So, modifying the “postgresql.conf” is the only way?

  13. May 20, 2012

    @Taai:
    no, i was wrong. i use 9.2 on daily basis, and in 9.2 custom_variable_classes is gone.

    in pre 9.1 you can use the trick like i described in this post.

    and going back to your original question – yes, backend start is ok. and you don’t need “limit 1” in your query, but the query should be:
    select backend_start from pg_stat_activity where procpid = pg_backend_pid();

  14. Apr 23, 2013

    I’ll immediately clutch your rss as I can’t find your email
    subscription link or newsletter service. Do you’ve any? Kindly permit me recognize so that I may subscribe. Thanks.

  15. Jan 29, 2014

    I think your post may solve the problem I talked about here: http://news.freshports.org/2014/01/28/multiple-repos-good-progress/

Leave a comment