Incrementing counters in database – select * from depesz;

Some time ago someone on irc asked about creating fast counters for something (banners I think).

I talked with her (him?) about it, but figured, I can as well write a blogpost, so others can use it too.

First the general idea: we want to store incrementable counters, with optional trade off that “we can lose last x minutes of data if it will make things faster".

First approach is pretty simple, create table:

$ create table counters (
    counter_name text primary key,
    current_value int8 not null default 0
);

And then just insert whenever we see new counter, and then update when needed.

There is a problem with this approach though.

Let's assume that our counter-updating-application does something like this:

$ begin;
$ update counters set current_value = current_value + 1 where counter_name = 'whatever';
$ commit;

It might also do something else, before, or after, the update, and it might even be not in database.

Do you see the problem? If not, consider what will happen, in concurrent environment, if the transaction would look like:

$ begin;
$ update counters set current_value = current_value + 1 where counter_name = 'whatever';
sleep for 0.5 seconds
$ commit;

All other updates to this counter will wait for this particular transaction to commit.

It's even worse if we're updating multiple counters in the same transaction, because if one transaction will try to do:

$ begin;
$ update counters set current_value = current_value + 1 where counter_name = 'whatever';
$ update counters set current_value = current_value + 1 where counter_name = 'else';
$ commit;

and another will do:

$ begin;
$ update counters set current_value = current_value + 1 where counter_name = 'else';
$ update counters set current_value = current_value + 1 where counter_name = 'whatever';
$ commit;

then there will be deadlock, and one of them will die.

Of course you might say: my code will never sleep. And it will never update multiple counters. Sure. But there are things you can't control – your process might lock on something outside of database. Or get killed. Or there could be issues with network.

Long story short – locking approach will, sooner or later, cause problems.

So, let's see how it work.

I made 10 sql files, each doing 1000 updates to 10 counters, in random order.

Files look like:

=$ head -n 5 update-01.sql
update counters set current_value = current_value + 1 where counter_name = '4';
update counters set current_value = current_value + 1 where counter_name = '8';
update counters set current_value = current_value + 1 where counter_name = '1';
update counters set current_value = current_value + 1 where counter_name = '1';
update counters set current_value = current_value + 1 where counter_name = '4';

counters are named “0" to “9".

So, let's run it, and see how long it will take:

$ time /bin/ls -1 update-??.sql | xargs -n1 -P10 psql -qAtX -f
real    0m20.181s
user    0m0.248s
sys     0m0.072s

OK. So it took ~ 20 seconds. Counters are:

select * from counters;
 counter_name | current_value 
--------------+---------------
 7            |          1064
 9            |           955
 3            |          1034
 6            |           964
 5            |           973
 4            |          1031
 2            |          1010
 0            |           942
 1            |           975
 8            |          1052
(10 rows)

One thing worth mentioning is that the table got pretty bloated:

$ \dt+ counters
                     List of relations
 Schema |   Name   | Type  | Owner  |  Size  | Description 
--------+----------+-------+--------+--------+-------------
 public | counters | table | depesz | 560 kB | 
(1 row)

560kB just to store 10 counters. But that's side effect of updates.

So, what are other options?

There is very simple option – don't update, just insert. Inserts generally don't lock.

To make it work we need additional table:

$ create table counters_queue (
    id serial primary key,
    counter_name text,
    value_change int4
);

Now, instead of:

$ update counters set current_value = current_value + 1 where counter_name = ?;

we'd use:

$ insert into counters_queue (counter_name, value_change) values (?, 1);

Running the same counter increments using inserts is faster:

=$ time /bin/ls -1 insert-??.sql | xargs -n1 -P10 psql -qAtX -f 
real    0m12.495s
user    0m0.224s
sys     0m0.084s

But what's more important – it is virtually lock free.

Obviously – this leads to question – ok, but how do I get current counter?

Well, we need a thing (cronjob) that will update the counters. This can be done using such simple function:

$ CREATE OR REPLACE FUNCTION apply_counter_queue( IN p_limit INT4 DEFAULT 1000 ) RETURNS INT8 as $$
with id_range as (
    SELECT min(id) as min_id, min(id) + p_limit as max_id FROM counters_queue
), deleted_rows as (
    DELETE FROM counters_queue WHERE id between (SELECT min_id FROM id_range) AND (SELECT max_id FROM id_range) returning *
), summarized_change as (
    SELECT counter_name, sum(value_change) FROM deleted_rows group BY counter_name HAVING sum(value_change) <> 0
), update_counters as (
    UPDATE counters as c SET current_value = c.current_value + s.sum FROM summarized_change s WHERE c.counter_name = s.counter_name returning c.counter_name
)
SELECT count(*) FROM update_counters;
$$ language 'sql';

This function does:

(id_range) – gets minimal id from queue, and it's increment by given parameter (defaults to 1000)
(deleted_rows) – deletes rows from queue that have id between values from id_range, returning deleted rows, so we can work on them
(summarized_change) – calculates sum of value_changes for all deleted rows, and filters out counters which wouldn't be changed (sum of value_change is 0)
(update_counters) – updates the counters, returning name of each updated counter
returns number of updated counters

for 10000 rows in queue, this function took ~ 40ms to process them! Counters table got updated only once for each counter, so it wasn't bloated.

This function – apply_counter_queue() should be called from cronjob, or some other scheduler.

Now, if you want to have current real counter value, including both what is in counters table, and not-yet-applied in queue, then you need:

$ select
    current_value + coalesce( (select sum(value_change) from counters_queue where counter_name = 'whatever' ), 0 )
    from counters where counter_name = 'whatever'

It's more complex, but with index on counter_name column in counters_queue, it will be fast.

Additionally, if speed of inserts to queue will become a problem, you might want to consider two things:

make queue table unlogged
batch inserts to queue

First is very simple:

$ CREATE UNLOGGED TABLE counters_queue (
    id serial PRIMARY KEY,
    counter_name TEXT,
    value_change INT4
);

Now, the same 10,000 inserts, in 10 parallel workers, take ~ 0.4s (with logged table, it took 12s!).

Of course, having the table unlogged means that if Pg would crash, all data in this table will be lost – so it's a tradeoff between what you can lose, and how fast you want to go.

Second part – batching inserts, is more complicated, and can be done only if you can somehow “buffer" inserts, and instead of inserting every row separately, insert them in groups.

For example. Changed my inserts so that each worker has to do only 100 inserts, but each insert inserts 10 elements to queue, like this:

$ insert into counters_queue (counter_name, value_change) values ('4', 1),('8', 1),('1', 1),('1', 1),('4', 1),('2', 1),('0', 1),('3', 1),('4', 1),('4', 1);

When I ran these 10 tasks on unlogged table, I got basically identical time (0.4s). But, when I made the counters_queue table logged, the same 10000 rows that previously took 12s, now finished in 1.3s!

So, to wrap it up, to base counters table approach, I:

added queue table (which can be optionally unlogged)
added function that removes rows from queue, and applies changes to counters
added cronjob that calls this function
modified queries that get counter value from table, to account for unapplied-yet values from queue table

Effect is that I have counter changes work faster, without locking (well, there is some locking, but very minimal compared to UPDATE's), much less bloat. But more complicated.

Whether it's worth the extra complexity – it's up for you to decide :).

22 thoughts on “Incrementing counters in database”

Alex says:

2016-06-14 at 16:47

Is there any downside to using a sequence instead?
depesz says:

2016-06-14 at 17:06

@Alex:
well, you would need to have as many sequences as you have counters. And these are not transactional, so you can’t rollback seq increase.
ph says:

2016-06-14 at 18:13

How about serial type? It can produce some locks too, doesn’t it? I mean – it gives incremental values and increments its counter – some kind of bottleneck
depesz says:

2016-06-14 at 18:57

@ph:

sorry, not sure what you mean. serial is basically a sequence. what about it?
ph says:

2016-06-15 at 04:15

@DEPESZ:

I mean that serial type gives numbers (id) to new records, that’s why this insertions can’t be true parallel – they are all waiting for new incremented values from sequence. Yeah, sequences are very fast. But may be it would be better to use just inserts without primary keys? )
Craig Ringer says:

2016-06-15 at 05:06

PH: Sequences (and SERIAL) are non-blocking and do not wait for commit of other transactions. They’re non-transactional and simply discard values claimed by rolled back xacts. So they don’t ever wait for new incremented values from the sequence.

The main issue with the approach discussed here is insufficient consideration of concurrency. Unless it’s OK to get duplicate counter values this approach will fail unless xacts are serialized by locking on the counter.
Terje Elde says:

2016-06-15 at 08:19

Not only are serial and sequences comparable, but serial is basically shorthand for setting up a sequence, and setting values from it as default on an int column.
Terje Elde says:

2016-06-15 at 08:26

Good writeup, but once you start considering something external (thinking of the cron job), you might as well keep the counter-logic partly externally as well. Have a tiny utility-daemon that LISTENs for counter-updates, and trigger increments with NOTIFY.
Or move the congested counters to redis, and have a daemon sync PostgreSQL with current values.
depesz says:

2016-06-15 at 11:10

@Craig Ringer:

Can you elaborate on the serialization problem with my approach? I can’t see how/when the problem would appear.
misha says:

2016-06-15 at 13:20

@craig
@depesz

avito.ru
we’ve been using system like this for many years. works perfect!
no any concurrency issues — all under read committed isolation.
depesz says:

2016-06-15 at 13:27

@misha:
“like this” – like the one I described in blogpost?

@depezh

@depezh
yes, exactly the same approach.
but we use only one table both for aggregated data and “queue”, so at least

our query is simpler:

select sum(c.val) as val from counters c where c.id = 'id'

and our rotation function is

CREATE OR REPLACE FUNCTION statistics.aggregate_user_items_cnt()
  RETURNS integer AS
$BODY$
begin
        set local work_mem = '4 GB';
 
        with src as (
            select
                cnt.user_id,
                cnt.category_id
            from
                statistics.user_items_cnt cnt
            group by
                cnt.user_id, cnt.category_id
            having
                count(1) > 1
        ),
        del as (
            delete from statistics.user_items_cnt cnt
            using
                src
            where
                cnt.user_id = src.user_id and cnt.category_id = src.category_id
            returning cnt.*
        ),
        agg as (
            select
                del.user_id,
                del.category_id,
                sum(del.cnt_total)::integer  as cnt_total,
                sum(del.cnt_active)::integer as cnt_active
            from
                del
            group by
                del.user_id, del.category_id
        )
        insert into statistics.user_items_cnt(user_id, category_id, cnt_total, cnt_active, txtime)
            select agg.user_id, agg.category_id, agg.cnt_total, agg.cnt_active, now() from agg where agg.cnt_total > 0;
 
        return 0;
end;
$BODY$
  LANGUAGE plpgsql VOLATILE
  COST 100;
ALTER FUNCTION statistics.aggregate_user_items_cnt()
  OWNER TO postgres;

misha says:

2016-06-15 at 15:19

“>” is “>”

stupid cross-windows copy-paste
misha says:

2016-06-15 at 15:20

аааа

& g t ;

is “>”
depesz says:

2016-06-15 at 15:26

@misha:

fixed that 🙂
misha says:

2016-06-15 at 15:38

🙂
Asaf says:

2016-06-15 at 16:09

I don’t understand why I need a serial column.

Won’t this table work the same ?
CREATE UNLOGGED TABLE counters_queue (
counter_name TEXT,
value_change INT4
);

When aggregating the results, postgres is going to do a full table scan anyway.
depesz says:

2016-06-15 at 16:30

@Asaf:

because we don’t want to scan whole queue table at once, only at most “x” records.

Reason for this is that the queue deleting/counter-updating function issues updates to counters, thus locking the values for some time. the shorter the time the better.

With 1000 rows at a time, from queue, the time to process should be minimal, but we need something to limit with – so that’s why I use serial.
misha says:

2016-06-15 at 16:36

@depesh

and yes!

there is no updates at all in my version
misha says:

2016-06-15 at 16:37

statistics.aggregate_user_items_cnt()

absolutely free from any locking

@misha:
that’s obviously not true.

try:

begin;
select statistics.aggrega...
select * from pg_locks where pid = pg_backend_pid();

misha says:

2016-06-15 at 16:48

hmm..

> Reason for this is that the queue deleting/counter-updating function issues updates to counters, thus locking the values for some time.

what are you talking about?

but, i think this locking affecting nothing in average cases “plus/minus 1” and selecting counter values

Comments are closed.