What logging has least overhead?

When working with PostgreSQL you generally want to get information about slow queries. The usual approach is to set log_min_duration_statement to some low(ish) value, run your app, and then analyze logs.

But you can log to many places – flat file, flat file on another disk, local syslog, remote syslog. And – perhaps, instead of log_min_duration_statement – just use pg_stat_statements?

Well, I wondered about it, and decided to test.

My test environment consists of 3 machines:

  • krowka – where the db is
  • h3po4 – where pgbench runs (to avoid overloading cpu on krowka with both pgbench and pg)
  • blob – remote recipient of syslog

All three machines are in the same lan, connected to the same switch, with average rtt out of 10 ping packets of ~ 0.13ms.

I decided to use pgbench, as it's using fast queries, which should be the most influenced by potentially slow logging, and since krowka has 12GB of RAM, decided to use scale 2500 to keep the data size interesting (FIXME)

Basic config is:

            name             |            setting             
-----------------------------+--------------------------------
 log_autovacuum_min_duration | -1
 log_checkpoints             | ON
 log_connections             | ON
 log_destination             | stderr
 log_directory               | pg_log
 log_disconnections          | ON
 log_duration                | off
 log_error_verbosity         | DEFAULT
 log_executor_stats          | off
 log_file_mode               | 0600
 log_filename                | postgresql-%Y-%m-%d_%H%M%S.log
 log_hostname                | off
 log_line_prefix             | %m %u@%d %p %r 
 log_lock_waits              | ON
 log_min_duration_statement  | -1
 log_min_error_statement     | error
 log_min_messages            | warning
 log_parser_stats            | off
 log_planner_stats           | off
 log_replication_commands    | off
 log_rotation_age            | 1440
 log_rotation_size           | 102400
 log_statement               | NONE
 log_statement_stats         | off
 log_temp_files              | 0
 log_timezone                | Poland
 log_truncate_on_rotation    | off
 logging_collector           | ON
(28 ROWS)

Each test was done using:

  1. load new config
  2. show values of critical points of config
  3. initialize pgbench database
  4. run 1 hour test with pgbench -h krowka -c 7 -T 3600

Results were rather surprising:

  • no logging: 95.812276 tps
  • pg_stat_statements: 94.818323 tps
  • full logging to local file in $PGDATA: 96.345291 tps
  • full logging to file on another disk: 96.651113 tps
  • full logging to syslog (logging to another disk): 95.866050 tps
  • full logging to remote syslog: 95.187785 tps

Why isn't logging more problematic? Well, within this 1 hour of test, Pg logged ~ 500MB of data. That's merely 142kB/s.

So, I made a test script for pgbench, that just ran: select ‘very long, static string' – without touching any tables – so it was very fast, but log lines were rather long: 120kB.

Unfortunately, I wasn't able to run it for 1 hour. I simply didn't have enough free disk space – logs were generated at ~ 95MB/s.

But the performance numbers were much more interesting:

  • no logging: 86.519867 tps
  • pg_stat_statements: 86.679150 tps
  • full logging to local file in $PGDATA: 80.970747 tps
  • full logging to file on another disk: 77.005846 tps
  • full logging to syslog (logging to another disk): 63.636626 tps
  • full logging to remote syslog: 76.662515 tps

In my case the other disk is slower then the one with PGDATA.

Of course – the numbers above do not mean that everyone will lose 25% of performance when using syslog logging. The numbers will be different for various types of queries, size of generated logs, and speed of IO.

But, it looks that:

  • pg_stat_statements seems to be almost free
  • local syslog is the most expensive.
  • logging to remote syslog is comparable to logging to local file directly

Given this, I think it makes sense to at least test pg_stat_statements in your environment, as it is clearly helpful, and the overhead is very, very small.

6 thoughts on “What logging has least overhead?”

  1. When telling that pg_stat_statement is almost free please note the memory overhead. Maybe measure it in your example config, I’m curious myself.

  2. @John:
    I fail to see how memory usage for this would be in any way significant.

    Please note that it doesn’t store full text of the queries – just query with parameters removed. So, the text stored is much smaller, and it’s kept only once per query “type”, and not every query on every run.

  3. I also don’t know if it’s significant, let me rephrase myself.

    You measured only the speed. I’m curious how, or if at all, memory usage changes when enabling pg_stat_statement.

    I just checked the docs and it seems it just needs 1MB of shared memory with the default configuration.

  4. @John:
    sounds about right. I can’t imagine what it would use more memory for. And 1MB is pretty insignificant these days.

  5. Hi Depesz,

    Couple questions about Your results & setup :

    1. Comparing
    no logging: 95.812276 tps
    to
    no logging: 86.519867 tps
    it looks like select ‘very long, static string’ isn’t faster than pgbench queries . My guess is it takes longer to transfer query text or result back to client . Have You tried running the test with a query similar to “select 1 where ‘long text’ = ”” ?

    2. The results in second table :
    no logging: 86.519867 tps
    pg_stat_statements: 86.679150 tps
    full logging to local file in $PGDATA: 80.970747 tps
    full logging to file on another disk: 77.005846 tps
    full logging to syslog (logging to another disk): 63.636626 tps
    full logging to remote syslog: 76.662515 tps

    don’t seem to be consistent with Your remarks on generated logs volume :

    Unfortunately, I wasn’t able to run it for 1 hour. I simply didn’t have enough free disk space – logs were generated at ~ 95MB/s.

    best tps * log line size = 86.679150 * 120kB ~ 10 MB

    Am I misreading the results ?

    Regards,
    Bolek

  6. @Bolek:
    first of all: it’s *t*ps, not *q*ps. Simply my test transactions were slower than test pgbench transactions. Probably because I used more queries per transaction than pgbench does.

    So – don’t compare tps values between the pgbench test and custom-queries test, as the values have *no* meaning in such case.

    Again – you’re missing the point that tps is transactions, and not queries. there were multiple (10 afair) queries per transaction.

Comments are closed.