Waiting for PostgreSQL 12 – Add log_statement_sample_rate parameter

On 29th of November 2018, Alvaro Herrera committed patch:

Add log_statement_sample_rate parameter
This allows to set a lower log_min_duration_statement value without
incurring excessive log traffic (which reduces performance).  This can
be useful to analyze workloads with lots of short queries.
Author: Adrien Nayrat
Discussion: https://postgr.es/m/-ee1e-db9f-fa97-@anayrat.info

One of the problems I did encounter many times is that I can't realistically log all queries. There is just too many of them.

I can, and do, tune log_min_duration_statement, but this means that all the queries faster then this threshold are unnoticed.

Of course – one can say: well, fast queries are not a problem, are they? Well, that depends. Single query that takes 1 minute is big problem. But on the other hand 10 million queries that take 2ms each, in total, used more than 5 hours of server time.

This new patch has a change to solve my problem.

Basically it's a single guc, named “log_statement_sample_rate" with values from 0 to 1 which decides how many percent of queries (that are above log_min_duration_statement) will get logged.

I did simple test:

=$ pgbench -i
=$ ls -l data/log
=$ pgbench -c 2 -j 2 -T 10
=$ ls -l data/log

This first test, in 10 seconds, generated 6,717,401 bytes of log. Hardly much, but it's just a workstation, not a real-life production server.

Now, I changed log_statement_sample_rate from default 1 to 0.1, and reloaded config.

This time, to make the test even nicer I did it via:

=$ psql -c 'select pg_rotate_logfile();'
=$ ls -l data/log
=$ pgbench -c 2 -j 2 -T 10
=$ psql -c 'select pg_rotate_logfile();'
=$ ls -l data/log

This time, the log file I got was only 572,306 bytes.

Interestingly – tps fell – it was 655 before, and now it's only 571. But this is most likely due to some random fluctuations in load on my desktop.

I retried the test and got values 641 tps vs. 616 tps. And then once more, and got 643 vs. 646 tps.

So – the performance differences are due to random flukes on my desktop, and the ability to see some fraction of even fastest queries is absolutely amazing.

Great stuff, thanks a lot!

2 thoughts on “Waiting for PostgreSQL 12 – Add log_statement_sample_rate parameter”

  1. So this is useful if you set the log_min_duration_statement parameter above 0 and then it describes the percentage of logged queries?

    If you set log_min_duration_statement to 0 ALL queries are logged. Is the rate still applied?


Leave a Reply

Your email address will not be published.

This site uses Akismet to reduce spam. Learn how your comment data is processed.