On 29th of November 2018, Alvaro Herrera committed patch:
Add log_statement_sample_rate parameter This allows to set a lower log_min_duration_statement value without incurring excessive log traffic (which reduces performance). This can be useful to analyze workloads with lots of short queries. Author: Adrien Nayrat Discussion: https://firstname.lastname@example.org
One of the problems I did encounter many times is that I can't realistically log all queries. There is just too many of them.
I can, and do, tune log_min_duration_statement, but this means that all the queries faster then this threshold are unnoticed.
Of course – one can say: well, fast queries are not a problem, are they? Well, that depends. Single query that takes 1 minute is big problem. But on the other hand 10 million queries that take 2ms each, in total, used more than 5 hours of server time.
This new patch has a change to solve my problem.
Basically it's a single guc, named “log_statement_sample_rate" with values from 0 to 1 which decides how many percent of queries (that are above log_min_duration_statement) will get logged.
I did simple test:
=$ pgbench -i =$ ls -l data/log =$ pgbench -c 2 -j 2 -T 10 =$ ls -l data/log
This first test, in 10 seconds, generated 6,717,401 bytes of log. Hardly much, but it's just a workstation, not a real-life production server.
Now, I changed log_statement_sample_rate from default 1 to 0.1, and reloaded config.
This time, to make the test even nicer I did it via:
=$ psql -c 'select pg_rotate_logfile();' =$ ls -l data/log =$ pgbench -c 2 -j 2 -T 10 =$ psql -c 'select pg_rotate_logfile();' =$ ls -l data/log
This time, the log file I got was only 572,306 bytes.
Interestingly – tps fell – it was 655 before, and now it's only 571. But this is most likely due to some random fluctuations in load on my desktop.
I retried the test and got values 641 tps vs. 616 tps. And then once more, and got 643 vs. 646 tps.
So – the performance differences are due to random flukes on my desktop, and the ability to see some fraction of even fastest queries is absolutely amazing.
Great stuff, thanks a lot!
4 thoughts on “Waiting for PostgreSQL 12 – Add log_statement_sample_rate parameter”
So this is useful if you set the log_min_duration_statement parameter above 0 and then it describes the percentage of logged queries?
If you set log_min_duration_statement to 0 ALL queries are logged. Is the rate still applied?
Yes and yes 🙂
Comments are closed.