shared buffers and their impact on performance

just lately we found interesting case about shared_buffers settings.

the database in question is rather simple:

3 tables
a bit over 60 gigabytes
around 150 million rows

the server we run it on is not really fancy:

2, single core, 3ghz xeons
16 gb of ram
6x 72gb, scsi, 15krpm discs in low-end hardware raid 10.

initially postgresql was configured to use 2gb for shared_buffers. it's definitely less than “generally suggested" 20-25%, but so it was.

the problem we had was that during checkpoints (every 5 minutes) load on discs was so high that the whole system basically freezed. effect: load around 20, unhappy clients.

bgwriter tuning didn't really gave us much of anything, so we decided to make more “interesting" changes. for starters – increase checkpoint_timeout to 30 minutes. load dropped, but when 30 minutes passed load spiked over 30.

then we thought about the least “sensible" change – checkpoint saves data from shared_buffers. so let's limit the amount of data in shared buffers. shared_buffers were brought down to 128megs. effect – load in peak hours of day hits 3. not 30. 3! daily average is around 1.5!

now, lowering shared_buffers is definitely not for everyone. but, it just happened that it did help (a lot!) in our case.

14 thoughts on “shared buffers and their impact on performance”

Leandro Guimarães Faria Corcete DUTRA says:

2007-12-05 at 14:45

Wouldn’t shortening savepoints have around the same effect without depriving the system of an important buffer area?

Just speaking from the top of my ignorance…
depesz says:

2007-12-05 at 14:50

well – i didn’t try it, and now i’d rather not play with production server 🙂

so i can’t be sure about it – perhaps it would work.
Evgeny says:

2007-12-05 at 15:41

I’ve had some experiments with a 50Gb+ DB (8Gb RAM+10RAID) and noticed the following:
The OS will handle the caching job *MUCH* better than shared buffers, for the case of a very updatable database. You should only leave enough shared buffers for Postgres’ major in-memory control structures.
Trophaeum says:

2007-12-05 at 16:57

I’d be very curious how postgres 8.3 with it’s new table scan stuff handles on this box, any chance you’re planning to put postgres 8.3 on it anytime soon? 🙂
depesz says:

2007-12-05 at 17:02

@Trophaeum:
on production – a week after release.
on test systems i have it already, but they are to slow for meaningful comparison.
Jeff Davis says:

2007-12-05 at 20:47

That’s an interesting result.

See if you can reproduce it somehow. There’s test hardware available for community use if I understand correctly. Then see if 8.3 fixes the problem. If not, it will be valuable information to use to improve the buffer manager.
Greg Smith says:

2007-12-05 at 21:25

The one thing you left out of this description is the operating system; that certainly has an impact on how the burst of writes at checkpoint time actually happens.

Reducing shared_buffers to help with V8.2 checkpoints is one of the suggestions on my long paper on this subject at http://www.westnet.com/~gsmith/content/postgresql/chkp-bgw-83.htm

You might want to take a look at that as part of your 8.3 planning. It doesn’t really “fix” this problem directly, but it provides another good technique for dealing with it. If you can tolerate checkpoint_timeout being at 30 minutes the checkpoint_completion_target feature should work fairly well for you; the biggest problems I’ve heard about are from people where the checkpoints are much more frequent than that.
depesz says:

2007-12-05 at 22:11

@Greg Smith:
sorry, didn’t thought about it. os is linux 2.6.22.9, distribution: gentoo, compiled as x86_64.
depesz says:

2007-12-05 at 22:12

@Jeff Davis:
repeating might be an issue due to fact that i can’t freely distribute the database – it contains sensitive information 🙁

on the other hand – we will be doing the same change on much stronger machine on our main database server. we’ll see how it will go.
Joshua Drake says:

2007-12-06 at 00:29

A couple of things 🙂

I think you mean checkpoint not savepoint. Also you should look at the background writer in general here. It can greatly reduce the damage a checkpoint can do.

Further if you have the right hardware, turning off fullpage_writes 🙂
depesz says:

2007-12-06 at 00:58

@Joshua Drake:
1. of course – my mystake. checkpoint. blog post edited.

2. as for fullpage_writes – this option was so many times said to be “do not touch”, “always on!”, that i didn’t even looked at it. what hardware is right for it?

3. we did spent some time tuning bgwriter, but it didn’t help
Joshua Drake says:

2007-12-06 at 01:48

fullpage_writes needs to have a battery backed raid unit.

For bgwriter the key to watch how long the checkpoints are taking. If you are cycling a lot of logs and the checkpoints are taking a long time, you need to adjust.

Also did you have autovacuum turned on during the testing?
Joshua Drake says:

2007-12-06 at 01:51

I would also note that you did the correct thing in lowering the shared_buffers, it is common for a checkpoint to flood IO in that scenario but there is a good change you can increase it higher with proper bgwriter settings.
Derek Spencer says:

2007-12-18 at 01:02

I have a similar hardware setup (Linux, 16GB RAM) and have also found that a “low” shared_buffers setting is preferred. I run my shared_buffers at 320MB RAM with good results. During my early years with PostgreSQL I learned to let Linux handle as much of the caching as possible.

Comments are closed.