On 4th of December, Tom Lane committed really cool patch:
KNNGIST, otherwise known as order-by-operator support for GIST.
This commit represents a rather heavily editorialized version of
Teodor's builtin_knngist_itself-0.8.2 and builtin_knngist_proc-0.8.1
patches. I redid the opclass API to add a separate Distance method
instead of turning the Consistent method into an illogical mess,
fixed some bit-rot in the rbtree interfaces, and generally worked over
the code style and comments.
There's still no non-code documentation to speak of, but I'll work on
that separately. Some contrib-module changes are also yet to come
(right now, point <-> point is the only KNN-ified operator).
Teodor Sigaev and Tom Lane
Continue reading Waiting for 9.1 – KNNGIST
Starting new blog series – explanation of various configuration parameters.
I will of course follow no schedule or order – if I'd had to – it would be my job, and in this way – it's fun.
First configuration parameter to write about is checkpoint_completion_target.
Continue reading Understanding postgresql.conf : checkpoint_completion_target
Before I'll start let me say that I am fan of what Oleg and Teodor did – their work is great, and I do appreciate their time and ideas.
But – I simply don't like the idea of using FTS (Full Text Search) inside of database. Why? Let me show you.
Continue reading Why I'm not fan of TSearch?
Foreign keys are known for couple of things, but speeding up your system is not one of them. But sometimes, having them in place lets you make queries significantly faster.
How? Let me show you example I have seen lately (well, it's simplified example based on something much more convoluted, and definitely longer):
Continue reading Performance gains from using foreign keys
Today, Mattias|farm on IRC asked how to create primary key using HASH index. After some talk, he said that in some books it said that for “=" (equality) hash indexes are better.
So, I digged a bit deeper.
Continue reading Should you use HASH index?
One database that I am monitoring uses a lot of stored procedures. Some of them are fast, some of them are not so fast. I thought – is there a sensible way to diagnose which part of stored procedure take the most time?
I mean – I could just put the logic into application, and then every query would have it's own timing in Pg logs, but this is not practical. And I also believe that using stored procedures/functions is way better than using plain SQL due to a number of reasons.
So, I'm back to question – how to check which part of function takes most of the time?
Continue reading Profiling stored procedures/functions
On 3rd of August, Tatsuo Ishii committed patch by ITAGAKI Takahiro:
Multi-threaded version of pgbench contributed by ITAGAKI Takahiro,
reviewed by Greg Smith and Josh Williams.
Following is the proposal from ITAGAKI Takahiro:
Pgbench is a famous tool to measure postgres performance, but nowadays
it does not work well because it cannot use multiple CPUs. On the other
hand, postgres server can use CPUs very well, so the bottle-neck of
workload is *in pgbench*.
Multi-threading would be a solution. The attached patch adds -j
(number of jobs) option to pgbench. If the value N is greater than 1,
pgbench runs with N threads. Connections are equally-divided into
them (ex. -c64 -j4 => 4 threads with 16 connections each). It can
run on POSIX platforms with pthread and on Windows with win32 threads.
Here are results of multi-threaded pgbench runs on Fedora 11 with intel
core i7 (8 logical cores = 4 physical cores * HT). -j8 (8 threads) was
the best and the tps is 4.5 times of -j1, that is a traditional result.
$ pgbench -i -s10
$ pgbench -n -S -c64 -j1 => tps = 11600.158593
$ pgbench -n -S -c64 -j2 => tps = 17947.100954
$ pgbench -n -S -c64 -j4 => tps = 26571.124001
$ pgbench -n -S -c64 -j8 => tps = 52725.470403
$ pgbench -n -S -c64 -j16 => tps = 38976.675319
$ pgbench -n -S -c64 -j32 => tps = 28998.499601
$ pgbench -n -S -c64 -j64 => tps = 26701.877815
Is it acceptable to use pthread in contrib module?
If ok, I will add the patch to the next commitfest.
Continue reading Waiting for 8.5 – Multi-threaded pgbench
On 3rd of November Andrew Dunstan committed his patch which adds new function to PostgreSQL – suppress_redundant_updates_trigger().
This function is not for using in selects, but it can help you tremendously if your database access matches certain pattern.
Continue reading Waiting for 8.4 – suppress_redundant_updates_trigger
When I was working for one of customers we found some strange thing. We needed to found number of distinct sessions per day. Table layout was very simple:
Continue reading Counting number of distinct elements
as you perhaps know there is this site/blog called high scalability. it contains articles about various things related to performance, scalability, availability and so on.
usually, when there is something about databases, it is about mysql. luckily for us, today they showed something about postgresql:
“Skype Plans for PostgreSQL to Scale to 1 Billion Users“.