What logging has least overhead?

When working with PostgreSQL you generally want to get information about slow queries. The usual approach is to set log_min_duration_statement to some low(ish) value, run your app, and then analyze logs.

But you can log to many places – flat file, flat file on another disk, local syslog, remote syslog. And – perhaps, instead of log_min_duration_statement – just use pg_stat_statements?

Well, I wondered about it, and decided to test.

Continue reading What logging has least overhead?

Waiting for 9.3 – Support indexing of regular-expression searches in contrib/pg_trgm.

On 9th of April, Tom Lane committed patch:

Support indexing of regular-expression searches in contrib/pg_trgm.
 
This works by extracting trigrams from the given regular expression,
in generally the same spirit as the previously-existing support for
LIKE searches, though of course the details are far more complicated.
 
Currently, only GIN indexes are supported.  We might be able to make
it work with GiST indexes later.
 
The implementation includes adding API functions to backend/regex/
to provide a view of the search NFA created from a regular expression.
These functions are meant to be generic enough to be supportable in
a standalone version of the regex library, should that ever happen.
 
Alexander Korotkov, reviewed by Heikki Linnakangas and Tom Lane

One day later Tom Lane added support for the same operations using GiST indexes (original patch was working only with GIN).

Continue reading Waiting for 9.3 – Support indexing of regular-expression searches in contrib/pg_trgm.

Waiting for 9.3 – Add parallel pg_dump option.

On 24th of March, Andrew Dunstan committed patch:

Add parallel pg_dump option.
 
New infrastructure is added which creates a set number of workers
(threads on Windows, forked processes on Unix). Jobs are then
handed out to these workers by the master process as needed.
pg_restore is adjusted to use this new infrastructure in place of the
old setup which created a new worker for each step on the fly. Parallel
dumps acquire a snapshot clone in order to stay consistent, if
available.
 
The parallel option is selected by the -j / --jobs command line
parameter of pg_dump.
 
Joachim Wieland, lightly editorialized by Andrew Dunstan.

Continue reading Waiting for 9.3 – Add parallel pg_dump option.

What is the point of bouncing?

Some of you might be familiar with pgBouncer project. Some are not. Some understand what/how/why it does, others do not.

This blog post is to have a place where I can point people who have question about how it works, why, and when it makes sense to use it (pgBouncer that is).

Continue reading What is the point of bouncing?

“= 123″ vs. “= ‘depesz’” – followup

Yesterday I wrote about selects on int4 vs. texts.

One of the comments that caught my attention was question about index creation time. So, let's see…

Continue reading “= 123″ vs. “= ‘depesz’” – followup

“= 123” vs. “= ‘depesz'”. What is faster?

There is this idea that normal form in databases require you to use integer, auto incrementing, primary keys.

The idea was discussed by many people, I will just point you to series of three blog posts on the subject by Josh Berkus ( part 1, 2 and 3, and reprise).

One of the points that proponents of surrogate keys (i.e. those based on integer and sequences) raise is that comparing integers is faster than comparing texts. So,

select * from users where id = 123

is faster than

select * from users where username = 'depesz'

Is it?

Continue reading “= 123" vs. “= ‘depesz'". What is faster?

Waiting for 9.2 – pg_stat_statements improvements

Three interesting patches:

On 27th of March, Robert Haas committed patch:

New GUC, track_iotiming, to track I/O timings.
 
Currently, the only way to see the numbers this gathers is via
EXPLAIN (ANALYZE, BUFFERS), but the plan is to add visibility through
the stats collector and pg_stat_statements in subsequent patches.
 
Ants Aasma, reviewed by Greg Smith, with some further changes by me.

On 27th of March, Robert Haas committed patch:

Expose track_iotiming information via pg_stat_statements.
 
Ants Aasma, reviewed by Greg Smith, with very minor tweaks by me.

On 29th of March, Tom Lane committed patch:

Improve contrib/pg_stat_statements to lump &quot;similar&quot; queries together.
 
pg_stat_statements now hashes selected fields of the analyzed parse tree
to assign a "fingerprint" to each query, and groups all queries with the
same fingerprint into a single entry in the pg_stat_statements view.
In practice it is expected that queries with the same fingerprint will be
equivalent except for values of literal constants.  To make the display
more useful, such constants are replaced by "?" in the displayed query
strings.
 
This mechanism currently supports only optimizable queries (SELECT,
INSERT, UPDATE, DELETE).  Utility commands are still matched on the
basis of their literal query strings.
 
There remain some open questions about how to deal with utility statements
that contain optimizable queries (such as EXPLAIN and SELECT INTO) and how
to deal with expiring speculative hashtable entries that are made to save
the normalized form of a query string.  However, fixing these issues should
require only localized changes, and since there are other open patches
involving contrib/pg_stat_statements, it seems best to go ahead and commit
what we've got.
 
Peter Geoghegan, reviewed by Daniel Farina

Continue reading Waiting for 9.2 – pg_stat_statements improvements

What index to create?

Some time ago I wrote a blogpost about why index might not be used.

While this post seemed to be well received (top link from depesz.com on reddit), it doesn't answer another question – what index to create for given situation.

I'll try to cover this question now.

IMPORTANT UPDATE: As of PostgreSQL 10 hash indexes are WAL logged. As such, main point against them is gone.

Continue reading What index to create?

Waiting for 9.1 – Faster LIKE/ILIKE

On 1st of February, Tom Lane committed patch:

Support LIKE and ILIKE index searches via contrib/pg_trgm indexes.
 
Unlike Btree-based LIKE optimization, this works for non-left-anchored
search patterns.  The effectiveness of the search depends on how many
trigrams can be extracted from the pattern.  (The worst case, with no      
trigrams, degrades to a full-table scan, so this isn't a panacea.  But   
it can be very useful.)                                                 
 
Alexander Korotkov, reviewed by Jan Urbanski

Continue reading Waiting for 9.1 – Faster LIKE/ILIKE

Waiting for 9.1 – Unlogged tables

On 29th of December, Robert Haas committed interesting patch, which does:

Support unlogged tables.
 
The contents of an unlogged table aren't WAL-logged; thus, they are not
available on standby servers and are truncated whenever the database
system enters recovery.  Indexes on unlogged tables are also unlogged.
Unlogged GiST indexes are not currently supported.

(edited commit message, due to this mail.

Continue reading Waiting for 9.1 – Unlogged tables

=$
|

Tag: performance