postgresql tips & tricks

cortilap @ freenode's #postgresql asked about how to create a check() that will allow only one of the columns to be not null.

it doesn't sound cool, let's see:

with 2 columns (a,b) you make a check: check ( (a is not null and b is null) or (a is null and b is not null) or (a is null and b is null))

whoa. and what about 3 columns? 4?

of course it creates some questions about the schema, but is there a way to do it? without such long checks?

one solution is to make a function to check it. but perhaps a simpler solution is possible?

luckily all of the fields are ints.

a quick think, and here we go:

check (coalesce(a*0, 1) + coalesce(b*0, 1) + coalesce(c*0, 1) > 1)

and what is the field was text? same thing, but instead of doing “X"*0, i would do “length(X)*0" 🙂

my thoughts on getting random row

this topic has been written about by many smart people – from the recent past, by greg sabino mullane and josh berkus.

they show 4 different approaches:

  1. order by random()
  2. >= random() limit 1
  3. random column
  4. random aggregate

all these approaches have their benefits and drawbacks, but i'd like to show another one (polish readers saw the approach already in january 2007, but this time i will make the code more robust).

Continue reading my thoughts on getting random row

speeding up like ‘%xxx%’

as most of you know postgresql can easily speedup searches using:

field like 'something%'

and (less easily):

field like '%something'

but how about:

field like '%something%'

general idea is to use some kind of full text search/indexing – tsearch, lucene, sphinx, you name it.

but sometimes you can't install fts/fti, or it doesn't really solve your problem. is there any help? let's find out.

Continue reading speeding up like ‘%xxx%'

objects in categories – counters with triggers

so there you go, you have some “categories" and some objects. for simplicity let's assume one object can be in only one category.

if this is too theoretical for you – let's assume these are “mails in folders", “photos in galleries", “posts in categories" or “auctions in categories on ebay".

everything clear? now, let's assume you want to know how many “objects" are in given “category".

most basic way to do it is:

SELECT COUNT(*) FROM objects WHERE category = some_category;

but this method is far from optimal. now, we'll learn how to do it better.

one warning for those of you who read the rss feed – if you say “yeah, i know the code, it's simple" – ask yourself – is your code deadlock-proof?

Continue reading objects in categories – counters with triggers

avoiding “empty” updates

just recently i saw a very interesting situation.

one table in customer database (with about 1.5 milion rows) was *very often* updated. using series of simple:

UPDATE TABLE SET FIELD = .., WHERE id = ...

updates always updated 1 record, search was using primary key, so it was quite fast.

what was strange was the fact that the table get about 20 times more updates then the next most updated table.

after careful checking i found out that a lot (about 60-70%) of the updates actually didn't change anything!

they were simply setting values that were already there.

so, i started to think about hwo to avoid this kind of situation.

and this is what i found out.

Continue reading avoiding “empty" updates

(postgre)sql gotchas

i would like to write about a things that are not really errors by themselves – they are simply situations when database acts in undesired way. a way that's perfectly logical, but where the logic is not always clearly seen “at first sight".

so, don't expect any postgresql-bashing. if you want some, indicate so in comments, i'll find some things to bash postgresql for.

but in here i'd rather bash us – people – for making mistakes. the ones that are not really easy to see as mistakes for database.

Continue reading (postgre)sql gotchas

indexable “field like ‘%something'” – update

thomas reiss just published his follow up to my post about indexed searches in “like ‘%something'" cases.

i don't understand french so i can't tell what exactly he wrote, but i understand c code and graphs 🙂

basically – he wrote c function that does reverse() of strings, and benchmarked it against pl/pgsql and pl/perl versions. of course c version is the fastest, but just take a look at how much faster it is. and how simple it is.