what should be fixed in postgresql

edit: title was changed because apparently it was too offensive

the title might a bit too offensive, but perhaps it will make it more visible.

some time ago (march 2007) i asked on polish db-related newsgroup about things that people hate about their databases.

to give some example i wrote about 5 things that i hate (dislike?) about postgresql. today i looked back at this topic to check what has changed. many things did change. some things don't piss me anymore. some new things showed up, so, here we go – what i (personally) see as things to be fixed/removed soon to get nicer (for users) database system.

Continue reading what should be fixed in postgresql

who has birthday tomorrow?

so, there you have a users table, with a very basic structure:

      Table "public.users"
  Column   |  Type   | Modifiers
-----------+---------+-----------
 id        | integer | not null
 birthdate | date    |
Indexes:
    "x_pkey" PRIMARY KEY, btree (id)

then, you have a task: find a query that will return all users which have birthday tomorrow. how will you do it?

Continue reading who has birthday tomorrow?

grant XXX on * ?

one of the more common problems new users have with postgresql (especially those that came from mysql background), is the lack of easy way to grant/revoke/do-something with many objects (tables/sequences/views) at once.

there are number of pages that deal with the problem, let's just name some from #postgresql infobots:

now, both of these pages have their benefits, but i'd like to show something simpler, yet (perhaps) more powerful.

instead of giving you the fish (figuratively speaking) i will give you the net and the skills so you'll be able to do the magic yourself.

Continue reading grant XXX on * ?

finding missing pairs

let's assume we have a simple table:

     Table "public.test"
 Column |  Type   | Modifiers
--------+---------+-----------
 a      | integer | not null
 b      | integer | not null
Indexes:
    "test_pkey" PRIMARY KEY, btree (a, b)
    "q" UNIQUE, btree (b, a)

now, let's insert some rows to it:

# INSERT INTO test SELECT * FROM generate_series(1,100) AS a, generate_series(1,100) AS b^J;
INSERT 0 10000

remove rows with a = b:

# DELETE FROM test WHERE a = b;
DELETE 100

and prepare test-case by randomly removing some rows:

# DELETE FROM test WHERE random() < 0.002;
DELETE 17

the question is – find all pairs of (a,b) where there is no row (a',b') where (a'=b and b'=a).

in other words – every row (a,b) should be paired. rows with a = 2 and b = 3, is paired by row with a = 3 and b = 2.

how to find incomplete pairs?

Continue reading finding missing pairs

“FATAL: Ident authentication failed”, or how cool ideas get bad usage schemas

UPDATE (2012-06-24): Version 9.1 of PostgreSQL renamed ident to peer (for local connections). So if you're having errors about “Peer authentication failed" – it is the same as “Ident authentication failed", and all described in this blogpost is still relevant.

ever seen one of those? i mean the “fatal: ident authentication failed"?

or, ever seen anyone having this problem when connecting to postgresql?

how often is this problem related to debian/post-debian linux distributions? 99%? 100%?

on #postgresql on irc.freenode.net it is the most common problem. my own irc logs show that “ident" showed over 300 times over last 41 days. now, that's something. and how come we have this problem? what can be done with it? read on.

Continue reading “FATAL: Ident authentication failed", or how cool ideas get bad usage schemas

finding optimum tables placement in 2-tablespace situation

just recently we got another array for out main production database. this means – we will be able to add new tablespace, thus making everything go faster.

in theory – it's nice. but which tables to move to the other?

the basic assumption is simple – index on table should not be on the same tablespace as the table itself. that's easy. but – should we really put all tables on one tablespace, and all indexes on another?

we decided that the important things that should be “boosted" are seeks and writes. sequential reads are (in our situation) more or less irrelevant.

read on to check how we split the load.

Continue reading finding optimum tables placement in 2-tablespace situation

easy access to postgresql documentation from firefox

for whatever reason i seem to remember names of files in postgresql documentation.

for example – i know, that psql docs are in app-psql.html, base page for configuration options is runtime-config.html. manual for “alter table" sql command, is in sql-altertable.html.

yet full page url (http://www.postgresql.org/docs/current/interactive/app-psql.html) is very long. on the other hand going to http://www.postgresql.org/docs/current/interactive/ is nto really useful, as i need to find specific page which i remember filename, but not necessarily title.

is there any solution? apparently, yes, simply follow these steps:

  1. go to http://www.postgresql.org/docs/current/interactive/
  2. bookmark this page
  3. go to “Bookmarks" -> “Organize Bookmarks…"
  4. find the bookmark that you just created, select it, and click on “properties" in toolbar
  5. in “keyword" enter letter “p" (without quotes)
  6. change url to http://www.postgresql.org/docs/current/interactive/%s.html
  7. press “ok" (you might want to create a special folder to hide this bookmark from standard view, as you will never use it directly)

and that's all. after you've done it, you can simply enter: “p app-psql" in your location edit box, press enter and you will be directed to correct page.

of course this trick works not only with postgresql docs. it can be used for google, search.postgresql.org, anything you want. but using this for postgresql manual will definitely shorten time spent on “going to manual" 🙂

dell powervault md1000 – storage test

i recently got new toy for tests – brand new dell powervault md1000.

what's this, you ask? basically – a rather nice das (direct attached storage) from dell.

the box i got had 15 sas discs, each disc being 72gb, 15krpm.

since this will be used as database storage, i wanted to make some performance tests.

Continue reading dell powervault md1000 – storage test

what fields are usually changed when update’ing?

there was this situation, that we had a lot of tables and a lot of update activity. so, we thought about splitting the most updated tables to parts that are usually stable, and parts (columns) which change often.

but how to know what changes? unfortunately orm that was used issued updates like this:

UPDATE TABLE SET field1='..', field2='...', field3='...' WHERE id = 123;

basically it always updated all fields. (don't even start to comment that orms are by definition broken).

so, i had to find a nice way to find out what was really updated.

Continue reading what fields are usually changed when update'ing?