edit: title was changed because apparently it was too offensive
the title might a bit too offensive, but perhaps it will make it more visible.
some time ago (march 2007) i asked on polish db-related newsgroup about things that people hate about their databases.
to give some example i wrote about 5 things that i hate (dislike?) about postgresql. today i looked back at this topic to check what has changed. many things did change. some things don't piss me anymore. some new things showed up, so, here we go – what i (personally) see as things to be fixed/removed soon to get nicer (for users) database system.
first thing – lack of planner hints. this is a major pita. i know that we should keep statistics up-to-date. i've read about “you have to learn your planner to do the thing". i have even gave the same advice to others.
yet, i've seen on number of cases that sometimes it simply gets it wrong. what's more – in some cases the costs are given in some arbitrary way (like the “select * from tablea left outer join tableb on tablea.x = tableb.y where tableb.y is null" case, which had been lately modified by tom lane by lowering estimated returned row count).
the idea is that planner is not perfect. i can teach it all time, and even then sometimes it will not get the best plan. solutions? fixing source to lower/boost some voodoo-parameters? come on – is adding planner hints really not possible? are all other things really more important? perhaps, but anyway – it's a real issue at the very least for me.
second – locale.
there are some things that i understand – adding full locale support per column/row/table is very complex. but what i personally find unjustifiable it the lack of ability to set locale per database, and – ability to change locale without re-initdb.
i know that changing locale would require rebuilding all indexes – but come on – we have the infrastructure (create index concurrently) – why can't it be used for change-locale thing? and even if we'll consider using concurrent index building as not possible – i would still prefer to have to “reindex database" than to reinitdb!
i have seen number of people that created their first database, found out that sorting/upper/lower is not working properly, and were then told that they have to dumpall, reinitdb, and reload everything. now – this is really pissing off.
third – “field quotation"
what drives me nuts about it is that we have tools (like pgadmin, at least that's what i've been told) which use quotation without informing users about possible consequences.
additionally – i don't quite understand why we can't go one step towards user and try to “guess" the table name. i mean – if there is no table with the name user specified – perhaps we could check if there is another, which has the same name with another capitalization? if there is one such table – use it (after all, in docs we say that identifiers are case-insensitive, so it's not like user would be shocked by this behavior). if there are more than one such tables – raise exception.
i honestly think that this would be much more user-friendly than current behavior.
fourth – mvcc-related lack of covering indexes.
now, this is pretty technical issue, but if you want to know what covering indexes are – take a look at this article.
now, i know that this is a huge thing to write, and i don't expect anyone rushing to do it (while i don't quite get why points 1 and 3 are not yet fixed), but it is so great feature, that i really think that it would be a great point in any “postgresql vs. other-database" discussion.
fifth – lack of global variables and packages in plpgsql.
plperl has it. plpgsql has to use some tricks with custom variable classes. why can't be there some “SHARED" record that would hold data between calls?
i decided to bring packages together, because they tend to have some namespace for package-wide variables.
plpgsql is really nice language, but it would definitely use some functionality-boost.
last point in postgresql itself (there will be one more thing) is lack of built-in jobs support.
i know about pgagent, but it comes with pgadmin, and not with postgresql. then, it's another process that have to be separately started.
usually cron job is enough – it simply works.
but i would simply prefer to have this functionality built into postgresql itself. why? it's safer (it's in database, so it doesn't have to authenticate, so it doesn't need “trust" or .pgpass), for one.
another point – it's simpler for me to manage – one tool (psql) would give me full control. pg_dumpall would contain everything including additional tasks – now, when making backup server, i have to do pg_dumpall, scan all possible crontabs, and all other, totally exotic places depending on what tool used somebody before me.
last point is not exactly about postgresql. it's about slony. i think that slony is the “poster-child" of postgresql replication, so i decided to include this point in this list.
slony is nice. works. i can set it up in 5 minutes. it has nice debugging.
all-round it's great product. with one exception.
try to add new column to existing table. it requires me to use “execute script" command of slonik.
this makes slony get database-wide exclusive lock to do the thing!
now, while i see why it was made so (for simplicity of slony code), it is not really acceptable.
the database i'm working on, when idling (in the night, with virtually no visitors) does around 30 transactions per second!
this brings some problems:
- getting the lock is actually not simple – usually it deadlocks with some other transaction
- when it's finally there the lock blocks everything on database for as-long-as-it-takes
- alter table x add column q; takes for some reason a lot of time – even though i didn't use default value – same command, executed directly takes around 1 second. through slony – 1-2 minutes.
now, i don't have any silver bullet on how to fix any of these issues. i'm not c programmer, so i can't write the patches myself. i'm an ordinary guy, so i can't really sponsor anyone to do it for me, but i would really love to see them somehow fixed. and i'm kind of sure, that i'm not alone on this 🙂