what should be fixed in postgresql

edit: title was changed because apparently it was too offensive

the title might a bit too offensive, but perhaps it will make it more visible.

some time ago (march 2007) i asked on polish db-related newsgroup about things that people hate about their databases.

to give some example i wrote about 5 things that i hate (dislike?) about postgresql. today i looked back at this topic to check what has changed. many things did change. some things don't piss me anymore. some new things showed up, so, here we go – what i (personally) see as things to be fixed/removed soon to get nicer (for users) database system.

first thing – lack of planner hints. this is a major pita. i know that we should keep statistics up-to-date. i've read about “you have to learn your planner to do the thing". i have even gave the same advice to others.

yet, i've seen on number of cases that sometimes it simply gets it wrong. what's more – in some cases the costs are given in some arbitrary way (like the “select * from tablea left outer join tableb on tablea.x = tableb.y where tableb.y is null" case, which had been lately modified by tom lane by lowering estimated returned row count).

the idea is that planner is not perfect. i can teach it all time, and even then sometimes it will not get the best plan. solutions? fixing source to lower/boost some voodoo-parameters? come on – is adding planner hints really not possible? are all other things really more important? perhaps, but anyway – it's a real issue at the very least for me.

second – locale.

there are some things that i understand – adding full locale support per column/row/table is very complex. but what i personally find unjustifiable it the lack of ability to set locale per database, and – ability to change locale without re-initdb.

i know that changing locale would require rebuilding all indexes – but come on – we have the infrastructure (create index concurrently) – why can't it be used for change-locale thing? and even if we'll consider using concurrent index building as not possible – i would still prefer to have to “reindex database" than to reinitdb!

i have seen number of people that created their first database, found out that sorting/upper/lower is not working properly, and were then told that they have to dumpall, reinitdb, and reload everything. now – this is really pissing off.

third – “field quotation"

what drives me nuts about it is that we have tools (like pgadmin, at least that's what i've been told) which use quotation without informing users about possible consequences.

additionally – i don't quite understand why we can't go one step towards user and try to “guess" the table name. i mean – if there is no table with the name user specified – perhaps we could check if there is another, which has the same name with another capitalization? if there is one such table – use it (after all, in docs we say that identifiers are case-insensitive, so it's not like user would be shocked by this behavior). if there are more than one such tables – raise exception.

i honestly think that this would be much more user-friendly than current behavior.

fourth – mvcc-related lack of covering indexes.

now, this is pretty technical issue, but if you want to know what covering indexes are – take a look at this article.

now, i know that this is a huge thing to write, and i don't expect anyone rushing to do it (while i don't quite get why points 1 and 3 are not yet fixed), but it is so great feature, that i really think that it would be a great point in any “postgresql vs. other-database" discussion.

fifth – lack of global variables and packages in plpgsql.

plperl has it. plpgsql has to use some tricks with custom variable classes. why can't be there some “SHARED" record that would hold data between calls?

i decided to bring packages together, because they tend to have some namespace for package-wide variables.

plpgsql is really nice language, but it would definitely use some functionality-boost.

last point in postgresql itself (there will be one more thing) is lack of built-in jobs support.

i know about pgagent, but it comes with pgadmin, and not with postgresql. then, it's another process that have to be separately started.

usually cron job is enough – it simply works.

but i would simply prefer to have this functionality built into postgresql itself. why? it's safer (it's in database, so it doesn't have to authenticate, so it doesn't need “trust" or .pgpass), for one.

another point – it's simpler for me to manage – one tool (psql) would give me full control. pg_dumpall would contain everything including additional tasks – now, when making backup server, i have to do pg_dumpall, scan all possible crontabs, and all other, totally exotic places depending on what tool used somebody before me.

last point is not exactly about postgresql. it's about slony. i think that slony is the “poster-child" of postgresql replication, so i decided to include this point in this list.

slony is nice. works. i can set it up in 5 minutes. it has nice debugging.

all-round it's great product. with one exception.

try to add new column to existing table. it requires me to use “execute script" command of slonik.

this makes slony get database-wide exclusive lock to do the thing!

now, while i see why it was made so (for simplicity of slony code), it is not really acceptable.

the database i'm working on, when idling (in the night, with virtually no visitors) does around 30 transactions per second!

this brings some problems:

getting the lock is actually not simple – usually it deadlocks with some other transaction
when it's finally there the lock blocks everything on database for as-long-as-it-takes
alter table x add column q; takes for some reason a lot of time – even though i didn't use default value – same command, executed directly takes around 1 second. through slony – 1-2 minutes.

now, i don't have any silver bullet on how to fix any of these issues. i'm not c programmer, so i can't write the patches myself. i'm an ordinary guy, so i can't really sponsor anyone to do it for me, but i would really love to see them somehow fixed. and i'm kind of sure, that i'm not alone on this 🙂

9 thoughts on “what should be fixed in postgresql”

Mathew says:

2007-10-29 at 00:04

Some good points here. I hope they get noticed by a friendly code monkey looking for a patch to write.

I agree with most of what you are saying.

On the package front I have to say I find the way Oracle does PL packages very strange, though.
Pavel Stehule says:

2007-10-29 at 09:19

I can agree with some. Lot of points are in ToDo – only waits for free programmers. Your solution about quoted fields is nonsens. It’s dangerous. There can be more user frendly diagnostics and warning messages. About hints .. it help for usually wrong programmer, with hints database use index, but you will have unefective statement still (you have to rewrite statement). All others I would to see in PostgreSQL too.
depesz says:

2007-10-29 at 09:42

why it’s nonsense?

i think that giving real case-insensitivity is good.

right now we have some “kind of” case-insensitivity, which bites a lot of people (new pg users usually).

it’s not like it is the top priority. yet i think it can, and should be fixed.
Zygo says:

2007-10-29 at 10:02

I haven’t felt any special need for query hinting…although that’s probably because 1) I’ve figured out how to write plans that the planner can’t optimize, which force it to use the plan I want, or 2) I’ve rewritten some particularly troublesome JOIN queries as nested FOR … IN query LOOPs with PL/PgSQL. You know something’s wrong when loop iteration through your query engine is faster than your query engine iterating through a loop.

For me the problem is usually that empirical testing with a bunch of query plans shows that some plan B is better than the plan A generated from the query planner. In these cases plan B will be slightly faster, or more predictable with different inputs, or plan A works better on a freshly dump/restored or CLUSTERed database while plan B works better on a database that has had a few million transactions pushed through it. Or perhaps plan B works better on a machine with reporting query C and D running at the same time (AFAIK the planner doesn’t currently take into account concurrent loads on the buffer caches).

The other annoying feature of statistics-based planning is that the query plans can go disastrously wrong without warning when the system is in production. I’ve been bitten by this a few times now (and this case is where the PL/PgSQL ‘join’ loops come in–those have predictable run times because the planner can’t possibly change the join algorithm).

My current biggest peeve (maybe fixed in 8.3?) is what happens when you have an application that alternates between:

select into my_job_id job_id from job_queue order by job_priority limit 1;

and

delete from job_queue where job_id = my_job_id;

where ‘x’ is an indexed column. Usually the application does a bunch of other processing for each row which isn’t shown here.

As of 8.2, this gradually gets slower and slower since the index contains all live *or dead* rows. Each time we do the select, we iterate linearly(!) over dead rows until we finally find a live row. Each time we do a delete, we add one more iteration to the next select.

Frequent vacuuming helps, but it’s hard to achieve “frequent” on a 60GB table.

It would be nice if the index search could say “hmmm, I seem to have followed an index to a dead row in the heap (i.e. global minimum transaction id is higher than the upper transaction id of this row), I’m going to go back and wipe the row out of the index right now, so I’ll never have to follow it again.” This is what VACUUM (without FULL) does already, and since the pages are already in the cache the cost of doing a “one-row VACUUM” is just the cost of writing the index page back (which would be a heck of a lot cheaper than all these linear searches across both heap and index on every query).
alec says:

2007-10-29 at 12:50

I know that programmers know why, me too, but what will be result of “SELECT 2/4”?
Thormick says:

2007-10-29 at 14:30

Having worked for years with “The Big Database Vendor(tm)”s product, I quickly learned that there were two very distinct types of hints: 1. Hints that tell the planner what to do. 2. Hints that give the planner information. Type one was just bad, and I’m happy to see that PostgreSQL doesn’t have those, but type two would be very welcome, rather than having to rely on “global voodoo parameters”.
Jeff Davis says:

2007-10-29 at 18:11

Interesting post.

(1) Look into the Dead Space Map, or DSM, which will hopefully be a part of 8.4. This feature promises the benefits of covering indexes and also drastically reducing the repetitiveness of VACUUM.

(2) Planner hints are not a bad idea, but they should be done declaratively, not within a query. There are no benefits to having hints inside the query, and lots of drawbacks.

(3) Locales: I agree completely. Locales and encodings could be improved a lot.
Zygo says:

2007-10-29 at 18:17

I looked over some of the queries I’ve had problems with in the last year to see where and why the planner alone fails.

This pattern seems to be quite common:

for some_row in select … loop
exit when random() > .01;
/* other code here */
end loop;

(note: the condition is usually more deterministic than random(), this is just a minimal example 😉

The problem here is a case the planner cannot possibly solve with statistical analysis of the query: I’m walking through a query with a cursor, but I’m going to consume a very small number of rows–much fewer rows than the full theoretical result set. Unfortunately I don’t know how many rows I will iterate over in advance, so I can’t simply use LIMIT.

Without a hint there’s no way for the planner to figure out that a small number of rows will ultimately be fetched from the theoretical result set. If the SELECT joins two large tables, this often means that the planner will choose seq_scan + sort instead of index_scan + nested loop. That actually would be faster if I intended to iterate over all rows with the cursor; however, since I usually iterate through at most a few hundred rows at a time the index scan is vastly superior.

There are a few solutions to this in current PostgreSQL, but they’re annoying:

1) Add “WHERE some_column > last_value_of_some_column ORDER BY some_column LIMIT 100” to the query. (I said I can’t *simply* use LIMIT, not that there was no way ;-). Even this fails if the planner thinks some intermediate JOIN in the query will be large enough before the LIMIT to use sorts instead of index scans.

2) Take the one-time hit to build the query result, and keep the cursor for the life of the application. That’s so annoying I’ve never used it. 😉

3) Turn enable_seqscan off and on just before and just after executing the query. Quite painful to do this in PL/PgSQL, especially if the query is in a FOR loop. It also turns off *all* seqscans, even if they do make sense for part of the query.
Pingback: Pythian Group Blog » Blog Archive » Log Buffer #69: a Carnival of the Vanities for DBAs

Comments are closed.