all of you know copy command. it's fast for inserting new data. but can it be used it we want to insert or update rows?
sure, it will be more than one command, but it's perfectly doable. here's how.
let's assume we have a table with user data (id, username, password).
we import it once, let's say 10000 accounts. then after time we get new import – 12000 accounts. and a task – new records should be inserted, and if somebody from already-imported users has different password – password from import should be used.
another thing – we can't remove users before import – reason is simple, there are other tables that have foreign keys to users.
so, what to do?
let's say our import data contains only username and password.
we import it to temp table:
CREATE temp TABLE new_import (username text, password text); copy new_import (username, password) FROM '/tmp/import.data';
now we have the data in side table, so let's update the users in case password is different:
UPDATE users AS u SET password = n.password FROM new_import n WHERE n.username = u.username AND n.password <> u.password;
and now, copy record from new_import users – but only about users which are not yet in users table:
INSERT INTO users (username, password) SELECT n.username, n.password FROM new_import n LEFT OUTER JOIN users u ON n.username = u.username WHERE u.username IS NULL;
and now cleaning:
DROP TABLE new_import;
and that's all. of course proper indexing will be needed to make it fast. simple, fast and effective.