indexable ” field like ‘%something'”

for the long time everybody knew that you can't use index on “LIKE" operations.

then came text_pattern_ops, so we could use indexes for prefix searches:

# \d depesz_test
                         Table "public.depesz_test"
 Column |  Type   |                        Modifiers
 id     | integer | not null default nextval('depesz_test_id_seq'::regclass)
 email  | text    | not null
    "depesz_test_pkey" PRIMARY KEY, btree (id)
    "x" UNIQUE, btree (email text_pattern_ops)
# EXPLAIN analyze SELECT COUNT(*) FROM depesz_test WHERE email LIKE 'dep%';
                                                      QUERY PLAN
 Aggregate  (cost=96.71..96.72 ROWS=1 width=0) (actual TIME=0.983..0.985 ROWS=1 loops=1)
   ->  Bitmap Heap Scan ON depesz_test  (cost=4.68..96.65 ROWS=24 width=0) (actual TIME=0.184..0.641 ROWS=155 loops=1)
         FILTER: (email ~~ 'dep%'::text)
         ->  Bitmap INDEX Scan ON x  (cost=0.00..4.67 ROWS=24 width=0) (actual TIME=0.158..0.158 ROWS=155 loops=1)
               INDEX Cond: ((email ~>=~ 'dep'::text) AND (email ~<~ 'deq'::text))
 Total runtime: 1.067 ms
(6 ROWS)

but what if i'd like to search for ‘%something'? not prefix, but suffix. in my example – what can i do to use indexes when searching for people from given domain?

Continue reading indexable " field like ‘%something'"

update account set password = ‘new_password’; oops?!

how many times did you (or somebody in your environment) did something like this? update (or delete) without proper “where"?

it happened at least couple of times to me.

of course – using transactions solved the problem. but – on the other hand – if i can't trust myself to add proper where, how can i trust myself to add “begin"?

so, is there no hope?

are we doomed to always repeat the same mistake, and spend uncountable hours on recovering from damaged tables?

fortunatelly there is a help.
Continue reading update account set password = ‘new_password'; oops?!

migrating live system from single-disk to raid1

i have been given a machine with 2 discs (2x 160gb, sata), linux debian 4.0 and a task to make it run as raid1.

in the begining layout was simple:

disc: /dev/sda had 2 partitions:

  • sda1 – 2gb, swap
  • sda2 – rest of disc, root filesystem

second disc (sdb) didn't have any partitions.

it was up to me what exactly i will do, but the outcome had to be:

  • all important data will be in raid1 setup on both discs
  • current data cannot be lost
  • installing everything from scratch is not an option.
  • machine has lilo loader on it, and it shouldn't be changed

so, after some tests i did it, and will write about how to do it for future reference.

all naming conventions in following text will use names on the machine described above (sda1, sda2, sdb).

  1. of course: apt-get install mdadm. mdadm is the tool to make raid arrays on linux.
  2. since debian kernel has everything important loaded – i dont need to, but you can: modprobe md_mod; modprobe raid1
  3. create 2 partitions (sdb1, sdb2) on sdb disc. their layout and sizes should be the same as on source disc. in my case i decided to use sdb1 partition as /tmp disc – 2gb should be enough
  4. let's create filesystem on sdb1 partition (future /tmp space) : mkfs -t ext3 /dev/sdb1
  5. now. i need to create md0 device (it didn't exist in my system. if it does in your – just skip this point). to create you use: mknod -m 0660 /dev/md0 b 9 0; chgrp disk /dev/md0
  6. once i have /dev/md0, i create the array. i do so, by creating new array in raid1 mode, that will contain sdb1 partition and “missing" disc. this means that this partition will be in “degraded" mode, but this is perfectly fine for us
  7. mdadm –create /dev/md0 -l1 -n2 /dev/sdb2 missing
  8. now, the filesystem on /dev/md0: mkfs -t ext3 /dev/md0
  9. then you should edit /etc/fstab, and modify it to change device for rootfilesystem from “/dev/sda1" to “/dev/md0". ready line can looks like this: “/dev/md0 / ext3 defaults 0 1"
  10. i add information about /tmp to fstab: “/dev/sdb1 /tmp ext3 defaults,errors=continue,noexec,nosuid 0 0". it is very important to use “0" at the end – otherwise, if one of disc would fail, system will not bootup correctly claiming that it can't mount /tmp.
  11. in /etc/lilo.conf i modify “root=" entry to point it to /dev/md0: “root=/dev/md0". “boot=" stays “/dev/sda"
  12. mkdir /mnt; mount /dev/md0 /mnt; cd /; tar cf – –exclude=./proc –exclude=./mnt –exclude=./sys . | ( cd /mnt; tar xvf – )
  13. above series of commands will make /mnt directory, mount our raid device there, and copy whole filesystem to it.
  14. since i skipped /proc and /sys, i have to create them now, and fix permissions: mkdir /mnt/proc /mnt/sys; chmod 555 /mnt/proc
  15. ok. now we have: root filesystem on /dev/sda2, 2-device raid1 on /dev/sdb2 (and a missing disc), with copy of root filesystem. configured /etc/fstab and /etc/lilo.conf. so, just issue “lilo" command to install new bootblock (should go without errors), and reboot machine.
  16. after bootup root filesystem should be mounted on /dev/md/0, and cat /proc/mdstat should show that this array (md0) is working, but degraded.
  17. now, we add unused (at the moment) /dev/sda2 to md0: mdadm –add /dev/md0 /dev/sda2
  18. raid rebuild process now works. it's progress can be seen by viewing /proc/mdstat file. full rebuild took me about 40 minutes. we can't proceed before rebuild finishes.
  19. after it finished, we have to modify /etc/lilo.conf once again. this time, “boot=" parameter should be changed to “/dev/md0", and we should add new parameter: “raid-extra-boot=/dev/sda,/dev/sdb"
  20. after modifications of lilo.conf we should issue “lilo" command to make the change permanent.
  21. at the moment the mogration practically finished. we can simply do one more reboot to test if it will work (should, and it did work for me 🙂

now, the procedure i shown above is not meant to be a full fledged raid howto or manual. there are better sources for this kind of information.

this procedure is only meant to help in similar cases (lilo, migration of root filesystem to raid1).

if you have any questions about it – do not hesitate to ask. and if you dont understand something – please tell me so – i'll be glad to fix all that's not clear.

postgresql tips & tricks

mage_ from #postgresql had interesting problem today.

he has a table with 2 date fields, and he wants to have list of all years from both fields. together. as one list.

his approach:

SELECT date_part('year', date1) FROM test
SELECT date_part('year', date2) FROM test;

is hardly satisfactory – it takes too long.

any way to speed it up?
Continue reading postgresql tips & tricks

londiste – simple replication system for postgresql

some time ago skype released some tools for postgresql that they developed.

one of the released tools is londiste – simple replication system. it is master to multiple slaves, without cascading.

the problem with skype projects is that documentation is not as full as we expect from pg-related project. luckily dim used the londiste, and was kind enough to write some howto/manual for this project.

all i can say is: take a look at it. it looks as a very simple to use, without all features of slony, but definitely worth to take a look at.

how to insert data to database – as fast as possible

some time ago i wrote a piece on values(),() in postgresql 8.2, saying that multi-row inserts are very fast.

some people said that my benchmark is pretty useless as i didn't take into consideration transactions.

others asked me to translate the text to english.

so i decided to redo the test, with more test scenarios, and write it up in english. this is the summary.

at first what i used, what i tested and how.

i used a linux machine, with these things inside:

  • cpu: AMD Sempron(tm) Processor 2600+ (1.6ghz)
  • memory: 3gb
  • discs: 4 250gb hitachi sata discs (only one was used)

i tried to make the machine as predictable as possible, thus i stopped all daemons which were not neccessary. full ps auxwwf output is provided in results tar file. basically – there is postgresql, sshd, openvpn, dhclient and some gettys. no cron, atd, smtpd, httpd or anything like this.

then i wrote a small program which generated test files. i do not distribute test files themselves, as in total they use nearly 70gb!

then i wrote another small program – which basically ran all of the tests (3 times to get an average).

full set of results is downloadable as tar file, which contains 10598 files (tar file is 350k, unpacked directory takes 42megs).

one very important notice. all tests that i have performed inserted random data to table of this structure:

  • id int4 primary key,
  • some_text text,
  • bool_1 bool,
  • bool_2 bool,
  • some_date date,
  • some_timestamp timestamp

so results (especially “break-points" where there is no further gain) will be different when inserting to another tables. the only point of this benchmark is to show which approach can give which results. and what's really worth the trouble 🙂

Continue reading how to insert data to database – as fast as possible

looking for hosting

i'm looking for hosting provider. what i need is basically virtual server, on separate ip (i'd like to put there some non-web servers, like smtpd, and doing so without separate ip can/will be tricky).

other needs – up to 50g of disc space, bandwidth of up to 1mbit upload.

i will need to have root on the machine to be able to install new software, and run services on low-ports (ssh, smtpd, web) and some high-ports (not known at the moment).

can you suggest any company that will let me to setup such a environment? at low price, but not at the cost of service uptime.

drzewa w sql’u – ltree

uwaga – ta metoda jest tylko i wyłącznie dla postgresql'a, gdyż wykorzystuje niestandarodwy typ danych obecny (jako moduł w contribie) jedynie w postgresie.

jak ltree działa nie będę opisywał bo od tego jest manual do ltree.

baza do ltree jest trywialna, przykładowo, oryginalne, testowe drzewo:

zapisujemy tak:

# create table tree_ltree (
id int4 primary key,
path ltree

po wstawieniu naszego testowego drzewa uzyskujemy taką zawartość tabelki:

id path
1 sql
2 sql.postgresql
4 sql.postgresql.linux

ok. jak się pyta taką bazę?

1. pobranie listy elementów głównych (top-levelowych)

select * from tree_ltree where path ~ '*{1}'

2. pobranie elementu bezpośrednio “nad" podanym elementem:

dane wejściowe:

  • ID : id elementu
select  p.* from tree_ltree c join tree_ltree p on c.path <@ p.path where = [ID] and c.path ~ cast(p.path::text || '.*{1}' as lquery)

zwracam uwagę, na to iż mając daną ścieżkę do elementu można mu po prostu wyciąć ostatni element (od kropki do końca) i w ten sposób uzyskać od razu ścieżkę do elementu nadrzędnego.

3. pobranie listy elementów bezpośrednio “pod" podanym elementem

dane wejściowe:

  • ID : id elementu
select c.* from tree_ltree c join tree_ltree p on c.path <@ p.path where = [ID] and c.path ~ cast(p.path::text || '.*{1}' as lquery);

zwracam uwagę, na to iż mając daną ścieżkę do elementu można mu po prostu dokleić do niej .*{1} i wykonać zapytanie:

select * from tree_ltree where path ~ [ZMODYFIKOWANA_SCIEZKA_PARENTA]

4. pobranie listy wszystkich elementów “nad" danym elementem (wylosowanym)

dane wejściowe:

  • ID : id elementu
select  p.* from tree_ltree c join tree_ltree p on c.path <@ p.path where = [ID] AND <> [ID]

5. pobranie listy wszystkich elementów “pod" danym elementem (wylosowanym)

dane wejściowe:

  • ID : id elementu
select c.* from tree_ltree c join tree_ltree p on c.path <@ p.path where = [ID] AND <> [ID]

6. sprawdzenie czy dany element jest “liściem" (czy ma pod-elementy)

dane wejściowe:

  • ID : id elementu
select count(*) from tree_ltree c join tree_ltree p on c.path <@ p.path where = [ID] AND <> [ID]

jeśli zwróci 0 – to jest to liść. w innym przypadku zwróci ilość bezpośrednich “dzieci".

7. pobranie głównego elementu w tej gałęzi drzewa w której znajduje się dany (wylosowany) element

  • ID : id elementu
select  p.* from tree_ltree c join tree_ltree p on c.path <@ p.path where = [ID] and p.path ~ '*{1}'

jeśli chodzi o zalety – najważniejszą jest szybkość pisania, intuicyjność zapytań, możliwości (indeksowane wyszukiwanie np. elementów 2 poziomy poniżej dowolnego elementu którego nazwa zaczyna się od “dep") i czytelność danych.

wada jest zasadniczo tylko jedna – przenośność. jeśli kiedykolwiek w przyszłości będziecie przenosić bazę na coś innego niż postgres, to macie problem. no tak. tylko po co przenosić bazę na coś innego niż postgres?