Getting list of unique elements in table, per group

Today, on irc, someone asked interesting question.

Basically she ran a query like:

SELECT a, b, c, d, e, f FROM TABLE ORDER BY a

then, she processed the query to get, for each a array of unique values of b, c, d, e, and f, and then he inserted it back to database, to some other table.

It was a problem, because the table had many rows (millions I would assume), and the whole process was slow.

So, how to make it faster?

Continue reading Getting list of unique elements in table, per group

explain.depesz.com – new change and some stats

Quite a long time ago (in October), Oskar Liljeblad reported a bug in anonymization. Namely – group keys were not anonymized.

You can see example of such plan here.

I finally got to it, fixed the bug, pushed new version to live site, and now such plans will be correctly anonymized.

Thanks Oskar, and sorry for long delay.

Continue reading explain.depesz.com – new change and some stats

Fixed a bug in OmniPITR

Just thought I'll share a “fun" story. Friend reported weird bug – OmniPITR reported that xlogs are sent to archive, but they actually weren't.

After some checking we found out that he was giving custom rsync-path (–rsync-path – path to rsync program) – and the path was broken.

In this case – OmniPITR was not reporting error, and quite happily was working under assumption that it works OK.

Continue reading Fixed a bug in OmniPITR

How to install your own copy of explain.depesz.com

There are some cases where you might want to get your own copy of explain.depesz.com. You might not trust me with your explains. You might want to use it without internet access. Or you just want to play with it, and have total control over the site.

Installing, while obvious to me, and recently described by John Poole, is not always 100% clear. So, I decided to write about how to set it up, from scratch.

Continue reading How to install your own copy of explain.depesz.com

Changes on explain.depesz.com

Uploaded new version to the server – straight from GitHub. There are two changes – one visible, and one not really.

The invisible change, first, is one for people hosting explain.depesz.com on their own. As you perhaps know you can get sources of explain.depesz.com and install it on any box you want (as log as you can get there PostgreSQL, Perl, and some perl modules). While working on it on my own, I figured I could use a way to tell which version of module-xxx the site is running right now. So I build /info page (which is inaccessible to everyone, but manually-marked admins), which lists versions and interesting paths.

The second change – the one visible to users, is that I made explain.depesz.com commify numbers. Sometimes it can be hard to read value like 12325563, but now it will be displayed as 12,325,563 making is simpler to grasp.

This second change was suggested by Jacek Wielemborek, so if you hate it – blame him. Of course if you love the change – it's all on me 🙂

Hope you'll find it helpful.

PostgreSQL + Perl + Unicode == confusion. Why?

Yesterday I had an interesting discussion on irc.

A guy wanted to know why Perl script is causing problems when dealing with Pg
and unicode characters.

The discussion went sideways, I got (a bit) upset, and had to leave anyway, so
I didn't finish it. But it did bother me, as for me the reasons of the problem
seem obvious, yet the person I talked with was very adamant that I have the
whole thing wrong.

So, I figured I'll use my blog to elaborate a bit…

Continue reading PostgreSQL + Perl + Unicode == confusion. Why?

OmniPITR 1.3.1

Right after releasing 1.3.0 I realized that I forgot about one thing.

If you're using ext3 (and possibly other, not sure) file system, removal of large file can cause problems due to heavy IO traffic.

We did hit this problem earlier at one of client sites, and devised a way to remove large files by truncating them, bit after bit, and getting them to small enough size to be removed in one go. I wrote about it earlier, of course.

Unfortunately – I forgot about this when releasing 1.3.0, but as soon as I tried to deploy at the client site, I noticed the missing functionality.

So, today I released 1.3.1, which adds two options to omnipitr-backup-cleanup:

  • –truncate
  • –sleep

If truncate is specified, and is more than 0, it will cause omnipitr-backup-slave to remove large files (larger than truncate value) in steps.

In pseudocode:

if param('truncate') {
  file_size = file_to_be_removed.size()
  while ( file_size > param('truncate') ) {
    file_size = file_size - param('truncate')
    file_to_be_removed.truncate_to( file_size )
    sleep( param('sleep') )
  }
}
file_to_be_removed.unlink()

So, for example, specifying –truncate=1000000, will remove the file truncating it first by 1MB blocks.

–sleep parameter is used to delay removal of next part of the file (it's used only in truncating loop, so has no meaning when truncate-loop is not used). It's value is in milliseconds, and defaults to 500 (0.5 second).

Hope you'll find it useful.