Why does “sudo ls -l /proc/1/fd/*” fail?

I usually write about PostgreSQL, but lately someone asked for help, and one of the problems was similar to sudo command from title.

This was not the first time I saw it, so figured, I'll write a blogpost about it, just so I can refer people to it in the future.

Continue reading Why does “sudo ls -l /proc/1/fd/*" fail?

A tale of automating tests of Pg with Bash

Word of warning: this blogpost is about thing related to Bash (well, maybe other shells too, didn't really test), but since I found it while doing Pg work, and it might bite someone else doing Pg related work, I decided to add it to “postgresql" tag.

So, due to some work I had to do, I needed a quick, repeatable way to setup some Pg instances, replication between them, and some data loader. All very simple, no real problems. At least that's what I thought…

Continue reading A tale of automating tests of Pg with Bash

Parallel dumping of databases

Some time ago I wrote a piece on speeding up dump/restore process using custom solution that was parallelizing process.

Later on I wrote some tools (“fast dump and restore") to do it in more general way.

But all of them had a problem – to get consistent dump you need to stop all concurrent access to your database. Why, and how to get rid of this limitation?

Continue reading Parallel dumping of databases

Reduce bloat of table without long/exclusive locks

Some time ago Joshua Tolley described how to reduce bloat from tables without locking (well, some locks are there, but very short, and not really intrusive).

Side note: Joshua: big thanks, great idea.

Based on his idea and some our research, i wrote a tool which does just this – reduces bloat in table.

Continue reading Reduce bloat of table without long/exclusive locks

Tips N’ Tricks – using GNU Screen as shell

I'm quite often doing stuff on remote machines, and quite frequently I start some long-running job, when I remember that I didn't ran it via screen – so it will break, if my network connection will die.

Is there any sane way to start screen automatically? YES.

Continue reading Tips N' Tricks – using GNU Screen as shell

Set operations in shell

I had this interesting case at work. We have imports of objects. Each object in import file has its “ID" (which can be any string). Same “ID" is in database.

So the idea is pretty simple – we can/should check how many of IDs from import were in database. Unfortunately – we'd rather not really do the comparison in DB, as it is pretty loaded.

Continue reading Set operations in shell

How to find newest file with given name?

This post will probably be boring for you, but this is mostly just a reminder to myself, written in form of a blog post.

So, I have a directory structure: /some/path/imported/DATE/TIME/file, where DATE is date of importing, in format YYYY-MM-DD, and TIME is time of importing, in format HHMMSS.

So, example paths look like this:

./2009-02-26/143251/5a6d001b94e47960fe41a262f70ed96a
./2009-02-26/143321/8e45f68421dad6129914fe068dfa5748
./2009-02-26/143407/aa04aa9c1e8f87b25fef98bd9a64e94d
./2009-02-26/143415/65180d1328e21959229e47b9288b6996
./2009-02-27/083542/5a6d001b94e47960fe41a262f70ed96a
./2009-02-27/084906/aa04aa9c1e8f87b25fef98bd9a64e94d
./2009-02-27/084926/65180d1328e21959229e47b9288b6996
./2009-02-27/155648/65180d1328e21959229e47b9288b6996

As you can see some of the files were imported many times.

Now, I need to find the latest import of given file.

So, I need a way to convert above list into:

./2009-02-26/143321/8e45f68421dad6129914fe068dfa5748
./2009-02-27/083542/5a6d001b94e47960fe41a262f70ed96a
./2009-02-27/084906/aa04aa9c1e8f87b25fef98bd9a64e94d
./2009-02-27/155648/65180d1328e21959229e47b9288b6996

Of course – with 10 imports, it's simple. But what if I had 10000 of them?

Luckily, it is rather simple:

find . -mindepth 3 -maxdepth 3 -exec basename {} \; | \
    sort -u | \
    while read DIR; \
    do \
        find . -name "$DIR" | \
        sort | \
        tail -n 1; \
    done

Of course I typed it originally as one-liner 🙂

While writing the post I realized I could do better:

find . -mindepth 3 -maxdepth 3 | \
    sort -r -t/ -k4,4 -k2,2 | \
    awk -F/ 'BEGIN{prev="/"} ($4!=prev) {print $0; prev=$4}'

Well. I understand the code, and what it does, but it doesn't change the fact that I'm not really fan of shell programming.