August 24th, 2007 by depesz | Tags: , | 5 comments »
Did it help? If yes - maybe you can help me?

let's imagine simple situation – you have postgresql server. configuration was fine-tuned, hardware is ok. yet the system is not really as fast as it should.

most common problem – slow queries.

second most common problem – fast queries, but too many of them. for example – i once saw a system which did something like this:

  • select id from table;
  • for every id do:
  • select * from table where id = ?

reason? very “interesting" orm.

now i'll show you how i deal with these kind of situations πŸ™‚

first of all – we turned logging of all queries (using log_min_duration_statement = 0 in postgresql.conf). then we used log_line_prefix set to “[%t] [%p] <%u@%d> " to have database information for every query.

there goes notice – log_min_duration_statement can be set to higher value, so there will be less logs, and it will be simpler to spot long-running queries. but it will not allow us to see the short-but-frequent queries.

and then we simply ran the websites that use the database for 1 day.

after this one day i got 12gigabytes of logs πŸ™‚

checking all of it by hand is simply not an option. i could use pgfouine, but i am not really happy with the tool, so i decided to write my own. it's a simple thing, and should be doable in a moment.

what this tool does is:

  1. bring query to “normalized" form. i.e. – select * from table where field = 123; becomes select * from table where field = ?; it let's me do simple aggregates
  2. outputs number of all queries and total used time (by all queries in specified database)
  3. for every “normalized" query it shows:
    1. query – normalized query itself
    2. count – how many times this type of query was called
    3. min – minimal time used by query of this type
    4. avg – average time used by query of this type
    5. max – maximal time used by query of this type
    6. totaltime – summarized runtime of all queries of this type
    7. count% – it shows how many percent of all queries are queries of this type
    8. time% – how many percent of total used time was used by queries of this type
    9. factor-x – basically it is time%/count% – it is used to find queries that are slower than average
    10. fastest – example of query of this type – the one that was the fastest
    11. slowest – example of query of this type – the one that was the slowest

output of the tool is csv file with tab-separated field – easy to load into any kind of “excel/openoffice-calc/gnumeric" tool.

now, while writing the code i found some pitfalls, but hopefully it is working nicely now. if you want to get the code – please fetch it from my svn repository.

so, how does it really work?

really simple:

  1. i modified the script (line Readonly my $ANALYZE_DB => ‘some_database';) to show which database i want to analyze
  2. then i ran: cat postgresql-2007-08-18_* | time -p ./analyze.pgsql.logs.pl > output.csv (logs were rotated after every 500megabytes)
  3. i waited
  4. and waited
  5. and waited πŸ™‚
  6. after nearly 9 hours it was done (average speed was around 390 kilobytes per second – for smaller files it will be much faster – the slowest part is sorting logs)
  7. output was generated – 3 megabyte of .csv file, with 11200 lines

at the beginning of file i got total stats:

Total-time:	174437449.26
Total-count:	35056191

total-time is in milliseconds, so it is about 48.5 hours. not bad for a 24 hours of log. and only one of the databases on this machine πŸ™‚

after these 2 lines i got long list of query-types (normalized queries).

so, i sort it by total-time, and let's see how the first 3 queries look:

    • query : “select t1.recid as core_recid, t1.parent as core_parent, t1.elevel as core_elevel, t1.name as core_name, t1.descr as core_descr, t1.class_id as core_class_id, t1.object_key as core_object_key, t1.element_type as core_element_type from core_tree as t1 where exists (select * from core_tree_path tp where tp.id = t1.recid and tp.path = cast(? as int8) ) or t1.recid = cast(? as int8) order by elevel"
    • count : 13669698
    • min : 0.92
    • avg : 5.17
    • max : 615.26
    • totaltime : 70722649.63
    • count% : 38.99%
    • time% : 40.54%
    • factor-x : 1.04
    • fastest : hidden
    • slowest : hidden
    • query : “select distinct n.*, (case when fn.fid isnull then ? else case when fnam.active >= ? then ? else ? end end) as forum_is, fn.fid as forum_id, c.mastercat from cms_news2cat c inner join news_articles n on c.nid = n.id and c.version = n.version left join eoforum2news as fn on fn.aid = n.id and fn.version = n.version left join eoforum_names as fnam on fn.fid = fnam.recid where c.cidIN (?,,?) and folder = ? and n.state = ? and n.startdate < now() and n.enddate > now() order by n.startdate desc limit ? offset ?"
    • count : 289872
    • min : 1.38
    • avg : 122.62
    • max : 3857.65
    • totaltime : 35542937.71
    • count% : 0.83%
    • time% : 20.38%
    • factor-x : 24.64
    • fastest : hidden
    • slowest : hidden
    • query : “select count(*) as count from (select distinct n.*, (case when fn.fid isnull then ? else case when fnam.active >= ? then ? else ? end end) as forum_is, fn.fid as forum_id, c.mastercat from cms_news2cat c inner join news_articles n on c.nid = n.id and c.version = n.version left join eoforum2news as fn on fn.aid = n.id and fn.version = n.version left join eoforum_names as fnam on fn.fid = fnam.recid where c.cidIN (?,,?) and folder = ? and n.state = ? and n.startdate < now() and n.enddate > now() ) as c"
    • count : 289871
    • min : 1.5
    • avg : 117.34
    • max : 64575.16
    • totaltime : 34013322.04
    • count% : 0.83%
    • time% : 19.50%
    • factor-x : 23.58
    • fastest : hidden
    • slowest : hidden

this gives us a lot of information. first of all, we see that one particular type of query is responsible for over 40% of total usage-time! what's more – this query in itself is rather fast (maximal time of 600ms, but averages on 5ms). so the problem is the sheer number of calls to this query – within 24 hours this query was called over 13.5 million times!

this cannot be fixed on database level – the app will have to modified to decrease number of calls.

but the next queries are more interesting from database point of view – only 300,000 calls but they consume around 20% of total time! (by the way – please notice, that 3 top-queries on this database use together more than 80% of server time!).

now, i will not go to details about how to optimize the queries (but maybe i will write more about it someday). what i'm showing in here is that it is quite simple to easily find what's really using up time on your db server.

now at the end couple of side notes:

  • this analyze takes some space to do – with 12gigs of logs the peak size of on-disk data (while analysing) was around 40 gigabytes. so – be prepared
  • it is crucial to log all queries. for example – if i did log only queries that are over 100ms (or even 10ms) i would log the longest examples of the #1 query from above example, but it would not show me that there are simply too many calls to it (remember that average call was 5ms!)
  • this software is definitely not perfect – i just found a “nice" bug that caused some of the queries not to be aggregated. it is not a big deal, and i will most probably fix it by tomorrow πŸ™‚
  • the longest time that was used by analyzing is sorting result set. so, if you're using machine with lots of ram, play with -S option to sort call to make it go faster.
  1. 5 comments

  2. # xor
    Aug 25, 2007

    pgfouine is great tool for postgresql, I love it! πŸ™‚

  3. # Hack KampbjΓΈrn
    Aug 27, 2007

    I don’t understand why you generate a raw file and then sort it before analyzing the content.

    I took your script and change it to use a hash table to accumulated the query statistics and then sort it at the end. If I didn’t have enough RAM I would TIE it to a berkeley db

    #!/usr/bin/perl
    use strict;

    use POSIX qw(strftime);
    use Fatal qw(open);

    my $OPTIONAL_HEADER = qr{ \A (\d\d\d\d-\d\d-\d\d \s+ \d\d:\d\d:\d\d \s+ (?: [A-Z]+ | \+\d\d\d\d ) ) \s+ ( [^@]+ ) @ (\w+) \s+ }xms;
    my $ANALYZE_DB = ‘somedatabase’;

    my $globals = {};

    my $data = normalize_data();
    write_report($data);

    exit;

    sub write_report {
    my ($data) = @_;
    print_report_header();
    foreach my $key (sort { $data->{$b}->{‘time’} $data->{$a}->{‘time’} } keys %{$data}) {
    report_data($data->{$key});
    }
    return;
    }

    sub report_columns {
    print join(“\t”, @_) . “\n”;
    return;
    }

    sub print_report_header {
    report_columns( “Total-time:”, $globals->{‘time’} );
    report_columns( “Total-count:”, $globals->{‘count’} );
    report_columns( qw( query count min avg max totaltime count% time% factor-x fastest slowest ) );
    return;
    }

    sub report_data {
    my $d = shift;
    my @c = ();
    push @c, $d->{‘std’};
    push @c, $d->{‘count’};
    push @c, $d->{‘min’};
    push @c, $d->{‘time’} / $d->{‘count’}; # avg
    push @c, $d->{‘max’};
    push @c, $d->{‘time’};
    push @c, $d->{‘count’} / $globals->{‘count’}; # count %
    push @c, $d->{‘time’} / $globals->{‘time’}; # time %
    push @c, ( $d->{‘time’} / $globals->{‘time’}) / ( $d->{‘count’} / $globals->{‘count’} ); # factor x
    push @c, $d->{‘min-sql’};
    push @c, $d->{‘max-sql’};
    report_columns(@c);
    return;
    }

    sub normalize_data {
    my $data = {};
    my $last = {};
    while (my $line = ) {
    chomp $line;
    $line =~ s/$OPTIONAL_HEADER//;
    my ($date, $user, $db) = ($1, $2, $3);
    if ($line =~ m{ \A \s* LOG: \s+ duration: \s+ (\d+\.\d+) \s+ ms \s+ (?: (?:bind|execute) \s+ \w+ | statement): \s+ (.*?) \s* \z }xms) {
    my ($time, $sql) = ($1, $2);
    if ($db eq $ANALYZE_DB ) {
    $globals->{‘count’}++;
    $globals->{‘time’}+=$time;
    }
    analyze_line($data, $last) if $last->{‘time’};
    $last = { ‘time’ => $time, ‘sql’ => $sql, ‘db’ => $db };
    } elsif ($line =~ m{ \A \s* (?: LOG | DETAIL | ERROR ) : \s\s }xms) {
    next;
    } else {
    next unless $last->{ ‘time’ };
    $last->{ ‘sql’ } .= ‘ ‘ . $line;
    }
    }
    analyze_line($data, $last) if $last->{‘sql’};
    return $data;
    }

    sub analyze_line {
    my ($data, $item) = @_;
    return unless $item->{‘db’} eq $ANALYZE_DB;
    my $sql = $item->{‘sql’};
    $sql =~ s/^\s*//;
    $sql =~ s/\s*$//;
    $sql =~ s/\s+/ /g;
    $item->{‘sql’} = $sql;

    my $std = normalize_line($sql);
    unless (defined($data->{$std})) {
    $data->{$std} = { ‘std’ => $std};
    }
    my $acc = $data->{$std};
    my $time = $item->{‘time’};
    $acc->{‘count’}++;
    $acc->{‘time’} += $time;
    if ($time {‘min’} or !defined($acc->{‘min’})) {
    $acc->{‘min’} = $time;
    $acc->{‘min-sql’} = $sql;
    }
    if ($time > $acc->{‘max’}) {
    $acc->{‘max’} = $time;
    $acc->{‘max-sql’} = $sql;
    }
    }

    sub normalize_line {
    my ($sql) = @_;
    $sql =~ s/^\s*//;
    $sql =~ s/\s*$//;
    $sql =~ s/\s+/ /g;
    my $std = lc $sql;
    $std =~ s/'[^’]*’/?/g;
    $std =~ s/\b\d+\b/?/g;
    $std =~ s/ in \(\s*\?\s*(?:,\s*\?\s*)*\)/IN (?,,?)/gi;
    $std =~ s/ \s* ( \bI?LIKE\b | | >= | | \+ | – | > |

  4. Aug 27, 2007

    the problem is in amount of data. generating datafile, sorting and reading sorted makes it possible to analyze practically any number of queries with very minimal memory requirements (used around 30megs to 14gigs of logs.

    using hashtable will use much more memory – of course it will not be that much as logs itself, but it will be higher than in write->sort->read mode.

    this is especially true in case one would want to add more details – for example – median.

  5. Aug 27, 2007

    as for your berkeley-db remark – true. it’s possible. yet i wanted this code to be as simple as possible. perhaps i overdid πŸ™‚
    i’ll think about it for a while, if i’ll see no problems with in-memory list (i dont want to tie) i’ll fix the code πŸ™‚

  6. # figther_from_brazil!
    Aug 28, 2007

    Dude your site is great! i’m always reading your stuffZ!

    COOL!

Sorry, comments for this post are disabled.