better explain analyze

everybody knows explain analyze.

it's well known, well understood tool.

the problem i have with it is that i don't see (at a glance) what's really eating the time in the query.

for example, there is a query (sorry, i had to hide table and column names):

# EXPLAIN analyze
FROM "t1" , "t2"
WHERE = t2.f6
    AND t2.f1 IN (SELECT f2 FROM t3 WHERE f7 = 51)
    AND ("t1"."f3" = 1
    AND "t1"."f4" = 4)
ORDER BY "t1"."f5" DESC LIMIT 3;
                                                                                     QUERY PLAN
 LIMIT  (cost=0.00..2518.33 ROWS=3 width=305) (actual TIME=4902.395..4902.425 ROWS=3 loops=1)
   ->  Nested Loop IN JOIN  (cost=0.00..91499.17 ROWS=109 width=305) (actual TIME=4902.393..4902.421 ROWS=3 loops=1)
         ->  Nested Loop  (cost=0.00..73809.64 ROWS=122223 width=309) (actual TIME=0.199..4451.765 ROWS=112264 loops=1)
               ->  INDEX Scan Backward USING t1_f5 ON t1  (cost=0.00..33453.27 ROWS=122225 width=305) (actual TIME=0.107..1105.096 ROWS=112264 loops=1)
                     FILTER: ((f3 = 1) AND (f4 = 4))
               ->  INDEX Scan USING t2_f6_key ON t2  (cost=0.00..0.32 ROWS=1 width=8) (actual TIME=0.027..0.028 ROWS=1 loops=112264)
                     INDEX Cond: ( = t2.f6)
         ->  INDEX Scan USING t3_f7_key ON t3  (cost=0.00..0.28 ROWS=1 width=4) (actual TIME=0.003..0.003 ROWS=0 loops=112264)
               INDEX Cond: ((t3.f7 = 51) AND (t2.f1 = t3.f2))
 Total runtime: 4902.582 ms

of course – finding where the time is really spent is quite simple. but not as simple as it could be.

so, let's write a small program:

#!/usr/bin/perl -p
my $string = " "x10;
if (/actual time=\d+\.\d*\.\.(\d+\.\d*) rows=\d+ loops=(\d+)/) {
    my $time = $1 * $2;
    $string = sprintf("%10.1f", $time);
$_ = "$string $_"

(i know. it's not pretty, but it serves it's purpose).

now – let's save the program as “/tmp/", and make it executable.

back to psql.

we will use \o switch:

\o |/tmp/

and not let's redo the explain. output now, looks like this:

                                                                                                QUERY PLAN
    4902.4  Limit  (cost=0.00..2518.33 rows=3 width=305) (actual time=4902.395..4902.425 rows=3 loops=1)
    4902.4    ->  Nested Loop IN Join  (cost=0.00..91499.17 rows=109 width=305) (actual time=4902.393..4902.421 rows=3 loops=1)
    4451.8          ->  Nested Loop  (cost=0.00..73809.64 rows=122223 width=309) (actual time=0.199..4451.765 rows=112264 loops=1)
    1105.1                ->  Index Scan Backward using t1_f5 on t1  (cost=0.00..33453.27 rows=122225 width=305) (actual time=0.107..1105.096 rows=112264 loops=1)
                                Filter: ((f3 = 1) AND (f4 = 4))
    3143.4                ->  Index Scan using t2_f6_key on t2  (cost=0.00..0.32 rows=1 width=8) (actual time=0.027..0.028 rows=1 loops=112264)
                                Index Cond: ( = t2.f6)
     336.8          ->  Index Scan using t3_f7_key on t3  (cost=0.00..0.28 rows=1 width=4) (actual time=0.003..0.003 rows=0 loops=112264)
                          Index Cond: ((t3.f7 = 51) AND (t2.f1 = t3.f2))
            Total runtime: 4902.582 ms

isn't it more readable? at the very least – for me, it's better – at the first glance i see total numbers 🙂

hope you'll like it as well.

10 thoughts on “better explain analyze”

  1. You and I are thinking along the same lines. I’ve also been working to make EXPLAIN output easier to read (at least for me). Thanks for sharing! Hopefully I’ll have something to share soon.

  2. I think explain should get an option to spit out structured data and/or dump into a table. the later is what oracle abd db2 do, but both of them have SQL support for recursive queries. There has been dicussion of this in the hackers list, not sure if a decision was made yet.

  3. That is actually being discussed for 8.4. We’ll see if it happens, but at least it’s being discussed 🙂

  4. Heh. Sometimes some solutions can be so simple. Thank you for the perl code and the tip. I hated using explain analyze until now. This will save a lot of time for pgadmins and makes it easier on db newbies

  5. Why does it seem that the first and the second numbers dont actually tally with the actual (max) times ? (336 and 3143). Since the other numbers match with the actual time (max) I reckon either the script screwed up in recognizing the actual (max) time or may be I am missing something…

    Thanks for the script anyway 🙂


  6. @Robins:
    the numbers 336 and 3143 are result of multiplication: maxtime * loops – as this is the real full time used by this entry.

    that was the whole purpose of writing the script – to do the multiplication.

    adding numbers at the begin of the line is just an addon.

  7. I saw what you were trying to accomplish here with parsing the output of EXPLAIN and agree fully that psql needs more functionality in this arena. I’ve begun creating a tool that takes care of all this stuff for you. You can find it at . So far it has TokenParsing and DOM-structure. Check out the README for more info.

  8. The tool mentioned is now available on CPAN as PSQL::Query, it suggests a tool called psql-plus the syntax to do what you’ve done would be:

    \o|psql-plus –printf “%at %s” –printf-info “%s”

  9. This tool definitely makes the explain-analyze output a lot more readable, especially to someone who is used to working with code profiling tools (which this output was clearly based on).

    I have a question though: Has Pg not fixed this issue? This script/site was created 9 years ago! While looking for a profiler for Pg online, I only see posts by people like myself, looking for something better.

    MySQL has “SHOW PROFILES” which is exactly what I want when I’m doing query tuning. It not only clearly shows the query detail, it shows things like IO.

Comments are closed.