February 13th, 2012 by depesz | Tags: analyze, explain, overhead, pg92, postgresql, timing | 2 comments »
On 7th of February, Robert Haas committed patch:
Sometimes it may be useful to get actual row counts out of EXPLAIN (ANALYZE) without paying the cost of timing every node entry/exit. With this patch, you can say EXPLAIN (ANALYZE, TIMING OFF) to get that. Tomas Vondra, reviewed by Eric Theise, with minor doc changes by me. |
The commit log says it all – we got new option to EXPLAIN.
But how much does it exactly help?
Let's see on a very specific case:
$ explain (analyze on) select ( select 1 where -1 < z ) from generate_series(1,10000000) z; QUERY PLAN ───────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────── Function Scan on generate_series z (cost=0.00..22.50 rows=1000 width=4) (actual time=1091.010..6050.521 rows=10000000 loops=1) SubPlan 1 -> Result (cost=0.00..0.01 rows=1 width=0) (actual time=0.000..0.000 rows=1 loops=10000000) One-Time Filter: ((-1) < z.z) Total runtime: 6443.722 ms (5 rows) |
and the same query – which is a very tight loop – without timing:
$ explain (analyze on, timing off) select ( select 1 where -1 < z ) from generate_series(1,10000000) z; QUERY PLAN ───────────────────────────────────────────────────────────────────────────────────────────────────────── Function Scan on generate_series z (cost=0.00..22.50 rows=1000 width=4) (actual rows=10000000 loops=1) SubPlan 1 -> Result (cost=0.00..0.01 rows=1 width=0) (actual rows=1 loops=10000000) One-Time Filter: ((-1) < z.z) Total runtime: 6083.863 ms (5 rows) |
The difference is not huge, but that could be because I'm running on bare metal. I heard that getting time on some virtualized systems is much more expensive, so such option, that would let you skip the timing, would be pretty helpful.
Personally – I don't use virtualization, but I know that many people do, so this could be a big win in such cases. The only problem is that explain.depesz.com doesn't (yet) parse such plans. But I'll look into it.

2 comments
Just a follow-on. There are other cases where timing information is expensive to get:
* Older multi-CPU servers didn’t keep the TSC (timestamp counter) synchronized between CPU sockets, so it was unstable for direct user space access
* Older CPUs also changed the TSC frequency with power management changes, making it unstable
* Linux bypasses timing syscalls and reads time directly in user space when TSC timing is stable, but other OSes (*BSDs for example) don’t have this optimization
If the TSC is unstable, kernels have to revert to slower hardware timing sources. And it makes a *huge* difference. You can try for yourself on Linux with the command:
echo hpet > /sys/devices/system/clocksource/clocksource0/current_clocksource
TL;DR: on my computer, your example takes:
* 16 seconds, with timing off
* 18 seconds, timing with tsc clocksource
* 75 seconds, timing with hpet clocksource
@Marti:
thanks for very informative comment.