There are many places that talk about this question but none fully explain it - all leave it ambiguous. Examples:
From what I've been able to gather, the following factors are at play:
- The two times T0 and T1 included in the
(actual time=<T0>..<T1> rows=<R> loops=<L>)output. - Total number of threads: one "leader" + any number of concurrent (parallel) "workers". If there are two workers that means 1+2=3 threads.
- Number of "loops" (L in the above example, "executions", iterations).
I'll include the actual documentation example:
EXPLAIN ANALYZE SELECT *
FROM tenk1 t1, tenk2 t2
WHERE t1.unique1 < 10 AND t1.unique2 = t2.unique2;
QUERY PLAN
---------------------------------------------------------------------------------------------------------------------------------
Nested Loop (cost=4.65..118.62 rows=10 width=488) (actual time=0.128..0.377 rows=10 loops=1)
-> Bitmap Heap Scan on tenk1 t1 (cost=4.36..39.47 rows=10 width=244) (actual time=0.057..0.121 rows=10 loops=1)
Recheck Cond: (unique1 < 10)
-> Bitmap Index Scan on tenk1_unique1 (cost=0.00..4.36 rows=10 width=0) (actual time=0.024..0.024 rows=10 loops=1)
Index Cond: (unique1 < 10)
-> Index Scan using tenk2_unique2 on tenk2 t2 (cost=0.29..7.91 rows=1 width=244) (actual time=0.021..0.022 rows=1 loops=10)
Index Cond: (unique2 = t1.unique2)
Planning time: 0.181 ms
Execution time: 0.501 ms
and the official explanation:
Note that the “actual time” values are in milliseconds of real time, ... <unrelated text about "cost" removed for brevity>.
In some query plans, it is possible for a subplan node to be executed more than once. For example, the inner index scan will be executed once per outer row in the above nested-loop plan. In such cases, the loops value reports the total number of executions of the node, and the actual time and rows values shown are averages per-execution. This is done to make the numbers comparable with the way that the cost estimates are shown. Multiply by the loops value to get the total time actually spent in the node. In the above example, we spent a total of 0.220 milliseconds executing the index scans on tenk2.
When there is only one thread and loops=1, things appear almost clear:
- T0 is effectively the clock start time of the "node".
- T1 is effectively the clock completion time of the "node".
... and a nice execution time diagram could be made. However, this does not add up to the example given. T1 of the root node in the above example is 0.377ms yet indicated execution time at the end is 0.501ms, which is 0.124ms longer. That is less than the planning time (0.181ms) so it can't be included there. It is also less than T0=0.128ms, though T0+T1 do come close to 0.505ms, mere 0.004ms longer. This could be explained by some sort of time accounting errors but it would imply that T1 is not the clock "completion time" but the "raw execution" time and that the completion time is T0+T1... and that fails to match the observed reality. I have a case were T1 does match the end time and T0 is very little less than T1, would add up to almost double the total time spent ... for queries that run much longer (and would not be prone to accounting errors). In other words, in all observed cases I saw, T1 is inclusive of T0 and T1 is always greater than T0.
Then we come to "loops" (still assuming a single thread, for now). In the example the Index Scan on tenk2 states:
(actual time=0.021..0.022 rows=1 loops=10)
... and documentation states that this is taking T1*L = 0.022ms/loop * 10 loops=0.220ms total. That seems to "adjust" the definition of T1 to one of the following:
- T1 is the total time taken by one average execution.
But what do loops/executions mean for T0? Does T1 continue to be inclusive of T0? I can see the following possibilities:
- T0 is the group clock time before loops begin. The raw execution time is T1 * L - T0 = 0.022ms/loop * 10loops - 0.021ms = 0.220ms - 0.021ms = 0.199ms.
- T0 is not the group clock start time but the iteration initialization time. The raw execution time is (T1 - T0) * L = (0.022ms/loop - 0.021ms/loop) * 10loops = 0.001ms/loop * 10loops = 0.010ms.
- I was wrong and T1 no longer includes T0 and T0 is per loop. Raw execution times is simple T1L now (0.220ms) and that occurs in addition to T0L=0.210ms for a combined total of (T0+T1)*L=0.430ms. This can't be because the root node completed earlier, at 0.430ms... or maybe not ... as execution time took 0.501ms?
- As above but T0 is not per loop.. Total time is T0+(T1*L)=0.021ms+0.220ms=0.241ms. That's longer than the root node's T0... but perhaps these can overlap even when not concurrent by means of interleaving - streaming data processing within a single thread.
Then we come to threads (extra workers). They make things even more "interesting" as whatever we do with multiplying by loops (may) now need(s) to be divided by the number of threads, but depending on the exact meanings of T0 and T1 this is different. When the number of threads = number of loops things seem to cancel out but due to previous ambiguity I'm done with forensic analysis and came here...
Can someone, please, explain what's going on here? What exactly are T0 and T1 and how are they affected by loops and threads?
What is T0 here? I can see two possibilities:
- The start time of the first execution, should not be multiplied by L=10, so the execution time is T1 * L - T0 = 0.022ms/loop * 10loops - 0.021ms = 0.220ms - 0.021ms = 0.199ms.
- The wait/delay time within each execution (should be multiplied by L=10), so the execution time is (T1 - T0) * L = (0.022ms/loop - 0.021ms/loop) * 10loops = 0.001ms/loop * 10loops = 0.010ms.
This is the meaning of the two times reported by
EXPLAIN (ANALYZE)asactual time:the first number is the time from the beginning of the statement execution until this execution plan node returned the first result row
the second number is the time from the beginning of the statement execution until the execution plan node was done
All times are in milliseconds.
If
loopsof an execution plan step is greater than one, there can be two casesThe
loopsindicate repeated execution in a single thread, like on the inner side of a nested loop join. In that case, you have to multiply the second number with theloopsvalue to get the total time spent executing this execution plan step.The
loopsindicate parallel execution in parallel worker processes (look at theGathernode to find the actual number of workers). In that case, these processes execute in parallel, so you don't have to multiply and the second number is the average time it took to complete that step.It is easy to tell how many parallel workers there are from the
Workers Launchedof theGathernode that is higher up in the execution plan tree. The two cases above can also occur in combination if several parallel workers each execute a step several times.