Estimating Flow Graph Performance#
The performance or scalability of a flow graph is not easy to predict. However there are a few key points that can guide you in estimating the limits on performance and speedup of some graphs.
The Critical Path Limits the Scalability in a Dependence Graph
A critical path is the most time consuming path from a node with no predecessors to a node with no successors. In a dependence graph, the execution of the nodes along a path cannot be overlapped since they have a strict ordering. Therefore, for a dependence graph, the critical path limits scalability.
More formally, let T be the total time consumed by all of the nodes in your graph if executed sequentially. Then let C be the time consumed along the path that takes the most time. The nodes along this path cannot be overlapped even in a parallel execution. Therefore, even if all other paths are executed in parallel with C, the wall clock time for the parallel execution is at least C, and the maximum possible speedup (ignoring microarchitectural and memory effects) is T/C.
There is Overhead in Spawning a Node’s Body as a Task
The bodies of input_nodes
, function_nodes
, continue_nodes
and
multifunction_nodes
execute within spawned tasks by default. This
means that you need to take into account the overhead of task
scheduling when estimating the time it takes for a node to execute
its body. All of the rules of thumb for determining the appropriate
granularity of tasks therefore also apply to node bodies as well. If
you have many fine-grained nodes in your flow graph, the impact of
these overheads can noticeably impact your performance. However,
depending on the graph structure, you can reduce such overheads by
using lightweight policy with these nodes.