dehne_gotz_min_spanning_tree.rst 17 KB

123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140141142143144145146147148149150151152153154155156157158159160161162163164165166167168169170171172173174175176177178179180181182183184185186187188189190191192193194195196197198199200201202203204205206207208209210211212213214215216217218219220221222223224225226227228229230231232233234235236237238239240241242243244245246247248249250251252253254255256257258259260261262263264265266267268269270271272273274275276277278279280281282283284285286287288289290291292293294295296297298299300301302303304305306307308309310311312313314315316317318319320321322323324325326327328329330331332333334335336337338339340341342343344345346347348349350351352353354355356357358359360361362363364365366367368369370371372373374375376377378379380381382383384385386387388389390391392393394395396397398399400401402403404405406407408409410411412413414415416417418419420421422423
  1. .. Copyright (C) 2004-2008 The Trustees of Indiana University.
  2. Use, modification and distribution is subject to the Boost Software
  3. License, Version 1.0. (See accompanying file LICENSE_1_0.txt or copy at
  4. http://www.boost.org/LICENSE_1_0.txt)
  5. ============================
  6. |Logo| Minimum Spanning Tree
  7. ============================
  8. The Parallel BGL contains four `minimum spanning tree`_ (MST)
  9. algorithms [DG98]_ for use on undirected, weighted, distributed
  10. graphs. The graphs need not be connected: each algorithm will compute
  11. a minimum spanning forest (MSF) when provided with a disconnected
  12. graph.
  13. The interface to each of the four algorithms is similar to the
  14. implementation of 'Kruskal's algorithm'_ in the sequential BGL. Each
  15. accepts, at a minimum, a graph, a weight map, and an output
  16. iterator. The edges of the MST (or MSF) will be output via the output
  17. iterator on process 0: other processes may receive edges on their
  18. output iterators, but the set may not be complete, depending on the
  19. algorithm. The algorithm parameters are documented together, because
  20. they do not vary greatly. See the section `Selecting an MST
  21. algorithm`_ for advice on algorithm selection.
  22. The graph itself must model the `Vertex List Graph`_ concept and the
  23. Distributed Edge List Graph concept. Since the most common
  24. distributed graph structure, the `distributed adjacency list`_, only
  25. models the Distributed Vertex List Graph concept, it may only be used
  26. with these algorithms when wrapped in a suitable adaptor, such as the
  27. `vertex_list_adaptor`_.
  28. .. contents::
  29. Where Defined
  30. -------------
  31. <``boost/graph/distributed/dehne_gotz_min_spanning_tree.hpp``>
  32. Parameters
  33. ----------
  34. IN: ``Graph& g``
  35. The graph type must be a model of `Vertex List Graph`_ and
  36. `Distributed Edge List Graph`_.
  37. IN/OUT: ``WeightMap weight``
  38. The weight map must be a `Distributed Property Map`_ and a `Readable
  39. Property Map`_ whose key type is the edge descriptor of the graph
  40. and whose value type is numerical.
  41. IN/OUT: ``OutputIterator out``
  42. The output iterator through which the edges of the MSF will be
  43. written. Must be capable of accepting edge descriptors for output.
  44. IN: ``VertexIndexMap index``
  45. A mapping from vertex descriptors to indices in the range *[0,
  46. num_vertices(g))*. This must be a `Readable Property Map`_ whose
  47. key type is a vertex descriptor and whose value type is an integral
  48. type, typically the ``vertices_size_type`` of the graph.
  49. **Default:** ``get(vertex_index, g)``
  50. IN/UTIL: ``RankMap rank_map``
  51. Stores the rank of each vertex, which is used for maintaining
  52. union-find data structures. This must be a `Read/Write Property Map`_
  53. whose key type is a vertex descriptor and whose value type is an
  54. integral type.
  55. **Default:** An ``iterator_property_map`` built from an STL vector
  56. of the ``vertices_size_type`` of the graph and the vertex index map.
  57. IN/UTIL: ``ParentMap parent_map``
  58. Stores the parent (representative) of each vertex, which is used for
  59. maintaining union-find data structures. This must be a `Read/Write
  60. Property Map`_ whose key type is a vertex descriptor and whose value
  61. type is also a vertex descriptor.
  62. **Default:** An ``iterator_property_map`` built from an STL vector
  63. of the ``vertex_descriptor`` of the graph and the vertex index map.
  64. IN/UTIL: ``SupervertexMap supervertex_map``
  65. Stores the supervertex index of each vertex, which is used for
  66. maintaining the supervertex list data structures. This must be a
  67. `Read/Write Property Map`_ whose key type is a vertex descriptor and
  68. whose value type is an integral type.
  69. **Default:** An ``iterator_property_map`` built from an STL vector
  70. of the ``vertices_size_type`` of the graph and the vertex index map.
  71. ``dense_boruvka_minimum_spanning_tree``
  72. ---------------------------------------
  73. ::
  74. namespace graph {
  75. template<typename Graph, typename WeightMap, typename OutputIterator,
  76. typename VertexIndexMap, typename RankMap, typename ParentMap,
  77. typename SupervertexMap>
  78. OutputIterator
  79. dense_boruvka_minimum_spanning_tree(const Graph& g, WeightMap weight_map,
  80. OutputIterator out,
  81. VertexIndexMap index,
  82. RankMap rank_map, ParentMap parent_map,
  83. SupervertexMap supervertex_map);
  84. template<typename Graph, typename WeightMap, typename OutputIterator,
  85. typename VertexIndex>
  86. OutputIterator
  87. dense_boruvka_minimum_spanning_tree(const Graph& g, WeightMap weight_map,
  88. OutputIterator out, VertexIndex index);
  89. template<typename Graph, typename WeightMap, typename OutputIterator>
  90. OutputIterator
  91. dense_boruvka_minimum_spanning_tree(const Graph& g, WeightMap weight_map,
  92. OutputIterator out);
  93. }
  94. Description
  95. ~~~~~~~~~~~
  96. The dense Boruvka distributed minimum spanning tree algorithm is a
  97. direct parallelization of the sequential MST algorithm by
  98. Boruvka. The algorithm first creates a *supervertex* out of each
  99. vertex. Then, in each iteration, it finds the smallest-weight edge
  100. incident to each vertex, collapses supervertices along these edges,
  101. and removals all self loops. The only difference between the
  102. sequential and parallel algorithms is that the parallel algorithm
  103. performs an all-reduce operation so that all processes have the
  104. global minimum set of edges.
  105. Unlike the other three algorithms, this algorithm emits the complete
  106. list of edges in the minimum spanning forest via the output iterator
  107. on all processes. It may therefore be more useful than the others
  108. when parallelizing sequential BGL programs.
  109. Complexity
  110. ~~~~~~~~~~
  111. The distributed algorithm requires *O(log n)* BSP supersteps, each of
  112. which requires *O(m/p + n)* time and *O(n)* communication per
  113. process.
  114. Performance
  115. ~~~~~~~~~~~
  116. The following charts illustrate the performance of this algorithm on
  117. various random graphs. We see that the algorithm scales well up to 64
  118. or 128 processors, depending on the type of graph, for dense
  119. graphs. However, for sparse graphs performance tapers off as the
  120. number of processors surpases *m/n*, i.e., the average degree (which
  121. is 30 for this graph). This behavior is expected from the algorithm.
  122. .. image:: http://www.osl.iu.edu/research/pbgl/performance/chart.php?generator=ER,SF,SW&dataset=TimeSparse&columns=5
  123. :align: left
  124. .. image:: http://www.osl.iu.edu/research/pbgl/performance/chart.php?generator=ER,SF,SW&dataset=TimeSparse&columns=5&speedup=1
  125. .. image:: http://www.osl.iu.edu/research/pbgl/performance/chart.php?generator=ER,SF,SW&dataset=TimeDense&columns=5
  126. :align: left
  127. .. image:: http://www.osl.iu.edu/research/pbgl/performance/chart.php?generator=ER,SF,SW&dataset=TimeDense&columns=5&speedup=1
  128. ``merge_local_minimum_spanning_trees``
  129. --------------------------------------
  130. ::
  131. namespace graph {
  132. template<typename Graph, typename WeightMap, typename OutputIterator,
  133. typename VertexIndexMap>
  134. OutputIterator
  135. merge_local_minimum_spanning_trees(const Graph& g, WeightMap weight,
  136. OutputIterator out,
  137. VertexIndexMap index);
  138. template<typename Graph, typename WeightMap, typename OutputIterator>
  139. inline OutputIterator
  140. merge_local_minimum_spanning_trees(const Graph& g, WeightMap weight,
  141. OutputIterator out);
  142. }
  143. Description
  144. ~~~~~~~~~~~
  145. The merging local MSTs algorithm operates by computing minimum
  146. spanning forests from the edges stored on each process. Then the
  147. processes merge their edge lists along a tree. The child nodes cease
  148. participating in the computation, but the parent nodes recompute MSFs
  149. from the newly acquired edges. In the final round, the root of the
  150. tree computes the global MSFs, having received candidate edges from
  151. every other process via the tree.
  152. Complexity
  153. ~~~~~~~~~~
  154. This algorithm requires *O(log_D p)* BSP supersteps (where *D* is the
  155. number of children in the tree, and is currently fixed at 3). Each
  156. superstep requires *O((m/p) log (m/p) + n)* time and *O(m/p)*
  157. communication per process.
  158. Performance
  159. ~~~~~~~~~~~
  160. The following charts illustrate the performance of this algorithm on
  161. various random graphs. The algorithm only scales well for very dense
  162. graphs, where most of the work is performed in the initial stage and
  163. there is very little work in the later stages.
  164. .. image:: http://www.osl.iu.edu/research/pbgl/performance/chart.php?generator=ER,SF,SW&dataset=TimeSparse&columns=6
  165. :align: left
  166. .. image:: http://www.osl.iu.edu/research/pbgl/performance/chart.php?generator=ER,SF,SW&dataset=TimeSparse&columns=6&speedup=1
  167. .. image:: http://www.osl.iu.edu/research/pbgl/performance/chart.php?generator=ER,SF,SW&dataset=TimeDense&columns=6
  168. :align: left
  169. .. image:: http://www.osl.iu.edu/research/pbgl/performance/chart.php?generator=ER,SF,SW&dataset=TimeDense&columns=6&speedup=1
  170. ``boruvka_then_merge``
  171. ----------------------
  172. ::
  173. namespace graph {
  174. template<typename Graph, typename WeightMap, typename OutputIterator,
  175. typename VertexIndexMap, typename RankMap, typename ParentMap,
  176. typename SupervertexMap>
  177. OutputIterator
  178. boruvka_then_merge(const Graph& g, WeightMap weight, OutputIterator out,
  179. VertexIndexMap index, RankMap rank_map,
  180. ParentMap parent_map, SupervertexMap
  181. supervertex_map);
  182. template<typename Graph, typename WeightMap, typename OutputIterator,
  183. typename VertexIndexMap>
  184. inline OutputIterator
  185. boruvka_then_merge(const Graph& g, WeightMap weight, OutputIterator out,
  186. VertexIndexMap index);
  187. template<typename Graph, typename WeightMap, typename OutputIterator>
  188. inline OutputIterator
  189. boruvka_then_merge(const Graph& g, WeightMap weight, OutputIterator out);
  190. }
  191. Description
  192. ~~~~~~~~~~~
  193. This algorithm applies both Boruvka steps and local MSF merging steps
  194. together to achieve better asymptotic performance than either
  195. algorithm alone. It first executes Boruvka steps until only *n/(log_d
  196. p)^2* supervertices remain, then completes the MSF computation by
  197. performing local MSF merging on the remaining edges and
  198. supervertices.
  199. Complexity
  200. ~~~~~~~~~~
  201. This algorithm requires *log_D p* + *log log_D p* BSP supersteps. The
  202. time required by each superstep depends on the type of superstep
  203. being performed; see the distributed Boruvka or merging local MSFS
  204. algorithms for details.
  205. Performance
  206. ~~~~~~~~~~~
  207. The following charts illustrate the performance of this algorithm on
  208. various random graphs. We see that the algorithm scales well up to 64
  209. or 128 processors, depending on the type of graph, for dense
  210. graphs. However, for sparse graphs performance tapers off as the
  211. number of processors surpases *m/n*, i.e., the average degree (which
  212. is 30 for this graph). This behavior is expected from the algorithm.
  213. .. image:: http://www.osl.iu.edu/research/pbgl/performance/chart.php?generator=ER,SF,SW&dataset=TimeSparse&columns=7
  214. :align: left
  215. .. image:: http://www.osl.iu.edu/research/pbgl/performance/chart.php?generator=ER,SF,SW&dataset=TimeSparse&columns=7&speedup=1
  216. .. image:: http://www.osl.iu.edu/research/pbgl/performance/chart.php?generator=ER,SF,SW&dataset=TimeDense&columns=7
  217. :align: left
  218. .. image:: http://www.osl.iu.edu/research/pbgl/performance/chart.php?generator=ER,SF,SW&dataset=TimeDense&columns=7&speedup=1
  219. ``boruvka_mixed_merge``
  220. -----------------------
  221. ::
  222. namespace {
  223. template<typename Graph, typename WeightMap, typename OutputIterator,
  224. typename VertexIndexMap, typename RankMap, typename ParentMap,
  225. typename SupervertexMap>
  226. OutputIterator
  227. boruvka_mixed_merge(const Graph& g, WeightMap weight, OutputIterator out,
  228. VertexIndexMap index, RankMap rank_map,
  229. ParentMap parent_map, SupervertexMap
  230. supervertex_map);
  231. template<typename Graph, typename WeightMap, typename OutputIterator,
  232. typename VertexIndexMap>
  233. inline OutputIterator
  234. boruvka_mixed_merge(const Graph& g, WeightMap weight, OutputIterator out,
  235. VertexIndexMap index);
  236. template<typename Graph, typename WeightMap, typename OutputIterator>
  237. inline OutputIterator
  238. boruvka_mixed_merge(const Graph& g, WeightMap weight, OutputIterator out);
  239. }
  240. Description
  241. ~~~~~~~~~~~
  242. This algorithm applies both Boruvka steps and local MSF merging steps
  243. together to achieve better asymptotic performance than either method
  244. alone. In each iteration, the algorithm first performs a Boruvka step
  245. and then merges the local MSFs computed based on the supervertex
  246. graph.
  247. Complexity
  248. ~~~~~~~~~~
  249. This algorithm requires *log_D p* BSP supersteps. The
  250. time required by each superstep depends on the type of superstep
  251. being performed; see the distributed Boruvka or merging local MSFS
  252. algorithms for details. However, the algorithm is
  253. communication-optional (requiring *O(n)* communication overall) when
  254. the graph is sufficiently dense, i.e., *m/n >= p*.
  255. Performance
  256. ~~~~~~~~~~~
  257. The following charts illustrate the performance of this algorithm on
  258. various random graphs. We see that the algorithm scales well up to 64
  259. or 128 processors, depending on the type of graph, for dense
  260. graphs. However, for sparse graphs performance tapers off as the
  261. number of processors surpases *m/n*, i.e., the average degree (which
  262. is 30 for this graph). This behavior is expected from the algorithm.
  263. .. image:: http://www.osl.iu.edu/research/pbgl/performance/chart.php?generator=ER,SF,SW&dataset=TimeSparse&columns=8
  264. :align: left
  265. .. image:: http://www.osl.iu.edu/research/pbgl/performance/chart.php?generator=ER,SF,SW&dataset=TimeSparse&columns=8&speedup=1
  266. .. image:: http://www.osl.iu.edu/research/pbgl/performance/chart.php?generator=ER,SF,SW&dataset=TimeDense&columns=8
  267. :align: left
  268. .. image:: http://www.osl.iu.edu/research/pbgl/performance/chart.php?generator=ER,SF,SW&dataset=TimeDense&columns=8&speedup=1
  269. Selecting an MST algorithm
  270. --------------------------
  271. Dehne and Gotz reported [DG98]_ mixed results when evaluating these
  272. four algorithms. No particular algorithm was clearly better than the
  273. others in all cases. However, the asymptotically best algorithm
  274. (``boruvka_mixed_merge``) seemed to perform more poorly in their tests
  275. than the other merging-based algorithms. The following performance
  276. charts illustrate the performance of these four minimum spanning tree
  277. implementations.
  278. Overall, ``dense_boruvka_minimum_spanning_tree`` gives the most
  279. consistent performance and scalability for the graphs we
  280. tested. Additionally, it may be more suitable for sequential programs
  281. that are being parallelized, because it emits complete MSF edge lists
  282. via the output iterators in every process.
  283. Performance on Sparse Graphs
  284. ~~~~~~~~~~~~~~~~~~~~~~~~~~~~
  285. .. image:: http://www.osl.iu.edu/research/pbgl/performance/chart.php?generator=ER&dataset=TimeSparse&columns=5,6,7,8
  286. :align: left
  287. .. image:: http://www.osl.iu.edu/research/pbgl/performance/chart.php?generator=ER&dataset=TimeSparse&columns=5,6,7,8&speedup=1
  288. .. image:: http://www.osl.iu.edu/research/pbgl/performance/chart.php?generator=SF&dataset=TimeSparse&columns=5,6,7,8
  289. :align: left
  290. .. image:: http://www.osl.iu.edu/research/pbgl/performance/chart.php?generator=SF&dataset=TimeSparse&columns=5,6,7,8&speedup=1
  291. .. image:: http://www.osl.iu.edu/research/pbgl/performance/chart.php?generator=SW&dataset=TimeSparse&columns=5,6,7,8
  292. :align: left
  293. .. image:: http://www.osl.iu.edu/research/pbgl/performance/chart.php?generator=SW&dataset=TimeSparse&columns=5,6,7,8&speedup=1
  294. Performance on Dense Graphs
  295. ~~~~~~~~~~~~~~~~~~~~~~~~~~~
  296. .. image:: http://www.osl.iu.edu/research/pbgl/performance/chart.php?generator=ER&dataset=TimeDense&columns=5,6,7,8
  297. :align: left
  298. .. image:: http://www.osl.iu.edu/research/pbgl/performance/chart.php?generator=ER&dataset=TimeDense&columns=5,6,7,8&speedup=1
  299. .. image:: http://www.osl.iu.edu/research/pbgl/performance/chart.php?generator=SF&dataset=TimeDense&columns=5,6,7,8
  300. :align: left
  301. .. image:: http://www.osl.iu.edu/research/pbgl/performance/chart.php?generator=SF&dataset=TimeDense&columns=5,6,7,8&speedup=1
  302. .. image:: http://www.osl.iu.edu/research/pbgl/performance/chart.php?generator=SW&dataset=TimeDense&columns=5,6,7,8
  303. :align: left
  304. .. image:: http://www.osl.iu.edu/research/pbgl/performance/chart.php?generator=SW&dataset=TimeDense&columns=5,6,7,8&speedup=1
  305. -----------------------------------------------------------------------------
  306. Copyright (C) 2004 The Trustees of Indiana University.
  307. Authors: Douglas Gregor and Andrew Lumsdaine
  308. .. |Logo| image:: pbgl-logo.png
  309. :align: middle
  310. :alt: Parallel BGL
  311. :target: http://www.osl.iu.edu/research/pbgl
  312. .. _minimum spanning tree: http://www.boost.org/libs/graph/doc/graph_theory_review.html#sec:minimum-spanning-tree
  313. .. _Kruskal's algorithm: http://www.boost.org/libs/graph/doc/kruskal_min_spanning_tree.html
  314. .. _Vertex list graph: http://www.boost.org/libs/graph/doc/VertexListGraph.html
  315. .. _distributed adjacency list: distributed_adjacency_list.html
  316. .. _vertex_list_adaptor: vertex_list_adaptor.html
  317. .. _Distributed Edge List Graph: DistributedEdgeListGraph.html
  318. .. _Distributed property map: distributed_property_map.html
  319. .. _Readable Property Map: http://www.boost.org/libs/property_map/ReadablePropertyMap.html
  320. .. _Read/Write Property Map: http://www.boost.org/libs/property_map/ReadWritePropertyMap.html
  321. .. [DG98] Frank Dehne and Silvia Gotz. *Practical Parallel Algorithms
  322. for Minimum Spanning Trees*. In Symposium on Reliable Distributed Systems,
  323. pages 366--371, 1998.