mpi.qbk 6.9 KB

123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140141142143144145146147148149150151152153154155156157158
  1. [library Boost.MPI
  2. [quickbook 1.6]
  3. [authors [Gregor, Douglas], [Troyer, Matthias] ]
  4. [copyright 2005 2006 2007 Douglas Gregor, Matthias Troyer, Trustees of Indiana University]
  5. [id mpi]
  6. [license
  7. Distributed under the Boost Software License, Version 1.0.
  8. (See accompanying file LICENSE_1_0.txt or copy at
  9. <ulink url="http://www.boost.org/LICENSE_1_0.txt">
  10. http://www.boost.org/LICENSE_1_0.txt
  11. </ulink>)
  12. ]
  13. ]
  14. [/ Links ]
  15. [def _MPI_ [@http://www-unix.mcs.anl.gov/mpi/ MPI]]
  16. [def _MPI_implementations_
  17. [@http://www-unix.mcs.anl.gov/mpi/implementations.html
  18. MPI implementations]]
  19. [def _Serialization_ [@boost:/libs/serialization/doc
  20. Boost.Serialization]]
  21. [def _BoostPython_ [@http://www.boost.org/libs/python/doc
  22. Boost.Python]]
  23. [def _Python_ [@http://www.python.org Python]]
  24. [def _MPICH_ [@http://www-unix.mcs.anl.gov/mpi/mpich/ MPICH2]]
  25. [def _OpenMPI_ [@http://www.open-mpi.org OpenMPI]]
  26. [def _IntelMPI_ [@https://software.intel.com/en-us/intel-mpi-library Intel MPI]]
  27. [def _accumulate_ [@http://www.sgi.com/tech/stl/accumulate.html
  28. `accumulate`]]
  29. [include introduction.qbk]
  30. [include getting_started.qbk]
  31. [include tutorial.qbk]
  32. [include c_mapping.qbk]
  33. [xinclude mpi_autodoc.xml]
  34. [include python.qbk]
  35. [section:design Design Philosophy]
  36. The design philosophy of the Parallel MPI library is very simple: be
  37. both convenient and efficient. MPI is a library built for
  38. high-performance applications, but it's FORTRAN-centric,
  39. performance-minded design makes it rather inflexible from the C++
  40. point of view: passing a string from one process to another is
  41. inconvenient, requiring several messages and explicit buffering;
  42. passing a container of strings from one process to another requires
  43. an extra level of manual bookkeeping; and passing a map from strings
  44. to containers of strings is positively infuriating. The Parallel MPI
  45. library allows all of these data types to be passed using the same
  46. simple `send()` and `recv()` primitives. Likewise, collective
  47. operations such as [funcref boost::mpi::reduce `reduce()`]
  48. allow arbitrary data types and function objects, much like the C++
  49. Standard Library would.
  50. The higher-level abstractions provided for convenience must not have
  51. an impact on the performance of the application. For instance, sending
  52. an integer via `send` must be as efficient as a call to `MPI_Send`,
  53. which means that it must be implemented by a simple call to
  54. `MPI_Send`; likewise, an integer [funcref boost::mpi::reduce
  55. `reduce()`] using `std::plus<int>` must be implemented with a call to
  56. `MPI_Reduce` on integers using the `MPI_SUM` operation: anything less
  57. will impact performance. In essence, this is the "don't pay for what
  58. you don't use" principle: if the user is not transmitting strings,
  59. s/he should not pay the overhead associated with strings.
  60. Sometimes, achieving maximal performance means foregoing convenient
  61. abstractions and implementing certain functionality using lower-level
  62. primitives. For this reason, it is always possible to extract enough
  63. information from the abstractions in Boost.MPI to minimize
  64. the amount of effort required to interface between Boost.MPI
  65. and the C MPI library.
  66. [endsect]
  67. [section:performance Performance Evaluation]
  68. Message-passing performance is crucial in high-performance distributed
  69. computing. To evaluate the performance of Boost.MPI, we modified the
  70. standard [@http://www.scl.ameslab.gov/netpipe/ NetPIPE] benchmark
  71. (version 3.6.2) to use Boost.MPI and compared its performance against
  72. raw MPI. We ran five different variants of the NetPIPE benchmark:
  73. # MPI: The unmodified NetPIPE benchmark.
  74. # Boost.MPI: NetPIPE modified to use Boost.MPI calls for
  75. communication.
  76. # MPI (Datatypes): NetPIPE modified to use a derived datatype (which
  77. itself contains a single `MPI_BYTE`) rather than a fundamental
  78. datatype.
  79. # Boost.MPI (Datatypes): NetPIPE modified to use a user-defined type
  80. `Char` in place of the fundamental `char` type. The `Char` type
  81. contains a single `char`, a `serialize()` method to make it
  82. serializable, and specializes [classref
  83. boost::mpi::is_mpi_datatype is_mpi_datatype] to force
  84. Boost.MPI to build a derived MPI data type for it.
  85. # Boost.MPI (Serialized): NetPIPE modified to use a user-defined type
  86. `Char` in place of the fundamental `char` type. This `Char` type
  87. contains a single `char` and is serializable. Unlike the Datatypes
  88. case, [classref boost::mpi::is_mpi_datatype
  89. is_mpi_datatype] is *not* specialized, forcing Boost.MPI to perform
  90. many, many serialization calls.
  91. The actual tests were performed on the Odin cluster in the
  92. [@http://www.cs.indiana.edu/ Department of Computer Science] at
  93. [@http://www.iub.edu Indiana University], which contains 128 nodes
  94. connected via Infiniband. Each node contains 4GB memory and two AMD
  95. Opteron processors. The NetPIPE benchmarks were compiled with Intel's
  96. C++ Compiler, version 9.0, Boost 1.35.0 (prerelease), and
  97. [@http://www.open-mpi.org/ Open MPI] version 1.1. The NetPIPE results
  98. follow:
  99. [$../../libs/mpi/doc/netpipe.png]
  100. There are a some observations we can make about these NetPIPE
  101. results. First of all, the top two plots show that Boost.MPI performs
  102. on par with MPI for fundamental types. The next two plots show that
  103. Boost.MPI performs on par with MPI for derived data types, even though
  104. Boost.MPI provides a much more abstract, completely transparent
  105. approach to building derived data types than raw MPI. Overall
  106. performance for derived data types is significantly worse than for
  107. fundamental data types, but the bottleneck is in the underlying MPI
  108. implementation itself. Finally, when forcing Boost.MPI to serialize
  109. characters individually, performance suffers greatly. This particular
  110. instance is the worst possible case for Boost.MPI, because we are
  111. serializing millions of individual characters. Overall, the
  112. additional abstraction provided by Boost.MPI does not impair its
  113. performance.
  114. [endsect]
  115. [section:history Revision History]
  116. * *Boost 1.36.0*:
  117. * Support for non-blocking operations in Python, from Andreas Klöckner
  118. * *Boost 1.35.0*: Initial release, containing the following post-review changes
  119. * Support for arrays in all collective operations
  120. * Support default-construction of [classref boost::mpi::environment environment]
  121. * *2006-09-21*: Boost.MPI accepted into Boost.
  122. [endsect:history]
  123. [section:acknowledge Acknowledgments]
  124. Boost.MPI was developed with support from Zurcher Kantonalbank. Daniel
  125. Egloff and Michael Gauckler contributed many ideas to Boost.MPI's
  126. design, particularly in the design of its abstractions for
  127. MPI data types and the novel skeleton/context mechanism for large data
  128. structures. Prabhanjan (Anju) Kambadur developed the predecessor to
  129. Boost.MPI that proved the usefulness of the Serialization library in
  130. an MPI setting and the performance benefits of specialization in a C++
  131. abstraction layer for MPI. Jeremy Siek managed the formal review of Boost.MPI.
  132. [endsect:acknowledge]