123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140141142143144145146147148149150151152153154155156157158 |
- [library Boost.MPI
- [quickbook 1.6]
- [authors [Gregor, Douglas], [Troyer, Matthias] ]
- [copyright 2005 2006 2007 Douglas Gregor, Matthias Troyer, Trustees of Indiana University]
- [id mpi]
- [license
- Distributed under the Boost Software License, Version 1.0.
- (See accompanying file LICENSE_1_0.txt or copy at
- <ulink url="http://www.boost.org/LICENSE_1_0.txt">
- http://www.boost.org/LICENSE_1_0.txt
- </ulink>)
- ]
- ]
- [/ Links ]
- [def _MPI_ [@http://www-unix.mcs.anl.gov/mpi/ MPI]]
- [def _MPI_implementations_
- [@http://www-unix.mcs.anl.gov/mpi/implementations.html
- MPI implementations]]
- [def _Serialization_ [@boost:/libs/serialization/doc
- Boost.Serialization]]
- [def _BoostPython_ [@http://www.boost.org/libs/python/doc
- Boost.Python]]
- [def _Python_ [@http://www.python.org Python]]
- [def _MPICH_ [@http://www-unix.mcs.anl.gov/mpi/mpich/ MPICH2]]
- [def _OpenMPI_ [@http://www.open-mpi.org OpenMPI]]
- [def _IntelMPI_ [@https://software.intel.com/en-us/intel-mpi-library Intel MPI]]
- [def _accumulate_ [@http://www.sgi.com/tech/stl/accumulate.html
- `accumulate`]]
- [include introduction.qbk]
- [include getting_started.qbk]
- [include tutorial.qbk]
- [include c_mapping.qbk]
- [xinclude mpi_autodoc.xml]
- [include python.qbk]
- [section:design Design Philosophy]
- The design philosophy of the Parallel MPI library is very simple: be
- both convenient and efficient. MPI is a library built for
- high-performance applications, but it's FORTRAN-centric,
- performance-minded design makes it rather inflexible from the C++
- point of view: passing a string from one process to another is
- inconvenient, requiring several messages and explicit buffering;
- passing a container of strings from one process to another requires
- an extra level of manual bookkeeping; and passing a map from strings
- to containers of strings is positively infuriating. The Parallel MPI
- library allows all of these data types to be passed using the same
- simple `send()` and `recv()` primitives. Likewise, collective
- operations such as [funcref boost::mpi::reduce `reduce()`]
- allow arbitrary data types and function objects, much like the C++
- Standard Library would.
- The higher-level abstractions provided for convenience must not have
- an impact on the performance of the application. For instance, sending
- an integer via `send` must be as efficient as a call to `MPI_Send`,
- which means that it must be implemented by a simple call to
- `MPI_Send`; likewise, an integer [funcref boost::mpi::reduce
- `reduce()`] using `std::plus<int>` must be implemented with a call to
- `MPI_Reduce` on integers using the `MPI_SUM` operation: anything less
- will impact performance. In essence, this is the "don't pay for what
- you don't use" principle: if the user is not transmitting strings,
- s/he should not pay the overhead associated with strings.
- Sometimes, achieving maximal performance means foregoing convenient
- abstractions and implementing certain functionality using lower-level
- primitives. For this reason, it is always possible to extract enough
- information from the abstractions in Boost.MPI to minimize
- the amount of effort required to interface between Boost.MPI
- and the C MPI library.
- [endsect]
- [section:performance Performance Evaluation]
- Message-passing performance is crucial in high-performance distributed
- computing. To evaluate the performance of Boost.MPI, we modified the
- standard [@http://www.scl.ameslab.gov/netpipe/ NetPIPE] benchmark
- (version 3.6.2) to use Boost.MPI and compared its performance against
- raw MPI. We ran five different variants of the NetPIPE benchmark:
- # MPI: The unmodified NetPIPE benchmark.
- # Boost.MPI: NetPIPE modified to use Boost.MPI calls for
- communication.
- # MPI (Datatypes): NetPIPE modified to use a derived datatype (which
- itself contains a single `MPI_BYTE`) rather than a fundamental
- datatype.
- # Boost.MPI (Datatypes): NetPIPE modified to use a user-defined type
- `Char` in place of the fundamental `char` type. The `Char` type
- contains a single `char`, a `serialize()` method to make it
- serializable, and specializes [classref
- boost::mpi::is_mpi_datatype is_mpi_datatype] to force
- Boost.MPI to build a derived MPI data type for it.
- # Boost.MPI (Serialized): NetPIPE modified to use a user-defined type
- `Char` in place of the fundamental `char` type. This `Char` type
- contains a single `char` and is serializable. Unlike the Datatypes
- case, [classref boost::mpi::is_mpi_datatype
- is_mpi_datatype] is *not* specialized, forcing Boost.MPI to perform
- many, many serialization calls.
- The actual tests were performed on the Odin cluster in the
- [@http://www.cs.indiana.edu/ Department of Computer Science] at
- [@http://www.iub.edu Indiana University], which contains 128 nodes
- connected via Infiniband. Each node contains 4GB memory and two AMD
- Opteron processors. The NetPIPE benchmarks were compiled with Intel's
- C++ Compiler, version 9.0, Boost 1.35.0 (prerelease), and
- [@http://www.open-mpi.org/ Open MPI] version 1.1. The NetPIPE results
- follow:
- [$../../libs/mpi/doc/netpipe.png]
- There are a some observations we can make about these NetPIPE
- results. First of all, the top two plots show that Boost.MPI performs
- on par with MPI for fundamental types. The next two plots show that
- Boost.MPI performs on par with MPI for derived data types, even though
- Boost.MPI provides a much more abstract, completely transparent
- approach to building derived data types than raw MPI. Overall
- performance for derived data types is significantly worse than for
- fundamental data types, but the bottleneck is in the underlying MPI
- implementation itself. Finally, when forcing Boost.MPI to serialize
- characters individually, performance suffers greatly. This particular
- instance is the worst possible case for Boost.MPI, because we are
- serializing millions of individual characters. Overall, the
- additional abstraction provided by Boost.MPI does not impair its
- performance.
- [endsect]
- [section:history Revision History]
- * *Boost 1.36.0*:
- * Support for non-blocking operations in Python, from Andreas Klöckner
- * *Boost 1.35.0*: Initial release, containing the following post-review changes
- * Support for arrays in all collective operations
- * Support default-construction of [classref boost::mpi::environment environment]
-
- * *2006-09-21*: Boost.MPI accepted into Boost.
- [endsect:history]
- [section:acknowledge Acknowledgments]
- Boost.MPI was developed with support from Zurcher Kantonalbank. Daniel
- Egloff and Michael Gauckler contributed many ideas to Boost.MPI's
- design, particularly in the design of its abstractions for
- MPI data types and the novel skeleton/context mechanism for large data
- structures. Prabhanjan (Anju) Kambadur developed the predecessor to
- Boost.MPI that proved the usefulness of the Serialization library in
- an MPI setting and the performance benefits of specialization in a C++
- abstraction layer for MPI. Jeremy Siek managed the formal review of Boost.MPI.
- [endsect:acknowledge]
|