123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136 |
- .. Copyright (C) 2004-2008 The Trustees of Indiana University.
- Use, modification and distribution is subject to the Boost Software
- License, Version 1.0. (See accompanying file LICENSE_1_0.txt or copy at
- http://www.boost.org/LICENSE_1_0.txt)
- ===============================================
- An Overview of the Parallel Boost Graph Library
- ===============================================
- .. image:: ../graph.png
- :width: 206
- :height: 184
- :alt: An example graph
- :align: right
- The Parallel Boost Graph Library (Parallel BGL) is a C++ library for
- parallel, distributed computation on graphs. The Parallel BGL contains
- distributed graph data structures, distributed graph algorithms,
- abstractions over the communication medium (such as MPI), and
- supporting data structures. A graph (also called a *network*) consists
- of a set of *vertices* and a set of relationships between vertices,
- called *edges*. The edges may be *undirected*, meaning that the
- relationship between vertices is mutual, e.g., "X is related to Y", or
- they can be *directed*, meaning that the relationship goes only one
- way, e.g., "X is the child of Y". The following figure illustrates a
- typical directed graph, where *a-i* are the vertices and the arrows
- represent edges.
- .. image:: ../distributed-graph.png
- :width: 229
- :height: 199
- :alt: A distributed graph
- :align: right
- The Parallel BGL is primarily concerned with *distributed*
- graphs. Distributed graphs are conceptually graphs, but their storage
- is spread across multiple processors. The following figure
- demonstrates a distributed version of the graph above, where the graph
- has been divided among three processors (represented by the grey
- rectangles). Edges in the graph may be either local (with both
- endpoints stored on the same processor) or remote (the target of the
- edge is stored on a different processor).
- The Parallel BGL is a generic library. At its core are *generic*
- distributed graph algorithms, which can operate on any distributed
- graph data structure provided that data structure meets certain
- requirements. For instance, the algorithm may need to enumerate the
- set of vertices stored on the current processor, enumerate the set of
- outgoing edges from a particular vertex, and determine on which
- processor the target of each edge resides. The graph algorithms in the
- Parallel BGL are also generic with respect to the *properties*
- attached to edges and vertices in a graph; for instance, the weight of
- each edge can be stored as part of the graph or allocated in a
- completely separate data structure.
- The genericity available in the algorithms of the Parallel BGL allows
- them to be applied to existing graph data structures. However, most
- users will instead be writing new code that takes advantage of the
- Parallel BGL. The Parallel BGL provides distributed graph data
- structures that meet the requirements of the Parallel BGL
- algorithms. The primary data structure is the `distributed adjacency
- list`_, which allows storage and manipulation of a (distributed)
- graph. The vertices in the graph are divided among the various
- processors, and each of the edges outgoing from a vertex are stored on
- the processor that "owns" (stores) that vertex. The following figure
- illustrates the distributed adjacency list representation.
- .. image:: ../dist-adjlist.png
- :width: 446
- :height: 154
- :alt: A distributed adjacency list
- :align: center
- .. image:: ../dist-pmap.png
- :width: 271
- :height: 175
- :alt: A distributed property map
- :align: right
- The `distributed adjacency list`_ distributes the structure of a graph
- over multiple processors. While graph structure is in important part
- of many graph problems, there are typically other properties attached
- to the vertices and edges, such as edge weights or the position of
- vertices within a grid. These properties are stored in *property
- maps*, which associate a single piece of data with each edge or vertex
- in a graph. Distributed property maps extend this notion to
- distributed computing, where properties are stored on the same
- processor as the vertex or edge. The following figure illustrates the
- distribution of a property map storing colors (white, gray, black) for
- each vertex. In addition to the storage for each vertex, the
- processors store some "ghost cells" that cache values actually stored
- on other processors, represented by the dashed boxes.
- Tying together all of the distributed data structures of the Parallel
- BGL are its process groups and distributed graph algorithms. Process
- groups coordinate the interactions between multiple processes and
- distributed data structures by abstracting the communication
- mechanism. The algorithms are typically written using the SPMD model
- (Single Program, Multiple Data) and interact with both the distributed
- data structures and the process group itself. At various points in the
- algorithm's execution, all processes execute a synchronization point,
- which allows all of the distributed data structures to ensure an
- appropriate degree of consistency across processes. The following
- diagram illustrates the communication patterns within the the
- execution of a distributed algorithm in the Parallel BGL. In
- particular, the diagram illustrates the distributed data structures
- used in a distributed breadth-first search, from the top-left and
- proceeding clockwise:
- - a user-defined property map that tracks the distance from the
- source vertex to all other vertices,
- - an automatically-generated property map that tracks the "color"
- of vertices in the (distributed) graph, to determine which
- vertices have been seen before,
- - a distributed queue, which coordinates the breadth-first search
- and distributes new vertices to search, and
- - a distributed graph, on which the breadth-first search is
- operating.
- .. image:: ../architecture.png
- :width: 485
- :height: 410
- :alt: Parallel Boost Graph Library architecture
- :align: center
- ----------------------------------------------------------------------------
- Copyright (C) 2005 The Trustees of Indiana University.
- Authors: Douglas Gregor and Andrew Lumsdaine
- .. _Distributed adjacency list: distributed_adjacency_list.html
- .. _Process groups:
|