123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140141142143144145146147148149150151152153154155156157158159160161162163164165166167168169170171172173174175176177178179180181182183184185186187188189190191192193194195196197198199200201202203204205206207208209210211212213214215216217218219220221222223224225226227228229230231232233234235236237238239240241242243244245246247248249250251252253254255256257258259260261262263264265266267268269270271272273274275276277278279280281282283284285286287288289290291292293294295296297298299300301302303304305306307308309310311312313314315316317 |
- <HTML>
- <!--
- Copyright (c) Jeremy Siek 2000
-
- Distributed under the Boost Software License, Version 1.0.
- (See accompanying file LICENSE_1_0.txt or copy at
- http://www.boost.org/LICENSE_1_0.txt)
- -->
- <Head>
- <Title>Kevin Bacon Example</Title>
- <BODY BGCOLOR="#ffffff" LINK="#0000ee" TEXT="#000000" VLINK="#551a8b"
- ALINK="#ff0000">
- <IMG SRC="../../../boost.png"
- ALT="C++ Boost" width="277" height="86">
- <BR Clear>
- <H1>Six Degrees of Kevin Bacon</H1>
- <P>
- A fun application of graph theory comes up in the popular game ``Six
- Degrees of Kevin Bacon''. The idea of the game is to connect an actor
- to Kevin Bacon through a trail of actors who appeared together in
- movies, and do so in less than six steps. For example, Theodore
- Hesburgh (President Emeritus of the University of Notre Dame) was in
- the movie ``Rudy'' with the actor Gerry Becker, who was in the movie
- ``Sleepers'' with Kevin Bacon. Why Kevin Bacon? For some reason the
- three students who invented the game, Mike Ginelli, Craig Fass, and
- Brian Turtle decided that Kevin Bacon is the center of the
- entertainment world. The Kevin Bacon game is quite similar to the <a
- href="http://www.oakland.edu/~grossman/erdoshp.html">Erdös
- Number</a> which has ``been a part of the folklore of
- mathematicians throughout the world for many years''.
- <P>
- The ``Six Degrees of Kevin Bacon'' game is really a graph problem. If
- you assign each actor to a vertex, and add an edge between two actors
- if they have appeared together in a movie, then you have a graph that
- represents the data for this game. Then the problem of finding a trail
- of actors to Kevin Bacon becomes a traditional graph problem: that of
- finding a <I>path</I> between two vertices. Since we wish to find a
- path that is shorter than six steps, ideally we would find the
- <I>shortest path</I> between the vertices. One might think to apply
- Dijkstra's shortest path algorithm to this problem, but that would be
- overkill since Dijkstra's algorithm is meant for situations when each
- edge has an associated length (or weight) and the goal is to find the
- path with the shortest cumulative length. Since we are only concerned
- with finding the shortest paths in terms of the number of edges the
- breadth-first search algorithm will solve the problem (and use less
- time than Dijkstra's).
- <P>
- <H2>Input File and Graph Setup</H2>
- <P>
- For this example we will use a much smaller graph than all movies and
- all actors. The full source code for this example is in <a
- href="../example/kevin-bacon.cpp"><TT>examples/kevin-bacon.cpp</TT></a>.
- We have supplied a file <TT>kevin-bacon.dat</TT> which contains a list
- of actor pairs who appeared in the same movie. Each line of the file
- contains an actor's name, a movie, and another actor that appeared in
- the movie. The ``;'' character is used as a separator. Here is an
- excerpt.
- <P>
- <PRE>
- Patrick Stewart;Prince of Egypt, The (1998);Steve Martin
- </PRE>
- <P>
- Our first task will be to read in the file and create a graph from
- it. We use a <TT>std::ifstream</TT> to input the file.
- <P>
- <PRE>
- std::ifstream datafile("./kevin-bacon.dat");
- if (!datafile) {
- std::cerr << "No ./kevin-bacon.dat file" << std::endl;
- return EXIT_FAILURE;
- }
- </PRE>
- <P>
- We will want to attach the actor's names to the vertices in the graph,
- and the movie names to the edges. We use the <TT>property</TT> class to
- specify the addition of these vertex and edge properties to the graph.
- We choose to use an undirected graph, because the relationship of
- actors appearing together in a movie is symmetric.
- <P>
- <PRE>
- using namespace boost;
- typedef adjacency_list<vecS, vecS, undirectedS,
- property<vertex_name_t, string>,
- property<edge_name_t, string > >
- > Graph;
- </PRE>
- <P>
- To access the properties, we will need to obtain property map
- objects from the graph. The following code shows how this is done.
- <P>
- <PRE>
- property_map<Graph, vertex_name_t>::type actor_name = get(vertex_name, g);
- property_map<Graph, edge_name_t>::type connecting_movie = get(edge_name, g);
- </PRE>
- <P>
- Next we can start to loop through the file, parsing each line into a
- sequence of tokens using the <a href="../../tokenizer/index.html">Boost
- Tokenizer Library</a>.
- <P>
- <PRE>
- for (std::string line; std::getline(datafile, line);) {
- char_delimiters_separator<char> sep(false, "", ";");
- tokenizer<> line_toks(line, sep);
- ...
- }
- </PRE>
- <P>
- Each line of the input corresponds to an edge in the graph, with the
- two actors as the vertices incident to the edge, and the movie name
- will be a property attached to the edge. One challenge in creating the
- graph from this format of input is that the edges are specified by a
- pair of actor names. As we add vertices to the graph, we'll need to
- create a map from actor names to their vertices so that later
- appearances of the same actor (on a different edge) can be linked with
- the correct vertex in the graph. The <a
- href="http://www.boost.org/sgi/stl/Map.html"><TT>std::map</TT></a>
- can be used to implement this mapping.
- <P>
- <PRE>
- typedef graph_traits<Graph>::vertex_descriptor Vertex;
- typedef std::map<string, Vertex> NameVertexMap;
- NameVertexMap actors;
- </PRE>
- <P>
- The first token will be an actor's name. If the actor is not already
- in our actor's map we will add a vertex to the graph, set the name
- property of the vertex, and record the vertex descriptor in the map.
- If the actor is already in the graph, we will retrieve the vertex
- descriptor from the map's <TT>pos</TT> iterator.
- <P>
- <PRE>
- NameVertexMap::iterator pos;
- bool inserted;
- Vertex u, v;
- tokenizer<>::iterator i = line_toks.begin();
- std::string actors_name = *i++;
- boost::tie(pos, inserted) = actors.insert(std::make_pair(actors_name, Vertex()));
- if (inserted) {
- u = add_vertex(g);
- actor_name[u] = actors_name;
- pos->second = u;
- } else
- u = pos->second;
- </PRE>
- <P>
- The second token is the name of the movie, and the third token is the
- other actor. We use the same technique as above to insert the second
- actor into the graph.
- <P>
- <PRE>
- std::string movie_name = *i++;
-
- boost::tie(pos, inserted) = actors.insert(std::make_pair(*i, Vertex()));
- if (inserted) {
- v = add_vertex(g);
- actor_name[v] = *i;
- pos->second = v;
- } else
- v = pos->second;
- </PRE>
- <P>
- The final step is to add an edge connecting the two actors, and record
- the name of the connecting movie.
- <P>
- <PRE>
- graph_traits<Graph>::edge_descriptor e;
- boost::tie(e, inserted) = add_edge(u, v, g);
- if (inserted)
- connecting_movie[e] = movie_name;
- </PRE>
- <P>
- <H2>A Custom Visitor Class</H2>
- <P>
- We create a custom visitor class to record the Kevin Bacon
- numbers. The Kevin Bacon number will be the shortest distances from
- each actor to Kevin Bacon. This distance has also been referred to as
- the ``Bacon Number'' in memory of the ``Erdos Number'' used to rank
- mathematicians according to how many publications separate them from
- Paul Erdos. The shortest distance to from Kevin Bacon to each actor
- will follow the breadth-first tree. The BFS algorithm identifies edges
- that belong to the breadth-first tree and calls our visitor's
- <tt>tree_edge</tt> function. We record the distance to the target
- vertex as one plus the distance to the source vertex.
- <PRE>
- template <typename DistanceMap>
- class bacon_number_recorder : public default_bfs_visitor
- {
- public:
- bacon_number_recorder(DistanceMap dist) : d(dist) { }
- template <typename Edge, typename Graph>
- void tree_edge(Edge e, const Graph& g) const
- {
- typename graph_traits<Graph>::vertex_descriptor
- u = source(e, g), v = target(e, g);
- d[v] = d[u] + 1;
- }
- private:
- DistanceMap d;
- };
- // Convenience function
- template <typename DistanceMap>
- bacon_number_recorder<DistanceMap>
- record_bacon_number(DistanceMap d)
- {
- return bacon_number_recorder<DistanceMap>(d);
- }
- </PRE>
- <P>
- We will use a vector to store the bacon numbers.
- <P>
- <PRE>
- std::vector<int> bacon_number(num_vertices(g));
- </PRE>
- <H2>Apply the Breadth-First Search</H2>
- <P>
- The BFS algorithm requires a source vertex as input; of course this
- will be Kevin Bacon. We now call the BGL BFS algorithm, passing in the
- graph, source vertex, and the visitor.
- <P>
- <PRE>
- Vertex src = actors["Kevin Bacon"];
- bacon_number[src] = 0;
- breadth_first_search(g, src, visitor(record_bacon_number(&bacon_number[0])));
- </PRE>
- <P>
- We can output the Bacon number for each actor simply by looping
- through all the vertices in the graph and use them to index into the
- <TT>bacon_number</TT> vector.
- <P>
- <PRE>
- graph_traits<Graph>::vertex_iterator i, end;
- for (boost::tie(i, end) = vertices(g); i != end; ++i)
- std::cout << actor_name[*i] << "'s bacon number is "
- << bacon_number[*i] << std::endl;
- </PRE>
- <P>
- Note that vertex descriptor objects can not always be used as indices
- into vectors or arrays such as <TT>bacon_number</TT>. This is valid
- with the <TT>adjacency_list</TT> class with <TT>VertexList=vecS</TT>,
- but not with other variations of <TT>adjacency_list</TT>. A more
- generic way to index based on vertices is to use the ID property
- map (<TT>vertex_index_t</TT>) in coordination with the <A
- HREF="../../property_map/doc/iterator_property_map.html"><TT>iterator_property_map</TT></a>.
- <P>
- Here are some excepts from the output of the program.
- <PRE>
- William Shatner's bacon number is 2
- Denise Richards's bacon number is 1
- Kevin Bacon's bacon number is 0
- Patrick Stewart's bacon number is 2
- Steve Martin's bacon number is 1
- ...
- </PRE>
- <br>
- <HR>
- <TABLE>
- <TR valign=top>
- <TD nowrap>Copyright © 2000-2001</TD><TD>
- <A HREF="http://www.boost.org/people/jeremy_siek.htm">Jeremy Siek</A>, Indiana University (<A HREF="mailto:jsiek@osl.iu.edu">jsiek@osl.iu.edu</A>)
- </TD></TR></TABLE>
- </BODY>
- </HTML>
- <!-- LocalWords: gif Hesburgh Ginelli Fass Erd vertices cerr namespace vecS
- -->
- <!-- LocalWords: undirectedS Tokenizer sep tokenizer toks NameVertexMap pos
- -->
- <!-- LocalWords: bool Erdos BFS typename DistanceMap bfs const num BGL src
- -->
- <!-- LocalWords: indices Shatner's Richards's siek htm
- -->
|