kevin_bacon.html 10 KB

123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140141142143144145146147148149150151152153154155156157158159160161162163164165166167168169170171172173174175176177178179180181182183184185186187188189190191192193194195196197198199200201202203204205206207208209210211212213214215216217218219220221222223224225226227228229230231232233234235236237238239240241242243244245246247248249250251252253254255256257258259260261262263264265266267268269270271272273274275276277278279280281282283284285286287288289290291292293294295296297298299300301302303304305306307308309310311312313314315316317
  1. <HTML>
  2. <!--
  3. Copyright (c) Jeremy Siek 2000
  4. Distributed under the Boost Software License, Version 1.0.
  5. (See accompanying file LICENSE_1_0.txt or copy at
  6. http://www.boost.org/LICENSE_1_0.txt)
  7. -->
  8. <Head>
  9. <Title>Kevin Bacon Example</Title>
  10. <BODY BGCOLOR="#ffffff" LINK="#0000ee" TEXT="#000000" VLINK="#551a8b"
  11. ALINK="#ff0000">
  12. <IMG SRC="../../../boost.png"
  13. ALT="C++ Boost" width="277" height="86">
  14. <BR Clear>
  15. <H1>Six Degrees of Kevin Bacon</H1>
  16. <P>
  17. A fun application of graph theory comes up in the popular game ``Six
  18. Degrees of Kevin Bacon''. The idea of the game is to connect an actor
  19. to Kevin Bacon through a trail of actors who appeared together in
  20. movies, and do so in less than six steps. For example, Theodore
  21. Hesburgh (President Emeritus of the University of Notre Dame) was in
  22. the movie ``Rudy'' with the actor Gerry Becker, who was in the movie
  23. ``Sleepers'' with Kevin Bacon. Why Kevin Bacon? For some reason the
  24. three students who invented the game, Mike Ginelli, Craig Fass, and
  25. Brian Turtle decided that Kevin Bacon is the center of the
  26. entertainment world. The Kevin Bacon game is quite similar to the <a
  27. href="http://www.oakland.edu/~grossman/erdoshp.html">Erd&ouml;s
  28. Number</a> which has ``been a part of the folklore of
  29. mathematicians throughout the world for many years''.
  30. <P>
  31. The ``Six Degrees of Kevin Bacon'' game is really a graph problem. If
  32. you assign each actor to a vertex, and add an edge between two actors
  33. if they have appeared together in a movie, then you have a graph that
  34. represents the data for this game. Then the problem of finding a trail
  35. of actors to Kevin Bacon becomes a traditional graph problem: that of
  36. finding a <I>path</I> between two vertices. Since we wish to find a
  37. path that is shorter than six steps, ideally we would find the
  38. <I>shortest path</I> between the vertices. One might think to apply
  39. Dijkstra's shortest path algorithm to this problem, but that would be
  40. overkill since Dijkstra's algorithm is meant for situations when each
  41. edge has an associated length (or weight) and the goal is to find the
  42. path with the shortest cumulative length. Since we are only concerned
  43. with finding the shortest paths in terms of the number of edges the
  44. breadth-first search algorithm will solve the problem (and use less
  45. time than Dijkstra's).
  46. <P>
  47. <H2>Input File and Graph Setup</H2>
  48. <P>
  49. For this example we will use a much smaller graph than all movies and
  50. all actors. The full source code for this example is in <a
  51. href="../example/kevin-bacon.cpp"><TT>examples/kevin-bacon.cpp</TT></a>.
  52. We have supplied a file <TT>kevin-bacon.dat</TT> which contains a list
  53. of actor pairs who appeared in the same movie. Each line of the file
  54. contains an actor's name, a movie, and another actor that appeared in
  55. the movie. The ``;'' character is used as a separator. Here is an
  56. excerpt.
  57. <P>
  58. <PRE>
  59. Patrick Stewart;Prince of Egypt, The (1998);Steve Martin
  60. </PRE>
  61. <P>
  62. Our first task will be to read in the file and create a graph from
  63. it. We use a <TT>std::ifstream</TT> to input the file.
  64. <P>
  65. <PRE>
  66. std::ifstream datafile("./kevin-bacon.dat");
  67. if (!datafile) {
  68. std::cerr &lt;&lt; "No ./kevin-bacon.dat file" &lt;&lt; std::endl;
  69. return EXIT_FAILURE;
  70. }
  71. </PRE>
  72. <P>
  73. We will want to attach the actor's names to the vertices in the graph,
  74. and the movie names to the edges. We use the <TT>property</TT> class to
  75. specify the addition of these vertex and edge properties to the graph.
  76. We choose to use an undirected graph, because the relationship of
  77. actors appearing together in a movie is symmetric.
  78. <P>
  79. <PRE>
  80. using namespace boost;
  81. typedef adjacency_list&lt;vecS, vecS, undirectedS,
  82. property&lt;vertex_name_t, string&gt;,
  83. property&lt;edge_name_t, string &gt; &gt;
  84. &gt; Graph;
  85. </PRE>
  86. <P>
  87. To access the properties, we will need to obtain property map
  88. objects from the graph. The following code shows how this is done.
  89. <P>
  90. <PRE>
  91. property_map&lt;Graph, vertex_name_t&gt;::type actor_name = get(vertex_name, g);
  92. property_map&lt;Graph, edge_name_t&gt;::type connecting_movie = get(edge_name, g);
  93. </PRE>
  94. <P>
  95. Next we can start to loop through the file, parsing each line into a
  96. sequence of tokens using the <a href="../../tokenizer/index.html">Boost
  97. Tokenizer Library</a>.
  98. <P>
  99. <PRE>
  100. for (std::string line; std::getline(datafile, line);) {
  101. char_delimiters_separator&lt;char&gt; sep(false, "", ";");
  102. tokenizer&lt;&gt; line_toks(line, sep);
  103. ...
  104. }
  105. </PRE>
  106. <P>
  107. Each line of the input corresponds to an edge in the graph, with the
  108. two actors as the vertices incident to the edge, and the movie name
  109. will be a property attached to the edge. One challenge in creating the
  110. graph from this format of input is that the edges are specified by a
  111. pair of actor names. As we add vertices to the graph, we'll need to
  112. create a map from actor names to their vertices so that later
  113. appearances of the same actor (on a different edge) can be linked with
  114. the correct vertex in the graph. The <a
  115. href="http://www.boost.org/sgi/stl/Map.html"><TT>std::map</TT></a>
  116. can be used to implement this mapping.
  117. <P>
  118. <PRE>
  119. typedef graph_traits&lt;Graph&gt;::vertex_descriptor Vertex;
  120. typedef std::map&lt;string, Vertex&gt; NameVertexMap;
  121. NameVertexMap actors;
  122. </PRE>
  123. <P>
  124. The first token will be an actor's name. If the actor is not already
  125. in our actor's map we will add a vertex to the graph, set the name
  126. property of the vertex, and record the vertex descriptor in the map.
  127. If the actor is already in the graph, we will retrieve the vertex
  128. descriptor from the map's <TT>pos</TT> iterator.
  129. <P>
  130. <PRE>
  131. NameVertexMap::iterator pos;
  132. bool inserted;
  133. Vertex u, v;
  134. tokenizer&lt;&gt;::iterator i = line_toks.begin();
  135. std::string actors_name = *i++;
  136. boost::tie(pos, inserted) = actors.insert(std::make_pair(actors_name, Vertex()));
  137. if (inserted) {
  138. u = add_vertex(g);
  139. actor_name[u] = actors_name;
  140. pos-&gt;second = u;
  141. } else
  142. u = pos-&gt;second;
  143. </PRE>
  144. <P>
  145. The second token is the name of the movie, and the third token is the
  146. other actor. We use the same technique as above to insert the second
  147. actor into the graph.
  148. <P>
  149. <PRE>
  150. std::string movie_name = *i++;
  151. boost::tie(pos, inserted) = actors.insert(std::make_pair(*i, Vertex()));
  152. if (inserted) {
  153. v = add_vertex(g);
  154. actor_name[v] = *i;
  155. pos-&gt;second = v;
  156. } else
  157. v = pos-&gt;second;
  158. </PRE>
  159. <P>
  160. The final step is to add an edge connecting the two actors, and record
  161. the name of the connecting movie.
  162. <P>
  163. <PRE>
  164. graph_traits&lt;Graph&gt;::edge_descriptor e;
  165. boost::tie(e, inserted) = add_edge(u, v, g);
  166. if (inserted)
  167. connecting_movie[e] = movie_name;
  168. </PRE>
  169. <P>
  170. <H2>A Custom Visitor Class</H2>
  171. <P>
  172. We create a custom visitor class to record the Kevin Bacon
  173. numbers. The Kevin Bacon number will be the shortest distances from
  174. each actor to Kevin Bacon. This distance has also been referred to as
  175. the ``Bacon Number'' in memory of the ``Erdos Number'' used to rank
  176. mathematicians according to how many publications separate them from
  177. Paul Erdos. The shortest distance to from Kevin Bacon to each actor
  178. will follow the breadth-first tree. The BFS algorithm identifies edges
  179. that belong to the breadth-first tree and calls our visitor's
  180. <tt>tree_edge</tt> function. We record the distance to the target
  181. vertex as one plus the distance to the source vertex.
  182. <PRE>
  183. template &lt;typename DistanceMap&gt;
  184. class bacon_number_recorder : public default_bfs_visitor
  185. {
  186. public:
  187. bacon_number_recorder(DistanceMap dist) : d(dist) { }
  188. template &lt;typename Edge, typename Graph&gt;
  189. void tree_edge(Edge e, const Graph& g) const
  190. {
  191. typename graph_traits&lt;Graph&gt;::vertex_descriptor
  192. u = source(e, g), v = target(e, g);
  193. d[v] = d[u] + 1;
  194. }
  195. private:
  196. DistanceMap d;
  197. };
  198. // Convenience function
  199. template &lt;typename DistanceMap&gt;
  200. bacon_number_recorder&lt;DistanceMap&gt;
  201. record_bacon_number(DistanceMap d)
  202. {
  203. return bacon_number_recorder&lt;DistanceMap&gt;(d);
  204. }
  205. </PRE>
  206. <P>
  207. We will use a vector to store the bacon numbers.
  208. <P>
  209. <PRE>
  210. std::vector&lt;int&gt; bacon_number(num_vertices(g));
  211. </PRE>
  212. <H2>Apply the Breadth-First Search</H2>
  213. <P>
  214. The BFS algorithm requires a source vertex as input; of course this
  215. will be Kevin Bacon. We now call the BGL BFS algorithm, passing in the
  216. graph, source vertex, and the visitor.
  217. <P>
  218. <PRE>
  219. Vertex src = actors["Kevin Bacon"];
  220. bacon_number[src] = 0;
  221. breadth_first_search(g, src, visitor(record_bacon_number(&amp;bacon_number[0])));
  222. </PRE>
  223. <P>
  224. We can output the Bacon number for each actor simply by looping
  225. through all the vertices in the graph and use them to index into the
  226. <TT>bacon_number</TT> vector.
  227. <P>
  228. <PRE>
  229. graph_traits&lt;Graph&gt;::vertex_iterator i, end;
  230. for (boost::tie(i, end) = vertices(g); i != end; ++i)
  231. std::cout &lt;&lt; actor_name[*i] &lt;&lt; "'s bacon number is "
  232. &lt;&lt; bacon_number[*i] &lt;&lt; std::endl;
  233. </PRE>
  234. <P>
  235. Note that vertex descriptor objects can not always be used as indices
  236. into vectors or arrays such as <TT>bacon_number</TT>. This is valid
  237. with the <TT>adjacency_list</TT> class with <TT>VertexList=vecS</TT>,
  238. but not with other variations of <TT>adjacency_list</TT>. A more
  239. generic way to index based on vertices is to use the ID property
  240. map (<TT>vertex_index_t</TT>) in coordination with the <A
  241. HREF="../../property_map/doc/iterator_property_map.html"><TT>iterator_property_map</TT></a>.
  242. <P>
  243. Here are some excepts from the output of the program.
  244. <PRE>
  245. William Shatner's bacon number is 2
  246. Denise Richards's bacon number is 1
  247. Kevin Bacon's bacon number is 0
  248. Patrick Stewart's bacon number is 2
  249. Steve Martin's bacon number is 1
  250. ...
  251. </PRE>
  252. <br>
  253. <HR>
  254. <TABLE>
  255. <TR valign=top>
  256. <TD nowrap>Copyright &copy; 2000-2001</TD><TD>
  257. <A HREF="http://www.boost.org/people/jeremy_siek.htm">Jeremy Siek</A>, Indiana University (<A HREF="mailto:jsiek@osl.iu.edu">jsiek@osl.iu.edu</A>)
  258. </TD></TR></TABLE>
  259. </BODY>
  260. </HTML>
  261. <!-- LocalWords: gif Hesburgh Ginelli Fass Erd vertices cerr namespace vecS
  262. -->
  263. <!-- LocalWords: undirectedS Tokenizer sep tokenizer toks NameVertexMap pos
  264. -->
  265. <!-- LocalWords: bool Erdos BFS typename DistanceMap bfs const num BGL src
  266. -->
  267. <!-- LocalWords: indices Shatner's Richards's siek htm
  268. -->