symbols.html 18 KB

123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140141142143144145146147148149150151152153154155156157158159160161162163164165166167168169170171172173174175176177178179180181182183184185186187188189190191192193194195196197198199200201202203204
  1. <html>
  2. <head>
  3. <title>Symbols</title>
  4. <meta http-equiv="Content-Type" content="text/html; charset=iso-8859-1">
  5. <link rel="stylesheet" href="theme/style.css" type="text/css">
  6. </head>
  7. <body>
  8. <table width="100%" border="0" background="theme/bkd2.gif" cellspacing="2">
  9. <tr>
  10. <td width="10">
  11. </td>
  12. <td width="85%">
  13. <font size="6" face="Verdana, Arial, Helvetica, sans-serif"><b>Symbols</b></font>
  14. </td>
  15. <td width="112"><a href="http://spirit.sf.net"><img src="theme/spirit.gif" width="112" height="48" align="right" border="0"></a></td>
  16. </tr>
  17. </table>
  18. <br>
  19. <table border="0">
  20. <tr>
  21. <td width="10"></td>
  22. <td width="30"><a href="../index.html"><img src="theme/u_arr.gif" border="0"></a></td>
  23. <td width="30"><a href="distinct.html"><img src="theme/l_arr.gif" border="0"></a></td>
  24. <td width="30"><a href="trees.html"><img src="theme/r_arr.gif" border="0"></a></td>
  25. </tr>
  26. </table>
  27. <p>This class symbols implements a symbol table. The symbol table holds a dictionary
  28. of symbols where each symbol is a sequence of CharTs (a <tt>char</tt>, <tt>wchar_t</tt>,
  29. <tt>int</tt>, enumeration etc.) . The template class, parameterized by the character
  30. type (CharT), can work efficiently with 8, 16, 32 and even 64 bit characters.
  31. Mutable data of type T is associated with each symbol.<br>
  32. </p>
  33. <p>Traditionally, symbol table management is maintained seperately outside the
  34. BNF grammar through semantic actions. Contrary to standard practice, the Spirit
  35. symbol table class <tt>symbols</tt> is-a parser. An instance of which may be
  36. used anywhere in the EBNF grammar specification. It is an example of a dynamic
  37. parser. A dynamic parser is characterized by its ability to modify its behavior
  38. at run time. Initially, an empty symbols object matches nothing. At any time,
  39. symbols may be added, thus, dynamically altering its behavior.</p>
  40. <p>Each entry in a symbol table has an associated mutable data slot. In this regard,
  41. one can view the symbol table as an associative container (or map) of key-value
  42. pairs where the keys are strings. </p>
  43. <p>The symbols class expects two template parameters (actually there is a third,
  44. see detail box). The first parameter <tt>T</tt> specifies the data type associated
  45. with each symbol (defaults to <tt>int</tt>) and the second parameter <tt>CharT</tt>
  46. specifies the character type of the symbols (defaults to <tt>char</tt>). </p>
  47. <pre><span class=identifier> </span><span class=keyword>template
  48. </span><span class=special>&lt;
  49. </span><span class=keyword>typename </span><span class=identifier>T </span><span class=special>= </span><span class=keyword>int</span><span class=special>,
  50. </span><span class=keyword>typename </span><span class=identifier>CharT </span><span class=special>= </span><span class=keyword>char</span><span class=special>,
  51. </span><span class=keyword>typename </span><span class=identifier>SetT </span><span class=special>= </span><span class=identifier>impl</span><span class=special>::</span><span class=identifier>tst</span><span class=special>&lt;</span><span class=identifier>T</span><span class=special>, </span><span class=identifier>CharT</span><span class=special>&gt;
  52. </span><span class=special>&gt;
  53. </span><span class=keyword>class </span><span class=identifier>symbols</span><span class=special>;</span></pre>
  54. <table width="80%" border="0" align="center">
  55. <tr>
  56. <td class="note_box"><img src="theme/lens.gif" width="15" height="16"> <b>Ternary
  57. State Trees</b><br>
  58. <br>
  59. The actual set implementation is supplied by the SetT template parameter
  60. (3rd template parameter of the symbols class) . By default, this uses the
  61. tst class which is an implementation of the Ternary Search Tree. <br>
  62. <br>
  63. Ternary Search Trees are faster than hashing for many typical search problems
  64. especially when the search interface is iterator based. Searching for a
  65. string of length k in a ternary search tree with n strings will require
  66. at most O(log n+k) character comparisons. TSTs are many times faster than
  67. hash tables for unsuccessful searches since mismatches are discovered earlier
  68. after examining only a few characters. Hash tables always examine an entire
  69. key when searching.<br>
  70. <br>
  71. For details see <a href="http://www.cs.princeton.edu/%7Ers/strings/">http://www.cs.princeton.edu/~rs/strings/</a>.</td>
  72. </tr>
  73. </table>
  74. <p>Here are some sample declarations:</p>
  75. <pre><span class=identifier> </span><span class=identifier>symbols</span><span class=special>&lt;&gt; </span><span class=identifier>sym</span><span class=special>;
  76. </span><span class=identifier>symbols</span><span class=special>&lt;</span><span class=keyword>short</span><span class=special>, </span><span class=keyword>wchar_t</span><span class=special>&gt; </span><span class=identifier>sym2</span><span class=special>;
  77. </span><span class=keyword>struct </span><span class=identifier>my_info
  78. </span><span class=special>{
  79. </span><span class=keyword>int </span><span class=identifier>id</span><span class=special>;
  80. </span><span class=keyword>double </span><span class=identifier>value</span><span class=special>;
  81. </span><span class=special>};
  82. </span><span class=identifier>symbols</span><span class=special>&lt;</span><span class=identifier>my_info</span><span class=special>&gt; </span><span class=identifier>sym3</span><span class=special>;</span></pre>
  83. <p>After having declared our symbol tables, symbols may be added statically using
  84. the construct:</p>
  85. <pre><span class=identifier> sym </span><span class=special>= </span><span class=identifier>a</span><span class=special>, </span><span class=identifier>b</span><span class=special>, </span><span class=identifier>c</span><span class=special>, </span><span class=identifier>d </span><span class=special>...;</span></pre>
  86. <p>where <tt>sym</tt> is a symbol table and <tt>a..d</tt> etc. are strings. <img src="theme/note.gif" width="16" height="16">Note
  87. that the comma operator is separating the items being added to the symbol table,
  88. through an assignment. Due to operator overloading this is possible and correct
  89. (though it may take a little getting used to) and is a concise way to initialize
  90. the symbol table with many symbols. Also, it is perfectly valid to make multiple
  91. assignments to a symbol table to iteratively add symbols (or groups of symbols)
  92. at different times.</p>
  93. <p>Simple example:<br>
  94. </p>
  95. <pre><span class=identifier> sym </span><span class=special>= </span><span class=string>&quot;pineapple&quot;</span><span class=special>, </span><span class=string>&quot;orange&quot;</span><span class=special>, </span><span class=string>&quot;banana&quot;</span><span class=special>, </span><span class=string>&quot;apple&quot;</span><span class=special>, </span><span class=string>&quot;mango&quot;</span><span class=special>;</span></pre>
  96. <p>Note that it is invalid to add the same symbol multiple times to a symbol table,
  97. though you may modify the value associated with a symbol artibrarily many times.</p>
  98. <p>Now, we may use sym in the grammar. Example:</p>
  99. <pre><span class=identifier> fruits </span><span class=special>= </span><span class=identifier>sym </span><span class=special>&gt;&gt; </span><span class=special>*(</span><span class=literal>',' </span><span class=special>&gt;&gt; </span><span class=identifier>sym</span><span class=special>);</span></pre>
  100. <p>Alternatively, symbols may be added dynamically through the member functor
  101. <tt>add</tt> (see <tt><a href="#symbol_inserter">symbol_inserter</a></tt> below).
  102. The member functor <tt>add</tt> may be attached to a parser as a semantic action
  103. taking in a begin/end pair:</p>
  104. <pre><span class=identifier> p</span><span class=special>[</span><span class=identifier>sym</span><span class=special>.</span><span class=identifier>add</span><span class=special>]</span></pre>
  105. <p>where p is a parser (and sym is a symbol table). On success, the matching portion
  106. of the input is added to the symbol table.</p>
  107. <p><tt>add</tt> may also be used to directly initialize data. Examples:</p>
  108. <pre><span class=identifier> sym</span><span class=special>.</span><span class=identifier>add</span><span class=special>(</span><span class=string>&quot;hello&quot;</span><span class=special>, </span><span class=number>1</span><span class=special>)(</span><span class=string>&quot;crazy&quot;</span><span class=special>, </span><span class=number>2</span><span class=special>)(</span><span class=string>&quot;world&quot;</span><span class=special>, </span><span class=number>3</span><span class=special>);</span></pre>
  109. <p>Assuming of course that the data slot associated with <tt>sym</tt> is an integer.</p>
  110. <p>The data associated with each symbol may be modified any time. The most obvious
  111. way of course is through <a href="semantic_actions.html">semantic actions</a>.
  112. A function or functor, as usual, may be attached to the symbol table. The symbol
  113. table expects a function or functor compatible with the signature:</p>
  114. <p><b>Signature for functions:</b></p>
  115. <pre><code><font color="#000000"><span class=identifier> </span><span class=keyword>void </span><span class=identifier>func</span><span class=special>(</span><span class=identifier>T</span><span class="special">&amp;</span><span class=identifier> data</span><span class=special>);</span></font></code></pre>
  116. <p><b>Signature for functors:</b><br>
  117. </p>
  118. <pre><code><font color="#000000"><span class=special> </span><span class=keyword>struct </span><span class=identifier>ftor
  119. </span><span class=special>{
  120. </span><span class=keyword>void </span><span class=keyword>operator</span><span class=special>()(</span><span class=identifier>T</span><span class="special">&amp;</span><span class=identifier> data</span><span class=special>) </span><span class=keyword>const</span><span class=special>;
  121. </span><span class=special>};</span></font></code></pre>
  122. <p>Where <tt>T</tt> is the data type of the symbol table (the <tt>T</tt> in its
  123. template parameter list). When the symbol table successfully matches something
  124. from the input, the data associated with the matching entry in the symbol table
  125. is reported to the semantic action.</p>
  126. <h2>Symbol table utilities</h2>
  127. <p>Sometimes, one may wish to deal with the symbol table directly. Provided are
  128. some symbol table utilities.</p>
  129. <p><b>add</b></p>
  130. <pre><span class=identifier> </span><span class=keyword>template </span><span class=special>&lt;</span><span class=keyword>typename </span><span class=identifier>T</span><span class=special>, </span><span class=keyword>typename </span><span class=identifier>CharT</span><span class=special>, </span><span class=keyword>typename </span><span class=identifier>SetT</span><span class=special>&gt;
  131. </span><span class=identifier>T</span><span class=special>* </span><span class=identifier>add</span><span class=special>(</span><span class=identifier>symbols</span><span class=special>&lt;</span><span class=identifier>T</span><span class=special>, </span><span class=identifier>CharT</span><span class=special>, </span><span class=identifier>SetT</span><span class=special>&gt;&amp; </span><span class=identifier>table</span><span class=special>, </span><span class=identifier>CharT </span><span class=keyword>const</span><span class=special>* </span><span class=identifier>sym</span><span class=special>, </span><span class=identifier>T </span><span class=keyword>const</span><span class=special>&amp; </span><span class=identifier>data </span><span class=special>= </span><span class=identifier>T</span><span class=special>());</span></pre>
  132. <p>adds a symbol <tt>sym</tt> (C string) to a symbol table <tt>table</tt> plus
  133. an optional data <tt>data</tt> associated with the symbol. Returns a pointer
  134. to the data associated with the symbol or <tt>NULL</tt> if add failed (e.g.
  135. when the symbol is already added before).<br>
  136. <br>
  137. <b>find</b></p>
  138. <pre><span class=special> </span><span class=keyword>template </span><span class=special>&lt;</span><span class=keyword>typename </span><span class=identifier>T</span><span class=special>, </span><span class=keyword>typename </span><span class=identifier>CharT</span><span class=special>, </span><span class=keyword>typename </span><span class=identifier>SetT</span><span class=special>&gt;
  139. </span><span class=identifier>T</span><span class=special>* </span><span class=identifier>find</span><span class=special>(</span><span class=identifier>symbols</span><span class=special>&lt;</span><span class=identifier>T</span><span class=special>, </span><span class=identifier>CharT</span><span class=special>, </span><span class=identifier>SetT</span><span class=special>&gt; </span><span class=keyword>const</span><span class=special>&amp; </span><span class=identifier>table</span><span class=special>, </span><span class=identifier>CharT </span><span class=keyword>const</span><span class=special>* </span><span class=identifier>sym</span><span class=special>);</span></pre>
  140. <p>finds a symbol <tt>sym</tt> (C string) from a symbol table <tt>table</tt>.
  141. Returns a pointer to the data associated with the symbol or <tt>NULL</tt> if
  142. not found</p>
  143. <h2><a name="symbol_inserter"></a>symbol_inserter</h2>
  144. <p>The symbols class holds an instance of this class named <tt>add</tt>. This
  145. can be called directly just like a member function, passing in a first/last
  146. iterator and optional data:<br>
  147. <br>
  148. </p>
  149. <pre><span class=identifier> sym</span><span class=special>.</span><span class=identifier>add</span><span class=special>(</span><span class=identifier>first</span><span class=special>, </span><span class=identifier>last</span><span class=special>, </span><span class=identifier>data</span><span class=special>);</span></pre>
  150. <p>Or, passing in a C string and optional data:<br>
  151. </p>
  152. <pre><span class=identifier> sym</span><span class=special>.</span><span class=identifier>add</span><span class=special>(</span><span class=identifier>c_string</span><span class=special>, </span><span class=identifier>data</span><span class=special>);</span></pre>
  153. <p>where <tt>sym</tt> is a symbol table. The <tt>data</tt> argument is optional.
  154. The nice thing about this scheme is that it can be cascaded. We've seen this
  155. applied above. Here's a snippet from the roman numerals parser</p>
  156. <pre> <span class=comment>// Parse roman numerals (1..9) using the symbol table.
  157. </span> <span class=keyword>struct </span><span class=identifier>ones </span><span class=special>: </span><span class=identifier>symbols</span><span class=special>&lt;</span><span class=keyword>unsigned</span><span class=special>&gt;
  158. </span><span class=special>{
  159. </span><span class=identifier>ones</span><span class=special>()
  160. </span><span class=special>{
  161. </span><span class=identifier>add
  162. </span><span class=special>(</span><span class=string>&quot;I&quot; </span><span class=special>, </span><span class=number>1</span><span class=special>)
  163. </span><span class=special>(</span><span class=string>&quot;II&quot; </span><span class=special>, </span><span class=number>2</span><span class=special>)
  164. </span><span class=special>(</span><span class=string>&quot;III&quot; </span><span class=special>, </span><span class=number>3</span><span class=special>)
  165. </span><span class=special>(</span><span class=string>&quot;IV&quot; </span><span class=special>, </span><span class=number>4</span><span class=special>)
  166. </span><span class=special>(</span><span class=string>&quot;V&quot; </span><span class=special>, </span><span class=number>5</span><span class=special>)
  167. </span><span class=special>(</span><span class=string>&quot;VI&quot; </span><span class=special>, </span><span class=number>6</span><span class=special>)
  168. </span><span class=special>(</span><span class=string>&quot;VII&quot; </span><span class=special>, </span><span class=number>7</span><span class=special>)
  169. </span><span class=special>(</span><span class=string>&quot;VIII&quot; </span><span class=special>, </span><span class=number>8</span><span class=special>)
  170. </span><span class=special>(</span><span class=string>&quot;IX&quot; </span><span class=special>, </span><span class=number>9</span><span class=special>)
  171. </span><span class=special>;
  172. </span><span class=special>}
  173. </span><span class=special>} </span><span class=identifier>ones_p</span><span class=special>;</span></pre>
  174. <p>Notice that a user defined struct <tt>ones</tt> is subclassed from <tt>symbols</tt>.
  175. Then at construction time, we added all the symbols using the <tt>add</tt> symbol_inserter.</p>
  176. <p> <img height="16" width="15" src="theme/lens.gif"> The full source code can be <a href="../example/fundamental/roman_numerals.cpp">viewed here</a>. This is part of the Spirit distribution.</p>
  177. <p>Again, <tt>add</tt> may also be used as a semantic action since it conforms
  178. to the action interface (see semantic actions):<br>
  179. </p>
  180. <pre><span class=special></span><span class=identifier> p</span><span class=special>[</span><span class=identifier>sym</span><span class=special>.</span><span class=identifier>add</span><span class=special>]</span></pre>
  181. <p>where p is a parser of course.<span class=special><br>
  182. </span></p>
  183. <table border="0">
  184. <tr>
  185. <td width="10"></td>
  186. <td width="30"><a href="../index.html"><img src="theme/u_arr.gif" border="0"></a></td>
  187. <td width="30"><a href="distinct.html"><img src="theme/l_arr.gif" border="0"></a></td>
  188. <td width="30"><a href="trees.html"><img src="theme/r_arr.gif" border="0"></a></td>
  189. </tr>
  190. </table>
  191. <br>
  192. <hr size="1">
  193. <p class="copyright">Copyright &copy; 1998-2003 Joel de Guzman<br>
  194. <br>
  195. <font size="2">Use, modification and distribution is subject to the Boost Software
  196. License, Version 1.0. (See accompanying file LICENSE_1_0.txt or copy at
  197. http://www.boost.org/LICENSE_1_0.txt)</font></p>
  198. </body>
  199. </html>