techniques.html 38 KB


  1. <html>
  2. <head>
  3. <title>Techniques</title>
  4. <meta http-equiv="Content-Type" content="text/html; charset=iso-8859-1">
  5. <link rel="stylesheet" href="theme/style.css" type="text/css">
  6. </head>
  7. <body>
  8. <table width="100%" border="0" background="theme/bkd2.gif" cellspacing="2">
  9. <tr>
  10. <td width="10">
  11. </td>
  12. <td width="85%"> <font size="6" face="Verdana, Arial, Helvetica, sans-serif"><b>Techniques</b></font></td>
  13. <td width="112"><a href="http://spirit.sf.net"><img src="theme/spirit.gif" width="112" height="48" align="right" border="0"></a></td>
  14. </tr>
  15. </table>
  16. <br>
  17. <table border="0">
  18. <tr>
  19. <td width="10"></td>
  20. <td width="30"><a href="../index.html"><img src="theme/u_arr.gif" border="0"></a></td>
  21. <td width="30"><a href="style_guide.html"><img src="theme/l_arr.gif" border="0"></a></td>
  22. <td width="30"><a href="faq.html"><img src="theme/r_arr.gif" border="0"></a></td>
  23. </tr>
  24. </table>
  25. <ul>
  26. <li><a href="#templatized_functors">Templatized Functors</a></li>
  27. <li><a href="#multiple_scanner_support">Rule With Multiple Scanners</a></li>
  28. <li><a href="#no_rules">Look Ma' No Rules!</a></li>
  29. <li><a href="#typeof">typeof</a></li>
  30. <li><a href="#nabialek_trick">Nabialek trick</a></li>
  31. </ul>
  32. <h3><a name="templatized_functors"></a> Templatized Functors</h3>
  33. <p>For the sake of genericity, it is often better to make the functor's member
  34. <tt>operator()</tt> a template. That way, we do not have to concern ourselves
  35. with the type of the argument to expect as long as the behavior is appropriate.
  36. For instance, rather than hard-coding <tt>char const*</tt> as the argument of
  37. a generic semantic action, it is better to make it a template member function.
  38. That way, it can accept any type of iterator:</p>
  39. <pre><code><font color="#000000"><span class=special> </span><span class=keyword>struct </span><span class=identifier>my_functor
  40. </span><span class=special>{
  41. </span><span class=keyword>template </span><span class=special>&lt;</span><span class=keyword>typename </span><span class=identifier>IteratorT</span><span class=special>&gt;
  42. </span><span class=keyword>void </span><span class=keyword>operator</span><span class=special>()(</span><span class=identifier>IteratorT </span><span class=identifier>first</span><span class=special>, </span><span class=identifier>IteratorT </span><span class=identifier>last</span><span class=special>) </span><span class=keyword>const</span><span class=special>;
  43. </span><span class=special>};</span></font></code></pre>
  44. <p>Take note that this is only possible with functors. It is not possible to pass
  45. in template functions as semantic actions unless you cast it to the correct
  46. function signature; in which case, you <em>monomorphize</em> the function. This
  47. clearly shows that functors are superior to plain functions.</p>
  48. <h3><b><a name="multiple_scanner_support" id="multiple_scanner_support"></a> Rule
  49. With Multiple Scanners</b></h3>
  50. <p>As of v1.8.0, rules can use one or more scanner types. There are cases, for
  51. instance, where we need a rule that can work on the phrase and character levels.
  52. Rule/scanner mismatch has been a source of confusion and is the no. 1 <a href="faq.html#scanner_business">FAQ</a>.
  53. To address this issue, we now have <a href="rule.html#multiple_scanner_support">multiple
  54. scanner support</a>. </p>
  55. <p>Here is an example of a grammar with a rule <tt>r</tt> that can be called with
  56. 3 types of scanners (phrase-level, lexeme, and lower-case). See the <a href="rule.html">rule</a>,
  57. <a href="grammar.html">grammar</a>, <a href="scanner.html#lexeme_scanner">lexeme_scanner</a>
  58. and <a href="scanner.html#as_lower_scanner">as_lower_scanner </a>for more information.
  59. </p>
  60. <p>Here's the grammar (see <a href="../example/techniques/multiple_scanners.cpp">multiple_scanners.cpp</a>):
  61. </p>
  62. <pre><span class=special> </span><span class=keyword>struct </span><span class=identifier>my_grammar </span><span class=special>: </span><span class=identifier>grammar</span><span class=special>&lt;</span><span class=identifier>my_grammar</span><span class=special>&gt;
  63. </span><span class=special>{
  64. </span><span class=keyword>template </span><span class=special>&lt;</span><span class=keyword>typename </span><span class=identifier>ScannerT</span><span class=special>&gt;
  65. </span><span class=keyword>struct </span><span class=identifier>definition
  66. </span><span class=special>{
  67. </span><span class=identifier>definition</span><span class=special>(</span><span class=identifier>my_grammar </span><span class=keyword>const</span><span class=special>&amp; </span><span class=identifier>self</span><span class=special>)
  68. </span><span class=special>{
  69. </span><span class=identifier>r </span><span class=special>= </span><span class=identifier>lower_p</span><span class=special>;
  70. </span><span class=identifier>rr </span><span class=special>= </span><span class=special>+(</span><span class=identifier>lexeme_d</span><span class=special>[</span><span class=identifier>r</span><span class=special>] </span><span class=special>&gt;&gt; </span><span class=identifier>as_lower_d</span><span class=special>[</span><span class=identifier>r</span><span class=special>] </span><span class=special>&gt;&gt; </span><span class=identifier>r</span><span class=special>);
  71. </span><span class=special>}
  72. </span><span class=keyword>typedef </span><span class=identifier>scanner_list</span><span class=special>&lt;
  73. </span><span class=identifier>ScannerT
  74. </span><span class=special>, </span><span class=keyword>typename </span><span class=identifier>lexeme_scanner</span><span class=special>&lt;</span><span class=identifier>ScannerT</span><span class=special>&gt;::</span><span class=identifier>type
  75. </span><span class=special>, </span><span class=keyword>typename </span><span class=identifier>as_lower_scanner</span><span class=special>&lt;</span><span class=identifier>ScannerT</span><span class=special>&gt;::</span><span class=identifier>type
  76. </span><span class=special>&gt; </span><span class=identifier>scanners</span><span class=special>;
  77. </span><span class=identifier>rule</span><span class=special>&lt;</span><span class=identifier>scanners</span><span class=special>&gt; </span><span class=identifier>r</span><span class=special>;
  78. </span><span class=identifier>rule</span><span class=special>&lt;</span><span class=identifier>ScannerT</span><span class=special>&gt; </span><span class=identifier>rr</span><span class=special>;
  79. </span><span class=identifier>rule</span><span class=special>&lt;</span><span class=identifier>ScannerT</span><span class=special>&gt; </span><span class=keyword>const</span><span class=special>&amp; </span><span class=identifier>start</span><span class=special>() </span><span class=keyword>const </span><span class=special>{ </span><span class=keyword>return </span><span class=identifier>rr</span><span class=special>; </span><span class=special>}
  80. </span><span class=special>};
  81. </span><span class=special>};</span></pre>
  82. <p>By default support for multiple scanners is disabled. The macro
  83. <tt>BOOST_SPIRIT_RULE_SCANNERTYPE_LIMIT</tt> must be defined to the
  84. maximum number of scanners allowed in a scanner_list. The value must
  85. be greater than 1 to enable multiple scanners. Given the
  86. example above, to define a limit of three scanners for the list, the
  87. following line must be inserted into the source file before the
  88. inclusion of Spirit headers:
  89. </p>
  90. <pre><span class=special> </span><span class=preprocessor>#define </span><span class=identifier>BOOST_SPIRIT_RULE_SCANNERTYPE_LIMIT</span> <span class=literal>3</span></pre>
  91. <h3><span class=special></span><b> <a name="no_rules" id="no_rules"></a> Look
  92. Ma' No Rules</b></h3>
  93. <p>You use grammars and you use lots of 'em? Want a fly-weight, no-cholesterol,
  94. super-optimized grammar? Read on...</p>
  95. <p>I have a love-hate relationship with rules. I guess you know the reasons why.
  96. A lot of problems stem from the limitation of rules. Dynamic polymorphism and
  97. static polymorphism in C++ do not mix well. There is no notion of virtual template
  98. functions in C++; at least not just yet. Thus, the <strong>rule is tied to a
  99. specific scanner type</strong>. This results in problems such as the <a href="faq.html#scanner_business">scanner
  100. business</a>, our no. 1 FAQ. Apart from that, the virtual functions in rules
  101. slow down parsing, kill all meta-information, and kills inlining, hence bloating
  102. the generated code, especially for very tiny rules such as:</p>
  103. <pre> r <span class="special">=</span> ch_p<span class="special">(</span><span class="quotes">'x'</span><span class="special">) &gt;&gt;</span> uint_p<span class="special">;</span></pre>
  104. <p> The rule's limitation is the main reason why the grammar is designed the way
  105. it is now, with a nested template definition class. The rule's limitation is
  106. also the reason why subrules exists. But do we really need rules? Of course!
  107. Before C++ adopts some sort of auto-type deduction, such as that proposed by
  108. David Abrahams in clc++m:</p>
  109. <pre>
  110. <code><span class=keyword>auto </span><span class=identifier>r </span><span class=special>= ...</span><span class=identifier>definition </span><span class=special>...</span></code></pre>
  111. <p> we are tied to the rule as RHS placeholders. However.... in some occasions
  112. we can get by without rules! For instance, rather than writing:</p>
  113. <pre>
  114. <code><span class=identifier>rule</span><span class=special>&lt;&gt; </span><span class=identifier>x </span><span class=special>= </span><span class=identifier>ch_p</span><span class=special>(</span><span class=literal>'x'</span><span class=special>);</span></code></pre>
  115. <p> It's better to write:</p>
  116. <pre>
  117. <code><span class=identifier>chlit</span><span class=special>&lt;&gt; </span><span class=identifier>x </span><span class=special>= </span><span class=identifier>ch_p</span><span class=special>(</span><span class=literal>'x'</span><span class=special>);</span></code></pre>
  118. <p> That's trivial. But what if the rule is rather complicated? Ok, let's proceed
  119. stepwise... I'll investigate a simple skip_parser based on the C grammar from
  120. Hartmut Kaiser. Basically, the grammar is written as (see <a href="../example/techniques/no_rules/no_rule1.cpp">no_rule1.cpp</a>):</p>
  121. <pre><code> <span class=keyword>struct </span><span class=identifier>skip_grammar </span><span class=special>: </span><span class=identifier>grammar</span><span class=special>&lt;</span><span class=identifier>skip_grammar</span><span class=special>&gt;
  122. {
  123. </span><span class=keyword>template </span><span class=special>&lt;</span><span class=keyword>typename </span><span class=identifier>ScannerT</span><span class=special>&gt;
  124. </span><span class=keyword>struct </span><span class=identifier>definition
  125. </span><span class=special>{
  126. </span><span class=identifier>definition</span><span class=special>(</span><span class=identifier>skip_grammar </span><span class=keyword>const</span><span class=special>&amp; /*</span><span class=identifier>self</span><span class=special>*/)
  127. {
  128. </span><span class=identifier>skip
  129. </span><span class=special>= </span><span class=identifier>space_p
  130. </span><span class=special>| </span><span class=string>&quot;//&quot; </span><span class=special>&gt;&gt; *(</span><span class=identifier>anychar_p </span><span class=special>- </span><span class=literal>'\n'</span><span class=special>) &gt;&gt; </span><span class=literal>'\n'
  131. </span><span class=special>| </span><span class=string>&quot;/*&quot; </span><span class=special>&gt;&gt; *(</span><span class=identifier>anychar_p </span><span class=special>- </span><span class=string>&quot;*/&quot;</span><span class=special>) &gt;&gt; </span><span class=string>&quot;*/&quot;
  132. </span><span class=special>;
  133. }
  134. </span><span class=identifier>rule</span><span class=special>&lt;</span><span class=identifier>ScannerT</span><span class=special>&gt; </span><span class=identifier>skip</span><span class=special>;
  135. </span><span class=identifier>rule</span><span class=special>&lt;</span><span class=identifier>ScannerT</span><span class=special>&gt; </span><span class=keyword>const</span><span class=special>&amp;
  136. </span><span class=identifier>start</span><span class=special>() </span><span class=keyword>const </span><span class=special>{ </span><span class=keyword>return </span><span class=identifier>skip</span><span class=special>; }
  137. };
  138. };</span></code></pre>
  139. <p> Ok, so far so good. Can we do better? Well... since there are no recursive
  140. rules there (in fact there's only one rule), you can expand the type of rule's
  141. RHS as the rule type (see <a href="../example/techniques/no_rules/no_rule2.cpp">no_rule2.cpp</a>):</p>
  142. <pre><code><span class=special> </span><span class=keyword>struct </span><span class=identifier>skip_grammar </span><span class=special>: </span><span class=identifier>grammar</span><span class=special>&lt;</span><span class=identifier>skip_grammar</span><span class=special>&gt;
  143. {
  144. </span><span class=keyword>template </span><span class=special>&lt;</span><span class=keyword>typename </span><span class=identifier>ScannerT</span><span class=special>&gt;
  145. </span><span class=keyword>struct </span><span class=identifier>definition
  146. </span><span class=special>{
  147. </span> <span class=identifier>definition</span><span class=special>(</span><span class=identifier>skip_grammar </span><span class=keyword>const</span><span class=special>&amp; /*</span><span class=identifier>self</span><span class=special>*/)
  148. : </span><span class=identifier>skip</span><span class=special>
  149. ( </span><span class=identifier>space_p
  150. </span><span class=special>| </span><span class=string>&quot;//&quot; </span><span class=special>&gt;&gt; *(</span><span class=identifier>anychar_p </span><span class=special>- </span><span class=literal>'\n'</span><span class=special>) &gt;&gt; </span><span class=literal>'\n'
  151. </span><span class=special>| </span><span class=string>&quot;/*&quot; </span><span class=special>&gt;&gt; *(</span><span class=identifier>anychar_p </span><span class=special>- </span><span class=string>&quot;*/&quot;</span><span class=special>) &gt;&gt; </span><span class=string>&quot;*/&quot;
  152. </span><span class=special>)
  153. {
  154. }
  155. </span><span class=keyword>typedef
  156. </span><span class=identifier>alternative</span><span class=special>&lt;</span><span class=identifier>alternative</span><span class=special>&lt;</span><span class=identifier>space_parser</span><span class=special>, </span><span class=identifier>sequence</span><span class=special>&lt;</span><span class=identifier>sequence</span><span class=special>&lt;
  157. </span><span class=identifier>strlit</span><span class=special>&lt;</span><span class=keyword>const </span><span class=keyword>char</span><span class=special>*&gt;, </span><span class=identifier>kleene_star</span><span class=special>&lt;</span><span class=identifier>difference</span><span class=special>&lt;</span><span class=identifier>anychar_parser</span><span class=special>,
  158. </span><span class=identifier>chlit</span><span class=special>&lt;</span><span class=keyword>char</span><span class=special>&gt; &gt; &gt; &gt;, </span><span class=identifier>chlit</span><span class=special>&lt;</span><span class=keyword>char</span><span class=special>&gt; &gt; &gt;, </span><span class=identifier>sequence</span><span class=special>&lt;</span><span class=identifier>sequence</span><span class=special>&lt;
  159. </span><span class=identifier>strlit</span><span class=special>&lt;</span><span class=keyword>const </span><span class=keyword>char</span><span class=special>*&gt;, </span><span class=identifier>kleene_star</span><span class=special>&lt;</span><span class=identifier>difference</span><span class=special>&lt;</span><span class=identifier>anychar_parser</span><span class=special>,
  160. </span><span class=identifier>strlit</span><span class=special>&lt;</span><span class=keyword>const </span><span class=keyword>char</span><span class=special>*&gt; &gt; &gt; &gt;, </span><span class=identifier>strlit</span><span class=special>&lt;</span><span class=keyword>const </span><span class=keyword>char</span><span class=special>*&gt; &gt; &gt;
  161. </span><span class=identifier>skip_t</span><span class=special>;
  162. </span><span class=special> </span><span class=identifier>skip_t </span><span class=identifier>skip</span><span class=special>;
  163. </span><span class=identifier>skip_t </span><span class=keyword>const</span><span class=special>&amp;
  164. </span><span class=identifier>start</span><span class=special>() </span><span class=keyword>const </span><span class=special>{ </span><span class=keyword>return </span><span class=identifier>skip</span><span class=special>; }
  165. };
  166. };</span></code></pre>
  167. <p> Ughhh! How did I do that? How was I able to get at the complex typedef? Am
  168. I insane? Well, not really... there's a trick! What you do is define the typedef
  169. <tt>skip_t</tt> first as int:</p>
  170. <pre>
  171. <code><span class=keyword>typedef </span><span class=keyword>int </span><span class=identifier>skip_t</span><span class=special>;</span></code></pre>
  172. <p> Try to compile. Then, the compiler will generate an obnoxious error message
  173. such as:</p>
  174. <pre>
  175. <code><span class=string>&quot;cannot convert boost::spirit::alternative&lt;... blah blah...to int&quot;</span><span class=special>.</span></code></pre>
  176. <p> <strong>THERE YOU GO!</strong> You got it's type! I just copy and paste the
  177. correct type (removing explicit qualifications, if preferred).</p>
  178. <p> Can we still go further? Yes. Remember that the grammar was designed for rules.
  179. The nested template definition class is needed to get around the rule's limitations.
  180. Without rules, I propose a new class called <tt>sub_grammar</tt>, the grammar's
  181. low-fat counterpart:</p>
  182. <pre><code><span class=special> </span><span class=keyword>namespace </span><span class=identifier>boost </span><span class=special>{ </span><span class=keyword>namespace </span><span class=identifier>spirit
  183. </span><span class=special>{
  184. </span><span class=keyword>template </span><span class=special>&lt;</span><span class=keyword>typename </span><span class=identifier>DerivedT</span><span class=special>&gt;
  185. </span><span class=keyword>struct </span><span class=identifier>sub_grammar </span><span class=special>: </span><span class=identifier>parser</span><span class=special>&lt;</span><span class=identifier>DerivedT</span><span class=special>&gt;
  186. {
  187. </span><span class=keyword>typedef </span><span class=identifier>sub_grammar </span><span class=identifier>self_t</span><span class=special>;
  188. </span><span class=keyword>typedef </span><span class=identifier>DerivedT </span><span class=keyword>const</span><span class=special>&amp; </span><span class=identifier>embed_t</span><span class=special>;
  189. </span><span class=keyword>template </span><span class=special>&lt;</span><span class=keyword>typename </span><span class=identifier>ScannerT</span><span class=special>&gt;
  190. </span><span class=keyword>struct </span><span class=identifier>result
  191. </span><span class=special>{
  192. </span><span class=keyword>typedef </span><span class=keyword>typename </span><span class=identifier>parser_result</span><span class=special>&lt;
  193. </span><span class=keyword>typename </span><span class=identifier>DerivedT</span><span class=special>::</span><span class=identifier>start_t</span><span class=special>, </span><span class=identifier>ScannerT</span><span class=special>&gt;::</span><span class=identifier>type
  194. </span><span class=identifier>type</span><span class=special>;
  195. };
  196. </span><span class=identifier>DerivedT </span><span class=keyword>const</span><span class=special>&amp; </span><span class=identifier>derived</span><span class=special>() </span><span class=keyword>const
  197. </span><span class=special>{ </span><span class=keyword>return </span><span class=special>*</span><span class=keyword>static_cast</span><span class=special>&lt;</span><span class=identifier>DerivedT </span><span class=keyword>const</span><span class=special>*&gt;(</span><span class=keyword>this</span><span class=special>); }
  198. </span><span class=keyword>template </span><span class=special>&lt;</span><span class=keyword>typename </span><span class=identifier>ScannerT</span><span class=special>&gt;
  199. </span><span class=keyword>typename </span><span class=identifier>parser_result</span><span class=special>&lt;</span><span class=identifier>self_t</span><span class=special>, </span><span class=identifier>ScannerT</span><span class=special>&gt;::</span><span class=identifier>type
  200. </span><span class=identifier>parse</span><span class=special>(</span><span class=identifier>ScannerT </span><span class=keyword>const</span><span class=special>&amp; </span><span class=identifier>scan</span><span class=special>) </span><span class=keyword>const
  201. </span><span class=special>{
  202. </span><span class=keyword>return </span><span class=identifier>derived</span><span class=special>().</span><span class=identifier>start</span><span class=special>.</span><span class=identifier>parse</span><span class=special>(</span><span class=identifier>scan</span><span class=special>);
  203. }
  204. };
  205. }}</span></code></pre>
  206. <p>With the <tt>sub_grammar</tt> class, we can define our skipper grammar this
  207. way (see <a href="../example/techniques/no_rules/no_rule3.cpp">no_rule3.cpp</a>):</p>
  208. <pre><code><span class=special> </span><span class=keyword>struct </span><span class=identifier>skip_grammar </span><span class=special>: </span><span class=identifier>sub_grammar</span><span class=special>&lt;</span><span class=identifier>skip_grammar</span><span class=special>&gt;
  209. {
  210. </span><span class=keyword>typedef
  211. </span><span class=identifier>alternative</span><span class=special>&lt;</span><span class=identifier>alternative</span><span class=special>&lt;</span><span class=identifier>space_parser</span><span class=special>, </span><span class=identifier>sequence</span><span class=special>&lt;</span><span class=identifier>sequence</span><span class=special>&lt;
  212. </span><span class=identifier>strlit</span><span class=special>&lt;</span><span class=keyword>const </span><span class=keyword>char</span><span class=special>*&gt;, </span><span class=identifier>kleene_star</span><span class=special>&lt;</span><span class=identifier>difference</span><span class=special>&lt;</span><span class=identifier>anychar_parser</span><span class=special>,
  213. </span><span class=identifier>chlit</span><span class=special>&lt;</span><span class=keyword>char</span><span class=special>&gt; &gt; &gt; &gt;, </span><span class=identifier>chlit</span><span class=special>&lt;</span><span class=keyword>char</span><span class=special>&gt; &gt; &gt;, </span><span class=identifier>sequence</span><span class=special>&lt;</span><span class=identifier>sequence</span><span class=special>&lt;
  214. </span><span class=identifier>strlit</span><span class=special>&lt;</span><span class=keyword>const </span><span class=keyword>char</span><span class=special>*&gt;, </span><span class=identifier>kleene_star</span><span class=special>&lt;</span><span class=identifier>difference</span><span class=special>&lt;</span><span class=identifier>anychar_parser</span><span class=special>,
  215. </span><span class=identifier>strlit</span><span class=special>&lt;</span><span class=keyword>const </span><span class=keyword>char</span><span class=special>*&gt; &gt; &gt; &gt;, </span><span class=identifier>strlit</span><span class=special>&lt;</span><span class=keyword>const </span><span class=keyword>char</span><span class=special>*&gt; &gt; &gt;
  216. </span><span class=identifier>start_t</span><span class=special>;
  217. </span><span class=identifier>skip_grammar</span><span class=special>()
  218. : </span><span class=identifier>start
  219. </span><span class=special>(
  220. </span><span class=identifier>space_p
  221. </span><span class=special>| </span><span class=string>&quot;//&quot; </span><span class=special>&gt;&gt; *(</span><span class=identifier>anychar_p </span><span class=special>- </span><span class=literal>'\n'</span><span class=special>) &gt;&gt; </span><span class=literal>'\n'
  222. </span><span class=special>| </span><span class=string>&quot;/*&quot; </span><span class=special>&gt;&gt; *(</span><span class=identifier>anychar_p </span><span class=special>- </span><span class=string>&quot;*/&quot;</span><span class=special>) &gt;&gt; </span><span class=string>&quot;*/&quot;
  223. </span><span class=special>)
  224. {}
  225. </span><span class=identifier>start_t </span><span class=identifier>start</span><span class=special>;
  226. };</span></code></pre>
  227. <p>But what for, you ask? You can simply use the <tt>start_t</tt> type above as-is.
  228. It's already a parser! We can just type:</p>
  229. <pre>
  230. <code><span class=identifier>skipper_t </span><span class=identifier>skipper </span><span class=special>=
  231. </span><span class=identifier>space_p
  232. </span><span class=special>| </span><span class=string>&quot;//&quot; </span><span class=special>&gt;&gt; *(</span><span class=identifier>anychar_p </span><span class=special>- </span><span class=literal>'\n'</span><span class=special>) &gt;&gt; </span><span class=literal>'\n' </span><br> <span class=special>| </span><span class=string>&quot;/*&quot; </span><span class=special>&gt;&gt; *(</span><span class=identifier>anychar_p </span><span class=special>- </span><span class=string>&quot;*/&quot;</span><span class=special>) &gt;&gt; </span><span class=string>&quot;*/&quot;</span>
  233. <span class=special> ;</span></code></pre>
  234. <p> and use <tt>skipper</tt> just as we would any parser? Well, a subtle difference
  235. is that <tt>skipper</tt>, used this way will be embedded <strong>by value </strong>when<strong>
  236. </strong>you compose more complex parsers using it. That is, if we use <tt>skipper</tt>
  237. inside another production, the whole thing will be stored in the composite.
  238. Heavy!</p>
  239. <p> The proposed <tt>sub_grammar</tt> OTOH will be held by reference. Note:</p>
  240. <pre><code> <span class=keyword>typedef </span><span class=identifier>DerivedT </span><span class=keyword>const</span><span class=special>&amp; </span><span class=identifier>embed_t</span><span class=special>;</span></code></pre>
  241. <p>The proposed <tt>sub_grammar</tt> does not have the inherent limitations of
  242. rules, is very lighweight, and should be blazingly fast (can be fully inlined
  243. and does not use virtual functions). Perhaps this class will be part of a future
  244. spirit release. </p>
  245. <table width="80%" border="0" align="center">
  246. <tr>
  247. <td class="note_box"><img src="theme/note.gif" width="16" height="16"> <strong>The
  248. no-rules result</strong><br> <br>
  249. So, how much did we save? On MSVCV7.1, the original code: <a href="../example/techniques/no_rules/no_rule1.cpp">no_rule1.cpp</a>
  250. compiles to <strong>28k</strong>. Eliding rules, <a href="../example/techniques/no_rules/no_rule2.cpp">no_rule2.cpp</a>,
  251. we got <strong>24k</strong>. Not bad, we shaved off 4k amounting to a 14%
  252. reduction. But you'll be in for a surprise. The last version, using the
  253. sub-grammar: <a href="../example/techniques/no_rules/no_rule3.cpp">no_rule3.cpp</a>,
  254. compiles to <strong>5.5k</strong>! That's a whopping 80% reduction.<br>
  255. <br>
  256. <table width="100%" border="1">
  257. <tr>
  258. <td><a href="../example/techniques/no_rules/no_rule1.cpp">no_rule1.cpp</a></td>
  259. <td><strong>28k</strong></td>
  260. <td>standard rule and grammar</td>
  261. </tr>
  262. <tr>
  263. <td><a href="../example/techniques/no_rules/no_rule2.cpp">no_rule2.cpp</a></td>
  264. <td><strong>24k</strong></td>
  265. <td>standard grammar, no rule</td>
  266. </tr>
  267. <tr>
  268. <td><a href="../example/techniques/no_rules/no_rule3.cpp">no_rule3.cpp</a></td>
  269. <td><strong>5.5k</strong></td>
  270. <td>sub_grammar, no rule, no grammar</td>
  271. </tr>
  272. </table> </td>
  273. </tr>
  274. </table>
  275. <h3><b> <a name="typeof" id="typeof"></a> typeof</b></h3>
  276. <p>Some compilers already support the <tt>typeof</tt> keyword. Examples are g++
  277. and Metrowerks CodeWarrior. Someday, <tt>typeof</tt> will become commonplace.
  278. It is worth noting that we can use <tt>typeof</tt> to define non-recursive rules
  279. without using the rule class. To give an example, we'll use the skipper example
  280. above; this time using <tt>typeof</tt>. First, to avoid redundancy, we'll introduce
  281. a macro <tt>RULE</tt>: </p>
  282. <pre><code> <span class=preprocessor>#define </span><span class=identifier>RULE</span><span class=special>(</span><span class=identifier>name</span><span class=special>, </span><span class=identifier>definition</span><span class=special>) </span><span class="keyword">typeof</span><span class=special>(</span><span class=identifier>definition</span><span class=special>) </span><span class=identifier>name </span><span class=special>= </span><span class=identifier>definition</span></code></pre>
  283. <p>Then, simply:</p>
  284. <pre><code><span class=identifier> </span><span class=identifier>RULE</span><span class=special>(
  285. </span><span class=identifier>skipper</span><span class=special>,
  286. ( </span><span class=identifier>space_p
  287. </span><span class=special>| </span><span class=string>&quot;//&quot; </span><span class=special>&gt;&gt; *(</span><span class=identifier>anychar_p </span><span class=special>- </span><span class=literal>'\n'</span><span class=special>) &gt;&gt; </span><span class=literal>'\n'
  288. </span><span class=special>| </span><span class=string>&quot;/*&quot; </span><span class=special>&gt;&gt; *(</span><span class=identifier>anychar_p </span><span class=special>- </span><span class=string>&quot;*/&quot;</span><span class=special>) &gt;&gt; </span><span class=string>&quot;*/&quot;
  289. </span><span class=special>)
  290. );</span></code></pre>
  291. <p>(see <a href="../example/techniques/typeof.cpp">typeof.cpp</a>)</p>
  292. <p>That's it! Now you can use skipper just as you would any parser. Be reminded,
  293. however, that <tt>skipper</tt> above will be embedded by value when<strong>
  294. </strong>you compose more complex parsers using it (see <tt>sub_grammar</tt> rationale above). You can use the <tt>sub_grammar</tt> class to avoid this problem.</p>
  295. <h3><a name="nabialek_trick"></a> Nabialek trick</h3>
  296. <p>This technique, I'll call the <strong><em>&quot;Nabialek trick&quot; </em></strong>(from the name of its inventor, Sam Nabialek), can improve the rule dispatch from linear non-deterministic to deterministic. The trick applies to grammars where a keyword (operator, etc), precedes a production. There are lots of grammars similar to this:</p>
  297. <pre> <span class=identifier>r </span><span class=special>=
  298. </span><span class=identifier>keyword1 </span><span class=special>&gt;&gt; </span><span class=identifier>production1
  299. </span><span class=special>| </span><span class=identifier>keyword2 </span><span class=special>&gt;&gt; </span><span class=identifier>production2
  300. </span><span class=special>| </span><span class=identifier>keyword3 </span><span class=special>&gt;&gt; </span><span class=identifier>production3
  301. </span><span class=special>| </span><span class=identifier>keyword4 </span><span class=special>&gt;&gt; </span><span class=identifier>production4
  302. </span><span class=special>| </span><span class=identifier>keyword5 </span><span class=special>&gt;&gt; </span><span class=identifier>production5
  303. </span><span class=comment>/*** etc ***/
  304. </span><span class=special>;</span></pre>
  305. <p>The cascaded alternatives are tried one at a time through trial and error until something matches. The Nabialek trick takes advantage of the <a href="symbols.html">symbol table</a>'s search properties to optimize the dispatching of the alternatives. For an example, see <a href="../example/techniques/nabialek.cpp">nabialek.cpp</a>. The grammar works as follows. There are two rules (<tt>one</tt> and <tt>two</tt>). When &quot;one&quot; is recognized, rule <tt>one</tt> is invoked. When &quot;two&quot; is recognized, rule <tt>two</tt> is invoked. Here's the grammar:</p>
  306. <pre><span class=special> </span><span class=identifier>one </span><span class=special>= </span><span class=identifier>name</span><span class=special>;
  307. </span><span class=identifier>two </span><span class=special>= </span><span class=identifier>name </span><span class=special>&gt;&gt; </span><span class=literal>',' </span><span class=special>&gt;&gt; </span><span class=identifier>name</span><span class=special>;
  308. </span><span class=identifier>continuations</span><span class=special>.</span><span class=identifier>add
  309. </span><span class=special>(</span><span class=string>&quot;one&quot;</span><span class=special>, &amp;</span><span class=identifier>one</span><span class=special>)
  310. </span><span class=special>(</span><span class=string>&quot;two&quot;</span><span class=special>, &amp;</span><span class=identifier>two</span><span class=special>)
  311. </span><span class=special>;
  312. </span><span class=identifier>line </span><span class=special>= </span><span class=identifier>continuations</span><span class=special>[</span><span class=identifier>set_rest</span><span class=special>&lt;</span><span class=identifier>rule_t</span><span class=special>&gt;(</span><span class=identifier>rest</span><span class=special>)] </span><span class=special>&gt;&gt; </span><span class=identifier>rest</span><span class=special>;</span></pre>
  313. <p>where continuations is a <a href="symbols.html">symbol table</a> with pointer to rule_t slots. one, two, name, line and rest are rules:</p>
  314. <pre><span class=special> </span><span class=identifier>rule_t </span><span class=identifier>name</span><span class=special>;
  315. </span><span class=identifier>rule_t </span><span class=identifier>line</span><span class=special>;
  316. </span><span class=identifier>rule_t </span><span class=identifier>rest</span><span class=special>;
  317. </span><span class=identifier>rule_t </span><span class=identifier>one</span><span class=special>;
  318. </span><span class=identifier>rule_t </span><span class=identifier>two</span><span class=special>;
  319. </span><span class=identifier>symbols</span><span class=special>&lt;</span><span class=identifier>rule_t</span><span class=special>*&gt; </span><span class=identifier>continuations</span><span class=special>;</span></pre>
  320. <p>set_rest, the semantic action attached to continuations is:</p>
  321. <pre><span class=special> </span><span class=keyword>template </span><span class=special>&lt;</span><span class=keyword>typename </span><span class=identifier>Rule</span><span class=special>&gt;
  322. </span><span class=keyword>struct </span><span class=identifier>set_rest
  323. </span><span class=special>{
  324. </span><span class=identifier>set_rest</span><span class=special>(</span><span class=identifier>Rule</span><span class=special>&amp; </span><span class=identifier>the_rule</span><span class=special>)
  325. </span><span class=special>: </span><span class=identifier>the_rule</span><span class=special>(</span><span class=identifier>the_rule</span><span class=special>) </span><span class=special>{}
  326. </span><span class=keyword>void </span><span class=keyword>operator</span><span class=special>()(</span><span class=identifier>Rule</span><span class=special>* </span><span class=identifier>newRule</span><span class=special>) </span><span class=keyword>const
  327. </span><span class=special>{ </span><span class=identifier>m_theRule </span><span class=special>= </span><span class=special>*</span><span class=identifier>newRule</span><span class=special>; </span><span class=special>}
  328. </span><span class=identifier>Rule</span><span class=special>&amp; </span><span class=identifier>the_rule</span><span class=special>;
  329. </span><span class=special>};</span></pre>
  330. <p>Notice how the rest <tt>rule</tt> gets set dynamically when the set_rule action is called. The dynamic grammar parses inputs such as:</p>
  331. <p> &quot;one only&quot;<br>
  332. &quot;one again&quot;<br>
  333. &quot;two first, second&quot;</p>
  334. <p>The cool part is that the <tt>rest</tt> rule is set (by the <tt>set_rest</tt> action) depending on what the symbol table got. If it got a <em>&quot;one&quot;</em> then rest = one. If it got <em>&quot;two&quot;</em>, then rest = two. Very nifty! This technique should be very fast, especially when there are lots of keywords. It would be nice to add special facilities to make this easy to use. I imagine:</p>
  335. <pre><span class=special> </span><span class=identifier>r </span><span class=special>= </span><span class=identifier>keywords </span><span class=special>&gt;&gt; </span><span class=identifier>rest</span><span class=special>;</span></pre>
  336. <p>where <tt>keywords</tt> is a special parser (based on the symbol table) that automatically sets its RHS (rest) depending on the acquired symbol. This, I think, is mighty cool! Someday perhaps... </p>
  337. <p><img src="theme/note.gif" width="16" height="16"> Also, see the <a href="switch_parser.html">switch parser</a> for another deterministic parsing trick for character/token prefixes. </p>
  338. <span class=special></span>
  339. <table border="0">
  340. <tr>
  341. <td width="10"></td>
  342. <td width="30"><a href="../index.html"><img src="theme/u_arr.gif" border="0"></a></td>
  343. <td width="30"><a href="style_guide.html"><img src="theme/l_arr.gif" border="0"></a></td>
  344. <td width="30"><a href="faq.html"><img src="theme/r_arr.gif" border="0"></a></td>
  345. </tr>
  346. </table>
  347. <br>
  348. <hr size="1">
  349. <p class="copyright">Copyright &copy; 1998-2003 Joel de Guzman<br>
  350. <br>
  351. <font size="2">Use, modification and distribution is subject to the Boost Software
  352. License, Version 1.0. (See accompanying file LICENSE_1_0.txt or copy at
  353. http://www.boost.org/LICENSE_1_0.txt)</font></p>
  354. <p class="copyright">&nbsp;</p>
  355. </body>
  356. </html>