warming_up.html 18 KB

123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140141142143144145146147148149150151152153154155156157158159160161162163164165166167168169170171172173174175176177178179180181182183184185186187188189190191192193194195196197198199200201202203204205206207208209210211212213214215216217218219220221222223224225226227228229230231232233234235236237238239240241242243244245246247248249250251252253254255256257258259260261262263264265
  1. <html>
  2. <head>
  3. <meta http-equiv="Content-Type" content="text/html; charset=US-ASCII">
  4. <title>Warming up</title>
  5. <link rel="stylesheet" href="../../../../../../../doc/src/boostbook.css" type="text/css">
  6. <meta name="generator" content="DocBook XSL Stylesheets V1.79.1">
  7. <link rel="home" href="../../index.html" title="Spirit X3 3.0.4">
  8. <link rel="up" href="../tutorials.html" title="Tutorials">
  9. <link rel="prev" href="quick_start.html" title="Quick Start">
  10. <link rel="next" href="semantic_actions.html" title="Parser Semantic Actions">
  11. </head>
  12. <body bgcolor="white" text="black" link="#0000FF" vlink="#840084" alink="#0000FF">
  13. <table cellpadding="2" width="100%"><tr>
  14. <td valign="top"><img alt="Boost C++ Libraries" width="277" height="86" src="../../../../../../../boost.png"></td>
  15. <td align="center"><a href="../../../../../../../index.html">Home</a></td>
  16. <td align="center"><a href="../../../../../../../libs/libraries.htm">Libraries</a></td>
  17. <td align="center"><a href="http://www.boost.org/users/people.html">People</a></td>
  18. <td align="center"><a href="http://www.boost.org/users/faq.html">FAQ</a></td>
  19. <td align="center"><a href="../../../../../../../more/index.htm">More</a></td>
  20. </tr></table>
  21. <hr>
  22. <div class="spirit-nav">
  23. <a accesskey="p" href="quick_start.html"><img src="../../../../../../../doc/src/images/prev.png" alt="Prev"></a><a accesskey="u" href="../tutorials.html"><img src="../../../../../../../doc/src/images/up.png" alt="Up"></a><a accesskey="h" href="../../index.html"><img src="../../../../../../../doc/src/images/home.png" alt="Home"></a><a accesskey="n" href="semantic_actions.html"><img src="../../../../../../../doc/src/images/next.png" alt="Next"></a>
  24. </div>
  25. <div class="section">
  26. <div class="titlepage"><div><div><h3 class="title">
  27. <a name="spirit_x3.tutorials.warming_up"></a><a class="link" href="warming_up.html" title="Warming up">Warming up</a>
  28. </h3></div></div></div>
  29. <p>
  30. We'll start by showing examples of parser expressions to give you a feel
  31. on how to build parsers from the simplest parser, building up as we go. When
  32. comparing EBNF to <a href="http://boost-spirit.com" target="_top">Spirit</a>, the
  33. expressions may seem awkward at first. <a href="http://boost-spirit.com" target="_top">Spirit</a>
  34. heavily uses operator overloading to accomplish its magic.
  35. </p>
  36. <h5>
  37. <a name="spirit_x3.tutorials.warming_up.h0"></a>
  38. <span class="phrase"><a name="spirit_x3.tutorials.warming_up.trivial_example__1_parsing_a_number"></a></span><a class="link" href="warming_up.html#spirit_x3.tutorials.warming_up.trivial_example__1_parsing_a_number">Trivial
  39. Example #1 Parsing a number</a>
  40. </h5>
  41. <p>
  42. Create a parser that will parse a floating-point number.
  43. </p>
  44. <pre class="programlisting"><span class="identifier">double_</span>
  45. </pre>
  46. <p>
  47. (You've got to admit, that's trivial!) The above code actually generates
  48. a Spirit floating point parser (a built-in parser). Spirit has many pre-defined
  49. parsers and consistent naming conventions help you keep from going insane!
  50. </p>
  51. <h5>
  52. <a name="spirit_x3.tutorials.warming_up.h1"></a>
  53. <span class="phrase"><a name="spirit_x3.tutorials.warming_up.trivial_example__2_parsing_two_numbers"></a></span><a class="link" href="warming_up.html#spirit_x3.tutorials.warming_up.trivial_example__2_parsing_two_numbers">Trivial
  54. Example #2 Parsing two numbers</a>
  55. </h5>
  56. <p>
  57. Create a parser that will accept a line consisting of two floating-point
  58. numbers.
  59. </p>
  60. <pre class="programlisting"><span class="identifier">double_</span> <span class="special">&gt;&gt;</span> <span class="identifier">double_</span>
  61. </pre>
  62. <p>
  63. Here you see the familiar floating-point numeric parser <code class="computeroutput"><span class="identifier">double_</span></code>
  64. used twice, once for each number. What's that <code class="computeroutput"><span class="special">&gt;&gt;</span></code>
  65. operator doing in there? Well, they had to be separated by something, and
  66. this was chosen as the "followed by" sequence operator. The above
  67. program creates a parser from two simpler parsers, glueing them together
  68. with the sequence operator. The result is a parser that is a composition
  69. of smaller parsers. Whitespace between numbers can implicitly be consumed
  70. depending on how the parser is invoked (see below).
  71. </p>
  72. <div class="note"><table border="0" summary="Note">
  73. <tr>
  74. <td rowspan="2" align="center" valign="top" width="25"><img alt="[Note]" src="../../images/note.png"></td>
  75. <th align="left">Note</th>
  76. </tr>
  77. <tr><td align="left" valign="top"><p>
  78. When we combine parsers, we end up with a "bigger" parser, but
  79. it's still a parser. Parsers can get bigger and bigger, nesting more and
  80. more, but whenever you glue two parsers together, you end up with one bigger
  81. parser. This is an important concept.
  82. </p></td></tr>
  83. </table></div>
  84. <h5>
  85. <a name="spirit_x3.tutorials.warming_up.h2"></a>
  86. <span class="phrase"><a name="spirit_x3.tutorials.warming_up.trivial_example__3_parsing_zero_or_more_numbers"></a></span><a class="link" href="warming_up.html#spirit_x3.tutorials.warming_up.trivial_example__3_parsing_zero_or_more_numbers">Trivial
  87. Example #3 Parsing zero or more numbers</a>
  88. </h5>
  89. <p>
  90. Create a parser that will accept zero or more floating-point numbers.
  91. </p>
  92. <pre class="programlisting"><span class="special">*</span><span class="identifier">double_</span>
  93. </pre>
  94. <p>
  95. This is like a regular-expression Kleene Star, though the syntax might look
  96. a bit odd for a C++ programmer not used to seeing the <code class="computeroutput"><span class="special">*</span></code>
  97. operator overloaded like this. Actually, if you know regular expressions
  98. it may look odd too since the star is before the expression it modifies.
  99. C'est la vie. Blame it on the fact that we must work with the syntax rules
  100. of C++.
  101. </p>
  102. <p>
  103. Any expression that evaluates to a parser may be used with the Kleene Star.
  104. Keep in mind that C++ operator precedence rules may require you to put expressions
  105. in parentheses for complex expressions. The Kleene Star is also known as
  106. a Kleene Closure, but we call it the Star in most places.
  107. </p>
  108. <h5>
  109. <a name="spirit_x3.tutorials.warming_up.h3"></a>
  110. <span class="phrase"><a name="spirit_x3.tutorials.warming_up.trivial_example__4_parsing_a_comma_delimited_list_of_numbers"></a></span><a class="link" href="warming_up.html#spirit_x3.tutorials.warming_up.trivial_example__4_parsing_a_comma_delimited_list_of_numbers">Trivial
  111. Example #4 Parsing a comma-delimited list of numbers</a>
  112. </h5>
  113. <p>
  114. This example will create a parser that accepts a comma-delimited list of
  115. numbers.
  116. </p>
  117. <pre class="programlisting"><span class="identifier">double_</span> <span class="special">&gt;&gt;</span> <span class="special">*(</span><span class="identifier">char_</span><span class="special">(</span><span class="char">','</span><span class="special">)</span> <span class="special">&gt;&gt;</span> <span class="identifier">double_</span><span class="special">)</span>
  118. </pre>
  119. <p>
  120. Notice <code class="computeroutput"><span class="identifier">char_</span><span class="special">(</span><span class="char">','</span><span class="special">)</span></code>. It is a
  121. literal character parser that can recognize the comma <code class="computeroutput"><span class="char">','</span></code>.
  122. In this case, the Kleene Star is modifying a more complex parser, namely,
  123. the one generated by the expression:
  124. </p>
  125. <pre class="programlisting"><span class="special">(</span><span class="identifier">char_</span><span class="special">(</span><span class="char">','</span><span class="special">)</span> <span class="special">&gt;&gt;</span> <span class="identifier">double_</span><span class="special">)</span>
  126. </pre>
  127. <p>
  128. Note that this is a case where the parentheses are necessary. The Kleene
  129. star encloses the complete expression above.
  130. </p>
  131. <h5>
  132. <a name="spirit_x3.tutorials.warming_up.h4"></a>
  133. <span class="phrase"><a name="spirit_x3.tutorials.warming_up.let_s_parse_"></a></span><a class="link" href="warming_up.html#spirit_x3.tutorials.warming_up.let_s_parse_">Let's
  134. Parse!</a>
  135. </h5>
  136. <p>
  137. We're done with defining the parser. So the next step is now invoking this
  138. parser to do its work. There are a couple of ways to do this. For now, we
  139. will use the <code class="computeroutput"><span class="identifier">phrase_parse</span></code>
  140. function. One overload of this function accepts four arguments:
  141. </p>
  142. <div class="orderedlist"><ol class="orderedlist" type="1">
  143. <li class="listitem">
  144. An iterator pointing to the start of the input
  145. </li>
  146. <li class="listitem">
  147. An iterator pointing to one past the end of the input
  148. </li>
  149. <li class="listitem">
  150. The parser object
  151. </li>
  152. <li class="listitem">
  153. Another parser called the skip parser
  154. </li>
  155. </ol></div>
  156. <p>
  157. In our example, we wish to skip spaces and tabs. Another parser named <code class="computeroutput"><span class="identifier">space</span></code> is included in Spirit's repertoire
  158. of predefined parsers. It is a very simple parser that simply recognizes
  159. whitespace. We will use <code class="computeroutput"><span class="identifier">space</span></code>
  160. as our skip parser. The skip parser is the one responsible for skipping characters
  161. in between parser elements such as the <code class="computeroutput"><span class="identifier">double_</span></code>
  162. and <code class="computeroutput"><span class="identifier">char_</span></code>.
  163. </p>
  164. <p>
  165. Ok, so now let's parse!
  166. </p>
  167. <pre class="programlisting"><span class="keyword">template</span> <span class="special">&lt;</span><span class="keyword">typename</span> <span class="identifier">Iterator</span><span class="special">&gt;</span>
  168. <span class="keyword">bool</span> <span class="identifier">parse_numbers</span><span class="special">(</span><span class="identifier">Iterator</span> <span class="identifier">first</span><span class="special">,</span> <span class="identifier">Iterator</span> <span class="identifier">last</span><span class="special">)</span>
  169. <span class="special">{</span>
  170. <span class="keyword">using</span> <span class="identifier">x3</span><span class="special">::</span><span class="identifier">double_</span><span class="special">;</span>
  171. <span class="keyword">using</span> <span class="identifier">x3</span><span class="special">::</span><span class="identifier">phrase_parse</span><span class="special">;</span>
  172. <span class="keyword">using</span> <span class="identifier">ascii</span><span class="special">::</span><span class="identifier">space</span><span class="special">;</span>
  173. <span class="keyword">bool</span> <span class="identifier">r</span> <span class="special">=</span> <span class="identifier">phrase_parse</span><span class="special">(</span>
  174. <span class="identifier">first</span><span class="special">,</span> <span class="comment">// Start Iterator</span>
  175. <span class="identifier">last</span><span class="special">,</span> <span class="comment">// End Iterator</span>
  176. <span class="identifier">double_</span> <span class="special">&gt;&gt;</span> <span class="special">*(</span><span class="char">','</span> <span class="special">&gt;&gt;</span> <span class="identifier">double_</span><span class="special">),</span> <span class="comment">// The Parser</span>
  177. <span class="identifier">space</span> <span class="comment">// The Skip-Parser</span>
  178. <span class="special">);</span>
  179. <span class="keyword">if</span> <span class="special">(</span><span class="identifier">first</span> <span class="special">!=</span> <span class="identifier">last</span><span class="special">)</span> <span class="comment">// fail if we did not get a full match</span>
  180. <span class="keyword">return</span> <span class="keyword">false</span><span class="special">;</span>
  181. <span class="keyword">return</span> <span class="identifier">r</span><span class="special">;</span>
  182. <span class="special">}</span>
  183. </pre>
  184. <p>
  185. The parse function returns <code class="computeroutput"><span class="keyword">true</span></code>
  186. or <code class="computeroutput"><span class="keyword">false</span></code> depending on the result
  187. of the parse. The first iterator is passed by reference. On a successful
  188. parse, this iterator is repositioned to the rightmost position consumed by
  189. the parser. If this becomes equal to <code class="computeroutput"><span class="identifier">last</span></code>,
  190. then we have a full match. If not, then we have a partial match. A partial
  191. match happens when the parser is only able to parse a portion of the input.
  192. </p>
  193. <p>
  194. Note that we inlined the parser directly in the call to parse. Upon calling
  195. parse, the expression evaluates into a temporary, unnamed parser which is
  196. passed into the parse() function, used, and then destroyed.
  197. </p>
  198. <p>
  199. Here, we opted to make the parser generic by making it a template, parameterized
  200. by the iterator type. By doing so, it can take in data coming from any STL
  201. conforming sequence as long as the iterators conform to a forward iterator.
  202. </p>
  203. <p>
  204. You can find the full cpp file here: <a href="../../../../../example/x3/num_list/num_list1.cpp" target="_top">num_list1.cpp</a>
  205. </p>
  206. <div class="note"><table border="0" summary="Note">
  207. <tr>
  208. <td rowspan="2" align="center" valign="top" width="25"><img alt="[Note]" src="../../images/note.png"></td>
  209. <th align="left">Note</th>
  210. </tr>
  211. <tr><td align="left" valign="top">
  212. <p>
  213. <code class="computeroutput"><span class="keyword">char</span></code> and <code class="computeroutput"><span class="keyword">wchar_t</span></code>
  214. operands
  215. </p>
  216. <p>
  217. The careful reader may notice that the parser expression has <code class="computeroutput"><span class="char">','</span></code> instead of <code class="computeroutput"><span class="identifier">char_</span><span class="special">(</span><span class="char">','</span><span class="special">)</span></code>
  218. as the previous examples did. This is ok due to C++ syntax rules of conversion.
  219. There are <code class="computeroutput"><span class="special">&gt;&gt;</span></code> operators
  220. that are overloaded to accept a <code class="computeroutput"><span class="keyword">char</span></code>
  221. or <code class="computeroutput"><span class="keyword">wchar_t</span></code> argument on its
  222. left or right (but not both). An operator may be overloaded if at least
  223. one of its parameters is a user-defined type. In this case, the <code class="computeroutput"><span class="identifier">double_</span></code> is the 2nd argument to <code class="computeroutput"><span class="keyword">operator</span><span class="special">&gt;&gt;</span></code>,
  224. and so the proper overload of <code class="computeroutput"><span class="special">&gt;&gt;</span></code>
  225. is used, converting <code class="computeroutput"><span class="char">','</span></code> into
  226. a character literal parser.
  227. </p>
  228. <p>
  229. The problem with omitting the <code class="computeroutput"><span class="identifier">char_</span></code>
  230. should be obvious: <code class="computeroutput"><span class="char">'a'</span> <span class="special">&gt;&gt;</span>
  231. <span class="char">'b'</span></code> is not a spirit parser, it is a
  232. numeric expression, right-shifting the ASCII (or another encoding) value
  233. of <code class="computeroutput"><span class="char">'a'</span></code> by the ASCII value of
  234. <code class="computeroutput"><span class="char">'b'</span></code>. However, both <code class="computeroutput"><span class="identifier">char_</span><span class="special">(</span><span class="char">'a'</span><span class="special">)</span> <span class="special">&gt;&gt;</span>
  235. <span class="char">'b'</span></code> and <code class="computeroutput"><span class="char">'a'</span>
  236. <span class="special">&gt;&gt;</span> <span class="identifier">char_</span><span class="special">(</span><span class="char">'b'</span><span class="special">)</span></code>
  237. are Spirit sequence parsers for the letter <code class="computeroutput"><span class="char">'a'</span></code>
  238. followed by <code class="computeroutput"><span class="char">'b'</span></code>. You'll get used
  239. to it, sooner or later.
  240. </p>
  241. </td></tr>
  242. </table></div>
  243. <p>
  244. Finally, take note that we test for a full match (i.e. the parser fully parsed
  245. the input) by checking if the first iterator, after parsing, is equal to
  246. the end iterator. You may strike out this part if partial matches are to
  247. be allowed.
  248. </p>
  249. </div>
  250. <table xmlns:rev="http://www.cs.rpi.edu/~gregod/boost/tools/doc/revision" width="100%"><tr>
  251. <td align="left"></td>
  252. <td align="right"><div class="copyright-footer">Copyright &#169; 2001-2018 Joel de Guzman,
  253. Hartmut Kaiser<p>
  254. Distributed under the Boost Software License, Version 1.0. (See accompanying
  255. file LICENSE_1_0.txt or copy at <a href="http://www.boost.org/LICENSE_1_0.txt" target="_top">http://www.boost.org/LICENSE_1_0.txt</a>)
  256. </p>
  257. </div></td>
  258. </tr></table>
  259. <hr>
  260. <div class="spirit-nav">
  261. <a accesskey="p" href="quick_start.html"><img src="../../../../../../../doc/src/images/prev.png" alt="Prev"></a><a accesskey="u" href="../tutorials.html"><img src="../../../../../../../doc/src/images/up.png" alt="Up"></a><a accesskey="h" href="../../index.html"><img src="../../../../../../../doc/src/images/home.png" alt="Home"></a><a accesskey="n" href="semantic_actions.html"><img src="../../../../../../../doc/src/images/next.png" alt="Next"></a>
  262. </div>
  263. </body>
  264. </html>