warming_up.qbk 6.8 KB

123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140141142143144145146147148149150151152153154155
  1. [/==============================================================================
  2. Copyright (C) 2001-2015 Joel de Guzman
  3. Copyright (C) 2001-2011 Hartmut Kaiser
  4. Distributed under the Boost Software License, Version 1.0. (See accompanying
  5. file LICENSE_1_0.txt or copy at http://www.boost.org/LICENSE_1_0.txt)
  6. ===============================================================================/]
  7. [section Warming up]
  8. We'll start by showing examples of parser expressions to give you a feel on how
  9. to build parsers from the simplest parser, building up as we go. When comparing
  10. EBNF to __spirit__, the expressions may seem awkward at first. __spirit__ heavily
  11. uses operator overloading to accomplish its magic.
  12. [heading Trivial Example #1 Parsing a number]
  13. Create a parser that will parse a floating-point number.
  14. double_
  15. (You've got to admit, that's trivial!) The above code actually generates a
  16. Spirit floating point parser (a built-in parser). Spirit has many pre-defined
  17. parsers and consistent naming conventions help you keep from going insane!
  18. [heading Trivial Example #2 Parsing two numbers]
  19. Create a parser that will accept a line consisting of two floating-point numbers.
  20. double_ >> double_
  21. Here you see the familiar floating-point numeric parser `double_` used twice,
  22. once for each number. What's that `>>` operator doing in there? Well, they had
  23. to be separated by something, and this was chosen as the "followed by" sequence
  24. operator. The above program creates a parser from two simpler parsers, glueing
  25. them together with the sequence operator. The result is a parser that is a
  26. composition of smaller parsers. Whitespace between numbers can implicitly be
  27. consumed depending on how the parser is invoked (see below).
  28. [note When we combine parsers, we end up with a "bigger" parser, but
  29. it's still a parser. Parsers can get bigger and bigger, nesting more and more,
  30. but whenever you glue two parsers together, you end up with one bigger parser.
  31. This is an important concept.
  32. ]
  33. [heading Trivial Example #3 Parsing zero or more numbers]
  34. Create a parser that will accept zero or more floating-point numbers.
  35. *double_
  36. This is like a regular-expression Kleene Star, though the syntax might look a
  37. bit odd for a C++ programmer not used to seeing the `*` operator overloaded like
  38. this. Actually, if you know regular expressions it may look odd too since the
  39. star is before the expression it modifies. C'est la vie. Blame it on the fact
  40. that we must work with the syntax rules of C++.
  41. Any expression that evaluates to a parser may be used with the Kleene Star.
  42. Keep in mind that C++ operator precedence rules may require you to put
  43. expressions in parentheses for complex expressions. The Kleene Star
  44. is also known as a Kleene Closure, but we call it the Star in most places.
  45. [heading Trivial Example #4 Parsing a comma-delimited list of numbers]
  46. This example will create a parser that accepts a comma-delimited list of
  47. numbers.
  48. double_ >> *(char_(',') >> double_)
  49. Notice `char_(',')`. It is a literal character parser that can recognize the
  50. comma `','`. In this case, the Kleene Star is modifying a more complex parser,
  51. namely, the one generated by the expression:
  52. (char_(',') >> double_)
  53. Note that this is a case where the parentheses are necessary. The Kleene star
  54. encloses the complete expression above.
  55. [heading Let's Parse!]
  56. We're done with defining the parser. So the next step is now invoking this
  57. parser to do its work. There are a couple of ways to do this. For now, we will
  58. use the `phrase_parse` function. One overload of this function accepts four
  59. arguments:
  60. # An iterator pointing to the start of the input
  61. # An iterator pointing to one past the end of the input
  62. # The parser object
  63. # Another parser called the skip parser
  64. In our example, we wish to skip spaces and tabs. Another parser named `space`
  65. is included in Spirit's repertoire of predefined parsers. It is a very simple
  66. parser that simply recognizes whitespace. We will use `space` as our skip
  67. parser. The skip parser is the one responsible for skipping characters in
  68. between parser elements such as the `double_` and `char_`.
  69. Ok, so now let's parse!
  70. template <typename Iterator>
  71. bool parse_numbers(Iterator first, Iterator last)
  72. {
  73. using x3::double_;
  74. using x3::phrase_parse;
  75. using ascii::space;
  76. bool r = phrase_parse(
  77. first, // Start Iterator
  78. last, // End Iterator
  79. double_ >> *(',' >> double_), // The Parser
  80. space // The Skip-Parser
  81. );
  82. if (first != last) // fail if we did not get a full match
  83. return false;
  84. return r;
  85. }
  86. The parse function returns `true` or `false` depending on the result of
  87. the parse. The first iterator is passed by reference. On a successful
  88. parse, this iterator is repositioned to the rightmost position consumed
  89. by the parser. If this becomes equal to `last`, then we have a full
  90. match. If not, then we have a partial match. A partial match happens
  91. when the parser is only able to parse a portion of the input.
  92. Note that we inlined the parser directly in the call to parse. Upon calling
  93. parse, the expression evaluates into a temporary, unnamed parser which is passed
  94. into the parse() function, used, and then destroyed.
  95. Here, we opted to make the parser generic by making it a template, parameterized
  96. by the iterator type. By doing so, it can take in data coming from any STL
  97. conforming sequence as long as the iterators conform to a forward iterator.
  98. You can find the full cpp file here:
  99. [@../../../example/x3/num_list/num_list1.cpp num_list1.cpp]
  100. [note `char` and `wchar_t` operands
  101. The careful reader may notice that the parser expression has `','` instead of
  102. `char_(',')` as the previous examples did. This is ok due to C++ syntax rules of
  103. conversion. There are `>>` operators that are overloaded to accept a `char` or
  104. `wchar_t` argument on its left or right (but not both). An operator may be
  105. overloaded if at least one of its parameters is a user-defined type. In this
  106. case, the `double_` is the 2nd argument to `operator>>`, and so the proper
  107. overload of `>>` is used, converting `','` into a character literal parser.
  108. The problem with omitting the `char_` should be obvious: `'a' >> 'b'` is not a
  109. spirit parser, it is a numeric expression, right-shifting the ASCII (or another
  110. encoding) value of `'a'` by the ASCII value of `'b'`. However, both
  111. `char_('a') >> 'b'` and `'a' >> char_('b')` are Spirit sequence parsers
  112. for the letter `'a'` followed by `'b'`. You'll get used to it, sooner or later.
  113. ]
  114. Finally, take note that we test for a full match (i.e. the parser fully parsed
  115. the input) by checking if the first iterator, after parsing, is equal to the end
  116. iterator. You may strike out this part if partial matches are to be allowed.
  117. [endsect] [/ Warming up]