porting_from_1_8.qbk 10 KB

123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140141142143144145146147148149150151152153154155156157158159160161162163164165166167168169170171172173174175176177178179180181182183184185186187188189190191192193194195196197198199200201202203204205206207208209210211212213214215
  1. [/==============================================================================
  2. Copyright (C) 2001-2011 Hartmut Kaiser
  3. Copyright (C) 2001-2011 Joel de Guzman
  4. Distributed under the Boost Software License, Version 1.0. (See accompanying
  5. file LICENSE_1_0.txt or copy at http://www.boost.org/LICENSE_1_0.txt)
  6. ===============================================================================/]
  7. [section Porting from Spirit 1.8.x]
  8. [import ../example/qi/porting_guide_classic.cpp]
  9. [import ../example/qi/porting_guide_qi.cpp]
  10. The current version of __spirit__ is a complete rewrite of earlier versions (we
  11. refer to earlier versions as __classic__). The parser generators are now only
  12. one part of the whole library. The parser submodule of __spirit__ is now called
  13. __qi__. It is conceptually different and exposes a completely different
  14. interface. Generally, there is no easy (or automated) way of converting parsers
  15. written for __classic__ to __qi__. Therefore this section can give only
  16. guidelines on how to approach porting your older parsers to the current version
  17. of __spirit__.
  18. [heading Include Files]
  19. The overall directory structure of the __spirit__ directories is described
  20. in the section __include_structure__ and the FAQ entry
  21. __include_structure_faq__. This should give you a good overview on how to find
  22. the needed header files for your new parsers. Moreover, each section in the
  23. __sec_qi_reference__ lists the required include files needed for any particular
  24. component.
  25. It is possible to tell from the name of a header file, what version it belongs
  26. to. While all main include files for __classic__ have the string 'classic_' in
  27. their name, for instance:
  28. #include <boost/spirit/include/classic_core.hpp>
  29. we named all main include files for __qi__ to have the string 'qi_' as part of
  30. their name, for instance:
  31. #include <boost/spirit/include/qi_core.hpp>
  32. The following table gives a rough list of corresponding header file between
  33. __classic__ and __qi__, but this can be used as a starting point only, as
  34. several components have either been moved to different submodules or might not
  35. exist in the never version anymore. We list only include files for the topmost
  36. submodules. For header files required for more lower level components please
  37. refer to the corresponding reference documentation of this component.
  38. [table
  39. [[Include file in /Spirit.Classic/] [Include file in /Spirit.Qi/]]
  40. [[`classic.hpp`] [`qi.hpp`]]
  41. [[`classic_actor.hpp`] [none, use __phoenix__ for writing semantic actions]]
  42. [[`classic_attribute.hpp`] [none, use local variables for rules instead of closures,
  43. the primitives parsers now directly support lazy
  44. parameterization]]
  45. [[`classic_core.hpp`] [`qi_core.hpp`]]
  46. [[`classic_debug.hpp`] [`qi_debug.hpp`]]
  47. [[`classic_dynamic.hpp`] [none, use __qi__ predicates instead of if_p, while_p, for_p
  48. (included by `qi_core.hpp`), the equivalent for lazy_p
  49. is now included by `qi_auxiliary.hpp`]]
  50. [[`classic_error_handling.hpp`] [none, included in `qi_core.hpp`]]
  51. [[`classic_meta.hpp`] [none]]
  52. [[`classic_symbols.hpp`] [none, included in `qi_core.hpp`]]
  53. [[`classic_utility.hpp`] [none, not part of __qi__ anymore, these components
  54. will be added over time to the __repo__]]
  55. ]
  56. [heading The Free Parse Functions]
  57. The free parse functions (i.e. the main parser API) has been changed. This
  58. includes the names of the free functions as well as their interface. In
  59. __classic__ all free functions were named `parse`. In __qi__ they are are named
  60. either `qi::parse` or `qi::phrase_parse` depending on whether the parsing should
  61. be done using a skipper (`qi::phrase_parse`) or not (`qi::parse`). All free
  62. functions now return a simple `bool`. A returned `true` means success (i.e. the
  63. parser has matched) or `false` (i.e. the parser didn't match). This is
  64. equivalent to the former old `parse_info` member `hit`. __qi__ doesn't support
  65. tracking of the matched input length anymore. The old `parse_info` member
  66. `full` can be emulated by comparing the iterators after `qi::parse` returned.
  67. All code examples in this section assume the following include statements and
  68. using directives to be inserted. For __classic__:
  69. [porting_guide_classic_includes]
  70. [porting_guide_classic_namespace]
  71. and for __qi__:
  72. [porting_guide_qi_includes]
  73. [porting_guide_qi_namespace]
  74. The following similar examples should clarify the differences. First the
  75. base example in __classic__:
  76. [porting_guide_classic_parse]
  77. And here is the equivalent piece of code using __qi__:
  78. [porting_guide_qi_parse]
  79. The changes required for phrase parsing (i.e. parsing using a skipper) are
  80. similar. Here is how phrase parsing works in __classic__:
  81. [porting_guide_classic_phrase_parse]
  82. And here the equivalent example in __qi__:
  83. [porting_guide_qi_phrase_parse]
  84. Note, how character parsers are in a separate namespace (here
  85. `boost::spirit::ascii::space`) as __qi__ now supports working with different
  86. character sets. See the section __char_encoding_namespace__ for more information.
  87. [heading Naming Conventions]
  88. In __classic__ all parser primitives have suffixes appended to their names,
  89. encoding their type: `"_p"` for parsers, `"_a"` for lazy actions, `"_d"` for
  90. directives, etc. In __qi__ we don't have anything similar. The only suffixes
  91. are single underscore letters `"_"` applied where the name would otherwise
  92. conflict with a keyword or predefined name (such as `int_` for the
  93. integer parser). Overall, most, if not all primitive parsers and directives
  94. have been renamed. Please see the __qi_quickref__ for an overview on the
  95. names for the different available parser primitives, directives and operators.
  96. [heading Parser Attributes]
  97. In __classic__ most of the parser primitives don't expose a specific attribute
  98. type. Most parsers expose the pair of iterators pointing to the matched input
  99. sequence. As in __qi__ all parsers expose a parser specific attribute type it
  100. introduces a special directive __qi_raw__`[]` allowing to achieve a similar
  101. effect as in __classic__. The __qi_raw__`[]` directive exposes the pair of
  102. iterators pointing to the matching sequence of its embedded parser. Even if we
  103. very much encourage you to rewrite your parsers to take advantage of the
  104. generated parser specific attributes, sometimes it is helpful to get access to
  105. the underlying matched input sequence.
  106. [heading Grammars and Rules]
  107. The `grammar<>` and `rule<>` types are of equal importance to __qi__ as they
  108. are for __classic__. Their main purpose is still the same: they allow to
  109. define non-terminals and they are the main building blocks for more complex
  110. parsers. Nevertheless, both types have been redesigned and their interfaces
  111. have changed. Let's have a look at two examples first, we'll explain the
  112. differences afterwards. Here is a simple grammar and its usage in __classic__:
  113. [porting_guide_classic_grammar]
  114. [porting_guide_classic_use_grammar]
  115. And here is a similar grammar and its usage in __qi__:
  116. [porting_guide_qi_grammar]
  117. [porting_guide_qi_use_grammar]
  118. Both versions look similar enough, but we see several differences (we will
  119. cover each of those differences in more detail below):
  120. * Neither the grammars nor the rules depend on a scanner type anymore, both
  121. depend only on the underlying iterator type. That means the dreaded scanner
  122. business is no issue anymore!
  123. * Grammars have no embedded class `definition` anymore
  124. * Grammars and rules may have an explicit attribute type specified in their
  125. definition
  126. * Grammars do not have any explicit start rules anymore. Instead one of the
  127. contained rules is used as a start rule by default.
  128. The first two points are tightly interrelated. The scanner business (see the
  129. FAQ number one of __classic__ here: __scanner_business__) has been
  130. a problem for a long time. The grammar and rule types have been specifically
  131. redesigned to avoid this problem in the future. This also means that we don't
  132. need any delayed instantiation of the inner definition class in a grammar
  133. anymore. So the redesign not only helped fixing a long standing design problem,
  134. it helped to simplify things considerably.
  135. All __qi__ parser components have well defined attribute types. Grammars and
  136. rules are no exception. But since both need to be generic enough to be usable
  137. for any parser their attribute type has to be explicitly specified. In the
  138. example above the `roman` grammar and the rule `first` both have an `unsigned`
  139. attribute:
  140. // grammar definition
  141. template <typename Iterator>
  142. struct roman : qi::grammar<Iterator, unsigned()> {...};
  143. // rule definition
  144. qi::rule<Iterator, unsigned()> first;
  145. The used notation resembles the definition of a function type. This is very
  146. natural as you can think of the synthesized attribute of the grammar and the
  147. rule as of its 'return value'. In fact the rule and the grammar both 'return'
  148. an unsigned value - the value they matched.
  149. [note The function type notation allows to specify parameters as well. These
  150. are interpreted as the types of inherited attributes the rule or
  151. grammar expect to be passed during parsing. For more information
  152. please see the section about inherited and synthesized attributes for
  153. rules and grammars (__sec_attributes__).]
  154. If no attribute is desired none needs to be specified. The default attribute
  155. type for both, grammars and rules, is __unused_type__, which is a special
  156. placeholder type. Generally, using __unused_type__ as the attribute of a parser
  157. is interpreted as 'this parser has no attribute'. This is mostly used for
  158. parsers applied to parts of the input not carrying any significant information,
  159. rather being delimiters or structural elements needed for correct interpretation
  160. of the input.
  161. The last difference might seem to be rather cosmetic and insignificant. But it
  162. turns out that not having to specify which rule in a grammar is the start rule
  163. (by returning it from the function `start()`) also means that any rule in a
  164. grammar can be directly used as the start rule. Nevertheless, the grammar base
  165. class gets initialized with the rule it has to use as the start rule in case
  166. the grammar instance is directly used as a parser.
  167. [endsect]