employee.qbk 6.3 KB

123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140141142143144145146147148149150151152153154155156157158159160161162163164165166167168169170171172173174175176177178179180181182183184185186187188189190191192193194195196197198199200201202203204205206207208209210211212213214215216217218219220221222223
  1. [/==============================================================================
  2. Copyright (C) 2001-2015 Joel de Guzman
  3. Copyright (C) 2001-2011 Hartmut Kaiser
  4. Distributed under the Boost Software License, Version 1.0. (See accompanying
  5. file LICENSE_1_0.txt or copy at http://www.boost.org/LICENSE_1_0.txt)
  6. ===============================================================================/]
  7. [section:employee Employee - Parsing into structs]
  8. It's a common question in the __spirit_list__: How do I parse and place
  9. the results into a C++ struct? Of course, at this point, you already
  10. know various ways to do it, using semantic actions. There are many ways
  11. to skin a cat. Spirit X3, being fully attributed, makes it even easier.
  12. The next example demonstrates some features of Spirit X3 that make this
  13. easy. In the process, you'll learn about:
  14. * More about attributes
  15. * Auto rules
  16. * Some more built-in parsers
  17. * Directives
  18. First, let's create a struct representing an employee:
  19. namespace client { namespace ast
  20. {
  21. struct employee
  22. {
  23. int age;
  24. std::string forename;
  25. std::string surname;
  26. double salary;
  27. };
  28. }}
  29. Then, we need to tell __fusion__ about our employee struct to make it a first-class
  30. fusion citizen that the grammar can utilize. If you don't know fusion yet,
  31. it is a __boost__ library for working with heterogeneous collections of data,
  32. commonly referred to as tuples. Spirit uses fusion extensively as part of its
  33. infrastructure.
  34. In fusion's view, a struct is just a form of a tuple. You can adapt any struct
  35. to be a fully conforming fusion tuple:
  36. BOOST_FUSION_ADAPT_STRUCT(
  37. client::ast::employee,
  38. age, forename, surname, salary
  39. )
  40. Now we'll write a parser for our employee. Inputs will be of the form:
  41. employee{ age, "forename", "surname", salary }
  42. [#__tutorial_employee_parser__]
  43. Here goes:
  44. namespace parser
  45. {
  46. namespace x3 = boost::spirit::x3;
  47. namespace ascii = boost::spirit::x3::ascii;
  48. using x3::int_;
  49. using x3::lit;
  50. using x3::double_;
  51. using x3::lexeme;
  52. using ascii::char_;
  53. x3::rule<class employee, ast::employee> const employee = "employee";
  54. auto const quoted_string = lexeme['"' >> +(char_ - '"') >> '"'];
  55. auto const employee_def =
  56. lit("employee")
  57. >> '{'
  58. >> int_ >> ','
  59. >> quoted_string >> ','
  60. >> quoted_string >> ','
  61. >> double_
  62. >> '}'
  63. ;
  64. BOOST_SPIRIT_DEFINE(employee);
  65. }
  66. The full cpp file for this example can be found here:
  67. [@../../../example/x3/employee.cpp employee.cpp]
  68. Let's walk through this one step at a time (not necessarily from top to bottom).
  69. [heading Rule Declaration]
  70. We are assuming that you already know about rules. We introduced rules in the
  71. previous [tutorial_roman Roman Numerals example]. Please go back and review
  72. the previous tutorial if you have to.
  73. x3::rule<class employee, ast::employee> employee = "employee";
  74. [heading Lexeme]
  75. lexeme['"' >> +(char_ - '"') >> '"'];
  76. `lexeme` inhibits space skipping from the open brace to the closing brace.
  77. The expression parses quoted strings.
  78. +(char_ - '"')
  79. parses one or more chars, except the double quote. It stops when it sees
  80. a double quote.
  81. [heading Difference]
  82. The expression:
  83. a - b
  84. parses `a` but not `b`. Its attribute is just `A`; the attribute of `a`. `b`'s
  85. attribute is ignored. Hence, the attribute of:
  86. char_ - '"'
  87. is just `char`.
  88. [heading Plus]
  89. +a
  90. is similar to Kleene star. Rather than match everything, `+a` matches one or more.
  91. Like it's related function, the Kleene star, its attribute is a `std::vector<A>`
  92. where `A` is the attribute of `a`. So, putting all these together, the attribute
  93. of
  94. +(char_ - '"')
  95. is then:
  96. std::vector<char>
  97. [heading Sequence Attribute]
  98. Now what's the attribute of
  99. '"' >> +(char_ - '"') >> '"'
  100. ?
  101. Well, typically, the attribute of:
  102. a >> b >> c
  103. is:
  104. fusion::vector<A, B, C>
  105. where `A` is the attribute of `a`, `B` is the attribute of `b` and `C` is the
  106. attribute of `c`. What is `fusion::vector`? - a tuple.
  107. [note If you don't know what I am talking about, see: [@http://tinyurl.com/6xun4j
  108. Fusion Vector]. It might be a good idea to have a look into __fusion__ at this
  109. point. You'll definitely see more of it in the coming pages.]
  110. [heading Attribute Collapsing]
  111. Some parsers, especially those very little literal parsers you see, like `'"'`,
  112. do not have attributes.
  113. Nodes without attributes are disregarded. In a sequence, like above, all nodes
  114. with no attributes are filtered out of the `fusion::vector`. So, since `'"'` has
  115. no attribute, and `+(char_ - '"')` has a `std::vector<char>` attribute, the
  116. whole expression's attribute should have been:
  117. fusion::vector<std::vector<char> >
  118. But wait, there's one more collapsing rule: If the attribute is followed by a
  119. single element `fusion::vector`, The element is stripped naked from its container.
  120. To make a long story short, the attribute of the expression:
  121. '"' >> +(char_ - '"') >> '"'
  122. is:
  123. std::vector<char>
  124. [heading Rule Definition]
  125. Again, we are assuming that you already know about rules and rule
  126. definitions. We introduced rules in the previous [tutorial_roman Roman
  127. Numerals example]. Please go back and review the previous tutorial if you
  128. have to.
  129. employee =
  130. lit("employee")
  131. >> '{'
  132. >> int_ >> ','
  133. >> quoted_string >> ','
  134. >> quoted_string >> ','
  135. >> double_
  136. >> '}'
  137. ;
  138. BOOST_SPIRIT_DEFINE(employee);
  139. Applying our collapsing rules above, the RHS has an attribute of:
  140. fusion::vector<int, std::string, std::string, double>
  141. These nodes do not have an attribute:
  142. * `lit("employee")`
  143. * `'{'`
  144. * `','`
  145. * `'}'`
  146. [note In case you are wondering, `lit("employee")` is the same as "employee". We
  147. had to wrap it inside `lit` because immediately after it is `>> '{'`. You can't
  148. right-shift a `char[]` and a `char` - you know, C++ syntax rules.]
  149. Recall that the attribute of `parser::employee` is the `ast::employee` struct.
  150. Now everything is clear, right? The `struct employee` *IS* compatible with
  151. `fusion::vector<int, std::string, std::string, double>`. So, the RHS of `start`
  152. uses start's attribute (a `struct employee`) in-situ when it does its work.
  153. [endsect]