introduction.qbk 3.7 KB

1234567891011121314151617181920212223242526272829303132333435363738394041424344454647484950515253545556575859606162636465666768697071727374757677787980818283848586878889909192939495
  1. [/==============================================================================
  2. Copyright (C) 2001-2015 Joel de Guzman
  3. Copyright (C) 2001-2011 Hartmut Kaiser
  4. Distributed under the Boost Software License, Version 1.0. (See accompanying
  5. file LICENSE_1_0.txt or copy at http://www.boost.org/LICENSE_1_0.txt)
  6. ===============================================================================/]
  7. [section Introduction]
  8. Boost Spirit X3 is an object-oriented, recursive-descent parser for C++. It
  9. allows you to write grammars using a format similar to
  10. Extended Backus Naur Form (EBNF)[footnote
  11. [@http://www.cl.cam.ac.uk/%7Emgk25/iso-14977.pdf ISO-EBNF]] directly in C++.
  12. These inline grammar specifications can mix freely with other C++ code and,
  13. thanks to the generative power of C++ templates, are immediately executable.
  14. Conventional compiler-compilers or parser-generators have to perform an
  15. additional translation step from the source EBNF code to C or C++ code.
  16. Since the target input grammars are written entirely in C++ we do not need any
  17. separate tools to compile, preprocess or integrate those into the build process.
  18. __spirit__ allows seamless integration of the parsing process with other C++
  19. code. This often allows for simpler and more efficient code.
  20. The created parsers are fully attributed, which allows you to easily build and
  21. handle hierarchical data structures in memory. These data structures resemble
  22. the structure of the input data and can directly be used to generate
  23. arbitrarily-formatted output.
  24. A simple EBNF grammar snippet:
  25. group ::= '(' expression ')'
  26. factor ::= integer | group
  27. term ::= factor (('*' factor) | ('/' factor))*
  28. expression ::= term (('+' term) | ('-' term))*
  29. is approximated using facilities of Spirit's /X3/ as seen in this code snippet:
  30. group = '(' >> expression >> ')';
  31. factor = integer | group;
  32. term = factor >> *(('*' >> factor) | ('/' >> factor));
  33. expression = term >> *(('+' >> term) | ('-' >> term));
  34. Through the magic of expression templates, this is perfectly valid and
  35. executable C++ code. The production rule `expression` is, in fact, an object
  36. that has a member function `parse` that does the work given a source code
  37. written in the grammar that we have just declared. Yes, it's a calculator. We
  38. shall simplify for now by skipping the type declarations and the definition of
  39. the rule `integer` invoked by `factor`. Now, the production rule `expression` in
  40. our grammar specification, traditionally called the `start` symbol, can
  41. recognize inputs such as:
  42. 12345
  43. -12345
  44. +12345
  45. 1 + 2
  46. 1 * 2
  47. 1/2 + 3/4
  48. 1 + 2 + 3 + 4
  49. 1 * 2 * 3 * 4
  50. (1 + 2) * (3 + 4)
  51. (-1 + 2) * (3 + -4)
  52. 1 + ((6 * 200) - 20) / 6
  53. (1 + (2 + (3 + (4 + 5))))
  54. Certainly we have modified the original EBNF syntax. This is done to
  55. conform to C++ syntax rules. Most notably we see the abundance of shift >>
  56. operators. Since there are no juxtaposition operators in C++, it is simply not
  57. possible to write something like:
  58. a b
  59. as seen in math syntax, for example, to mean multiplication or, in our case,
  60. as seen in EBNF syntax to mean sequencing (b should follow a). __x3__
  61. uses the shift `>>` operator instead for this purpose. We take the `>>`
  62. operator, with arrows pointing to the right, to mean "is followed by". Thus we
  63. write:
  64. a >> b
  65. The alternative operator `|` and the parentheses `()` remain as is. The
  66. assignment operator `=` is used in place of EBNF's `::=`. Last but not least,
  67. the Kleene star `*`, which in this case is a postfix operator in EBNF becomes a
  68. prefix. Instead of:
  69. a* //... in EBNF syntax,
  70. we write:
  71. *a //... in Spirit.
  72. since there are no postfix stars, `*`, in C/C++. Finally, we terminate each
  73. rule with the ubiquitous semi-colon, `;`.
  74. [endsect]