syntax_diagram.qbk 3.9 KB

123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104
  1. [/==============================================================================
  2. Copyright (C) 2001-2011 Joel de Guzman
  3. Copyright (C) 2001-2011 Hartmut Kaiser
  4. Distributed under the Boost Software License, Version 1.0. (See accompanying
  5. file LICENSE_1_0.txt or copy at http://www.boost.org/LICENSE_1_0.txt)
  6. ===============================================================================/]
  7. [section Syntax Diagram]
  8. In the next section, we will deal with Parsing Expression Grammars
  9. (PEG) [footnote Bryan Ford: Parsing Expression Grammars: A Recognition-Based
  10. Syntactic Foundation, [@http://pdos.csail.mit.edu/~baford/packrat/popl04/]],
  11. a variant of Extended Backus-Naur Form (EBNF) [footnote Richard E. Pattis: EBNF:
  12. A Notation to Describe Syntax, [@http://www.cs.cmu.edu/~pattis/misc/ebnf.pdf]]
  13. with a different interpretation. It is easier to understand PEG using Syntax
  14. Diagrams. Syntax diagrams represent a grammar graphically. It was used extensively
  15. by Niklaus Wirth [footnote Niklaus Wirth: The Programming Language
  16. Pascal. (July 1973)] in the "Pascal User Manual". Syntax Diagrams are
  17. easily understandable by programmers due to their similarity to flow
  18. charts. The isomorphism of the diagrams and functions make them ideal for
  19. representing __rd__ parsers which are essentially mutually recursive
  20. functions.
  21. Historically, Parsing Expression Grammars have been used for describing grammars
  22. for parsers only (hence the name). In __spirit__ we use a very similar notation
  23. for output generation as well. Almost all the concepts described here are
  24. equally applicable both to __qi__ parsers and to __karma__ generators.
  25. [heading Elements]
  26. All diagrams have one entry and one exit point. Arrows connect all possible
  27. paths through the grammar from the entry point to the exit point.
  28. [:__sd_start_stop__]
  29. Terminals are represented by round boxes. Terminals are atomic and
  30. usually represent plain characters, strings or tokens.
  31. [:__sd_terminals__]
  32. Non-terminals are represented by boxes. Diagrams are modularized using
  33. named non-terminals. A complex diagram can be broken down into a set of
  34. non-terminals. Non-terminals also allow recursion (i.e. a non-terminal
  35. can call itself).
  36. [:__sd_non_terminals__]
  37. [heading Constructs]
  38. The most basic composition is the Sequence. B follows A:
  39. [:__sd_sequence__]
  40. The ordered choice henceforth we will call /alternatives/. In PEG,
  41. ordered choice and alternatives are not quite the same. PEG allows
  42. ambiguity of choice where one or more branches can succeed. In PEG, in
  43. case of ambiguity, the first one always wins.
  44. [:__sd_choice__]
  45. The optional (zero-or-one):
  46. [:__sd_optional__]
  47. Now, the loops. We have the zero-or-more and one-or-more:
  48. [:__sd_kleene__]
  49. [:__sd_plus__]
  50. Take note that, as in PEG, these loops behave greedily. If there is
  51. another 'A' just before the end-point, it will always fail because the
  52. preceding loop has already exhausted all 'A's and there is nothing more
  53. left. This is a crucial difference between PEG and general Context Free
  54. Grammars (CFGs). This behavior is quite obvious with syntax diagrams as
  55. they resemble flow-charts.
  56. [heading Predicates]
  57. Now, the following are Syntax Diagram versions of PEG predicates. These
  58. are not traditionally found in Syntax Diagrams. These are special
  59. extensions we invented to closely follow PEGs.
  60. First, we introduce a new element, the Predicate:
  61. [:__sd_predicate__]
  62. This is similar to the conditionals in flow charts where the 'No' branch
  63. is absent and always signals a failed parse.
  64. We have two versions of the predicate, the /And-Predicate/ and the
  65. /Not-Predicate/:
  66. [:__sd_and_predicate__]
  67. [:__sd_not_predicate__]
  68. The /And-Predicate/ tries the predicate, P, and succeeds if P succeeds,
  69. or otherwise fail. The opposite is true with the /Not-Predicate/. It
  70. tries the predicate, P, and fails if P succeeds, or otherwise succeeds.
  71. Both versions do a look-ahead but do not consume any input regardless if
  72. P succeeds or not.
  73. [endsect]