[/============================================================================== Copyright (C) 2001-2015 Hartmut Kaiser Copyright (C) 2001-2011 Joel de Guzman Distributed under the Boost Software License, Version 1.0. (See accompanying file LICENSE_1_0.txt or copy at http://www.boost.org/LICENSE_1_0.txt) ===============================================================================/] [/////////////////////////////////////////////////////////////////////////////] [section:primitive_attributes Attributes of Primitive Components] Parsers in __spirit__ are fully attributed. __x3__ parsers always /expose/ an attribute specific to their type. This is called /synthesized attribute/ as it is returned from a successful match representing the matched input sequence. For instance, numeric parsers, such as `int_` or `double_`, return the `int` or `double` value converted from the matched input sequence. Other primitive parser components have other intuitive attribute types, such as for instance `int_` which has `int`, or `ascii::char_` which has `char`. Primitive parsers apply the normal C++ convertibility rules: you can use any C++ type to receive the parsed value as long as the attribute type of the parser is convertible to the type provided. The following example shows how a synthesized parser attribute (the `int` value) is extracted by calling the API function `x3::parse`: int value = 0; std::string str("123"); std::string::iterator strbegin = str.begin(); x3::parse(strbegin, str.end(), int_, value); // value == 123 For a full list of available parser primitives and their attribute types please see the sections __sec_x3_primitive__. [endsect] [/////////////////////////////////////////////////////////////////////////////] [section:compound_attributes Attributes of Compound Components] __x3__ implement well defined attribute type propagation rules for all compound parsers, such as sequences, alternatives, Kleene star, etc. The main attribute propagation rule for a sequences is for instance: a: A, b: B --> (a >> b): tuple which reads as: [:Given `a` and `b` are parsers, and `A` is the attribute type of `a`, and `B` is the attribute type of `b`, then the attribute type of `a >> b` (`a << b`) will be `tuple`.] [note The notation `tuple` is used as a placeholder expression for any fusion sequence holding the types A and B, such as `boost::fusion::tuple` or `std::pair` (for more information see __fusion__).] As you can see, in order for a type to be compatible with the attribute type of a compound expression it has to * either be convertible to the attribute type, * or it has to expose certain functionalities, i.e. it needs to conform to a concept compatible with the component. Each compound component implements its own set of attribute propagation rules. For a full list of how the different compound parsers consume attributes see the sections __sec_x3_compound__. [heading The Attribute of Sequence Parsers] Sequences require an attribute type to expose the concept of a fusion sequence, where all elements of that fusion sequence have to be compatible with the corresponding element of the component sequence. For example, the expression: double_ >> double_ is compatible with any fusion sequence holding two types, where both types have to be compatible with `double`. The first element of the fusion sequence has to be compatible with the attribute of the first `double_`, and the second element of the fusion sequence has to be compatible with the attribute of the second `double_`. If we assume to have an instance of a `std::pair`, we can directly use the expressions above to do both, parse input to fill the attribute: // the following parses "1.0 2.0" into a pair of double std::string input("1.0 2.0"); std::string::iterator strbegin = input.begin(); std::pair p; x3::phrase_parse(strbegin, input.end(), x3::double_ >> x3::double_, // parser grammar x3::space, // delimiter grammar p); // attribute to fill while parsing [tip *For sequences only:* To keep it simple, unlike __Spirit.qi__, __x3__ does not support more than one attribute anymore in the `parse` and `phrase_parse` function. Just use `std:tuple'. Be sure to include `boost/fusion/adapted/std_tuple.hpp' in this case. ] [heading The Attribute of Alternative Parsers] Alternative parsers are all about - well - alternatives. In order to store possibly different result (attribute) types from the different alternatives we use the data type __boost_variant__. The main attribute propagation rule of these components is: a: A, b: B --> (a | b): variant Alternatives have a second very important attribute propagation rule: a: A, b: A --> (a | b): A often simplifying things significantly. If all sub expressions of an alternative expose the same attribute type, the overall alternative will expose exactly the same attribute type as well. [endsect] [/////////////////////////////////////////////////////////////////////////////] [section:more_compound_attributes More About Attributes of Compound Components] While parsing input, it is often desirable to combine some constant elements with variable parts. For instance, let us look at the example of parsing or formatting a complex number, which is written as `(real, imag)`, where `real` and `imag` are the variables representing the real and imaginary parts of our complex number. This can be achieved by writing: '(' >> double_ >> ", " >> double_ >> ')' Literals (such as `'('` and `", "`) do /not/ expose any attribute (well actually, they do expose the special type `unused_type`, but in this context `unused_type` is interpreted as if the component does not expose any attribute at all). It is very important to understand that the literals don't consume any of the elements of a fusion sequence passed to this component sequence. As said, they just don't expose any attribute and don't produce (consume) any data. The following example shows this: // the following parses "(1.0, 2.0)" into a pair of double std::string input("(1.0, 2.0)"); std::string::iterator strbegin = input.begin(); std::pair p; x3::parse(strbegin, input.end(), '(' >> x3::double_ >> ", " >> x3::double_ >> ')', // parser grammar p); // attribute to fill while parsing where the first element of the pair passed in as the data to generate is still associated with the first `double_`, and the second element is associated with the second `double_` parser. This behavior should be familiar as it conforms to the way other input and output formatting libraries such as `scanf`, `printf` or `boost::format` are handling their variable parts. In this context you can think about __x3__'s primitive components (such as the `double_` above) as of being type safe placeholders for the attribute values. [tip *For sequences only:* To keep it simple, unlike __Spirit.qi__, __x3__ does not support more than one attribute anymore in the `parse` and `phrase_parse` function. Just use `std:tuple'. Be sure to include `boost/fusion/adapted/std_tuple.hpp' in this case. ] Let's take a look at this from a more formal perspective: a: A, b: Unused --> (a >> b): A which reads as: [:Given `a` and `b` are parsers, and `A` is the attribute type of `a`, and `unused_type` is the attribute type of `b`, then the attribute type of `a >> b` (`a << b`) will be `A` as well. This rule applies regardless of the position the element exposing the `unused_type` is at.] This rule is the key to the understanding of the attribute handling in sequences as soon as literals are involved. It is as if elements with `unused_type` attributes 'disappeared' during attribute propagation. Notably, this is not only true for sequences but for any compound components. For instance, for alternative components the corresponding rule is: a: A, b: Unused --> (a | b): A again, allowing to simplify the overall attribute type of an expression. [endsect] [/////////////////////////////////////////////////////////////////////////////] [section:nonterminal_attributes Attributes of Nonterminals] Nonterminals are the main means of constructing more complex parsers out of simpler ones. The nonterminals in the parser world are very similar to functions in an imperative programming language. They can be used to encapsulate parser expressions for a particular input sequence. After being defined, the nonterminals can be used as 'normal' parsers in more complex expressions whenever the encapsulated input needs to be recognized. Parser nonterminals in __x3__ usually return a value (the synthesized attribute). The type of the synthesized attribute as to be explicitly specified while defining the particular nonterminal. Example (ignore ID for now): x3::rule r; [endsect]