[/============================================================================== Copyright (C) 2001-2011 Hartmut Kaiser Copyright (C) 2001-2011 Joel de Guzman Distributed under the Boost Software License, Version 1.0. (See accompanying file LICENSE_1_0.txt or copy at http://www.boost.org/LICENSE_1_0.txt) ===============================================================================/] [section:attributes Attributes] [/////////////////////////////////////////////////////////////////////////////] [section:primitive_attributes Attributes of Primitive Components] Parsers and generators in __spirit__ are fully attributed. __qi__ parsers always /expose/ an attribute specific to their type. This is called /synthesized attribute/ as it is returned from a successful match representing the matched input sequence. For instance, numeric parsers, such as `int_` or `double_`, return the `int` or `double` value converted from the matched input sequence. Other primitive parser components have other intuitive attribute types, such as for instance `int_` which has `int`, or `ascii::char_` which has `char`. For primitive parsers apply the normal C++ convertibility rules: you can use any C++ type to receive the parsed value as long as the attribute type of the parser is convertible to the type provided. The following example shows how a synthesized parser attribute (the `int` value) is extracted by calling the API function `qi::parse`: int value = 0; std::string str("123"); std::string::iterator strbegin = str.begin(); qi::parse(strbegin, str.end(), int_, value); // value == 123 The attribute type of a generator defines what data types this generator is able to consume in order to produce its output. __karma__ generators always /expect/ an attribute specific to their type. This is called /consumed attribute/ and is expected to be passed to the generator. The consumed attribute is most of the time the value the generator is designed to emit output for. For primitive generators the normal C++ convertibility rules apply. Any data type convertible to the attribute type of a primitive generator can be used to provide the data to generate. We present a similar example as above, this time the consumed attribute of the `int_` generator (the `int` value) is passed to the API function `karma::generate`: int value = 123; std::string str; std::back_insert_iterator out(str); karma::generate(out, int_, value); // str == "123" Other primitive generator components have other intuitive attribute types, very similar to the corresponding parser components. For instance, the `ascii::char_` generator has `char` as consumed attribute. For a full list of available parser and generator primitives and their attribute types please see the sections __sec_qi_primitive__ and __sec_karma_primitive__. [endsect] [/////////////////////////////////////////////////////////////////////////////] [section:compound_attributes Attributes of Compound Components] __qi__ and __karma__ implement well defined attribute type propagation rules for all compound parsers and generators, such as sequences, alternatives, Kleene star, etc. The main attribute propagation rule for a sequences is for instance: [table [[Library] [Sequence attribute propagation rule]] [[Qi] [`a: A, b: B --> (a >> b): tuple`]] [[Karma] [`a: A, b: B --> (a << b): tuple`]] ] which reads as: [:Given `a` and `b` are parsers (generators), and `A` is the attribute type of `a`, and `B` is the attribute type of `b`, then the attribute type of `a >> b` (`a << b`) will be `tuple`.] [note The notation `tuple` is used as a placeholder expression for any fusion sequence holding the types A and B, such as `boost::fusion::tuple` or `std::pair` (for more information see __fusion__).] As you can see, in order for a type to be compatible with the attribute type of a compound expression it has to * either be convertible to the attribute type, * or it has to expose certain functionalities, i.e. it needs to conform to a concept compatible with the component. Each compound component implements its own set of attribute propagation rules. For a full list of how the different compound generators consume attributes see the sections __sec_qi_compound__ and __sec_karma_compound__. [heading The Attribute of Sequence Parsers and Generators] Sequences require an attribute type to expose the concept of a fusion sequence, where all elements of that fusion sequence have to be compatible with the corresponding element of the component sequence. For example, the expression: [table [[Library] [Sequence expression]] [[Qi] [`double_ >> double_`]] [[Karma] [`double_ << double_`]] ] is compatible with any fusion sequence holding two types, where both types have to be compatible with `double`. The first element of the fusion sequence has to be compatible with the attribute of the first `double_`, and the second element of the fusion sequence has to be compatible with the attribute of the second `double_`. If we assume to have an instance of a `std::pair`, we can directly use the expressions above to do both, parse input to fill the attribute: // the following parses "1.0 2.0" into a pair of double std::string input("1.0 2.0"); std::string::iterator strbegin = input.begin(); std::pair p; qi::phrase_parse(strbegin, input.end(), qi::double_ >> qi::double_, // parser grammar qi::space, // delimiter grammar p); // attribute to fill while parsing and generate output for it: // the following generates: "1.0 2.0" from the pair filled above std::string str; std::back_insert_iterator out(str); karma::generate_delimited(out, karma::double_ << karma::double_, // generator grammar (format description) karma::space, // delimiter grammar p); // data to use as the attribute (where the `karma::space` generator is used as the delimiter, allowing to automatically skip/insert delimiting spaces in between all primitives). [tip *For sequences only:* __qi__ and __karma__ expose a set of API functions usable mainly with sequences. Very much like the functions of the `scanf` and `printf` families these functions allow to pass the attributes for each of the elements of the sequence separately. Using the corresponding overload of /Qi's/ parse or /Karma's/ `generate()` the expression above could be rewritten as: `` double d1 = 0.0, d2 = 0.0; qi::phrase_parse(begin, end, qi::double_ >> qi::double_, qi::space, d1, d2); karma::generate_delimited(out, karma::double_ << karma::double_, karma::space, d1, d2); `` where the first attribute is used for the first `double_`, and the second attribute is used for the second `double_`. ] [heading The Attribute of Alternative Parsers and Generators] Alternative parsers and generators are all about - well - alternatives. In order to store possibly different result (attribute) types from the different alternatives we use the data type __boost_variant__. The main attribute propagation rule of these components is: a: A, b: B --> (a | b): variant Alternatives have a second very important attribute propagation rule: a: A, b: A --> (a | b): A often allowing to simplify things significantly. If all sub expressions of an alternative expose the same attribute type, the overall alternative will expose exactly the same attribute type as well. [endsect] [/////////////////////////////////////////////////////////////////////////////] [section:more_compound_attributes More About Attributes of Compound Components] While parsing input or generating output it is often desirable to combine some constant elements with variable parts. For instance, let us look at the example of parsing or formatting a complex number, which is written as `(real, imag)`, where `real` and `imag ` are the variables representing the real and imaginary parts of our complex number. This can be achieved by writing: [table [[Library] [Sequence expression]] [[Qi] [`'(' >> double_ >> ", " >> double_ >> ')'`]] [[Karma] [`'(' << double_ << ", " << double_ << ')'`]] ] Fortunately, literals (such as `'('` and `", "`) do /not/ expose any attribute (well actually, they do expose the special type `unused_type`, but in this context `unused_type` is interpreted as if the component does not expose any attribute at all). It is very important to understand that the literals don't consume any of the elements of a fusion sequence passed to this component sequence. As said, they just don't expose any attribute and don't produce (consume) any data. The following example shows this: // the following parses "(1.0, 2.0)" into a pair of double std::string input("(1.0, 2.0)"); std::string::iterator strbegin = input.begin(); std::pair p; qi::parse(strbegin, input.end(), '(' >> qi::double_ >> ", " >> qi::double_ >> ')', // parser grammar p); // attribute to fill while parsing and here is the equivalent __karma__ code snippet: // the following generates: (1.0, 2.0) std::string str; std::back_insert_iterator out(str); generate(out, '(' << karma::double_ << ", " << karma::double_ << ')', // generator grammar (format description) p); // data to use as the attribute where the first element of the pair passed in as the data to generate is still associated with the first `double_`, and the second element is associated with the second `double_` generator. This behavior should be familiar as it conforms to the way other input and output formatting libraries such as `scanf`, `printf` or `boost::format` are handling their variable parts. In this context you can think about __qi__'s and __karma__'s primitive components (such as the `double_` above) as of being type safe placeholders for the attribute values. [tip Similarly to the tip provided above, this example could be rewritten using /Spirit's/ multi-attribute API function: `` double d1 = 0.0, d2 = 0.0; qi::parse(begin, end, '(' >> qi::double_ >> ", " >> qi::double_ >> ')', d1, d2); karma::generate(out, '(' << karma::double_ << ", " << karma::double_ << ')', d1, d2); `` which provides a clear and comfortable syntax, more similar to the placeholder based syntax as exposed by `printf` or `boost::format`. ] Let's take a look at this from a more formal perspective. The sequence attribute propagation rules define a special behavior if generators exposing `unused_type` as their attribute are involved (see __sec_karma_compound__): [table [[Library] [Sequence attribute propagation rule]] [[Qi] [`a: A, b: Unused --> (a >> b): A`]] [[Karma] [`a: A, b: Unused --> (a << b): A`]] ] which reads as: [:Given `a` and `b` are parsers (generators), and `A` is the attribute type of `a`, and `unused_type` is the attribute type of `b`, then the attribute type of `a >> b` (`a << b`) will be `A` as well. This rule applies regardless of the position the element exposing the `unused_type` is at.] This rule is the key to the understanding of the attribute handling in sequences as soon as literals are involved. It is as if elements with `unused_type` attributes 'disappeared' during attribute propagation. Notably, this is not only true for sequences but for any compound components. For instance, for alternative components the corresponding rule is: a: A, b: Unused --> (a | b): A again, allowing to simplify the overall attribute type of an expression. [endsect] [/////////////////////////////////////////////////////////////////////////////] [section:nonterminal_attributes Attributes of Rules and Grammars] Nonterminals are well known from parsers where they are used as the main means of constructing more complex parsers out of simpler ones. The nonterminals in the parser world are very similar to functions in an imperative programming language. They can be used to encapsulate parser expressions for a particular input sequence. After being defined, the nonterminals can be used as 'normal' parsers in more complex expressions whenever the encapsulated input needs to be recognized. Parser nonterminals in __qi__ may accept /parameters/ (inherited attributes) and usually return a value (the synthesized attribute). Both, the types of the inherited and the synthesized attributes have to be explicitly specified while defining the particular `grammar` or the `rule` (the Spirit __repo__ additionally has `subrules` which conform to a similar interface). As an example, the following code declares a __qi__ `rule` exposing an `int` as its synthesized attribute, while expecting a single `double` as its inherited attribute (see the section about the __qi__ __rule__ for more information): qi::rule r; In the world of generators, nonterminals are just as useful as in the parser world. Generator nonterminals encapsulate a format description for a particular data type, and, whenever we need to emit output for this data type, the corresponding nonterminal is invoked in a similar way as the predefined __karma__ generator primitives. The __karma__ [karma_nonterminal nonterminals] are very similar to the __qi__ nonterminals. Generator nonterminals may accept /parameters/ as well, and we call those inherited attributes too. The main difference is that they do not expose a synthesized attribute (as parsers do), but they require a special /consumed attribute/. Usually the consumed attribute is the value the generator creates its output from. Even if the consumed attribute is not 'returned' from the generator we chose to use the same function style declaration syntax as used in __qi__. The example below declares a __karma__ `rule` consuming a `double` while not expecting any additional inherited attributes. karma::rule r; The inherited attributes of nonterminal parsers and generators are normally passed to the component during its invocation. These are the /parameters/ the parser or generator may accept and they can be used to parameterize the component depending on the context they are invoked from. [/ * attribute propagation * explicit and operator%= ] [endsect] [endsect] [/ Attributes]