123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140141142143144145146147148149150151152153154155156157158159160161162163164165166167168169170171172173174175176177178179180181182183184185186187188189190191192193194195196197198199200201202 |
- <html>
- <head>
- <title>Directives</title>
- <meta http-equiv="Content-Type" content="text/html; charset=iso-8859-1">
- <link rel="stylesheet" href="theme/style.css" type="text/css">
- </head>
- <body>
- <table width="100%" border="0" background="theme/bkd2.gif" cellspacing="2">
- <tr>
- <td width="10">
- </td>
- <td width="85%">
- <font size="6" face="Verdana, Arial, Helvetica, sans-serif"><b>Directives</b></font>
- </td>
- <td width="112"><a href="http://spirit.sf.net"><img src="theme/spirit.gif" width="112" height="48" align="right" border="0"></a></td>
- </tr>
- </table>
- <br>
- <table border="0">
- <tr>
- <td width="10"></td>
- <td width="30"><a href="../index.html"><img src="theme/u_arr.gif" border="0"></a></td>
- <td width="30"><a href="epsilon.html"><img src="theme/l_arr.gif" border="0"></a></td>
- <td width="30"><a href="scanner.html"><img src="theme/r_arr.gif" border="0"></a></td>
- </tr>
- </table>
- <p>Parser directives have the form: <b>directive[expression]</b></p>
- <p>A directive modifies the behavior of its enclosed expression, essentially <em>decorating</em>
- it. The framework pre-defines a few directives. Clients of the framework are
- free to define their own directives as needed. Information on how this is done
- will be provided later. For now, we shall deal only with predefined directives.</p>
- <h2>lexeme_d</h2>
- <p>Turns off white space skipping. At the phrase level, the parser ignores white
- spaces, possibly including comments. Use <tt>lexeme_d</tt> in situations where
- we want to work at the character level instead of the phrase level. Parsers
- can be made to work at the character level by enclosing the pertinent parts
- inside the lexeme_d directive. For example, let us complete the example presented
- in the <a href="introduction.html">Introduction</a>. There, we skipped the definition
- of the <tt>integer</tt> rule. Here's how it is actually defined:</p>
- <pre><code><font color="#000000"><span class=identifier> </span><span class=identifier>integer </span><span class=special>= </span><span class=identifier>lexeme_d</span><span class=special>[ </span><span class=special>!(</span><span class=identifier>ch_p</span><span class=special>(</span><span class=literal>'+'</span><span class=special>) </span><span class=special>| </span><span class=literal>'-'</span><span class=special>) </span><span class=special>>> </span><span class=special>+</span><span class=identifier>digit </span><span class=special>];</span></font></code></pre>
- <p>The <tt>lexeme_d</tt> directive instructs the parser to work on the character
- level. Without it, the <tt>integer</tt> rule would have allowed erroneous embedded
- white spaces in inputs such as <span class="quotes">"1 2 345"</span>
- which will be parsed as <span class="quotes">"12345"</span>.</p>
- <h2>as_lower_d</h2>
- <p>There are times when we want to inhibit case sensitivity. The <tt>as_lower_d</tt>
- directive converts all characters from the input to lower-case.</p>
- <table width="80%" border="0" align="center">
- <tr>
- <td class="note_box"><img src="theme/alert.gif" width="16" height="16"><b>
- as_lower_d behavior</b> <br>
- <br>
- It is important to note that only the input is converted to lower case.
- Parsers enclosed inside the <tt>as_lower_d</tt> expecting upper case characters
- will fail to parse. Example: <tt>as_lower_d[<span class="quotes">'X'</span>]</tt>
- will never succeed because it expects an upper case <tt class="quotes">'X'</tt>
- that the <tt>as_lower_d</tt> directive will never supply.</td>
- </tr>
- </table>
- <p>For example, in Pascal, keywords and identifiers are case insensitive. Pascal
- ignores the case of letters in identifiers and keywords. Identifiers Id, ID
- and id are indistinguishable in Pascal. Without the as_lower_d directive, it
- would be awkward to define a rule that recognizes this. Here's a possibility:</p>
- <pre><code><font color="#000000"><span class=special> </span><span class=identifier>r </span><span class=special>= </span><span class=identifier>str_p</span><span class=special>(</span><span class=string>"id"</span><span class=special>) </span><span class=special>| </span><span class=string>"Id" </span><span class=special>| </span><span class=string>"iD" </span><span class=special>| </span><span class=string>"ID"</span><span class=special>;</span></font></code></pre>
- <p>Now, try doing that with the case insensitive Pascal keyword <span class="quotes">"BEGIN"</span>.
- The <tt>as_lower_d</tt> directive makes this simple:</p>
- <pre><code><font color="#000000"><span class=special> </span><span class=identifier>r </span><span class=special>= </span><span class=identifier>as_lower_d</span><span class=special>[</span><span class=string>"begin"</span><span class=special>];</span></font></code></pre>
- <table width="80%" border="0" align="center">
- <tr>
- <td class="note_box"><div align="justify"><img src="theme/note.gif" width="16" height="16">
- <b>Primitive arguments</b> <br>
- <br>
- The astute reader will notice that we did not explicitly wrap <span class="quotes">"begin"</span>
- inside an <tt>str_p</tt>. Whenever appropriate, directives should be able
- to allow primitive types such as <tt>char</tt>, <tt>int</tt>, <tt>wchar_t</tt>,
- <tt>char const<span class="operators">*</span></tt>, <tt>wchar_t const<span class="operators">*</span></tt>
- and so on. Examples: <tt><br>
- <br>
- </tt><code><span class=identifier>as_lower_d</span><tt><span class=special>[</span><span class=string>"hello"</span><span class=special>]
- </span><span class=comment>// same as as_lower_d[str_p("hello")]</span></tt><code></code><span class=identifier><br>
- as_lower_d</span><span class=special>[</span><span class=literal>'x'</span><span class=special>]
- </span><span class=comment>// same as as_lower_d[ch_p('x')]</span></code></div></td>
- </tr>
- </table>
- <h3>no_actions_d</h3>
- <p>There are cases where you want <a href="semantic_actions.html">semantic actions</a>
- not to be triggered. By enclosing a parser in the <tt>no_actions_d</tt> directive,
- all semantic actions directly or indirectly attached to the parser will not
- fire. </p>
- <pre><code><font color="#000000"><span class=special> </span>no_actions_d<span class=special>[</span><span class=identifier>expression</span><span class=special>]</span></font></code><code><font color="#000000"><span class=special></span></font></code></pre>
- <h3>Tweaking the Scanner Type</h3>
- <p><img src="theme/note.gif" width="16" height="16"> How does <tt>lexeme_d, as_lower_d</tt>
- and <font color="#000000"><tt>no_actions_d</tt></font> work? These directives
- do their magic by tweaking the scanner policies. Well, you don't need to know
- what that means for now. Scanner policies are discussed <a href="indepth_the_scanner.html">later</a>.
- However, it is important to note that when the scanner policy is tweaked, the
- result is a different scanner. Why is this important to note? The <a href="rule.html">rule</a>
- is tied to a particular scanner (one or more scanners, to be precise). If you
- wrap a rule inside a <tt>lexeme_d, as_lower_d</tt> or <font color="#000000"><tt>no_actions_d,</tt>the
- compiler will complain about <a href="faq.html#scanner_business">scanner mismatch</a>
- unless you associate the required scanner with the rule. </font></p>
- <p><tt>lexeme_scanner</tt>, <tt>as_lower_scanner</tt> and <tt>no_actions_scanner</tt>
- are your friends if the need to wrap a rule inside these directives arise. Learn
- bout these beasts in the next chapter on <a href="scanner.html#lexeme_scanner">The
- Scanner and Parsing</a>.</p>
- <h2>longest_d</h2>
- <p>Alternatives in the Spirit parser compiler are short-circuited (see <a href="operators.html">Operators</a>).
- Sometimes, this is not what is desired. The <tt>longest_d</tt> directive instructs
- the parser not to short-circuit alternatives enclosed inside this directive,
- but instead makes the parser try all possible alternatives and choose the one
- matching the longest portion of the input stream.</p>
- <p>Consider the parsing of integers and real numbers:</p>
- <pre><code><font color="#000000"><span class=comment> </span><span class=identifier>number </span><span class=special>= </span><span class=identifier>real </span><span class=special>| </span><span class=identifier>integer</span><span class=special>;</span></font></code></pre>
- <p>A number can be a real or an integer. This grammar is ambiguous. An input <span class="quotes">"1234"</span>
- should potentially match both real and integer. Recall though that alternatives
- are short-circuited . Thus, for inputs such as above, the real alternative always
- wins. However, if we swap the alternatives:</p>
- <pre><code><font color="#000000"><span class=special> </span><span class=identifier>number </span><span class=special>= </span><span class=identifier>integer </span><span class=special>| </span><span class=identifier>real</span><span class=special>;</span></font></code></pre>
- <p>we still have a problem. Now, an input <span class="quotes">"123.456"</span>
- will be partially matched by integer until the decimal point. This is not what
- we want. The solution here is either to fix the ambiguity by factoring out the
- common prefixes of real and integer or, if that is not possible nor desired,
- use the <tt>longest_d</tt> directive:</p>
- <pre><code><font color="#000000"><span class=special> </span><span class=identifier>number </span><span class=special>= </span><span class=identifier>longest_d</span><span class=special>[ </span><span class=identifier>integer </span><span class=special>| </span><span class=identifier>real </span><span class=special>];</span></font></code></pre>
- <h2>shortest_d</h2>
- <p>Opposite of the <tt>longest_d</tt> directive.</p>
- <table width="80%" border="0" align="center">
- <tr>
- <td class="note_box"><img src="theme/note.gif" width="16" height="16"> <b>Multiple
- alternatives</b> <br>
- <br>
- The <tt>longest_d</tt> and <tt>shortest_d</tt> directives can accept two
- or more alternatives. Examples:<br>
- <br>
- <font color="#000000"><span class=identifier><code>longest</code></span><code><span class=special>[
- </span><span class=identifier>a </span><span class=special>| </span><span class=identifier>b
- </span><span class=special>| </span><span class=identifier>c </span><span class=special>];
- </span><span class=identifier><br>
- shortest</span><span class=special>[ </span><span class=identifier>a </span><span class=special>|
- </span><span class=identifier>b </span><span class=special>| </span><span class=identifier>c
- </span><span class=special>| </span><span class=identifier>d </span><span class=special>];</span></code></font></td>
- </tr>
- </table>
- <h2>limit_d</h2>
- <p>Ensures that the result of a parser is constrained to a given min..max range
- (inclusive). If not, then the parser fails and returns a no-match.</p>
- <p><b>Usage:</b></p>
- <pre><code><font color="#000000"><span class=special> </span><span class=identifier>limit_d</span><span class=special>(</span><span class=identifier>min</span><span class=special>, </span><span class=identifier>max</span><span class=special>)[</span><span class=identifier>expression</span><span class=special>]</span></font></code></pre>
- <p>This directive is particularly useful in conjunction with parsers that parse
- specific scalar ranges (for example, <a href="numerics.html">numeric parsers</a>).
- Here's a practical example. Although the numeric parsers can be configured to
- accept only a limited number of digits (say, 0..2), there is no way to limit
- the result to a range (say -1.0..1.0). This design is deliberate. Doing so would
- have undermined Spirit's design rule that <i><span class="quotes">"the
- client should not pay for features that she does not use"</span></i>. We
- would have stored the min, max values in the numeric parser itself, used or
- unused. Well, we could get by by using static constants configured by a non-type
- template parameter, but that is not acceptable because that way, we can only
- accommodate integers. What about real numbers or user defined numbers such as
- big-ints?</p>
- <p><b>Example</b>, parse time of the form <b>HH:MM:SS</b>:</p>
- <pre><code><font color="#000000"><span class=special> </span><span class=identifier>uint_parser</span><span class=special><</span><span class=keyword>int</span><span class=special>, </span><span class=number>10</span><span class=special>, </span><span class=number>2</span><span class=special>, </span><span class=number>2</span><span class=special>> </span><span class=identifier>uint2_p</span><span class=special>;
- </span><span class=identifier>r </span><span class=special>= </span><span class=identifier>lexeme_d
- </span><span class=special>[
- </span><span class=identifier>limit_d</span><span class=special>(</span><span class=number>0u</span><span class=special>, </span><span class=number>23u</span><span class=special>)[</span><span class=identifier>uint2_p</span><span class=special>] </span><span class=special>>> </span><span class=literal>':' </span><span class=comment>// Hours 00..23
- </span><span class=special>>> </span><span class=identifier>limit_d</span><span class=special>(</span><span class=number>0u</span><span class=special>, </span><span class=number>59u</span><span class=special>)[</span><span class=identifier>uint2_p</span><span class=special>] </span><span class=special>>> </span><span class=literal>':' </span><span class=comment>// Minutes 00..59
- </span><span class=special>>> </span><span class=identifier>limit_d</span><span class=special>(</span><span class=number>0u</span><span class=special>, </span><span class=number>59u</span><span class=special>)[</span><span class=identifier>uint2_p</span><span class=special>] </span><span class=comment>// Seconds 00..59
- </span><span class=special>];</span></font></code>
- </pre>
- <h2>min_limit_d</h2>
- <p>Sometimes, it is useful to unconstrain just the maximum limit. This will allow
- for an interval that's unbounded in one direction. The directive min_limit_d
- ensures that the result of a parser is not less than minimun. If not, then the
- parser fails and returns a no-match.</p>
- <p><b>Usage:</b></p>
- <pre><code><font color="#000000"><span class=special> </span><span class=identifier>min_limit_d</span><span class=special>(</span><span class=identifier>min</span><span class=special>)[</span><span class=identifier>expression</span><span class=special>]</span></font></code></pre>
- <p><b>Example</b>, ensure that a date is not less than 1900</p>
- <pre><code><font color="#000000"><span class=special> </span><span class=identifier>min_limit_d</span><span class=special>(</span><span class=number>1900u</span><span class=special>)[</span><span class=identifier>uint_p</span><span class=special>]</span></font></code></pre>
- <h2>max_limit_d</h2>
- <p>Opposite of <tt>min_limit_d</tt>. Take note that <tt>limit_d[p]</tt> is equivalent
- to:</p>
- <pre><code><font color="#000000"><span class=special> </span><span class=identifier>min_limit_d</span><span class=special>(</span><span class=identifier>min</span><span class=special>)[</span><span class=identifier>max_limit_d</span><span class=special>(</span><span class=identifier>max</span><span class=special>)[</span><span class=identifier>p</span><span class=special>]]</span></font></code><code><font color="#000000"><span class=special></span></font></code></pre>
- <table border="0">
- <tr>
- <td width="10"></td>
- <td width="30"><a href="../index.html"><img src="theme/u_arr.gif" border="0"></a></td>
- <td width="30"><a href="epsilon.html"><img src="theme/l_arr.gif" border="0"></a></td>
- <td width="30"><a href="scanner.html"><img src="theme/r_arr.gif" border="0"></a></td>
- </tr>
- </table>
- <br>
- <hr size="1">
- <p class="copyright">Copyright © 1998-2003 Joel de Guzman<br>
- <br>
- <font size="2">Use, modification and distribution is subject to the Boost Software
- License, Version 1.0. (See accompanying file LICENSE_1_0.txt or copy at
- http://www.boost.org/LICENSE_1_0.txt) </font> </p>
- <p> </p>
- </body>
- </html>
|