dataflow.html 8.9 KB

123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140141142143144145146147148149150151152153154155156157158159160161162163164165166167168169170171172173174175176177178179180181182183184185186187188189190191192193194195196197198
  1. <!doctype HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd">
  2. <html>
  3. <!--
  4. (C) Copyright 2002-4 Robert Ramey - http://www.rrsd.com .
  5. Use, modification and distribution is subject to the Boost Software
  6. License, Version 1.0. (See accompanying file LICENSE_1_0.txt or copy at
  7. http://www.boost.org/LICENSE_1_0.txt)
  8. -->
  9. <head>
  10. <meta http-equiv="Content-Type" content="text/html; charset=UTF-8">
  11. <link rel="stylesheet" type="text/css" href="../../../boost.css">
  12. <link rel="stylesheet" type="text/css" href="style.css">
  13. <title>Serialization - Dataflow Iterators</title>
  14. </head>
  15. <body link="#0000ff" vlink="#800080">
  16. <table border="0" cellpadding="7" cellspacing="0" width="100%" summary="header">
  17. <tr>
  18. <td valign="top" width="300">
  19. <h3><a href="../../../index.htm"><img height="86" width="277" alt="C++ Boost" src="../../../boost.png" border="0"></a></h3>
  20. </td>
  21. <td valign="top">
  22. <h1 align="center">Serialization</h1>
  23. <h2 align="center">Dataflow Iterators</h2>
  24. </td>
  25. </tr>
  26. </table>
  27. <hr>
  28. <h3>Motivation</h3>
  29. Consider the problem of translating an arbitrary length sequence of 8 bit bytes
  30. to base64 text. Such a process can be summarized as:
  31. <p>
  32. source =&gt; 8 bit bytes =&gt; 6 bit integers =&gt; encode to base64 characters =&gt; insert line breaks =&gt; destination
  33. <p>
  34. We would prefer the solution that is:
  35. <ul>
  36. <li>Decomposable. so we can code, test, verify and use each (simple) stage of the conversion
  37. independently.
  38. <li>Composable. so we can use this composite as a new component somewhere else.
  39. <li>Efficient, so we're not required to re-implement it again.
  40. <li>Scalable, so that it works well for short and arbitrarily long sequences.
  41. </ul>
  42. The approach that comes closest to meeting these requirements is that described
  43. and implemented with <a href="../../iterator/doc/index.html">Iterator Adaptors</a>.
  44. The fundamental feature of an Iterator Adaptor template that makes it interesting to
  45. us is that it takes as a parameter a base iterator from which it derives its
  46. input. This suggests that something like the following might be possible.
  47. <pre><code>
  48. typedef
  49. insert_linebreaks&lt; // insert line breaks every 76 characters
  50. base64_from_binary&lt; // convert binary values to base64 characters
  51. transform_width&lt; // retrieve 6 bit integers from a sequence of 8 bit bytes
  52. const char *,
  53. 6,
  54. 8
  55. &gt;
  56. &gt;
  57. ,76
  58. &gt;
  59. base64_text; // compose all the above operations in to a new iterator
  60. std::copy(
  61. base64_text(address),
  62. base64_text(address + count),
  63. ostream_iterator&lt;CharType&gt;(os)
  64. );
  65. </code></pre>
  66. Indeed, this seems to be exactly the kind of problem that iterator adaptors are
  67. intended to address. The Iterator Adaptor library already includes
  68. modules which can be configured to implement some of the operations above. For example,
  69. included is <a target="transform_iterator" href="../../iterator/doc/transform_iterator.html">
  70. transform_iterator</a>, which can be used to implement 6 bit integer =&gt; base64 code.
  71. <h3>Dataflow Iterators</h3>
  72. Unfortunately, not all iterators which inherit from Iterator Adaptors are guaranteed
  73. to meet the composability goals stated above. To accomplish this purpose, they have
  74. to be written with some additional considerations in mind.
  75. We define a Dataflow Iterator as an class inherited from <code style="white-space: normal">iterator_adaptor</code> which
  76. fulfills a small set of additional requirements.
  77. <h4>Templated Constructors</h4>
  78. <p>
  79. Templated constructor have the form:
  80. <pre><code>
  81. template&lt;class T&gt;
  82. dataflow_iterator(T start) :
  83. iterator_adaptor(Base(start))
  84. {}
  85. </code></pre>
  86. When these constructors are applied to our example of above, the following code is generated:
  87. <pre><code>
  88. std::copy(
  89. insert_linebreaks(
  90. base64_from_binary(
  91. transform_width(
  92. address
  93. ),
  94. )
  95. ),
  96. insert_linebreaks(
  97. base64_from_binary(
  98. transform_width(
  99. address + count
  100. )
  101. )
  102. )
  103. ostream_iterator&lt;char&gt;(os)
  104. );
  105. </code></pre>
  106. The recursive application of this template is what automatically generates the
  107. constructor <code style="white-space: normal">base64_text(const char *)</code> in our example above. The original
  108. Iterator Adaptors include a <code style="white-space: normal">make_xxx_iterator</code> to fulfill this function.
  109. However, I believe these are unwieldy to use compared to the above solution using
  110. Templated constructors.
  111. <h4>Dereferencing</h4>
  112. Dereferencing some iterators can cause problems. For example, a natural
  113. way to write a <code style="white-space: normal">remove_whitespace</code> iterator is to increment past the initial
  114. whitespaces when the iterator is constructed. This will fail if the iterator passed to the
  115. constructor "points" to the end of a string. The
  116. <a target="filter_iterator" href="../../iterator/doc/filter_iterator.html">
  117. <code style="white-space: normal">filter_iterator</code></a> is implemented
  118. in this way so it can't be used in our context. So, for implementation of this iterator,
  119. space removal is deferred until the iterator actually is dereferenced.
  120. <h4>Comparison</h4>
  121. The default implementation of iterator equality of <code style="white-space: normal">iterator_adaptor</code> just
  122. invokes the equality operator on the base iterators. Generally this is satisfactory.
  123. However, this implies that other operations (E. G. dereference) do not prematurely
  124. increment the base iterator. Avoiding this can be surprisingly tricky in some cases.
  125. (E.G. transform_width)
  126. <p>
  127. Iterators which fulfill the above requirements should be composable and the above sample
  128. code should implement our binary to base64 conversion.
  129. <h3>Iterators Included in the Library</h3>
  130. Dataflow iterators for the serialization library are all defined in the hamespace
  131. <code style="white-space: normal">boost::archive::iterators</code> included here are:
  132. <dl class="index">
  133. <dt><a target="base64_from_binary" href="../../../boost/archive/iterators/base64_from_binary.hpp">
  134. base64_from_binary</a></dt>
  135. <dd>transforms a sequence of integers to base64 text</dd>
  136. <dt><a target="base64_from_binary" href="../../../boost/archive/iterators/binary_from_base64.hpp">
  137. binary_from_base64</a></dt>
  138. <dd>transforms a sequence of base64 characters to a sequence of integers</dd>
  139. <dt><a target="insert_linebreaks" href="../../../boost/archive/iterators/insert_linebreaks.hpp">
  140. insert_linebreaks</a></dt>
  141. <dd>given a sequence, creates a sequence with newline characters inserted</dd>
  142. <dt><a target="mb_from_wchar" href="../../../boost/archive/iterators/mb_from_wchar.hpp">
  143. mb_from_wchar</a></dt>
  144. <dd>transforms a sequence of wide characters to a sequence of multi-byte characters</dd>
  145. <dt><a target="remove_whitespace" href="../../../boost/archive/iterators/remove_whitespace.hpp">
  146. remove_whitespace</a></dt>
  147. <dd>given a sequence of characters, returns a sequence with the white characters
  148. removed. This is a derivation from the <code style="white-space: normal">boost::filter_iterator</code></dd>
  149. <dt><a target="transform_width" href="../../../boost/archive/iterators/transform_width.hpp">
  150. transform_width</a></dt>
  151. <dd>transforms a sequence of x bit elements into a sequence of y bit elements. This
  152. is a key component in iterators which translate to and from base64 text.</dd>
  153. <dt><a target="wchar_from_mb" href="../../../boost/archive/iterators/wchar_from_mb.hpp">
  154. wchar_from_mb</a></dt>
  155. <dd>transform a sequence of multi-byte characters in the current locale to wide characters.</dd>
  156. <dt><a target="xml_escape" href="../../../boost/archive/iterators/xml_escape.hpp">
  157. xml_escape</a></dt>
  158. <dd>escapes xml meta-characters from xml text</dd>
  159. <dt><a target="xml_unescape" href="../../../boost/archive/iterators/xml_unescape.hpp">
  160. xml_unescape</a></dt>
  161. <dd>unescapes xml escape sequences to create a sequence of normal text<dd>
  162. </dl>
  163. <p>
  164. The standard stream iterators don't quite work for us. On systems which implement <code style="white-space: normal">wchar_t</code>
  165. as unsigned short integers (E.G. VC 6) they didn't function as I expected. I also made some
  166. adjustments to be consistent with our concept of Dataflow Iterators. Like the rest of our
  167. iterators, they are found in the namespace <code style="white-space: normal">boost::archive::interators</code> to avoid
  168. conflicts with the standard library versions.
  169. <dl class = "index">
  170. <dt><a target="istream_iterator" href="../../../boost/archive/iterators/istream_iterator.hpp">
  171. istream_iterator</a></dt>
  172. <dt><a target="ostream_iterator" href="../../../boost/archive/iterators/ostream_iterator.hpp">
  173. ostream_iterator</a></dt>
  174. </dl>
  175. <hr>
  176. <p><i>&copy; Copyright <a href="http://www.rrsd.com">Robert Ramey</a> 2002-2004.
  177. Distributed under the Boost Software License, Version 1.0. (See
  178. accompanying file LICENSE_1_0.txt or copy at http://www.boost.org/LICENSE_1_0.txt)
  179. </i></p>
  180. </body>
  181. </html>