123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140141142143144145146147148149150151152153154155156157158159160161162163164165166167168169170171172173174175176177178179180181182183184185186187188189190191192193194195196197198 |
- <!doctype HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd">
- <html>
- <!--
- (C) Copyright 2002-4 Robert Ramey - http://www.rrsd.com .
- Use, modification and distribution is subject to the Boost Software
- License, Version 1.0. (See accompanying file LICENSE_1_0.txt or copy at
- http://www.boost.org/LICENSE_1_0.txt)
- -->
- <head>
- <meta http-equiv="Content-Type" content="text/html; charset=UTF-8">
- <link rel="stylesheet" type="text/css" href="../../../boost.css">
- <link rel="stylesheet" type="text/css" href="style.css">
- <title>Serialization - Dataflow Iterators</title>
- </head>
- <body link="#0000ff" vlink="#800080">
- <table border="0" cellpadding="7" cellspacing="0" width="100%" summary="header">
- <tr>
- <td valign="top" width="300">
- <h3><a href="../../../index.htm"><img height="86" width="277" alt="C++ Boost" src="../../../boost.png" border="0"></a></h3>
- </td>
- <td valign="top">
- <h1 align="center">Serialization</h1>
- <h2 align="center">Dataflow Iterators</h2>
- </td>
- </tr>
- </table>
- <hr>
- <h3>Motivation</h3>
- Consider the problem of translating an arbitrary length sequence of 8 bit bytes
- to base64 text. Such a process can be summarized as:
- <p>
- source => 8 bit bytes => 6 bit integers => encode to base64 characters => insert line breaks => destination
- <p>
- We would prefer the solution that is:
- <ul>
- <li>Decomposable. so we can code, test, verify and use each (simple) stage of the conversion
- independently.
- <li>Composable. so we can use this composite as a new component somewhere else.
- <li>Efficient, so we're not required to re-implement it again.
- <li>Scalable, so that it works well for short and arbitrarily long sequences.
- </ul>
- The approach that comes closest to meeting these requirements is that described
- and implemented with <a href="../../iterator/doc/index.html">Iterator Adaptors</a>.
- The fundamental feature of an Iterator Adaptor template that makes it interesting to
- us is that it takes as a parameter a base iterator from which it derives its
- input. This suggests that something like the following might be possible.
- <pre><code>
- typedef
- insert_linebreaks< // insert line breaks every 76 characters
- base64_from_binary< // convert binary values to base64 characters
- transform_width< // retrieve 6 bit integers from a sequence of 8 bit bytes
- const char *,
- 6,
- 8
- >
- >
- ,76
- >
- base64_text; // compose all the above operations in to a new iterator
- std::copy(
- base64_text(address),
- base64_text(address + count),
- ostream_iterator<CharType>(os)
- );
- </code></pre>
- Indeed, this seems to be exactly the kind of problem that iterator adaptors are
- intended to address. The Iterator Adaptor library already includes
- modules which can be configured to implement some of the operations above. For example,
- included is <a target="transform_iterator" href="../../iterator/doc/transform_iterator.html">
- transform_iterator</a>, which can be used to implement 6 bit integer => base64 code.
- <h3>Dataflow Iterators</h3>
- Unfortunately, not all iterators which inherit from Iterator Adaptors are guaranteed
- to meet the composability goals stated above. To accomplish this purpose, they have
- to be written with some additional considerations in mind.
- We define a Dataflow Iterator as an class inherited from <code style="white-space: normal">iterator_adaptor</code> which
- fulfills a small set of additional requirements.
- <h4>Templated Constructors</h4>
- <p>
- Templated constructor have the form:
- <pre><code>
- template<class T>
- dataflow_iterator(T start) :
- iterator_adaptor(Base(start))
- {}
- </code></pre>
- When these constructors are applied to our example of above, the following code is generated:
- <pre><code>
- std::copy(
- insert_linebreaks(
- base64_from_binary(
- transform_width(
- address
- ),
- )
- ),
- insert_linebreaks(
- base64_from_binary(
- transform_width(
- address + count
- )
- )
- )
- ostream_iterator<char>(os)
- );
- </code></pre>
- The recursive application of this template is what automatically generates the
- constructor <code style="white-space: normal">base64_text(const char *)</code> in our example above. The original
- Iterator Adaptors include a <code style="white-space: normal">make_xxx_iterator</code> to fulfill this function.
- However, I believe these are unwieldy to use compared to the above solution using
- Templated constructors.
- <h4>Dereferencing</h4>
- Dereferencing some iterators can cause problems. For example, a natural
- way to write a <code style="white-space: normal">remove_whitespace</code> iterator is to increment past the initial
- whitespaces when the iterator is constructed. This will fail if the iterator passed to the
- constructor "points" to the end of a string. The
- <a target="filter_iterator" href="../../iterator/doc/filter_iterator.html">
- <code style="white-space: normal">filter_iterator</code></a> is implemented
- in this way so it can't be used in our context. So, for implementation of this iterator,
- space removal is deferred until the iterator actually is dereferenced.
- <h4>Comparison</h4>
- The default implementation of iterator equality of <code style="white-space: normal">iterator_adaptor</code> just
- invokes the equality operator on the base iterators. Generally this is satisfactory.
- However, this implies that other operations (E. G. dereference) do not prematurely
- increment the base iterator. Avoiding this can be surprisingly tricky in some cases.
- (E.G. transform_width)
- <p>
- Iterators which fulfill the above requirements should be composable and the above sample
- code should implement our binary to base64 conversion.
- <h3>Iterators Included in the Library</h3>
- Dataflow iterators for the serialization library are all defined in the hamespace
- <code style="white-space: normal">boost::archive::iterators</code> included here are:
- <dl class="index">
- <dt><a target="base64_from_binary" href="../../../boost/archive/iterators/base64_from_binary.hpp">
- base64_from_binary</a></dt>
- <dd>transforms a sequence of integers to base64 text</dd>
- <dt><a target="base64_from_binary" href="../../../boost/archive/iterators/binary_from_base64.hpp">
- binary_from_base64</a></dt>
- <dd>transforms a sequence of base64 characters to a sequence of integers</dd>
- <dt><a target="insert_linebreaks" href="../../../boost/archive/iterators/insert_linebreaks.hpp">
- insert_linebreaks</a></dt>
- <dd>given a sequence, creates a sequence with newline characters inserted</dd>
- <dt><a target="mb_from_wchar" href="../../../boost/archive/iterators/mb_from_wchar.hpp">
- mb_from_wchar</a></dt>
- <dd>transforms a sequence of wide characters to a sequence of multi-byte characters</dd>
- <dt><a target="remove_whitespace" href="../../../boost/archive/iterators/remove_whitespace.hpp">
- remove_whitespace</a></dt>
- <dd>given a sequence of characters, returns a sequence with the white characters
- removed. This is a derivation from the <code style="white-space: normal">boost::filter_iterator</code></dd>
- <dt><a target="transform_width" href="../../../boost/archive/iterators/transform_width.hpp">
- transform_width</a></dt>
- <dd>transforms a sequence of x bit elements into a sequence of y bit elements. This
- is a key component in iterators which translate to and from base64 text.</dd>
- <dt><a target="wchar_from_mb" href="../../../boost/archive/iterators/wchar_from_mb.hpp">
- wchar_from_mb</a></dt>
- <dd>transform a sequence of multi-byte characters in the current locale to wide characters.</dd>
- <dt><a target="xml_escape" href="../../../boost/archive/iterators/xml_escape.hpp">
- xml_escape</a></dt>
- <dd>escapes xml meta-characters from xml text</dd>
- <dt><a target="xml_unescape" href="../../../boost/archive/iterators/xml_unescape.hpp">
- xml_unescape</a></dt>
- <dd>unescapes xml escape sequences to create a sequence of normal text<dd>
- </dl>
- <p>
- The standard stream iterators don't quite work for us. On systems which implement <code style="white-space: normal">wchar_t</code>
- as unsigned short integers (E.G. VC 6) they didn't function as I expected. I also made some
- adjustments to be consistent with our concept of Dataflow Iterators. Like the rest of our
- iterators, they are found in the namespace <code style="white-space: normal">boost::archive::interators</code> to avoid
- conflicts with the standard library versions.
- <dl class = "index">
- <dt><a target="istream_iterator" href="../../../boost/archive/iterators/istream_iterator.hpp">
- istream_iterator</a></dt>
- <dt><a target="ostream_iterator" href="../../../boost/archive/iterators/ostream_iterator.hpp">
- ostream_iterator</a></dt>
- </dl>
- <hr>
- <p><i>© Copyright <a href="http://www.rrsd.com">Robert Ramey</a> 2002-2004.
- Distributed under the Boost Software License, Version 1.0. (See
- accompanying file LICENSE_1_0.txt or copy at http://www.boost.org/LICENSE_1_0.txt)
- </i></p>
- </body>
- </html>
|