[library Boost.CRC [quickbook 1.5] [version 1.5] [id crc] [dirname crc] [copyright 2001 2003 2012 Daryle Walker] [purpose Cyclic Redundancy Code computation] [category Miscellaneous] [authors [Walker, Daryle]] [license Distributed under the Boost Software License, Version 1.0. (See the accompanying file LICENSE_1_0.txt or a copy at [@http://www.boost.org/LICENSE_1_0.txt]) ] [source-mode c++] ] [/ Macros ] [def __RMCA__ Rocksoft\u2122 Model CRC Algorithm] [section:motivation What is Boost.CRC?] CRCs (cyclic redundancy codes) is one common technique to confirming data integrity after transmission. The [*Boost.CRC] library provides access to two styles of CRC computation, one as a function template, the other as a function template and two computation object class templates, where the two class templates differ in speed. [endsect] [section:introduction Introduction] [note Some of the introductory material is based on [@http://www.ross.net/crc/download/crc_v3.txt ['A Painless Guide to CRC Error Detection Algorithms]] by Ross N. Williams at his [@http://www.ross.net/crc/ [*The CRC Pitstop]] site. ] When binary data is transmitted, usually electronically, there is a chance that the data gets corrupted. One method to pick up said corruption is to generate some value that is coded from the original data, send said value to the receiver, then confirm that the received data generates the same value when it's coded at the destination. There are several possibilities after the receiver's check: * The check value matches at both places and the data was transmitted intact. * The data got corrupted and the check values do not match. * The data is intact, but the check value got corrupted. This can't the distinguished from the previous case, but is a (relatively) safe false positive. * Either the data and/or check value gets corrupted, but such that the results of the check-coding are still consistent. This is a dangerous false negative! The way to minimize false negatives is to choose coding algorithms that cause a lot of churn per input, especially a variable amount. The check values are known as [*checksum]s because they are used to /check/ for data consistency and the first coding algorithms were addition- (i.e. ['sum]ming-) based. [section:intro_crcs CRCs] [*Cyclic Redundancy Codes] are a type of consistency check that treats the message data as a (long) dividend of a modulo-2 polynomial division. Modulo-2 arithmetic doesn't use carries/borrows when combining numbers. A specific CRC defines a set number of bits to work on at a time, where said number is also the degree of a fixed polynomial (with modulo-2 coefficients) used as a divisor. Since ordering doesn't apply to modulo arithmetic, the check between the current high part of the dividend and the trial partial product (of the divisor and the trial new quotient coefficient) is done by seeing if the highest-degree coefficient of the dividend is one. (The highest-degree coefficient of the divisor must be one by definition, since it's the only non-zero choice.) The remainder after the division is finished is used as the basis of the CRC checksum. [section:intro_crcs_impl CRC Implementation] For a given degree /x/ for the modulo-2 polynomial divisor, the remainder will have at most /x/ terms (from degree /x/ - 1 down to the constant term). The coefficients are modulo-2, which means that they can be represented by 0's and 1's. So a remainder can be modeled by an (unsigned) integer of at least /x/ bits in *width*. The divisor must have its /x/ degree term be one, which means it is always known and can be implied instead of having to explicitly include in representations. Its lower /x/ terms must be specified, so a divisor can be modeled the same way as remainders. With such a modeling, the divisor representation could be said to be truncated since the uppermost term's value is implied and not stored. The remainder and [*(truncated) divisor polynomial]s are stored as basic computer integers. This is in contrast to the dividend, which is modeled from the input stream of data bits, where each new incoming bit is the next lower term of the dividend polynomial. Long division can be processed in piecemeal, reading new upper terms as needed. This maps to reading the data a byte (or bit) at a time, generating updated remainders just-in-time, without needing to read (and/or store(!)) the entire data message at once. Long division involves appending new dividend terms after the previous terms have been processed into the (interim) remainder. So the remainder it the only thing that has to change during each division step; a new input byte (or bit) is combined with the remainder to make the interim dividend, and then combined with the partial product (based on the divisor and top dividend bit(s)) to become a remainder again. [endsect] [section:intro_crcs_augment Message Augmentation] When all of the input data has been read during division, the last /x/ bits are still stuck in the interim remainder. They have not been pushed through the division steps; to do so, /x/ zero-valued extra bits must be passed into the system. This ensures all of the message's data bits get processed. The post-processed remainder is the checksum. The system requires the message to be *augmented* with /x/ extra bits to get results. Alternatively, if the post-division augmentation bits are the expected checksum instead, then the remainder will "subtract" the checksum with itself, giving zero as the final remainder. The remainder will end up non-zero if bit errors exist in either the data or checksum or both. This option requires the checksum to be fed from highest-order bit first on down (i.e. big endian). Exploiting the properties of how the division is carried out, the steps can be rearranged such that the post-processing zero-valued bits are not needed; their effect is merged into the start of the process. Such systems read *unaugmented* messages and expose the checksum directly from the interim remainder afterwards. (You can't use the "augment-message-with-checksum-and-zero-check" technique with this, of course.) [endsect] [section:intro_crcs_param Other CRC Parameters] Since long division proceeds from the uppermost terms on down, it's easiest to treat an incoming byte as the uppermost unprocessed terms, and to read the bits within that byte as the highest-order bit is the uppermost unprocessed term and go down. However, some hardware implementations have an easier time reading each byte from the lowest-order bit and go up. To simulate those systems in software, the program needs to be flagged that *input reflection* needs to be applied. Reflecting a built-in integer reverses the order of its bits, such that the lowest- and highest-order bits swap states, the next-lowest- and next-highest-order bits swap, etc. The input reflection can be done by reflecting each byte as it comes in or keeping the bytes unchanged but reflect the other internal functioning. The latter sounds harder, but what it usually done in the real world, since it's a one-time cost, unlike reflecting the bytes. Similarly, the final remainder is processed by some hardware in reverse order, which means software that simulate such systems need to flag that *output reflection* is in effect. Some CRCs don't return the remainder directly (reflected or not), but add an extra step complementing the output bits. Complementing turns 1 values into 0 values and vice versa. This can simulated by using a XOR (exclusive-or) bit mask of all 1-values (of the same bit length as the remainder). Some systems use a *final XOR mask* that isn't all 1-values, for variety. (This mask takes place /after/ any output reflection.) At the other end, the built-in-integer register normally starts at zero as the first bytes are read. Instead of just doing nothing but load input bits for /x/ steps, some CRC systems use a non-zero *initial remainder* to add extra processing. This initial value has to be different for the augmented versus un-augmented versions of the same system, due to possible incorporation with the zero-valued augment bits. [endsect] [section:intro_crcs_rmca Compact CRC Parameter Specification] The __RMCA__, or RMCA for short, was designed by Ross Williams to describe all the specification points of a given CRC system (quoted): [variablelist RMCA Parameters [[WIDTH] [This is the width of the algorithm expressed in bits. This is one less than the width of the Poly.]] [[POLY] [This parameter is the poly. This is a binary value that should be specified as a hexadecimal number. The top bit of the poly should be omitted. For example, if the poly is 10110, you should specify 06. An important aspect of this parameter is that it represents the unreflected poly; the bottom bit of this parameter is always the LSB of the divisor during the division regardless of whether the algorithm being modelled is reflected.]] [[INIT] [This parameter specifies the initial value of the register when the algorithm starts. This is the value that is to be assigned to the register in the direct table algorithm. In the table algorithm, we may think of the register always commencing with the value zero, and this value being XORed into the register after the N'th bit iteration. This parameter should be specified as a hexadecimal number.]] [[REFIN] [This is a boolean parameter. If it is FALSE, input bytes are processed with bit 7 being treated as the most significant bit (MSB) and bit 0 being treated as the least significant bit. If this parameter is FALSE, each byte is reflected before being processed.]] [[REFOUT] [This is a boolean parameter. If it is set to FALSE, the final value in the register is fed into the XOROUT stage directly, otherwise, if this parameter is TRUE, the final register value is reflected first.]] [[XOROUT] [This is an W-bit value that should be specified as a hexadecimal number. It is XORed to the final register value (after the REFOUT) stage before the value is returned as the official checksum.]] ] His description assumes an octet-sized byte. The /POLY/ is the (truncated) divisor. The /INIT/ is the initial remainder, assuming the unaugmented version of CRC processing is used. (If you're using an augmented-style CRC, you have to undo the effect of the built-in zero-augment before initialization.) [endsect] The two function templates and two class templates in this library provide ways to carry out CRC computations. You give the various __RMCA__ parameters as template parameters and/or constructor parameters. You then submit all the message data bytes at once (for the functions) or piecemeal (for the class objects). [endsect] Note that some error-detection techniques merge their checksum results within the message data, while CRC checksums are either at the end (when augmented, without either kind of reflection, with a bit-width that's a multiple of byte size, and no XOR mask) or out-of-band. [endsect] [section:crc_basic Theoretical CRC Computer] #include // for std::size_t namespace boost { template < std::size_t Bits > class crc_basic; } The [*`boost::crc_basic`] class template acts as an unaugmented-CRC processor that can accept input at the bit-level. It only takes one __RMCA__ parameter as a template parameter, the /WIDTH/, which determines the built-in unsigned integer type used for division registers. The other __RMCA__ parameters can be passed on through the constructor. (Most of the parameters have defaults.) The integer type used for registers is published as `value_type`, while the __RMCA__ attributes can be discovered as: [table:crc_basic_rmca RMCA Parameters in boost::crc_basic [[Parameter] [Member Name] [Kind] [Expression Type]] [[['WIDTH]] [`bit_count`] [class-static immutable data member] [`std::size_t`]] [[['POLY]] [`get_truncated_polynominal`] [`const` member function] [`value_type`]] [[['INIT]] [`get_initial_remainder`] [`const` member function] [`value_type`]] [[['REFIN]] [`get_reflect_input`] [`const` member function] [`bool`]] [[['REFOUT]] [`get_reflect_remainder`] [`const` member function] [`bool`]] [[['XOROUT]] [`get_final_xor_value`] [`const` member function] [`value_type`]] ] Since most of the parameters are specified at run-time, you can reuse a single computer object for separate runs with differing parameters. The type uses the automatically-defined copy/move/assign and destruction routines. [import ./crc_examples.cpp] [crc_basic_reuse] This example necessarily demonstrates input and output. There is one output member function, `checksum` that returns the remainder from the CRC steps plus any post-processing reflection and/or XOR-masking. There are several options to submit input data for processing: [table:crc_basic_input Member Functions for Input in boost::crc_basic [[Member Function] [Input Type/Style]] [[`process_bit`] [A single bit.]] [[`process_bits`] [A specified number of bits within the given byte. Reading starts from the highest-order desired bit.]] [[`process_byte`] [An entire byte. The bits within the byte are read in an order consistent with `get_reflect_input`.]] [[`process_block`] [All bytes in the range. The range is defined by a pointer to the first byte and another pointer to one (byte) past-the-end.]] [[`process_bytes`] [All bytes in the range. The range is defined by a pointer to the first byte and a count of number of bytes to read.]] ] The input functions currently do not return anything. [crc_piecemeal_run] The input-influenced state of a computer object may be read with the `get_interim_remainder` member function. An object may be loaded with the same kind of state with the one-argument version of the `reset` member function. The state is the remainder from the CRC operations performed on all the already-entered input, without any output post-processing. The two functions can be used together to save the state to a persistent medium and restore it later. Note that the __RMCA__ parameters have to saved/restored via other means, if they're not fixed within a program. Calling `reset` with no arguments sets the interim remainder to what /INIT/ was set at object construction. This lets one computer object be used for independent runs with separate data messages using the same CRC standard. [endsect] [section:crc_optimal Optimized CRC Computer] #include // for boost::uintmax_t #include // for std::size_t namespace boost { template < std::size_t Bits, uintmax_t TruncPoly, uintmax_t InitRem, uintmax_t FinalXor, bool ReflectIn, bool ReflectRem > class crc_optimal; } The [*`boost::crc_optimal`] class template acts as an unaugmented-CRC processor that can accept input at the byte-level. It takes all the __RMCA__ parameters as template parameters. Like in `crc_basic`, the /WIDTH/ stays as the first parameter and determines the built-in unsigned integer type used for division registers. The other __RMCA__ parameters move up to the template parameter field in the same relative order they were in the `crc_basic` constructor. (Some parameters have defaults.) Objects based from `crc_optimal` can either be default-constructed, giving it the same behavior as a `crc_basic` object with the equivalent settings, or use one parameter, which overrides the initial remainder value[footnote i.e. The interim-remainder before any input is read.] without permanently affecting the initial-remainder attribute. Besides template parameters and construction, `crc_optimal` differs from `crc_basic` interface-wise by: * Adding five class-static immutable data members corresponding to the new template parameters. [table:crc_optimal_rmca Additional RMCA Expressions in boost::crc_optimal [[New Member] [Equivalent]] [[`truncated_polynominal`] [`get_truncated_polynominal`]] [[`initial_remainder`] [`get_initial_remainder`]] [[`reflect_input`] [`get_reflect_input`]] [[`reflect_remainder`] [`get_reflect_remainder`]] [[`final_xor_value`] [`get_final_xor_value`]] ] * Removing the `process_bit` and `process_bits` member functions. * Adding two versions of the `operator ()` member function. The single-argument version forwards to `process_byte`, making it suitable to STL algorithms that take (and possibly return) function objects[footnote Like `std::for_each`.]. The argument-less version forwards to `checksum`, making it suitable for STL algorithms that expect generator objects[footnote Albeit this object won't change its return value within code that only uses it as a generator.]. * Merging the two `reset` member functions into one. (It uses a single parameter that can have a default argument). The major difference between `crc_basic` and `crc_optimal` is the internals. Objects from `crc_basic` run their CRC algorithm one bit at a time, no matter which unit is used as input. Objects from `crc_optimal`, when /WIDTH/ is at least `CHAR_BIT`[footnote i.e. The optimizations are suspended if the /WIDTH/ only justifies using part of a single byte.], use a byte-indexed table-driven CRC algorithm which is a *lot* faster than processing individual bits. Since all of the parameters are specified at compile-time, you cannot reuse a single computer object for separate runs with differing parameters. The type uses the automatically-defined copy/move/assign and destruction routines. [endsect] [section:crc_function CRC Function] #include // for boost::uintmax_t #include // for boost::uint_t #include // for std::size_t namespace boost { template < std::size_t Bits, uintmax_t TruncPoly, uintmax_t InitRem, uintmax_t FinalXor, bool ReflectIn, bool ReflectRem > typename uint_t::fast crc( void const *buffer, std::size_t byte_count ); } The [*`boost::crc`] function template computes the unaugmented CRC of a single data block in one pass. It has the same template parameter list as `boost::crc_optimal`, which it uses for implementation. This function saves you the trouble of setting up and using the `boost::crc_optimal` object manually. [endsect] [section:acrc_function Augmented-CRC Function] #include // for boost::uintmax_t #include // for boost::uint_t #include // for std::size_t namespace boost { template < std::size_t Bits, uintmax_t TruncPoly > typename uint_t::fast augmented_crc( void const *buffer, std::size_t byte_count, typename uint_t::fast initial_remainder = 0u ); } The [*`boost::augmented_crc`] function computes the augmented-style CRC of a data block. Like `boost::crc`, the first two template parameters are the /WIDTH/ and /POLY/. However, the /INIT/ has moved to being a function parameter, after the data block's starting address and byte length, defaulting to zero if not given. This function uses modulo-2 division at its most raw, and so forfeits the /REFIN/, /REFOUT/, and /XOROUT/ attributes, setting them to `0` or `false`. Another difference from `boost::crc` is that a non-zero /INIT/ has to be skewed when used with this function. (No conversion functions are currently given.) [acrc_piecemeal_run] Since `augmented_crc` doesn't know when your data ends, you must supply the augment, either /WIDTH/ zero bits or the expected checksum. The augment can be either at the end of last data block or from an extra call. Remember that if an expected checksum is used as the augment, its bits must be arranged in big-endian order. Because `augmented_crc` reads byte-wise, while augments assume bit-wise reading, augmentations are valid only when /WIDTH/ is a multiple of the bits-per-byte ratio (`CHAR_BIT`). [endsect] [section:crc_samples Pre-Defined CRC Samples] namespace boost { typedef crc_optimal<16, 0x8005, 0, 0, true, true> crc_16_type; typedef crc_optimal<16, 0x1021, 0xFFFF, 0, false, false> crc_ccitt_false_t, crc_ccitt_type; typedef crc_optimal<16, 0x1021, 0, 0, true, true> crc_ccitt_true_t; typedef crc_optimal<16, 0x8408, 0, 0, true, true> crc_xmodem_type; typedef crc_optimal<16, 0x1021, 0, 0, false, false> crc_xmodem_t; typedef crc_optimal<32, 0x04C11DB7, 0xFFFFFFFF, 0xFFFFFFFF, true, true> crc_32_type; } Several sample CRC types are given, representing common CRC algorithms. The samples have now been checked against the [@http://regregex.bbcmicro.net/crc-catalogue.htm ['Catalogue of parametrised CRC algorithms]], leading to some new type-aliases corresponding to the corrected profiles. (Older, incorrect profiles keep their name for backwards compatibility.) However, this library is primarily concerned with CRC implementation, and not with determining "good" sets of CRC parameters. [table:crc_samples_list Common CRCs [[Computer Type] [Standard(s)]] [[`crc_16_type`] [BISYNCH, ARC, LHA, ZOO]] [[`crc_ccitt_false_t`] [Commonly misidentified as the standard by CCITT]] [[`crc_ccitt_type`] [`crc_ccitt_false_t` (I made the same mistake.)]] [[`crc_ccitt_true_t`] [Designated by CCITT (Comit\u00E9 Consultatif International T\u00E9l\u00E9graphique et T\u00E9l\u00E9phonique), KERMIT]] [[`crc_xmodem_type`] [A mistake I didn't catch in defining `crc_xmodem_t`.]] [[`crc_xmodem_t`] [XMODEM, ZMODEM, ACORN]] [[`crc_32_type`] [ADCCP, PKZip, libPNG, AUTODIN II, Ethernet, FDDI]] ] [endsect] [section:end End Matter] [heading References] *The [@boost:/boost/crc.hpp CRC header] itself *Some [@boost:/libs/crc/test/crc_test.cpp test code] and some [@boost:/libs/crc/test/crc_test2.cpp newer tests] that use the [@boost:/libs/test/index.html Boost test library] *Some [@boost:/libs/crc/crc_example.cpp example code] [variablelist History [[Boost 1.49.0] [Refined implementation, testing, and documentation. Some interfaces were tweaked.]] [[Boost 1.30.2 (estimated)] [Released an example program.]] [[Boost 1.22.0] [First public release.]] ] [heading Acknowledgments] For giving advice on compiler/C++ compliance, implementation, interface, algorithms, and bug reports: *Darin Adler *Beman Dawes *Doug Gregor *John Maddock *Joe Mariadassou *Jens Maurer *Vladimir Prus *Joel Young [variablelist Contributors [[Michael Barr [@mailto:mbarr@netrino.com mbarr@netrino.com]] [Wrote [@http://www.netrino.com/Embedded-Systems/How-To/CRC-Calculation-C-Code ["CRC Implementation Code in C]], a less-confusing guide to implementing CRC algorithms. (Originally published as ["Slow and Steady Never Lost the Race] in the January 2000 issue of [@http://www.embedded.com ['Embedded Systems Programming]], pages 37\u201346. The web version used to be known as [@http://www.netrino.com/Connecting/2000-01/ ['Easier Said Than Done]].) ]] [[Daryle Walker] [Started the library and contributed the theoretical and optimal CRC computation class templates and the CRC computing function template. Contributed the test and example code. ]] [[Ross N. Williams] [Wrote [@http://www.ross.net/crc/crcpaper.html ['A Painless Guide to CRC Error Detection Algorithms]], a definitive source of CRC information. ]] ] [endsect] [xinclude autodoc.xml]