projects.qbk 18 KB

123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140141142143144145146147148149150151152153154155156157158159160161162163164165166167168169170171172173174175176177178179180181182183184185186187188189190191192193194195196197198199200201202203204205206207208209210211212213214215216217218219220221222223224225226227228229230231232233234235236237238239240241242243244245246247248249250251252253254255256257258259260261262263264265266267268269270271272273274275276277278279280281282283284285286287288289290291292293294295296297298299300301302303304305306307308309310311312313314315316317318319320321322323324325326327328329330331332333334335336337338339340341342343344345346347348349350351352353354355356357358359360361362363364365366367368369370371372373374375376377378379380381382383384385386387388389390391392393394395396397398399400401402403404405406407408409410411412413414415416417418419420421422423424425426427428429430431432433434435436437438439440441442443444445446447448449450451452453454455456457458459460461462463464465466467468469470471472473474475476477478479480481482483484485486487488489490491492493494495496497498499500501502503
  1. [/
  2. Copyright (c) 2009-2009 Joachim Faulhaber
  3. Distributed under the Boost Software License, Version 1.0.
  4. (See accompanying file LICENSE_1_0.txt or copy at
  5. http://www.boost.org/LICENSE_1_0.txt)
  6. ]
  7. [section Projects]
  8. ['*Projects*] are examples on the usage of interval containers
  9. that go beyond small toy snippets of code. The code presented
  10. here addresses more serious applications that approach the
  11. quality of real world programming. At the same time it aims to
  12. guide the reader more deeply into various aspects of
  13. the library. In order not to overburden the reader with implementation
  14. details, the code in ['*projects*] tries to be ['*minimal*]. It has a focus on
  15. the main aspects of the projects and is not intended to be complete
  16. and mature like the library code itself. Cause it's minimal,
  17. project code lives in `namespace mini`.
  18. [/
  19. [section Overview]
  20. [table Overview over Icl Examples
  21. [[level] [example] [classes] [features]]
  22. [[basic] [[link boost_icl.examples.large_bitset Large Bitset]]
  23. [__itv_map__][The most simple application of an interval map:
  24. Counting the overlaps of added intervals.]]
  25. ]
  26. [endsect][/ Overview IN Projects]
  27. ]
  28. [section Large Bitset][/ IN Projects]
  29. Bitsets are just sets. Sets of unsigned integrals,
  30. to be more precise.
  31. The prefix ['*bit*] usually only indicates,
  32. that the representation of those sets is organized in a compressed
  33. form that exploits the fact, that we can switch on an off single
  34. bits in machine words. Bitsets are therefore known to be very small
  35. and thus efficient.
  36. The efficiency of bitsets is usually coupled to the
  37. precondition that the range of values of elements
  38. is relatively small, like [0..32) or [0..64), values that can
  39. be typically represented in single or a small number of
  40. machine words. If we wanted to represent a set containing
  41. two values {1, 1000000}, we would be much better off using
  42. other sets like e.g. an `std::set`.
  43. Bitsets compress well, if elements spread over narrow ranges
  44. only. Interval sets compress well, if many elements are clustered
  45. over intervals. They can span large sets very efficiently then.
  46. In project ['*Large Bitset*] we want to ['*combine the bit compression
  47. and the interval compression*] to achieve a set implementation,
  48. that is capable of spanning large chunks of contiguous elements
  49. using intervals and also to represent more narrow ['nests] of varying
  50. bit sequences using bitset compression.
  51. As we will see, this can be achieved using only a small
  52. amount of code because most of the properties we need
  53. are provided by an __itv_map__ of `bitsets`:
  54. ``
  55. typedef interval_map<IntegralT, SomeBitSet<N>, partial_absorber,
  56. std::less, inplace_bit_add, inplace_bit_and> IntervalBitmap;
  57. ``
  58. Such an `IntervalBitmap` represents `k*N` bits for every segment.
  59. ``
  60. [a, a+k)->'1111....1111' // N bits associated: Represents a total of k*N bits.
  61. ``
  62. For the interval `[a, a+k)` above all bits are set.
  63. But we can also have individual ['nests] or ['clusters]
  64. of bitsequences.
  65. ``
  66. [b, b+1)->'01001011...1'
  67. [b+1,b+2)->'11010001...0'
  68. . . .
  69. ``
  70. and we can span intervals of equal bit sequences that represent
  71. periodic patterns.
  72. ``
  73. [c,d)->'010101....01' // Every second bit is set in range [c,d)
  74. [d,e)->'001100..0011' // Every two bits alterate in range [d,e)
  75. [e,f)->'bit-sequence' // 'bit-sequence' reoccurs every N bits in range [e,f)
  76. ``
  77. An `IntervalBitmap` can represent
  78. `N*(2^M)` elements, if `M` is the number of bits of the
  79. integral type `IntegralT`. Unlike bitsets, that usually
  80. represent ['*unsigned*] integral numbers, large_bitset may
  81. range over negative numbers as well.
  82. There are fields where such large
  83. bitsets implementations are needed. E.g. for the compact
  84. representation of large file allocation tables.
  85. What remains to be done for project *Large Bitset* is
  86. to code a wrapper `class large_bitset` around `IntervalBitmap`
  87. so that `large_bitset` looks and feels like a usual
  88. set class.
  89. [section Using large_bitset]
  90. To quicken your appetite for a look at the implementation
  91. here are a few use cases first. Within the examples that follow,
  92. we will use `nat`[^['k]] for unsigned integrals
  93. and `bits`[^['k]] for bitsets containing [^['k]] bits.
  94. Let's start large.
  95. In the first example . . .
  96. [import ../example/large_bitset_/large_bitset.cpp]
  97. [large_bitset_test_large_set_all]
  98. . . . we are testing the limits. First we set all bits and
  99. then we switch off the very last bit.
  100. [large_bitset_test_large_erase_last]
  101. Program output (/a little beautified/):
  102. ``
  103. ----- Test function test_large() -----------------------------------------------
  104. We have just turned on the awesome amount of 18,446,744,073,709,551,616 bits ;-)
  105. [ 0, 288230376151711744) -> 1111111111111111111111111111111111111111111111111111111111111111
  106. ---- Let's swich off the very last bit -----------------------------------------
  107. [ 0, 288230376151711743) -> 1111111111111111111111111111111111111111111111111111111111111111
  108. [288230376151711743, 288230376151711744) -> 1111111111111111111111111111111111111111111111111111111111111110
  109. ---- Venti is plenty ... let's do something small: A tall ----------------------
  110. ``
  111. More readable is a smaller version of `large_bitset`.
  112. In function `test_small()` we apply a few more operations . . .
  113. [large_bitset_test_small]
  114. . . . producing this output:
  115. ``
  116. ----- Test function test_small() -----------
  117. -- Switch on all bits in range [0,64] ------
  118. [0,8)->11111111
  119. [8,9)->10000000
  120. --------------------------------------------
  121. -- Turn off bits: 25,27,28 -----------------
  122. [0,3)->11111111
  123. [3,4)->10100111
  124. [4,8)->11111111
  125. [8,9)->10000000
  126. --------------------------------------------
  127. -- Flip bits in range [24,30) --------------
  128. [0,3)->11111111
  129. [3,4)->01011011
  130. [4,8)->11111111
  131. [8,9)->10000000
  132. --------------------------------------------
  133. -- Remove the first 10 bits ----------------
  134. [1,2)->00111111
  135. [2,3)->11111111
  136. [3,4)->01011011
  137. [4,8)->11111111
  138. [8,9)->10000000
  139. -- Remove even bits in range [0,72) --------
  140. [1,2)->00010101
  141. [2,3)->01010101
  142. [3,4)->01010001
  143. [4,8)->01010101
  144. -- Set odd bits in range [0,72) --------
  145. [0,9)->01010101
  146. --------------------------------------------
  147. ``
  148. Finally, we present a little /picturesque/ example,
  149. that demonstrates that `large_bitset` can also serve
  150. as a self compressing bitmap, that we can 'paint' with.
  151. [large_bitset_test_picturesque]
  152. Note that we have two `large_bitsets` for the /outline/
  153. and the /interior/. Both parts are compressed but we can
  154. compose both by `operator +`, because the right /positions/
  155. are provided. This is the program output:
  156. ``
  157. ----- Test function test_picturesque() -----
  158. -------- empty face: 3 intervals -----
  159. ********
  160. * *
  161. * *
  162. * *
  163. * *
  164. ******
  165. -------- compressed smile: 2 intervals -----
  166. * *
  167. ****
  168. -------- staring bitset: 6 intervals -----
  169. ********
  170. * *
  171. * * * *
  172. * *
  173. * **** *
  174. ******
  175. --------------------------------------------
  176. ``
  177. So, may be you are curious how this class template
  178. is coded on top of __itv_map__ using only about 250 lines
  179. of code. This is shown in the sections that follow.
  180. [endsect][/ Usage of a large_bitset]
  181. [section The interval_bitmap]
  182. To begin, let's look at the basic data type again, that
  183. will be providing the major functionality:
  184. ``
  185. typedef interval_map<DomainT, BitSetT, partial_absorber,
  186. std::less, inplace_bit_add, inplace_bit_and> IntervalBitmap;
  187. ``
  188. `DomainT` is supposed to be an integral
  189. type, the bitset type `BitSetT` will be a wrapper class around an
  190. unsigned integral type. `BitSetT` has to implement bitwise operators
  191. that will be called by the functors `inplace_bit_add<BitSetT>`
  192. and `inplace_bit_and<BitSetT>`.
  193. The type trait of interval_map is `partial_absorber`, which means
  194. that it is /partial/ and that empty `BitSetTs` are not stored in
  195. the map. This is desired and keeps the `interval_map` minimal, storing
  196. only bitsets, that contain at least one bit switched on.
  197. Functor template `inplace_bit_add` for parameter `Combine` indicates that we
  198. do not expect `operator +=` as addition but the bitwise operator
  199. `|=`. For template parameter `Section` which is instaniated by
  200. `inplace_bit_and` we expect the bitwise `&=` operator.
  201. [endsect][/ section The interval_bitmap]
  202. [section A class implementation for the bitset type]
  203. The code of the project is enclosed in a `namespace mini`.
  204. The name indicates, that the implementation is a /minimal/
  205. example implementation. The name of the bitset class will
  206. be `bits` or `mini::bits` if qualified.
  207. To be used as a codomain parameter of class template __itv_map__,
  208. `mini::bits` has to implement all the functions that are required
  209. for a codomain_type in general, which are the default constructor `bits()`
  210. and an equality `operator==`. Moreover `mini::bits` has to implement operators
  211. required by the instantiations for parameter `Combine` and `Section`
  212. which are `inplace_bit_add` and `inplace_bit_and`. From functors
  213. `inplace_bit_add` and `inplace_bit_and` there are inverse functors
  214. `inplace_bit_subtract` and `inplace_bit_xor`. Those functors
  215. use operators `|= &= ^=` and `~`. Finally if we want to apply
  216. lexicographical and subset comparison on large_bitset, we also
  217. need an `operator <`. All the operators that we need can be implemented
  218. for `mini::bits` on a few lines:
  219. [import ../example/large_bitset_/bits.hpp]
  220. [mini_bits_class_bits]
  221. Finally there is one important piece of meta information, we have
  222. to provide: `mini::bits` has to be recognized as a `Set` by the
  223. icl code. Otherwise we can not exploit the fact that a map of
  224. sets is model of `Set` and the resulting large_bitset would not
  225. behave like a set. So we have to say that `mini::bits` shall be sets:
  226. [mini_bits_is_set]
  227. This is done by adding a partial template specialization to
  228. the type trait template `icl::is_set`.
  229. For the extension of this type trait template and the result
  230. values of inclusion_compare we need these `#includes` for the
  231. implementation of `mini::bits`:
  232. [mini_bits_includes]
  233. [endsect][/ A class implementation for the bitset type]
  234. [section Implementation of a large bitset]
  235. Having finished our `mini::bits` implementation, we can start to
  236. code the wrapper class that hides the efficient interval map of mini::bits
  237. and exposes a simple and convenient set behavior to the world of users.
  238. Let's start with the required `#include`s this time:
  239. [import ../example/large_bitset_/large_bitset.hpp]
  240. [large_bitset_includes]
  241. Besides `boost/icl/interval_map.hpp` and `bits.hpp` the most important
  242. include here is `boost/operators.hpp`. We use this library in order
  243. to further minimize the code and to provide pretty extensive operator
  244. functionality using very little code.
  245. For a short and concise naming of the most important unsigned integer
  246. types and the corresponding `mini::bits` we define this:
  247. [large_bitset_natural_typedefs]
  248. [section large_bitset: Public interface][/ ------------------------------------]
  249. And now let's code `large_bitset`.
  250. [large_bitset_class_template_header]
  251. The first template parameter `DomainT` will be instantiated with
  252. an integral type that defines the kind of numbers that can
  253. be elements of the set. Since we want to go for a large set we
  254. use `nat64` as default which is a 64 bit unsigned integer ranging
  255. from `0` to `2^64-1`. As bitset parameter we also choose a 64-bit
  256. default. Parameters `Combine` and `Interval` are necessary to
  257. be passed to dependent type expressions. An allocator can be
  258. chosen, if desired.
  259. The nested list of private inheritance contains groups of template
  260. instantiations from
  261. [@http://www.boost.org/doc/libs/1_39_0/libs/utility/operators.htm Boost.Operator],
  262. that provides derivable operators from more fundamental once.
  263. Implementing the fundamental operators, we get the derivable ones
  264. /for free/. Below is a short overview of what we get using Boost.Operator,
  265. where __S stands for `large_bitset`, __i for it's `interval_type`
  266. and __e for it's `domain_type` or `element_type`.
  267. [table
  268. [[Group ][fundamental] [derivable ]]
  269. [[Equality, ordering ][`==`] [`!=` ]]
  270. [[ ][`<` ] [`> <= >=` ]]
  271. [[Set operators (__S x __S)][`+= |= -= &= ^=` ][`+ | - & ^` ]]
  272. [[Set operators (__S x __e)][`+= |= -= &= ^=` ][`+ | - & ^` ]]
  273. [[Set operators (__S x __i)][`+= |= -= &= ^=` ][`+ | - & ^` ]]
  274. ]
  275. There is a number of associated types
  276. [large_bitset_associated_types]
  277. most importantly the implementing `interval_bitmap_type` that is used
  278. for the implementing container.
  279. [large_bitmap_impl_map]
  280. In order to use
  281. [@http://www.boost.org/doc/libs/1_39_0/libs/utility/operators.htm Boost.Operator]
  282. we have to implement the fundamental operators as class members. This can be
  283. done quite schematically.
  284. [large_bitset_operators]
  285. As we can see, the seven most important operators that work on the
  286. class type `large_bitset` can be directly implemented by propagating
  287. the operation to the implementing `_map`
  288. of type `interval_bitmap_type`. For the operators that work on segment and
  289. element types, we use member functions `add`, `subtract`, `intersect` and
  290. `flip`. As we will see only a small amount of adaper code is needed
  291. to couple those functions with the functionality of the implementing
  292. container.
  293. Member functions `add`, `subtract`, `intersect` and `flip`,
  294. that allow to combine ['*intervals*] to `large_bitsets` can
  295. be uniformly implemented using a private function
  296. `segment_apply` that applies /addition/, /subtraction/,
  297. /intersection/ or /symmetric difference/, after having
  298. translated the interval's borders into the right bitset
  299. positions.
  300. [large_bitset_fundamental_functions]
  301. In the sample programs, that we will present to demonstrate
  302. the capabilities of `large_bitset` we will need a few additional
  303. functions specifically output functions in two different
  304. flavors.
  305. [large_bitset_demo_functions]
  306. * The first one, `show_segments()` shows the container
  307. content as it is implemented, in the compressed form.
  308. * The second function `show_matrix` shows the complete
  309. matrix of bits that is represented by the container.
  310. [endsect][/ large_bitset: Public interface]
  311. [section large_bitset: Private implementation] [/ -------------------------------------]
  312. In order to implement operations like the addition of
  313. an element say `42` to
  314. the large bitset, we need to translate the /value/ to the
  315. /position/ of the associated *bit* representing `42` in the
  316. interval container of bitsets. As an example, suppose we
  317. use a
  318. ``
  319. large_bitset<nat, mini::bits8> lbs;
  320. ``
  321. that carries small bitsets of 8 bits only.
  322. The first four interval of `lbs` are assumed to
  323. be associated with some bitsets. Now we want to add the interval
  324. `[a,b]==[5,27]`. This will result in the following situation:
  325. ``
  326. [0,1)-> [1,2)-> [2,3)-> [3,4)->
  327. [00101100][11001011][11101001][11100000]
  328. + [111 11111111 11111111 1111] [5,27] as bitset
  329. a b
  330. => [0,1)-> [1,3)-> [3,4)->
  331. [00101111][11111111][11110000]
  332. ``
  333. So we have to convert values 5 and 27 into a part that
  334. points to the interval and a part that refers to the
  335. position within the interval, which is done by a
  336. /division/ and a /modulo/ computation. (In order to have a
  337. consistent representation of the bitsequences across the containers,
  338. within this project, bitsets are denoted with the
  339. ['*least significant bit on the left!*])
  340. ``
  341. A = a/8 = 5/8 = 0 // refers to interval
  342. B = b/8 = 27/8 = 3
  343. R = a%8 = 5%8 = 5 // refers to the position in the associated bitset.
  344. S = b%8 = 27%8 = 3
  345. ``
  346. All /division/ and /modulo operations/ needed here are always done
  347. using a divisor `d` that is a power of `2`: `d = 2^x`. Therefore
  348. division and modulo can be expressed by bitset operations.
  349. The constants needed for those bitset computations are defined here:
  350. [large_bitset_impl_constants]
  351. Looking at the example again, we can see that we have to identify the
  352. positions of the beginning and ending of the interval [5,27] that is
  353. to insert, and then ['*subdivide*] that range of bitsets into three partitions.
  354. # The bitset where the interval starts.
  355. # the bitset where the interval ends
  356. # The bitsets that are completely overlapped by the interval
  357. ``
  358. combine interval [5,27] to large_bitset lbs w.r.t. some operation o
  359. [0,1)-> [1,2)-> [2,3)-> [3,4)->
  360. [00101100][11001011][11101001][11100000]
  361. o [111 11111111 11111111 1111]
  362. a b
  363. subdivide:
  364. [first! ][mid_1] . . .[mid_n][ !last]
  365. [00000111][1...1] . . .[1...1][11110000]
  366. ``
  367. After subdividing, we perform the operation `o` as follows:
  368. # For the first bitset: Set all bits from ther starting bit (`!`)
  369. to the end of the bitset to `1`. All other bits are `0`. Then
  370. perform operation `o`: `_map o= ([0,1)->00000111)`
  371. # For the last bitset: Set all bits from the beginning of the bitset
  372. to the ending bit (`!`) to `1`. All other bits are `0`. Then
  373. perform operation `o`: `_map o= ([3,4)->11110000)`
  374. # For the range of bitsets in between the staring and ending one,
  375. perform operation `o`: `_map o= ([1,3)->11111111)`
  376. The algorithm, that has been outlined and illustrated by the
  377. example, is implemented by the private member function
  378. `segment_apply`. To make the combiner operation a variable
  379. in this algorithm, we use a /pointer to member function type/
  380. [large_bitset_segment_combiner]
  381. as first function argument. We will pass member functions `combine_` here,
  382. ``
  383. combine_(first_of_interval, end_of_interval, some_bitset);
  384. ``
  385. that take the beginning and ending of an interval and a bitset and combine
  386. them to the implementing `interval_bitmap_type _map`. Here are these functions:
  387. [large_bitmap_combiners]
  388. Finally we can code function `segment_apply`, that does the partitioning
  389. and subsequent combining:
  390. [large_bitset_segment_apply]
  391. The functions that help filling bitsets to and from a given bit are
  392. implemented here:
  393. [large_bitset_bitset_filler]
  394. This completes the implementation of class template `large_bitset`.
  395. Using only a small amount of mostly schematic code,
  396. we have been able to provide a pretty powerful, self compressing
  397. and generally usable set type for all integral domain types.
  398. [endsect][/ large_bitset: Private implementation]
  399. [endsect][/ Implementation of a large bitset]
  400. [endsect][/ Large Bitsets IN Projects]
  401. [endsect][/ Projects]