performance.html 19 KB

123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140141142143144145146147148149150151152153154155156157158159160161162163164165166167168169170171172173174175176177178179180181182183184185186187188189190191192193194195196197198199200201202203204205206207208209210211212213214215216217218219220221222223224225226227228229230231232233234235236237238239240241242243244245246247248249250251252253254255256257258259260261262263264265266267268269270271272273274275276277278279280281282283284285286287288289290291292293294295296297298299300301302303304305306307308309310311312313314315316317318319320321322323324325326327328329330331332333334335336337338339340341342343344345346347348349350351352353354355356357358359360361362363364365366367368369370371372373374375376377378379380381382383384385386387388389390391392393394395396397398399400401402403404405406407408409410411412413414415416417418419420421422423424425426427428429430431432433434435436437438439440441442443444445446447448449450451452453454455456457458459460461462463464465466467468469470471472
  1. <!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.0.1 Transitional//EN">
  2. <html>
  3. <head>
  4. <meta http-equiv="Content-Type" content="text/html; charset=ISO-8859-1">
  5. <title>Boost.Flyweight Documentation - Performance</title>
  6. <link rel="stylesheet" href="style.css" type="text/css">
  7. <link rel="start" href="index.html">
  8. <link rel="prev" href="reference/tracking.html">
  9. <link rel="up" href="index.html">
  10. <link rel="next" href="examples.html">
  11. </head>
  12. <body>
  13. <h1><img src="../../../boost.png" alt="Boost logo" align=
  14. "middle" width="277" height="86">Boost.Flyweight Performance</h1>
  15. <div class="prev_link"><a href="reference/tracking.html"><img src="prev.gif" alt="tracking policies" border="0"><br>
  16. Tracking policies
  17. </a></div>
  18. <div class="up_link"><a href="index.html"><img src="up.gif" alt="index" border="0"><br>
  19. Index
  20. </a></div>
  21. <div class="next_link"><a href="examples.html"><img src="next.gif" alt="examples" border="0"><br>
  22. Examples
  23. </a></div><br clear="all" style="clear: all;">
  24. <br clear="all" style="clear: all;">
  25. <hr>
  26. <h2>Contents</h2>
  27. <ul>
  28. <li><a href="#intro">Introduction</a></li>
  29. <li><a href="#memory">Memory consumption</a>
  30. <ul>
  31. <li><a href="#flyweight_size">Flyweight size</a></li>
  32. <li><a href="#entry_size">Entry size</a></li>
  33. <li><a href="#overall_memory">Overall memory consumption</a></li>
  34. </ul>
  35. </li>
  36. <li><a href="#time">Time efficiency</a>
  37. <ul>
  38. <li><a href="#initialization">Initialization</a></li>
  39. <li><a href="#assignment">Assignment</a></li>
  40. <li><a href="#equality_comparison">Equality comparison</a></li>
  41. <li><a href="#value_access">Value access</a></li>
  42. </ul>
  43. </li>
  44. <li><a href="#results">Experimental results</a>
  45. <ul>
  46. <li><a href="#msvc_80">Microsoft Visual C++ 8.0</a>
  47. <ul>
  48. <li><a href="#msvc_80_memory">Memory</a></li>
  49. <li><a href="#msvc_80_time">Execution time</a></li>
  50. </ul>
  51. </li>
  52. <li><a href="#gcc_344">GCC 3.4.4</a>
  53. <ul>
  54. <li><a href="#gcc_344_memory">Memory</a></li>
  55. <li><a href="#gcc_344_time">Execution time</a></li>
  56. </ul>
  57. </li>
  58. </ul>
  59. </li>
  60. <li><a href="#conclusions">Conclusions</a></li>
  61. </ul>
  62. <h2><a name="intro">Introduction</a></h2>
  63. <p>
  64. We show how to estimate the memory reduction obtained by the usage of
  65. Boost.Flyweight in a particular scenario and study the impact on the execution
  66. time for the different functional areas of <code>flyweight</code>.
  67. Some experimental results are provided.
  68. </p>
  69. <h2><a name="memory">Memory consumption</a></h2>
  70. <p>
  71. As we saw in the <a href="tutorial/index.html#rationale">tutorial rationale</a>,
  72. the flyweight pattern is based on two types of objects:
  73. <ul>
  74. <li>The flyweight objects proper, which have very small size, typically
  75. that of a pointer.
  76. </li>
  77. <li>The shared values, which are stored as internal <i>entries</i> into the
  78. flyweight factory.
  79. </li>
  80. </ul>
  81. The overall memory consumption is then a function of the size of the
  82. flyweight objects, the size of the entry objects and the degree of
  83. value redundancy.
  84. </p>
  85. <h3><a name="flyweight_size">Flyweight size</a></h3>
  86. <p>
  87. The only data member of a <code>flyweight</code> object is a so-called
  88. <i>handle</i>, an opaque object of small size provided by the internal
  89. flyweight factory to refer to the entries it stores. For the default
  90. <a href="tutorial/configuration.html#hashed_factory"><code>hashed_factory</code></a>,
  91. this handle is merely a pointer, so <code>sizeof(flyweight&lt;T&gt;)=sizeof(void*)</code>,
  92. 4 bytes in typical 32-bit architectures.
  93. For other types of factories, the handle is an iterator to an internal
  94. container used in the implementation of the factory: again, its size
  95. is typically that of a pointer.
  96. </p>
  97. <h3><a name="entry_size">Entry size</a></h3>
  98. <p>
  99. The entries stored in the factory associated to <code>flyweight&lt;T,...&gt;</code>
  100. need not only hold a value of <code>T</code>, but also contain additional
  101. information related to the internal implementation of
  102. <code>flyweight&lt;T,...&gt;</code>:
  103. </p>
  104. <blockquote>
  105. <i>entry</i> = <code>sizeof(T)</code> + <i>overhead</i>.
  106. </blockquote>
  107. <p>
  108. For the current implementation of Boost.Flyweight, the following aspects
  109. contribute to <i>overhead</i>:
  110. <ul>
  111. <li>Usage of <a href="tutorial/key_value.html">key-value flyweights</a>.</li>
  112. <li>Internal overhead of the <a href="tutorial/configuration.html#factories">factory</a>
  113. container.
  114. </li>
  115. <li>Bookkeeping information associated to the
  116. <a href="tutorial/configuration.html#tracking">tracking mechanism</a>.
  117. </li>
  118. </ul>
  119. The table summarizes the separate contributions to <i>overhead</i> introduced
  120. by the different components taking part of the definition of
  121. a <code>flyweight</code> instantiation. Values are given in <i>words</i>,
  122. i.e. the size of a pointer, which is 4 bytes in a typical 32-bit architecture.
  123. Alignment may introduce additional overhead.
  124. </p>
  125. <p align="center">
  126. <table cellspacing="0">
  127. <caption><b>Entry overhead of the components of Boost.Flyweight.</b></caption>
  128. <tr>
  129. <th align="center"colspan="2">component</th>
  130. <th align="center">overhead (words)</th>
  131. </tr>
  132. <tr>
  133. <td align="center" rowspan="2">&nbsp;&nbsp;<a href="tutorial/key_value.html#key_value"><code>key_value</code></a>&nbsp;&nbsp;</td>
  134. <td align="center">&nbsp;&nbsp;with <a href="tutorial/key_value.html#key_extractor">key extractor</a>&nbsp;&nbsp;</td>
  135. <td align="center">&nbsp;&nbsp;1<sup>(1)</sup>&nbsp;&nbsp;</td>
  136. </tr>
  137. <tr>
  138. <td align="center">&nbsp;&nbsp;without <a href="tutorial/key_value.html#key_extractor">key extractor</a>&nbsp;&nbsp;</td>
  139. <td align="center">&nbsp;&nbsp;1 + <code>sizeof(Key)&nbsp;&nbsp;</td>
  140. </tr>
  141. <tr class="odd_tr">
  142. <td align="center" rowspan="3">&nbsp;&nbsp;factory&nbsp;&nbsp;</td>
  143. <td align="center">&nbsp;&nbsp;<a href="tutorial/configuration.html#hashed_factory"><code>hashed_factory</code></a>&nbsp;&nbsp;</td>
  144. <td align="center">&nbsp;&nbsp;~2.5&nbsp;&nbsp;</td>
  145. </tr>
  146. <tr class="odd_tr">
  147. <td align="center">&nbsp;&nbsp;<a href="tutorial/configuration.html#set_factory"><code>set_factory</code></a>&nbsp;&nbsp;</td>
  148. <td align="center">&nbsp;&nbsp;4<sup>(2)</sup>&nbsp;&nbsp;</td>
  149. </tr>
  150. <tr class="odd_tr">
  151. <td align="center">&nbsp;&nbsp;<a href="tutorial/configuration.html#assoc_container_factory"><code>assoc_container_factory</code></a>&nbsp;&nbsp;</td>
  152. <td align="center">&nbsp;&nbsp;depends on the container used&nbsp;&nbsp;</td>
  153. </tr>
  154. <tr>
  155. <td align="center" rowspan="2">&nbsp;&nbsp;tracking mechanism&nbsp;&nbsp;</td>
  156. <td align="center">&nbsp;&nbsp;<a href="tutorial/configuration.html#refcounted"><code>refcounted</code></a>&nbsp;&nbsp;</td>
  157. <td align="center">&nbsp;&nbsp;2<sup>(3)</sup>&nbsp;&nbsp;</td>
  158. </tr>
  159. <tr>
  160. <td align="center">&nbsp;&nbsp;<a href="tutorial/configuration.html#no_tracking"><code>no_tracking</code></a>&nbsp;&nbsp;</td>
  161. <td align="center">&nbsp;&nbsp;0&nbsp;&nbsp;</td>
  162. </tr>
  163. </table>
  164. <sup>(1)</sup> <small>Assuming that <code>sizeof(Key)&lt;=sizeof(Value)</code>.</small><br>
  165. <sup>(2)</sup> <small>For some implementations of <code>std::set</code> this overhead reduces to 3.</small><br>
  166. <sup>(3)</sup> <small>In some platforms this value can be 3.</small>
  167. </p>
  168. <p>
  169. For instance, for the default configuration parameters of <code>flyweight</code>,
  170. <i>overhead</i> is typically 2.5(<code>hashed_factory</code>) + 2(<code>refcounted</code>)
  171. = 4.5 words.
  172. </p>
  173. <h3><a name="overall_memory">Overall memory consumption</a></h3>
  174. <p>
  175. Consider a scenario where there are <i>N</i> different objects of type <code>T</code>
  176. jointly taking <i>M</i> different values. The objects consume then
  177. <i>S</i> = <i>N</i>&middot;<i>T</i> bytes, where <i>T</i> is defined as the
  178. average size of <code>T</code> (<code>sizeof(T)</code> plus dynamic
  179. memory allocated by <code>T</code> objects).
  180. If we now replace <code>T</code> by some instantiation
  181. <code>flyweight&lt;T,...&gt;</code>, the resulting memory consumption
  182. will be
  183. </p>
  184. <blockquote>
  185. <i>S<sub>F</sub></i> =
  186. <i>N</i>&middot;<i>P</i> + <i>M</i>&middot;(<i>T</i> + <i>overhead</i>),
  187. </blockquote>
  188. <p>
  189. where <i>P</i> is <code>sizeof(flyweight&lt;T,...&gt;)</code>, typically
  190. equal to <code>sizeof(void*)</code>, as seen <a href="#flyweight_size">before</a>.
  191. The ratio <i>S<sub>F</sub></i> / <i>S</i> is then
  192. </p>
  193. <blockquote>
  194. <i>S<sub>F</sub></i> / <i>S</i> =
  195. (<i>P</i> / <i>T</i>)+ (<i>M</i> / <i>N</i>)(1 + <i>overhead</i> / <i>T</i>).
  196. </blockquote>
  197. <p>
  198. <i>S<sub>F</sub></i> / <i>S</i> tends to its minimum, <i>P</i> / <i>T</i>,
  199. as <i>M</i> / <i>N</i> tends to 0, i.e. when the degree of value redundancy
  200. among <code>T</code> objects grows higher. On the other hand, the worst possible case
  201. <i>S<sub>F</sub></i> / <i>S</i> = 1 + (<i>P</i> + <i>overhead</i>) / <i>T</i>
  202. happens when <i>M</i> / <i>N</i> = 1, that is, if there is no value redundancy at all; in this situation there is
  203. no point in applying the flyweight pattern in the first place.
  204. </p>
  205. <p align="center">
  206. <img src="memory.png" alt="relative memory consumption of Boost.Flyweight as a function of value diversity"
  207. width="446" height="411"><br>
  208. <b>Fig. 1: Relative memory consumption of Boost.Flyweight as a function of value diversity.</b>
  209. </p>
  210. <h2>
  211. <a name="time">Time efficiency</a>
  212. </h2>
  213. <p>
  214. The introduction of the flyweight pattern involves an extra level of indirection
  215. that, in general, results in some execution overhead when accessing the values. On
  216. the other hand, manipulation of flyweight objects is considerably faster than
  217. moving around the heavy values they stand for. We analyze qualitatively the
  218. execution overheads or improvements associated to the different usage contexts
  219. of Boost.Flyweight.
  220. </p>
  221. <h3><a name="initialization">Initialization</a></h3>
  222. <p>
  223. As compared with the initialization an object of type <code>T</code>, constructing
  224. a <code>flyweight&lt;T&gt;</code> performs important extra work like looking
  225. up the value in the flyweight factory and inserting it if it is not present.
  226. So, construction of flyweights (other than copy construction, which is
  227. cheap), is expected to be noticeably slower than the construction of the
  228. underlying type <code>T</code>. Much of the time spent at constructing
  229. the associated <code>T</code> value proper can be saved, however, by
  230. using <a href="tutorial/key_value.html">key-value flyweights</a>.
  231. </p>
  232. <h3><a name="assignment">Assignment</a></h3>
  233. <p>
  234. Assignment of flyweight objects is extremely fast, as it only involves
  235. assigning an internal handle type used to refer to the shared value. Moreover,
  236. assignment of <code>flyweight</code> objects never throws. Assignment time
  237. is influenced by the type of <a href="tutorial/configuration.html#tracking">tracking
  238. policy</a> used; in this regard,
  239. <a href="tutorial/configuration.html#no_tracking"><code>no_tracking</code></a>
  240. is the fastest option.
  241. </p>
  242. <h3><a name="equality_comparison">Equality comparison</a></h3>
  243. <p>
  244. Comparing two <code>flyweight</code> objects for equality reduces to
  245. checking that the <i>addresses</i> of the values they are associated to
  246. are equal; in general, this operation is much faster than comparing the
  247. underlying values. This aspect is of particular relevance when the flyweight
  248. objects stand for complex values like those arising in the application of
  249. the <a href="examples.html#example3"><i>composite pattern</i></a>.
  250. </p>
  251. <h3><a name="value_access">Value access</a></h3>
  252. <p>
  253. The conversion from <code>flyweight&lt;T&gt;</code> to <code>const T&amp;</code>
  254. relies on a level of indirection relating the flyweight objects to the
  255. values they are associated to; so, value access is expected to be slower
  256. when using Boost.Flyweight as compared to using the associated values
  257. directly. This overhead, however, can be masked by an indirect improvement
  258. resulting from locality and cache effects: as the set of different <code>T</code>
  259. values handled by an instantiation of <code>flyweight&lt;T&gt;</code> is
  260. generally much smaller than the equivalent family of <code>T</code> objects
  261. when Boost.Flyweight is not used, active values can fit better
  262. into the processor cache.
  263. </p>
  264. <h2><a name="results">Experimental results</a></h2>
  265. <p>
  266. A <a href="examples.html#example7">profiling program</a> was devised to test
  267. the space and time efficiency of different instantiations of <code>flyweight</code>
  268. against a base situation not using Boost.Flyweight. The profiled scenarios are:
  269. <ol>
  270. <li><code>std::string</code>.</li>
  271. <li><code>flyweight&lt;std::string&gt;</code> with default configuration aspects
  272. (<a href="tutorial/configuration.html#hashed_factory"><code>hashed_factory</code></a>,
  273. <a href="tutorial/configuration.html#refcounted"><code>refcounted</code></a> tracking,
  274. <a href="tutorial/configuration.html#simple_locking"><code>simple_locking</code></a>).
  275. </li>
  276. <li><code>flyweight&lt;std::string,<a href="tutorial/configuration.html#no_tracking"><code>no_tracking</code></a>&gt;</code>.</li>
  277. <li><code>flyweight&lt;std::string,<a href="tutorial/configuration.html#set_factory"><code>set_factory</code></a>&gt;</code>.</li>
  278. <li><code>flyweight&lt;std::string,<a href="tutorial/configuration.html#set_factory"><code>set_factory</code></a>,<a href="tutorial/configuration.html#no_tracking"><code>no_tracking</code></a>&gt;</code>.</li>
  279. </ol>
  280. </p>
  281. <p>
  282. Actually the types tested are not exactly those listed above, but instrumented
  283. versions that keep track of the allocated memory for profiling purposes.
  284. The program parses a text file into an array of words and then perform various
  285. manipulations involving the different context usages of Boost.Flyweight discussed
  286. <a href="#time">previously</a>. As our text file we have used the
  287. <a href="http://www.gutenberg.org/dirs/etext99/2donq10.txt">plain text</a>
  288. version of Project Gutenberg edition of <a href="http://www.gutenberg.org/etext/2000"><i>Don
  289. Quijote</i></a> (2.04 MB).
  290. </p>
  291. <h3><a name="msvc_80">Microsoft Visual C++ 8.0</a></h3>
  292. <p>
  293. The program was built with default release settings and <code>_SECURE_SCL=0</code>.
  294. Tests were run under Windows XP in a machine equipped with an Intel Core 2 Duo T5500
  295. processor and 1 GB of RAM.
  296. </p>
  297. <h4><a name="msvc_80_memory">Memory</a></h4>
  298. <p align="center">
  299. <img src="memory_msvc_80.png" alt="memory consumption (MB), MSVC++ 8.0"
  300. width="798" height="323"><br>
  301. <b>Fig. 2: Memory consumption, MSVC++ 8.0. Values in MB.</b>
  302. </p>
  303. <p>
  304. The results show the memory consumption figures for the different profiled
  305. scenarios.
  306. The standard library implementation of MSVC++ 8.0 features the so-called
  307. small buffer optimization for strings, by which <code>std::string</code>
  308. objects hold a small buffer that can be used when the string is short,
  309. thus avoding dynamic allocations. This results in <code>sizeof(std::string)</code>
  310. being quite high, 28 bytes. In our particular test strings are almost always
  311. held in the small buffer, so the minimum <a href="#overall_memory"><i>S<sub>F</sub></i> / <i>S</i></a>
  312. achievable is 4/28 = 14.3%, which is quite close to the experimental
  313. results, given that the memory devoted to storage of shared values
  314. is residual (around 3% of total memory) due to the high word redundancy
  315. of the text source.
  316. </p>
  317. <h4><a name="msvc_80_time">Execution time</a></h4>
  318. <p align="center">
  319. <img src="time_msvc_80.png" alt="execution time (s), MSVC++ 8.0"
  320. width="820" height="325"><br>
  321. <b>Fig. 3: Execution time, MSVC++ 8.0. Values in seconds.</b>
  322. </p>
  323. <p>
  324. The figure displays execution times for the profiled scenarios in different
  325. usage contexts. In accordance with our previous
  326. <a href="#time">qualitative analysis</a>, initialization of <code>flyweight</code>s
  327. carries an important overhead with respect to the base case scenario (between 20% and 40%
  328. of additional execution time), while the other usage contexts
  329. (assignment, equality comparison and value access) have performance gains,
  330. with speedup factors of more than 10 in some cases. The use of a
  331. <a href="tutorial/configuration.html#refcounted"><code>refcounted</code></a>
  332. tracking policy introduces penalties with respect to
  333. <a href="tutorial/configuration.html#no_tracking"><code>no_tracking</code></a>
  334. in initialization and assignment, but has no effect in equality comparison
  335. and value access.
  336. </p>
  337. <h3><a name="gcc_344">GNU GCC 3.4.4</a></h3>
  338. <p>
  339. The Cygwin/MinGW version of the compiler was used, with command options
  340. <code>-ftemplate-depth-128 -O3 -finline-functions -DNDEBUG</code>.
  341. Tests were run under a Cygwin terminal in the same machine as <a href="#msvc_80">before</a>.
  342. </p>
  343. <h4><a name="gcc_344_memory">Memory</a></h4>
  344. <p align="center">
  345. <img src="memory_gcc_344.png" alt="memory consumption (MB), GCC 3.4.4"
  346. width="798" height="323"><br>
  347. <b>Fig. 4: Memory consumption, GCC 3.4.4. Values in MB.</b>
  348. </p>
  349. <p>
  350. The standard library used by GCC 3.4.4 implements <code>std::string</code>
  351. using <a href="http://en.wikipedia.org/wiki/Copy-on-write">copy-on-write</a>
  352. optimization techniques, which leads to very small value redundancy for
  353. some usage patterns. This explains why the memory reduction achieved
  354. by Boost.Flyweight is so poor in this case. Other contexts where assignment
  355. is much less used than direct construction will favor Boost.Flyweight
  356. over plain copy-on-write <code>std::string</code>s.
  357. </p>
  358. <h4><a name="gcc_344_time">Execution time</a></h4>
  359. <p align="center">
  360. <img src="time_gcc_344.png" alt="execution time (s), GCC 3.4.4"
  361. width="820" height="325"><br>
  362. <b>Fig. 5: Execution time, GCC 3.4.4. Values in seconds.</b>
  363. </p>
  364. <p>
  365. Relative performance figures are similar to those obtained for
  366. <a href="#msvc_80_time">MSVC++ 8.0</a>, although some of the
  367. speedups achieved by Boost.Flyweight are higher here (&times;25
  368. in equality comparison and up to &times;100 in assignment when
  369. <a href="tutorial/configuration.html#no_tracking"><code>no_tracking</code></a>
  370. is in effect).
  371. </p>
  372. <h2><a name="conclusions">Conclusions</a></h2>
  373. <p>
  374. The introduction of Boost.Flyweight in application scenarios with very
  375. high value redundancy yields important reductions in memory consumption:
  376. this is especially relevant when data volume approaches the limits of
  377. physical memory in the machine, since Boost.Flyweight can avoid virtual
  378. memory thrashing thus making the application viable. We have shown
  379. how to estimate the achievable reduction in memory consumption from
  380. some basic value statistics and knowledge of the <code>flyweight</code>
  381. configuration aspects being used.
  382. </p>
  383. <p>
  384. Boost.Flyweight can also accelerate execution times in areas other than
  385. object initialization, due to the fastest manipulation of small
  386. flyweight objects and to locality and cache effects arising from the
  387. drastic reduction of the set of allocated values.
  388. </p>
  389. <hr>
  390. <div class="prev_link"><a href="reference/tracking.html"><img src="prev.gif" alt="tracking policies" border="0"><br>
  391. Tracking policies
  392. </a></div>
  393. <div class="up_link"><a href="index.html"><img src="up.gif" alt="index" border="0"><br>
  394. Index
  395. </a></div>
  396. <div class="next_link"><a href="examples.html"><img src="next.gif" alt="examples" border="0"><br>
  397. Examples
  398. </a></div><br clear="all" style="clear: all;">
  399. <br clear="all" style="clear: all;">
  400. <br>
  401. <p>Revised September 1st 2014</p>
  402. <p>&copy; Copyright 2006-2014 Joaqu&iacute;n M L&oacute;pez Mu&ntilde;oz.
  403. Distributed under the Boost Software
  404. License, Version 1.0. (See accompanying file <a href="../../../LICENSE_1_0.txt">
  405. LICENSE_1_0.txt</a> or copy at <a href="http://www.boost.org/LICENSE_1_0.txt">
  406. http://www.boost.org/LICENSE_1_0.txt</a>)
  407. </p>
  408. </body>
  409. </html>