binomial_example.qbk 13 KB

123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140141142143144145146147148149150151152153154155156157158159160161162163164165166167168169170171172173174175176177178179180181182183184185186187188189190191192193194195196197198199200201202203204205206207208209210211212213214215216217218219220221222223224225226227228229230231232233234235236237238239240241242243244245246247248249250251252253254255256257258259260261262263264265266267268269270271272273274275276277278279280281282283284285286287288289290291292293294295296297298299300301302303304305306307308309310311312313314315316317318319320321322323324325326327328329330331332
  1. [section:binom_eg Binomial Distribution Examples]
  2. See also the reference documentation for the __binomial_distrib.
  3. [section:binomial_coinflip_example Binomial Coin-Flipping Example]
  4. [import ../../example/binomial_coinflip_example.cpp]
  5. [binomial_coinflip_example1]
  6. See [@../../example/binomial_coinflip_example.cpp binomial_coinflip_example.cpp]
  7. for full source code, the program output looks like this:
  8. [binomial_coinflip_example_output]
  9. [endsect] [/section:binomial_coinflip_example Binomial coinflip example]
  10. [section:binomial_quiz_example Binomial Quiz Example]
  11. [import ../../example/binomial_quiz_example.cpp]
  12. [binomial_quiz_example1]
  13. [binomial_quiz_example2]
  14. [discrete_quantile_real]
  15. See [@../../example/binomial_quiz_example.cpp binomial_quiz_example.cpp]
  16. for full source code and output.
  17. [endsect] [/section:binomial_coinflip_quiz Binomial Coin-Flipping example]
  18. [section:binom_conf Calculating Confidence Limits on the Frequency of Occurrence for a Binomial Distribution]
  19. Imagine you have a process that follows a binomial distribution: for each
  20. trial conducted, an event either occurs or does it does not, referred
  21. to as "successes" and "failures". If, by experiment, you want to measure the
  22. frequency with which successes occur, the best estimate is given simply
  23. by /k/ \/ /N/, for /k/ successes out of /N/ trials. However our confidence in that
  24. estimate will be shaped by how many trials were conducted, and how many successes
  25. were observed. The static member functions
  26. `binomial_distribution<>::find_lower_bound_on_p` and
  27. `binomial_distribution<>::find_upper_bound_on_p` allow you to calculate
  28. the confidence intervals for your estimate of the occurrence frequency.
  29. The sample program [@../../example/binomial_confidence_limits.cpp
  30. binomial_confidence_limits.cpp] illustrates their use. It begins by defining
  31. a procedure that will print a table of confidence limits for various degrees
  32. of certainty:
  33. #include <iostream>
  34. #include <iomanip>
  35. #include <boost/math/distributions/binomial.hpp>
  36. void confidence_limits_on_frequency(unsigned trials, unsigned successes)
  37. {
  38. //
  39. // trials = Total number of trials.
  40. // successes = Total number of observed successes.
  41. //
  42. // Calculate confidence limits for an observed
  43. // frequency of occurrence that follows a binomial
  44. // distribution.
  45. //
  46. using namespace std;
  47. using namespace boost::math;
  48. // Print out general info:
  49. cout <<
  50. "___________________________________________\n"
  51. "2-Sided Confidence Limits For Success Ratio\n"
  52. "___________________________________________\n\n";
  53. cout << setprecision(7);
  54. cout << setw(40) << left << "Number of Observations" << "= " << trials << "\n";
  55. cout << setw(40) << left << "Number of successes" << "= " << successes << "\n";
  56. cout << setw(40) << left << "Sample frequency of occurrence" << "= " << double(successes) / trials << "\n";
  57. The procedure now defines a table of significance levels: these are the
  58. probabilities that the true occurrence frequency lies outside the calculated
  59. interval:
  60. double alpha[] = { 0.5, 0.25, 0.1, 0.05, 0.01, 0.001, 0.0001, 0.00001 };
  61. Some pretty printing of the table header follows:
  62. cout << "\n\n"
  63. "_______________________________________________________________________\n"
  64. "Confidence Lower CP Upper CP Lower JP Upper JP\n"
  65. " Value (%) Limit Limit Limit Limit\n"
  66. "_______________________________________________________________________\n";
  67. And now for the important part - the intervals themselves - for each
  68. value of /alpha/, we call `find_lower_bound_on_p` and
  69. `find_lower_upper_on_p` to obtain lower and upper bounds
  70. respectively. Note that since we are calculating a two-sided interval,
  71. we must divide the value of alpha in two.
  72. Please note that calculating two separate /single sided bounds/, each with risk
  73. level [alpha] is not the same thing as calculating a two sided interval.
  74. Had we calculate two single-sided intervals each with a risk
  75. that the true value is outside the interval of [alpha], then:
  76. * The risk that it is less than the lower bound is [alpha].
  77. and
  78. * The risk that it is greater than the upper bound is also [alpha].
  79. So the risk it is outside *upper or lower bound*, is *twice* alpha, and the
  80. probability that it is inside the bounds is therefore not nearly as high as
  81. one might have thought. This is why [alpha]/2 must be used in
  82. the calculations below.
  83. In contrast, had we been calculating a
  84. single-sided interval, for example: ['"Calculate a lower bound so that we are P%
  85. sure that the true occurrence frequency is greater than some value"]
  86. then we would *not* have divided by two.
  87. Finally note that `binomial_distribution` provides a choice of two
  88. methods for the calculation, we print out the results from both
  89. methods in this example:
  90. for(unsigned i = 0; i < sizeof(alpha)/sizeof(alpha[0]); ++i)
  91. {
  92. // Confidence value:
  93. cout << fixed << setprecision(3) << setw(10) << right << 100 * (1-alpha[i]);
  94. // Calculate Clopper Pearson bounds:
  95. double l = binomial_distribution<>::find_lower_bound_on_p(
  96. trials, successes, alpha[i]/2);
  97. double u = binomial_distribution<>::find_upper_bound_on_p(
  98. trials, successes, alpha[i]/2);
  99. // Print Clopper Pearson Limits:
  100. cout << fixed << setprecision(5) << setw(15) << right << l;
  101. cout << fixed << setprecision(5) << setw(15) << right << u;
  102. // Calculate Jeffreys Prior Bounds:
  103. l = binomial_distribution<>::find_lower_bound_on_p(
  104. trials, successes, alpha[i]/2,
  105. binomial_distribution<>::jeffreys_prior_interval);
  106. u = binomial_distribution<>::find_upper_bound_on_p(
  107. trials, successes, alpha[i]/2,
  108. binomial_distribution<>::jeffreys_prior_interval);
  109. // Print Jeffreys Prior Limits:
  110. cout << fixed << setprecision(5) << setw(15) << right << l;
  111. cout << fixed << setprecision(5) << setw(15) << right << u << std::endl;
  112. }
  113. cout << endl;
  114. }
  115. And that's all there is to it. Let's see some sample output for a 2 in 10
  116. success ratio, first for 20 trials:
  117. [pre'''___________________________________________
  118. 2-Sided Confidence Limits For Success Ratio
  119. ___________________________________________
  120. Number of Observations = 20
  121. Number of successes = 4
  122. Sample frequency of occurrence = 0.2
  123. _______________________________________________________________________
  124. Confidence Lower CP Upper CP Lower JP Upper JP
  125. Value (%) Limit Limit Limit Limit
  126. _______________________________________________________________________
  127. 50.000 0.12840 0.29588 0.14974 0.26916
  128. 75.000 0.09775 0.34633 0.11653 0.31861
  129. 90.000 0.07135 0.40103 0.08734 0.37274
  130. 95.000 0.05733 0.43661 0.07152 0.40823
  131. 99.000 0.03576 0.50661 0.04655 0.47859
  132. 99.900 0.01905 0.58632 0.02634 0.55960
  133. 99.990 0.01042 0.64997 0.01530 0.62495
  134. 99.999 0.00577 0.70216 0.00901 0.67897
  135. ''']
  136. As you can see, even at the 95% confidence level the bounds are
  137. really quite wide (this example is chosen to be easily compared to the one
  138. in the __handbook
  139. [@http://www.itl.nist.gov/div898/handbook/prc/section2/prc241.htm
  140. here]). Note also that the Clopper-Pearson calculation method (CP above)
  141. produces quite noticeably more pessimistic estimates than the Jeffreys Prior
  142. method (JP above).
  143. Compare that with the program output for
  144. 2000 trials:
  145. [pre'''___________________________________________
  146. 2-Sided Confidence Limits For Success Ratio
  147. ___________________________________________
  148. Number of Observations = 2000
  149. Number of successes = 400
  150. Sample frequency of occurrence = 0.2000000
  151. _______________________________________________________________________
  152. Confidence Lower CP Upper CP Lower JP Upper JP
  153. Value (%) Limit Limit Limit Limit
  154. _______________________________________________________________________
  155. 50.000 0.19382 0.20638 0.19406 0.20613
  156. 75.000 0.18965 0.21072 0.18990 0.21047
  157. 90.000 0.18537 0.21528 0.18561 0.21503
  158. 95.000 0.18267 0.21821 0.18291 0.21796
  159. 99.000 0.17745 0.22400 0.17769 0.22374
  160. 99.900 0.17150 0.23079 0.17173 0.23053
  161. 99.990 0.16658 0.23657 0.16681 0.23631
  162. 99.999 0.16233 0.24169 0.16256 0.24143
  163. ''']
  164. Now even when the confidence level is very high, the limits are really
  165. quite close to the experimentally calculated value of 0.2. Furthermore
  166. the difference between the two calculation methods is now really quite small.
  167. [endsect] [/section:binom_conf Calculating Confidence Limits on the Frequency of Occurrence for a Binomial Distribution]
  168. [section:binom_size_eg Estimating Sample Sizes for a Binomial Distribution.]
  169. Imagine you have a critical component that you know will fail in 1 in
  170. N "uses" (for some suitable definition of "use"). You may want to schedule
  171. routine replacement of the component so that its chance of failure between
  172. routine replacements is less than P%. If the failures follow a binomial
  173. distribution (each time the component is "used" it either fails or does not)
  174. then the static member function `binomial_distibution<>::find_maximum_number_of_trials`
  175. can be used to estimate the maximum number of "uses" of that component for some
  176. acceptable risk level /alpha/.
  177. The example program
  178. [@../../example/binomial_sample_sizes.cpp binomial_sample_sizes.cpp]
  179. demonstrates its usage. It centres on a routine that prints out
  180. a table of maximum sample sizes for various probability thresholds:
  181. void find_max_sample_size(
  182. double p, // success ratio.
  183. unsigned successes) // Total number of observed successes permitted.
  184. {
  185. The routine then declares a table of probability thresholds: these are the
  186. maximum acceptable probability that /successes/ or fewer events will be
  187. observed. In our example, /successes/ will be always zero, since we want
  188. no component failures, but in other situations non-zero values may well
  189. make sense.
  190. double alpha[] = { 0.5, 0.25, 0.1, 0.05, 0.01, 0.001, 0.0001, 0.00001 };
  191. Much of the rest of the program is pretty-printing, the important part
  192. is in the calculation of maximum number of permitted trials for each
  193. value of alpha:
  194. for(unsigned i = 0; i < sizeof(alpha)/sizeof(alpha[0]); ++i)
  195. {
  196. // Confidence value:
  197. cout << fixed << setprecision(3) << setw(10) << right << 100 * (1-alpha[i]);
  198. // calculate trials:
  199. double t = binomial::find_maximum_number_of_trials(
  200. successes, p, alpha[i]);
  201. t = floor(t);
  202. // Print Trials:
  203. cout << fixed << setprecision(5) << setw(15) << right << t << endl;
  204. }
  205. Note that since we're
  206. calculating the maximum number of trials permitted, we'll err on the safe
  207. side and take the floor of the result. Had we been calculating the
  208. /minimum/ number of trials required to observe a certain number of /successes/
  209. using `find_minimum_number_of_trials` we would have taken the ceiling instead.
  210. We'll finish off by looking at some sample output, firstly for
  211. a 1 in 1000 chance of component failure with each use:
  212. [pre
  213. '''________________________
  214. Maximum Number of Trials
  215. ________________________
  216. Success ratio = 0.001
  217. Maximum Number of "successes" permitted = 0
  218. ____________________________
  219. Confidence Max Number
  220. Value (%) Of Trials
  221. ____________________________
  222. 50.000 692
  223. 75.000 287
  224. 90.000 105
  225. 95.000 51
  226. 99.000 10
  227. 99.900 0
  228. 99.990 0
  229. 99.999 0'''
  230. ]
  231. So 51 "uses" of the component would yield a 95% chance that no
  232. component failures would be observed.
  233. Compare that with a 1 in 1 million chance of component failure:
  234. [pre'''
  235. ________________________
  236. Maximum Number of Trials
  237. ________________________
  238. Success ratio = 0.0000010
  239. Maximum Number of "successes" permitted = 0
  240. ____________________________
  241. Confidence Max Number
  242. Value (%) Of Trials
  243. ____________________________
  244. 50.000 693146
  245. 75.000 287681
  246. 90.000 105360
  247. 95.000 51293
  248. 99.000 10050
  249. 99.900 1000
  250. 99.990 100
  251. 99.999 10'''
  252. ]
  253. In this case, even 1000 uses of the component would still yield a
  254. less than 1 in 1000 chance of observing a component failure
  255. (i.e. a 99.9% chance of no failure).
  256. [endsect] [/section:binom_size_eg Estimating Sample Sizes for a Binomial Distribution.]
  257. [endsect] [/section:binom_eg Binomial Distribution]
  258. [/
  259. Copyright 2006 John Maddock and Paul A. Bristow.
  260. Distributed under the Boost Software License, Version 1.0.
  261. (See accompanying file LICENSE_1_0.txt or copy at
  262. http://www.boost.org/LICENSE_1_0.txt).
  263. ]