arcsine.qbk 11 KB

123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140141142143144145146147148149150151152153154155156157158159160161162163164165166167168169170171172173174175176177178179180181182183184185186187188189190191192193194195196197198199200201202203204205206207208209210211212213214215216217218219220221222223224225226227228229230231232233234235236237238239240241242243244245246247248249250251252253254255256257258259260261262263264265266267268269270271272273274275276277278279280281282283284285286287288
  1. [section:arcine_dist Arcsine Distribution]
  2. [import ../../example/arcsine_example.cpp] [/ for arcsine snips below]
  3. ``#include <boost/math/distributions/arcsine.hpp>``
  4. namespace boost{ namespace math{
  5. template <class RealType = double,
  6. class ``__Policy`` = ``__policy_class`` >
  7. class arcsine_distribution;
  8. typedef arcsine_distribution<double> arcsine; // double precision standard arcsine distribution [0,1].
  9. template <class RealType, class ``__Policy``>
  10. class arcsine_distribution
  11. {
  12. public:
  13. typedef RealType value_type;
  14. typedef Policy policy_type;
  15. // Constructor from two range parameters, x_min and x_max:
  16. arcsine_distribution(RealType x_min, RealType x_max);
  17. // Range Parameter accessors:
  18. RealType x_min() const;
  19. RealType x_max() const;
  20. };
  21. }} // namespaces
  22. The class type `arcsine_distribution` represents an
  23. [@http://en.wikipedia.org/wiki/arcsine_distribution arcsine]
  24. [@http://en.wikipedia.org/wiki/Probability_distribution probability distribution function].
  25. The arcsine distribution is named because its CDF uses the inverse sin[super -1] or arcsine.
  26. This is implemented as a generalized version with support from ['x_min] to ['x_max]
  27. providing the 'standard arcsine distribution' as default with ['x_min = 0] and ['x_max = 1].
  28. (A few make other choices for 'standard').
  29. The arcsine distribution is generalized to include any bounded support ['a <= x <= b] by
  30. [@http://reference.wolfram.com/language/ref/ArcSinDistribution.html Wolfram] and
  31. [@http://en.wikipedia.org/wiki/arcsine_distribution Wikipedia],
  32. but also using ['location] and ['scale] parameters by
  33. [@http://www.math.uah.edu/stat/index.html Virtual Laboratories in Probability and Statistics]
  34. [@http://www.math.uah.edu/stat/special/Arcsine.html Arcsine distribution].
  35. The end-point version is simpler and more obvious, so we implement that.
  36. If desired, [@http://en.wikipedia.org/wiki/arcsine_distribution this]
  37. outlines how the __beta_distrib can be used to add a shape factor.
  38. The [@http://en.wikipedia.org/wiki/Probability_density_function probability density function PDF]
  39. for the [@http://en.wikipedia.org/wiki/arcsine_distribution arcsine distribution]
  40. defined on the interval \[['x_min, x_max]\] is given by:
  41. [expression f(x; x_min, x_max) = 1 /([pi][sdot][sqrt]((x - x_min)[sdot](x_max - x_min))]
  42. For example, __WolframAlpha arcsine distribution, from input of
  43. N[PDF[arcsinedistribution[0, 1], 0.5], 50]
  44. computes the PDF value
  45. 0.63661977236758134307553505349005744813783858296183
  46. The Probability Density Functions (PDF) of generalized arcsine distributions are symmetric U-shaped curves,
  47. centered on ['(x_max - x_min)/2],
  48. highest (infinite) near the two extrema, and quite flat over the central region.
  49. If random variate ['x] is ['x_min] or ['x_max], then the PDF is infinity.
  50. If random variate ['x] is ['x_min] then the CDF is zero.
  51. If random variate ['x] is ['x_max] then the CDF is unity.
  52. The 'Standard' (0, 1) arcsine distribution is shown in blue
  53. and some generalized examples with other ['x] ranges.
  54. [graph arcsine_pdf]
  55. The Cumulative Distribution Function CDF is defined as
  56. [expression F(x) = 2[sdot]arcsin([sqrt]((x-x_min)/(x_max - x))) / [pi]]
  57. [graph arcsine_cdf]
  58. [h5 Constructor]
  59. arcsine_distribution(RealType x_min, RealType x_max);
  60. constructs an arcsine distribution with range parameters ['x_min] and ['x_max].
  61. Requires ['x_min < x_max], otherwise __domain_error is called.
  62. For example:
  63. arcsine_distribution<> myarcsine(-2, 4);
  64. constructs an arcsine distribution with ['x_min = -2] and ['x_max = 4].
  65. Default values of ['x_min = 0] and ['x_max = 1] and a ` typedef arcsine_distribution<double> arcsine;` mean that
  66. arcsine as;
  67. constructs a 'Standard 01' arcsine distribution.
  68. [h5 Parameter Accessors]
  69. RealType x_min() const;
  70. RealType x_max() const;
  71. Return the parameter ['x_min] or ['x_max] from which this distribution was constructed.
  72. So, for example:
  73. [arcsine_snip_8]
  74. [h4 Non-member Accessor Functions]
  75. All the [link math_toolkit.dist_ref.nmp usual non-member accessor functions]
  76. that are generic to all distributions are supported: __usual_accessors.
  77. The formulae for calculating these are shown in the table below, and at
  78. [@http://mathworld.wolfram.com/arcsineDistribution.html Wolfram Mathworld].
  79. [note There are always [*two] values for the [*mode], at ['x_min] and at ['x_max], default 0 and 1,
  80. so instead we raise the exception __domain_error.
  81. At these extrema, the PDFs are infinite, and the CDFs zero or unity.]
  82. [h4 Applications]
  83. The arcsine distribution is useful to describe
  84. [@http://en.wikipedia.org/wiki/Random_walk Random walks], (including drunken walks)
  85. [@http://en.wikipedia.org/wiki/Brownian_motion Brownian motion],
  86. [@http://en.wikipedia.org/wiki/Wiener_process Weiner processes],
  87. [@http://en.wikipedia.org/wiki/Bernoulli_trial Bernoulli trials],
  88. and their appplication to solve stock market and other
  89. [@http://en.wikipedia.org/wiki/Gambler%27s_ruin ruinous gambling games].
  90. The random variate ['x] is constrained to ['x_min] and ['x_max], (for our 'standard' distribution, 0 and 1),
  91. and is usually some fraction. For any other ['x_min] and ['x_max] a fraction can be obtained from ['x] using
  92. [expression fraction = (x - x_min) / (x_max - x_min)]
  93. The simplest example is tossing heads and tails with a fair coin and modelling the risk of losing, or winning.
  94. Walkers (molecules, drunks...) moving left or right of a centre line are another common example.
  95. The random variate ['x] is the fraction of time spent on the 'winning' side.
  96. If half the time is spent on the 'winning' side (and so the other half on the 'losing' side) then ['x = 1/2].
  97. For large numbers of tosses, this is modelled by the (standard \[0,1\]) arcsine distribution,
  98. and the PDF can be calculated thus:
  99. [arcsine_snip_2]
  100. From the plot of PDF, it is clear that ['x] = [frac12] is the [*minimum] of the curve,
  101. so this is the [*least likely] scenario.
  102. (This is highly counter-intuitive, considering that fair tosses must [*eventually] become equal.
  103. It turns out that ['eventually] is not just very long, but [*infinite]!).
  104. The [*most likely] scenarios are towards the extrema where ['x] = 0 or ['x] = 1.
  105. If fraction of time on the left is a [frac14],
  106. it is only slightly more likely because the curve is quite flat bottomed.
  107. [arcsine_snip_3]
  108. If we consider fair coin-tossing games being played for 100 days
  109. (hypothetically continuously to be 'at-limit')
  110. the person winning after day 5 will not change in fraction 0.144 of the cases.
  111. We can easily compute this setting ['x] = 5./100 = 0.05
  112. [arcsine_snip_4]
  113. Similarly, we can compute from a fraction of 0.05 /2 = 0.025
  114. (halved because we are considering both winners and losers)
  115. corresponding to 1 - 0.025 or 97.5% of the gamblers, (walkers, particles...) on the [*same side] of the origin
  116. [arcsine_snip_5]
  117. (use of the complement gives a bit more clarity,
  118. and avoids potential loss of accuracy when ['x] is close to unity, see __why_complements).
  119. [arcsine_snip_6]
  120. or we can reverse the calculation by assuming a fraction of time on one side, say fraction 0.2,
  121. [arcsine_snip_7]
  122. [*Summary]: Every time we toss, the odds are equal,
  123. so on average we have the same change of winning and losing.
  124. But this is [*not true] for an an individual game where one will be [*mostly in a bad or good patch].
  125. This is quite counter-intuitive to most people, but the mathematics is clear,
  126. and gamblers continue to provide proof.
  127. [*Moral]: if you in a losing patch, leave the game.
  128. (Because the odds to recover to a good patch are poor).
  129. [*Corollary]: Quit while you are ahead?
  130. A working example is at [@../../example/arcsine_example.cpp arcsine_example.cpp]
  131. including sample output .
  132. [h4 Related distributions]
  133. The arcsine distribution with ['x_min = 0] and ['x_max = 1] is special case of the
  134. __beta_distrib with [alpha] = 1/2 and [beta] = 1/2.
  135. [h4 Accuracy]
  136. This distribution is implemented using sqrt, sine, cos and arc sine and cos trigonometric functions
  137. which are normally accurate to a few __epsilon.
  138. But all values suffer from [@http://en.wikipedia.org/wiki/Loss_of_significance loss of significance or cancellation error]
  139. for values of ['x] close to ['x_max].
  140. For example, for a standard [0, 1] arcsine distribution ['as], the pdf is symmetric about random variate ['x = 0.5]
  141. so that one would expect `pdf(as, 0.01) == pdf(as, 0.99)`. But as ['x] nears unity, there is increasing
  142. [@http://en.wikipedia.org/wiki/Loss_of_significance loss of significance].
  143. To counteract this, the complement versions of CDF and quantile
  144. are implemented with alternative expressions using ['cos[super -1]] instead of ['sin[super -1]].
  145. Users should see __why_complements for guidance on when to avoid loss of accuracy by using complements.
  146. [h4 Testing]
  147. The results were tested against a few accurate spot values computed by __WolframAlpha, for example:
  148. N[PDF[arcsinedistribution[0, 1], 0.5], 50]
  149. 0.63661977236758134307553505349005744813783858296183
  150. [h4 Implementation]
  151. In the following table ['a] and ['b] are the parameters ['x_min] and ['x_max],
  152. ['x] is the random variable, ['p] is the probability and its complement ['q = 1-p].
  153. [table
  154. [[Function][Implementation Notes]]
  155. [[support] [x [isin] \[a, b\], default x [isin] \[0, 1\] ]]
  156. [[pdf] [f(x; a, b) = 1/([pi][sdot][sqrt](x - a)[sdot](b - x))]]
  157. [[cdf] [F(x) = 2/[pi][sdot]sin[super-1]([sqrt](x - a) / (b - a) ) ]]
  158. [[cdf of complement] [2/([pi][sdot]cos[super-1]([sqrt](x - a) / (b - a)))]]
  159. [[quantile] [-a[sdot]sin[super 2]([frac12][pi][sdot]p) + a + b[sdot]sin[super 2]([frac12][pi][sdot]p)]]
  160. [[quantile from the complement] [-a[sdot]cos[super 2]([frac12][pi][sdot]p) + a + b[sdot]cos[super 2]([frac12][pi][sdot]q)]]
  161. [[mean] [[frac12](a+b)]]
  162. [[median] [[frac12](a+b)]]
  163. [[mode] [ x [isin] \[a, b\], so raises domain_error (returning NaN).]]
  164. [[variance] [(b - a)[super 2] / 8]]
  165. [[skewness] [0]]
  166. [[kurtosis excess] [ -3/2 ]]
  167. [[kurtosis] [kurtosis_excess + 3]]
  168. ]
  169. The quantile was calculated using an expression obtained by using __WolframAlpha
  170. to invert the formula for the CDF thus
  171. solve [p - 2/pi sin^-1(sqrt((x-a)/(b-a))) = 0, x]
  172. which was interpreted as
  173. Solve[p - (2 ArcSin[Sqrt[(-a + x)/(-a + b)]])/Pi == 0, x, MaxExtraConditions -> Automatic]
  174. and produced the resulting expression
  175. [expression x = -a sin^2((pi p)/2)+a+b sin^2((pi p)/2)]
  176. Thanks to Wolfram for providing this facility.
  177. [h4 References]
  178. * [@http://en.wikipedia.org/wiki/arcsine_distribution Wikipedia arcsine distribution]
  179. * [@http://en.wikipedia.org/wiki/Beta_distribution Wikipedia Beta distribution]
  180. * [@http://mathworld.wolfram.com/BetaDistribution.html Wolfram MathWorld]
  181. * [@http://www.wolframalpha.com/ Wolfram Alpha]
  182. [h4 Sources]
  183. *[@http://estebanmoro.org/2009/04/the-probability-of-going-through-a-bad-patch The probability of going through a bad patch] Esteban Moro's Blog.
  184. *[@http://www.gotohaggstrom.com/What%20do%20schmucks%20and%20the%20arc%20sine%20law%20have%20in%20common.pdf What soschumcks and the arc sine have in common] Peter Haggstrom.
  185. *[@http://www.math.uah.edu/stat/special/Arcsine.html arcsine distribution].
  186. *[@http://reference.wolfram.com/language/ref/ArcSinDistribution.html Wolfram reference arcsine examples].
  187. *[@http://www.math.harvard.edu/library/sternberg/slides/1180908.pdf Shlomo Sternberg slides].
  188. [endsect] [/section:arcsine_dist arcsine]
  189. [/ arcsine.qbk
  190. Copyright 2014 John Maddock and Paul A. Bristow.
  191. Distributed under the Boost Software License, Version 1.0.
  192. (See accompanying file LICENSE_1_0.txt or copy at
  193. http://www.boost.org/LICENSE_1_0.txt).
  194. ]