students_t_examples.qbk 32 KB

123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140141142143144145146147148149150151152153154155156157158159160161162163164165166167168169170171172173174175176177178179180181182183184185186187188189190191192193194195196197198199200201202203204205206207208209210211212213214215216217218219220221222223224225226227228229230231232233234235236237238239240241242243244245246247248249250251252253254255256257258259260261262263264265266267268269270271272273274275276277278279280281282283284285286287288289290291292293294295296297298299300301302303304305306307308309310311312313314315316317318319320321322323324325326327328329330331332333334335336337338339340341342343344345346347348349350351352353354355356357358359360361362363364365366367368369370371372373374375376377378379380381382383384385386387388389390391392393394395396397398399400401402403404405406407408409410411412413414415416417418419420421422423424425426427428429430431432433434435436437438439440441442443444445446447448449450451452453454455456457458459460461462463464465466467468469470471472473474475476477478479480481482483484485486487488489490491492493494495496497498499500501502503504505506507508509510511512513514515516517518519520521522523524525526527528529530531532533534535536537538539540541542543544545546547548549550551552553554555556557558559560561562563564565566567568569570571572573574575576577578579580581582583584585586587588589590591592593594595596597598599600601602603604605606607608609610611612613614615616617618619620621622623624625626627628629630631632633634635636637638639640641642643644645646647648649650651652653654655656657658659660661662663664665666667668669670671672673674675676677678679680681682683684685686687688689690691692693694695696697698699700701702703704705706707708709710711712713714715716717718719720721722723724725726727728729730731732733734735736737738739740741742743744745746747748749750751752753754755756757758759760761762763764765766767768769770771772773774775776777778779780781782783784785
  1. [section:st_eg Student's t Distribution Examples]
  2. [section:tut_mean_intervals Calculating confidence intervals on the mean with the Students-t distribution]
  3. Let's say you have a sample mean, you may wish to know what confidence intervals
  4. you can place on that mean. Colloquially: "I want an interval that I can be
  5. P% sure contains the true mean". (On a technical point, note that
  6. the interval either contains the true mean or it does not: the
  7. meaning of the confidence level is subtly
  8. different from this colloquialism. More background information can be found on the
  9. [@http://www.itl.nist.gov/div898/handbook/eda/section3/eda352.htm NIST site]).
  10. The formula for the interval can be expressed as:
  11. [equation dist_tutorial4]
  12. Where, ['Y[sub s]] is the sample mean, /s/ is the sample standard deviation,
  13. /N/ is the sample size, /[alpha]/ is the desired significance level and
  14. ['t[sub ([alpha]/2,N-1)]] is the upper critical value of the Students-t
  15. distribution with /N-1/ degrees of freedom.
  16. [note
  17. The quantity [alpha] is the maximum acceptable risk of falsely rejecting
  18. the null-hypothesis. The smaller the value of [alpha] the greater the
  19. strength of the test.
  20. The confidence level of the test is defined as 1 - [alpha], and often expressed
  21. as a percentage. So for example a significance level of 0.05, is equivalent
  22. to a 95% confidence level. Refer to
  23. [@http://www.itl.nist.gov/div898/handbook/prc/section1/prc14.htm
  24. "What are confidence intervals?"] in __handbook for more information.
  25. ] [/ Note]
  26. [note
  27. The usual assumptions of
  28. [@http://en.wikipedia.org/wiki/Independent_and_identically-distributed_random_variables independent and identically distributed (i.i.d.)]
  29. variables and [@http://en.wikipedia.org/wiki/Normal_distribution normal distribution]
  30. of course apply here, as they do in other examples.
  31. ]
  32. From the formula, it should be clear that:
  33. * The width of the confidence interval decreases as the sample size increases.
  34. * The width increases as the standard deviation increases.
  35. * The width increases as the ['confidence level increases] (0.5 towards 0.99999 - stronger).
  36. * The width increases as the ['significance level decreases] (0.5 towards 0.00000...01 - stronger).
  37. The following example code is taken from the example program
  38. [@../../example/students_t_single_sample.cpp students_t_single_sample.cpp].
  39. We'll begin by defining a procedure to calculate intervals for
  40. various confidence levels; the procedure will print these out
  41. as a table:
  42. // Needed includes:
  43. #include <boost/math/distributions/students_t.hpp>
  44. #include <iostream>
  45. #include <iomanip>
  46. // Bring everything into global namespace for ease of use:
  47. using namespace boost::math;
  48. using namespace std;
  49. void confidence_limits_on_mean(
  50. double Sm, // Sm = Sample Mean.
  51. double Sd, // Sd = Sample Standard Deviation.
  52. unsigned Sn) // Sn = Sample Size.
  53. {
  54. using namespace std;
  55. using namespace boost::math;
  56. // Print out general info:
  57. cout <<
  58. "__________________________________\n"
  59. "2-Sided Confidence Limits For Mean\n"
  60. "__________________________________\n\n";
  61. cout << setprecision(7);
  62. cout << setw(40) << left << "Number of Observations" << "= " << Sn << "\n";
  63. cout << setw(40) << left << "Mean" << "= " << Sm << "\n";
  64. cout << setw(40) << left << "Standard Deviation" << "= " << Sd << "\n";
  65. We'll define a table of significance/risk levels for which we'll compute intervals:
  66. double alpha[] = { 0.5, 0.25, 0.1, 0.05, 0.01, 0.001, 0.0001, 0.00001 };
  67. Note that these are the complements of the confidence/probability levels: 0.5, 0.75, 0.9 .. 0.99999).
  68. Next we'll declare the distribution object we'll need, note that
  69. the /degrees of freedom/ parameter is the sample size less one:
  70. students_t dist(Sn - 1);
  71. Most of what follows in the program is pretty printing, so let's focus
  72. on the calculation of the interval. First we need the t-statistic,
  73. computed using the /quantile/ function and our significance level. Note
  74. that since the significance levels are the complement of the probability,
  75. we have to wrap the arguments in a call to /complement(...)/:
  76. double T = quantile(complement(dist, alpha[i] / 2));
  77. Note that alpha was divided by two, since we'll be calculating
  78. both the upper and lower bounds: had we been interested in a single
  79. sided interval then we would have omitted this step.
  80. Now to complete the picture, we'll get the (one-sided) width of the
  81. interval from the t-statistic
  82. by multiplying by the standard deviation, and dividing by the square
  83. root of the sample size:
  84. double w = T * Sd / sqrt(double(Sn));
  85. The two-sided interval is then the sample mean plus and minus this width.
  86. And apart from some more pretty-printing that completes the procedure.
  87. Let's take a look at some sample output, first using the
  88. [@http://www.itl.nist.gov/div898/handbook/eda/section4/eda428.htm
  89. Heat flow data] from the NIST site. The data set was collected
  90. by Bob Zarr of NIST in January, 1990 from a heat flow meter
  91. calibration and stability analysis.
  92. The corresponding dataplot
  93. output for this test can be found in
  94. [@http://www.itl.nist.gov/div898/handbook/eda/section3/eda352.htm
  95. section 3.5.2] of the __handbook.
  96. [pre'''
  97. __________________________________
  98. 2-Sided Confidence Limits For Mean
  99. __________________________________
  100. Number of Observations = 195
  101. Mean = 9.26146
  102. Standard Deviation = 0.02278881
  103. ___________________________________________________________________
  104. Confidence T Interval Lower Upper
  105. Value (%) Value Width Limit Limit
  106. ___________________________________________________________________
  107. 50.000 0.676 1.103e-003 9.26036 9.26256
  108. 75.000 1.154 1.883e-003 9.25958 9.26334
  109. 90.000 1.653 2.697e-003 9.25876 9.26416
  110. 95.000 1.972 3.219e-003 9.25824 9.26468
  111. 99.000 2.601 4.245e-003 9.25721 9.26571
  112. 99.900 3.341 5.453e-003 9.25601 9.26691
  113. 99.990 3.973 6.484e-003 9.25498 9.26794
  114. 99.999 4.537 7.404e-003 9.25406 9.26886
  115. ''']
  116. As you can see the large sample size (195) and small standard deviation (0.023)
  117. have combined to give very small intervals, indeed we can be
  118. very confident that the true mean is 9.2.
  119. For comparison the next example data output is taken from
  120. ['P.K.Hou, O. W. Lau & M.C. Wong, Analyst (1983) vol. 108, p 64.
  121. and from Statistics for Analytical Chemistry, 3rd ed. (1994), pp 54-55
  122. J. C. Miller and J. N. Miller, Ellis Horwood ISBN 0 13 0309907.]
  123. The values result from the determination of mercury by cold-vapour
  124. atomic absorption.
  125. [pre'''
  126. __________________________________
  127. 2-Sided Confidence Limits For Mean
  128. __________________________________
  129. Number of Observations = 3
  130. Mean = 37.8000000
  131. Standard Deviation = 0.9643650
  132. ___________________________________________________________________
  133. Confidence T Interval Lower Upper
  134. Value (%) Value Width Limit Limit
  135. ___________________________________________________________________
  136. 50.000 0.816 0.455 37.34539 38.25461
  137. 75.000 1.604 0.893 36.90717 38.69283
  138. 90.000 2.920 1.626 36.17422 39.42578
  139. 95.000 4.303 2.396 35.40438 40.19562
  140. 99.000 9.925 5.526 32.27408 43.32592
  141. 99.900 31.599 17.594 20.20639 55.39361
  142. 99.990 99.992 55.673 -17.87346 93.47346
  143. 99.999 316.225 176.067 -138.26683 213.86683
  144. ''']
  145. This time the fact that there are only three measurements leads to
  146. much wider intervals, indeed such large intervals that it's hard
  147. to be very confident in the location of the mean.
  148. [endsect] [/section:tut_mean_intervals Calculating confidence intervals on the mean with the Students-t distribution]
  149. [section:tut_mean_test Testing a sample mean for difference from a "true" mean]
  150. When calibrating or comparing a scientific instrument or measurement method of some kind,
  151. we want to be answer the question "Does an observed sample mean differ from the
  152. "true" mean in any significant way?". If it does, then we have evidence of
  153. a systematic difference. This question can be answered with a Students-t test:
  154. more information can be found
  155. [@http://www.itl.nist.gov/div898/handbook/eda/section3/eda352.htm
  156. on the NIST site].
  157. Of course, the assignment of "true" to one mean may be quite arbitrary,
  158. often this is simply a "traditional" method of measurement.
  159. The following example code is taken from the example program
  160. [@../../example/students_t_single_sample.cpp students_t_single_sample.cpp].
  161. We'll begin by defining a procedure to determine which of the
  162. possible hypothesis are rejected or not-rejected
  163. at a given significance level:
  164. [note
  165. Non-statisticians might say 'not-rejected' means 'accepted',
  166. (often of the null-hypothesis) implying, wrongly, that there really *IS* no difference,
  167. but statisticans eschew this to avoid implying that there is positive evidence of 'no difference'.
  168. 'Not-rejected' here means there is *no evidence* of difference, but there still might well be a difference.
  169. For example, see [@http://en.wikipedia.org/wiki/Argument_from_ignorance argument from ignorance] and
  170. [@http://www.bmj.com/cgi/content/full/311/7003/485 Absence of evidence does not constitute evidence of absence.]
  171. ] [/ note]
  172. // Needed includes:
  173. #include <boost/math/distributions/students_t.hpp>
  174. #include <iostream>
  175. #include <iomanip>
  176. // Bring everything into global namespace for ease of use:
  177. using namespace boost::math;
  178. using namespace std;
  179. void single_sample_t_test(double M, double Sm, double Sd, unsigned Sn, double alpha)
  180. {
  181. //
  182. // M = true mean.
  183. // Sm = Sample Mean.
  184. // Sd = Sample Standard Deviation.
  185. // Sn = Sample Size.
  186. // alpha = Significance Level.
  187. Most of the procedure is pretty-printing, so let's just focus on the
  188. calculation, we begin by calculating the t-statistic:
  189. // Difference in means:
  190. double diff = Sm - M;
  191. // Degrees of freedom:
  192. unsigned v = Sn - 1;
  193. // t-statistic:
  194. double t_stat = diff * sqrt(double(Sn)) / Sd;
  195. Finally calculate the probability from the t-statistic. If we're interested
  196. in simply whether there is a difference (either less or greater) or not,
  197. we don't care about the sign of the t-statistic,
  198. and we take the complement of the probability for comparison
  199. to the significance level:
  200. students_t dist(v);
  201. double q = cdf(complement(dist, fabs(t_stat)));
  202. The procedure then prints out the results of the various tests
  203. that can be done, these
  204. can be summarised in the following table:
  205. [table
  206. [[Hypothesis][Test]]
  207. [[The Null-hypothesis: there is
  208. *no difference* in means]
  209. [Reject if complement of CDF for |t| < significance level / 2:
  210. `cdf(complement(dist, fabs(t))) < alpha / 2`]]
  211. [[The Alternative-hypothesis: there
  212. *is difference* in means]
  213. [Reject if complement of CDF for |t| > significance level / 2:
  214. `cdf(complement(dist, fabs(t))) > alpha / 2`]]
  215. [[The Alternative-hypothesis: the sample mean *is less* than
  216. the true mean.]
  217. [Reject if CDF of t > 1 - significance level:
  218. `cdf(complement(dist, t)) < alpha`]]
  219. [[The Alternative-hypothesis: the sample mean *is greater* than
  220. the true mean.]
  221. [Reject if complement of CDF of t < significance level:
  222. `cdf(dist, t) < alpha`]]
  223. ]
  224. [note
  225. Notice that the comparisons are against `alpha / 2` for a two-sided test
  226. and against `alpha` for a one-sided test]
  227. Now that we have all the parts in place, let's take a look at some
  228. sample output, first using the
  229. [@http://www.itl.nist.gov/div898/handbook/eda/section4/eda428.htm
  230. Heat flow data] from the NIST site. The data set was collected
  231. by Bob Zarr of NIST in January, 1990 from a heat flow meter
  232. calibration and stability analysis. The corresponding dataplot
  233. output for this test can be found in
  234. [@http://www.itl.nist.gov/div898/handbook/eda/section3/eda352.htm
  235. section 3.5.2] of the __handbook.
  236. [pre
  237. __________________________________
  238. Student t test for a single sample
  239. __________________________________
  240. Number of Observations = 195
  241. Sample Mean = 9.26146
  242. Sample Standard Deviation = 0.02279
  243. Expected True Mean = 5.00000
  244. Sample Mean - Expected Test Mean = 4.26146
  245. Degrees of Freedom = 194
  246. T Statistic = 2611.28380
  247. Probability that difference is due to chance = 0.000e+000
  248. Results for Alternative Hypothesis and alpha = 0.0500
  249. Alternative Hypothesis Conclusion
  250. Mean != 5.000 NOT REJECTED
  251. Mean < 5.000 REJECTED
  252. Mean > 5.000 NOT REJECTED
  253. ]
  254. You will note the line that says the probability that the difference is
  255. due to chance is zero. From a philosophical point of view, of course,
  256. the probability can never reach zero. However, in this case the calculated
  257. probability is smaller than the smallest representable double precision number,
  258. hence the appearance of a zero here. Whatever its "true" value is, we know it
  259. must be extraordinarily small, so the alternative hypothesis - that there is
  260. a difference in means - is not rejected.
  261. For comparison the next example data output is taken from
  262. ['P.K.Hou, O. W. Lau & M.C. Wong, Analyst (1983) vol. 108, p 64.
  263. and from Statistics for Analytical Chemistry, 3rd ed. (1994), pp 54-55
  264. J. C. Miller and J. N. Miller, Ellis Horwood ISBN 0 13 0309907.]
  265. The values result from the determination of mercury by cold-vapour
  266. atomic absorption.
  267. [pre
  268. __________________________________
  269. Student t test for a single sample
  270. __________________________________
  271. Number of Observations = 3
  272. Sample Mean = 37.80000
  273. Sample Standard Deviation = 0.96437
  274. Expected True Mean = 38.90000
  275. Sample Mean - Expected Test Mean = -1.10000
  276. Degrees of Freedom = 2
  277. T Statistic = -1.97566
  278. Probability that difference is due to chance = 1.869e-001
  279. Results for Alternative Hypothesis and alpha = 0.0500
  280. Alternative Hypothesis Conclusion
  281. Mean != 38.900 REJECTED
  282. Mean < 38.900 NOT REJECTED
  283. Mean > 38.900 NOT REJECTED
  284. ]
  285. As you can see the small number of measurements (3) has led to a large uncertainty
  286. in the location of the true mean. So even though there appears to be a difference
  287. between the sample mean and the expected true mean, we conclude that there
  288. is no significant difference, and are unable to reject the null hypothesis.
  289. However, if we were to lower the bar for acceptance down to alpha = 0.1
  290. (a 90% confidence level) we see a different output:
  291. [pre
  292. __________________________________
  293. Student t test for a single sample
  294. __________________________________
  295. Number of Observations = 3
  296. Sample Mean = 37.80000
  297. Sample Standard Deviation = 0.96437
  298. Expected True Mean = 38.90000
  299. Sample Mean - Expected Test Mean = -1.10000
  300. Degrees of Freedom = 2
  301. T Statistic = -1.97566
  302. Probability that difference is due to chance = 1.869e-001
  303. Results for Alternative Hypothesis and alpha = 0.1000
  304. Alternative Hypothesis Conclusion
  305. Mean != 38.900 REJECTED
  306. Mean < 38.900 NOT REJECTED
  307. Mean > 38.900 REJECTED
  308. ]
  309. In this case, we really have a borderline result,
  310. and more data (and/or more accurate data),
  311. is needed for a more convincing conclusion.
  312. [endsect] [/section:tut_mean_test Testing a sample mean for difference from a "true" mean]
  313. [section:tut_mean_size Estimating how large a sample size would have to become
  314. in order to give a significant Students-t test result with a single sample test]
  315. Imagine you have conducted a Students-t test on a single sample in order
  316. to check for systematic errors in your measurements. Imagine that the
  317. result is borderline. At this point one might go off and collect more data,
  318. but it might be prudent to first ask the question "How much more?".
  319. The parameter estimators of the students_t_distribution class
  320. can provide this information.
  321. This section is based on the example code in
  322. [@../../example/students_t_single_sample.cpp students_t_single_sample.cpp]
  323. and we begin by defining a procedure that will print out a table of
  324. estimated sample sizes for various confidence levels:
  325. // Needed includes:
  326. #include <boost/math/distributions/students_t.hpp>
  327. #include <iostream>
  328. #include <iomanip>
  329. // Bring everything into global namespace for ease of use:
  330. using namespace boost::math;
  331. using namespace std;
  332. void single_sample_find_df(
  333. double M, // M = true mean.
  334. double Sm, // Sm = Sample Mean.
  335. double Sd) // Sd = Sample Standard Deviation.
  336. {
  337. Next we define a table of significance levels:
  338. double alpha[] = { 0.5, 0.25, 0.1, 0.05, 0.01, 0.001, 0.0001, 0.00001 };
  339. Printing out the table of sample sizes required for various confidence levels
  340. begins with the table header:
  341. cout << "\n\n"
  342. "_______________________________________________________________\n"
  343. "Confidence Estimated Estimated\n"
  344. " Value (%) Sample Size Sample Size\n"
  345. " (one sided test) (two sided test)\n"
  346. "_______________________________________________________________\n";
  347. And now the important part: the sample sizes required. Class
  348. `students_t_distribution` has a static member function
  349. `find_degrees_of_freedom` that will calculate how large
  350. a sample size needs to be in order to give a definitive result.
  351. The first argument is the difference between the means that you
  352. wish to be able to detect, here it's the absolute value of the
  353. difference between the sample mean, and the true mean.
  354. Then come two probability values: alpha and beta. Alpha is the
  355. maximum acceptable risk of rejecting the null-hypothesis when it is
  356. in fact true. Beta is the maximum acceptable risk of failing to reject
  357. the null-hypothesis when in fact it is false.
  358. Also note that for a two-sided test, alpha must be divided by 2.
  359. The final parameter of the function is the standard deviation of the sample.
  360. In this example, we assume that alpha and beta are the same, and call
  361. `find_degrees_of_freedom` twice: once with alpha for a one-sided test,
  362. and once with alpha/2 for a two-sided test.
  363. for(unsigned i = 0; i < sizeof(alpha)/sizeof(alpha[0]); ++i)
  364. {
  365. // Confidence value:
  366. cout << fixed << setprecision(3) << setw(10) << right << 100 * (1-alpha[i]);
  367. // calculate df for single sided test:
  368. double df = students_t::find_degrees_of_freedom(
  369. fabs(M - Sm), alpha[i], alpha[i], Sd);
  370. // convert to sample size:
  371. double size = ceil(df) + 1;
  372. // Print size:
  373. cout << fixed << setprecision(0) << setw(16) << right << size;
  374. // calculate df for two sided test:
  375. df = students_t::find_degrees_of_freedom(
  376. fabs(M - Sm), alpha[i]/2, alpha[i], Sd);
  377. // convert to sample size:
  378. size = ceil(df) + 1;
  379. // Print size:
  380. cout << fixed << setprecision(0) << setw(16) << right << size << endl;
  381. }
  382. cout << endl;
  383. }
  384. Let's now look at some sample output using data taken from
  385. ['P.K.Hou, O. W. Lau & M.C. Wong, Analyst (1983) vol. 108, p 64.
  386. and from Statistics for Analytical Chemistry, 3rd ed. (1994), pp 54-55
  387. J. C. Miller and J. N. Miller, Ellis Horwood ISBN 0 13 0309907.]
  388. The values result from the determination of mercury by cold-vapour
  389. atomic absorption.
  390. Only three measurements were made, and the Students-t test above
  391. gave a borderline result, so this example
  392. will show us how many samples would need to be collected:
  393. [pre'''
  394. _____________________________________________________________
  395. Estimated sample sizes required for various confidence levels
  396. _____________________________________________________________
  397. True Mean = 38.90000
  398. Sample Mean = 37.80000
  399. Sample Standard Deviation = 0.96437
  400. _______________________________________________________________
  401. Confidence Estimated Estimated
  402. Value (%) Sample Size Sample Size
  403. (one sided test) (two sided test)
  404. _______________________________________________________________
  405. 75.000 3 4
  406. 90.000 7 9
  407. 95.000 11 13
  408. 99.000 20 22
  409. 99.900 35 37
  410. 99.990 50 53
  411. 99.999 66 68
  412. ''']
  413. So in this case, many more measurements would have had to be made,
  414. for example at the 95% level, 14 measurements in total for a two-sided test.
  415. [endsect] [/section:tut_mean_size Estimating how large a sample size would have to become in order to give a significant Students-t test result with a single sample test]
  416. [section:two_sample_students_t Comparing the means of two samples with the Students-t test]
  417. Imagine that we have two samples, and we wish to determine whether
  418. their means are different or not. This situation often arises when
  419. determining whether a new process or treatment is better than an old one.
  420. In this example, we'll be using the
  421. [@http://www.itl.nist.gov/div898/handbook/eda/section3/eda3531.htm
  422. Car Mileage sample data] from the
  423. [@http://www.itl.nist.gov NIST website]. The data compares
  424. miles per gallon of US cars with miles per gallon of Japanese cars.
  425. The sample code is in
  426. [@../../example/students_t_two_samples.cpp students_t_two_samples.cpp].
  427. There are two ways in which this test can be conducted: we can assume
  428. that the true standard deviations of the two samples are equal or not.
  429. If the standard deviations are assumed to be equal, then the calculation
  430. of the t-statistic is greatly simplified, so we'll examine that case first.
  431. In real life we should verify whether this assumption is valid with a
  432. Chi-Squared test for equal variances.
  433. We begin by defining a procedure that will conduct our test assuming equal
  434. variances:
  435. // Needed headers:
  436. #include <boost/math/distributions/students_t.hpp>
  437. #include <iostream>
  438. #include <iomanip>
  439. // Simplify usage:
  440. using namespace boost::math;
  441. using namespace std;
  442. void two_samples_t_test_equal_sd(
  443. double Sm1, // Sm1 = Sample 1 Mean.
  444. double Sd1, // Sd1 = Sample 1 Standard Deviation.
  445. unsigned Sn1, // Sn1 = Sample 1 Size.
  446. double Sm2, // Sm2 = Sample 2 Mean.
  447. double Sd2, // Sd2 = Sample 2 Standard Deviation.
  448. unsigned Sn2, // Sn2 = Sample 2 Size.
  449. double alpha) // alpha = Significance Level.
  450. {
  451. Our procedure will begin by calculating the t-statistic, assuming
  452. equal variances the needed formulae are:
  453. [equation dist_tutorial1]
  454. where Sp is the "pooled" standard deviation of the two samples,
  455. and /v/ is the number of degrees of freedom of the two combined
  456. samples. We can now write the code to calculate the t-statistic:
  457. // Degrees of freedom:
  458. double v = Sn1 + Sn2 - 2;
  459. cout << setw(55) << left << "Degrees of Freedom" << "= " << v << "\n";
  460. // Pooled variance:
  461. double sp = sqrt(((Sn1-1) * Sd1 * Sd1 + (Sn2-1) * Sd2 * Sd2) / v);
  462. cout << setw(55) << left << "Pooled Standard Deviation" << "= " << sp << "\n";
  463. // t-statistic:
  464. double t_stat = (Sm1 - Sm2) / (sp * sqrt(1.0 / Sn1 + 1.0 / Sn2));
  465. cout << setw(55) << left << "T Statistic" << "= " << t_stat << "\n";
  466. The next step is to define our distribution object, and calculate the
  467. complement of the probability:
  468. students_t dist(v);
  469. double q = cdf(complement(dist, fabs(t_stat)));
  470. cout << setw(55) << left << "Probability that difference is due to chance" << "= "
  471. << setprecision(3) << scientific << 2 * q << "\n\n";
  472. Here we've used the absolute value of the t-statistic, because we initially
  473. want to know simply whether there is a difference or not (a two-sided test).
  474. However, we can also test whether the mean of the second sample is greater
  475. or is less (one-sided test) than that of the first:
  476. all the possible tests are summed up in the following table:
  477. [table
  478. [[Hypothesis][Test]]
  479. [[The Null-hypothesis: there is
  480. *no difference* in means]
  481. [Reject if complement of CDF for |t| < significance level / 2:
  482. `cdf(complement(dist, fabs(t))) < alpha / 2`]]
  483. [[The Alternative-hypothesis: there is a
  484. *difference* in means]
  485. [Reject if complement of CDF for |t| > significance level / 2:
  486. `cdf(complement(dist, fabs(t))) > alpha / 2`]]
  487. [[The Alternative-hypothesis: Sample 1 Mean is *less* than
  488. Sample 2 Mean.]
  489. [Reject if CDF of t > significance level:
  490. `cdf(dist, t) > alpha`]]
  491. [[The Alternative-hypothesis: Sample 1 Mean is *greater* than
  492. Sample 2 Mean.]
  493. [Reject if complement of CDF of t > significance level:
  494. `cdf(complement(dist, t)) > alpha`]]
  495. ]
  496. [note
  497. For a two-sided test we must compare against alpha / 2 and not alpha.]
  498. Most of the rest of the sample program is pretty-printing, so we'll
  499. skip over that, and take a look at the sample output for alpha=0.05
  500. (a 95% probability level). For comparison the dataplot output
  501. for the same data is in
  502. [@http://www.itl.nist.gov/div898/handbook/eda/section3/eda353.htm
  503. section 1.3.5.3] of the __handbook.
  504. [pre'''
  505. ________________________________________________
  506. Student t test for two samples (equal variances)
  507. ________________________________________________
  508. Number of Observations (Sample 1) = 249
  509. Sample 1 Mean = 20.145
  510. Sample 1 Standard Deviation = 6.4147
  511. Number of Observations (Sample 2) = 79
  512. Sample 2 Mean = 30.481
  513. Sample 2 Standard Deviation = 6.1077
  514. Degrees of Freedom = 326
  515. Pooled Standard Deviation = 6.3426
  516. T Statistic = -12.621
  517. Probability that difference is due to chance = 5.273e-030
  518. Results for Alternative Hypothesis and alpha = 0.0500'''
  519. Alternative Hypothesis Conclusion
  520. Sample 1 Mean != Sample 2 Mean NOT REJECTED
  521. Sample 1 Mean < Sample 2 Mean NOT REJECTED
  522. Sample 1 Mean > Sample 2 Mean REJECTED
  523. ]
  524. So with a probability that the difference is due to chance of just
  525. 5.273e-030, we can safely conclude that there is indeed a difference.
  526. The tests on the alternative hypothesis show that we must
  527. also reject the hypothesis that Sample 1 Mean is
  528. greater than that for Sample 2: in this case Sample 1 represents the
  529. miles per gallon for Japanese cars, and Sample 2 the miles per gallon for
  530. US cars, so we conclude that Japanese cars are on average more
  531. fuel efficient.
  532. Now that we have the simple case out of the way, let's look for a moment
  533. at the more complex one: that the standard deviations of the two samples
  534. are not equal. In this case the formula for the t-statistic becomes:
  535. [equation dist_tutorial2]
  536. And for the combined degrees of freedom we use the
  537. [@http://en.wikipedia.org/wiki/Welch-Satterthwaite_equation Welch-Satterthwaite]
  538. approximation:
  539. [equation dist_tutorial3]
  540. Note that this is one of the rare situations where the degrees-of-freedom
  541. parameter to the Student's t distribution is a real number, and not an
  542. integer value.
  543. [note
  544. Some statistical packages truncate the effective degrees of freedom to
  545. an integer value: this may be necessary if you are relying on lookup tables,
  546. but since our code fully supports non-integer degrees of freedom there is no
  547. need to truncate in this case. Also note that when the degrees of freedom
  548. is small then the Welch-Satterthwaite approximation may be a significant
  549. source of error.]
  550. Putting these formulae into code we get:
  551. // Degrees of freedom:
  552. double v = Sd1 * Sd1 / Sn1 + Sd2 * Sd2 / Sn2;
  553. v *= v;
  554. double t1 = Sd1 * Sd1 / Sn1;
  555. t1 *= t1;
  556. t1 /= (Sn1 - 1);
  557. double t2 = Sd2 * Sd2 / Sn2;
  558. t2 *= t2;
  559. t2 /= (Sn2 - 1);
  560. v /= (t1 + t2);
  561. cout << setw(55) << left << "Degrees of Freedom" << "= " << v << "\n";
  562. // t-statistic:
  563. double t_stat = (Sm1 - Sm2) / sqrt(Sd1 * Sd1 / Sn1 + Sd2 * Sd2 / Sn2);
  564. cout << setw(55) << left << "T Statistic" << "= " << t_stat << "\n";
  565. Thereafter the code and the tests are performed the same as before. Using
  566. are car mileage data again, here's what the output looks like:
  567. [pre'''
  568. __________________________________________________
  569. Student t test for two samples (unequal variances)
  570. __________________________________________________
  571. Number of Observations (Sample 1) = 249
  572. Sample 1 Mean = 20.145
  573. Sample 1 Standard Deviation = 6.4147
  574. Number of Observations (Sample 2) = 79
  575. Sample 2 Mean = 30.481
  576. Sample 2 Standard Deviation = 6.1077
  577. Degrees of Freedom = 136.87
  578. T Statistic = -12.946
  579. Probability that difference is due to chance = 1.571e-025
  580. Results for Alternative Hypothesis and alpha = 0.0500'''
  581. Alternative Hypothesis Conclusion
  582. Sample 1 Mean != Sample 2 Mean NOT REJECTED
  583. Sample 1 Mean < Sample 2 Mean NOT REJECTED
  584. Sample 1 Mean > Sample 2 Mean REJECTED
  585. ]
  586. This time allowing the variances in the two samples to differ has yielded
  587. a higher likelihood that the observed difference is down to chance alone
  588. (1.571e-025 compared to 5.273e-030 when equal variances were assumed).
  589. However, the conclusion remains the same: US cars are less fuel efficient
  590. than Japanese models.
  591. [endsect] [/section:two_sample_students_t Comparing the means of two samples with the Students-t test]
  592. [section:paired_st Comparing two paired samples with the Student's t distribution]
  593. Imagine that we have a before and after reading for each item in the sample:
  594. for example we might have measured blood pressure before and after administration
  595. of a new drug. We can't pool the results and compare the means before and after
  596. the change, because each patient will have a different baseline reading.
  597. Instead we calculate the difference between before and after measurements
  598. in each patient, and calculate the mean and standard deviation of the differences.
  599. To test whether a significant change has taken place, we can then test
  600. the null-hypothesis that the true mean is zero using the same procedure
  601. we used in the single sample cases previously discussed.
  602. That means we can:
  603. * [link math_toolkit.stat_tut.weg.st_eg.tut_mean_intervals Calculate confidence intervals of the mean].
  604. If the endpoints of the interval differ in sign then we are unable to reject
  605. the null-hypothesis that there is no change.
  606. * [link math_toolkit.stat_tut.weg.st_eg.tut_mean_test Test whether the true mean is zero]. If the
  607. result is consistent with a true mean of zero, then we are unable to reject the
  608. null-hypothesis that there is no change.
  609. * [link math_toolkit.stat_tut.weg.st_eg.tut_mean_size Calculate how many pairs of readings we would need
  610. in order to obtain a significant result].
  611. [endsect] [/section:paired_st Comparing two paired samples with the Student's t distribution]
  612. [endsect] [/section:st_eg Student's t]
  613. [/
  614. Copyright 2006, 2012 John Maddock and Paul A. Bristow.
  615. Distributed under the Boost Software License, Version 1.0.
  616. (See accompanying file LICENSE_1_0.txt or copy at
  617. http://www.boost.org/LICENSE_1_0.txt).
  618. ]