123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140141142143144145146147148149150151152153154155156157158159160161162163164165166167168169170171172173174175176177178179180181182183184185186187188189190191192193194195196197198199200201202203204205206207208209210211212213214215216217218219220221222223224225226227228229230231232233234235236237238239240241242243244245246247248249250251252253254255256257258259260261262263264265266267268269270271272273274275276277278279280281282283284285286287288289290291292293294295296297298299300301302303304305306307308309310311312313314315316317318319320321322323324325326327328329330331332333334335336337338339340341342343344345346347348349350351352353354355356357358359360361362363364365366367368369370371372373374375376377378379380381382383384385386387388389390391392393394395396397398399400401402403404405406407408409410411412413414415416417418419420421422423424425426427428429430431432433434435436437438439440441442443444445446447448449450451452453454455456457458459460461462463464465466467468469470471472473474475476477478479480481482483484485486487488489490491492493494495496497498499500501502503504505506507508509510511512513514515516517518519520521522523524525526527528529530531532533534535536537538539540541542543544545546547548549550551552553554555556557558559560561562563564565566567568569570571572573574575576577578579580581582583584585586587588589590591592593594595596597598599600601602603604605606607608609610611612613614615616617618619620621622623624625626627628629630631632633634635636637638639640641642643644645646647648649650651652653654655656657658659660661662663664665666667668669670671672673674675676677678679680681682683684685686687688689690691692693694695696697698699700701702703704705706707708709710711712713714715716717718719720721722723724725726727728729730731732733734735736737738739740741742743744745746747748749750751752753754755756757758759760761762763764765766767768769770771772773774775776777778779780781782783784785 |
- [section:st_eg Student's t Distribution Examples]
- [section:tut_mean_intervals Calculating confidence intervals on the mean with the Students-t distribution]
- Let's say you have a sample mean, you may wish to know what confidence intervals
- you can place on that mean. Colloquially: "I want an interval that I can be
- P% sure contains the true mean". (On a technical point, note that
- the interval either contains the true mean or it does not: the
- meaning of the confidence level is subtly
- different from this colloquialism. More background information can be found on the
- [@http://www.itl.nist.gov/div898/handbook/eda/section3/eda352.htm NIST site]).
- The formula for the interval can be expressed as:
- [equation dist_tutorial4]
- Where, ['Y[sub s]] is the sample mean, /s/ is the sample standard deviation,
- /N/ is the sample size, /[alpha]/ is the desired significance level and
- ['t[sub ([alpha]/2,N-1)]] is the upper critical value of the Students-t
- distribution with /N-1/ degrees of freedom.
- [note
- The quantity [alpha] is the maximum acceptable risk of falsely rejecting
- the null-hypothesis. The smaller the value of [alpha] the greater the
- strength of the test.
- The confidence level of the test is defined as 1 - [alpha], and often expressed
- as a percentage. So for example a significance level of 0.05, is equivalent
- to a 95% confidence level. Refer to
- [@http://www.itl.nist.gov/div898/handbook/prc/section1/prc14.htm
- "What are confidence intervals?"] in __handbook for more information.
- ] [/ Note]
- [note
- The usual assumptions of
- [@http://en.wikipedia.org/wiki/Independent_and_identically-distributed_random_variables independent and identically distributed (i.i.d.)]
- variables and [@http://en.wikipedia.org/wiki/Normal_distribution normal distribution]
- of course apply here, as they do in other examples.
- ]
- From the formula, it should be clear that:
- * The width of the confidence interval decreases as the sample size increases.
- * The width increases as the standard deviation increases.
- * The width increases as the ['confidence level increases] (0.5 towards 0.99999 - stronger).
- * The width increases as the ['significance level decreases] (0.5 towards 0.00000...01 - stronger).
- The following example code is taken from the example program
- [@../../example/students_t_single_sample.cpp students_t_single_sample.cpp].
- We'll begin by defining a procedure to calculate intervals for
- various confidence levels; the procedure will print these out
- as a table:
- // Needed includes:
- #include <boost/math/distributions/students_t.hpp>
- #include <iostream>
- #include <iomanip>
- // Bring everything into global namespace for ease of use:
- using namespace boost::math;
- using namespace std;
- void confidence_limits_on_mean(
- double Sm, // Sm = Sample Mean.
- double Sd, // Sd = Sample Standard Deviation.
- unsigned Sn) // Sn = Sample Size.
- {
- using namespace std;
- using namespace boost::math;
- // Print out general info:
- cout <<
- "__________________________________\n"
- "2-Sided Confidence Limits For Mean\n"
- "__________________________________\n\n";
- cout << setprecision(7);
- cout << setw(40) << left << "Number of Observations" << "= " << Sn << "\n";
- cout << setw(40) << left << "Mean" << "= " << Sm << "\n";
- cout << setw(40) << left << "Standard Deviation" << "= " << Sd << "\n";
- We'll define a table of significance/risk levels for which we'll compute intervals:
- double alpha[] = { 0.5, 0.25, 0.1, 0.05, 0.01, 0.001, 0.0001, 0.00001 };
- Note that these are the complements of the confidence/probability levels: 0.5, 0.75, 0.9 .. 0.99999).
- Next we'll declare the distribution object we'll need, note that
- the /degrees of freedom/ parameter is the sample size less one:
- students_t dist(Sn - 1);
- Most of what follows in the program is pretty printing, so let's focus
- on the calculation of the interval. First we need the t-statistic,
- computed using the /quantile/ function and our significance level. Note
- that since the significance levels are the complement of the probability,
- we have to wrap the arguments in a call to /complement(...)/:
- double T = quantile(complement(dist, alpha[i] / 2));
- Note that alpha was divided by two, since we'll be calculating
- both the upper and lower bounds: had we been interested in a single
- sided interval then we would have omitted this step.
- Now to complete the picture, we'll get the (one-sided) width of the
- interval from the t-statistic
- by multiplying by the standard deviation, and dividing by the square
- root of the sample size:
- double w = T * Sd / sqrt(double(Sn));
- The two-sided interval is then the sample mean plus and minus this width.
- And apart from some more pretty-printing that completes the procedure.
- Let's take a look at some sample output, first using the
- [@http://www.itl.nist.gov/div898/handbook/eda/section4/eda428.htm
- Heat flow data] from the NIST site. The data set was collected
- by Bob Zarr of NIST in January, 1990 from a heat flow meter
- calibration and stability analysis.
- The corresponding dataplot
- output for this test can be found in
- [@http://www.itl.nist.gov/div898/handbook/eda/section3/eda352.htm
- section 3.5.2] of the __handbook.
- [pre'''
- __________________________________
- 2-Sided Confidence Limits For Mean
- __________________________________
- Number of Observations = 195
- Mean = 9.26146
- Standard Deviation = 0.02278881
- ___________________________________________________________________
- Confidence T Interval Lower Upper
- Value (%) Value Width Limit Limit
- ___________________________________________________________________
- 50.000 0.676 1.103e-003 9.26036 9.26256
- 75.000 1.154 1.883e-003 9.25958 9.26334
- 90.000 1.653 2.697e-003 9.25876 9.26416
- 95.000 1.972 3.219e-003 9.25824 9.26468
- 99.000 2.601 4.245e-003 9.25721 9.26571
- 99.900 3.341 5.453e-003 9.25601 9.26691
- 99.990 3.973 6.484e-003 9.25498 9.26794
- 99.999 4.537 7.404e-003 9.25406 9.26886
- ''']
- As you can see the large sample size (195) and small standard deviation (0.023)
- have combined to give very small intervals, indeed we can be
- very confident that the true mean is 9.2.
- For comparison the next example data output is taken from
- ['P.K.Hou, O. W. Lau & M.C. Wong, Analyst (1983) vol. 108, p 64.
- and from Statistics for Analytical Chemistry, 3rd ed. (1994), pp 54-55
- J. C. Miller and J. N. Miller, Ellis Horwood ISBN 0 13 0309907.]
- The values result from the determination of mercury by cold-vapour
- atomic absorption.
- [pre'''
- __________________________________
- 2-Sided Confidence Limits For Mean
- __________________________________
- Number of Observations = 3
- Mean = 37.8000000
- Standard Deviation = 0.9643650
- ___________________________________________________________________
- Confidence T Interval Lower Upper
- Value (%) Value Width Limit Limit
- ___________________________________________________________________
- 50.000 0.816 0.455 37.34539 38.25461
- 75.000 1.604 0.893 36.90717 38.69283
- 90.000 2.920 1.626 36.17422 39.42578
- 95.000 4.303 2.396 35.40438 40.19562
- 99.000 9.925 5.526 32.27408 43.32592
- 99.900 31.599 17.594 20.20639 55.39361
- 99.990 99.992 55.673 -17.87346 93.47346
- 99.999 316.225 176.067 -138.26683 213.86683
- ''']
- This time the fact that there are only three measurements leads to
- much wider intervals, indeed such large intervals that it's hard
- to be very confident in the location of the mean.
- [endsect] [/section:tut_mean_intervals Calculating confidence intervals on the mean with the Students-t distribution]
- [section:tut_mean_test Testing a sample mean for difference from a "true" mean]
- When calibrating or comparing a scientific instrument or measurement method of some kind,
- we want to be answer the question "Does an observed sample mean differ from the
- "true" mean in any significant way?". If it does, then we have evidence of
- a systematic difference. This question can be answered with a Students-t test:
- more information can be found
- [@http://www.itl.nist.gov/div898/handbook/eda/section3/eda352.htm
- on the NIST site].
- Of course, the assignment of "true" to one mean may be quite arbitrary,
- often this is simply a "traditional" method of measurement.
- The following example code is taken from the example program
- [@../../example/students_t_single_sample.cpp students_t_single_sample.cpp].
- We'll begin by defining a procedure to determine which of the
- possible hypothesis are rejected or not-rejected
- at a given significance level:
- [note
- Non-statisticians might say 'not-rejected' means 'accepted',
- (often of the null-hypothesis) implying, wrongly, that there really *IS* no difference,
- but statisticans eschew this to avoid implying that there is positive evidence of 'no difference'.
- 'Not-rejected' here means there is *no evidence* of difference, but there still might well be a difference.
- For example, see [@http://en.wikipedia.org/wiki/Argument_from_ignorance argument from ignorance] and
- [@http://www.bmj.com/cgi/content/full/311/7003/485 Absence of evidence does not constitute evidence of absence.]
- ] [/ note]
- // Needed includes:
- #include <boost/math/distributions/students_t.hpp>
- #include <iostream>
- #include <iomanip>
- // Bring everything into global namespace for ease of use:
- using namespace boost::math;
- using namespace std;
- void single_sample_t_test(double M, double Sm, double Sd, unsigned Sn, double alpha)
- {
- //
- // M = true mean.
- // Sm = Sample Mean.
- // Sd = Sample Standard Deviation.
- // Sn = Sample Size.
- // alpha = Significance Level.
- Most of the procedure is pretty-printing, so let's just focus on the
- calculation, we begin by calculating the t-statistic:
- // Difference in means:
- double diff = Sm - M;
- // Degrees of freedom:
- unsigned v = Sn - 1;
- // t-statistic:
- double t_stat = diff * sqrt(double(Sn)) / Sd;
- Finally calculate the probability from the t-statistic. If we're interested
- in simply whether there is a difference (either less or greater) or not,
- we don't care about the sign of the t-statistic,
- and we take the complement of the probability for comparison
- to the significance level:
- students_t dist(v);
- double q = cdf(complement(dist, fabs(t_stat)));
- The procedure then prints out the results of the various tests
- that can be done, these
- can be summarised in the following table:
- [table
- [[Hypothesis][Test]]
- [[The Null-hypothesis: there is
- *no difference* in means]
- [Reject if complement of CDF for |t| < significance level / 2:
- `cdf(complement(dist, fabs(t))) < alpha / 2`]]
- [[The Alternative-hypothesis: there
- *is difference* in means]
- [Reject if complement of CDF for |t| > significance level / 2:
- `cdf(complement(dist, fabs(t))) > alpha / 2`]]
- [[The Alternative-hypothesis: the sample mean *is less* than
- the true mean.]
- [Reject if CDF of t > 1 - significance level:
- `cdf(complement(dist, t)) < alpha`]]
- [[The Alternative-hypothesis: the sample mean *is greater* than
- the true mean.]
- [Reject if complement of CDF of t < significance level:
- `cdf(dist, t) < alpha`]]
- ]
- [note
- Notice that the comparisons are against `alpha / 2` for a two-sided test
- and against `alpha` for a one-sided test]
- Now that we have all the parts in place, let's take a look at some
- sample output, first using the
- [@http://www.itl.nist.gov/div898/handbook/eda/section4/eda428.htm
- Heat flow data] from the NIST site. The data set was collected
- by Bob Zarr of NIST in January, 1990 from a heat flow meter
- calibration and stability analysis. The corresponding dataplot
- output for this test can be found in
- [@http://www.itl.nist.gov/div898/handbook/eda/section3/eda352.htm
- section 3.5.2] of the __handbook.
- [pre
- __________________________________
- Student t test for a single sample
- __________________________________
- Number of Observations = 195
- Sample Mean = 9.26146
- Sample Standard Deviation = 0.02279
- Expected True Mean = 5.00000
- Sample Mean - Expected Test Mean = 4.26146
- Degrees of Freedom = 194
- T Statistic = 2611.28380
- Probability that difference is due to chance = 0.000e+000
- Results for Alternative Hypothesis and alpha = 0.0500
- Alternative Hypothesis Conclusion
- Mean != 5.000 NOT REJECTED
- Mean < 5.000 REJECTED
- Mean > 5.000 NOT REJECTED
- ]
- You will note the line that says the probability that the difference is
- due to chance is zero. From a philosophical point of view, of course,
- the probability can never reach zero. However, in this case the calculated
- probability is smaller than the smallest representable double precision number,
- hence the appearance of a zero here. Whatever its "true" value is, we know it
- must be extraordinarily small, so the alternative hypothesis - that there is
- a difference in means - is not rejected.
- For comparison the next example data output is taken from
- ['P.K.Hou, O. W. Lau & M.C. Wong, Analyst (1983) vol. 108, p 64.
- and from Statistics for Analytical Chemistry, 3rd ed. (1994), pp 54-55
- J. C. Miller and J. N. Miller, Ellis Horwood ISBN 0 13 0309907.]
- The values result from the determination of mercury by cold-vapour
- atomic absorption.
- [pre
- __________________________________
- Student t test for a single sample
- __________________________________
- Number of Observations = 3
- Sample Mean = 37.80000
- Sample Standard Deviation = 0.96437
- Expected True Mean = 38.90000
- Sample Mean - Expected Test Mean = -1.10000
- Degrees of Freedom = 2
- T Statistic = -1.97566
- Probability that difference is due to chance = 1.869e-001
- Results for Alternative Hypothesis and alpha = 0.0500
- Alternative Hypothesis Conclusion
- Mean != 38.900 REJECTED
- Mean < 38.900 NOT REJECTED
- Mean > 38.900 NOT REJECTED
- ]
- As you can see the small number of measurements (3) has led to a large uncertainty
- in the location of the true mean. So even though there appears to be a difference
- between the sample mean and the expected true mean, we conclude that there
- is no significant difference, and are unable to reject the null hypothesis.
- However, if we were to lower the bar for acceptance down to alpha = 0.1
- (a 90% confidence level) we see a different output:
- [pre
- __________________________________
- Student t test for a single sample
- __________________________________
- Number of Observations = 3
- Sample Mean = 37.80000
- Sample Standard Deviation = 0.96437
- Expected True Mean = 38.90000
- Sample Mean - Expected Test Mean = -1.10000
- Degrees of Freedom = 2
- T Statistic = -1.97566
- Probability that difference is due to chance = 1.869e-001
- Results for Alternative Hypothesis and alpha = 0.1000
- Alternative Hypothesis Conclusion
- Mean != 38.900 REJECTED
- Mean < 38.900 NOT REJECTED
- Mean > 38.900 REJECTED
- ]
- In this case, we really have a borderline result,
- and more data (and/or more accurate data),
- is needed for a more convincing conclusion.
- [endsect] [/section:tut_mean_test Testing a sample mean for difference from a "true" mean]
- [section:tut_mean_size Estimating how large a sample size would have to become
- in order to give a significant Students-t test result with a single sample test]
- Imagine you have conducted a Students-t test on a single sample in order
- to check for systematic errors in your measurements. Imagine that the
- result is borderline. At this point one might go off and collect more data,
- but it might be prudent to first ask the question "How much more?".
- The parameter estimators of the students_t_distribution class
- can provide this information.
- This section is based on the example code in
- [@../../example/students_t_single_sample.cpp students_t_single_sample.cpp]
- and we begin by defining a procedure that will print out a table of
- estimated sample sizes for various confidence levels:
- // Needed includes:
- #include <boost/math/distributions/students_t.hpp>
- #include <iostream>
- #include <iomanip>
- // Bring everything into global namespace for ease of use:
- using namespace boost::math;
- using namespace std;
- void single_sample_find_df(
- double M, // M = true mean.
- double Sm, // Sm = Sample Mean.
- double Sd) // Sd = Sample Standard Deviation.
- {
- Next we define a table of significance levels:
- double alpha[] = { 0.5, 0.25, 0.1, 0.05, 0.01, 0.001, 0.0001, 0.00001 };
- Printing out the table of sample sizes required for various confidence levels
- begins with the table header:
- cout << "\n\n"
- "_______________________________________________________________\n"
- "Confidence Estimated Estimated\n"
- " Value (%) Sample Size Sample Size\n"
- " (one sided test) (two sided test)\n"
- "_______________________________________________________________\n";
- And now the important part: the sample sizes required. Class
- `students_t_distribution` has a static member function
- `find_degrees_of_freedom` that will calculate how large
- a sample size needs to be in order to give a definitive result.
- The first argument is the difference between the means that you
- wish to be able to detect, here it's the absolute value of the
- difference between the sample mean, and the true mean.
- Then come two probability values: alpha and beta. Alpha is the
- maximum acceptable risk of rejecting the null-hypothesis when it is
- in fact true. Beta is the maximum acceptable risk of failing to reject
- the null-hypothesis when in fact it is false.
- Also note that for a two-sided test, alpha must be divided by 2.
- The final parameter of the function is the standard deviation of the sample.
- In this example, we assume that alpha and beta are the same, and call
- `find_degrees_of_freedom` twice: once with alpha for a one-sided test,
- and once with alpha/2 for a two-sided test.
- for(unsigned i = 0; i < sizeof(alpha)/sizeof(alpha[0]); ++i)
- {
- // Confidence value:
- cout << fixed << setprecision(3) << setw(10) << right << 100 * (1-alpha[i]);
- // calculate df for single sided test:
- double df = students_t::find_degrees_of_freedom(
- fabs(M - Sm), alpha[i], alpha[i], Sd);
- // convert to sample size:
- double size = ceil(df) + 1;
- // Print size:
- cout << fixed << setprecision(0) << setw(16) << right << size;
- // calculate df for two sided test:
- df = students_t::find_degrees_of_freedom(
- fabs(M - Sm), alpha[i]/2, alpha[i], Sd);
- // convert to sample size:
- size = ceil(df) + 1;
- // Print size:
- cout << fixed << setprecision(0) << setw(16) << right << size << endl;
- }
- cout << endl;
- }
- Let's now look at some sample output using data taken from
- ['P.K.Hou, O. W. Lau & M.C. Wong, Analyst (1983) vol. 108, p 64.
- and from Statistics for Analytical Chemistry, 3rd ed. (1994), pp 54-55
- J. C. Miller and J. N. Miller, Ellis Horwood ISBN 0 13 0309907.]
- The values result from the determination of mercury by cold-vapour
- atomic absorption.
- Only three measurements were made, and the Students-t test above
- gave a borderline result, so this example
- will show us how many samples would need to be collected:
- [pre'''
- _____________________________________________________________
- Estimated sample sizes required for various confidence levels
- _____________________________________________________________
- True Mean = 38.90000
- Sample Mean = 37.80000
- Sample Standard Deviation = 0.96437
- _______________________________________________________________
- Confidence Estimated Estimated
- Value (%) Sample Size Sample Size
- (one sided test) (two sided test)
- _______________________________________________________________
- 75.000 3 4
- 90.000 7 9
- 95.000 11 13
- 99.000 20 22
- 99.900 35 37
- 99.990 50 53
- 99.999 66 68
- ''']
- So in this case, many more measurements would have had to be made,
- for example at the 95% level, 14 measurements in total for a two-sided test.
- [endsect] [/section:tut_mean_size Estimating how large a sample size would have to become in order to give a significant Students-t test result with a single sample test]
- [section:two_sample_students_t Comparing the means of two samples with the Students-t test]
- Imagine that we have two samples, and we wish to determine whether
- their means are different or not. This situation often arises when
- determining whether a new process or treatment is better than an old one.
- In this example, we'll be using the
- [@http://www.itl.nist.gov/div898/handbook/eda/section3/eda3531.htm
- Car Mileage sample data] from the
- [@http://www.itl.nist.gov NIST website]. The data compares
- miles per gallon of US cars with miles per gallon of Japanese cars.
- The sample code is in
- [@../../example/students_t_two_samples.cpp students_t_two_samples.cpp].
- There are two ways in which this test can be conducted: we can assume
- that the true standard deviations of the two samples are equal or not.
- If the standard deviations are assumed to be equal, then the calculation
- of the t-statistic is greatly simplified, so we'll examine that case first.
- In real life we should verify whether this assumption is valid with a
- Chi-Squared test for equal variances.
- We begin by defining a procedure that will conduct our test assuming equal
- variances:
- // Needed headers:
- #include <boost/math/distributions/students_t.hpp>
- #include <iostream>
- #include <iomanip>
- // Simplify usage:
- using namespace boost::math;
- using namespace std;
- void two_samples_t_test_equal_sd(
- double Sm1, // Sm1 = Sample 1 Mean.
- double Sd1, // Sd1 = Sample 1 Standard Deviation.
- unsigned Sn1, // Sn1 = Sample 1 Size.
- double Sm2, // Sm2 = Sample 2 Mean.
- double Sd2, // Sd2 = Sample 2 Standard Deviation.
- unsigned Sn2, // Sn2 = Sample 2 Size.
- double alpha) // alpha = Significance Level.
- {
- Our procedure will begin by calculating the t-statistic, assuming
- equal variances the needed formulae are:
- [equation dist_tutorial1]
- where Sp is the "pooled" standard deviation of the two samples,
- and /v/ is the number of degrees of freedom of the two combined
- samples. We can now write the code to calculate the t-statistic:
- // Degrees of freedom:
- double v = Sn1 + Sn2 - 2;
- cout << setw(55) << left << "Degrees of Freedom" << "= " << v << "\n";
- // Pooled variance:
- double sp = sqrt(((Sn1-1) * Sd1 * Sd1 + (Sn2-1) * Sd2 * Sd2) / v);
- cout << setw(55) << left << "Pooled Standard Deviation" << "= " << sp << "\n";
- // t-statistic:
- double t_stat = (Sm1 - Sm2) / (sp * sqrt(1.0 / Sn1 + 1.0 / Sn2));
- cout << setw(55) << left << "T Statistic" << "= " << t_stat << "\n";
- The next step is to define our distribution object, and calculate the
- complement of the probability:
- students_t dist(v);
- double q = cdf(complement(dist, fabs(t_stat)));
- cout << setw(55) << left << "Probability that difference is due to chance" << "= "
- << setprecision(3) << scientific << 2 * q << "\n\n";
- Here we've used the absolute value of the t-statistic, because we initially
- want to know simply whether there is a difference or not (a two-sided test).
- However, we can also test whether the mean of the second sample is greater
- or is less (one-sided test) than that of the first:
- all the possible tests are summed up in the following table:
- [table
- [[Hypothesis][Test]]
- [[The Null-hypothesis: there is
- *no difference* in means]
- [Reject if complement of CDF for |t| < significance level / 2:
- `cdf(complement(dist, fabs(t))) < alpha / 2`]]
- [[The Alternative-hypothesis: there is a
- *difference* in means]
- [Reject if complement of CDF for |t| > significance level / 2:
- `cdf(complement(dist, fabs(t))) > alpha / 2`]]
- [[The Alternative-hypothesis: Sample 1 Mean is *less* than
- Sample 2 Mean.]
- [Reject if CDF of t > significance level:
- `cdf(dist, t) > alpha`]]
- [[The Alternative-hypothesis: Sample 1 Mean is *greater* than
- Sample 2 Mean.]
- [Reject if complement of CDF of t > significance level:
- `cdf(complement(dist, t)) > alpha`]]
- ]
- [note
- For a two-sided test we must compare against alpha / 2 and not alpha.]
- Most of the rest of the sample program is pretty-printing, so we'll
- skip over that, and take a look at the sample output for alpha=0.05
- (a 95% probability level). For comparison the dataplot output
- for the same data is in
- [@http://www.itl.nist.gov/div898/handbook/eda/section3/eda353.htm
- section 1.3.5.3] of the __handbook.
- [pre'''
- ________________________________________________
- Student t test for two samples (equal variances)
- ________________________________________________
- Number of Observations (Sample 1) = 249
- Sample 1 Mean = 20.145
- Sample 1 Standard Deviation = 6.4147
- Number of Observations (Sample 2) = 79
- Sample 2 Mean = 30.481
- Sample 2 Standard Deviation = 6.1077
- Degrees of Freedom = 326
- Pooled Standard Deviation = 6.3426
- T Statistic = -12.621
- Probability that difference is due to chance = 5.273e-030
- Results for Alternative Hypothesis and alpha = 0.0500'''
- Alternative Hypothesis Conclusion
- Sample 1 Mean != Sample 2 Mean NOT REJECTED
- Sample 1 Mean < Sample 2 Mean NOT REJECTED
- Sample 1 Mean > Sample 2 Mean REJECTED
- ]
- So with a probability that the difference is due to chance of just
- 5.273e-030, we can safely conclude that there is indeed a difference.
- The tests on the alternative hypothesis show that we must
- also reject the hypothesis that Sample 1 Mean is
- greater than that for Sample 2: in this case Sample 1 represents the
- miles per gallon for Japanese cars, and Sample 2 the miles per gallon for
- US cars, so we conclude that Japanese cars are on average more
- fuel efficient.
- Now that we have the simple case out of the way, let's look for a moment
- at the more complex one: that the standard deviations of the two samples
- are not equal. In this case the formula for the t-statistic becomes:
- [equation dist_tutorial2]
- And for the combined degrees of freedom we use the
- [@http://en.wikipedia.org/wiki/Welch-Satterthwaite_equation Welch-Satterthwaite]
- approximation:
- [equation dist_tutorial3]
- Note that this is one of the rare situations where the degrees-of-freedom
- parameter to the Student's t distribution is a real number, and not an
- integer value.
- [note
- Some statistical packages truncate the effective degrees of freedom to
- an integer value: this may be necessary if you are relying on lookup tables,
- but since our code fully supports non-integer degrees of freedom there is no
- need to truncate in this case. Also note that when the degrees of freedom
- is small then the Welch-Satterthwaite approximation may be a significant
- source of error.]
- Putting these formulae into code we get:
- // Degrees of freedom:
- double v = Sd1 * Sd1 / Sn1 + Sd2 * Sd2 / Sn2;
- v *= v;
- double t1 = Sd1 * Sd1 / Sn1;
- t1 *= t1;
- t1 /= (Sn1 - 1);
- double t2 = Sd2 * Sd2 / Sn2;
- t2 *= t2;
- t2 /= (Sn2 - 1);
- v /= (t1 + t2);
- cout << setw(55) << left << "Degrees of Freedom" << "= " << v << "\n";
- // t-statistic:
- double t_stat = (Sm1 - Sm2) / sqrt(Sd1 * Sd1 / Sn1 + Sd2 * Sd2 / Sn2);
- cout << setw(55) << left << "T Statistic" << "= " << t_stat << "\n";
- Thereafter the code and the tests are performed the same as before. Using
- are car mileage data again, here's what the output looks like:
- [pre'''
- __________________________________________________
- Student t test for two samples (unequal variances)
- __________________________________________________
- Number of Observations (Sample 1) = 249
- Sample 1 Mean = 20.145
- Sample 1 Standard Deviation = 6.4147
- Number of Observations (Sample 2) = 79
- Sample 2 Mean = 30.481
- Sample 2 Standard Deviation = 6.1077
- Degrees of Freedom = 136.87
- T Statistic = -12.946
- Probability that difference is due to chance = 1.571e-025
- Results for Alternative Hypothesis and alpha = 0.0500'''
- Alternative Hypothesis Conclusion
- Sample 1 Mean != Sample 2 Mean NOT REJECTED
- Sample 1 Mean < Sample 2 Mean NOT REJECTED
- Sample 1 Mean > Sample 2 Mean REJECTED
- ]
- This time allowing the variances in the two samples to differ has yielded
- a higher likelihood that the observed difference is down to chance alone
- (1.571e-025 compared to 5.273e-030 when equal variances were assumed).
- However, the conclusion remains the same: US cars are less fuel efficient
- than Japanese models.
- [endsect] [/section:two_sample_students_t Comparing the means of two samples with the Students-t test]
- [section:paired_st Comparing two paired samples with the Student's t distribution]
- Imagine that we have a before and after reading for each item in the sample:
- for example we might have measured blood pressure before and after administration
- of a new drug. We can't pool the results and compare the means before and after
- the change, because each patient will have a different baseline reading.
- Instead we calculate the difference between before and after measurements
- in each patient, and calculate the mean and standard deviation of the differences.
- To test whether a significant change has taken place, we can then test
- the null-hypothesis that the true mean is zero using the same procedure
- we used in the single sample cases previously discussed.
- That means we can:
- * [link math_toolkit.stat_tut.weg.st_eg.tut_mean_intervals Calculate confidence intervals of the mean].
- If the endpoints of the interval differ in sign then we are unable to reject
- the null-hypothesis that there is no change.
- * [link math_toolkit.stat_tut.weg.st_eg.tut_mean_test Test whether the true mean is zero]. If the
- result is consistent with a true mean of zero, then we are unable to reject the
- null-hypothesis that there is no change.
- * [link math_toolkit.stat_tut.weg.st_eg.tut_mean_size Calculate how many pairs of readings we would need
- in order to obtain a significant result].
- [endsect] [/section:paired_st Comparing two paired samples with the Student's t distribution]
- [endsect] [/section:st_eg Student's t]
- [/
- Copyright 2006, 2012 John Maddock and Paul A. Bristow.
- Distributed under the Boost Software License, Version 1.0.
- (See accompanying file LICENSE_1_0.txt or copy at
- http://www.boost.org/LICENSE_1_0.txt).
- ]
|