Practice problems for these concepts can be found at:
In this lesson, we get more specific by actually constructing confidence intervals for each of the parameters under consideration. The chart below lists each parameter for which we will construct confidence intervals, the conditions under which we are justified in constructing the interval, and the formula for actually constructing the interval.
Special note concerning the degrees of freedom for the sampling distribution of the difference of two means: In most situations, a conservative, and usually acceptable, approach for determining the required number of degrees of freedom is to let df = min { n_{1} – 1, n_{2} – 1 }. This is "conservative" in the sense that it will give a smaller number of degrees of freedom than other methods, which translates to a larger margin of error. If you choose not to use the conservative approach, there are two cases of interest: (1) the population variances are assumed to be equal; (2) the population variances are not assumed to be equal (the usual case).
- If we can justify the assumption that the population variances are equal, we can "pool" our estimates of the population standard deviation. In practice, this is rarely done since the statistical test for equal variances is unreliable. However, if we can make that assumption, then df = n_{1} + n_{2} – 2, and the standard error becomes . You will never be required to use this method (since it is very difficult to justify the assumption that the population variances are equal), although you should know when it is permitted.
- The confidence interval can be constructed by calculator or computer (that's the "computed by software" notation in the chart). In this case, the degrees of freedom will be computed using the following expression:
You probably don't want to do this computation by hand, but you could! Note that this technique usually results in a noninteger number of degrees of freedom.
In practice, since most people will be constructing a two-sample confidence interval using a calculator, the second method above (referred to as the "computed by software" method in the box above) is acceptable. Just be sure to report the degrees of freedom as given by the calculator so that a reader knows that you used a calculator.
example: An airline is interested in determining the average number of unoccupied seats for all of its flights. It selects an SRS of 81 flights and determines that the average number of unoccupied seats for the sample is 12.5 seats with a sample standard deviation of 3.9 seats. Construct a 95% confidence interval for the true number of unoccupied seats for all flights.
solution: The problem states that the sample is an SRS. The large sample size justifies the construction of a one-sample confidence interval for the population mean. For a 95% confidence interval with df = 81 – 1 = 80, we have, from Table B, t* =1.990. We have 12.5 ± (1.99) = (11.64, 13.36)
If the problem had stated that n = 80 instead of 81, we would have had df = 80 – 1 = 79. There is no entry in Table B for 79 degrees of freedom. In this case we would have had to round down and use df = 60, resulting in t* = 2.000 and an interval of 12.5 ± (2.00) = (11.63, 13.37) the difference isn't large, but the interval is slightly wider. (For the record, we note that the value of t* for df = 79 is given by the TI-84 as invT(0.975,79) = 1.99045.)
You can use the STAT TESTS TInterval function on the TI-83/84 calculator to find a confidence interval for a population mean (a confidence interval for a population mean is often called a "one-sample" t interval). It's recommended that you show how the interval was constructed as well as reporting the calculator answer. And don't forget to show that the conditions needed to construct the interval are present.
example: Interpret the confidence interval from the previous example in the context of the problem.
solution: We are 95% confident that the true mean number of unoccupied seats is between 11.6 and 13.4 seats. (Remember that we are not making any probability statement about the particular interval we have constructed. Either the true mean is in the interval or it isn't.)
For large sample confidence intervals utilizing z-procedures, it is probably worth memorizing the critical values of z for the most common C levels of 0.90, 0.95, and 0.99. They are:
example: Brittany thinks she has a bad penny because, after 150 flips, she counted 88 heads. Find a 99% confidence interval for the true proportion of heads. Do you think the coin is biased?
solution: First we need to check to see if using a z interval is justified.
Because n and n(1 – ) are both greater than or equal to 5, we can construct a 99% z interval for the population proportion:
We are 99% confident that the true proportion of heads for this coin is between 0.484 and 0.69. If the coin were fair, we would expect, on average, 50% heads. Since 0.50 is in the interval, it is a likely population value for this coin. We do not have strong evidence that Brittany's coin is bad.
Generally, you should use t procedures for one- or two-sample problems (those that involve means) unless you are given the population standard deviation(s) and z-procedures for one- or two-proportion problems.
example: The following data were collected as part of a study. Construct a 90% confidence interval for the true difference between the means (μ_{1} – μ_{2}). Does it seem likely the differences in the sample indicate that there is a real difference between the population means? The samples were SRSs from independent, approximately normal, populations.
solution: The relatively small values of n tell us that we need to use a two-sample t interval. The conditions necessary for using this interval are given in the problem: SRSs from independent, approximately normal, populations. Using the "conservative" method of choosing the degrees of freedom:
We are 90% confident that the true difference between the means lies in the interval from 0.227 to 5.25. If the true difference between the means is zero, we would expect to find 0 in the interval. Because it isn't, this intervalprovides evidence that there might be a real difference between the means.
If you do the same exercise on your calculator (STAT TESTS 2-SampTInt), you get (0.302, 5.178) with df = 35.999. This interval is narrower, highlighting the conservative nature of using df = min{ n_{1} – 1, n_{2} – 1 }. Also, note the calculator calculates the number of degrees of freedom using
example: Construct a 95% confidence interval for p_{1} – p_{2} given that n_{1} = 180, n_{2} = 250, _{1} = 0.31, _{2} = 0.25. Assume that these are data from SRSs from independent populations.
solution: 180(0.31) = 55.8, 180(1 – 0.31) = 124.2, 250(0.25) = 62.5, and 250(0.75) = 187.5 are all greater than or equal to 5 so, with what is given in the problem, we have the conditions needed to construct a two-proportion z interval.
Practice problems for these concepts can be found at: