Confidence Intervals for Comparing Two Treatment or Population Means Study Guide (page 3)

Updated on Aug 24, 2011


  1. The populations of interest are the population of identical twins and the population of fraternal twins. The type of twins cannot be assigned at random. Fraternal and identical twins constitute different populations.
  2. The estimated mean difference in the heights of identical twins is 1.68 cm. The estimated variance of the difference in the heights of identical twins is 2.10 cm2, and the estimated standard deviation is 1.45 cm. The estimated mean difference in the heights of fraternal twins is 3.40 cm. The estimated variance of the difference in heights of fraternal twins is 14.76 cm2, and the estimated standard deviation is 3.842 cm.
  3. The variance of the difference in heights of fraternal twins is about seven times the variance of the difference in heights of identical twins. Thus, it is unlikely that these are estimates of the same quantity. (In general, if one variance is about four times that of the other, then it is unlikely the two are equal.) Thus, we would not want to estimate a common variance.
  4. The estimated difference in population means is = 3.9 – 1.5 = 2.4 cm. Because the variances are not the same, the standard error of the estimate is
  5. Parallel dotplots and boxplots are shown in Figures 19.3 and 19.4. Both graphs indicate that the sample distributions are skewed right. The difference in identical twin heights has an outlier as well. Normality is not a reasonable assumption for these populations.

    Figure 19.3

    Figure 19.4

  6. Because twins were recruited and not randomly selected, inference may be drawn only to twins in the study. We would hope that this sample is representative of the larger population of identical and fraternal twins so that the inferences could be drawn more broadly. However, we cannot be assured of this.

Confidence Intervals Comparing Two Means

Two-Group Design

As before, let and be the sample means based on units receiving treatments one and two, respectively. Then is a point estimate of the difference in the two treatment means, μ1 – μ2. The standard error of is if and if .

To set a confidence interval on the difference in two treatment means, μ1– μ2, using the methods outlined here, two conditions must be satisfied. First, the treatments must be independent random samples from the population of units receiving treatment 1 and treatment 2. Suppose the units are randomly selected from the population and then randomly assigned to either treatment 1 or treatment 2. The random selection of the units gives us the random samples, and the random assignment of treatment ensures independence. If the units are not randomly selected, then we must rely solely on the random assignment of the treatments to give us a population of all possible samples for the two treatments from these units. Either way, the random assignment of treatments is critical for inference. Second, the responses must either be normally distributed, or the sample size for each treatment must be large enough (n ≥30) so that, by the Central Limit Theorem, each estimated treatment mean is approximately normal. Once these conditions are met, the approach we use will depend on whether we believe or . We will consider these two cases in turn.

First, assume . To standardize , we take , which has a t-distribution with (n1 + n2 – 2) degrees of freedom. Thus, a 100(1 – α)% confidence interval on μ1 – μ2 is () where t* with (n1 + n2 – 2) is the tabulated value such that .

Next, suppose that . Standardizing , we have .

This standardized variable is only approximately distributed as a t-distribution, and the approximation involves a complicated formula for the degrees of freedom. That is, .

Typically, computation of these degrees of freedom is built into a calculator or computer software. A 100(1 – α)% confidence interval has the form , where t* has the degrees of freedom given previously.


Consider the telemarketing example in the previous lesson. Let the subscript H represent the hard-sell approach and the subscript S represent the soft-sell approach. Set an 80% confidence interval on the difference in the treatment means, μH – μS.


Two conditions must be satisfied. Randomly selected members of the sales force were assigned at random to the two treatments, so the first condition is satisfied. We noted earlier that the data consisted of discrete counts so they could not be normally distributed. However, the sample size is 40 for each treatment, allowing us to invoke the Central Limit Theorem.

In the previous lesson, we found = –0.93 sales. The estimated variance for the hard-sell approach is 1.59 sales2, and that for the softsell approach is 1.93 sales2. Because these two estimates are close to each other, we assume that . Therefore, as we saw earlier, the standard error of is = 0.30. For an 80% confidence interval, α = 0.20. We must find t*, such that where we have (n1 + n2 – 2) = 40 + 40 –2 =78 df. From the t-table in Lesson 12, we look in the row for 78 df and the column for α = 0.10 to find t* = 1.292. Therefore, an 80% confidence interval on μH – μS is –0.93 ± 1.292(0.30) or –0.93 ± 0.087. Therefore, we are 80% confident that, on average, the number of sales using the hard-sell approach is between 0.84 and 1.02 less each hour than using the soft-sell approach. Notice that the negative number meant that fewer sales were made using the hard-sell approach because we were estimating μH–μS. A positive number would have indicated that the estimated mean for the hard-sell approach was larger than that of the soft-sell approach.

View Full Article
Add your own comment