Confidence Intervals for Comparing Two Treatment or Population Means Study Guide

Updated on Aug 24, 2011

Find practice problems and solutions for these concepts at Confidence Intervals for Comparing Two Treatment or Population Means Practice Problems.

Matched-pairs and two-group designs were considered in the previous lesson, but only the paired design was discussed in detail. Now we will focus on the two-group design and on random samples from two populations. Design considerations as well as inference for the difference in the treatment or population means will be discussed.

Two-Group Designs

Suppose we have decided to conduct a study using a two-group design. As with the paired design, we begin by selecting the study units. If this selection is made at random from some population, inferences can be made for this population at the end of the study. Otherwise, inference will be restricted to units in the study.

After the study units have been chosen, half are randomly assigned to the first treatment; the other half receive the second treatment. It is not necessary for the two groups to be evenly divided as just described. We could flip a fair coin to determine which treatment each unit receives. Although about half would get each treatment, it is likely that one treatment will have a few more study units than the other treatment. There are times in which we want to have more units receiving one treatment than another. However, in the absence of additional information, we will seek a randomization process that will result in the same number of units within each group.

The goal of a two-group study is usually to compare the means of the two groups. Let the mean of the first population be denoted by μ1 and the mean of the second population by μ2. Let X1i be the observed ith response from the first treatment, i = 1, 2, . . . , n1, where n1 units are receiving treatment 1. Similarly, let X2i be the observed response from the ith unit receiving the second treatment, i = 1, 2, . . . , n2.We use the sample mean of the units receiving the first treatment to estimate that treatment mean, and the sample mean of the units under the second treatment to estimate the second treatment mean. Let and be the sample means based on units receiving treatments 1 and 2, respectively.

To estimate the difference in the two treatment means, μ1 – μ2, we would use . Although we have only one sample from each treatment, we can imagine repeating the study many times, and computing each time. This gives rise to the sampling distribution of . If each population distribution is normal, the sampling distribution of is normal with mean μ1 – μ2 and variance .

The standard error of depends on whether the variances of the units receiving the two treatments are equal . If we believe that the two variances are equal, then we want to use information from each sample to estimate the common variance; that is, we want to find the pooled estimate of the variance. The term pooled estimate means that information from multiple samples is combined to provide one estimate. We must allow for the fact that the means could be different under the two treatments. These ideas lead us to use (called s-squared pooled),

as the estimate of this common variance. Notice that is a weighted average of the estimated variances within each treatment. If the two samples sizes (n1 and n2) are equal, is the average of and ; otherwise, the sample having the largest number of observations has the largest weight. Assuming that the variance is the same under the two treatments, the standard error of is .

What happens if we are unwilling to assume that the variances are the same under the two treatments? In this case, we must obtain estimates of the variance for units receiving each treatment. That is, is the sample variance for all units receiving treatment 1. Similarly, is the sample variance for all units receiving treatment 2. The standard error of is .

View Full Article
Add your own comment