Confidence Intervals for Comparing Two Treatment or Population Means Study Guide (page 2)
Find practice problems and solutions for these concepts at Confidence Intervals for Comparing Two Treatment or Population Means Practice Problems.
Matched-pairs and two-group designs were considered in the previous lesson, but only the paired design was discussed in detail. Now we will focus on the two-group design and on random samples from two populations. Design considerations as well as inference for the difference in the treatment or population means will be discussed.
Suppose we have decided to conduct a study using a two-group design. As with the paired design, we begin by selecting the study units. If this selection is made at random from some population, inferences can be made for this population at the end of the study. Otherwise, inference will be restricted to units in the study.
After the study units have been chosen, half are randomly assigned to the first treatment; the other half receive the second treatment. It is not necessary for the two groups to be evenly divided as just described. We could flip a fair coin to determine which treatment each unit receives. Although about half would get each treatment, it is likely that one treatment will have a few more study units than the other treatment. There are times in which we want to have more units receiving one treatment than another. However, in the absence of additional information, we will seek a randomization process that will result in the same number of units within each group.
The goal of a two-group study is usually to compare the means of the two groups. Let the mean of the first population be denoted by μ1 and the mean of the second population by μ2. Let X1i be the observed ith response from the first treatment, i = 1, 2, . . . , n1, where n1 units are receiving treatment 1. Similarly, let X2i be the observed response from the ith unit receiving the second treatment, i = 1, 2, . . . , n2.We use the sample mean of the units receiving the first treatment to estimate that treatment mean, and the sample mean of the units under the second treatment to estimate the second treatment mean. Let and be the sample means based on units receiving treatments 1 and 2, respectively.
To estimate the difference in the two treatment means, μ1 – μ2, we would use – . Although we have only one sample from each treatment, we can imagine repeating the study many times, and computing – each time. This gives rise to the sampling distribution of – . If each population distribution is normal, the sampling distribution of – is normal with mean μ1 – μ2 and variance .
The standard error of – depends on whether the variances of the units receiving the two treatments are equal . If we believe that the two variances are equal, then we want to use information from each sample to estimate the common variance; that is, we want to find the pooled estimate of the variance. The term pooled estimate means that information from multiple samples is combined to provide one estimate. We must allow for the fact that the means could be different under the two treatments. These ideas lead us to use (called s-squared pooled),
as the estimate of this common variance. Notice that is a weighted average of the estimated variances within each treatment. If the two samples sizes (n1 and n2) are equal, is the average of and ; otherwise, the sample having the largest number of observations has the largest weight. Assuming that the variance is the same under the two treatments, the standard error of – is .
What happens if we are unwilling to assume that the variances are the same under the two treatments? In this case, we must obtain estimates of the variance for units receiving each treatment. That is, is the sample variance for all units receiving treatment 1. Similarly, is the sample variance for all units receiving treatment 2. The standard error of – is .
A large telemarketing firm acquired a new client with a product. A script for the sales people to use when calling prospective customers needed to be developed. Because the product was different from ones the firm had handled in the past, the script writers were divided as to which of two approaches, a hard-sell approach or a soft-sell approach, would result in the greatest number of sales. They decided to conduct a study to compare the two approaches. Eighty people were randomly selected from the sales force. Of these, 40 were randomly assigned to use the hard-sell approach; the other 40 were to use the soft-sell approach. Each person was then trained using the script of the method to which he or she was assigned. After having each study participant use the script for one day, the number of sales made during a randomly selected hour during the next work day was recorded. The results are in Table 19.1.
- This study has a two-group design. Explain why this statement is true.
- Estimate the mean and standard deviation for each treatment.
- Is it reasonable to assume the variance is the same for both populations? If so, estimate the variance common to both.
- Estimate the difference in the treatment means and find its standard error.
- Is it reasonable to assume that the numbers of sales are normally distributed for each treatment?
- To which population may inference be drawn from this study?
- The hard-sell approach was randomly assigned to half of the study participants, and the other half of the study participants was assigned the soft-sell approach. No effort was made to pair the study participants to control other factors.
- The estimated mean number of sales per hour when using the hard-sell approach is 1.45 sales. The estimated variance of the number of sales per hour for this approach is 1.59 sales2, so the estimated standard deviation is 1.26 sales. The estimated mean and variances of the number of sales per hour when using the soft-sell approach is 2.38 sales and 1.93 sales2, respectively. The estimated standard deviation of the number of sales using the soft-sell approach is 1.39 sales.
- Because the standard deviations for the two treatment groups are similar, it is reasonable to assume that they are estimating a common variance.
Using the subscript H to represent the hard-sell approach and the subscript S to represent the soft-sell approach, the estimate of that common variance is:
The estimated difference in the mean number of sales using the hard-sell and the soft-sell approaches is – = 1.45 – 2.38 = – 0.93 sales; that is, 0.93 fewer sales are made, on average, using the hard-sell approach compared to the soft-sell approach. The standard error of this estimate is:
- For a two-group experiment, the condition of normality is checked within each treatment group. Because we are working with counts (the number of sales in an hour), the data are discrete. They cannot be normally distributed. We will focus on the shape of the sample distributions. Figures 19.1 and 19.2 show parallel dotplots and parallel boxplots. Based on the dotplot, the sample distribution of the hard-sell approach is skewed to the right, but the distribution of the soft-sale approach is reasonably symmetric. The boxplot supports the view that the sample distribution of the hard-sell approach is skewed to the right; further, the lone observation of five sales in an hour is an outlier. Based on the boxplot, the symmetry of the sample distribution of the soft-sell approach is a reasonable assumption though some may believe the distribution to be skewed left.
- Because the study participants were randomly drawn from the firm's sales force, the sales force of this large telemarketing firm is the population to which inference may be drawn.
Comparing Two Populations
If we randomly select samples from each of two populations, the two samples are independent. The statistical methods used to compare the means of two populations based on independent samples from each population are identical to those used in analyzing studies of the two-group design. We estimate the difference using – . The standard error of – is if the two population standard deviations are equal. If the two population standard deviations are not equal, then the standard error of – is .
A researcher conducted a study to compare traits of identical and fraternal twins. She wanted to know whether the mean difference in twin heights was different for identical and fraternal twins. She recruited 30 identical twin pairs and 30 fraternal twin pairs to participate in the study. The difference in each pair's height was recorded and presented in Table 19.3.
- This study compares two population means. Explain why this statement is true.
- Estimate the mean and standard deviation for each population.
- Is it reasonable to assume the variance is the same for both populations? If so, estimate the variance common to both?
- Estimate the difference in the population means and find the standard error of this estimate.
- Is it reasonable to assume that the differences in twin heights are normally distributed for each population?
- To which populations may inference be drawn from this study?
Today on Education.com
- Coats and Car Seats: A Lethal Combination?
- Kindergarten Sight Words List
- Child Development Theories
- Signs Your Child Might Have Asperger's Syndrome
- 10 Fun Activities for Children with Autism
- Why is Play Important? Social and Emotional Development, Physical Development, Creative Development
- The Homework Debate
- Social Cognitive Theory
- First Grade Sight Words List
- GED Math Practice Test 1