Design of a Study: Sampling, Surveys, and Experiments Free Response Practice Problems for AP Statistics (page 2)
Review the following concepts if necessary:
- Samples for AP Statistics
- Sampling Bias for AP Statistics
- Statistical Significance for AP Statistics
- Experiments and Observational Studies for AP Statistics
- You are interested in the extent to which ingesting vitamin C inhibits getting a cold. You identify 300 volunteers, 150 of whom have been taking more than 1000 mg of vitamin C a day for the past month, and 150 of whom have not taken vitamin C at all during the past month. You record the number of colds during the following month for each group and find that the vitamin C group had significantly fewer colds. Is this an experiment or an observational study? Explain. What do we mean in this case when we say that the finding was significant?
- Design an experiment that employs a completely randomized design to study the question of whether of not taking large doses of vitamin C is effective in reducing the number of colds.
- A survey of physicians found that some doctors gave a placebo rather than an actual medication to patients who experience pain symptoms for which no physical reason can be found. If the pain symptoms were reduced, the doctors concluded that there was no real physical basis for the complaints. Do the doctors understand the placebo effect? Explain.
- Explain how you would use a table of random digits to help obtain a systematic sample of 10% of the names on a alphabetical list of voters in a community. Is this a random sample? Is it a simple random sample?
- The Literary Digest Magazine, in 1936, predicted that Alf Landon would defeat Franklin Roosevelt in the presidential election that year. The prediction was based on questionnaires mailed to 10 million of its subscribers and to names drawn from other public lists. Those receiving the questionnaires were encouraged to mail back their ballot preference. The prediction was off by 19 percentage points. The magazine received back some 2.3 million ballots from the 10 million sent out. What are some of the things that might have caused the magazine to be so wrong (the same techniques had produced accurate predictions for several previous elections)? (Hint: Think about what was going on in the world in 1936.)
- Interviewers, after the 9/11 attacks, asked a group of Arab Americans if they trust the administration to make efforts to counter anti-Arab activities. If the interviewer was of Arab descent, 42% responded "yes" and if the interviewer was of non-Arab descent, 55% responded "yes." What seems to be going on here?
- There are three classes of statistics at your school, each with 30 students. You want to elect a simple random sample of 15 students from the 90 students as part of an opinion-gathering project for your social studies class. Describe a procedure for doing this.
- Question #1 stated, in part: "You are interested in the extent to which ingesting vitamin C inhibits getting a cold. You identify 300 volunteers, 150 of whom have been taking more than 1000 mg of vitamin C a day for the past month, and 150 of whom have not taken vitamin C at all during the past month. You record the number of colds during the following month for each group and find that the vitamin C group had significantly fewer colds." Explain the concept of confounding in the context of this problem and give an example of how it might have affected the finding that the vitamin C group had fewer colds.
- A shopping mall wants to know about the attitudes of all shoppers who visit the mall. On a Wednesday morning, the mall places 10 interviewers at a variety of places in the mall and asks questions of shoppers as they pass by. Comment on any bias that might be inherent in this approach.
- Question #2 asked you to design a completely randomized experiment for the situation presented in question #1. That is, to design an experiment that uses treatment and control groups to see if the groups differed in terms of the number of colds suffered by users of 1000 mg a day of vitamin C and those that didn't use vitamin C. Question #8 asked you about possible confounding variables in this study. Given that you believe that both general health habits and use of vitamin C might explain a reduced number of colds, design an experiment to determine the effectiveness of vitamin C taking into account general health habits. You may assume your volunteers vary in their history of vitamin C use.
- You have developed a weight-loss treatment that involves a combination of exercise and diet pills. The treatment has been effective with subjects who have used a regular dose of the pill of 200 mg, when exercise level is held constant. There is some indication that higher doses of the pill will promote even better results, but you are worried about side effects if the dosage becomes too great. Assume you have 400 overweight volunteers for your study, who have all been on the same exercise program, but who have not been taking any kind of diet pill. Design a study to evaluate the relative effects of 200 mg, 400 mg, 600 mg, and 800 mg daily dosage of the pill.
- You are going to study the effectiveness of three different SAT preparation courses. You obtain 60 high school juniors as volunteers to participate in your study. You want to assign each of the 60 students, at random, to one of the three programs. Describe a procedure for assigning students to the programs if
- you want there to be an equal number of students taking each course.
- you want each student to be assigned independently to a group. That is, each student should have the same probability of being in any of the three groups.
- A researcher wants to obtain a sample of 100 teachers who teach in high schools at various economic levels and has access to a list of teachers in several schools for each of the levels. She has identified four such economic levels (A, B, C, and D) that comprise 10%, 15%, 45%, and 30% of the schools in which the teachers work. Describe what is meant by a stratified random sample in this situation and discuss how she might obtain it.
- You are testing for sweetness in five varieties of strawberry. You have 10 plots available for testing. The 10 plots are arranged in two side-by-side groups of five. A river runs along the edge of one of the groups of five plots something like the diagram shown below (the available plots are numbered 1–10).
- Look at problem #14 again. It is the following year, and you now have only two types of strawberries to test. Faced with the same physical conditions you had in problem 14, and given that you are concerned that differing soil conditions (as well as proximity to the river) might affect sweetness, how might you block the experiment to produce the most reliable results?
- A group of volunteers, who had never been in any kind of therapy, were randomly separated into two groups, one of which received an experimental therapy to improve selfconcept. The other group, the control group, received traditional therapy. The subjects were not informed of which therapy they were receiving. Psychologists who specialize in self-concept issues evaluated both groups after training for self-concept, and the self-concept scores for the two groups were compared. Could this experiment have been double-blind? Explain. If it wasn't double-blind, what might have been the impact on the results?
- You want to determine how students in your school feel about a new dress code for school dances. One faction in the student council, call them group A, wants to word the question as follows: "As one way to help improve student behavior at school sponsored events, do you feel that there should be a dress code for school dances?" Another group, group B, prefers, "Should the school administration be allowed to restrict student rights by imposing a dress code for school dances?" Which group do you think favors a dress code and which opposes it? Explain.
- A study of reactions to different types of billboard advertising is to be carried out. Two different types of ads (call them Type I and Type II) for each product will be featured on numerous billboards. The organizer of the campaign is concerned that communities representing different economic strata will react differently to the ads. The three communities where billboards will be placed have been identified as Upper Middle, Middle, and Lower Middle. Four billboards are available in each of the three communities. Design a study to compare the effectiveness of the two types of advertising taking into account the communities involved.
- In 1976, Shere Hite published a book entitled The Hite Report on Female Sexuality. The conclusions reported in the book were based on 3000 returned surveys from some 100,000 sent out to, and distributed by, various women's groups. The results were that women were highly critical of men. In what way might the author's findings have been biased?
- You have 26 women available for a study: Annie, Betty, Clara, Darlene, Edie, Fay, Grace, Helen, Ina, Jane, Koko, Laura, Mary, Nancy, Ophelia, Patty, Quincy, Robin, Suzy, Tina, Ulla, Vivien, Wanda, Xena, Yolanda, and Zoe. The women need to be divided into four groups for the purpose of the study. Explain how you could use a table of random digits to make the needed assignments.
You decide to control for the possible confounding effect of the river by planting one of each type of strawberry in plots 1–5 and one of each type in plots 6–10 (that is, you block to control for the river). Then, within each block, you randomly assign one type of strawberry to each of the five plots within the block. What is the purpose of randomization in this situation?
- It's an observational study because the researcher didn't provide a treatment, but simply observed different outcomes from two groups with at least one different characteristic. Participants self-selected themselves into either the vitamin C group or the nonvitamin C group. To say that the finding was significant in this case means that the difference between the number of colds in the vitamin C group and in the nonvitamin C group was too great to attribute to chance—it appears that something besides random variation may have accounted for the difference.
- Identify 300 volunteers for the study, preferably none of whom have been taking vitamin C. Randomly split the group into two groups of 150 participants each. One group can be randomly selected to receive a set dosage of vitamin C each day for a month and the other group to receive a placebo. Neither the subjects nor those who administer the medication will know which subjects received the vitamin C and which received the placebo (that is, the study should be double blind). During the month following the giving of pills, you can count the number of colds within each group. Your measurement of interest is the difference in the number of colds between the two groups. Also, placebo effects often diminish over time.
- The doctors probably did not understand the placebo effect. We know that, sometimes, a real effect can occur even from a placebo. If people believe they are receiving a real treatment, they will often show a change. But without a control group, we have no way of knowing if the improvement would not have been even more significant with a real treatment. The difference between the placebo score and the treatment score is what is important, not one or the other.
- If you want 10% of the names on the list, you need every 10th name for your sample. Number the first ten names on the list 0,1,2,…, 9. Pick a random place to enter the table of random digits and note the first number. The first person in your sample is the person among the first 10 on the list corresponds to the number chosen. Then pick every 10th name on the list after that name. This is a random sample to the extent that, before the first name was selected, every member of the population had an equal chance to be chosen. It is not a simple random sample because not all possible samples of 10% of the population are equally likely adjacent names on the list, for example, could not both be part of the sample.
- This is an instance of voluntary response bias. This poll was taken during the depths of the Depression, and people felt strongly about national leadership. Those who wanted a change were more likely to respond than those who were more or less satisfied with the current administration. Also, at the height of the Depression, people who subscribed to magazines and were on public lists were more likely to be well-to-do and, hence, Republican (Landon was a Republican and Roosevelt was a Democrat).
- Almost certainly, respondents are responding in a way they feel will please the interviewer. This is a form of response bias—in this circumstance, people may just not give a truthful answer.
- Many different solutions are possible. One way would be to put the names of all 90 students on slips of paper and put the slips of paper into a box. Then draw out 15 slips of paper at random. The names on the paper are your sample. Another way would be to identify each student by a two-digit number 01, 02,…, 90 and use a table of random digits to select 15 numbers. Or you could use the randInt function on your calculator to select 15 numbers between 1 and 90 inclusive. What you cannot do, if you want it to be an SRS, is to employ a procedure that selects five students randomly from each of the three classes.
- Because the two groups were not selected randomly, it is possible that the fewer number of colds in the vitamin C group could be the result of some variable whose effects cannot be separated from the effects of the vitamin C. That would make this other variable a confounding variable. A possible confounding variable in this case might be that the group who takes vitamin C might be, as a group, more health conscious than those who do not take vitamin C. This could account for the difference in the number of colds but could not be separated from the effects of taking vitamin C.
- The study suffers from undercoverage of the population of interest, which was declared to be all shoppers at the mall. By restricting their interview time to a Wednesday morning, they effectively exclude most people who work. They essentially have a sample of the opinions of nonworking shoppers. There may be other problems with randomness, but without more specific information about how they gathered their sample, talking about it would only be speculation.
- We could first administer a questionnaire to all 300 volunteers to determine differing levels of health consciousness. For simplicity, let's just say that the two groups identified are "health conscious" and "not health conscious." Then you would block by "health conscious" and "not health conscious" and run the experiment within each block. A diagram of this experiment might look like this:
- Because exercise level seems to be more or less constant among the volunteers, there is no need to block for its effect. Furthermore, because the effects of a 200 mg dosage are known, there is no need to have a placebo (although you could)—the 200 mg dosage will serve as the control. Randomly divide your 400 volunteers into four groups of 100 each. Randomly assign each group to one of the four treatment levels: 200 mg, 400 mg, 600 mg, or 800 mg. The study can be and should be double-blind. After a period of time, compare the weight loss results for the four groups.
- Many answers are possible. One solution involves putting the names of all 60 students on slips of paper, then randomly selecting the papers. The first student goes into program 1, the next into program 2, etc. until all 60 students have been assigned.
- Use a random number generator to select integers from 1 to 3 (like randInt (1,3)) on the TI-83/84 or use a table of random numbers assigning each of the programs a range of values (such as 1–3, 4–6, 7–9, and ignore 0). Pick any student and generate a random number from 1 to 3. The student enters the program that corresponds to the number. In this way, the probability of a student ending up in any one group is 1/3, and the selections are independent. It would be unlikely to have the three groups come out completely even in terms of the numbers in each, but we would expect it to be close.
- In this situation, a stratified random sample would be a sample in which the proportion of teachers from each of the four levels is the same as that of the population from which the sample was drawn. That is, in the sample of 100 teachers, 10 should be from level A, 15 from level B, 45 from level C, and 30 from level D. For level A, she could accomplish this by taking an SRS of 10 teachers from a list of all teachers who teach at that level. SRSs of 15, 45, and 30 would then be obtained from each of the other lists.
- Remember that you block to control for the variables that might affect the outcome that you know about, and you randomize to control for the effect of those you don't know about. In this case, then, you randomize to control for any unknown systematic differences between the plots that might influence sweetness. An example might be that the plots on the northern end of the rows (plots 1 and 6) have naturally richer soil than those plots on the south side.
- The idea is to get plots that are most similar in order to run the experiment. One possibility would be to match the plots the following way: close to the river north (6 and 7); close to the river south (9 and 10); away from the river north (1 and 2); and away from the river south (4 and 5). This pairing controls for both the effects of the river and possible north–south differences that might affect sweetness. Within each pair, you would randomly select one plot to plant one variety of strawberry, planting the other variety in the other plot.
- The study could have been double-blind. The question indicates that the subjects did not know which treatment they were receiving. If the psychologists did not know which therapy the subjects had received before being evaluated, then the basic requirement of a double-blind study was met: neither the subjects nor the researchers who come in contact with them are aware of who is in the treatment and who is in the control group.
- Group A favors a dress code, group B does not. Both groups are hoping to bias the response in favor of their position by the way they have worded the question.
- You probably want to block by community since it is felt that economic status influences attitudes toward advertising. That is, you will have three blocks: Upper Middle, Middle, and Lower Middle. Within each, you have four billboards. Randomly select two of the billboards within each block to receive the Type I ads, and put the Type II ads on the other two. After a few weeks, compare the differences in reaction to each type of advertising within each block.
- With only 3000 of 100,000 surveys returned, voluntary response bias is most likely operating. That is, the 3000 women represented those who felt strongly enough (negatively) about men and were the most likely to respond. We have no way of knowing if the 3% who returned the survey were representative of the 100,000 who received it, but they most likely were not.
- Assign each of the 26 women a two-digit number, say 01, 02,…, 26. Then enter the table at a random location and note two-digit numbers. Ignore numbers outside of the 01–26 range. The first number chosen assigns the corresponding woman to the first group, the second to the second group, etc. until all 26 have been assigned. This method roughly equalizes the numbers in the group (not quite because 4 doesn't go evenly into 26), but does not assign them independently.
If you wanted to assign the women independently, you would consider only the digits 1, 2, 3, or 4, which correspond to the four groups. As one of the women steps forward, one of the random digits is identified, and that woman goes into the group that corresponds to the chosen number. Proceed in this fashion until all 26 women are assigned a group. This procedure yields independent assignments to groups, but the groups most likely will be somewhat unequal in size. In fact, with only 26 women, group sizes might be quite unequal (a TI-83/84 simulation of this produced 4 1s, 11 2s, 4 3s, and 7 4s).
This arrangement leaves plots 3 and 8 unassigned. One possibility is simply to leave them empty. Another possibility is to assign randomly each of them to one of the pairs they adjoin. That is, plot 3 could be randomly assigned to join either plot 2 or plot 4. Similarly, plot 8 would join either plot 7 or plot 9.
If the study wasn't double-blind, it would be because the psychologists were aware of which subjects had which therapy. In this case, the attitudes of the psychologists toward the different therapies might influence their evaluations—probably because they might read more improvement into a therapy of which they approve.
Today on Education.com
- Coats and Car Seats: A Lethal Combination?
- Kindergarten Sight Words List
- Child Development Theories
- Signs Your Child Might Have Asperger's Syndrome
- 10 Fun Activities for Children with Autism
- Why is Play Important? Social and Emotional Development, Physical Development, Creative Development
- The Homework Debate
- First Grade Sight Words List
- Social Cognitive Theory
- GED Math Practice Test 1