Hypotheses, Prediction, and Regression
The following several problems involve a hypothetical experiment in which the incidence of a mystery illness, Syndrome X, is scrutinized. Twenty groups of 100 people are selected according to various criteria. Then the percentage of people in each group who exhibit Syndrome X is recorded. Graphs are compiled, and the data is analyzed. This example, like all the others in this chapter, is based on real-world possibilities, but otherwise it's entirely make-believe.
Practice 1
Suppose we test 100 randomly selected people living at locations in 20 different latitudes (for a total of 2000 people in the experiment). We render the results as a scatter plot. Latitude is the independent variable. The percentage of people exhibiting Syndrome X is the dependent variable. The result is shown in Fig. 8-14. Someone states a hypothesis: "If you move closer to, or farther from, the equator, your risk of developing Syndrome X does not change." What sort of hypothesis is this? Someone else says, "Look at Fig. 8-14. Obviously, people who live close to the equator get Syndrome X more often than people who live far from the equator. I believe that if you move closer to the equator, your risk of developing Syndrome X will increase." What sort of hypothesis is this? A third person says, "The scatter plot shows that a greater proportion of people who live close to the equator have Syndrome X, as compared with people who live far from the equator. But this does not logically imply that if you move closer to the equator, then you run a greater risk of developing Syndrome X than if you stay here or move farther from the equator." What sort of hypothesis is this?
Fig. 8-14. Illustration for Practice 1, 2, 3, and 6.
Solution 1
The first hypothesis is an example of a null hypothesis. The second and third hypotheses are examples of alternative hypotheses.
Practice 2
Provide an argument that can be used to support the first hypothesis in Problem 8-34.
Solution 2
The first hypothesis is this: "If you move closer to, or farther from, the equator, your risk of developing Syndrome X will not change." The scatter plot of Fig. 8-14 shows that people who live close to the equator have Syndrome X in greater proportion than people who live far from the equator. But it is an oversimplification to say that the latitude of residence, all by itself, is responsible for Syndrome X. The syndrome might be preventable by taking precautions that most people who live near the equator don't know about, or in which they don't believe, or that their governments forbid them to take. If you live in Amsterdam and are knowledgeable about Syndrome X, you might adjust your lifestyle or take a vaccine so that, if you move to Singapore, you will bear no greater risk of contracting Syndrome X than you have now.
Practice 3
How can the first hypothesis in Problem 8-34 be tested?
Solution 3
In order to discover whether or not moving from one latitude to another affects the probability that a person will develop Syndrome X, it will be necessary to test a large number of people who have moved from various specific places to various other specific places. This test will be more complex and time-consuming than the original experiment. Additional factors will enter in, too. For example, we will have to find out how long each person has lived in the new location after moving, and how much traveling each person does (for example, in conjunction with employment). The extent, and not only the direction, of the latitude change will also have to be taken into account. Is there a difference between moving from Amsterdam to Singapore, as compared with moving from Amsterdam to Rome? Another factor is the original residence latitude. Is there a difference between moving from Rome to Singapore, as compared with moving from Amsterdam to Singapore?
Practice 4
Figure 8-15 shows a scatter plot of data for the same 20 groups of 100 people that have been researched in our hypothetical survey involving Syndrome X. But instead of the latitude in degrees north or south of the equator, the altitude, in meters above sea level, is the independent variable. What does this graph tell us?
Fig. 8-15. Illustration for Practice 4, 5, and 7.
Solution 4
It is difficult to see any correlation here. Some people might see a weak negative correlation between the altitude of a place above sea level and the proportion of the people exhibiting Syndrome X. But other people might see a weak positive correlation because of the points in the upper-right portion of the plot. A computer must be used to determine the actual correlation, and when it is found, it might turn out to be so weak as to be insignificant.
Practice 5
Suppose someone comes forward with a hypothesis: "If you move to a higher or lower altitude above sea level, your risk of developing Syndrome X does not change." What sort of hypothesis is this? Someone else says, "It seems to me that Fig. 8-15 shows a weak, but not a significant, correlation between altitude and the existence of Syndrome X in the resident population. But I disagree with you concerning the hazards involved with moving. There might be factors that don't show up in this data, even if the correlation is equal to 0; and one or more of these factors might drastically affect your susceptibility to developing Syndrome X if you move much higher up or lower down, relative to sea level." What sort of hypothesis is this?
Solution 5
The first hypothesis is a null hypothesis. The second hypothesis is an alternative hypothesis.
Practice 6
Estimate the position of the line of least squares for the scatter plot showing the incidence of Syndrome X versus the latitude north or south of the equator (Fig. 8-14).
Solution 6
Figure 8-16 shows a "good guess" at the line of least squares for the points in Fig. 8-14.
Fig. 8-16. Illustration for Practice 6.
Practice 7
Figure 8-17 shows a "guess" at a regression curve for the points in Fig. 8-15, based on the notion that the correlation is weak, but negative. Is this a "good guess"? If so, why? If not, why not?
Solution 7
Figure 8-17 is not a "good guess" at a regression curve for the points in Fig. 8-15. There is no such thing as a "good guess" here. The correlation is weak at best, and its nature is uncertain in the absence of computer analysis.
Fig. 8-15. Illustration for Practice 4, 5, and 7.
Fig. 8-17. Illustration for Practice 7.
More practice problems for these concepts can be found at:
Statistics Practical Problems Practice Test
View Full Article
From Statistics Demystified: A Self-Teaching Guide. Copyright © 2004 by The McGraw-Hill Companies. All Rights Reserved.