One-Variable Data Analysis Free Response Practice Problems for AP Statistics (page 2)

By — McGraw-Hill Professional
Updated on Feb 5, 2011


  1. Using the calculator, we find that = 29.78, s = 11.94, Q1 = 21, Q3 = 37. Using the 1.5(IQR) rule, outliers are values that are less than 21 - 1.5(37 - 21) = -3 or greater than 37 + 1.5(37 - 21) = 61. Because no values lie outside of those boundaries, there are no outliers by this rule.
  2. Using the ± 2s rule, we have ± 2s = 29.78 ± 2(11.94) = (5.9, 53.66). By this standard, the year he hit 54 home runs would be considered an outlier.

  3. (a) is a property of the standard normal distribution, not a property of normal distributions in general. (b) is a property of the normal distribution. (c) is not a property of the normal distribution–Almost all of the terms are within four standard deviations of the mean but, at least in theory, there are terms at any given distance from the mean. (d) is a property of the normal distribution—the normal curve is the perfect bell-shaped curve. (e) is a property of the normal distribution and is the property that makes this curve useful as a probability density curve.
  4. Free Response
  5. What shows up when done by 5 rather than 10 is the gap between 42 and 52. In 16 out of 18 years, Mantle hit 42 or fewer home runs. He hit more than 50 only twice.

  6. Free Response = 76.4 and s = 10.17.
      Free Response
  7. Using the Standard Normal Probability table, a score of 84 corresponds to the 77.34th percentile, and a score of 89 corresponds to the 89.25th percentile. Both students were in the top quartile of scores after the program and performed better than all but one of the other students. We don't know that there is a cause-and-effect relationship between the pilot program and the high scores (that would require comparisons with a pretest), but it's reasonable to assume that the program had a positive impact. You might wonder how the student who got the 98 did so well!

  8. Free Response
  9. The most distinguishing feature is that the range (43) is quite large compared to themiddle 50% of the scores (13). That is, we can see from the graph that the scores are packed somewhat closely about the median. The shape of a histogram of the data would be symmetric and mound shaped.

  10. Free Response Area of the left of 3.28 is 0.9995.
  11. There are 18 values in the stemplot. The median is 17 (actually between the last two 7s in the row marked by the (5) in the count column of the plot —it's still 17). Because there are 9 values in each half of the stemplot, the median of the lower half of the data, Q1, is the 5th score from the top. So, Q1 = 14. Q3 = the 5th score counting from the bottom = 24. Thus, IQR = 24 - 14 = 10.
  12. There are 3 values in the first bar, 6 in the second, 2 in the third, 9 in the fourth, and 5 in the fifth for a total of 25 values in the dataset. Of these, 3 + 6 + 2 = 11 are less than 3.5. There are 25 terms altogether, so the proportion of terms less than 3.5 is 11/25 = 0.44.
  13. With the exception of the one outlier for Bonds, the most obvious thing about these two is just how similar the two are. The medians of the two are almost identical and the IQRs are very similar. The data do not show it, but with the exception of 2001, the year Bonds hit 73 home runs, neither batter ever hit 50 or more home runs in a season. So, for any given season, you should be overjoyed to have either on your team, but there is no good reason to choose one over the other. However, if you based your decision on who had the most home runs in a single season, you would certainly choose Bonds.
  14. Let x be the value in question. Because we do not want to be in the top 20%, the area to the left of x is 0.8. Hence zx = 0.84 (found by locating the nearest table entry to 0.8, which is 0.7995 and reading the corresponding z-score as 0.84). Then
      Free Response
  15. [Using the calculator, the solution to this problem is given by invNorm (0.8,185,25).]

  16. Free Response = $3.36 million, s = $1.88 million, Med = $3.35 million, IQR = $2.6 million. A boxplot of the data looks like this:
  17. Free Response

    The fact that the mean and median are virtually the same, and that the boxplot shows that the data are more or less symmetric, indicates that either set of measures would be appropriate.

  18. The easiest way to do this is to use the calculator. Put the age data in L1 and the frequencies in L2. Then do 1-Var Stats L1,L2 (the calculator will read the second list as frequencies for the first list).
    • The mean is 2.48 years, and the median is 2 years. This indicates that the mean is being pulled to the right—and that the distribution is skewed to the right or has outliers in the direction of the larger values.
    • The standard deviation is 2.61 years. Because one standard deviation to left would yield a negative value, this also indicates that the distribution extends further to the right than the left.
    • A histogram of the data, drawn on the TI–83/84, is drawn below. This definitely indicates that the ages of these pennies is skewed to the right.
  19. Free Response

  20. Since we don't know the shape of the distribution of coin values, we must use Chebyshev's rule to help us solve this problem. Let k = the number of standard deviations that 170 is above the mean. Then 130 + k · (15) = 170. So, k ≈ 2.67. Thus, at most , or 14%, of the coins are valued at more than $170. Her requirement was that or 15.5%, of the coins must be valued at more than $170. Since at most 14% can be valued that highly, she should not buy the collection.
  21. The new mean is 5(35 - 10) = 125.
  22. The new median is 5(33 - 10) = 115.

    The new variance is 52(62) = 900.

    The new standard deviation is 5(6) = 30.

    The new IQR is 5(12) = 60.

  23. First we need to find the proportion of women who would be less than 62'' tall:
      Free Response
  24. So 0.1151 of the terms in the distribution would be less than 62''. This means that 0.1151(300) = 34.53, so you would expect that 34 or 35 of the women would be less than 62'' tall.

  25. a, c, and d are properties of the standard deviation. (a) serves as a definition of the standarddeviation. It is independent of the number of terms in the distribution in the sense that simply adding more terms will not necessarily increase or decrease s. (d) is another way of saying that the standard deviation is independent of the mean—it's a measure of spread, not a measure of center.
  26. The standard deviation is not resistant to extreme values (b) because it is based on the mean, not the median. (e) is a statement about the interquartile range. In general, unless we know something about the curve, we don't know what proportion of terms are within 2 standard deviations of the mean.

  27. For these data, Q1 = $2.3 million, Q3 = $4.9 million. To be an outlier, Erick would need to make at least 4.9 + 1.5(4.9 - 2.3) = 8.8 million. In other words, he would need a $2.6 million dollar raise in order to have his salary be an outlier.
  28. You need to estimate the median and the quartiles. Note that the histogram is skewed to the left, so that the scores tend to pack to the right. This means that the median is to the right of center and that the boxplot would have a long whisker to the left. The boxplot looks like this:
  29. Free Response

  30. If you standardize both scores, you can compare them on the same scale. Accordingly,
  31. Free Response

    Nathan did slightly, but only slightly, better on the second test.

  32. Free Response
View Full Article
Add your own comment

Ask a Question

Have questions about this article or topic? Ask
150 Characters allowed