Measures of Center for AP Statistics (page 2)
Practice problems for these concepts can be found at:
- One-Variable Data Analysis Multiple Choice Practice Problems for AP Statistics
- One-Variable Data Analysis Free Response Practice Problems for AP Statistics
- One-Variable Data Analysis Review Problems for AP Statistics
- One-Variable Data Analysis Rapid Review for AP Statistics
In the last example of the previous section, we said that the graph appeared to be centered about a height of 66" In this section, we talk about ways to describe the center of a distribution. There are two primary measures of center: the mean and the median. There is a third measure, the mode, but it tells where the most frequent values occur for inch more than it describes the center. In some distributions, the mean, median, and mode will be close in value, but the mode can appear at any point in the distribution.
Let xi represent any value in a set of n values (i = 1, 2,…, n). The mean of the set is defined as the sum of the x's divided by n. Symbolically . Usually, the indices on the summation symbol in the numerator are left out and the expression is simplified to .
Σ x means "the sum of x" and is defined as follows: Σ x = x1 + x2 +… +xn. Think of it as the "add-'em-up" symbol to help remember what it means. is used for a mean based on a sample (a statistic). In the event that you have access to an entire distribution (such as in Chapters 9 and 10), its mean is symbolized by the Greek letter μ
(Note: in the previous chapter, we made a distinction between statistics, which are values that describe sample data, and parameters, which are values that describe populations. Unless we are clear that we have access to an entire population, or that we are discussing a distribution, we use the statistics rather than parameters.)
example: During his major league career, Babe Ruth hit the following number of home runs (1914–1935): 0, 4, 3, 2, 11, 29, 54, 59, 35, 41, 46, 25, 47, 60, 54, 46, 49, 46, 41, 34, 22, 6. What was the mean number of home runs per year for his major league career?
The median of a ordered dataset is the "middle" value in the set. If the dataset has an odd number of values, the median is a member of the set and is the middle value. If there are 3 values, the median is the second value. If there are 5, it is the third, etc. If the dataset has an even number of values, the median is the mean of the two middle numbers. If there are 4 values, the median is the mean of the second and third values. In general, if there are n values in the ordered dataset, the median is at the position. If you have 28 terms in order, you will find the median at the = 14.5th position (that is, between the 14th and 15th terms). Be careful not to interpret as the value of the median rather than as the location of the median.
example: Consider once again the data in the previous example from Babe Ruth's career. What was the median number of home runs per year he hit during his major league career?
solution: First, put the numbers in order from smallest to largest: 0, 2, 3, 4, 6, 11, 22, 25, 29, 34, 35, 41, 41, 46, 46, 46, 47, 49, 54, 54, 59, 60. There are 22 scores, so the median is found at the 11.5th position, between the 11th and 12th scores (35 and 41). So the median is
The 1-Var Stats procedure, described in the previous Calculator Tip box, will, if you scroll down to the second screen of output, give you the median (as part of the entire five-number summary of the data: minimum, lower quartile; median, upper quartile; maximum).
Although the mean and median are both measures of center, the choice of which to use depends on the shape of the distribution. If the distribution is symmetric and mound shaped, the mean and median will be close. However, if the distribution has outliers or is strongly skewed, the median is probably the better choice to describe the center. This is because it is a resistant statistic, one whose numerical value is not dramatically affected by extreme values, while the mean is not resistant.
example: A group of five teachers in a small school have salaries of $32,700, $32,700, $38,500, $41,600, and $44,500. The mean and median salaries for these teachers are $38,160 and $38,500, respectively. Suppose the highest paid teacher gets sick, and the school superintendent volunteers to substitute for her. The superintendent's salary is $174,300. If you replace the $44,500 salary with the $174,300 one, the median doesn't change at all (it's still $38,500), but the new mean is $64,120—almost everybody is below average if, by "average," you mean mean. It's sort of like Lake Wobegon, where all of the children are expected to be above average.
example: For the graph given below, would you expect the mean or median to be larger? Why?
solution: You would expect the median to be larger than the mean. Because the graph is skewed to the left, and the mean is not resistant, you would expect the mean to be pulled to the left (in fact, the dataset from which this graph was drawn from has a mean of 5.4 and a median of 6, as expected, given the skewness).
Practice problems for these concepts can be found at:
- Kindergarten Sight Words List
- First Grade Sight Words List
- 10 Fun Activities for Children with Autism
- Definitions of Social Studies
- Grammar Lesson: Complete and Simple Predicates
- Child Development Theories
- Signs Your Child Might Have Asperger's Syndrome
- How to Practice Preschool Letter and Name Writing
- Netiquette: Rules of Behavior on the Internet
- Social Cognitive Theory