Education.com
Try
Brainzy
Try
Plus

Descriptive Measures Other Specifications Help

By — McGraw-Hill Professional
Updated on Sep 12, 2011

Other Specifications—Range

There are additional descriptive measures that can be used to describe the characteristics of data. Let's look at the definitions of some of them.

Range

In a data set, or in any contiguous (''all-of-a-piece'') interval in that set, the term range can be defined as the difference between the smallest value and the largest value in the set or interval.

In the graph of hypothetical blood-pressure test results (Fig. 4-1), the lowest systolic pressure in the data set is 60, and the highest is 160. Therefore, the range is the difference between these two values, or 100. It's possible that a few of the people tested have pressures lower than 60 or higher than 160, but their readings have been, in effect, thrown out of the data set.

In the 40-question test we've examined so often in this chapter, the lowest score is 0, and the highest score is 40. Therefore, the range is 40. We might want to restrict our attention to the range of some portion of all the scores, for example the 2nd lowest 25% of them. This range can be determined from Table 4-4; it is equal to 24 – 17, or 7. Note that the meaning of the word ''range'' in this context is different from the meaning of the word ''range'' at the top of the left-hand column of Table 4-4.

Intervals by Element Quantity

Coefficient of Variation

Do you remember the definitions of the mean (μ) and the standard deviation (σ)? Let's review them briefly. There's an important specification that can be derived from them.

In a normal distribution, such as the one that shows the results of our hypothetical blood-pressure data-gathering adventure, the mean is the value (in this case the blood pressure) such that the area under the curve is equal on either side of a vertical line corresponding to that value.

In tabulated data for discrete elements, the mean is the arithmetic average of all the results. If we have results {x1, x2, x3, . . ., xn} whose mean is μ, then the standard deviation is

    σ = {(1/n)[(x1μ)2 + (x2μ)2 + . . . + (xnμ)2]}1/2

The mean is a measure of central tendency or ''centeredness.'' The standard deviation is a measure of dispersion or ''spread-outedness.'' Suppose we want to know how spread out the data is relative to the mean. We can get an expression for this if we divide the standard deviation by the mean. This gives us a quantity known as the coefficient of variation, which is symbolized as CV. Mathematically, the CV is:

    CV = σ/μ

The standard deviation and the mean are both expressed in the same units, such as systolic blood pressure or test score. Because of this, when we divide one by the other, the units cancel each other out, so the CV doesn't have any units. A number with no units associated with it is known as a dimensionless quantity.

Because the CV is dimensionless, it can be used to compare the ''spread-outedness'' of data sets that describe vastly different things, such as blood pressures and test scores. A large CV means that data is relatively spread out around the mean. A small CV means that data is concentrated closely around the mean. In the extreme, if CV = 0, all the data values are the same, and are exactly at the mean. Figure 4-6 shows two distributions in graphical form, one with a fairly low CV, and the other with a higher CV.

Other Specifications

There is one potential difficulty with the above formula. Have you guessed it? If you wonder what happens in a distribution where the data can attain either positive or negative values – for example, temperatures in degrees Celsius – your concern is justified. If μ = 0 (the freezing point of water on the Celsius temperature scale), there's a problem. This trouble can be avoided by changing the units in which the data is specified, so that 0 doesn't occur within the set of possible values. When expressing temperatures, for example, we could use the Kelvin scale rather than the Celsius scale, where all temperature readings are above 0.

In a situation where all the elements in a data set are equal to 0, such as would happen if a whole class of students turns in blank papers on a test, the CV is undefined because the mean really is equal to 0.

View Full Article
Add your own comment

Ask a Question

Have questions about this article or topic? Ask
Ask
150 Characters allowed