Dotplots and Stem-and-Leaf Plots Study Guide
Introduction to Dotplots and Stem-and-Leaf Plots
When numerical data are collected, we no longer have the counts for each category as we did for categorical data. Each observation is a number. We want to be able to display these numbers in a way that will provide more insight into the data. Displaying numerical data using dotplots and stem-and-leaf plots is the focus of this lesson.
When numerical data are collected during a study, it is often difficult to understand what the numbers mean by simply looking at them. Recall the 62-member orchestra that we discussed in the beginning of Lesson 3. In addition to gender, the height (in inches) of each member was recorded. The heights and genders of the orchestra members are given in Table 4.1.
Although we have all 62 heights of the orchestra members, it is difficult to describe the basic characteristics of the population distribution by looking only at the values. In this lesson, we will first look at some ways to display the data graphically. In the next lesson, we will consider some common measures used to summarize these population characteristics, beginning with measures of central tendency and measures of dispersion. Remember: The orchestra is the population of interest. Because we are summarizing information from the whole population, these measures are parameters. Later, we will consider how to estimate these values from a sample.
A dotplot is a simple way to present numerical data when the data set is reasonably small. To construct a dotplot, complete the following steps:
- Draw a horizontal line and mark it with a measurement scale that extends at least as low as the smallest value in the data set and as high as the largest value in the data set.
- For each observation in the data set, locate the value on the measurement scale and represent it by a dot. If two or more observations have the same value, stack the dots vertically.
A dotplot of the orchestra members' heights is shown in Figure 4.1.
It is interesting to notice that we have gaps in the data. The member who is 53.5 inches tall is quite a bit shorter than the next shortest orchestra member. Similarly, the member who is 83.8 inches taller appears to be several inches taller than the next tallest orchestra member.
Because females tend to be shorter, on average, than males by the time they are in high school, it may be helpful to compare the distributions of female- and male-member heights. An effective way of doing this is to construct parallel dotplots. Here, parallel lines are drawn for each gender. The same scale is used for both lines, and the lines are labeled. Then dotplots are constructed for each group as in Figure 4.2.
From the parallel dotplots shown in Figure 4.2, we see that, although some overlap exists, the males tend to be taller than the females. The shortest person is a female, and the tallest person is a male.
A study was conducted to determine whether a person's "blinking rate" was, on average, different when playing video games than when engaged in conversation. Fifteen high school students were randomly selected from a large high school for participation in the study. The order in which treatments (video games and conversation) were applied to each individual was determined by the flip of a coin. The investigators were concerned that the study participants would, perhaps unconsciously, alter their blinking rate if they knew that was what was being observed. To avoid this, the study participants were told that swallow patterns were being studied and that the hypothesis was as follows: People will swallow less when playing a game or reading a book because they will produce less saliva due to the fight-or-flight reflex. Each participant was videotaped, and the number of blinks in a two-minute time interval was recorded for each treatment.
To illustrate how the study was conducted, suppose the first person to be tested was George. It was decided if the coin was heads-up, George would play the video games first; otherwise, the first treatment would be normal conversation. The coin was flipped, and the upper face was tails. George was engaged in normal conversation for about ten minutes. During this time, he was videotaped. The investigators wanted to pick a time period during the middle of the conversation to count the blink rate, so they counted the number of blinks in the fourth and fifth minutes of conversation. George was then given a 30-minute break. At the end of the break, he began playing a video game for ten minutes, during which time he was videotaped. The number of blinks in the fourth and fifth minutes was recorded. This process was repeated for each study participant. The results are shown in Table 4.2.
Based on what we learned in Lesson 2, this is an experiment with a broad scope of inference. It is an experiment because the treatments (that is, playing a video game and during normal conversation) are randomly assigned. In this case, the random assignment is the order in which the two are applied. The population is students in the large high school. Because the study participants were randomly selected from the population, conclusions can be made for all students in the high school; that is, a broad scope of inference exists. Can inference be drawn to all high school students? Perhaps, but that is not clear. If we assume that these high school students are representative of all high school students, then yes. However, these students may differ in some manner that would affect their blink rate while playing video games or during normal conversation. If so, then we could make erroneous conclusions if we tried to extend the results to all high school students.
The manner in which the study was conducted led to paired data. A pair of observations, one taken during normal conversation and the other taken while playing video games, is collected for each study participant. Because the objective of the study is to compare the two treatments, the difference in each pair of observations, representing the difference in the two treatments for that person, should be computed and these differences plotted. Most of us like to work with positive numbers more than negative numbers. Looking at the data, the number of blinks during normal conversation tends to be larger than the number of blinks while playing a video game, so we will compute the number of blinks during normal conversation minus the number of blinks while playing video games for each study participant.