Education.com
Try
Brainzy
Try
Plus

Describing and Displaying Bivariate Data Study Guide (page 3)

By
Updated on Oct 5, 2011

Example

  1. Find the z-scores associated with the drop heights and the rebound heights of the dropped basketball.
  2. Find the Pearson's correlation coefficient for these data.
  3. Relate the correlation coefficient r to the graph.

Solution

  1. Many calculators have a built-in function that can be used to compute the correlation coefficient. We will not use that function here, but instead demonstrate a way to organize the computations required to find r when such a function key is not available. First, we need to find the sample mean and sample deviation for the drop height and for the rebound height. The sample mean for the drop height is 42 inches, and the sample standard deviation of drop height is 19.2678489 inches. The sample mean for the rebound height is 22.9090909 inches, and the sample standard deviation is 11.4390057 inches. Notice that we are carrying all of the decimal places at this point to reduce the effect of a rounding error on our value for r. If we were to report these values, we would round to about one decimal place. Using the sample means and sample standard deviations, we find the z-scores for drop height and rebound height for each observation as well as the product of the two. These are presented in Table 8.3.

    The values of zDrop and zRebound sum to zero because the transformation of the sample values was designed to give a mean of zero, making the total also zero.

  2. Pearson's r is found using the total of the product of these two values; that is,

Table 8.3 z-scores for drop height and rebound height

  1. This high value of r, with the scatter plot of the data, allows us to conclude that a line would provide a good model for these data, at least in the range of drop heights considered in this study. It is important to look both at the scatter plot and r, not just r by itself, before drawing a conclusion of a linear relationship as the following example demonstrates.

Because we have a sample for the rebound heights, Pearson's r is an estimate of the population correlation coefficient rho, or ρ. The population correlation coefficient has the same basic properties as r. However, it is important to remember that ρ does not change; it is a population characteristic. In contrast, r is based on a sample. If another sample is drawn from the same population, the value of r is very likely tochange. Each sample provides an estimate of rho. In the example of dropping a basketball, we have a sample of size n = 33.We can use the data to estimate the population correlation. If we were to conduct the study again,we would undoubtedly obtain a similar but different value of r, which would also be an estimate of the population correlation.

Example

We have collected the following (x,y) pairs: (–8.7,–0.6), (–8.2,–1.3), (–6.1,–2.0), (–4.1,–4.0), (–1.6,–5.5), (–0.2,–6.0), (0.7,4.6), (1.4,4.2), (3.8,3.8), (6.5,3.1), (8.2,2.0), and (9.1,0.6).

  1. Construct a scatter plot of the data and discuss the relationship in X and Y based on the plot.
  2. Find Pearson's correlation coefficient r.
  3. Discuss the relationship in r and what was observed in the graph.

Solution

The scatter plot of the data is shown in Figure 8.5.

Figure 8.5

It appears that there are two groups of responses. The group in the upper right portion of the graph seems to have decreasing y-values as x increases. Similarly, the group in the lower left portion of the graph seems to have decreasing y-values as x increases.

To find Pearson's correlation coefficient, we organize the data in tabular form; find zx, zy, and zx zy, and the columns totaled (see Table 8.4).

As expected, the sums of zx and zy are zero. The product of the two is used to find Pearson's r:

This value of r would tend to lead us to believe that there is a moderately strong, positive, linear relationship in X and Y. This would be the wrong conclusion for these data. This is why it is so important to look at the scatter plot when interpreting r. If we consider the two groups separately, r for the group with negative x values is –0.992, and r for the group with positive x values is –0.941. Both of these suggest a strong, negative relationship in X and Y.When something like this happens, the researcher must try to determine what the difference is in the two groups. If the observations were taken from people, potential factors such as gender, age, and disease would represent the two groups.

Table 8.4

Describing and Displaying Bivariate Data In Short

Bivariate data arise often in studies. The relationship in the two variables is often of great interest. Scatter plots are visual displays of the data that help us understand how the two variables might be related. Pearson's correlation coefficient r is a measure of the strength of the linear relationship in the variables. It is important to look at the scatter plot when interpreting the meaning of r.

Find practice problems and solutions for these concepts at Describing and Displaying Bivariate Data Practice Questions.

View Full Article
Add your own comment