Correlation Principles Help

By — McGraw-Hill Professional
Updated on Sep 12, 2011

Introduction to Correlation Principles

Let's examine correlation a little more closely now. When two things are correlated, does one cause the other? Does a third phenomenon cause both? Is there any cause-and-effect relationship at all? People often conclude that there is a cause-and-effect relationship when they see a correlation. But this is not necessarily true.

Quantitative Versus Qualitative

Correlation (often symbolized by the italicized, lowercase letter r) can be numerically defined only between variables that can be quantified. Examples of quantitative variables include time, temperature, and average monthly rainfall.

It's possible to qualitatively express the correlation between two variables if one or both of them cannot be quantified. But it's not possible to quantitatively express correlation unless both variables and their relationship can be quantified. Even if it seems obvious that two variables are correlated, there is a big difference between saying that, for example, ''rudeness and violence are strongly correlated'' and ''the correlation between rudeness and violence is +0.75.'' Violence can be quantified on the basis of crime statistics, but rudeness is a more elusive variable to numerically express.

Imagine that a massive social experiment is conducted over a period of years, and researchers come to the conclusion that people develop schizophrenia more often in some geographic regions than in others. Suppose, for example, that there are more people with this disorder living at high elevations in the mountains, where there is lots of snow and the weather is cool all year round, than there are at low elevations near tropical seas, where it rains often and the weather is warm all year. Both of these variables – schizophrenia and environment – are difficult or impossible to quantify. In particular, if you took 100 psychiatrists and asked them to diagnose a person who behaves strangely, you might end up with 40 diagnoses of ''schizophrenia,'' 10 diagnoses of ''paranoid psychosis,'' 15 diagnoses of ''depression,'' 5 diagnoses of ''bipolar disorder,'' 25 diagnoses of ''normal but upset,'' and 5 verdicts of ''not enough information to make a diagnosis.'' While the largest proportion (40%) of the doctors think the person has schizophrenia in this breakdown, that is not even a simple majority. Such a diagnosis is not absolute, such as would be the case with an unmistakable physical ailment such as a brain tumor.

Correlation Range

The first thing we should know about correlation, as shown or implied by a scatter plot, was suggested earlier in this book. But it's so important that it bears repetition. Correlation can be expressed as a numerical value r such that the following restriction holds:

    –1 ≤ r ≤ +1

This means the mathematical correlation can be equal to anything between, and including, –1 and +1. Sometimes percentages are used instead, so the possible range of correlation values, r%, is as follows:

    –100% ≤ r % ≤ +100%

A correlation value of r = –1 represents the strongest possible negative correlation; r = +1 represents the strongest possible positive correlation. Moderately strong positive correlation might be reflected by a figure of r = +0.7; weak negative correlation might show up as r% = –20%. A value of r = 0 or r% = 0% means there is no correlation at all. Interestingly, the absence of any correlation can be more difficult to prove than the existence of correlation, especially if the number of samples (or points in a scatter plot) is small.

It's impossible for anything to be correlated with anything else to an extent beyond the above limits. If you ever hear anyone talking about two phenomena being correlated by ''a factor of –2'' or ''r = 150%,'' you know they're wrong. In addition to this, we need to be careful when we say that two effects are correlated to ''twice the extent'' of two other effects. If two phenomena are correlated by a factor of r = +0.75, and someone comes along and tells you that changing the temperature (or some other parameter) will ''double the correlation,'' you know something is wrong because this suggests that the correlation could become r = +1.50, an impossibility.

View Full Article
Add your own comment

Ask a Question

Have questions about this article or topic? Ask
150 Characters allowed