Education.com
Try
Brainzy
Try
Plus

Transformations to Achieve Linearity for AP Statistics

By — McGraw-Hill Professional
Updated on Feb 5, 2011

Practice problems for these concepts can be found at:

Until now, we have been concerned with data that can be modeled with a line. Of course, there are many two-variable relationships that are nonlinear. The path of an object thrown in the air is parabolic (quadratic). Population tends to grow exponentially, at least for a while. Even though you could find a LSRL for nonlinear data, it makes no sense to do so. The AP Statistics course deals only with two-variable data that can be modeled by a line OR non-linear two-variable data that can be transformed in such a way that the transformed data can be modeled by a line.

example: Let g(x) = 2x, which is exponential and clearly nonlinear. Let f(x) = ln(x). Then, f[g(x)] = ln(2x) = xln(2), which is linear. That is, we can transform an exponential function such as g(x) into a linear function by taking the log of each value of g(x).

example: Let g(x) = 4x2, which is quadratic. Let f(x) = . Then f[g(x)] = Transformations to Achieve Linearity =2x, which is linear.

example: The number of a certain type of bacteria present (in thousands) after a certain number of hours is given in the following chart:

Transformations to Achieve Linearity

What would be the predicted quantity of bacteria after 3.75 hours?

solution: A scatterplot of the data and a residual plot [for Number = a + b(Hour)] shows that a line is not a good model for this data:

Transformations to Achieve Linearity

Now, take ln(Number) to produce the following data:

Transformations to Achieve Linearity

The scatterplot of Year versus ln(Population) and the residual plot for ln(Number) = –0.0047 + 0.586(Hours) are as follows:

Transformations to Achieve Linearity

The scatterplot looks much more linear and the residual plot no longer has the distinctive pattern of the raw data. We have transformed the original data in such a way that the transformed data is well modeled by a line. The regression equation for the transformed data is: ln(Number) = –0.047 + 0.586 (Hours).

The question asked for how many bacteria are predicted to be present after 3.75 hours. Plugging 3.75 into the regression equation, we have ln(Number) = –0.0048 + 0.586(3.75) = 2.19. But that is ln(Number), not Number. We must back-transform this answer to the original units. Doing so, we have Number = e2.19 = 8.94 thousand bacteria.

Transformations to Achieve Linearity

Transformations to Achieve Linearity

Transformations to Achieve Linearity

It may be worth your while to try several different transformations to see if you can achieve linearity. Some possible transformations are: take the log of both variables, raise one or both variables to a power, take the square root of one of the variables, take the reciprocal of one or both variables, etc.

Practice problems for these concepts can be found at:

Add your own comment

Ask a Question

Have questions about this article or topic? Ask
Ask
150 Characters allowed