Practice problems for these concepts can be found at:

- Two-Variable Data Analysis Multiple Choice Practice Problems for AP Statistics
- Two-Variable Data Analysis Free Response Practice Problems for AP Statistics
- Two-Variable Data Analysis Review Problems for AP Statistics
- Two-Variable Data Analysis Rapid Review for AP Statistics

When we developed the LSRL, we referred to *y* – (the *actual value* – the *predicted value*) as an error in prediction. The formal name for *y* – is the **residual**. Note that the order is always "actual" – "predicted" so that a positive residual means that the prediction was too small and a negative residual means that the prediction was too large.

**Example:** In the previous example, a criminal earning $1560/month paid restitution of $800/month. The predicted restitution for this amount would be = –56.22 + 0.46(1560) = $661.38. Thus, the residual for this case is $800 – $ 661.38 = $138.62.

Residuals can be useful to us in determining the extent to which a linear model is appropriate for a dataset. If a line is an appropriate model, we would expect to find the residuals more or less randomly scattered about the average residual (which is, of course, 0). In fact, we expect to find them approximately normally distributed about 0. A pattern of residuals that does not appear to be more or less randomly distributed about 0 (that is, there is a systematic nature to the graph of the residuals) is evidence that a line is not a good model for the data. If the residuals are small, the line may predict well even though it isn't a good theoretical model for the data. The usual method of determining if a line is a good model is to examine visually a plot of the residuals plotted against the explanatory variable.

Example:The data given below show the height (in cm) at various ages (in months) for a group of children.

- Does a line seem to be a good model for the data? Explain.
- What is the value of the residual for a child of 19 months?

**Solution:**

- Using the calculator (LinReg(a+bx) L1, L2, Y1), we find
*height*= 64.94 + 0.634(age),*r*= 0.993. The large value of r tells us that the points are close to a line. The scatterplot and LSLR are shown below on the graph at the left. - The residual (actual minus predicted) for
*age*= 19 months is 77.1 – (64.94 + 0.634 · 19) = 0.114. Note that 77.1 – Y1(19)= 0.112.

From the graph on the left, a line appears to be a good fit for the data (the points lie close to the line). The residual plot on the right shows no readily obvious pattern, so we have good evidence that a line is a good model for the data and we can feel good about using the LSRL to predict height from age.

(Note that you can generate a complete set of residuals, which will match what is stored in RESID, in a list. Assuming your data are in L1 and L2 and that you have found the LSRL and stored it in Y1, let L3 = L2–Y1(L1). The residuals for each value will then appear in L3. You might want to let L4 = RESID (by pasting RESID from the LIST menu) and observe that L3 and L4 are the same.

If we are trying to predict a value of *y* from a value of *x*, it is called **interpolation** if we are predicting from an *x*-value within the range of *x*-values. It is called **extrapolation** if we are predicting from a value of *x* outside of the *x*-values.

Example:Using the age/height data from the previous example, we areinterpolatingif we attempt to predict height from an age between 18 and 29 months. It is interpolation if we try to predict the height of a 20.5-month-old baby. We areextrapolatingif we try to predict the height of a child less than 18 months old or more than 29 months old.

If a line has been shown to be a good model for the data and if it fits the line well (i.e., we have a strong *r* and a more or less random distribution of residuals), we can have confidence in interpolated predictions. We can rarely have confidence in extrapolated values. In the example above, we might be willing to go slightly beyond the ages given because of the high correlation and the good linear model, but it's good practice not to extrapolate beyond the data given. If we were to extrapolate the data in the example to a child of 12 years of age (144 months), we would predict the child to be 156.2 inches, or more than 13 feet tall!

Practice problems for these concepts can be found at:

### Ask a Question

Have questions about this article or topic? Ask### Related Questions

#### Q:

#### Q:

#### Q:

#### Q:

### Popular Articles

- Kindergarten Sight Words List
- First Grade Sight Words List
- 10 Fun Activities for Children with Autism
- Signs Your Child Might Have Asperger's Syndrome
- Definitions of Social Studies
- A Teacher's Guide to Differentiating Instruction
- Curriculum Definition
- Theories of Learning
- What Makes a School Effective?
- Child Development Theories