Practice problems for these concepts can be found at:

- Inference for Regression Multiple Choice Practice Problems for AP Statistics
- Inference for Regression Free Response Practice Problems for AP Statistics
- Inference for Regression Review Problems for AP Statistics
- Inference for Regression Rapid Review for AP Statistics

Let's distinguish between *statistics* and *parameters*. Statistics are measurements or values that describe samples, and parameters are measurements that describe populations. We have also seen that statistics can be used to estimate parameters. Thus, we have used to estimate the population mean μ, σ to estimate the population standard deviation s, etc. The least-squares regression line ( = *a* + *bx*), is based on a set of ordered pairs. is actually a statistic because it is based on sample data. In this chapter, we study the parameter, μ_{y}, that is estimated by .

Before we look at the model for linear regression, let's consider an example:

example:The following data are pulse rates and heights for a group of 10 female statistics students:

- What is the least-squares regression line for predicting pulse rate from height?
- What is the correlation coefficient between height and pulse rate? Interpret the correlation coefficient in the context of the problem.
- What is the predicted pulse rate of a 67" tall student?
- Interpret the slope of the regression line in the context of the problem.

**solution:**

*Pulse rate*= 47.17 + 0.302 (*Height*). (Done on the TI-83/84 with*Height*in L1 and*Pulse*in L2, the LSRL can be found STAT CALC LinReg(a+bx) L1,L2,Y1.)*r*= 0.21. There is a weak, positive, linear relationship between Height and Pulse rate.*Pulse rate*= 47.17 + 0.302(67) = 67.4. (On the Ti-83/84: Y1(67) = 67.42. Remember that you can paste Y1 to the home screen by entering VARS Y-VARS Function Y1.)- For each increase in height of one inch, the pulse rate is predicted to increase by 0.302 beats per minute (or: the pulse rate will increase, on average, by 0.302 beats per minute).

When doing inference for regression, we use = *a* + *bx* to estimate the true population regression line. Similar to what we have done with other statistics used for inference, we use *a* and *b* as estimators of population parameters *a* and *b*, the intercept and slope of the population regression line. The conditions necessary for doing inference for regression are:

- For each given value of
*x*, the values of the response variable*y*-values are independent and normally distributed. - For each given value of
*x*, the standard deviation, σ, of*y*-values is the same. - The mean response of the
*y*-values for the fixed values of*x*are linearly related by the equation μ_{y}= α + β*x*.

**example:** Consider a situation in which we are interested in how well a person scores on an agility test after a fixed number of 3-oz. glasses of wine. Let *x* be the number of glasses consumed. Let *x* take on the values 1, 2, 3, 4, 5, and 6. Let *y* be the score on the agility test (scale: 1–100). Then for any given value *x _{i}*, there will be a distribution of

*y*-values with mean μ

_{y1}. The conditions for inference for regression are that (i) each of these distributions of

*y*-values are normally distributed, (ii) each of these distributions of

*y*-values has the same standard deviation σ, and (iii) each of the μ

_{y1}lies on a line.

Remember that a *residual* was the error involved when making a prediction from a regression equation (residual = actual value of *y* – predicted value of *y *= *y _{i}* –

_{i}). Not surprisingly, the standard error of the predictions is a function of the squared residuals:

*s* is an estimator of σ, the standard deviation of the residuals. Thus, there are actually three parameters to worry about in regression: α, β, and σ, which are estimated by *a*, *b*, and *s*, respectively.

The final statistic we need to do inference for regression is the standard error of the slope of the regression line:

In summary, inference for regression depends upon estimating μ_{y} = *a* + β_{x} with = *a* + *bx*. For each *x*, the response values of *y* are independent and follow a normal distribution, each distribution having the same standard deviation. Inference for regression depends on the following statistics:

*a*, the estimate of the*y*intercept, α, of β_{y}*b*, the estimate of the slope, β, of μ_{y}*s*, the standard error of the residuals*s*, the standard error of the slope of the regression line_{b}

In the section that follows, we explore inference for the slope of a regression line in terms of a significance test and a confidence interval for the slope.

Practice problems for these concepts can be found at:

### Ask a Question

Have questions about this article or topic? Ask### Related Questions

#### Q:

#### Q:

#### Q:

#### Q:

### Popular Articles

- Kindergarten Sight Words List
- First Grade Sight Words List
- 10 Fun Activities for Children with Autism
- Signs Your Child Might Have Asperger's Syndrome
- A Teacher's Guide to Differentiating Instruction
- Theories of Learning
- Child Development Theories
- Social Cognitive Theory
- Curriculum Definition
- Why is Play Important? Social and Emotional Development, Physical Development, Creative Development