Practice problems for these concepts can be found at:
- Inference for Regression Multiple Choice Practice Problems for AP Statistics
- Inference for Regression Free Response Practice Problems for AP Statistics
- Inference for Regression Review Problems for AP Statistics
- Inference for Regression Rapid Review for AP Statistics
Let's distinguish between statistics and parameters. Statistics are measurements or values that describe samples, and parameters are measurements that describe populations. We have also seen that statistics can be used to estimate parameters. Thus, we have used to estimate the population mean μ, σ to estimate the population standard deviation s, etc. The least-squares regression line (
= a + bx), is based on a set of ordered pairs.
is actually a statistic because it is based on sample data. In this chapter, we study the parameter, μy, that is estimated by
.
Before we look at the model for linear regression, let's consider an example:
example: The following data are pulse rates and heights for a group of 10 female statistics students:
- What is the least-squares regression line for predicting pulse rate from height?
- What is the correlation coefficient between height and pulse rate? Interpret the correlation coefficient in the context of the problem.
- What is the predicted pulse rate of a 67" tall student?
- Interpret the slope of the regression line in the context of the problem.
solution:
- Pulse rate = 47.17 + 0.302 (Height). (Done on the TI-83/84 with Height in L1 and Pulse in L2, the LSRL can be found STAT CALC LinReg(a+bx) L1,L2,Y1.)
- r = 0.21. There is a weak, positive, linear relationship between Height and Pulse rate.
- Pulse rate = 47.17 + 0.302(67) = 67.4. (On the Ti-83/84: Y1(67) = 67.42. Remember that you can paste Y1 to the home screen by entering VARS Y-VARS Function Y1.)
- For each increase in height of one inch, the pulse rate is predicted to increase by 0.302 beats per minute (or: the pulse rate will increase, on average, by 0.302 beats per minute).
When doing inference for regression, we use = a + bx to estimate the true population regression line. Similar to what we have done with other statistics used for inference, we use a and b as estimators of population parameters a and b, the intercept and slope of the population regression line. The conditions necessary for doing inference for regression are:
- For each given value of x, the values of the response variable y-values are independent and normally distributed.
- For each given value of x, the standard deviation, σ, of y-values is the same.
- The mean response of the y-values for the fixed values of x are linearly related by the equation μy = α + βx.
example: Consider a situation in which we are interested in how well a person scores on an agility test after a fixed number of 3-oz. glasses of wine. Let x be the number of glasses consumed. Let x take on the values 1, 2, 3, 4, 5, and 6. Let y be the score on the agility test (scale: 1–100). Then for any given value xi, there will be a distribution of y-values with mean μy1. The conditions for inference for regression are that (i) each of these distributions of y-values are normally distributed, (ii) each of these distributions of y-values has the same standard deviation σ, and (iii) each of the μy1 lies on a line.
Remember that a residual was the error involved when making a prediction from a regression equation (residual = actual value of y – predicted value of y = yi – i ). Not surprisingly, the standard error of the predictions is a function of the squared residuals:
s is an estimator of σ, the standard deviation of the residuals. Thus, there are actually three parameters to worry about in regression: α, β, and σ, which are estimated by a, b, and s, respectively.
The final statistic we need to do inference for regression is the standard error of the slope of the regression line:
In summary, inference for regression depends upon estimating μy = a + βx with = a + bx. For each x, the response values of y are independent and follow a normal distribution, each distribution having the same standard deviation. Inference for regression depends on the following statistics:
- a, the estimate of the y intercept, α, of βy
- b, the estimate of the slope, β, of μy
- s, the standard error of the residuals
- sb, the standard error of the slope of the regression line
In the section that follows, we explore inference for the slope of a regression line in terms of a significance test and a confidence interval for the slope.
Practice problems for these concepts can be found at:
Ask a Question
Have questions about this article or topic? AskRelated Questions
See More QuestionsToday on Education.com
SUMMER LEARNING
June Workbooks Are Here!
TECHNOLOGY
Are Cell Phones Dangerous for Kids?
Local SAT & ACT Classes
Popular Articles
- Kindergarten Sight Words List
- The Five Warning Signs of Asperger's Syndrome
- First Grade Sight Words List
- Graduation Inspiration: Top 10 Graduation Quotes
- 10 Fun Activities for Children with Autism
- What Makes a School Effective?
- Child Development Theories
- Should Your Child Be Held Back a Grade? Know Your Rights
- Why is Play Important? Social and Emotional Development, Physical Development, Creative Development
- Smart Parenting During and After Divorce: Introducing Your Child to Your New Partner

Get Active! 9 Games to Keep Kids Moving
7 Ways to Get Your Kid Excited About Summer School 
Add your own comment