Probability & Statistics Honors |
Unit Test #4 |

_____ / 37

1. |
For a particular car model, data were collected on the age of each car (in years) and the sale price of the car (in hundreds of dollars). A least squares regression model, suitable for predicting sale price, was obtained: $\widehat{y}=195.47-20.26x$ One particular car was six years old and had a sale price of $7000. What was the residual for that car? |
(4) |

For a six year old car, the model predicts an average sale price of $\widehat{y}=195.47-20.26\left(6\right)=73.91$ hundreds of dollars ($7391). The residual will be the difference between the actual and predicted sales prices: $7000-7391=-391$ dollars.

2. |
A grocery chain studied the relationship between the length (miles) and shipping costs (dollars per 100 pounds) of its shipping routes. The results are shown above. Identify any unusual points in this scatterplot. |
(3) |

The point with an x value of 120 and a y value near 11 is a clear outlier. It does not have leverage. Because it has a very large residual compared to the rest of the data, it is likely influential.

3. |
A salary study at a large company found the following relationship between years of experience and annual salary: $ln\left(\widehat{y}\right)=10.4838+0.0789\cdot x$. Use this model to predict the salary of an employee who has seven years of experience. |
(3) |

$ln\left(\widehat{y}\right)=10.4838+0.0789\cdot 7\Rightarrow ln\left(\widehat{y}\right)=11.0361\Rightarrow \widehat{y}={e}^{11.0361}\approx 62075.09$; the model predicts an average salary of $62075.09 for an employee with seven years of experience.

As part of a study of commercial bank branches, data were obtained on the number of independent
businesses (*x*) located in sample zip code areas and the number of bank branches (*y*) located in
these areas. The sample data are shown below.

x |
92 |
116 |
124 |
210 |
216 |
267 |
306 |
378 |
415 |
502 |
615 |
703 |

y |
3 |
2 |
3 |
5 |
4 |
5 |
5 |
6 |
7 |
7 |
9 |
9 |

Use these data for all remaining questions on this test.

4. |
Find the least squares regression line suitable for predicting number of branches from number of independent businesses. |
(4) |

$\widehat{y}=1.7668+0.0111x$

5. |
Interpret the slope of the line you found in [4] in the context of this scenario. |
(4) |

For each additional independent business, the model predicts an average of 0.0111 additional bank branches in that zip code.

6. |
Interpret the |
(4) |

In a zip code with no independent businesses, the model predicts an average of 1.7668 bank branches.

7. |
Calculate and interpret the coefficient of determination for these data. |
(4) |

${r}^{2}=0.9452$; 94.52% of the variation in number of bank branches can be explained by a linear relationship with number of independent businesses.

8. |
Use the least squares regression line you found in [4] to predict the number of bank branches when there are 250 independent businesses. |
(3) |

Plugging in $x=250$ results in a prediction of 4.5431 bank branches (on average).

9. |
Would your least squares regression line be suitable for predicting the number of bank branches when there are 50 independent businesses? Justify your answer. |
(3) |

It would not. The lowest number of independent businesses in the data is 92, so the model is not designed to handle x-values that low. Using $x=50$ would be extrapolation…we would have to assume that the relationship continues beyond what we have observed.

10. |
Construct a residual plot for the least squares regression on these data. |
(5) |

Page last updated 11:16 2020-10-06