Mechanics and Techniques Problems
7.1. Look at the data on home prices in the Rochester, NY area in 2000 found in the data file C07
Homes.xls [.rda].
- If you were to use this data to predict the sales price of a home, which variables
would you use? Based on your intuition about homes, rank the top five most important
variables in determining the price of the home in order from most influential to least
influential.
- Use the graphical and numerical tools of this chapter to determine the five variables
that most influence the price of a home. Rank them in order. Compare these results
with your estimates in part (a). Provide evidence for all conclusions.
- If some of the independent variables in a data set are related to each other, you may
have a problem called ”co-linearity”. Are there any variables in the home data that
you would expect to be related? Based on the numerical calculations (and possibly
graphs) are any of the independent variables co-linear? Which ones? To what degree?
7.2. Consider the data in C07 Electricity.xls [.rda] which contains observations of total
monthly electric power usage compared to the size of the home (in square feet).
- Create a scatterplot of this data. Do you expect that a simple linear model will be a
good fit to this data? Why or why not? Use the features you see in the graph to explain
your answer.
- Add a linear trendline (along with its equation) to the graph. What is the best-fit
simple linear model for predicting monthly electricity usage as a function of home size?
What do the slope and y-intercept mean? Do these numbers make sense? Why or why
not?
- Use the model to predict the electricity usage for the following two homes: Home #1
is 2050 square feet. Home #2 is 3200 square feet.
7.3. Suppose you have two different phone plans to select from when you make long distance calls.
Plan #1 costs a flat rate of 7 cents each minute (or fraction of a minute) that the call lasts. Plan
#2 costs only 3 cents per minute, but has a 39 cent connection charge for all calls, no matter how
long. Which calling plan would you use for a 3 minute call? Which would you use for a 45 minute
call? How can you decide ahead of time which plan to use when making a call? Explain all of your
answers using trendlines and scatterplots to help. Be sure your explanation uses terms like slope
and y-intercept and includes information about the units of the variables involved.