Mechanics and Techniques Problems

7.1. Look at the data on home prices in the Rochester, NY area in 2000 found in the data file C07 Homes.xls [.rda].

  1. If you were to use this data to predict the sales price of a home, which variables would you use? Based on your intuition about homes, rank the top five most important variables in determining the price of the home in order from most influential to least influential.
  2. Use the graphical and numerical tools of this chapter to determine the five variables that most influence the price of a home. Rank them in order. Compare these results with your estimates in part (a). Provide evidence for all conclusions.
  3. If some of the independent variables in a data set are related to each other, you may have a problem called ”co-linearity”. Are there any variables in the home data that you would expect to be related? Based on the numerical calculations (and possibly graphs) are any of the independent variables co-linear? Which ones? To what degree?

7.2. Consider the data in C07 Electricity.xls [.rda] which contains observations of total monthly electric power usage compared to the size of the home (in square feet).

  1. Create a scatterplot of this data. Do you expect that a simple linear model will be a good fit to this data? Why or why not? Use the features you see in the graph to explain your answer.
  2. Add a linear trendline (along with its equation) to the graph. What is the best-fit simple linear model for predicting monthly electricity usage as a function of home size? What do the slope and y-intercept mean? Do these numbers make sense? Why or why not?
  3. Use the model to predict the electricity usage for the following two homes: Home #1 is 2050 square feet. Home #2 is 3200 square feet.

7.3. Suppose you have two different phone plans to select from when you make long distance calls. Plan #1 costs a flat rate of 7 cents each minute (or fraction of a minute) that the call lasts. Plan #2 costs only 3 cents per minute, but has a 39 cent connection charge for all calls, no matter how long. Which calling plan would you use for a 3 minute call? Which would you use for a 45 minute call? How can you decide ahead of time which plan to use when making a call? Explain all of your answers using trendlines and scatterplots to help. Be sure your explanation uses terms like slope and y-intercept and includes information about the units of the variables involved.