11.1 What if the Data is Not Proportional

Our first assumption when modeling data using regression is that the data is based on an underlying linear relationship. Such relationships are said to be proportional: if the x data increases by a certain amount, the y data increases by a fixed constant times that same amount. The fixed constant relating the x-variable changes to the y-variable changes is the called the slope of the linear model.

For many sets of data, however, the assumption of linearity is quite false. For example, the amount of electricity used in a house is related to the size of the house; larger houses are more expensive to heat or cool, so they tend to use more electricity. However, this relationship does not mean that doubling the size of the house always doubles the electricity costs. Much of the electricity use comes from lights, computers, televisions, and radios. No matter how much bigger the house, a family of four can only use so many of these devices at one time. So while the cost may increase, we might expect a more dramatic increase in electricity use when comparing a small house to a medium house, but a much less dramatic increase when comparing a medium-sized house to a large house. This implies that the slope of the model relating the electricity costs (y) to the size of the house (x) would be different for large houses than for small houses. In a linear function, this slope must be the same, regardless of the x-value being considered.

  11.1.1 Definitions and Formulas
  11.1.2 Worked Examples
  11.1.3 Exploration 11A: Non-proportional data