The chain has often proved to be a baffling concept to many people. Let's examine the chain rule through an example.
Suppose represents the volume of air in a spherical balloon as a function of the radius, r, of the balloon. If the radius increases, the volume should increase according to
using the power rule for evaluating the derivative of a monomial combined with the linearity properties of the derivative. What if we have another function r(t) which tell us how the radius is changing with respect to time? How fast is the volume changing in time?
One way to compute this would be to plug r(t) into V(r) to get V as a function of t, then differentiate this with respect to time. This would give us with V changing at a rate dV/dt. There is another way to compute this which generalizes to other situations. The chain rule says that
So, if then and . Using the chain rule, we get
This process may seem a little cumbersome, but imagine you have a string of functions, each depending on the next. For example, V = V(x), x = x(y), y = y(t), t = t(s). The chain rule then says
Notice that derivatives simply chain together (hence, the chain rule) and you can check to see if its right by imagining that you cancel the dx on top with the dx on bottom and so forth in a chain of fractions. If you've correctly used the chain rule, this cancellation will result in a mathematical identity (in this case dV/ds = dV/ds.)