The Chain Rule

Something we frequently do in mathematics and its applications is to transform among different coordinate systems. For example, the surface in Figure 1a can be represented by the Cartesian equation

z=x^{2}-y^{2}.

However, the same surface can also be represented in polar coordinates \left(r,\,\theta \right), by the equation

z=r^{2}\cos \,2\theta

(see Figure 1b).

HTMLFiles/chain_rule1.png

Figure 1: (a) the surface z=x^{2}-y^{2} in Cartesian coordinates and (b) the equivalent surface z=r^{2}\cos \,2\theta in polar coordinates

In this set of circumstances, we can think about the behaviour of z as x and y vary, or about its behaviour as r and \theta vary. It follows that the partial derivatives

\frac{\partial z}{\partial x},\frac{\partial z}{\partial y}

exist, and so do the partial derivatives

\frac{\partial z}{\partial r},\frac{\partial z}{\partial \theta }.

What concerns us here is how these two sets of partial derivatives are related.

The answer comes from Taylor's theorem. Consider the surface

z=f\left(x,y\right).

Suppose that we change the value of the variable r by \delta r, while holding \theta constant. In doing so, we will have to change the values of both x and y: let us say that these variables change by \delta x and \delta y respectively. Then the change in z will be equal to

\delta z=f\left(x+\delta x,y+\delta y\right)-f\left(x,y\right).

But, by Taylor's theorem,

f\left(x+\delta x,y+\delta y\right)=f\left(x,y\right)+\delta x\,f_{x}\left(x,y\right)+\delta y\,f_{y}\left(x,y\right)+\text{...},\,

and therefore

\delta z\,=\,\frac{\partial z}{\partial x}\delta x\,+\,\frac{\partial z}{\partial y}\delta y+\text{...}\,.

In the limit as \delta x and \delta y tend to zero,

\frac{\partial z}{\partial r}=\frac{\partial z}{\partial x}\,\frac{\partial x}{\partial r}+\frac{\partial z}{\partial y}\,\frac{\partial y}{\partial r}.

Similarly,

\frac{\partial z}{\partial \theta }=\frac{\partial z}{\partial x}\,\frac{\partial x}{\partial \theta }+\frac{\partial z}{\partial y}\,\frac{\partial y}{\partial \theta }.

Together, these form the chain rule for partial differentiation.

The observation that

\delta z\,=\,\frac{\partial z}{\partial x}\delta x\,+\,\frac{\partial z}{\partial y}\delta y+\text{...}\,.

is in itself very useful: it enables us to deduce the error in z if we know the errors in x and y. For example, consider the cylinder with radius r m and height h m. It has surface area

A=2\,\pi \,r\,\left(r+h\right).

Now, suppose we know r and h to within 5% and 1% respectively, and suppose both are measured at 3 metres. Then the error in A is given by

\delta A\,=\,\frac{\partial A}{\partial r}\delta r\,+\,\frac{\partial A}{\partial h}\delta h=2\pi \,\left(2r+h\right)\delta r+2\pi \,r\,\delta h.

Thus

\frac{\delta A}{A}=\frac{2\pi \,\left(2r+h\right)\delta r+2\pi \,r\,\delta h}{2\,\pi \,r\,\left(r+h\right)}=\frac{2\,r+h}{r+h}\times \frac{\delta r}{r}+\frac{h}{r+h}\times \frac{\delta h}{h}.

The relative error, as a percentage, is thus

\frac{2\times 3+3}{3+3}\times 5+\frac{3}{3+3}\times 1=8.
Created with the ExportAsWebPage package in Wolfram Mathematica 7.0