3. Dispute over( )marking

The concepts of differential calculus that we've been building in previous chapters and in Living Geometry perhaps make sense. For their practical use, however, they need not only to make sense, but to be as easy to use as possible. We need to develop labeling, algebra. So, going back in history for a moment, we look at how the signposting of differential calculus was conceived by the two ``founders'' of differential calculus, I. Newton and G. Leibniz.

The two thinkers have long argued over which of them first conceived the theory, but by today's consensus both developed their theory independently and independently. But because each was working on the theory without the other's knowledge, they both developed their own markings for the same mathematical concepts. The markings we use today combine elements from both origins.

Although the two creators couldn't stand each other, today we can be thankful that there were two somewhat independent views of the same thing. In fact, the mathematical markings we use to make the calculations affect how we think about the issue. If we choose good markings, things will go smoothly and gracefully, if our markings contain ambiguities or we don't fully understand them, we won't get very far. A proper understanding of the handling of mathematical markings and the conformity of those markings with the described reality, that is the essence of algebra that we are dealing with here. And in few places can this be seen as well as the so-called chain rule, which is the subject of this chapter.

Composite Features

The functions we looked at in Living Geometry, describe the relationship between two variables. For example, we can describe how many pigs a farmer might have depending on how large his own field is. In the everyday world, however, there are rarely two isolated phenomena that are somehow connected. Rather, we move through a network of interrelationships. In order to better describe this network, we use composite functions.

Compound functions describe a relationship where a $A$ thing depends on $B$ and that depends on $C$. For example, to see how an automaker's earnings depend on the amount of taxes, you just need to figure out how car sales depend on taxes (the lower the tax, the more people will want to buy a car), and how a company's profits depend on the number of cars sold. But let's give an even clearer example: say, a state or a large company wants to build charging stations on electric cars. But so-called economies of scale make the price per piece smaller the more they choose to build. But the question is how much money they will have to use, because the economy is unstable and, of course, not all stations will pay at once. So they can invest more the bigger the economic growth. So we can ask how many stations the state will put out depending on economic growth.

Suppose the number of $n$ electrical stations exposed depends on the number of money we have available, such as $$n(p) = p^2 \,,$$

where $p$ is the number of dollars in millions. Next, imagine that $p$ in millions depends on economic growth as follows: $$ p(r) = r^3 \cdot 100 \,,$$

where $r$ is economic growth in percent. So in this case, the $n$ dependency on $r$ will $$n(r) = n(p(r)) = (p(r))^2= (10\, 000) r^6\,. $$

The $n(r)$ function is already a simple function, because we put an explicit $p(r)$ relationship in the original $n(p(r)$ composite function. You might wonder why we ever work with composite functions at all, and why we don't always just convert functions to a simple function? But the beauty is that the government can change the function of $p(r)$ depending on how much it wants to translate economic growth into electric cars. Using this method, it can then observe how everything is expressed in terms of $n$. The description by composite functions also corresponds more closely to the reality that our world is a complex web of relationships. The $p(r)$ feature can perform in other compound functions, so it makes sense to keep it.

Chain rule

For further analysis of compound functions, it is useful to calculate their derivative. We can of course express composite functions explicitly as simple functions and derive them, but that's not always practical. The second approach is to derive a directly compounded function. This time, we'll give away the formula to calculate the derivative of the compound function outright, and then we'll interpret it and explain why it works.

As we wrote at the beginning of the chapter, we will also present two independent signings from Newton and Leibniz, the so-called Newton notation and Leibniz notation. To express the formula, we begin with Newton's notation, in which the following relationship can be written: $$ (n(p(r)))' = n'(p(r)) \cdot p'(r) \,.$$

Understanding the formula through Leibniz's notation may seem imprecise, but Leibniz's markings bear similarities to rigorous mathematical proof of our relationship. Besides, at the time of Leibniz, and even some time later, such rigorous evidence didn't even exist, and yet somehow Leibniz understood the relationship.

It looks very confusing with the parentheses, doesn't it? We would read this formula verbally as follows: The derivative of the $n$ function, which is a $p$ function dependent on $r$, is equal to the derivative of the $n$ function by $p$ multiplied by the derivative of the $p$ function by $r$. In Newton's notation, the derivative is usually denoted by a comma. Since it's not clear which variable we're deriving from, this notation is ambiguous. We can partially correct this by writing the variable that we derive from in parentheses. Another remedy is that if we derive by time, we write a dot over the derivative as follows: $\dot v$ (derivative of $v$ by time). But the result is often just as messy.

The formula above indicates that the change in the composite function is the product of the changes in both functions, but it appears to have fallen from the sky with no justification behind it. To better see the internal context, we would normally have to formulate the formula in a more precise and mathematical way to prove why (and under what conditions) it applies. In this case, however, a change in our signings can also provide a deeper understanding: just use Leibniz's instead of Newton's. Leibniz's notation differs because we can write in it: $$ f'(x) = \frac{\mathrm{d} f(x)}{\mathrm{d} x} \leftrightsquigarrow \frac{\Delta y}{\Delta x}\,.$$

In this formula motivated by the definition of a derivative as a limit on a proportion, we see clearly the meaning of a derivative. As we know for sure, we can expand the fractions, or multiply by one in the appropriate shape. So for the derivative of the compound function, we write: $$\begin{align*} \frac{\mathrm d n(p(r))}{\mathrm d r} = \frac{\mathrm d n(p(r))}{\mathrm d r} \frac{\mathrm d p}{\mathrm d p }\\ \frac{\mathrm d n(p(r))}{\mathrm d r} = \frac{\mathrm d n(p)}{\mathrm d p} \cdot \frac{\mathrm d p(r)}{\mathrm d r }\,. \end{align*}$$

In the transition to the second line, we made suggestive finishes with $\cdot$, we switched $\mathrm d p$ and $\mathrm d r$ due to commutative multiplication, and we also wrote in places and in places which quantity is a function of what (e.g. if $p$ is a function of $r$ as $p(r)$ or not). We finally got a formula called the chain rule. It has its name because the different members are linked together like links in a chain.

After showing how the formula for the derivative of the composite function can be arrived at using the Leibniz notation, it is perhaps already possible to see how and why the change in the composite function can be broken down into the product of two functions.

Differentials

You may wonder if the aforementioned adjustment and treatment of the quantities $\mathrm d p$ was indeed justified. Indeed, the meaning of infinitesimal variables such as. $\mathrm d p$ has not been defined anywhere, even in Alive geomtry, defined only by the derivative $\frac{\mathrm d}{\mathrm d x}$. Therefore, it doesn't even make mathematical sense to switch the denominator of fractions. But it turns out that the formula we get from switching is valid, and thinking about the derivative as a fraction is convenient. So let's try to make mathematical sense of terms like $\mathrm d x$.

The best way to conquer something is to name it. So let's say from now on that the terms in the form of $\mathrm d x$ are differentials. We can view differential as an infinitesimal (infinitesimal) increment of this quantity, like $\mathrm d V$ can be an infinitesimal inflow of volume. When the two differentials are divided, we get that: $$ \frac{\mathrm{d}y}{\mathrm d x} \equiv y(x)' \,.$$

We got the same sign for the derivative, so we're going to assume we got the derivative. Another reason to back up our belief may be that when we introduced the derivative in Living Geometry, we said it was good to think of it as $\frac{\Delta y}{\Delta x}$, we have a similar expression here, only instead of $\Delta$, we use $\mathrm{d}$.

If we believe in differentials, they give us good intuition, we see, for example, that $$ \frac{\mathrm d x}{\mathrm d y} = \left( \frac{\mathrm d y}{\mathrm d x} \right)^{-1} \,.$$

Or the derivative of the inverse function is equal to one over the derivative of the original function. Indeed, this formula holds true under some conditions. But our process cannot be considered evidence, or even a hint of evidence, given how quickly we have described the differentials.

Another identity the differentials suggest is: $$\begin{align*} \frac{\frac{\mathrm{d} y}{\mathrm{d}x} }{\frac{\mathrm{d} z}{\mathrm{d}x}} = \frac{\mathrm{d}y}{\mathrm{d}z} \,. \end{align*}$$

The identity should apply if we extend the entire fraction to $\mathrm{d}x$. Again, it's not evidence, but we can rely on the fact that it mostly works.

We can also integrate the differentials: $$ \mathrm d x \Rightarrow \int_0^x \mathrm d x = x \,,$$

Or adding an integral sign is an adjustment that we could make, for example, in an equation.

Status of differentials in differential count

It is possible to build the theory of differential calculus by starting from the beginning to talk about differentials. However, this is too lengthy a process and is not common in textbooks. The mathematically correct theory is more easily constructed using concepts like the limit that we tried to sketch in our series. The popularity of differentials lies in their abundant use in physics. Since physicists tend to be familiar with the problem they're working with before they start solving it, they may not be as interested in mathematical accuracy. Differentials are thus a more intuitive tool for them, and if they accidentally encounter a bad result, they can identify it using other rules.

Reviving geometry is written for all those interested in differential calculus, regardless of whether they are physicists or not. Therefore, a procedure is chosen that at least roughly corresponds to good mathematical habits while making intuitive sense. We are only talking about differentials now, as it is a widespread device that nevertheless lacks proper mathematical consistency in our conception. In the next, we'll give examples of always solutions using both notations. So stick to Leibniz notation if you have any experience or interest in differential calculus. Newton's notation, we believe, is a more consistent and mathematically accurate method of problem solving.

Estimation and derivative error

When we visualized the chain rule, we can look at its place of use: the theory of measurement uncertainties (or errors). I mean, suppose, in our case, with a state that wants to build electricity stations, we want to take analysis a little more seriously. Then we should be a little bit prepared for our estimate not being 100% accurate. The number of electric stations for a certain amount of money is fixed in the price tag of the company we buy from, so $n(p)$ is a fixed function. On the other hand, there is no certainty that we can pinpoint $r$ economic growth accurately. If we identify it with an error, $p$, and therefore our estimated $n$, will be inaccurate. But how does the uncertainty of determining $r$ spread to the uncertainty of $n$?

The answer lies in the derivative: a derivative indicates a change. So the larger the derivative, the faster the function responds to the change in the input parameter. Change that may be due to uncertainty of the input parameter...

By uncertainty we mean an estimate of the maximum error of this quantity. We never know the error of a quantity precisely: if we knew it precisely, you could calculate the value of a quantity retrospectively.

So for each quantity of $r$, we're going to denote the uncertainty of this quantity as $\Delta r$. E.g., for economic growth $r=5\,\%$, we may have uncertainty e.g. $\Delta r = 0{,}5\,\%$. The question is how, with the uncertainty of $\Delta r$, the uncertainty of $\Delta n$ can be calculated, since $n$ depends on $r$.

The answer is not hard, as the uncertainty is calculated as $|n( p(r+\Delta r)) - n(p(r)|$. We can further estimate this expression using a derivative as follows: $$\begin{align*} |n( p(r+\Delta r)) - n(p(r))| &\doteq | \left( n( p(r)) + \Delta r \cdot (n( p(r)))' \right) - n(p(r))|\\ |n( p(r+\Delta r)) - n(p(r))| &\doteq | \Delta r \cdot (n( p(r)))'| = | \Delta r |\cdot |(n( p(r)))'| \,. \end{align*}$$

In general, what we got is that the uncertainty of a function is the derivative of a function multiplied by the uncertainty of the function's input parameter (argument). So let's figure out what the uncertainty of $n$ is for $r=1\,\%$ and $\Delta r = 0{,}5\,\%$.

For the record, we see that $n(p(1\,\%)) = 10\,000$. Next, let's figure out the derivative of $n(p(r))$ by $r$: $$\begin{align*} \frac{\mathrm d n(p(r))}{\mathrm d r} = \frac{\mathrm d n(p)}{\mathrm d p} \cdot \frac{\mathrm d p(r)}{\mathrm d r } = (2p(r))\cdot (3r^2\cdot 100) = (2 \cdot r^3\cdot 100)\cdot (3r^2\cdot 100) = 6r^5 \cdot 10\,000 \,. \end{align*}$$

To calculate the individual derivatives, we used the rule of the derivative of the power function, which we can also find in the derivative table in the derivative appendix. For our particular case, we get uncertainty $\Delta n = 1875 \doteq 2\,000$. That's nearly one-fifth of estimated stations. So we see that with our data, we can hardly prepare for the future, the state should investigate the situation better.

Except for the above example, measurement error calculations are mainly used in any measurement apparatus we have for use. Before, it was mainly the domain of physicists who built their experiments and needed to know how accurate they were. But today we are surrounded by different sensors that would not give meaningful values if we did not know their uncertainty. For example, for a phone that receives a signal, you need to know the uncertainty to know if the connection is reliable.

Product Derivation

Finally, we will present one more useful rule on the treatment of derivatives. It would already like to get its hands on some interesting derivative applications, however it won't be possible without the proper equipment. So without further ado, we introduce a rule describing the product of two functions, $f(x)$ and $g(x)$. Using Newton's notation, we write it as follows: $$\begin{align*} (f(x)g(x)) ' = f'(x) g(x) + f(x) g'(x) \,. \end{align*}$$

$$\begin{align*} &\frac{\mathrm{d} (f(x) g(x))}{\mathrm{d}x} = \lim_{h\to 0} \frac{ f(x+ h) g(x +h) - f(x) g(x)}{h} \\&= \lim_{h\to 0} \frac{ f(x+ h) g(x +h) - f(x) g(x) + f(x+h)g(x) - f(x+h)g(x)}{h} \\ &= \lim_{h\to 0} \frac{ f(x+ h) g(x +h) - f(x+h)g(x) + f(x+h)g(x) - f(x) g(x) }{h} \\ &= \lim_{h\to 0} f(x+ h)\frac{ g(x +h) - g(x)}{h} + g(x)\frac{f(x+h) - f(x)}{h}\\ &= \frac{\mathrm{d} g(x)}{\mathrm{d}x} g(x) + f(x)\frac{\mathrm{d} g(x)}{\mathrm{d}x} \,. \end{align*}$$

Don't worry if you find the evidence too complicated or have trouble keeping it in your head. Once again, we are reminded that rather than proof, this is a phenomenon - rigorous mathematical understanding is not what we are trying to do here, nor is it needed.

<< Previous chapter >> Next chapter