Theory - [T17]
Introduction
We know the formula for the probability density function of the normal distribution is given by: $$ f(x) = \frac{1}{\sigma\sqrt{2\pi}}e^{-\frac{(x-\mu)^2}{2\sigma^2}} $$
We will try to derive this formula using a more intuitive approach with respect to more rigorous mathematical approach.
Suppose I throw a dart into a dartboard. I aim at the centre of the board $(0,0)$ but I’m not all that good with darts so the dart lands in a random position $(X,Y)$.
Assumptions
Before we start the derivation, we will make some assumptions on the way the darts are thrown.
- The density is rotationally invariant so the distribution of where my dart lands only depends on the distance of the dart to the centre.
- The random variables $X$ and $Y$ are independent, how much I miss left and right makes no difference to the distribution of how much I miss up and down
- Assuming we throw darts that hit within a square target, if the area of the square target is the same, then the closer the distance from the origin to the square, the higher the probability of hitting the target.
- When the distance from the origin to the square is the same, the larger the area of the square, the higher the probability of hitting the target.
Derivation of of the form $e^{−x^2}$
We are going to start by deriving the fact that $f(x)$ (The PDF for the normal distribution) is in the form $e^{−x^2}$.
Let’s consider the expected value of hitting a square $A$ with width $\Delta x$ and height $\Delta y$ at an arbitrary position $(x,y)$ on the orthogonal coordinate system, while thinking about the three assumptions mentioned earlier.
Let $f(x,y)$ be the probability density function for a dart to land on the $x$ and $y$ axes.
Since we assumed $X$ and $Y$ are independent, we can write $f(x,y)$ as the product of two functions $f(x)$ and $f(y)$.
Therefore the probability of hitting the square $A$ is given by:
$$
f(x)\Delta x f(y)\Delta y
$$
Let $g(r,\theta)$ be the probability density function expressed in polar coordinates.
We can use assumption 1 to write $g(r,\theta)$ as $g(r)$, since the probability of hitting the square $A$ only depends on the distance of the dart to the centre.
Therefore the probability of hitting the square $A$ is given by:
$$
g(r)\Delta x\Delta y
$$
Since both derived equations are equal to the probability of hitting the square $A$, we can equate them to get: $$ f(x)\Delta x f(y)\Delta y = g(r)\Delta x\Delta y \\ \dArr \\ \frac{f(x)\cancel{\Delta x} f(y)\cancel{\Delta y}}{\cancel{\Delta x \Delta y}} = \frac{g(r)\cancel{\Delta x \Delta y}}{\cancel{\Delta x \Delta y}} \\ \dArr \\ f(x)f(y) = g(r) $$
Differentiating the equation for with respect to $\theta$ and using the separation of variables method, we can obtain the following result: $$ x = C \frac{f’(x)}{f(x)} \\ $$ where $C$ is a constant that arises from the differentiation.
Integrating both sides with respect to $x$ will then give us: $$ \frac{1}{2}x^2 = C \ln f(x) + C’ \\ $$
Where $C’$ is another constant that arises from the integration.
Expressing the obtained equation in terms of $f(x)$ proves that $f(x)$ is in the form $e^{−x^2}$.