Theory - [T16]

Nalin Dhingra included in Researches about pure theory

14-11-2022 540 words 3 minutes

Contents

Question

Explain in your own words the “Law of large numbers” and sketch a simple proof (Markov inequality, Čebyšëv inequality, … )

Introduction

When creating a sample of a population, we start collect units of observation from the given population until our sample can resemble it as a whole, therefore growing our sample larger and larger. What the Law of Large Numbers" states is that as the number of units" in our sample grows, the sample mean will get closer and closer to the expected value of the population (i.e. its mean). In other words, the larger the sample size, the more accurate the sample mean will be in representing the population mean.

Demonstration

This can be demonstrated making use of the Markov inequality and the Chebyshev inequality.

Markov inequality

The Markov inequality states that in an empirical distribution of a random variable, the probability of the variable taking a value greater than or equal to a constant is less than or equal to the value of the mean divided by that constant. $$ freq(x \geq k) \leq \frac{\bar{x}}{k} $$ This can be generalized to theoretical case as well. $$ P(X \geq k) \leq \frac{\mu}{k} $$

Chebyshev inequality

The Markov inequality just defined can be utilized to prove the Chebyshev inequality, as follows: $$ P(X \geq k) \leq \frac{\mu}{k} \Rarr P(X^2 \geq k^2) \leq \frac{\mu^2}{k^2} $$ If we choose our random variable to be $X = (X - \mu)$, then we can write: $$ P((X^2 - \mu^2) \geq k^2) \leq \frac{E(x - \mu)^2}{k^2} $$ We know that the expected value of the squared difference between a random variable and its mean is the variance of the random variable, therefore we can write: $$ P((X^2 - \mu^2) \geq k^2) \leq \frac{\sigma^2}{k^2} $$ If we then choose our constant $k$ to be a multiple of $\sigma$, $k = a\sigma$, we can write: $$ P((X^2 - \mu^2) \geq a^2\sigma^2) \leq \frac{\cancel{\sigma^2}}{a^2\cancel{\sigma^2}} $$

If we then take the square root of the probability on the left side of the inequality, we can write: $$ P(|X - \mu| \geq a\sigma) \leq \frac{1}{a^2} $$

This is the Chebyshev inequality and it states that the probability of a random variable being a certain distance from its mean is less than or equal to the inverse of the square of that distance. This is of great importance as it demonstrates that the probability at the “tails” of a distribution decreases as the distance from the mean increases.

Law of Large Numbers

Finally the inequality just demonstrated can be used to prove the weak form of the Law of Large Numbers. In fact if we consider the random variable $X$ to be the sample mean $\bar{X}$, the constant $k$ to be an arbitrary small number $\epsilon$. The variance of the sample mean is $\frac{\sigma^2}{n}$, where $n$ is the number of observations in the sample. We can finally rewrite the Chebyshev inequality as: $$ P(|\bar{X} - \mu| \geq \epsilon) \leq \frac{\sigma^2}{n}\cdot\epsilon^2 $$ If we observe the limit of this inequality as $n$ goes to infinity, we can write: $$ \lim_{n\to\infty} P(|\bar{X} - \mu| \geq \epsilon) \leq 0 $$

Therefore proving that there is a convergence of the sample mean to the population mean as the sample size grows to infinity.