Contents

Theory - [T4]

Question
Concept of distribution. Univariate and multivariate. Conditional and marginal distributions.

General definition

A probability distribution is a mathematical function that provides the probabilities of the occurrence of various possible outcomes in an experiment. Probability distributions are used to define different types of random variables in order to make decisions based on these models.
For instance, if $X$ is used to denote the outcome of a coin toss (“the experiment”), then the probability distribution of $X$ would take the value $0.5$ for $X$ = heads, and $0.5$ for $X$ = tails.

Univariate distribution

A univariate distribution is the probability distribution of a single random variable. Referencing the previous section, we can reuse the coin toss as an example of a univariate distribution. In fact it only samples the value that the coin will display after the toss. To be more precise, if we assume the coin is fair, the sampled distribution will represent a discrete uniform distribution, since all its possible, and finite, values are equally likely to occur.

Multivariate distribution

Multivariate distributions show comparisons between two or more measurements and the relationships among them. For each univariate distribution with one random variable, there is a more general multivariate distribution. For example, the normal distribution is univariate and its more general counterpart is the multivariate normal distribution.
A very often encounter of a multivariate distribution is the bivariate distribution, where the sampled and observed random variables are exactly two.

Conditional distribution

The conditional probability distribution of $Y$ given $X$ is the probability distribution of $Y$ when $X$ is known to be a particular value; in some cases the conditional probabilities may be expressed as functions containing the unspecified value $x$ of $X$ as a parameter. The conditional distribution contrasts with the marginal distribution of a random variable, which is its distribution without reference to the value of the other variable.
We can refer to the conditional distribution of a subset of a set of more than two variables; this conditional distribution is contingent on the values of all the remaining variables, and if more than one variable is included in the subset then this conditional distribution is the conditional joint distribution of the included variables.

Marginal distribution

The distribution of a random variable, or set of random variables, obtained by considering a component, or subset of components, of a larger random vector with a given distribution. Thus the marginal distribution is the projection of the distribution of the random vector $X=(X_1,\dots,X_n)$ onto an axis $x_1$ or subspace defined by variables $x_{i1}…x_{ik}$, and is completely determined by the distribution of the original vector.


Sources