Contents

Theory - [T5]

Question
Illustrate the concept of conditional, joint, marginal (relative) frequency using a simple bivariate distribution

Dataset

Let’s first of all define our bivariate distribution.
As a study case we will take the population of students and as random variables we will take the grades taken in a certain exam and the current average of each student.

Student Grade Average
Kiersten 28 29
Reese 26 21
Andy 30 27
Marc 26 20

Contingency table

Now we have our dataset, let’s build the contingecy table:

Grade\Average 20 21 27 29
26 1 1 0 0
28 0 0 0 1
30 0 0 1 0

Joint frequency

The joint frequency can be seen as the frequency of two events happening at the same time. We can make an example taking the previous contingency table: $$freq(Grade=26, Average=21) = 1$$ This value is immediately visible on the contingency table.

Extended contingency table

Now let’s show the contingency table above with the corresponding marginal frequencies:

Grade\Average 20 21 27 29 Marginal Grade
26 1 1 0 0 2
28 0 0 0 1 1
30 0 0 1 0 1
Marginal Average 1 1 1 1

Marginal frequency

Given the extended contingency table, we can see what the marginal frequencies for each column and row are. These values represent the sum of all the joint relative frequencies of a specific row or column.

Conditional frequency

Given all the information derived up to now it is immediate to define the conditional frequency as the number of occurences of a certain even conditioned on another specific even.
This can be defined by the following formula: $$ cond\_freq = (joint / marginal) $$ For example if we want to find the conditional frequency of all students who got a grade of 30 conditioned on having a average of 27: $$ cond\_freq = (1 / 1)$$


Sources