Multiple Random Variables

Tanav — Sun, 06 Feb 2022 21:17:57 +0000

Topics Covered — Joint PMF and Marginal PMF of 2 or more Random Variables.
Prerequisite — Random Variables
This article is basically the math behind the PMF of random variables and how it can be used if more than one are involved.

Joint PMF

Assume X and Y are Discrete Random Variables defined in the same probability space.
Range of X = Tx
Range of Y = Ty
Then Fₓᵧ(x,y) is a function from Tx × Ty to [0,1]
F(t₁ , t₂) = P( X=t₁ and Y=t₂ ) such that t₁ ∈ Tx and t₂ ∈ Ty
P(X,Y) means P(X and Y) here
When we write Joint PMF it is written usually in form of a table or matrix
Let’s take an example to understand it better.
Tossing a fair coin twice.
Let Xᵢ = 1 if i’th toss is heads and Xᵢ = 0 if i’th toss is tails
Here i= 1,2
Fₓ₁ₓ₂(0,0) = P(X₁=0 , X₂=0)=1/2 * 1/2 = 1/4
Fₓ₁ₓ₂(0,1)= P(X₁=0 , X₂=1) = 1/2 *1/2 = 1/4
The same info in Tabular form -

Since all entries have equal probability a proper distinction cannot be seen in the various PMFs so let’s take another example to understand this better
Random Number Less than 100
X is the Units place and Y is the remainder of the number when divided by 4
Like in case of 31 X=1 and Y=3
Fₓᵧ(0,0) = P(X=0, Y=0) = P(Number ending in 0 and Divisible by 4)
So the set of numbers in this category are {00,20,40,60,80}
P({0,20,40,60,80}) = 5/100 or 1/20
Similarly
Fₓᵧ(4,2) = P( X=4 , Y=2) = P(Number ends in 4 and has 2 as remainder)
Set of numbers = {14,34,54,74,94}
P({14,34,54,74,94}) = 5/100=1/20
Right now it seems like the above case where all values where equal but this is really not the case for this example
Fₓᵧ(1,0) = P( X=1 , Y=0) = P(Number ends in 1 and has no remainder)
Set of numbers = ϕ
Because there is no such number
So, P(X=1,Y=0)=0

Such a table can be formed depicting all values of Fₓᵧ
Also if you note sum of one column or row depicts the probability of that individual event happening
That is what we will be discussing in the next part of this article

Marginal PMF

Suppose X and Y are jointly distributed discrete random variables
with joint PMF Fₓᵧ . The PMF of the individual random variables X and Y are called
as marginal PMFs. It can be shown that

Important Point- Marginal PMF is Simple PMF if the other variable did not exist.
That math above looks weirdly complex but all it means is if we form a PMF table and add a row or column then it gives us the probability of that individual event
For example in the above example we had formed the table

Add all the probabilities of column 0 ( Remainder of number is 0)
Probability of this happening is 1/4 because remainder can be one of 0,1,2,3 this is also shown by sum of all rows in that column.
So we can say
Fₓ₂(0)=1/4 →1/20 + 1/20 + 1/20 + 1/20 + 1/20

Similarly for rows
Probability of Units place being 5 is 1/10 and it is same as sum of all columns associated with that row.
Fₓ₁(5)=1/10 → 1/20 + 1/20

Joint PMF of more than two discrete random variables

Suppose X₁, X₂, . . . , Xₙ are discrete random variables defined in the same probability space. Let the range of Xᵢ be Tₓᵢ. The joint PMF of Xᵢ , denoted by Fₓ₁,ₓ₂,….ₓₙ, is a function from Tₓ₁ × Tₓ₂ × . . . × Tₓₙ to [0, 1] defined as
Fₓ₁,ₓ₂,….ₓₙ = P(X₁ = t₁, X₂ = t₂, . . . , Xn = tₙ); tᵢ ∈ Tₓ
This is exactly like using the “and” operator to get all Variables together and find their Joint PMF.

Marginal PMF in case of more than two discrete random variables

Suppose X₁, X₂, . . . , Xₙ are jointly distributed discrete random variables with joint PMF Fₓ₁,ₓ₂,….ₓₙ The PMF of the individual random variables X₁, X₂, . . . , Xₙ are called as marginal PMFs. It can be shown in form of

In simple terms the above math jargon says — Take the Joint PMF of all the given variables except the one who’s marginal PMF is being found.
For example -
Let’s take 3 random variables X₁ , X₂ and X₃
To find marginal PMF of X₁. Find the Joint PMF of X₂ and X₃
This can also be grouped up and is called Marginalisation

The above formula’s simplification is Sum everything that is not needed.
X₁, X₂, X₃ and X₄ are random variables forming a Joint PMF
To find Marginalised PMF of X₁ and X₃ we find Joint PMF of X₂ and X₄

Stay Tuned

Statistics In Data Science

Tanav — Sun, 06 Feb 2022 20:36:37 +0000

This will be a series of articles dealing with statistical concepts from Random variables to various distributions to likelihood estimation and hypothesis testing.
Prerequisite- Basic Probability.

Random Variables

The literal definition for Random Variable is “ A function with domain as the sample space of an experiment and range as real numbers”. This just means random variables are numerical versions of outcomes of any experiment.

Simplest possible example - Coin Toss :
Sample Space = {Heads , Tails}
So random variable X can be written as X(Heads)=1 , X(Tails)=0.
This can be any number we assign but usually meaningful functions are considered.

Throw a dice -
Sample Space = {1,2,3,4,5,6}
X is defined as X(1)= x₁, X(2)= x₂ ….. X(6)=x₆

What are the values of x₁, x₂, x₃ , x₄, x₅, x₆?

These xᵢ’s are essentially same as sample space and distinct for a one-to-one function. But the need not be distinct and can be same as well.

Example - In the same Die roll the condition is of Even vs Odd Numbers
Random Variables- E(2)=E(4)=E(6)=1 and E(1)=E(3)=E(5)=0

Random Variables and Events

If X is a random variable then
(X Similarly (X>x) , (X=x) , (X ≤ x) and (X≥x) are all events.

In the die example-
S={1,2,3,4,5,6}
X(1)= 1 , X(2) = 2 , X(3) = 3 , X(4) = 4 , X(5)= 5 and X(6) =6
For event (X<4) : { 1,2,3}
Event {2,5} can also be written as (X = 2) ⋂ (X=5)
Why use Random Variables?

Instead of trying to assign probabilities to the entire outcome we assign them to events defined through them.
This reduces the detail in outcome to something simpler
Using limited data only random variables can be studied
Types of Random Variables -

Discrete Random Variable
Continuous Random Variable

Discrete Random Variable

If the range of random variables is a discrete set it is called discrete random variable. Usually described by it’s Probability Mass Function (PMF).
PMF = Random Variable X set on Range T
Fₓ(t) : T -> [0,1] is defined as Fₓ(t)= P(X=t) for t ∈ T
X=t is the event
P(X=t) is probability of X taking the value of t.
For example-
A fair coin is tossed 3 times
Sample Space = {HHH , HHT , HTH , HTT , THH , THT , TTH , TTT}
X= Number of heads

X ∈ {0,1,2,3}
Fₓ(0) = 1/8
Fₓ(1)=3/8
Fₓ(2)=3/8
Fₓ(3)=1/8
This shows the PMFs of various discrete random variables. The main thing to remember about PMF is that **sum of all PMFs in the range is 1 **at all times.
0 ≤ Fₓ(t) ≤ 1
∑ Fₓ(t)=1 here (t ∈T)

Continuous Random Variable -

A random variable X with CDF Fₓ is said to be continuous random variable if Fₓ is continuous at every x. CDF has no jumps or steps .
CDF = Cumulative Distribution Function
This is a very detailed topic so I will be writing another article discussing this in detail.

DEV Community: Tanav