Properties of Joint Distributions: Chris Piech CS109, Stanford University
Properties of Joint Distributions: Chris Piech CS109, Stanford University
Distributions
Chris Piech
CS109, Stanford University
Joint Probability Table
Joint Probability Table
Marginal
Dining Hall Eating Club Cafe Self-made Year
Freshman 0.02 0.00 0.02 0.00 0.04
Sophomore 0.51 0.15 0.03 0.03 0.69
Junior 0.08 0.02 0.02 0.02 0.13
Senior 0.02 0.05 0.01 0.01 0.08
5+ 0.02 0.01 0.05 0.05 0.07
Marginal
Status 0.65 0.23 0.13 0.11
Continuous Joint Random Variables
900
0 900
x
Joint Probability Density Function
900
900 y
0 900 0 x 900
a2 b2
a b
FX ,Y (a, b) = ò òf
-¥ -¥
X ,Y ( x, y ) dy dx
¶2
f X ,Y (a, b) = ¶a ¶b FX ,Y (a, b)
Jointly CDF
to 1 as
𝐹",$ 𝑥, 𝑦 = 𝑃 𝑋 ≤ 𝑥, 𝑌 ≤ 𝑦 x → +∞,
y → +∞,
to 0 as 𝑥
x → -∞,
plot by Academo
Probabilities from Joint CDF
𝑃 𝑎- < 𝑋 ≤ 𝑎/, 𝑏- < 𝑌 ≤ 𝑏/ = 𝐹",$ 𝑎/, 𝑏/
b
2
b
1
a a
1 2
Probabilities from Joint CDF
𝑃 𝑎- < 𝑋 ≤ 𝑎/, 𝑏- < 𝑌 ≤ 𝑏/ = 𝐹",$ 𝑎/, 𝑏/
b
2
b
1
a a
1 2
Probabilities from Joint CDF
𝑃 𝑎- < 𝑋 ≤ 𝑎/, 𝑏- < 𝑌 ≤ 𝑏/ = 𝐹",$ 𝑎/, 𝑏/
b
2
b
1
a a
1 2
Probabilities from Joint CDF
𝑃 𝑎- < 𝑋 ≤ 𝑎/, 𝑏- < 𝑌 ≤ 𝑏/ = 𝐹",$ 𝑎/, 𝑏/
−𝐹",$ 𝑎-, 𝑏/
b
2
b
1
a a
1 2
Probabilities from Joint CDF
𝑃 𝑎- < 𝑋 ≤ 𝑎/, 𝑏- < 𝑌 ≤ 𝑏/ = 𝐹",$ 𝑎/, 𝑏/
−𝐹",$ 𝑎-, 𝑏/
b
2
b
1
a a
1 2
Probabilities from Joint CDF
𝑃 𝑎- < 𝑋 ≤ 𝑎/, 𝑏- < 𝑌 ≤ 𝑏/ = 𝐹",$ 𝑎/, 𝑏/
−𝐹",$ 𝑎-, 𝑏/
−𝐹",$ 𝑎/, 𝑏-
b
2
b
1
a a
1 2
Probabilities from Joint CDF
𝑃 𝑎- < 𝑋 ≤ 𝑎/, 𝑏- < 𝑌 ≤ 𝑏/ = 𝐹",$ 𝑎/, 𝑏/
−𝐹",$ 𝑎-, 𝑏/
−𝐹",$ 𝑎/, 𝑏-
b
2
b
1
a a
1 2
Probabilities from Joint CDF
𝑃 𝑎- < 𝑋 ≤ 𝑎/, 𝑏- < 𝑌 ≤ 𝑏/ = 𝐹",$ 𝑎/, 𝑏/
−𝐹",$ 𝑎-, 𝑏/
−𝐹",$ 𝑎/, 𝑏-
+𝐹",$ 𝑎-, 𝑏-
b
2
b
1
a a
1 2
Probabilities from Joint CDF
𝑃 𝑎- < 𝑋 ≤ 𝑎/, 𝑏- < 𝑌 ≤ 𝑏/ = 𝐹",$ 𝑎/, 𝑏/
−𝐹",$ 𝑎-, 𝑏/
−𝐹",$ 𝑎/, 𝑏-
+𝐹",$ 𝑎-, 𝑏-
b
2
b
1
a a
1 2
Probability for Instagram!
Gaussian Blur
In image processing, a Gaussian blur is the result of blurring
an image by a Gaussian function. It is a widely used effect in
graphics software, typically to reduce image noise.
Joint PDF
1 x2 +y 2
fX,Y (x, y) = 2
e 2·32
2⇡ · 3
Joint CDF
⇣x⌘ ⇣y⌘
FX,Y (x, y) = ·
3 3
P (X 5)
P (Y = 6)
P (5 Z 10)
Four Prototypical Trajectories
Yes!
Is Year Independent of Lunch?
X ⇠ Poi( )
§ λ is the “rate”
§ X takes on values 0, 1, 2…
§ has distribution (PMF):
k
P (X = k) = e
k!
Web Server Requests
• Let N = # of requests to web server/day
§ Suppose N ~ Poi(l)
§ Each request comes from a human (probability = p) or
from a “bot” (probability = (1 – p)), independently
§ X = # requests from humans/day (X | N) ~ Bin(N, p)
§ Y = # requests from bots/day (Y | N) ~ Bin(N, 1 - p)
P( X = i, Y = j ) = P( X = i, Y = j | X + Y = i + j ) P( X + Y = i + j )
+ P( X = i, Y = j | X + Y ¹ i + j ) P( X + Y ¹ i + j )
Probability of i human
requests and j bot Probability of number of
requests requests in a day was i + j
Probability of i human
requests and j bot requests |
we got i + j requests
Web Server Requests
• Let N = # of requests to web server/day
§ Suppose N ~ Poi(l)
§ Each request comes from a human (probability = p) or
from a “bot” (probability = (1 – p)), independently
§ X = # requests from humans/day (X | N) ~ Bin(N, p)
§ Y = # requests from bots/day (Y | N) ~ Bin(N, 1 - p)
P( X = i, Y = j ) = P( X = i, Y = j | X + Y = i + j ) P( X + Y = i + j )
+ P( X = i, Y = j | X + Y ¹ i + j ) P( X + Y ¹ i + j )
§ Note: P( X = i, Y = j | X + Y ¹ i + j ) = 0
P( X = i, Y = j ) = P( X = i, Y = j | X + Y = i + j ) P( X + Y = i + j )
+ P( X = i, Y = j | X + Y ¹ i + j ) P( X + Y ¹ i + j )
Web Server Requests
• Let N = # of requests to web server/day
§ Suppose N ~ Poi(l)
§ Each request comes from a human (probability = p) or
from a “bot” (probability = (1 – p)), independently
§ X = # requests from humans/day (X | N) ~ Bin(N, p)
§ Y = # requests from bots/day (Y | N) ~ Bin(N, 1 - p)
P( X = i, Y = j ) = P( X = i, Y = j | X + Y = i + j ) P( X + Y = i + j )
+ P( X = i, Y = j | X + Y ¹ i + j ) P( X + Y ¹ i + j )
æ i+ j ö i
P( X = i, Y = j | X + Y = i + j ) = ç ÷ p (1 - p ) j
è i ø
-l l
i+ j
P( X + Y = i + j ) = e (i + j )!
æ i+ j ö i j -l l
i+ j
P( X = i, Y = j ) = ç ÷ p (1 - p ) e (i + j )!
è i ø
Web Server Requests
• Let N = # of requests to web server/day
§ Suppose N ~ Poi(l)
§ Each request comes from a human (probability = p) or
from a “bot” (probability = (1 – p)), independently
§ X = # requests from humans/day (X | N) ~ Bin(N, p)
§ Y = # requests from bots/day (Y | N) ~ Bin(N, 1 - p)
- l ( lp )
i ( l (1- p )) j
(i + j )! i j -l l
i+ j
P( X = i, Y = j ) = i! j! p (1 - p) e (i+ j )! = e i! × j!
• Equivalently:
FX ,Y (a, b) = FX (a) FY (b) for all a, b
f X ,Y (a, b) = f X (a ) fY (b) for all a, b
• More generally, joint density factors separately:
f X ,Y ( x, y ) = h( x) g ( y ) where - ¥ < x, y < ¥
Is the Blur Distribution Independent?
In image processing, a Gaussian blur is the result of blurring
an image by a Gaussian function. It is a widely used effect in
graphics software, typically to reduce image noise.
Joint PDF
1 x2 +y 2
fX,Y (x, y) = 2
e 2·32
2⇡ · 3
Joint CDF
⇣x⌘ ⇣y⌘
FX,Y (x, y) = ·
3 3
P( EF )
P( E | F ) = where P( F ) > 0
P( F )
F
E
Discrete Conditional Distributions
• Recall that for events E and F:
P( EF )
P( E | F ) = where P( F ) > 0
P( F )
• Now, have X and Y as discrete random variables
§ Conditional PMF of X given Y (where pY(y) > 0):
P( X = x, Y = y) p X ,Y ( x, y)
PX |Y ( x | y) = P( X = x | Y = y) = =
P(Y = y) pY ( y)
§ Conditional CDF of X given Y (where pY(y) > 0):
P( X £ a, Y = y )
FX |Y (a | y) = P( X £ a | Y = y) =
P(Y = y )
=
å x£a
p X ,Y ( x, y )
= å p X |Y ( x | y )
pY ( y ) x£a
Joint Probability Table
Joint Probability Table
Marginal
Dining Hall Eating Club Cafe Self-made Year
Freshman 0.02 0.00 0.02 0.00 0.04
Sophomore 0.51 0.15 0.03 0.03 0.69
Junior 0.08 0.02 0.02 0.02 0.13
Senior 0.02 0.05 0.01 0.01 0.08
5+ 0.02 0.01 0.05 0.05 0.07
Marginal
Status 0.65 0.23 0.13 0.11
Lunch | Year
And It Applies to Books Too
P (X = x, Y = y)
P (X = x|Y = y) =
P (Y = y)
P (X = x, Y = y)
P (X = x|Y = y) = fX|Y (x|y) · ✏x · ✏y
fX|Y (x|y) · ✏x = P (Y = y)
P (X fY=(y)x, ·Y✏y= y)
P (Xf = x|Y = y) = fX|Y (x|y) · ✏x · ✏y
(x|y) · ✏x = fX|YP(x|y)
X|Y
fX|Y (x|y) = fY(Y(y)=· ✏y)
y
fY (x|y)
ffX|Y (y) · ✏x · ✏y
· ✏x =
fX|Yf(x|y)(x|y) X|Y
(x|y)
= (y) · ✏y
X|Y
fYf(y)
Y
fX|Y (x|y)
fX|Y (x|y) =
fY (y)
Mixing Discrete and Continuous
Let X be a continuous random variable
Let N be a discrete random variable
P (N = n|X = x)P (X = x)
P (X= x|N = n) = P (N = n|X
P (X= x|N = n) = P (N==x)Pn) (X = x)
P (N = n|X P (N ==x)P
n) (X = x)
P (X= x|N = n) = PN |X (n|x)PX (x)
PX|N (x|n) = P (N (n|x)P
P = n|X
P (N= =x)P
(x) n) (X = x)
P (X= x|N = n) = N |X P (n) X
PX|N (x|n) = N
PN |X (n|x)PP (N = n)
(x)
PX|N (x|n) = PN (n)X (x) · ✏x
PN |X (n|x)f X
PN |X P(n|x)f
N
(n) X (x)
fX|N (x|n) =
PN (n)
All the Bayes
P (X = x|N = n) =
Belong to
P (N = n|X = Us
x)P (X = x)
P (N = n)
M,N are discrete. X, Y are continuous
PN |X (n|x)PX (x)
PX|N (x|n) =
PN (n)
PN |M (n|m)pM (m)
pM |N (m|n) = PN |X (n|x)fX (x) · ✏x
fX|N (x|n) · ✏x = pN (n)
PN (n)
PN |X (n|x)fX (x)
fX|N (x|n) =
PN (n)
µx = 3
µy = 3
=2
Tracking in 2D Space: Prior
[(x 3)2 +(y 3)2 ]
fX,Y (x, y) = K · e 8
Tracking in 2D Space: Observation!
[(x 3)2 +(y 3)2 ]
fX,Y (x, y) = K · e 8
p
2
fD|X,Y ⇠ N (µ = x2 + y 2 , = 1)
What is your new belief for the location of the object being tracked?
Your probability density function can be expressed with a constant
Tracking in 2D Space: Observation!
[(x 3)2 +(y 3)2 ]
fX,Y (x, y) = K · e 8
p
[d x2 +y 2 ]2
fD|X,Y (d|x, y) = K · e
What is your new belief for the location of the object being tracked?
Your joint probability density function can be expressed with a constant
Tracking in 2D Space: Posterior
p (x 3)2 +(y 3)2
[(4 x2 +y 2 )2 + ]
fX,Y |D (x, y|4) = K · e 8