L28 Bayseian Linear Regression Linchpin Sampler PDF
L28 Bayseian Linear Regression Linchpin Sampler PDF
The instructor of this course owns the copyright of all the course materials. This lecture
material was distributed only to the students attending the course MTH511a: “Statistical
Simulation and Data Analysis” of IIT Kanpur, and should not be distributed in print or
through electronic media without the consent of the instructor. Students can make their own
copies of the course materials for their use.
In this lecture, we will focus only the Bayesian linear regression model, and discuss
accept-reject algorithms that sample from the posterior.
1
⇢
2 n/2 p/2 a 1 (y X )T (y X ) T
b
= exp
2 2 2 2 2
⇡(x, y) .
Then, we can split the joint distribution as the product of conditional times marginal.
That is
⇡(x, y) = ⇡(x|y) ⇡(y) .
If X|Y is known in closed-form and we can sample from it, then we may try and get
samples from the marginal distribution of y. This is beneficial since the dimension of
y is smaller than (x, y), and implementing AR on a smaller dimensional problem will
be much easier. So the algorithm would be
• Generate Y ⇠ ⇡(y)
• Generate X ⇠ X|Y
• Output (X, Y ).
2
The variable Y is called the linchpin variable with target density ⇡(y). We can use
this quite easily in Bayesian linear regression.
Example 1 (Bayesian linear regression). Recall the posterior distribution in Bayesian
linear regression as:
⇢
2 2 n/2 p/2 a 1 (y X )T (y X ) T
b
⇡( , |y) = exp
2 2 2 2 2
First, note that we prefer 2 to be the linchpin variable since it is univariate, and
is p-variate. So we need to find the distribution | 2 and the marginal distribution of
2
. Let A = (X T X + I).
Z
2
⇡( , |y)d
Z ⇢
2 n/2 p/2 a 1 yT y 2 T X T y + T
XT X T
b
/ exp d
2 2 2 2 2
⇢ T Z ⇢ T
2 n/2 p/2 a 1 y y b XT X 2 T
XT y T
= exp exp d
2 2 2 2 2 2 2
⇢ T Z ⇢ T
2 n/2 p/2 a 1 y y b (X T X + I) 2 T XT y
= exp exp d
2 2 2 2 2
⇢ T Z (
T
2 n/2 p/2 a 1 y y b A 2 T AA 1 X T y
= exp 2 2
exp
2 2 2
)
1 T T 1 T 1 T T 1 T
(A X y) A(A X y) (A X y) A(A X y)
+ d
2 2 2 2
⇢
2 n/2 p/2 a 1 yT y b y T XA 1 AA 1 X T y
= exp +
2 2 2 2 2
Z ⇢ T
A 2 T AA 1 X T y + y T XA 1 AA 1 X T y
= ⇥ exp
2 2
⇢ Z ⇢
2 n/2 p/2 a 1 yT y b y T XA 1 X T y ( A 1 X T y)T A( A 1 X T y)
= exp + exp .
2 2 2 2 2 2 2
2
So | is a multivariate normal distribution
| 2 , y ⇠ Np A 1 X T y, 2
A 1
,
Z ⇢
2 2 n/2 p/2 a 1 y T (I XA 1 X T )y b 2 p/2
⇡( , |y)d / exp · det(A)p/2
2 2 2
3
⇢
2 n/2 a 1 y T (I XA 1 X T )y b
/ exp .
2 2 2
where both those densities are available in closed-form and samples can be generated
easily from them in the following way:
2
1. Generate ⇠ Inverse Gamma as indicated above
2
2. Generate | ⇠ Normal distribution as indicated above.
3. ( , 2 ) is one draw from the posterior. Repeat for many draws, and estimate
posterior mean and quantiles.
We now implement Bayesian linear regression for the cars dataset
###########################################
## Linchpin variable sampler
## for Bayesian linear regression for cars
###########################################
set.seed(1)
4
rate.sig <- ( t(y) %*% (diag(1,n) - X %*% A.inv %*% t(X)) %*% y )/2 + b
for(i in 1:N)
{
beta[i,] <- A.inv %*% t(X) %*%y + Ainv.sqrt %*% rnorm(p, sd = sqrt(sig2))
# Getting beta estimates
}
σ2 β1 β2
1.0
0.06
0.008
0.05
0.8
0.006
0.04
0.6
Density
Density
Density
0.03
0.004
0.4
0.02
0.002
0.2
0.01
0.000
0.00
0.0
100 200 300 400 500 −40 −30 −20 −10 0 10 2.5 3.0 3.5 4.0 4.5 5.0 5.5
5
# sig2
#2.5% 157.9957 -27.370729 2.993072
#97.5% 342.2522 -1.772828 4.539354
Note that the posterior credible interval for both 1 and 2 do not have 0 in the interval,
implying both regression coefficients are important and should be treated as non-zero.