0% found this document useful (0 votes)
25 views20 pages

Ky - Unit 3

Uploaded by

khushiyadav88400
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
25 views20 pages

Ky - Unit 3

Uploaded by

khushiyadav88400
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 20

Potential Exam Questions Explain the concept of pseudo-random number generation.

How does the


Linear Congruential Generator (LCG) work? Derive and explain the inverse-transform method for
random number generation. Write the steps of the acceptance-rejection method and explain its
working with an example. What is the Box-Muller transform? How is it used to generate normal
random variables? Discuss the application of Monte Carlo methods in numerical integration.
Provide an example. Compare and contrast antithetic variables and control variates as variance
reduction techniques. What is importance sampling? Explain its formula with an example. How
does stratified sampling reduce variance in Monte Carlo methods? Define a Markov chain. What
are its key properties? Explain the Metropolis-Hastings algorithm. How does it ensure convergence
to the target distribution? What is Gibbs sampling? Provide an example of its use in bivariate
distributions. How can one check for convergence in MCMC algorithms? give me answer of all
these question in details

ChatGPT said:

ChatGPT
Sure! Here's a detailed explanation of the concepts and methods mentioned in the exam questions:

1. Pseudo-Random Number Generation and Linear Congruential Generator


(LCG)
Pseudo-random number generation (PRNG) refers to the generation of numbers that appear
random but are actually determined by an algorithm. These numbers are called pseudo-random
because they are not truly random, but they appear statistically similar to random sequences for
practical purposes. The most commonly used PRNGs are deterministic, meaning they will always
generate the same sequence for a given seed.
The Linear Congruential Generator (LCG) is a widely used PRNG that generates a sequence of
numbers based on a linear recurrence relation. The general form of the LCG is:
Xn+1=(aXn+c)mod mX_{n+1} = (aX_n + c) \mod mXn+1=(aXn+c)modm
Where:
• XnX_nXn is the current value (seed for X0X_0X0),
• aaa is the multiplier,
• ccc is the increment,
• mmm is the modulus, and
• Xn+1X_{n+1}Xn+1 is the next value in the sequence.
This method has a period that depends on the parameters aaa, ccc, and mmm. A well-chosen LCG
can have a long period and is computationally efficient, but it can also exhibit patterns if the
parameters are poorly chosen.

2. Inverse-Transform Method for Random Number Generation


The inverse-transform method is a technique for generating random variables with a given
probability distribution. The basic idea is that if we can express the cumulative distribution function
(CDF) F(x)F(x)F(x) of a random variable XXX, we can invert it to obtain the corresponding values
of XXX.
Steps:
1. Uniform Random Variable Generation: Generate a random number UUU from a uniform
distribution on the interval [0, 1].
2. Find the Inverse: Solve for XXX in the equation F(X)=UF(X) = UF(X)=U. This gives the
value of XXX corresponding to the uniform random number.
3. Return the Value of XXX: The value of XXX is now a sample from the desired
distribution.
For example, to generate a random variable with an exponential distribution, we use:
• F(x)=1−e−λxF(x) = 1 - e^{-\lambda x}F(x)=1−e−λx
• Inverting it gives: x=−ln⁡(1−U)λx = -\frac{\ln(1 - U)}{\lambda}x=−λln(1−U)

3. Acceptance-Rejection Method
The acceptance-rejection method is used to generate random variables from a distribution that is
difficult to sample from directly. It involves sampling from an easier distribution and rejecting some
of the samples based on a criterion.
Steps:
1. Choose a proposal distribution g(x)g(x)g(x), which is easy to sample from.
2. Find a constant MMM such that for all xxx, f(x)≤M⋅g(x)f(x) \leq M \cdot g(x)f(x)≤M⋅g(x),
where f(x)f(x)f(x) is the target distribution.
3. Sample x∗x^*x∗ from g(x)g(x)g(x) and generate a uniform random number UUU from [0,
1].
4. If U≤f(x∗)M⋅g(x∗)U \leq \frac{f(x^*)}{M \cdot g(x^*)}U≤M⋅g(x∗)f(x∗), accept
x∗x^*x∗; otherwise, reject it.
5. Repeat steps 3 and 4 until a valid sample is accepted.
Example: Suppose you want to sample from a distribution f(x)f(x)f(x) and choose a proposal
distribution g(x)g(x)g(x) that is easy to sample from, like a uniform or normal distribution.

4. Box-Muller Transform for Generating Normal Random Variables


The Box-Muller transform is a method to generate pairs of independent standard normal random
variables from two independent uniform random variables.
Steps:
1. Generate two independent random variables U1U_1U1 and U2U_2U2 from the uniform
distribution on [0, 1].
2. Compute the following: Z0=−2ln⁡U1⋅cos⁡(2πU2)Z_0 = \sqrt{-2 \ln U_1} \cdot \cos(2\pi
U_2)Z0=−2lnU1⋅cos(2πU2) Z1=−2ln⁡U1⋅sin⁡(2πU2)Z_1 = \sqrt{-2 \ln U_1} \cdot \sin(2\pi
U_2)Z1=−2lnU1⋅sin(2πU2) Z0Z_0Z0 and Z1Z_1Z1 are independent standard normal
variables.
This transformation ensures that the generated random variables have a normal distribution with
mean 0 and variance 1.
5. Monte Carlo Methods in Numerical Integration
Monte Carlo methods use random sampling to estimate numerical results. In numerical
integration, these methods are useful for approximating integrals, especially in high-dimensional
spaces.
Example: To estimate the integral of a function f(x)f(x)f(x) over the interval [a,b][a, b][a,b], we use:
I=∫abf(x) dx≈b−aN∑i=1Nf(xi)I = \int_a^b f(x) \, dx \approx \frac{b-a}{N} \sum_{i=1}^{N}
f(x_i)I=∫abf(x)dx≈Nb−ai=1∑Nf(xi)
Where xix_ixi are random samples from the interval [a,b][a, b][a,b], and NNN is the number of
samples. The more samples we take, the more accurate the estimate becomes.

6. Antithetic Variables vs Control Variates


• Antithetic Variables: This variance reduction technique generates dependent pairs of
random variables such that the sum or average of the pair has reduced variance. It exploits
the symmetry in the problem.
• Example: If you have a random variable XXX, use the antithetic variable 1−X1 -
X1−X, so that the average reduces variance.
• Control Variates: This technique involves using a known random variable with known
expectation to reduce the variance of the estimator. The key idea is to subtract a weighted
version of the control variate from the original estimator.
• Example: If you know the expectation of YYY, adjust your estimator by subtracting
α(Y−E[Y])\alpha (Y - E[Y])α(Y−E[Y]).

7. Importance Sampling
Importance Sampling is a variance reduction technique where you sample from a distribution that
is more concentrated in the areas that contribute most to the integral or expectation, and weight the
samples accordingly.
The formula is:
E[f(X)]=∫f(x)p(x)dx≈1N∑i=1Nf(xi)q(xi)p(xi)E[f(X)] = \int f(x) p(x) dx \approx \frac{1}{N} \
sum_{i=1}^{N} \frac{f(x_i)}{q(x_i)} p(x_i)E[f(X)]=∫f(x)p(x)dx≈N1i=1∑Nq(xi)f(xi)p(xi)
Where:
• p(x)p(x)p(x) is the target distribution,
• q(x)q(x)q(x) is the proposal distribution, and
• xix_ixi are samples from q(x)q(x)q(x).

8. Stratified Sampling
Stratified Sampling involves dividing the sample space into strata (subintervals or subsets) and
sampling from each stratum. This reduces variance by ensuring that all parts of the sample space are
represented proportionally.
Steps:
1. Divide the sample space into NNN strata.
2. Sample from each stratum proportionally.
3. Combine the results to estimate the desired quantity.
By ensuring each stratum is sampled, stratified sampling can achieve lower variance than simple
random sampling.

9. Markov Chain
A Markov chain is a sequence of random variables where the future state depends only on the
current state, not on the history of previous states. Its key properties are:
• Memorylessness: The probability of transitioning to the next state depends only on the
current state.
• Stationary Distribution: Over time, the chain may converge to a stable distribution.
• Transition Matrix: The probabilities of moving from one state to another are often
represented in a matrix.

10. Metropolis-Hastings Algorithm


The Metropolis-Hastings algorithm is used to generate samples from a target distribution that is
difficult to sample directly. The algorithm is a type of Markov Chain Monte Carlo (MCMC)
method.
Steps:
1. Start with an initial state x0x_0x0.
2. Propose a new state x′x'x′ based on a proposal distribution q(x′∣x)q(x'|x)q(x′∣x).
3. Calculate the acceptance ratio: α=min⁡(1,p(x′)q(x∣x′)p(x)q(x′∣x))\alpha = \min\left( 1, \
frac{p(x') q(x|x')}{p(x) q(x'|x)} \right)α=min(1,p(x)q(x′∣x)p(x′)q(x∣x′)) Where p(x)p(x)p(x)
is the target distribution.
4. Accept the new state x′x'x′ with probability α\alphaα; otherwise, keep the current state.
5. Repeat steps 2-4 for a large number of iterations.
The algorithm ensures convergence to the target distribution.

11. Gibbs Sampling


Gibbs sampling is a special case of the Metropolis-Hastings algorithm, used for sampling from the
joint distribution of multiple variables. It works by iteratively sampling each variable from its
conditional distribution given the others.
Example: In bivariate distributions, Gibbs sampling updates each variable by sampling from its
conditional distribution:
• p(x1∣x2)p(x_1 | x_2)p(x1∣x2)
• p(x2∣x1)p(x_2 | x_1)p(x2∣x1)
These steps are repeated for many iterations to obtain samples from the joint distribution.

12. Checking Convergence in MCMC Algorithms


Convergence in MCMC algorithms can be checked using the following methods:
1. Visual Inspection: Plot the chain over time to check for stability.
2. Autocorrelation: Compute autocorrelations of the chain. A well-mixed chain should have
low autocorrelation.
3. Convergence Diagnostics: Use diagnostic tests such as the Gelman-Rubin test, which
compares chains started from different initial values.
4. Effective Sample Size (ESS): Assess the ESS, which indicates how many independent
samples are effectively represented by the MCMC chain.
Convergence is important to ensure the Markov chain has sufficiently explored the target
distribution.

You said:
in more detail and explain by taking an exaples
ChatGPT said:

ChatGPT
Sure! Let’s go deeper into each concept with detailed explanations and examples.

1. Pseudo-Random Number Generation and Linear Congruential Generator


(LCG)
Pseudo-random number generation (PRNG) is a process where numbers that appear to be
random are generated using an algorithm. Although they are deterministic, they appear random for
most practical applications. These numbers are typically used in simulations, cryptography, and
statistical sampling.

Linear Congruential Generator (LCG):


The LCG is one of the simplest and oldest algorithms used for PRNG. It generates numbers based
on a recurrence relation of the form:
Xn+1=(aXn+c)mod mX_{n+1} = (a X_n + c) \mod mXn+1=(aXn+c)modm
Where:
• XnX_nXn is the current value in the sequence.
• aaa, ccc, and mmm are parameters chosen for the generator.
• mod m\mod mmodm ensures that the sequence of values wraps around after mmm values.
Example: Suppose we want to generate a sequence with the following parameters:
• a=5a = 5a=5
• c=1c = 1c=1
• m=16m = 16m=16
• Seed X0=7X_0 = 7X0=7
The first few values would be:
• X1=(5×7+1)mod 16=36mod 16=4X_1 = (5 \times 7 + 1) \mod 16 = 36 \mod 16 = 4X1
=(5×7+1)mod16=36mod16=4
• X2=(5×4+1)mod 16=21mod 16=5X_2 = (5 \times 4 + 1) \mod 16 = 21 \mod 16 = 5X2
=(5×4+1)mod16=21mod16=5
• X3=(5×5+1)mod 16=26mod 16=10X_3 = (5 \times 5 + 1) \mod 16 = 26 \mod 16 = 10X3
=(5×5+1)mod16=26mod16=10
• And so on...
This sequence will eventually repeat after a number of steps depending on the parameters aaa, ccc,
and mmm. The period of the LCG is the number of unique values before it starts repeating.

2. Inverse-Transform Method for Random Number Generation


The inverse-transform method is a technique for generating random variables that follow a
specific probability distribution, given the cumulative distribution function (CDF) F(x)F(x)F(x) of
the desired distribution.

Steps:
1. Generate a uniform random variable UUU on [0, 1].
2. Find the inverse of the CDF: Solve for XXX in the equation F(X)=UF(X) = UF(X)=U.
3. Return the value of XXX.

Example: Exponential Distribution


Suppose we want to generate a random variable from an exponential distribution with rate
parameter λ\lambdaλ. The CDF of the exponential distribution is:
F(x)=1−e−λxF(x) = 1 - e^{-\lambda x}F(x)=1−e−λx
To apply the inverse-transform method, we solve for xxx in terms of UUU, where
U∼Uniform(0,1)U \sim \text{Uniform}(0, 1)U∼Uniform(0,1).
U=1−e−λx⇒e−λx=1−U⇒x=−ln⁡(1−U)λU = 1 - e^{-\lambda x} \quad \Rightarrow \quad e^{-\lambda
x} = 1 - U \quad \Rightarrow \quad x = -\frac{\ln(1 - U)}{\
lambda}U=1−e−λx⇒e−λx=1−U⇒x=−λln(1−U)
Thus, to generate a random number from an exponential distribution, we:
1. Generate a uniform random variable U∼Uniform(0,1)U \sim \text{Uniform}(0,
1)U∼Uniform(0,1).
2. Calculate x=−ln⁡(1−U)λx = -\frac{\ln(1 - U)}{\lambda}x=−λln(1−U).
For example, if U=0.5U = 0.5U=0.5 and λ=2\lambda = 2λ=2, then:
x=−ln⁡(1−0.5)2=−ln⁡(0.5)2≈0.3466x = -\frac{\ln(1 - 0.5)}{2} = -\frac{\ln(0.5)}{2} \approx
0.3466x=−2ln(1−0.5)=−2ln(0.5)≈0.3466

3. Acceptance-Rejection Method
The acceptance-rejection method is a technique used when it’s difficult to sample from a
distribution directly, but easier to sample from a proposal distribution. It generates random variables
from the target distribution by rejecting certain samples based on a probability criterion.
Steps:
1. Choose a proposal distribution g(x)g(x)g(x) and a constant MMM such that
f(x)≤M⋅g(x)f(x) \leq M \cdot g(x)f(x)≤M⋅g(x) for all xxx, where f(x)f(x)f(x) is the target
distribution.
2. Sample x∗x^*x∗ from g(x)g(x)g(x).
3. Generate a uniform random variable UUU on [0, 1].
4. Accept x∗x^*x∗ if U≤f(x∗)M⋅g(x∗)U \leq \frac{f(x^*)}{M \cdot
g(x^*)}U≤M⋅g(x∗)f(x∗); otherwise, reject it.
5. Repeat the above steps until a sample is accepted.

Example: Sampling from a Beta Distribution


Suppose we want to sample from a Beta distribution with parameters α=2\alpha = 2α=2 and β=5\
beta = 5β=5, but we choose to use a simpler proposal distribution, like a uniform distribution.
1. Choose a proposal distribution: Here we choose g(x)=1g(x) = 1g(x)=1 for x∈[0,1]x \in [0,
1]x∈[0,1] (a uniform distribution).
2. Find the constant MMM: We need f(x)≤M⋅g(x)f(x) \leq M \cdot g(x)f(x)≤M⋅g(x). For a
Beta distribution, f(x)f(x)f(x) has a peak at the value corresponding to x=0.25x =
0.25x=0.25. After calculating, we find M=4M = 4M=4 for the Beta(2, 5) distribution.
3. Generate x∗x^*x∗ from g(x)g(x)g(x), say x∗=0.4x^* = 0.4x∗=0.4.
4. Generate UUU from U∼Uniform(0,1)U \sim \text{Uniform}(0, 1)U∼Uniform(0,1), say
U=0.7U = 0.7U=0.7.
5. Check the acceptance criterion: We check if U≤f(0.4)4⋅1U \leq \frac{f(0.4)}{4 \cdot
1}U≤4⋅1f(0.4). If true, accept x∗=0.4x^* = 0.4x∗=0.4; if not, reject and repeat.

4. Box-Muller Transform for Normal Distribution


The Box-Muller transform is a method for generating two independent standard normal random
variables from two independent uniform random variables.

Steps:
1. Generate two independent uniform random variables U1,U2∼Uniform(0,1)U_1, U_2 \
sim \text{Uniform}(0, 1)U1,U2∼Uniform(0,1).
2. Compute the two standard normal variables using the formulas:
Z0=−2ln⁡U1cos⁡(2πU2)Z_0 = \sqrt{-2 \ln U_1} \cos(2\pi U_2)Z0=−2lnU1cos(2πU2)
Z1=−2ln⁡U1sin⁡(2πU2)Z_1 = \sqrt{-2 \ln U_1} \sin(2\pi U_2)Z1=−2lnU1sin(2πU2)
These two variables Z0Z_0Z0 and Z1Z_1Z1 are independent and follow a standard normal
distribution.

Example:
Let’s generate two standard normal variables using the Box-Muller method.
1. Suppose U1=0.7U_1 = 0.7U1=0.7 and U2=0.3U_2 = 0.3U2=0.3.
2. Compute Z0Z_0Z0 and Z1Z_1Z1: Z0=−2ln⁡0.7cos⁡(2π⋅0.3)≈0.5154Z_0 = \sqrt{-2 \ln 0.7} \
cos(2\pi \cdot 0.3) \approx 0.5154Z0=−2ln0.7cos(2π⋅0.3)≈0.5154
Z1=−2ln⁡0.7sin⁡(2π⋅0.3)≈−1.2686Z_1 = \sqrt{-2 \ln 0.7} \sin(2\pi \cdot 0.3) \approx -
1.2686Z1=−2ln0.7sin(2π⋅0.3)≈−1.2686
Now, Z0Z_0Z0 and Z1Z_1Z1 are independent normal random variables.

5. Monte Carlo Methods in Numerical Integration


Monte Carlo methods estimate numerical results through random sampling. They are particularly
useful for high-dimensional integrals, which are difficult to compute using traditional methods.

Example: Estimating an Integral


To estimate the integral of f(x)=x2f(x) = x^2f(x)=x2 over [0,1][0, 1][0,1], we use Monte Carlo
integration:
1. Generate random samples x1,x2,...,xNx_1, x_2, ..., x_Nx1,x2,...,xN from the uniform
distribution U(0,1)U(0, 1)U(0,1).
2. Estimate the integral as: I=∫01x2 dx≈1N∑i=1Nf(xi)I = \int_0^1 x^2 \, dx \approx \frac{1}
{N} \sum_{i=1}^{N} f(x_i)I=∫01x2dx≈N1i=1∑Nf(xi)
3. For example, if we sample 3 values x1=0.2,x2=0.5,x3=0.9x_1 = 0.2, x_2 = 0.5, x_3 = 0.9x1
=0.2,x2=0.5,x3=0.9, then: I≈13(0.22+0.52+0.92)=13(0.04+0.25+0.81)=0.3667I \approx \
frac{1}{3} \left( 0.2^2 + 0.5^2 + 0.9^2 \right) = \frac{1}{3} \left( 0.04 + 0.25 + 0.81 \right)
= 0.3667I≈31(0.22+0.52+0.92)=31(0.04+0.25+0.81)=0.3667
This is an approximation of the true integral, which is 13\frac{1}{3}31.

6. Antithetic Variables vs Control Variates


Antithetic Variables:
This technique reduces variance by using the relationship between complementary random
variables. The idea is to generate pairs of random variables whose means cancel out, thereby
reducing the variance.

Example:
Suppose we want to estimate the expectation of a function f(X)f(X)f(X) where XXX is a random
variable. If we generate a sample X1X_1X1 from a uniform distribution, we also generate an
"antithetic" sample 1−X11 - X_11−X1. The average of these two values, X1+(1−X1)2\frac{X_1 +
(1 - X_1)}{2}2X1+(1−X1), will have less variance than using X1X_1X1 alone.

Control Variates:
In this method, we use a known random variable with a known expectation to reduce variance. If
YYY is a random variable with a known expectation E[Y]E[Y]E[Y], we use it to adjust the
estimator.
Example:
Suppose we are estimating E[X]E[X]E[X], and we know the expectation of YYY. We adjust the
estimate as:
E[X]≈1N∑i=1NXi−α(Yi−E[Y])E[X] \approx \frac{1}{N} \sum_{i=1}^{N} X_i - \alpha (Y_i -
E[Y])E[X]≈N1i=1∑NXi−α(Yi−E[Y])
Where α\alphaα is a constant chosen to minimize the variance of the estimator.

7. Importance Sampling
Importance Sampling is a method used to estimate the expected value of a function by sampling
from a distribution that is more concentrated in the areas of interest.

Formula:
E[f(X)]=∫f(x)p(x)dx≈1N∑i=1Nf(xi)q(xi)p(xi)E[f(X)] = \int f(x) p(x) dx \approx \frac{1}{N} \
sum_{i=1}^{N} \frac{f(x_i)}{q(x_i)} p(x_i)E[f(X)]=∫f(x)p(x)dx≈N1i=1∑Nq(xi)f(xi)p(xi)
Where:
• p(x)p(x)p(x) is the target distribution.
• q(x)q(x)q(x) is the proposal distribution.

Example:
To estimate E[X2]E[X^2]E[X2] where X∼Exponential(1)X \sim \text{Exponential}
(1)X∼Exponential(1), we can use a proposal distribution q(x)=0.5exp⁡(−0.5x)q(x) = 0.5 \exp(-
0.5x)q(x)=0.5exp(−0.5x) and correct the estimate using the formula.

8. Stratified Sampling
Stratified Sampling divides the population into subgroups or "strata" and then samples from each
stratum. This approach helps ensure that every part of the distribution is well-represented, reducing
variance.

Example:
Suppose we want to estimate the mean of a function f(x)f(x)f(x) over [0,1][0, 1][0,1], and we divide
the interval into 3 equal strata. We then sample from each subinterval to obtain more accurate
estimates.

9. Markov Chains
A Markov Chain is a sequence of random variables where the future state depends only on the
current state, not on the history.
Example:
Suppose we have a weather model where the weather tomorrow depends only on today’s weather:
• If it’s sunny today, there’s a 70% chance of sunny tomorrow, 30% chance of rain.
• If it’s rainy today, there’s an 80% chance of rain tomorrow, 20% chance of sunny.
This is a Markov Chain.

10. Metropolis-Hastings Algorithm


The Metropolis-Hastings algorithm is an MCMC method for sampling from a probability
distribution when direct sampling is difficult. It works by proposing new states and accepting or
rejecting them based on a calculated ratio.

Example:
Suppose you want to sample from a target distribution p(x)p(x)p(x). Start with an initial value,
propose a new value, and accept or reject it based on the acceptance ratio.

11. Gibbs Sampling


Gibbs Sampling is a Markov Chain Monte Carlo (MCMC) algorithm used for sampling from
multivariate distributions by iteratively sampling each variable conditioned on the others.

Example:
Suppose we have a bivariate normal distribution. We sample each variable conditioned on the other
variable.

12. Convergence in MCMC


To ensure that the Markov Chain has converged to the target distribution, we use methods like:
• Gelman-Rubin test: Compares chains with different initial values.
• Autocorrelation: Checks if samples are independent.
• Effective Sample Size: Measures the number of independent samples.
Certainly! Let's go into more detail on Markov Chains, the Metropolis-Hastings algorithm,
Gibbs sampling, and how to check for convergence in MCMC.

9. Markov Chains
A Markov Chain is a sequence of random variables where the future state depends only on the
current state and not on the sequence of events that preceded it. This property is called the Markov
property or memoryless property.
Key Characteristics:
• State space: The set of all possible states the system can be in.
• Transition probabilities: The probabilities of moving from one state to another.
• Stationary distribution: If the Markov Chain reaches a steady state, the distribution of
states stops changing.
A Markov Chain can be represented by a transition matrix PP, where the entry P(i,j)P(i,j) is the
probability of transitioning from state ii to state jj.

Example: Weather Model


Let's consider a simple weather model with two states: Sunny and Rainy. The transition matrix
might look like:
P=(0.70.30.20.8)P = \begin{pmatrix} 0.7 & 0.3 \\ 0.2 & 0.8 \end{pmatrix}
This means:
• If it’s Sunny today, there’s a 70% chance it will be Sunny tomorrow and a 30% chance it
will be Rainy.
• If it’s Rainy today, there’s a 20% chance it will be Sunny tomorrow and an 80% chance it
will be Rainy.
The process is memoryless because the weather tomorrow depends only on the current weather, not
on how we got there.
Applications: Markov Chains are used in various fields such as physics (modeling systems in
equilibrium), economics (modeling market states), and biology (modeling gene sequences).

10. Metropolis-Hastings Algorithm


The Metropolis-Hastings algorithm is a method for generating samples from a probability
distribution p(x)p(x) where direct sampling is difficult. It is a part of Markov Chain Monte Carlo
(MCMC) methods and is widely used to sample from complex distributions.

Key Idea:
• Start with an initial guess or state x0x_0.
• Propose a new state x′x' from a proposal distribution q(x′∣x)q(x' | x), where q(x′∣x)q(x' | x) is
a distribution that proposes new states based on the current state xx.
• Calculate the acceptance ratio: A(x,x′)=min⁡(1,p(x′)q(x∣x′)p(x)q(x′∣x))A(x, x') = \min \
left( 1, \frac{p(x') q(x | x')}{p(x) q(x' | x)} \right) This ratio compares the target distribution
at x′x' with the target distribution at xx, adjusting for the proposal distribution.

Acceptance Rule:
• Accept the new state x′x' with probability A(x,x′)A(x, x'). If accepted, xn+1=x′x_{n+1} = x'.
• Reject the new state with probability 1−A(x,x′)1 - A(x, x'). If rejected, keep the current
state: xn+1=xnx_{n+1} = x_n.
Example: Sampling from a Normal Distribution
Suppose we want to sample from a normal distribution with mean μ=0\mu = 0 and variance σ2=1\
sigma^2 = 1. Let’s say we start with x0=2x_0 = 2.
1. Propose a new value x′x' from a normal proposal distribution, say x′∼N(xn,1)x' \sim
N(x_n, 1), meaning x′x' is sampled around the current value xnx_n.
2. Calculate the acceptance ratio: A(x,x′)=min⁡(1,p(x′)p(x))A(x, x') = \min \left( 1, \frac{p(x')}
{p(x)} \right) Here, p(x)p(x) and p(x′)p(x') are the probability densities of the target
distribution (a normal distribution in this case).
3. Accept x′x' if A(x,x′)A(x, x') is large, or reject it with a smaller probability. Repeat this
process iteratively.
Over time, the chain of values generated will converge to the target distribution.

11. Gibbs Sampling


Gibbs Sampling is an MCMC technique used to sample from multivariate probability distributions
by iteratively sampling from the conditional distributions of each variable, given the current values
of the others. It is particularly useful when you have a joint distribution and want to sample from its
marginals.

Key Steps:
1. Start with initial values for all variables X1(0),X2(0),…,Xd(0)X_1^{(0)}, X_2^{(0)}, \
dots, X_d^{(0)}.
2. Iteratively update each variable by sampling from its conditional distribution, which is the
distribution of one variable conditioned on the others. X1(t+1)∼p(X1∣X2(t),
…,Xd(t))X_1^{(t+1)} \sim p(X_1 | X_2^{(t)}, \dots, X_d^{(t)})
X2(t+1)∼p(X2∣X1(t+1),X3(t),…,Xd(t))X_2^{(t+1)} \sim p(X_2 | X_1^{(t+1)},
X_3^{(t)}, \dots, X_d^{(t)}) Repeat for each variable in the system.
3. Repeat this process for a number of iterations, where each iteration updates the values of all
the variables.

Example: Bivariate Normal Distribution


Suppose we want to sample from a bivariate normal distribution with correlation ρ\rho. We know
the marginal distributions X1∼N(0,1)X_1 \sim N(0, 1) and X2∣X1∼N(ρX1,1)X_2 | X_1 \sim N(\
rho X_1, 1).
1. Start with X1(0)=0X_1^{(0)} = 0, say.
2. Update X2(1)X_2^{(1)} by sampling from N(ρX1(0),1)N(\rho X_1^{(0)}, 1).
3. Update X1(1)X_1^{(1)} by sampling from N(0,1)N(0, 1), as it’s independent of X2X_2.
4. Repeat until convergence.
At each step, the sampling process uses the current value of one variable to sample the next, which
allows us to build up a sample from the joint distribution.
12. Checking Convergence in MCMC
One of the challenges in MCMC is to ensure that the algorithm has converged to the target
distribution. If the chain hasn’t converged, the samples may not represent the target distribution
properly. There are several methods to check for convergence:

1. Gelman-Rubin Diagnostic (Potential Scale Reduction Factor)


The Gelman-Rubin test compares multiple chains that are initialized at different points. The idea is
that after a sufficient number of iterations, all chains should converge to the same distribution. The
method computes the ratio R^\hat{R}, where:
R^=VwithinVbetween+m−1m⋅VwithinVtotal\hat{R} = \sqrt{\frac{V_{\text{within}}}{V_{\
text{between}}} + \frac{m - 1}{m} \cdot \frac{V_{\text{within}}}{V_{\text{total}}}}
Where:
• VwithinV_{\text{within}} is the within-chain variance.
• VbetweenV_{\text{between}} is the between-chain variance.
• VtotalV_{\text{total}} is the total variance across all chains.
• mm is the number of chains.
If R^\hat{R} is close to 1, the chains have likely converged.

2. Autocorrelation:
Autocorrelation measures the correlation between samples at different steps. For the chain to be
effectively independent, the autocorrelation should decay to zero as the lag increases. A high
autocorrelation suggests that the chain has not mixed well, and more iterations are needed.

3. Trace Plots:
Plotting the trace of the MCMC chain over time can help visualize convergence. A well-mixed
chain should show no obvious trends, and the values should wander around the target distribution.
A non-mixing chain might show a pattern or be stuck in a local mode of the distribution.

4. Effective Sample Size (ESS):


The effective sample size gives an estimate of the number of independent samples. It accounts for
autocorrelation and tells us how much information the chain provides. A higher ESS means more
independent information.

Example:
If we’re using an MCMC method to sample from a distribution and plotting the trace of the chain,
we would expect the plot to show fluctuations around a constant value. If the plot shows a steady
upward or downward trend, it indicates that the chain has not yet converged.

In summary:
• Markov Chains are the foundation of MCMC and are widely used to model stochastic
processes.
• Metropolis-Hastings is a general-purpose MCMC algorithm for sampling from complex
distributions.
• Gibbs Sampling simplifies sampling from multivariate distributions by sampling each
variable conditioned on the others.
• To check for convergence, we can use diagnostics like the Gelman-Rubin test,
autocorrelation, trace plots, and effective sample size to ensure that the samples represent
the target distribution.
Let me know if you need further clarification or examples!
Here's a detailed explanation of the topics you mentioned, focusing on their importance for your
AKTU 3rd-year semester exam. This explanation includes essential concepts, detailed
descriptions, and examples.

1. Pseudo-Random Numbers
Random Number Generation
• Random numbers are numbers generated in such a way that their values are unpredictable
and follow a uniform distribution.
• Pseudo-random numbers are generated using deterministic algorithms but are designed to
appear random.

Linear Congruential Generator (LCG):


• One of the simplest methods for generating pseudo-random numbers.
• Formula:
Xn+1=(aXn+c) mod mX_{n+1} = (aX_n + c) \, \text{mod} \, m
• X0X_0: Seed value (initial value).
• aa: Multiplier.
• cc: Increment.
• mm: Modulus (usually a large prime number).
• Example: If X0=1,a=5,c=1,m=7X_0 = 1, a = 5, c = 1, m = 7, generate the sequence:
X1=(5×1+1) mod 7=6X2=(5×6+1) mod 7=2X3=(5×2+1) mod 7=4X_1 = (5 \times 1 + 1) \, \
text{mod} \, 7 = 6 X_2 = (5 \times 6 + 1) \, \text{mod} \, 7 = 2 X_3 = (5 \times 2 + 1) \, \
text{mod} \, 7 = 4

Inverse-Transform Method
• Used to generate random variables from any probability distribution.
• Steps:
1. Generate a random number U∼U(0,1)U \sim U(0, 1).
2. Solve FX(X)=UF_X(X) = U, where FXF_X is the CDF of the desired distribution.
• Example: For an exponential distribution with rate parameter λ\lambda: FX(X)=1−e−λX,so
X=−ln⁡(1−U)λ.F_X(X) = 1 - e^{-\lambda X}, \quad \text{so} \, X = -\frac{\ln(1-U)}{\
lambda}.

Acceptance-Rejection Method
• Used when direct sampling is difficult.
• Steps:
1. Choose a proposal distribution g(x)g(x) such that f(x)≤Mg(x)f(x) \leq Mg(x), where
M>1M > 1.
2. Generate XX from g(x)g(x) and U∼U(0,1)U \sim U(0, 1).
3. Accept XX if U≤f(X)/(Mg(X))U \leq f(X)/(Mg(X)), else reject.

Transformations
• To generate random variables of complex distributions, transformations of simple
distributions are used.
• Example: Generate normal random variables using Box-Muller transformation:
Z1=−2ln⁡U1cos⁡(2πU2),Z2=−2ln⁡U1sin⁡(2πU2),Z_1 = \sqrt{-2 \ln U_1} \cos(2\pi U_2), \quad
Z_2 = \sqrt{-2 \ln U_1} \sin(2\pi U_2), where U1,U2∼U(0,1)U_1, U_2 \sim U(0, 1).

Multivariate Probability Calculations


• Used in generating multiple correlated random variables.
• Example: Multivariate normal variables can be generated by applying Cholesky
decomposition to the covariance matrix.

2. Monte Carlo Integration


Simulation and Monte Carlo Integration
• Simulation: Use random sampling to model real-world systems or processes.
• Monte Carlo Integration: Estimate integrals using random sampling.
• Formula: I=∫abf(x)dx≈1N∑i=1Nf(xi),I = \int_a^b f(x) dx \approx \frac{1}{N} \
sum_{i=1}^N f(x_i), where xix_i are randomly sampled points in [a,b][a, b].

Variance Reduction Techniques


• Reduce the variance of Monte Carlo estimates to improve accuracy.
1. Antithetic Variables: Use negatively correlated variables.
2. Control Variates: Use a known quantity to adjust estimates.

Monte Carlo Hypothesis Testing


• Use simulation to test hypotheses when theoretical tests are difficult to perform.
• Example: Permutation tests for comparing two distributions.

Importance Sampling
• Focus sampling efforts on regions where the function contributes most to the integral.
• Weights: Scale samples by the ratio of the target distribution to the proposal distribution.

Stratified Sampling
• Divide the domain into non-overlapping strata and sample separately from each.
• Improves accuracy by ensuring uniform coverage of the domain.
3. Markov Chain Monte Carlo (MCMC)
Markov Chains
• A sequence of random variables X1,X2,…X_1, X_2, \dots where the future state depends
only on the current state.
• Transition Probability Matrix: P(i,j)=P(Xn+1=j∣Xn=i)P(i, j) = P(X_{n+1} = j \mid X_n =
i).

Metropolis-Hastings Algorithm
• Used to sample from complex probability distributions.
• Steps:
1. Start with an initial state X0X_0.
2. Propose a new state YY from a proposal distribution q(X→Y)q(X \to Y).
3. Accept YY with probability: α=min⁡(1,π(Y)q(Y→X)π(X)q(X→Y)).\alpha = \min\
left(1, \frac{\pi(Y)q(Y \to X)}{\pi(X)q(X \to Y)}\right).

Gibbs Sampling
• A special case of MCMC used when the joint distribution can be decomposed into
conditional distributions.
• Iteratively sample from conditional distributions: X1∼P(X1∣X2,…,Xn), X2∼P(X2∣X1,
…,Xn),…X_1 \sim P(X_1 \mid X_2, \dots, X_n), \, X_2 \sim P(X_2 \mid X_1, \dots,
X_n), \dots

Convergence
• MCMC chains converge to the target distribution as the number of iterations increases.
• Diagnostics:
• Autocorrelation: Measures dependence between samples.
• Gelman-Rubin Statistic: Checks for convergence by comparing within-chain and
between-chain variance.

Important Questions for Exams


1. Explain the Inverse-Transform Method with an example.
2. What is the Acceptance-Rejection Method? Describe its steps.
3. Derive the formula for Monte Carlo Integration.
4. Explain Variance Reduction Techniques in Monte Carlo methods.
5. Describe the Metropolis-Hastings Algorithm with an example.
6. What is Gibbs Sampling, and how is it used in MCMC?
7. How are multivariate normal random variables generated?
8. Write a note on Stratified Sampling and its benefits.

This detailed explanation covers all major concepts along with their potential applications and
derivations to help you prepare for your AKTU semester exam. If you'd like this content as a PDF
or need additional clarifications, let me know!
KHUSHI YADUVANSHI
Here’s a detailed explanation of the important questions mentioned earlier. These answers are
crafted in a way to help you write well in your exams.

1. Explain the Inverse-Transform Method with an Example


Definition:
The Inverse-Transform Method is used to generate random variables from any probability
distribution by using the inverse of the cumulative distribution function (CDF).

Steps:
1. Generate a random number UU from a uniform distribution [0,1][0,1].
2. Use the relationship: FX(X)=P(X≤x)=U,F_X(X) = P(X \leq x) = U, where FX(X)F_X(X) is
the CDF of the desired distribution.
3. Solve for XX in terms of UU to get the random variable XX.

Example: Exponential Distribution


For an exponential distribution with parameter λ>0\lambda > 0:
• The CDF is FX(X)=1−e−λXF_X(X) = 1 - e^{-\lambda X}.
• Set FX(X)=UF_X(X) = U, so: U=1−e−λX ⟹ e−λX=1−U.U = 1 - e^{-\lambda X} \implies
e^{-\lambda X} = 1 - U. Taking ln⁡\ln on both sides: X=−ln⁡(1−U)λ.X = -\frac{\ln(1 - U)}{\
lambda}.
• If U∼U(0,1)U \sim U(0, 1), then XX follows the exponential distribution.

Application in Exams:
Explain this method step-by-step and derive the formula for XX. If needed, substitute numerical
values for UU and λ\lambda to show an example.

2. What is the Acceptance-Rejection Method? Describe its Steps.


Definition:
The Acceptance-Rejection Method is used to generate random samples from a target distribution
f(x)f(x) when direct sampling is difficult.

Steps:
1. Choose a proposal distribution g(x)g(x) such that f(x)≤Mg(x)f(x) \leq Mg(x) for all xx,
where M>1M > 1 is a constant.
2. Generate a candidate XX from g(x)g(x) and a uniform random number U∼U(0,1)U \sim
U(0,1).
3. Accept XX if: U≤f(X)Mg(X).U \leq \frac{f(X)}{Mg(X)}. Otherwise, reject XX and repeat
the process.
Example: Sampling from a Normal Distribution
• Let f(x)f(x) be a normal distribution and g(x)g(x) be a uniform distribution.
• Suppose M=2M = 2, and a candidate XX is generated from g(x)g(x).
• Use the acceptance criterion U≤f(X)/(2g(X))U \leq f(X)/(2g(X)).

Advantages:
• Simple to implement when f(x)f(x) is known.
• Effective for distributions with complex shapes.

3. Derive the Formula for Monte Carlo Integration


Definition:
Monte Carlo Integration estimates the value of an integral using random sampling.

Steps:
1. Suppose we want to estimate: I=∫abf(x)dx.I = \int_a^b f(x) dx.
2. Generate NN random points x1,x2,…,xNx_1, x_2, \dots, x_N uniformly in [a,b][a, b].
3. Compute the integral as: I≈b−aN∑i=1Nf(xi).I \approx \frac{b - a}{N} \sum_{i=1}^N f(x_i).

Example:
Estimate ∫01x2dx\int_0^1 x^2 dx:
• Generate N=3N = 3 random points: x1=0.2,x2=0.5,x3=0.8x_1 = 0.2, x_2 = 0.5, x_3 = 0.8.
• Compute the integral: I≈13[f(0.2)+f(0.5)+f(0.8)]=13(0.04+0.25+0.64)=0.31.I \approx \
frac{1}{3} \left[ f(0.2) + f(0.5) + f(0.8) \right] = \frac{1}{3}(0.04 + 0.25 + 0.64) = 0.31.

Applications:
Monte Carlo Integration is widely used in high-dimensional integrals or complex functions.

4. Explain Variance Reduction Techniques in Monte Carlo Methods


Definition:
Variance reduction techniques improve the accuracy of Monte Carlo simulations by reducing the
variability of estimates.

Key Techniques:
1. Antithetic Variables:
• Generate pairs of negatively correlated random variables.
• Example: If U∼U(0,1)U \sim U(0,1), then use 1−U1-U as its antithetic variable.
• Reduces variance because the pair balances extreme values.
2. Control Variates:
• Use a variable YY with a known expectation to adjust estimates.
• Formula: I^=1N∑i=1N[f(Xi)−c(Yi−E[Y])].\hat{I} = \frac{1}{N} \sum_{i=1}^N \
left[ f(X_i) - c(Y_i - \mathbb{E}[Y]) \right].
3. Importance Sampling:
• Focus samples on the regions of the domain where f(x)f(x) contributes most.
• Weight samples by f(x)g(x)\frac{f(x)}{g(x)}, where g(x)g(x) is the sampling
distribution.
4. Stratified Sampling:
• Divide the domain into strata and sample independently from each stratum.

Exam Tip:
Define each technique clearly and use examples to demonstrate.

5. Describe the Metropolis-Hastings Algorithm


Definition:
The Metropolis-Hastings algorithm generates samples from a target distribution π(x)\pi(x) using a
proposal distribution q(x→y)q(x \to y).

Steps:
1. Start at an initial state X0X_0.
2. Propose a new state YY from q(X→Y)q(X \to Y).
3. Compute the acceptance probability: α=min⁡(1,π(Y)q(Y→X)π(X)q(X→Y)).\alpha = \min\
left(1, \frac{\pi(Y)q(Y \to X)}{\pi(X)q(X \to Y)}\right).
4. Accept YY with probability α\alpha; otherwise, retain XnX_n.

Example:
• Suppose π(x)\pi(x) is a normal distribution and q(x→y)q(x \to y) is a uniform distribution.
• Propose Y=X+ϵY = X + \epsilon, where ϵ∼U(−1,1)\epsilon \sim U(-1,1).

6. What is Gibbs Sampling, and How is it Used in MCMC?


Definition:
Gibbs sampling is a special case of MCMC where samples are drawn from the conditional
distributions of each variable.

Steps:
1. Initialize the values of all variables.
2. Iteratively sample each variable from its conditional distribution, keeping others fixed.

Example:
For a bivariate distribution P(X,Y)P(X, Y):
• Sample X∼P(X∣Y)X \sim P(X \mid Y).
• Sample Y∼P(Y∣X)Y \sim P(Y \mid X).

Applications:
Used in Bayesian inference for complex models.

7. How are Multivariate Normal Random Variables Generated?


Steps:
1. Start with independent standard normal variables Z∼N(0,I)Z \sim N(0, I).
2. Use the covariance matrix Σ=LLT\Sigma = LL^T, where LL is obtained via Cholesky
decomposition.
3. Compute X=μ+LZX = \mu + LZ, where μ\mu is the mean vector.

Example:
For μ=[1,2]\mu = [1, 2] and Σ=[10.80.81]\Sigma = \begin{bmatrix} 1 & 0.8 \\ 0.8 & 1 \
end{bmatrix}:
1. Perform Cholesky decomposition of Σ\Sigma.
2. Use Z=[Z1,Z2]Z = [Z_1, Z_2], where Z1,Z2∼N(0,1)Z_1, Z_2 \sim N(0, 1), to compute XX.

8. Write a Note on Stratified Sampling and Its Benefits


Definition:
Stratified sampling divides the domain into distinct strata and performs independent sampling
within each.

Benefits:
• Reduces variance.
• Ensures better representation of the entire domain.

Example:
If estimating the mean of a population divided into males and females, stratify by gender and
sample within each group.

ALL BEST KHUSHI


JAI SHRI KRISHNA

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy