Diffusion Models A Concise Perspective
Diffusion Models A Concise Perspective
Shashwat Gupta1[0009−0003−8037−2348]
1 Diffusion Models
There are several types of generative models popular now (as shown in Figure
1), but none is without its flaws:
2 Shashwat Gupta
x0 ∼ q(x)
p
q(xt |xt−1 ) = N (xt ; 1 − βt xt−1 , βt I)
T
Y
q(x1:T |x0 ) = q(xt |xt−1 )
t=1
αt = 1 − βt
t
Y
ᾱt = αi
i=1
√
q(xt |x0 ) = N (xt ; ᾱt xt−1 , (1 − ᾱt )I)
Since βt is small, q(xt−1 |xt ) is also Gaussian. However, estimating this quan-
tity would require using the entire dataset, so we learn a model pθ to approximate
the conditional probabilities.
Diffusion Models - A beginner’s perspective 3
ϵθ (xt , t) ϵθ (xt , t)
sθ (xt , t) ≈ ∇xt log q(xt ) = Eq(x0 ) [∇xt q(xt | x0 )] = Eq(x0 ) − √ = −√
1 − ᾱt 1 − ᾱt
To turn a diffusion model into a conditioned model [22], we can add conditioning
information (y) at each step with a guidance-scalar s as :
T
Y
pθ (x0:T |y) = p(xT ) pθ (xt−1 |xt , y)
t=1
∇xt log pθ (xt |y) = ∇xt log pθ (xt ) + s.∇xt log pθ (y|xt )
1
Using ∇xt log q(xt ) = − √1− ϵ (xt , t)
ᾱ θ t
√
ϵ̄θ (xt , t) = ϵθ (xt , t) − 1 − ᾱt ∇xt log pθ (y|xt ))
The above score-based formulation eliminates the term using pθ (y), which
needs knowledge of all data points.
The following are the popular ways to condition the diffusion model
Diffusion Models - A beginner’s perspective 5
(b) Adding noise increases the base of predictions to sparse regions closer to the low-
dimensional manifold.
∇xt log p(y|xt ) = ∇xt log p(xt |y) − ∇xt log p(xt )
1 √
= −√ ϵθ (xt , t, y) − 1 − ᾱt w∇xt log p(y|xt )
1 − ᾱt
i.e.
ϵ̄θ (xt , t, y) = ϵθ (xt , t, y)+w ϵθ (xt , t, y)−ϵθ (xt , t) = (w+1)ϵθ (xt , t, y)−wϵθ (xt , t)
References
1. Blog: https://lilianweng.github.io/posts/2021-07-11-diffusion-models/
2. Blog: https://theaisummer.com/diffusion-models/
3. Blog: https://towardsdatascience.com/diffusion-models-made-easy-8414298ce4da
(a simplistic explanation)
4. Video: https://www.youtube.com/watch?v=HoKDTa5jHvg&t=1284s
5. Paper: DDPM: https://arxiv.org/pdf/2006.11239.pdf (Ho et al., 2020)
6. Video Explanation: https://www.youtube.com/watch?v=W-O7AZNzbzQ
7. Annotated Code: https://huggingface.co/blog/annotated-diffusion
8. Blog - Variational AutoEncoders: https://lilianweng.github.io/posts/
2018-08-12-vae/
9. Blog - Latent Variable Models: https://theaisummer.com/
latent-variable-models/
10. Paper: Deep Unsupervised Learning using Nonequilibrium Thermodynamics, Dick-
stein et al., 2015: https://arxiv.org/pdf/1503.03585.pdf
11. Paper: Improved Denoising Diffusion Probabilistic Models, Nicol and Dhariwal,
2021: https://arxiv.org/pdf/2102.09672.pdf
12. Paper: Diffusion Models beat GANs on Image Synthesis: https://arxiv.org/pdf/
2105.05233.pdf
13. Paper: Generative Modelling by estimating Gradients of Data Distribution: Noise-
conditioned score network, Yang and Ermon, 2019: https://arxiv.org/abs/1907.
05600
14. Paper: Cold Diffusion: https://arxiv.org/pdf/2208.09392.pdf
15. Paper: Understanding Diffusion Models: A Unified Perspective, Calvin Luo, 2022:
https://arxiv.org/pdf/2208.11970.pdf
8 Shashwat Gupta
16. Paper: Fast Sampling of Diffusion Models with Exponential Integrator, Zhang et
al., 2020: https://arxiv.org/abs/2204.13902
17. Paper: Classifier-Free Diffusion Guidance (Ho et al., 2021): https://openreview.
net/pdf?id=qw8AKxfYbI
18. Paper: Diffusion Models: A Comprehensive study of Methods and Applications,
Yang et al., 2022: https://arxiv.org/pdf/2209.00796.pdf
19. Diffusion and Score-based generative models: https://www.youtube.com/watch?
v=wMmqCMwuM2Q
20. Blog: https://yang-song.net/blog/2021/score/
21. Blog : Autoregressive models, normalizing flow, energy-based models, VAEs. Score-
papers: https://scorebasedgenerativemodeling.github.io/
22. Blog: Guiding Diffusion Process: https://sander.ai/2022/05/26/guidance.html
23. Blog: Diffusion as autoencoders: https://sander.ai/2022/01/31/diffusion.
html
24. Video : Langevin Dynamics end to end: https://www.youtube.com/watch?v=
3-KzIjoFJy4&t=2379s
25. Paper : Adding Conditional Control to Text-to-Image Diffusion Models, Zhang et
al, 2023 https://arxiv.org/abs/2302.05543
26. Paper : Bayesian Learning vis Stochastic Gradient Langevin Dynam-
ics, Welling and Teh, 2011 https://www.stats.ox.ac.uk/~teh/research/
compstats/WelTeh2011a.pdf
27. Paper : High-Resolution Image Synthesis with Latent Diffusion Models Rombach
et al., 2022 https://arxiv.org/abs/2112.10752
28. Paper : Denoising Diffusion Implicit Models, Song et al.,2021 https://arxiv.org/
abs/2010.02502