Atharv Report Final
Atharv Report Final
Generative Adversarial Networks (GANs) are a class of deep learning models that have
shown remarkable capabilities in generating realistic synthetic data. Introduced by Ian
Goodfellow in 2014, GANs work through a unique adversarial setup involving two neural
networks: a Generator and a Discriminator. The Generator aims to create synthetic data
that is indistinguishable from real data, while the Discriminator attempts to differentiate
between real and generated data. The two networks are trained simultaneously in a zero-
sum game where the Generator learns to improve its outputs, and the Discriminator
improves its ability to distinguish real from fake data.
This project focuses on the implementation and training of a basic GAN model using the
MNIST dataset of handwritten digits. The Generator in the model is designed to take
random noise as input and generate images that resemble handwritten digits. The
Discriminator, on the other hand, is tasked with classifying these images as either real
(from the MNIST dataset) or fake (produced by the Generator).
In order to improve the stability of the training process, several techniques were
incorporated into the model architecture, such as label smoothing and batch
normalization. Label smoothing helps prevent the model from becoming overly confident,
which can lead to overfitting, while batch normalization helps regulate the learning
process, improving both convergence speed and model generalization. The training
process was performed using the Adam optimizer, which is known for its efficiency and
effectiveness in training deep learning models.
1
TABLE OF CONTENTS
1 Certificate
2 Abstract
3 Table of Contents
4 Chapter 1: Introduction 4
8 Chapter 5: Conclusion 23
9 References 24
2
Chapter 1: Introduction
1.1 Introduction
Generative Adversarial Networks (GANs) have emerged as one of the most powerful
and intriguing innovations in the field of deep learning. Introduced by Ian Goodfellow
in 2014, GANs have revolutionized the way synthetic data is generated. They consist
of two neural networks—the Generator and the Discriminator—engaged in a two-
player game: the Generator creates data that mimics real-world samples, while the
Discriminator tries to distinguish between real and synthetic data. Through this
adversarial training process, both networks iteratively improve, leading to the
generation of realistic data that is often indistinguishable from the actual data.
The strength of GANs lies in their ability to generate high-quality, realistic data
without the need for paired input-output datasets. Their impact spans various
domains such as computer vision, text-to-image synthesis, style transfer, medical
image enhancement, and artistic content generation. However, training GANs is a
complex task, often plagued with challenges such as mode collapse, training
instability, and gradient vanishing.
This project implements a basic GAN using PyTorch with the goal of generating
synthetic images of handwritten digits from the MNIST dataset. This dataset, a
staple in the machine learning community, contains 28x28 grayscale images of digits
ranging from 0 to 9. The project emphasizes both qualitative and quantitative
evaluation of the generated outputs and introduces stabilization techniques like label
smoothing and batch normalization to enhance performance.
1.2 Need
In many real-world applications, access to high-quality labeled data can be limited
due to cost, privacy, or availability constraints. Generative models, especially GANs,
offer a solution by creating synthetic data that can augment or even replace real
datasets in training machine learning models.
The need for this project is rooted in the following:
Understanding GANs from the ground up by implementing and
experimenting with the architecture.
Generating realistic synthetic data for use in applications like data
augmentation or image simulation.
1
Exploring stabilization techniques to improve GAN training, which is
notoriously unstable.
Creating a foundation for more complex generative models like DCGANs,
WGANs, and CycleGANs.
For beginners and practitioners alike, understanding how to build and train GANs on
datasets like MNIST provides critical insights into adversarial learning, optimization
challenges, and generative modeling strategies.
1.3 Motivation
The motivation behind this project is twofold: academic learning and practical skill-
building. GANs represent a cutting-edge advancement in artificial intelligence and
understanding their functioning equips one with a powerful tool in the deep learning
arsenal. By implementing a GAN from scratch using PyTorch, this project allows for
a hands-on experience that bridges theoretical concepts with practical execution.
Specific motivations include:
Curiosity in generative AI: With growing trends in AI-generated art, text, and
images, understanding GANs opens doors to innovation and creativity.
Hands-on learning of PyTorch: Building a GAN provides exposure to model
definition, custom training loops, and visualization of outputs using one of the
most widely used deep learning frameworks.
Career relevance: GANs are increasingly applied in industry sectors such as
fashion, healthcare, and entertainment. This project acts as a stepping stone
toward mastering generative models for real-world use cases.
Academic value: The MNIST-based GAN implementation sets a solid
foundation for advanced research or coursework in AI and deep learning.
This project also serves as an inspiration to explore deeper GAN variants and apply
them to more complex datasets and domains.
2
Chapter 2: Literature Survey
6
Chapter 4: Experimental Results
CODE
# ===============================
# ✅ FINAL GAN PROJECT CODE
# ===============================
# 🔧 SETUP (Remove old logs)
!rm -rf runs
!pip install -q tensorboard
%load_ext tensorboard
%tensorboard --logdir runs
# ===============================
# 🧠 IMPORTS
# ===============================
import torch
import torch.nn as nn
import torch.optim as optim
import torchvision
import torchvision.datasets as datasets
from torch.utils.data import DataLoader
import torchvision.transforms as transforms
from torch.utils.tensorboard import SummaryWriter
import matplotlib.pyplot as plt
import os
# ===============================
# 🧠 MODEL DEFINITIONS
# ===============================
class Discriminator(nn.Module):
def __init__(self, img_dim):
super().__init__()
self.model = nn.Sequential(
nn.Linear(img_dim, 512),
nn.LeakyReLU(0.2),
nn.Linear(512, 256),
nn.LeakyReLU(0.2),
nn.Linear(256, 1),
nn.Sigmoid()
)
def forward(self, x):
return self.model(x)
class Generator(nn.Module):
def __init__(self, z_dim, img_dim):
super().__init__()
self.model = nn.Sequential(
nn.Linear(z_dim, 256),
nn.BatchNorm1d(256),
nn.LeakyReLU(0.2),
nn.Linear(256, 512),
nn.BatchNorm1d(512),
nn.LeakyReLU(0.2),
7
nn.Linear(512, img_dim),
nn.Tanh()
)
def forward(self, x):
return self.model(x)
# ===============================
# ⚙️ HYPERPARAMETERS
# ===============================
device = "cuda" if torch.cuda.is_available() else "cpu"
lr = 1e-4
z_dim = 64
image_dim = 28 * 28
batch_size = 64
num_epochs = 100
# ===============================
# 📦 DATASET & LOADER
# ===============================
transform = transforms.Compose([
transforms.ToTensor(),
transforms.Normalize((0.5,), (0.5,))
])
dataset = datasets.MNIST(root="dataset/", transform=transform, download=True)
loader = DataLoader(dataset, batch_size=batch_size, shuffle=True)
# ===============================
# 🧠 INITIALIZE MODELS & OPTIMIZERS
# ===============================
disc = Discriminator(image_dim).to(device)
gen = Generator(z_dim, image_dim).to(device)
opt_disc = optim.Adam(disc.parameters(), lr=lr)
opt_gen = optim.Adam(gen.parameters(), lr=lr)
criterion = nn.BCELoss()
fixed_noise = torch.randn((batch_size, z_dim)).to(device)
writer_fake = SummaryWriter(f"runs/GAN_MNIST/fake")
writer_real = SummaryWriter(f"runs/GAN_MNIST/real")
step = 0
G_losses = []
D_losses = []
# Directory to save generated samples and models
os.makedirs("samples", exist_ok=True)
os.makedirs("models", exist_ok=True)
# ===============================
# 🔁 TRAINING LOOP
# ===============================
for epoch in range(num_epochs):
for batch_idx, (real, _) in enumerate(loader):
real = real.view(-1, image_dim).to(device)
cur_batch_size = real.size(0)
# ========== Train Discriminator ==========
noise = torch.randn(cur_batch_size, z_dim).to(device)
fake = gen(noise)
8
disc_real = disc(real).view(-1)
lossD_real = criterion(disc_real, torch.ones_like(disc_real) * 0.9) # Label smoothing
disc_fake = disc(fake.detach()).view(-1)
lossD_fake = criterion(disc_fake, torch.zeros_like(disc_fake))
lossD = (lossD_real + lossD_fake) / 2
disc.zero_grad()
lossD.backward()
opt_disc.step()
# ========== Train Generator ==========
output = disc(fake).view(-1)
lossG = criterion(output, torch.ones_like(output)) # Fool the discriminator
gen.zero_grad()
lossG.backward()
opt_gen.step()
# Save losses
G_losses.append(lossG.item())
D_losses.append(lossD.item())
# ========== Logging & Visualization ==========
if batch_idx % 100 == 0:
print(
f"Epoch [{epoch+1}/{num_epochs}] Batch {batch_idx}/{len(loader)} "
f"Loss D: {lossD:.4f}, Loss G: {lossG:.4f}"
)
with torch.no_grad():
fake = gen(fixed_noise).reshape(-1, 1, 28, 28)
real_imgs = real.reshape(-1, 1, 28, 28)
img_grid_fake = torchvision.utils.make_grid(fake, normalize=True)
img_grid_real = torchvision.utils.make_grid(real_imgs, normalize=True)
writer_fake.add_image("Fake Images", img_grid_fake, global_step=step)
writer_real.add_image("Real Images", img_grid_real, global_step=step)
step += 1
# Save model checkpoints and generated images
if (epoch + 1) % 10 == 0:
torch.save(gen.state_dict(), f"models/generator_epoch_{epoch+1}.pth")
torchvision.utils.save_image(fake, f"samples/fake_epoch_{epoch+1}.png", normalize=True)
# ===============================
# 📊 PLOT LOSS CURVES
# ===============================
plt.figure(figsize=(10, 5))
plt.title("Generator and Discriminator Loss During Training")
plt.plot(G_losses, label="G")
plt.plot(D_losses, label="D")
plt.xlabel("Iterations")
plt.ylabel("Loss")
plt.legend()
plt.savefig("samples/loss_curve.png")
plt.show()
# ===============================
# ✅ FINAL GAN PROJECT CODE
# ===============================
# 🔧 SETUP (Remove old logs)
9
!rm -rf runs
!pip install -q tensorboard
%load_ext tensorboard
%tensorboard --logdir runs
# ===============================
# 🧠 IMPORTS
# ===============================
import torch
import torch.nn as nn
import torch.optim as optim
import torchvision
import torchvision.datasets as datasets
from torch.utils.data import DataLoader
import torchvision.transforms as transforms
from torch.utils.tensorboard import SummaryWriter
import matplotlib.pyplot as plt
import os
# ===============================
# 🧠 MODEL DEFINITIONS
# ===============================
class Discriminator(nn.Module):
def __init__(self, img_dim):
super().__init__()
self.model = nn.Sequential(
nn.Linear(img_dim, 512),
nn.LeakyReLU(0.2),
nn.Linear(512, 256),
nn.LeakyReLU(0.2),
nn.Linear(256, 1),
nn.Sigmoid()
)
def forward(self, x):
return self.model(x)
class Generator(nn.Module):
def __init__(self, z_dim, img_dim):
super().__init__()
self.model = nn.Sequential(
nn.Linear(z_dim, 256),
nn.BatchNorm1d(256),
nn.LeakyReLU(0.2),
nn.Linear(256, 512),
nn.BatchNorm1d(512),
nn.LeakyReLU(0.2),
nn.Linear(512, img_dim),
nn.Tanh()
)
def forward(self, x):
return self.model(x)
# ===============================
# ⚙️ HYPERPARAMETERS
# ===============================
10
device = "cuda" if torch.cuda.is_available() else "cpu"
lr = 1e-4
z_dim = 64
image_dim = 28 * 28
batch_size = 64
num_epochs = 100
# ===============================
# 📦 DATASET & LOADER
# ===============================
transform = transforms.Compose([
transforms.ToTensor(),
transforms.Normalize((0.5,), (0.5,))
])
dataset = datasets.MNIST(root="dataset/", transform=transform, download=True)
loader = DataLoader(dataset, batch_size=batch_size, shuffle=True)
# ===============================
# 🧠 INITIALIZE MODELS & OPTIMIZERS
# ===============================
disc = Discriminator(image_dim).to(device)
gen = Generator(z_dim, image_dim).to(device)
opt_disc = optim.Adam(disc.parameters(), lr=lr)
opt_gen = optim.Adam(gen.parameters(), lr=lr)
criterion = nn.BCELoss()
fixed_noise = torch.randn((batch_size, z_dim)).to(device)
writer_fake = SummaryWriter(f"runs/GAN_MNIST/fake")
writer_real = SummaryWriter(f"runs/GAN_MNIST/real")
step = 0
G_losses = []
D_losses = []
# Directory to save generated samples and models
os.makedirs("samples", exist_ok=True)
os.makedirs("models", exist_ok=True)
# ===============================
# 🔁 TRAINING LOOP
# ===============================
for epoch in range(num_epochs):
for batch_idx, (real, _) in enumerate(loader):
real = real.view(-1, image_dim).to(device)
cur_batch_size = real.size(0)
# ========== Train Discriminator ==========
noise = torch.randn(cur_batch_size, z_dim).to(device)
fake = gen(noise)
disc_real = disc(real).view(-1)
lossD_real = criterion(disc_real, torch.ones_like(disc_real) * 0.9) # Label smoothing
disc_fake = disc(fake.detach()).view(-1)
lossD_fake = criterion(disc_fake, torch.zeros_like(disc_fake))
lossD = (lossD_real + lossD_fake) / 2
disc.zero_grad()
lossD.backward()
opt_disc.step()
# ========== Train Generator ==========
11
output = disc(fake).view(-1)
lossG = criterion(output, torch.ones_like(output)) # Fool the discriminator
gen.zero_grad()
lossG.backward()
opt_gen.step()
# Save losses
G_losses.append(lossG.item())
D_losses.append(lossD.item())
Output
12
13
14
15
16
17
18
19
Chapter 5: Conclusion
20
REFERENCES
Ian Goodfellow, Jean Pouget-Abadie, Mehdi Mirza, Bing Xu, David Warde-Farley,
Sherjil Ozair, Aaron Courville, and Yoshua Bengio.
Generative Adversarial Networks. arXiv preprint arXiv:1406.2661, 2014.
https://arxiv.org/abs/1406.2661
PyTorch Documentation – Neural Networks (torch.nn)
https://pytorch.org/docs/stable/nn.html
MNIST Handwritten Digit Dataset
http://yann.lecun.com/exdb/mnist/
TensorBoard – Visualizing Learning
https://www.tensorflow.org/tensorboard
Radford, Alec, Luke Metz, and Soumith Chintala.
Unsupervised Representation Learning with Deep Convolutional Generative Adversarial
Networks. arXiv preprint arXiv:1511.06434, 2015.
https://arxiv.org/abs/1511.06434
Towards Data Science. A Beginner's Guide to GANs with Code in PyTorch
https://towardsdatascience.com/a-beginners-guide-to-generative-adversarial-networks-
gans-with-pytorch-code-79a36549f0c4
GitHub Repositories and Tutorials on GANs
https://github.com/eriklindernoren/PyTorch-GAN
21