Final Repo
Final Repo
2024
A Term paper report submitted in partial fulfillment of the requirement for the Award of degree
BACHELOR OF TECHNOLOGY
in
CSE (Artificial Intelligence & Machine Learning)
Submitted
By
CH. PRASAD
23345A4205
CERTIFICATE
This is to certify that term paper report titled “Generative Adversarial Based Image Denoising
Methods” submitted by Ch. Prasad bearing Reg. No:23345A4205 has been carried out in partial
fulfillment for the award of Bachelor of Technology in CSE (Artificial Intelligence & Machine
Learning) of GMRIT, Rajam affiliated to JNTUGV, Vizianagaram is a record of bonafide work
carried out by him under my guidance & supervision. The results embodied in this report have not
been submitted to any other University or Institute for the award of any degree.
ACKNOWLEDGEMENT
It gives me an immense pleasure to express deep sense of gratitude to my guide, Dr.
Sreejyothsna Ankam, Senior Assistant Professor, Department of CSE (Artificial Intelligence &
Machine Learning) of wholehearted and invaluable guidance throughout the report. Without her
sustained and sincere effort, this report would not have taken this shape. She encouraged and helped
me to overcome various difficulties that I have faced at various stages of my report.
I would like to sincerely thank Dr. K. Srividya, Associate Professor & HOD,
Department of CSE (Artificial Intelligence & Machine Learning), for providing all the necessary
facilities that led to the successful completion of my report.
I take privilege to thank our Principal Dr. C.L.V.R.S.V.Prasad, who has made the
atmosphere so easy to work. I shall always be indebted to them.
I would like to thank all the faculty members of the Department of CSE (Artificial
Intelligence & Machine Learning) for their direct or indirect support and also all the lab technicians
for their valuable suggestions and providing excellent opportunities in completion of this report.
Ch. prasad
23345A4205
INDEX
CONTENTS PAGE NO
CERTIFICATE 2
ACKNOWLEDGEMENT 3
ABSTRACT 5
Chapter 1. INTRODUCTION 6
1.1 Historical Perspective on Image Denoising
1.2 Challenges in image denoising
1.3 Emerging of deep learning for image denoising
1.4 Role of GAN in Image denoising
Chapter 2. RELATED WORK 9
Chapter 3. METHODOLOGY 12
Chapter 4. TRADITIONAL IMAGE DENOISING METHODS 22
Chapter 5. EXPERIMENT & SETUP 31
5.1 Dataset Selection and Preparation
5.2 Evolution Metrics
5.3 Visualization of denoising Images
Chapter 6. RESULTS AND COMPARISIONS 35
Chapter 7. FEATURE DIRECTION AND CONCLUSION 37
7.1 Summary of Contributions
7.2 Limitations of the Proposed Methods
7.3 Future Research Directions
7.4 Conclusion
REFERENCES 42
ABSTRACT
One of the main challenges in image processing is image denoising, which aims to eliminate noise while
keeping crucial details. This topic has benefited greatly from recent developments in Generative
Adversarial Networks (GANs), which provide fresh approaches to age-old denoising issues. This paper
looks at several approaches and their efficacy in providing a thorough overview of GAN-based picture
denoising techniques. I examine the development of GAN architectures used for denoising tasks, starting
with simple models and moving up to more complex versions like self-supervised Noise2Noise methods
and Wasserstein GANs (WGANs). Important topics covered include single picture denoising and the
modification of GANs to handle blurriness. In addition to offering suggestions for future research paths in
the field, our assessment aims to shed light on the advantages and disadvantages of the GAN-based
denoising techniques that are already in use.
Keywords: Generative Adversarial Networks (GANs), Image Denoising, Noise2Noise, Wasserstein GAN
(WGAN), Image Quality Enhancement,
INTRODUCTION
Image denoising has evolved significantly over the years, beginning with basic linear filtering techniques
like Gaussian and mean filters in the 1960s and 1970s, which smoothed images to reduce noise but often
blurred details. In the 1980s and 1990s, non-linear filters such as the median filter and the Wiener filter
improved noise reduction while preserving edges. The introduction of wavelet and Fourier transform
methods allowed denoising in the transform domain, enabling more precise noise removal by isolating
noise in specific frequency ranges. In the 2000s, variational methods like total variation denoising and
Bayesian approaches gained prominence, offering more adaptive solutions for preserving image features.
The 2000s also saw the rise of non-local means, which improved noise removal by leveraging similar
patches across an image. With the advent of deep learning in the 2010s, techniques such as convolutional
neural networks (CNNs) and generative adversarial networks (GANs) revolutionized image denoising,
achieving impressive results by learning complex patterns in noisy images. Modern methods like
Noise2Noise also introduced self-supervised learning, further advancing the field. Despite these
advancements, challenges such as dealing with real-world noise and the computational cost of deep
learning models continue to drive ongoing research in the field
Traditional denoising techniques face several challenges that limit their effectiveness, particularly in
complex or real-world scenarios. Many methods, like Gaussian and mean filters, tend to blur fine details
and edges, leading to a loss of important features in the image. While non-linear filters like bilateral
filtering improve edge preservation, they still struggle with heavy noise or intricate textures. Additionally,
traditional methods often assume a specific noise model, such as Gaussian noise, making them less
adaptable to different types of noise found in real-world images. Computational complexity is another
issue, as some techniques, like wavelet-based denoising, can be resource-intensive, making them
unsuitable for real-time applications. These methods also risk over-smoothing or removing high-frequency
information, resulting in unnatural or blurred images. Furthermore, traditional denoising algorithms can
struggle with textured or homogeneous regions, where distinguishing noise from important details is
difficult. These limitations have spurred the development of more advanced approaches, particularly those
based on machine learning, which offer more robust solutions for complex noise patterns and preserve
image quality more effectively.
The emergence of deep learning for image denoising has marked a significant shift in the field, offering a
more powerful and adaptive approach compared to traditional methods. Traditional denoising techniques,
while effective in certain scenarios, struggled with issues such as blurring of fine details, difficulty in
handling different types of noise, and computational inefficiency. Deep learning, particularly through the
use of Convolutional Neural Networks (CNNs), has addressed many of these limitations by learning
complex patterns and features from large datasets.
Deep learning methods, especially CNNs, have revolutionized image denoising by learning to identify and
remove noise without explicitly relying on predefined noise models or assumptions. These models are
trained on pairs of noisy and clean images, enabling them to learn the underlying structure of the noise and
the important features of the image. Unlike traditional methods, deep learning approaches can generalize
across a wide range of noise types, including Gaussian, salt-and-pepper, and speckle noise, providing
more flexible and robust denoising solutions.
One of the key advantages of deep learning in denoising is its ability to preserve fine details and edges.
CNNs can learn spatial hierarchies of features, allowing them to better differentiate between noise and
important image structures. This leads to sharper and more accurate denoising results, particularly in
images with complex textures or fine details.
In addition to CNNs, other deep learning architectures, such as autoencoders and generative adversarial
networks (GANs), have also been applied to image denoising. Autoencoders learn to compress and
reconstruct images, effectively removing noise during the reconstruction process. GANs, on the other
hand, can generate high-quality, noise-free images by training a generator to produce realistic denoised
images and a discriminator to distinguish between denoised and real images, improving the quality of
denoising even further.
Generative Adversarial Networks (GANs) have played a transformative role in the field of image
denoising, offering a new and powerful approach to remove noise while preserving image details. GANs
consist of two neural networks: a generator and a discriminator, which work in tandem to improve the
quality of denoised images. The generator is responsible for creating denoised images from noisy inputs,
The key advantage of GANs in image denoising is their ability to model complex, high-level structures in
images. Unlike traditional methods, which rely on predefined filters or statistical models, GANs learn to
remove noise through a process of competition between the generator and discriminator. The generator
continuously improves its denoising capability, while the discriminator helps the network refine the output
to make it as realistic and noise-free as possible.
One significant benefit of using GANs for denoising is their ability to preserve fine details, textures, and
edges. Traditional denoising methods often struggle with over-smoothing, where details are lost along
with the noise. GANs, on the other hand, focus on learning the natural distribution of images and can
generate sharp, detailed images while removing noise. This makes GAN-based denoising particularly
effective for images with complex textures or fine structures, such as in medical imaging or high-
resolution photographs.
The process of training GANs for image denoising typically involves using noisy images as inputs and
clean, noise-free images as the target outputs. Over time, as the generator and discriminator improve
through adversarial training, the generator becomes capable of producing high-quality denoised images.
GANs have also been applied in a variety of denoising tasks, from handling Gaussian noise to more
challenging types such as salt-and-pepper noise, speckle, or even real-world noise introduced by camera
sensors.
Another notable advancement in GAN-based image denoising is the use of conditional GANs (cGANs). In
these models, both the noisy image and the corresponding clean image are used to guide the generator,
providing a more targeted approach to noise reduction. cGANs enable more precise denoising by
conditioning the model on additional information, such as the specific type of noise or image content.
Despite their impressive capabilities, GANs for image denoising are not without challenges. Training
GANs can be computationally expensive and may require large amounts of data to generalize effectively.
Additionally, the adversarial training process can sometimes lead to instability, requiring careful tuning of
hyperparameters. However, the remarkable ability of GANs to produce high-quality denoised images
while preserving intricate details has made them a powerful tool in modern image denoising applications,
outperforming traditional methods in many cases.
Related Work
Qu, Z., Zhang, Y., Sun, Y., & Lin, X[1] has proposed a paper A New Generative Adversarial Network for
Texture Preserving Image Denoising. The Noise2Noise framework builds on classical and deep learning-
based denoising methods. Traditional techniques used filtering and optimization, while modern
approaches, like CNNs (e.g., DnCNN) and autoencoders, learn mappings from noisy to clean images.
GANs further enhance denoising by preserving textures through adversarial training, making them ideal
for detail-sensitive applications
Samik Banerjee, Sukhendu Das[2], SD-GAN: Structural and Denoising GAN reveals facial parts under
occlusion. Classical methods like Non-Local Means and BM3D use filtering and statistical models for
noise removal. Deep learning models like DnCNN leverage CNNs trained on clean-noisy pairs for
improved denoising, while GANs enhance perceptual quality through adversarial training. Noise2Noise
innovates by using noisy image pairs instead of clean references, enabling denoising without clean data.
Self-supervised extensions adapt this approach for single-image denoising by simulating multiple noisy
versions.
Xin Cheng , Jingmei Zhou , Jiachun Song, and Xiangmo Zhao[3], A Highway Traffic Image Enhancement
Algorithm Based on Improved GAN in Complex Weather Conditions. Traditional methods like histogram
equalization and Retinex struggle with complex textures and overexposure in challenging weather. Deep
learning models, such as GCANet and object detection networks, improve clarity and adapt to varying
conditions, while GANs enhance underwater image clarity but face synthetic-to-real distribution
challenges. However, many methods still struggle with fine details and color fidelity, highlighting the
need for better solutions.
SONGKUI CHEN 1 , DAMING SHI[4], Image Denoising With Generative Adversarial Networks and Its
Application to Cell Image Enhancement. The paper critiques traditional MSE-based denoising for
producing blurry results and limited texture recovery. Feature extraction methods are also constrained by
task-specific sensitivity. The proposed framework addresses these issues using Wasserstein GANs
(WGANs), improving detail recovery by focusing on the distribution of clean images, particularly for cell
image denoising.
Shaobo Zhao, Sheng Lin , Xi Cheng[6], Dual-GAN Complementary Learning for Real-World Image
Denoising. Notable denoising methods like DnCNNs, FFDNet, CBDNet, RIDNet, and SADNet address
image noise using deep learning. However, many rely on L1 or L2 losses, leading to blurring and limited
real-world effectiveness. Handling complex, spatially varying noise has driven the development of
strategies combining denoised image and noise learning. Single GAN-based methods face challenges with
complexity and training, prompting the proposed DGCL strategy to enhance performance and overcome
these limitations
HAIJUN HU 1 , BO GAO3 , ZHIYUAN SHEN[7], Image Smear Removal via Improved Conditional
GAN and Semantic Network. Image smear removal techniques often face challenges with paired training
data and reconstruction loss in traditional methods like autoencoders. GANs offer a solution by generating
realistic images, but issues like error oscillations and gradient instability persist, leading to the adoption of
stable models like Wasserstein GANs (WGAN). Recent methods use architectures like VGG-16 for
improved feature extraction, combining low- and high-dimensional features. However, visual consistency
in complex backgrounds remains a challenge. The proposed method addresses this by leveraging GANs
and multi-scale feature fusion to enhance restoration accuracy and realism.
Xin Jin1 , Ying Hu1 , Chu-Yue Zhang1[8], This image restoration method combines GANs with multi-
scale feature fusion to address limitations in traditional algorithms requiring paired data. While GAN-
based approaches often complicate training and produce unnatural results, this method uses a VGG-16
encoder-decoder to fuse low- and high-dimensional features, enhancing restoration quality. Incorporating
WGAN principles and L1 loss improves similarity to target images. Experiments show improved accuracy
and visual consistency, though challenges remain in complex backgrounds, which future work aims to
address.
LUAN THANH TRINH AND TOMOKI HAMAGAMI[9], Latent Denoising Diffusion GAN: Faster
Sampling, Higher Image Quality. The Latent Denoising Diffusion GAN (LDDGAN) improves image
generation by addressing the slow inference of diffusion models while enhancing quality and diversity.
WENDA LI 1,2,3 AND JIAN WANG[12], Residual Learning of Cycle-GAN for Seismic Data Denoising.
The Residual Cycle-GAN (RCGAN) improves seismic data denoising by integrating residual learning into
a Cycle-GAN framework, addressing the limitations of traditional methods that require manual parameter
selection. RCGAN uses data augmentation to enhance training efficiency and adaptability to real seismic
data, generating labeled datasets from noisy unlabeled data. It outperforms methods like FXDM and
DnCNN by better suppressing random noise while preserving important data details in both synthetic and
real seismic datasets.
Joonyoung Song , Jae-Heon Jeong , Dae-Soon Park [15], Unsupervised Denoising for Satellite Imagery
Using Wavelet Directional Cycle GAN. Satellite image denoising includes traditional methods like Total
Variation (TV), which preserves high-frequency information, and Low-Rank Matrix Recovery (LRMR),
which uses the low-rank properties of clean images. However, these methods require paired noisy and
clean images, which are hard to obtain. Recent deep learning approaches, like deep image prior and
Noise2Void, face issues with computational complexity and noise assumptions. This highlights the need
for innovative methods like WavCycleGAN, which denoises satellite imagery without paired datasets,
addressing challenges in remote sensing.
METHODOLOGY
In the field of image denoising, several methodologies have been developed to improve the accuracy,
efficiency, and generalization ability of models. These methodologies often involve advanced network
architectures, self-supervised learning frameworks, and robust optimization techniques to achieve superior
denoising results. Below is an in-depth look at the methodologies used for image denoising, including the
Noise2Noise framework, GAN architectures, and key implementation details.
The Noise2Noise framework is a self-supervised learning approach that uses noisy image pairs to train a
model to effectively denoise images without requiring clean ground-truth data. This approach has proven
to be highly effective in scenarios where obtaining clean images is impractical or costly. Below are the
key components of the Noise2Noise framework:
Data augmentation plays a critical role in self-supervised learning by artificially expanding the training
data, improving the model's generalization, and preventing overfitting. Common data augmentation
techniques used in the Noise2Noise framework include:
Cropping and Scaling: Random cropping and resizing of image patches can help the model learn to
focus on localized areas of the image, improving its ability to denoise even small patches.
Noise Injection: Introducing different types of noise (e.g., Gaussian noise, salt-and-pepper noise)
into clean images during training can further strengthen the model's ability to generalize to various
noise types.
Color Jittering: Slight adjustments to the brightness, contrast, and saturation of images can help the
model handle variations in lighting and visual conditions.
These augmentation techniques enable the model to learn more robust denoising representations by
artificially expanding the noisy image dataset.
In the Noise2Noise framework, the loss function plays a crucial role in guiding the training process.
Common loss functions used for image denoising in this context include:
Mean Squared Error (MSE): The most common loss function in denoising tasks, MSE calculates
the squared difference between the predicted denoised image and the noisy image. This function
helps in minimizing pixel-wise errors.
Structural Similarity Index (SSIM): SSIM is used to measure the perceptual quality of images,
comparing structural information such as luminance, contrast, and texture. It is particularly useful
for preserving image quality while reducing noise.
Perceptual Loss: Based on the features extracted from pretrained deep neural networks (e.g., VGG
networks), perceptual loss evaluates the similarity between the high-level representations of the
predicted and target images. This helps in preserving high-level structures like textures and edges
during denoising.
Adversarial Loss: In GAN-based models, adversarial loss is employed, where the generator
attempts to create denoised images that cannot be distinguished from real clean images by the
discriminator. This loss encourages the generator to produce more realistic images.
Optimization techniques in the Noise2Noise framework focus on adjusting model parameters to minimize
the loss functions efficiently:
Stochastic Gradient Descent (SGD): One of the most common optimization methods, where the
model parameters are updated iteratively based on gradients computed from the loss function.
Adam Optimizer: An adaptive optimization algorithm that adjusts the learning rate during training,
making it faster and more efficient for training deep networks. Adam is widely used in denoising
tasks due to its ability to handle noisy gradients and sparse data.
Learning Rate Scheduling: A technique where the learning rate decreases over time to allow the
model to converge smoothly. This helps in preventing overfitting and achieving a more optimal
denoising result.
Several network architectures have been developed to further improve GAN-based denoising
performance. The following architectures are frequently used:
U-Net is a widely used architecture for image denoising tasks, especially in biomedical imaging and other
high-resolution tasks. It consists of an encoder-decoder structure with skip connections that enable the
model to retain high-resolution information.
Encoder: The encoder progressively downscales the image, extracting features at different levels of
abstraction.
Decoder: The decoder upscales the image, reconstructing the image details at a higher resolution.
The skip connections between encoder and decoder layers help in recovering fine-grained
information and retaining spatial details.
Advantages: U-Net is well-suited for image denoising because of its ability to preserve spatial
information through the use of skip connections. It also performs well when dealing with small
datasets, as it can generalize well.
ResNet (Residual Networks) incorporates skip connections, or residual connections, between layers to
prevent the vanishing gradient problem and enable the training of very deep networks.
Residual Blocks: ResNet uses residual blocks, where the input to each block is added directly to
the output, enabling smoother gradient flow and improving the training of deep networks.
Advantages: ResNet is known for its deep architectures, which can capture complex image
features, making it effective in handling high-dimensional data like images. This is particularly
useful for denoising tasks that require capturing intricate details.
The Vision Transformer (ViT) is a relatively new architecture that applies the transformer model,
originally developed for natural language processing, to vision tasks.
Patch-based Input: ViT divides an image into patches and processes them using transformer layers.
Each patch is treated as an individual token, and the model learns relationships between different
patches to understand the global image context.
CycleGAN: CycleGAN is used for tasks where paired data is unavailable. It learns to map noisy
images to clean images by introducing a cycle consistency loss, which ensures that transforming a
noisy image to clean and back to noisy should result in the original image.
WGAN-GP: Wasserstein GAN with Gradient Penalty is an improved version of GANs that uses
the Wasserstein distance to measure the difference between real and generated images. The
gradient penalty helps in stabilizing training, leading to better denoising results.
Implementing an image denoising model requires a well-structured process involving dataset preparation,
model training, and hyperparameter tuning to optimize performance.
For supervised denoising tasks, paired datasets consisting of noisy and clean image pairs are required. In
the case of Noise2Noise, noisy images are used in pairs to train the model without requiring clean images.
Public Datasets: Popular datasets like BSDS500, ImageNet, or Set5/Set14 are often used for
training denoising models. These datasets provide a wide variety of images, helping the model
generalize to different types of noise.
Synthetic Noise Generation: For datasets without noise, synthetic noise (e.g., Gaussian or salt-and-
pepper noise) can be added to clean images to create noisy datasets for training.
Initialization: Proper initialization of weights is essential for stable training. Methods like Xavier
or He initialization are commonly used to start training with optimal parameters.
Validation: A validation set is used to monitor the model's performance during training and
prevent overfitting. The model's ability to generalize to unseen noisy images is evaluated using the
validation dataset.
Hyperparameter tuning is crucial for achieving optimal denoising performance. Key hyperparameters
include:
Learning Rate: The learning rate controls the speed of gradient descent. A smaller learning rate
allows for more precise updates, but may require more epochs to converge.
Batch Size: The batch size determines how many samples are processed together before the
model's parameters are updated. A smaller batch size can improve the model's generalization but
increases computational cost.
Network Depth and Width: The depth (number of layers) and width (number of neurons per layer)
of the neural network can significantly impact the model's capacity to learn from data. A deeper or
wider network may lead to better performance but requires more computational resources.
Regularization: Techniques such as dropout, weight decay, or early stopping can be applied to
prevent overfitting and improve model generalization.
Medical imaging is a field where image clarity and precision are of paramount importance. Techniques
such as MRI, CT, and ultrasound are used to capture high-resolution images that aid in the diagnosis and
treatment of various diseases. However, these images are often affected by noise due to factors like low-
light conditions, patient motion, or imperfections in the imaging sensor. GAN-based denoising techniques
have shown great promise in improving the quality of medical images, particularly in preserving important
structural details while reducing noise.
MRIDenoising:
MRI images are often affected by noise arising from thermal fluctuations in the scanner, motion
artifacts due to patient movement, and low signal-to-noise ratios (SNR) caused by suboptimal
scanning parameters. GAN-based denoising models are particularly useful for MRI, as they can
learn the complex noise patterns specific to the scanner and the body region being imaged.
CTDenoising:
In CT scans, noise can be generated by low-dose radiation or motion artifacts, which can distort
the structural features of the imaged tissue. GAN-based denoising methods can be trained to
remove these distortions while preserving critical anatomical details, which is crucial for accurate
diagnosis.
UltrasoundDenoising:
Ultrasound images are often plagued by speckle noise, which can obscure fine structures in the
image. GANs, particularly with architectures like U-Net or ResNet, have been employed to reduce
speckle noise while preserving edges and fine details of the imaged tissue.
While GAN-based denoising techniques have shown promising results, there are several challenges
associated with their use in medical imaging:
1.Generalization to Different Devices: GAN models trained on one type of MRI or CT scanner may
not perform well on images captured by different devices due to variations in noise characteristics.
2.Preservation of Critical Information: In medical imaging, removing noise must not come at the
cost of losing important structural or functional information. GAN models must balance noise removal
with detail preservation.
3.Real-Time Denoising: Some medical imaging applications require real-time processing (e.g.,
ultrasound), which places constraints on the computational complexity of denoising models.
A recent study demonstrated the use of GANs for denoising MRI images. The study compared
traditional methods like Non-Local Means (NLM) with a GAN-based approach, where the GAN model
significantly outperformed NLM in terms of both PSNR and SSIM, particularly in regions with high
noise and low contrast.
In oncology, low-dose CT scans are used for early cancer detection, but these images are often noisy
due to the reduced radiation exposure. GANs were applied to denoise these CT scans, leading to
improved tumor detection and more accurate treatment planning.
Remote sensing involves collecting data about the Earth's surface using satellite or aerial sensors. Images
obtained from these platforms are often subject to noise caused by environmental factors such as clouds,
sensor limitations, and atmospheric interference. High-quality denoising is crucial for extracting accurate
information from satellite and aerial images, which are used in applications such as environmental
monitoring, agriculture, and disaster management.
Satellite images are often affected by sensor noise, cloud cover, and atmospheric conditions, which
introduce artifacts and degrade image quality. Remote sensing applications such as land-use
mapping, urban planning, and environmental monitoring require high-quality images, and
denoising is essential for accurate data extraction. GANs have been shown to improve the quality
of satellite images by removing noise while preserving geographical features like roads, rivers, and
buildings.
Aerial images captured by drones or aircraft are subject to similar types of noise, such as motion
blur, sensor noise, and low-light conditions. Denoising these images is crucial for applications like
Department of CSE-AI&ML, GMRIT Page 19
Generative Adversarial Network Based Image Denoising Methods
2024
precision agriculture and disaster response. GANs can be used to remove these artifacts while
maintaining the sharpness of edges and other important details.
Satellite and aerial images are often subject to varying noise types, such as Gaussian noise, salt-
and-pepper noise, and motion blur. A single model may struggle to handle all these noise types
simultaneously, requiring more complex training strategies.
Remote sensing images are typically high-resolution, which increases the computational load
required for processing. Efficient, lightweight GAN models are needed to handle these large
images in real-time applications.
Cloud cover and atmospheric conditions can cause significant distortions in remote sensing
images. While GANs can help reduce the effects of noise, accurately modeling and removing these
distortions remains a challenge.
A case study focused on land-use classification in agricultural fields used satellite images that were
denoised using a GAN-based method. The results showed that GAN denoising improved
classification accuracy, especially in areas with low contrast between land features.
In disaster management, aerial images captured after natural disasters (e.g., hurricanes, floods) are
often noisy due to the conditions under which they are taken. GANs were applied to denoise these
images, improving the accuracy of damage assessment and facilitating faster decision-making
during disaster relief efforts.
While medical imaging and remote sensing are two major fields that benefit from image denoising, GAN-
based denoising methods have applications in several other areas as well. These include surveillance,
forensics, and astronomy.
3.7.1 Surveillance
Surveillance cameras, especially those used in public and security applications, are often affected by noise
due to low-light conditions, sensor limitations, or environmental factors. Denoising these images is crucial
for accurate facial recognition, object tracking, and anomaly detection.
Challenge: The noise in surveillance videos often has a temporal component (frame-to-frame
variation) and spatial components (individual pixel noise), which makes it challenging to remove
without blurring or distorting important details.
Solution: GANs have been applied to video denoising, where temporal coherence between frames
is maintained, and noise is effectively reduced while preserving sharpness and details for
identification purposes.
3.7.2 Forensics
In forensic investigations, digital images are often crucial pieces of evidence. However, these images can
be degraded due to compression, low quality, or noise from various sources. Denoising such images is
essential for accurate analysis and evidence retrieval.
Challenge: Forensic images often contain fine details that are critical for analysis (e.g.,
fingerprints, facial features), and denoising must not distort these details.
3.7.3 Astronomy
Astronomical images, such as those captured by telescopes, are often affected by noise due to the long
distances the light travels, atmospheric disturbances, and sensor limitations. Image denoising is crucial for
enhancing the visibility of celestial bodies and improving the precision of measurements in astronomical
research.
Challenge: Noise in astronomical images can vary depending on the light source, sensor, and
atmospheric conditions, making it difficult to develop a single denoising method that works
universally.
Solution: GAN-based denoising methods can be used to enhance the visibility of celestial objects
by removing noise while preserving the fine details necessary for accurate measurement and
analysis.
While GAN-based denoising techniques have made significant strides in several real-world applications,
they are not without their challenges. Some of the limitations include:
1. Training Complexity: GANs require careful training and can be sensitive to hyperparameters and
training data.
3. Generalization: GANs trained on one dataset (e.g., medical images) may not generalize well to
other domains (e.g., satellite images) due to the variation in noise types and image characteristics.
Traditional image denoising methods encompass a variety of techniques that aim to remove noise from
images while retaining important details. These methods can be broadly categorized into linear filtering,
Linear filtering is one of the simplest and most commonly used techniques in image denoising. It involves
the use of convolution-based filters, which operate by averaging or smoothing pixel values based on their
neighborhood. The most common types of linear filters include:
Gaussian Filter: This filter applies a Gaussian function to smooth the image and reduce noise. It
works by averaging pixels in a weighted manner, where pixels closer to the center have a higher
weight. While effective at removing Gaussian noise, it can blur edges and fine details.
Mean Filter: The mean filter replaces each pixel with the average of its neighbors. This approach is
simple and easy to implement but can cause blurring of edges, which reduces image quality,
especially in images with fine details.
Wiener Filter: The Wiener filter adapts to the local image statistics by estimating the local variance
and then adjusting its smoothing process accordingly. It works well for images with Gaussian
noise, but it can also blur edges when the noise characteristics are not well-defined.
While linear filtering methods are computationally efficient, they often struggle with preserving fine
details and sharp edges, especially when noise is heavy or the noise characteristics are complex.
Nonlinear filtering methods were introduced to address the shortcomings of linear filters, especially their
tendency to blur edges and fine structures. Nonlinear filters adjust the intensity of each pixel based on the
local neighborhood, but in a way that is not directly proportional (unlike linear filters). Key nonlinear
filtering methods include:
Median Filter: This filter replaces each pixel with the median value of its neighbors rather than the
average. The median filter is particularly effective in removing salt-and-pepper noise, as it
preserves edges better than mean filters. However, it can still struggle with certain types of noise,
such as Gaussian noise.
Bilateral Filter: The bilateral filter is a more sophisticated nonlinear filter that considers both
spatial proximity and intensity similarity to preserve edges while reducing noise. It works by
smoothing the image based on both the pixel’s spatial distance and intensity difference from the
Department of CSE-AI&ML, GMRIT Page 23
Generative Adversarial Network Based Image Denoising Methods
2024
central pixel. This approach is better at preserving edges but can be computationally expensive for
large images.
Anisotropic Diffusion (or Perona-Malik Filter): This method uses partial differential equations to
iteratively reduce image gradients, effectively smoothing out areas of uniform intensity while
preserving edges. It is particularly good at removing noise while maintaining edge sharpness.
Nonlinear filters are generally more effective than linear ones at preserving edges and texture, but they can
still be computationally expensive and may struggle in highly noisy conditions.
Wavelet-based methods rely on the transformation of the image into the wavelet domain, where noise and
signal components can be separated more effectively. Wavelets allow the image to be represented at
multiple scales, making it easier to isolate high-frequency noise from the lower-frequency signal. Key
wavelet-based denoising techniques include:
Wavelet Thresholding: This method involves applying a wavelet transform to the image, then
thresholding the wavelet coefficients to remove noise. Coefficients associated with high-frequency
noise are shrunk or set to zero, while low-frequency components, which represent the underlying
image structure, are retained. This method effectively removes noise while preserving important
image features.
Wavelet Shrinkage: In this technique, wavelet coefficients are "shrunk" based on a threshold value,
either using hard or soft thresholding. Hard thresholding sets coefficients below a threshold to
zero, while soft thresholding reduces the magnitude of the coefficients but does not set them to
zero entirely. Wavelet shrinkage can be very effective in removing noise, especially when the
noise is predominantly high-frequency.
Bayesian Wavelet Denoising: Bayesian methods in wavelet denoising treat the wavelet coefficients
probabilistically, using a prior distribution to estimate which coefficients correspond to noise and
which represent the true image content. This approach is often more robust to various types of
noise, but it can be computationally intensive.
Sparse coding is a more recent approach to image denoising that models an image as a sparse combination
of dictionary elements (basis functions). The idea is that natural images can be represented efficiently
using a small number of basis functions, and noise typically does not follow the same sparse structure.
Key elements of sparse coding for image denoising include:
Dictionary Learning: In sparse coding, a dictionary of image patches is learned from training data.
This dictionary captures the essential structures and textures present in the image. During
denoising, the image is represented as a sparse combination of these dictionary elements, with non-
sparse components typically corresponding to noise.
Sparse Representation: The denoising process involves finding a sparse representation of the noisy
image. The sparse coefficients corresponding to noise are suppressed, and the remaining
coefficients are used to reconstruct a cleaner version of the image. This process typically involves
l1-norm minimization, which promotes sparsity in the representation.
K-SVD (K-means Singular Value Decomposition): This algorithm is commonly used for
dictionary learning in sparse coding, where the dictionary is updated iteratively to better represent
the image data. It has shown excellent results in denoising by using a dictionary that is specifically
trained to handle noise.
Sparse coding is particularly effective for handling real-world images and complex noise patterns because
it can adapt to different textures and structures. However, the dictionary learning and sparse coding
processes can be computationally intensive and may require substantial data for training.
With the rise of deep learning in the 2010s, new methods for image denoising emerged that leveraged the
power of neural networks to learn complex features and patterns in noisy images. These early deep
learning-based approaches primarily involved Convolutional Neural Networks (CNNs) and Recurrent
Neural Networks (RNNs), both of which brought significant improvements over traditional methods by
leveraging the ability of deep models to automatically learn features and better adapt to noisy data.
Convolutional Neural Networks (CNNs) quickly became one of the most popular and effective deep
learning architectures for image denoising. CNNs excel at capturing spatial hierarchies in images,
allowing them to learn both low-level features (e.g., edges, textures) and high-level structures (e.g.,
objects or patterns) in the image. This ability makes CNNs particularly well-suited for tasks like
denoising, where it’s crucial to identify and separate noise from the original image content.
Network Architecture: In early CNN-based denoising methods, networks were designed with
several convolutional layers followed by activation functions, pooling layers, and fully connected
layers. The input to the network was a noisy image, and the output was the denoised version of that
image. The network learned to map noisy input images to their clean counterparts by minimizing
the difference (usually using a loss function like mean squared error) between the network's output
and the ground truth.
Training on Large Datasets: One of the key advantages of CNNs is their ability to be trained on
large datasets of noisy and clean image pairs. During training, CNNs learn to recognize the
Advantages: CNNs are highly effective in handling various types of noise (Gaussian, salt-and-
pepper, etc.) and are good at preserving edges and textures, which are often problematic for
traditional denoising methods. The ability to learn from large datasets means CNNs can generalize
well to different types of noise and image content, leading to more flexible and robust denoising
solutions.
Challenges: Early CNN-based denoising methods required substantial computational resources for
training and a large amount of labeled training data (clean/noisy image pairs). Additionally, while
CNNs can effectively reduce noise, they still require careful architecture design to avoid
overfitting and to balance denoising with detail preservation.
While CNNs were dominant in image processing tasks, Recurrent Neural Networks (RNNs) were also
explored for image denoising, especially when temporal or sequential dependencies in the data were
important. RNNs, traditionally used for tasks involving sequential data (such as natural language
processing), are designed to capture long-range dependencies, making them a natural choice for certain
types of denoising tasks.
Network Architecture: RNN-based denoising models were typically built to process image data
sequentially or in patches, where each patch or pixel’s denoising process was informed by the
previously processed patches or pixels. Variants like Long Short-Term Memory (LSTM) networks
and Gated Recurrent Units (GRUs) were used to capture temporal dependencies, which can be
particularly helpful for handling large-scale images with varying noise patterns.
Advantages: RNNs, particularly LSTMs, have the advantage of maintaining a form of memory,
which allows the model to account for long-range dependencies in the image. This can be useful
when denoising images with complex structures or when there is noise that is spread out across the
image in a correlated manner.
The early deep learning-based methods, including those utilizing CNNs and RNNs, represented a
major advancement over traditional denoising techniques. CNNs, in particular, showed impressive
results in preserving edges and textures while effectively reducing noise. The introduction of deep
networks allowed for more flexibility in denoising various types of noise, and CNNs could
generalize to new, unseen types of noise with sufficient training data.
However, these early methods also faced challenges such as the need for large datasets, long training
times, and high computational costs. RNN-based methods, while promising, did not outperform CNNs in
most denoising tasks and were limited by their complexity and difficulties in handling high-dimensional
image data.
Despite these challenges, the success of CNNs in early deep learning-based denoising laid the foundation
for more advanced architectures, including residual networks (ResNets) and generative adversarial
networks (GANs), which have further improved the quality and efficiency of image denoising.
Generative Adversarial Networks (GANs) have made a significant impact in the field of image denoising
by offering a powerful framework for learning to remove noise while preserving fine details and textures.
Unlike traditional methods, GANs use a generative approach where two neural networks, a generator and
a discriminator, are trained adversarially to improve the quality of denoised images.
Generator: The generator takes a noisy image as input and generates a denoised version of that
image. It aims to remove noise while maintaining the underlying structure of the image. The
generator learns to improve its denoising ability by receiving feedback from the discriminator.
Discriminator: The discriminator's role is to distinguish between real (clean) and generated
(denoised) images. It tries to classify whether a given image is a true clean image or a denoised
image produced by the generator. The generator improves over time by learning to produce images
that the discriminator can no longer easily distinguish from real clean images.
Through this adversarial training process, the generator becomes better at removing noise and
preserving important image features, while the discriminator helps guide the network to generate
more realistic denoised images.
Several GAN variants have been developed to enhance the denoising process by addressing issues such
as stability, convergence, and performance:
DCGAN (Deep Convolutional GAN): DCGANs are a variant that uses convolutional layers in
both the generator and discriminator. This architecture improves the learning process and stability
for image-based tasks like denoising. DCGANs have been shown to produce high-quality results
for image generation and restoration tasks, including denoising.
WGAN-GP (Wasserstein GAN with Gradient Penalty): WGAN-GP improves upon the original
GAN by using the Wasserstein distance (instead of Jensen-Shannon divergence) to measure the
difference between real and generated images. This metric allows for more stable training and
mitigates issues like mode collapse, where the generator produces only a limited set of outputs.
WGAN-GP is useful for denoising tasks where high-quality, realistic image generation is critical.
CycleGAN: While not specifically designed for denoising, CycleGANs have been applied to
scenarios where paired training data is not available. In the context of denoising, CycleGAN can
learn to map noisy images to clean images without the need for clean-ground truth pairs, which is
helpful for real-world denoising tasks with unpaired noisy data.
The Noise2Noise framework is an innovative approach to training GANs for image denoising. Instead of
relying on clean images to train the model, Noise2Noise uses noisy image pairs for training. This self-
supervised method works by treating noisy images as the "ground truth" and leveraging the fact that noise
in the image is not highly correlated between different noisy samples.
Training with Noisy Pairs: In Noise2Noise, two noisy versions of the same image are provided as
input, and the network learns to map one noisy image to the other. By using multiple noisy
versions of the same scene, the network can effectively learn to separate noise from the underlying
image structure, improving the denoising process.
Advantages: This approach does not require paired clean and noisy images, which makes it highly
suitable for real-world applications where acquiring clean images for training can be challenging
or impossible.
Supervised GAN-Based Denoising: In supervised learning, GANs are trained using paired noisy
and clean images. The generator learns to map noisy images to their clean counterparts, while the
discriminator helps guide the learning process. Supervised GAN-based denoising is effective when
paired data is available, as it ensures that the network learns the exact mapping between noisy and
clean images.
Unsupervised GAN-Based Denoising: Unsupervised methods, on the other hand, do not require
paired clean images. Instead, GANs can be trained using noisy image datasets and learn to denoise
through techniques such as the Noise2Noise framework or domain adaptation approaches like
CycleGAN. These methods are particularly useful in scenarios where obtaining clean images for
training is impractical or infeasible.
Recent advancements in GAN-based image denoising have led to more sophisticated and efficient
approaches that address the limitations of earlier models. Some of the leading techniques include:
Few-shot and zero-shot denoising aim to improve the ability of GANs to denoise images with
very limited training data (few-shot) or without any labeled training data at all (zero-shot). These
techniques typically use transfer learning, meta-learning, or pre-trained models to generalize
denoising to new domains or noise types with minimal data.
Few-Shot Denoising: In few-shot learning, GANs can be fine-tuned on a small set of noisy
images, leveraging prior knowledge or pre-trained models. This allows them to adapt to new types
of noise or image distributions with a small amount of data.
Zero-Shot Denoising: Zero-shot learning techniques allow GANs to perform denoising tasks
without any training data specific to the target domain. These models rely on pre-trained networks
or unsupervised approaches like Noise2Noise to perform denoising directly on unseen noisy
images.
FIGURE 6.1: The experimental results comparing the usage of only MSE and MSE with normalized
variance loss function over training epochs.
In evaluating GAN-based image denoising models, selecting the appropriate datasets is critical to ensuring
that the models are tested under a variety of conditions. We use a mix of standard benchmark datasets and
real-world datasets that simulate the types of noise encountered in practical applications. Below is a
detailed description of the datasets used in this study.
Standard datasets are essential for benchmarking denoising algorithms, providing consistent ground truth
images and noise characteristics. These datasets typically contain images with artificially added noise,
allowing for controlled experiments.
1. BSD68
The Berkeley Segmentation Dataset (BSD68) is one of the most commonly used datasets for
image denoising tasks. It consists of 68 high-resolution natural images covering a wide range of
Example Use: Benchmarking models' ability to handle different noise levels and image complexities,
such as textures and sharp edges.
2. Set12
Set12 is a smaller dataset consisting of 12 images, selected to cover a range of categories,
including natural scenes, architecture, and animals. Set12 is typically used for evaluating the
performance of denoising algorithms in situations where a smaller, more varied dataset is needed.
Similar to BSD68, images in Set12 are corrupted with Gaussian noise at various levels.
Example Use: Testing the ability of models to generalize across a variety of image types and noise levels
with a smaller dataset.
Example Use: Evaluating the model’s effectiveness in denoising images affected by real-world noise
patterns, such as sensor noise and low-light conditions.
Real-world datasets are essential to evaluate how well denoising algorithms perform when applied to
images that are noisy due to complex, unmodeled sources. These datasets provide practical scenarios
where noise is not simply synthetic Gaussian noise but rather varies based on environmental factors.
1. Medical Images
Medical imaging, including X-ray, MRI, and CT scan images, is one of the most critical
applications for image denoising. These images are often noisy due to low-light conditions, sensor
Example Use: Evaluating the preservation of important anatomical details in noisy medical images,
ensuring that the denoising model does not remove or distort critical information.
2. Satellite Images
Satellite images, particularly those captured in adverse weather or lighting conditions, can be
heavily affected by noise. Noise in satellite images can come from sensor limitations, atmospheric
interference, or environmental factors. Denoising these images is important for various
applications, such as environmental monitoring, urban planning, and disaster response.
Example Use: Testing how well the model performs in preserving geographical details while removing
noise in satellite imagery captured in challenging conditions.
Evaluation metrics are essential for objectively assessing the performance of image denoising models. In
this work, we employ several widely recognized metrics: Peak Signal-to-Noise Ratio (PSNR),
Structural Similarity Index Measure (SSIM), and Perceptual Quality Metrics such as VGG-
Perceptual Loss. These metrics help in quantitatively measuring the quality of the denoised images and
evaluating how well the model restores the original image structure.
PSNR is a traditional metric used to evaluate image quality by measuring the ratio between the maximum
possible pixel value and the Mean Squared Error (MSE) between the original and denoised images. A
higher PSNR value indicates better image quality and less noise.
SSIM evaluates the structural similarity between two images by comparing luminance, contrast, and
texture. Unlike PSNR, SSIM takes into account the perceptual aspects of image quality and can be more
aligned with human visual perception.
Visualization plays an important role in understanding the effectiveness of denoising models. In this
section, we provide side-by-side comparisons of noisy images and their corresponding denoised versions
produced by different methods.
Example 1: X-ray Image Denoising The figure shows an X-ray image with sensor noise and its
denoised output using the GAN-based method, compared with the results from traditional methods
like BM3D and DnCNN. The GAN method restores fine details like bone structures while
effectively removing noise.
Example 2: Satellite Image Denoising We show a satellite image with cloud-induced noise and
compare the denoising results using different methods. The GAN model is able to preserve
geographical details like roads and water bodies while
In a system with high noise level interference, there may be a significant loss of important image features,
leading to the possibility of overfitting in the proposed self-supervised learning method. Therefore, this
research compares the learning outcomes of deep learning using only MSE loss with the combination of
MSE and variance-based loss. This is because MSE alone may lead the network to perceive only average
pixel-wise values during training, potentially overlooking the distribution of noise in the input.
Consequently, this may result in a high pixel error when applying noise-augmented noise removal, as
presented in the proposed method
FIGURE6.1: Example of noisy image augmentation results from each network model for Gaussian noise
with a standard deviation of 50.
The experimental comparison reveals that pixel errors are more prominent in flat regions, where the
network often struggles to predict image features accurately. Despite achieving better PSNR in the
validation set with just MSE loss during training, when applying the generated noisy images to the N2N
denoising process, some residual noise remains. This suggests that relying solely on MSE loss may lead to
more overfitting in the learning of the selfgenerated noisy image network compared to using MSE with
variance normalization as a loss function.
Noisy image
Noise2Noise Limited Limited Moderate
pairs
Single noisy
Self2Self Limited Yes Moderate
image
Single noisy
Proposed Yes Yes Moderate
image (self-aug.)
Noisy image
Noise2Noise Limited Limited Moderate
pairs
Single noisy
Self2Self Limited Yes Moderate
image
Single noisy
Proposed Yes Yes Moderate
image (self-aug.)
In this work, we have explored the application of Generative Adversarial Networks (GANs) for image
denoising, focusing on state-of-the-art methods and novel approaches, such as Noise2Noise and self-
augmented Noise2Noise. The primary contributions of this document are as follows:
We introduced various GAN architectures, including U-Net, ResNet, and Vision Transformer, and
assessed their effectiveness in denoising a wide range of images. Through extensive
experimentation, we demonstrated how GANs outperform traditional denoising methods like
Gaussian filtering and wavelet transforms in terms of preserving structural and textural details in
images.
A novel self-augmented approach to the Noise2Noise framework was presented, which removes
the dependency on paired clean-noisy images. This method generates augmented noisy images
from a single noisy input, thus facilitating high-quality denoising even when clean reference
images are not available. This has practical implications for applications where acquiring clean
images is either difficult or impossible.
We employed several quantitative metrics, including PSNR, SSIM, and perceptual loss, to evaluate
the effectiveness of our denoising methods. The results showed significant improvements in image
quality and structural similarity compared to state-of-the-art methods. We also conducted ablation
studies to validate the importance of key components like variance normalization and different
network architectures.
One of the key strengths of GAN-based denoising is their ability to generalize across different
types of images and noise patterns. By training on noisy images, GANs learned to adapt to varying
noise conditions and achieve high-quality denoising, regardless of the dataset or noise type
Despite the promising results of GAN-based denoising methods, several limitations and challenges remain
that need to be addressed to make these techniques more effective in real-world applications.
GANs, by nature, are known to suffer from training instability, where the generator and
discriminator fail to converge, leading to mode collapse. This issue can result in blurry or
unrealistic denoised images. To address this, careful hyperparameter tuning, loss function
adjustments, and advanced training techniques (e.g., Wasserstein GANs) are necessary, but they
increase the complexity of the model.
2. Computational Complexity:
Although GANs show impressive results on benchmark datasets with synthetic noise, they
sometimes struggle with real-world noise, which can vary in structure, intensity, and source. For
example, satellite images affected by atmospheric interference may require additional
preprocessing or specialized training to achieve optimal denoising.
GANs trained on a specific domain (e.g., medical images or satellite images) may not generalize
well to other domains. For instance, a model trained on MRI images may not perform well on X-
ray images due to differences in noise characteristics. This limitation makes it challenging to build
universal denoising models that work effectively across all image types and noise patterns.
While GANs are effective at removing noise, in high-noise conditions, there is a risk of losing fine
details or introducing artifacts. The delicate balance between removing noise and preserving image
structure is difficult to maintain, especially in challenging scenarios where the noise level is high,
and fine features must be preserved
Despite the limitations, GAN-based denoising methods hold great potential for future advancements.
Below are several key research directions that could further enhance the capabilities and applicability of
GANs in image denoising.
One of the major challenges in image denoising is the need for large amounts of paired noisy-clean image
data for supervised learning. In many real-world scenarios, acquiring such paired data is either impractical
or costly. Unsupervised and self-supervised learning methods have the potential to alleviate this problem.
Unsupervised Learning: GANs trained in an unsupervised manner can learn to denoise images by
leveraging unpaired data. This would enable models to generalize to a wider range of images
without the need for paired data. Unsupervised methods, such as those using cycle-consistency
losses (e.g., CycleGAN), could help learn mappings from noisy images to clean images without
relying on ground truth images.
Optimizing for Speed: Research into optimizing GAN architectures for real-time applications is
necessary. Techniques such as model pruning, quantization, and knowledge distillation could be
explored to reduce the computational load without compromising denoising quality.
Although GANs have achieved impressive results in image denoising, they remain largely black-box
models, making it difficult to understand how they make decisions or why certain artifacts may appear in
the output images. Explainable AI (XAI) techniques for GANs could help increase trust in the model's
denoising process and improve its reliability.
Visualizing Feature Importance: Techniques such as saliency maps or Grad-CAM could be used
to understand which regions of the image the network is focusing on during denoising.
Many real-world applications require denoising across multiple modalities, such as combining MRI and
CT scans for more comprehensive medical imaging or integrating data from different types of remote
sensing sensors (e.g., optical, radar, infrared). Multi-modal denoising aims to combine the strengths of
different modalities to improve the overall image quality.
Cross-Domain Denoising: Future models could be designed to learn how to denoise images from
one domain (e.g., medical images) using data from another domain (e.g., satellite images), thereby
improving generalization across different imaging systems.
Given the computational intensity of GANs, hardware acceleration is a promising direction for making
GAN-based denoising more efficient and scalable. Leveraging specialized hardware, such as Graphics
Processing Units (GPUs), Field-Programmable Gate Arrays (FPGAs), or Application-Specific Integrated
Circuits (ASICs), can greatly enhance the speed and efficiency of denoising models.
Optimizing for Parallelism: The parallel processing capabilities of GPUs can be leveraged to
train GANs faster, allowing for the denoising of high-resolution images in real time.
Edge Computing for Denoising: Future research could focus on deploying lightweight denoising
models on edge devices (e.g., smartphones, drones, and medical imaging equipment), enabling
real-time denoising without relying on cloud computing resources.
7.4 Conclusion
GAN-based image denoising methods have shown remarkable potential in removing noise from images
while preserving crucial details. These methods offer significant advantages over traditional denoising
approaches, especially when paired with advanced architectures like U-Net, ResNet, and Vision
Transformers. The proposed self-augmented Noise2Noise framework has also opened up new possibilities
for denoising even when clean reference images are not available. However, challenges such as training
instability, computational complexity, and generalization to real-world noise remain.
REFERENCES
1. Qu, Z., Zhang, Y., Sun, Y., & Lin, X(2018), A New Generative Adversarial Network for Texture
Preserving Image Denoising ,IEEE,5356-5361
2. Samik Banerjee, Sukhendu Das(2020), SD-GAN: Structural and Denoising GAN reveals facial
parts under occlusion,arxiv,1-24
6. Shaobo Zhao, Sheng Lin , Xi Cheng(2024), Dual-GAN Complementary Learning for Real-World
Image Denoising,IEEE, 355-366
8. Xin Jin1 , Ying Hu1 , Chu-Yue Zhang1(2020), Image restoration method based on GAN and
multi-scale feature fusion ,IEEE Xplore, 2305-2310
9. LUAN THANH TRINH AND TOMOKI HAMAGAMI(2024), Latent Denoising Diffusion GAN:
Faster Sampling, Higher Image Quality,IEEE Access, 78161-78172
10. IBRAHIM H. EL-SHAL 1 , OMAR M. FAHM(2022), License Plate Image Analysis Empowered
by Generative Adversarial Neural Networks (GANs) ,IEEE Access, 30846-30857
12. WENDA LI 1,2,3 AND JIAN WANG(2021), Residual Learning of Cycle-GAN for Seismic Data
Denoising,IEEE Access, 11585-11597
13. Qi-Feng Sun1 · Jia-Yue Xu1 · Han-Xiao Zhang1(2021), Random noise suppression
and super-resolution reconstruction algorithm of seismic profle based on GAN,Springer, 2107–
2119