Full Termpaper
Full Termpaper
1.INTRODUCTION
Image denoising is a fundamental task in image processing, crucial for enhancing image quality
by removing noise that often degrades visual clarity. Noise can be introduced during image
capture due to factors like environmental conditions, low-light settings, or limitations of imaging
sensors. Traditional image denoising techniques include filtering-based methods and deep
learning-based approaches. Filtering techniques work by adjusting pixel values based on defined
parameters and have been widely used to improve image sharpness and color accuracy.
However, these methods often struggle with complex noise patterns, especially in high-noise
scenarios.
Deep learning methods have advanced denoising by learning to differentiate noise from the true
image signal using large datasets of paired noisy and clean images. In particular, the
Noise2Clean (N2C) approach trains a network to convert a noisy image into its clean
counterpart. This, however, requires a considerable amount of paired data, which can be difficult
to obtain, especially in blind noise scenarios where only a single noisy image is available.
To address this limitation, Noise2Noise (N2N) methods were developed, demonstrating that
neural networks can be trained on pairs of independently noisy images to effectively reduce
noise without needing clean data. The N2N approach adapts to noise by balancing the average
noise patterns within the noisy inputs and outputs, improving denoising performance.
Nonetheless, Noise2Noise still requires multiple noisy images for training, making it less
practical for cases where only a single noisy image is available or in scenarios with unpredictable
noise patterns.
This paper introduces a self-augmented approach to enhance the N2N framework, aiming to
address single-image denoising in blind noise settings. The proposed method leverages self-
supervised learning by generating augmented noisy images that mimic the distribution of the
original noisy image. This self-augmentation allows for training using only the noisy image
itself, eliminating the dependency on paired noisy data. The self-augmented model is validated
using Peak Signal-to-Noise Ratio (PSNR) and Structural Similarity Index Measure (SSIM)
metrics, with results indicating that the approach achieves competitive performance in single-
image denoising tasks. This method opens new possibilities for image denoising, particularly in
fields where obtaining clean or multiple noisy images is impractical, such as in medical imaging
or satellite data analysis.
2. METHODOLOGY
The methodology involves implementing a self-augmented noisy image generation for single-
image denoising within a Noise2Noise (N2N) framework. Initially, a self-augmented network is
trained by feeding a noisy image set as both input and validation data, producing a similar noisy
output that differs slightly from the input due to inherent learning imperfections in deep learning.
This process creates a new set of noisy images which, alongside the original noisy images, form
input-validation pairs for the N2N denoising network. Three models—ResNet, U-Net, and
Vision Transformer (ViT)—were tested for self-augmentation, with U-Net and ViT yielding the
best results. The self-augmented noisy image generation approach emphasizes variance
normalization in the loss function to prevent overfitting and improve the distribution of noise
across RGB channels. The methodology was evaluated using standard datasets and real-world
noisy images with performance metrics like PSNR and SSIM, demonstrating effectiveness in
denoising without requiring noise-free images.
This methodology begins with generating self-augmented noisy images to create a secondary
noisy image set from a single noisy image, addressing the challenge of requiring two noisy
images with different noise for traditional Noise2Noise setups. A self-augmented network
produces slightly altered versions of a noisy image, using the original as both input and
validation data. Through training imperfections, the network generates outputs mimicking the
original noise distribution with subtle variations, suitable for Noise2Noise denoising. Three
models—ResNet, U-Net, and Vision Transformer (ViT)—were tested for this approach. ResNet
prevents data loss with skip connections, U-Net retains spatial context via an encoder-decoder
structure, and ViT uses attention mechanisms to enhance feature retention. The models were
optimized with mean squared error (MSE) loss and variance normalization, which adapts to RGB
noise distribution and ensures realistic noisy images for effective denoising.Application in the
Paper: The network generates a set of noisy images that resemble the original noisy image but
are different enough to be used for training the Noise2Noise denoising network. This strategy is
essential in scenarios where only one noisy image is available for training.
Each model was trained with mean squared error (MSE) loss as the primary optimization target,
alongside variance normalization to control the noise distribution across RGB channels,
particularly in uniform regions where standard MSE can lead to overfitting. The addition of
variance normalization ensures that each architecture adapts more effectively to the noise
characteristics, thereby generating noisy outputs that serve as reliable training pairs for the
Noise2Noise denoising network. Through this methodology, I was able to evaluate each
architecture's effectiveness in self-augmented noisy image generation, with ViT and U-Net
showing particular strengths in maintaining consistent noise distributions.
3. GANs (Generative Adversarial Networks):
In addition to ResNet, U-Net, and Vision Transformer (ViT), experimented with Generative
Adversarial Networks (GANs) for self-augmentation. GANs consist of a generator, which
creates new images, and a discriminator, which evaluates their authenticity against the original
noisy input. Using a U-Net generator for its feature-preserving properties and a CNN-based
discriminator enhanced with VGG-19 for subtle noise differentiation, the GAN trained on noisy
images with a dual-loss approach: MSE for content loss and binary cross-entropy for adversarial
loss. This setup aimed to produce realistic noisy images while minimizing pixel-level
differences. Despite challenges like overfitting and minor artifacts in regions of high feature
variability, careful tuning of loss functions mitigated these issues, enabling the GAN to generate
noisy images closely resembling the input. However, GANs proved less consistent than ViT and
U-Net due to the complexities of adversarial training, though they provided valuable insights into
capturing detailed noise patterns in self-augmentation.
4. Self-Augmented Noise2Noise algorithm
In scenarios where only a single noisy image is available, the self-augmented N2N approach is
particularly valuable. The methodology includes an additional step to simulate a second noisy
version from a single noisy image, enabling the Noise2Noise process even in single-image cases.
Fig 2.2.3 flow diagram for noise 2 noise algorithm
Self-Augmentation of Noisy Images:
o The model first creates a second noisy version of each image by learning to
approximate the noise distribution in the original noisy image.
o This process involves training a deep learning model (e.g., U-Net, ResNet, or
Vision Transformer) to create a noisy output that closely resembles the input
noisy image without directly replicating it.
o To prevent overfitting, the training uses a modified loss function that combines
MSE with variance normalization. This encourages the model to focus on the
underlying structure of the image rather than memorizing the noise pattern.
Self-Supervised Noise2Noise Training:
o The original noisy image serves as the input, and the self-augmented noisy
version serves as the target during training.
o By minimizing the loss between the original noisy image and the self-augmented
noisy image, the network learns to reduce noise.
Network Architecture:
o Various architectures can be used for this self-augmentation step, such as U-Net
(an encoder-decoder structure) or Vision Transformer (ViT), which have shown
better results in handling noise.
o In particular, U-Net and ViT are effective because they capture relationships
within image patches and avoid overfitting, which helps maintain image quality
while denoising.
Evaluation Metrics:
o The model's performance is evaluated using metrics like Peak Signal-to-Noise
Ratio (PSNR) and Structural Similarity Index Measure (SSIM), which quantify
how well the noise is removed while preserving image details.
3.LITERATURE SURVEY
3.1 RELATED WORK
Qu, Z., Zhang, Y., Sun, Y., & Lin, X[1] has proposed a paper A New Generative Adversarial
Network for Texture Preserving Image Denoising. The Noise2Noise framework builds on
classical and deep learning-based denoising methods. Traditional techniques used filtering and
optimization, while modern approaches, like CNNs (e.g., DnCNN) and autoencoders, learn
mappings from noisy to clean images. GANs further enhance denoising by preserving textures
through adversarial training, making them ideal for detail-sensitive applications.
Samik Banerjee, Sukhendu Das[2], SD-GAN: Structural and Denoising GAN reveals facial parts
under occlusion. Classical methods like Non-Local Means and BM3D use filtering and statistical
models for noise removal. Deep learning models like DnCNN leverage CNNs trained on clean-
noisy pairs for improved denoising, while GANs enhance perceptual quality through adversarial
training. Noise2Noise innovates by using noisy image pairs instead of clean references, enabling
denoising without clean data. Self-supervised extensions adapt this approach for single-image
denoising by simulating multiple noisy versions.
Xin Cheng , Jingmei Zhou , Jiachun Song, and Xiangmo Zhao[3], A Highway Traffic Image
Enhancement Algorithm Based on Improved GAN in Complex Weather Conditions. Traditional
methods like histogram equalization and Retinex struggle with complex textures and
overexposure in challenging weather. Deep learning models, such as GCANet and object
detection networks, improve clarity and adapt to varying conditions, while GANs enhance
underwater image clarity but face synthetic-to-real distribution challenges. However, many
methods still struggle with fine details and color fidelity, highlighting the need for better
solutions.
Shaobo Zhao, Sheng Lin , Xi Cheng[6], Dual-GAN Complementary Learning for Real-World
Image Denoising. Notable denoising methods like DnCNNs, FFDNet, CBDNet, RIDNet, and
SADNet address image noise using deep learning. However, many rely on L1 or L2 losses,
leading to blurring and limited real-world effectiveness. Handling complex, spatially varying
noise has driven the development of strategies combining denoised image and noise learning.
Single GAN-based methods face challenges with complexity and training, prompting the
proposed DGCL strategy to enhance performance and overcome these limitations.
Xin Jin1 , Ying Hu1 , Chu-Yue Zhang1[8], This image restoration method combines GANs with
multi-scale feature fusion to address limitations in traditional algorithms requiring paired data.
While GAN-based approaches often complicate training and produce unnatural results, this
method uses a VGG-16 encoder-decoder to fuse low- and high-dimensional features, enhancing
restoration quality. Incorporating WGAN principles and L1 loss improves similarity to target
images. Experiments show improved accuracy and visual consistency, though challenges remain
in complex backgrounds, which future work aims to address.
LUAN THANH TRINH AND TOMOKI HAMAGAMI[9], Latent Denoising Diffusion GAN:
Faster Sampling, Higher Image Quality. The Latent Denoising Diffusion GAN (LDDGAN)
improves image generation by addressing the slow inference of diffusion models while
enhancing quality and diversity. Using pre-trained autoencoders, it compresses images into a
low-dimensional latent space for faster sampling and efficiency. A Weighted Learning strategy
balances adversarial and reconstruction losses, boosting image quality and diversity. LDDGAN
achieves state-of-the-art speeds and competitive quality compared to models like DiffusionGAN,
showing promise for real-time applications and advancing high-fidelity diffusion models by
moving beyond Gaussian latent spaces.
WENDA LI 1,2,3 AND JIAN WANG[12], Residual Learning of Cycle-GAN for Seismic Data
Denoising. The Residual Cycle-GAN (RCGAN) improves seismic data denoising by integrating
residual learning into a Cycle-GAN framework, addressing the limitations of traditional methods
that require manual parameter selection. RCGAN uses data augmentation to enhance training
efficiency and adaptability to real seismic data, generating labeled datasets from noisy unlabeled
data. It outperforms methods like FXDM and DnCNN by better suppressing random noise while
preserving important data details in both synthetic and real seismic datasets.
Joonyoung Song , Jae-Heon Jeong , Dae-Soon Park [15], Unsupervised Denoising for Satellite
Imagery Using Wavelet Directional Cycle GAN. Satellite image denoising includes traditional
methods like Total Variation (TV), which preserves high-frequency information, and Low-Rank
Matrix Recovery (LRMR), which uses the low-rank properties of clean images. However, these
methods require paired noisy and clean images, which are hard to obtain. Recent deep learning
approaches, like deep image prior and Noise2Void, face issues with computational complexity
and noise assumptions. This highlights the need for innovative methods like WavCycleGAN,
which denoises satellite imagery without paired datasets, addressing challenges in remote
sensing.
CONCLUSION
This research presents a novel approach to enhancing the Noise2Noise (N2N) algorithm for
image denoising, specifically targeting single-image and blind noise scenarios by generating
noisy images from limited datasets. By leveraging the imperfections inherent in the deep learning
training process, the proposed self-augmented noisy image network enables the generation of
new noisy images that can be used for training and validation, effectively facilitating noise
removal. Experimental results demonstrate that this self-augmentation strategy achieves
performance comparable to other unsupervised denoising methods at lower noise levels,
although it struggles with higher noise levels and real-world images due to a lack of
understanding of image features. The findings underscore the importance of accurately
estimating noise characteristics in real-world applications to enhance denoising performance and
minimize pixel errors.
references
1. Qu, Z., Zhang, Y., Sun, Y., & Lin, X(2018), A New Generative Adversarial Network
for Texture Preserving Image Denoising ,IEEE,5356-5361
2. Samik Banerjee, Sukhendu Das(2020), SD-GAN: Structural and Denoising GAN
reveals facial parts under occlusion,arxiv,1-24
3. Xin Cheng , Jingmei Zhou , Jiachun Song, and Xiangmo Zhao,(2023), A Highway
Traffic Image Enhancement Algorithm Based on Improved GAN in Complex
Weather Conditions,IEEE,8716-8726
4. SONGKUI CHEN 1 , DAMING SHI(2020), Image Denoising With Generative
Adversarial Networks and Its Application to Cell Image Enhancement ,IEEE, 82819-
82831
5. HAIBO ZHANG 1 AND KOUICHI SAKURA(2022), Conditional Generative
Adversarial Network-Based Image Denoising for Defending Against Adversarial
Attack,IEEE Access, 169031-169043
6. Shaobo Zhao, Sheng Lin , Xi Cheng(2024), Dual-GAN Complementary Learning
for Real-World Image Denoising,IEEE, 355-366
7. HAIJUN HU 1 , BO GAO3 , ZHIYUAN SHEN(2020), Image Smear Removal via
Improved Conditional GAN and Semantic Network,IEEE Access, 113104-113111
8. Xin Jin1 , Ying Hu1 , Chu-Yue Zhang1(2020), Image restoration method based on
GAN and multi-scale feature fusion ,IEEE Xplore, 2305-2310
9. LUAN THANH TRINH AND TOMOKI HAMAGAMI(2024), Latent Denoising
Diffusion GAN: Faster Sampling, Higher Image Quality,IEEE Access, 78161-78172
10. IBRAHIM H. EL-SHAL 1 , OMAR M. FAHM(2022), License Plate Image Analysis
Empowered by Generative Adversarial Neural Networks (GANs) ,IEEE Access,
30846-30857
11. Huihui Li , Cang Gu , Dongqing Wu , Gong Cheng/(2022), Multiscale Generative
Adversarial Network Based on Wavelet Feature Learning for SAR-to-Optical Image
Translation,IEEEE, 5236115-5236125
12. WENDA LI 1,2,3 AND JIAN WANG(2021), Residual Learning of Cycle-GAN for
Seismic Data Denoising,IEEE Access, 11585-11597
13. Qi-Feng Sun1 · Jia-Yue Xu1 · Han-Xiao Zhang1(2021), Random noise suppression
and super-resolution reconstruction algorithm of seismic profle based
on GAN,Springer, 2107–2119
14. ASAVARON LIMSUEBCHUEA 1 , RAKKRIT DUANGSOITHONG(2024), Self-
Augmented Noisy Image for Noise2Noise Image Denoising,IEEE Access, 71076-
71087
15. Joon young Song , Jae-Heon Jeong , Dae-Soon Park (2021), Unsupervised
Denoising for Satellite Imagery Using Wavelet Directional CycleGAN,IEEE Xplore,
6823-6839