0% found this document useful (0 votes)
117 views15 pages

Electronics 09 01011 v2 PDF

Uploaded by

Sharan SP
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
117 views15 pages

Electronics 09 01011 v2 PDF

Uploaded by

Sharan SP
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 15

electronics

Article
Learning to See in Extremely Low-Light
Environments with Small Data
Yifeng Xu 1,2 , Huigang Wang 1, * , Garth Douglas Cooper 1 , Shaowei Rong 1 and Weitao Sun 1
1 School of Marine Science and Technology, Northwestern Polytechnical University, Xi’an 710072, China;
xuyifeng123@mail.nwpu.edu.cn (Y.X.); Coopergd1@mail.nwpu.edu.cn (G.D.C.);
rsw1986@mail.nwpu.edu.cn (S.R.); Sunwt1223@gmail.com (W.S.)
2 Jinhua Polytechnic, Jinhua 321017, China
* Correspondence: wanghg74@nwpu.edu.cn; Tel.: +86-029-88460521

Received: 25 May 2020; Accepted: 14 June 2020; Published: 17 June 2020 

Abstract: Recent advances in deep learning have shown exciting promise in various artificial
intelligence vision tasks, such as image classification, image noise reduction, object detection,
semantic segmentation, and more. The restoration of the image captured in an extremely dark
environment is one of the subtasks in computer vision. Some of the latest progress in this field
depends on sophisticated algorithms and massive image pairs taken in low-light and normal-light
conditions. However, it is difficult to capture pictures of the same size and the same location
under two different light level environments. We propose a method named NL2LL to collect the
underexposure images and the corresponding normal exposure images by adjusting camera settings
in the “normal” level of light during the daytime. The normal light of the daytime provides better
conditions for taking high-quality image pairs quickly and accurately. Additionally, we describe the
regularized denoising autoencoder is effective for restoring a low-light image. Due to high-quality
training data, the proposed restoration algorithm achieves superior results for images taken in
an extremely low-light environment (about 100× underexposure). Our algorithm surpasses most
contrasted methods solely relying on a small amount of training data, 20 image pairs. The experiment
also shows the model adapts to different brightness environments.

Keywords: low light; image restoration; denoise; noise reduction; deep leaning; machine learning

1. Introduction
Image restoration is a challenging task, particularly in an extremely dark environment. For example,
the recovery of images captured in a scene where the light is very dark is a difficult problem. There are
two ways to solve the problem: relying on hardware and relying on software. In the aspect of hardware,
the adjustment of the camera settings can partially solve the problem, but there are still difficulties:
(1) a higher sensitivity value of the image sensor increases brightness, but it also increases high
sensitivity noise, and, although the ISO value in the latest cameras can be set to 25,600, the images with
an ISO over 400 should have more noise; (2) the larger aperture receives more light, but it also leads to
worse sharpness and a shallower depth of field; (3) extending exposure time is one of the most direct
solutions, but a little movement should lead to blurred imaging under the condition; (4) the larger size
of the photosensitive element receives more photons, although the size of the photosensitive element is
limited by camera size and cost; (5) using flash helps to capture more light, however, the flash range is
limited and flash is forbidden in some situations.
In addition to the suitable hardware settings, some sophisticated algorithms have been designed
to restore the images in the dark. Many denoising, deblurring, color calibration, and enhancement
algorithms [1–5] are applied to low-light images. These algorithms only deal with normal low-light

Electronics 2020, 9, 1011; doi:10.3390/electronics9061011 www.mdpi.com/journal/electronics


Electronics 2020, 9, 1011 2 of 15

images but are inefficient for the extremely low-light images with brightness as low as 1 lumen.
An alternative method, burst alignment algorithms [6–8], uses multiple pictures taken continuously.
However, this kind of method still loses efficacy on the pictures in extremely low-light conditions.
Chen et al. [9] proposed a deep learning end-to-end method which can restore images with only
0.1 lumens. However, this method was trained with massive amounts of training data and needed
a huge computational cost. It is well known the deep learning algorithms require big data. One of the
reasons is that the quality of partial data is poor. By obtaining better training data, the algorithm can
accomplish the same result with less data.
In the past, the low-light data were basically collected in the low-light environment, while the
ground truth data were collected by long exposure. This collecting method had many difficulties
and generated low-quality training data. We propose a new method to collect dark pictures and
corresponding normal exposure pictures by adjusting camera settings in the daytime.
We use an end-to-end neural network to restore the extremely low-light image only by
20 high-quality image pairs based on the work of [9,10]. Our contributions in this paper are summarized
as follows: (1) we propose a new low-cost method to capture high-quality image pairs, including
dark pictures and corresponding normal exposure pictures; (2) we use the theory of the regularized
denoising autoencoder to explain why the algorithm works effectively; (3) our proposed algorithm can
restore images taken in an extremely dark environment (100× underexposure light level) by only using
20 image pairs.
The rest of this paper is organized as follows. Related work about low-light images restoration is
proposed in Section 2. The image acquisition method and the framework of the neural network are
shown in Section 3. The detailed experimental results are shown in Section 4. Several problems that
require further research are put forward in Section 5. The paper concludes with Section 6.

2. Related Work
The restoration of low-light images has been extensively studied in the literature. In this section,
we provide a short review of related work.

2.1. Low-Light Image Datasets


Well-known public image datasets, such as PASCAL VOC [11], ImageNet [12], and COCO [13]
have played a significant role in traditional computer vision tasks. Because less than 2% of the images
in these datasets were taken in a low-light environment, the public datasets were unsuitable for training
low-light image restoration.
Many researchers proposed low-light image datasets. LLnet [14] darken the original images to
simulate low-light images. The original images were used as a ground truth; the generated dark
images were artificial. The authors of [5] proposed a new dataset of 3000 underexposed images,
each with an expert-retouched reference. PolyU [15] collected real-world noisy images taken from
40 scenes, including indoor normal lighting, dark lighting, and outdoor normal-light scenes. However,
the brightness of the images from these datasets is dusky, not extremely black. SID [9] proposed an
extremely dark image dataset captured by Sony and Fuji cameras, including 5094 short exposure
images and 424 long-exposure images. LOL [16] consists of 500 image pairs. The ExDARK [17] dataset
is made up of images captured in real environments, containing various objects. Because the images
need careful adjustment of the camera settings in dark conditions, the previously mentioned databases
have two common problems: the high cost of collecting data and very few high-quality images.

2.2. Image Denoising


According noise is one of the significant obstacles in the restoration of low-light images; denoising
is a notable subtask in low-light enhancement tasks. A classic traditional method of dealing with
the low-light image is scaling luminosity and the followed denoising procedure. Image denoising
is an often-traversed topic in low-level computer vision. Many approaches have been proposed,
Electronics 2020, 9,
Electronics 2020, 9, 1011
1011 3 of 15

[20], etc. These approaches are grouped into traditional denoising methods. Their effectiveness is
such as total variation [18], sparse coding [19], and 3D transform-domain filtering (BM3D) [20], etc.
often based on an image prior’s information, such as smoothness, low rank, and self-similarity, etc.
These approaches are grouped into traditional denoising methods. Their effectiveness is often based
Unfortunately, most traditional methods only work more effectively for the synthetic noise data, such
on an image prior’s information, such as smoothness, low rank, and self-similarity, etc. Unfortunately,
as salt pepper and Gaussian noise, but the performance of the methods sharply drops for noisy
most traditional methods only work more effectively for the synthetic noise data, such as salt pepper
images taken in real-world environments.
and Gaussian noise, but the performance of the methods sharply drops for noisy images taken in
Researchers have also widely explored the application of deep learning networks to image
real-world environments.
denoising. Important methods of deep learning include deep illumination estimation [5], end-to-end
Researchers have also widely explored the application of deep learning networks to image
convolutional networks [9,21], autoencoders [22,23], and multi-layer perceptron [24]. Generally, most
denoising. Important methods of deep learning include deep illumination estimation [5], end-to-end
methods based on deep learning work better than traditional methods. However, the former has the
convolutional networks [9,21], autoencoders [22,23], and multi-layer perceptron [24]. Generally, most
requirement of huge amounts of training data.
methods based
Except for on deepimage
single learning work better
denoising, than traditional
the alternative, methods. However,
multiple-image denoisingthe formerachieves
[6,7,25], has the
requirement
better resultsofsince
hugemore
amounts of training
information data.
is collected. However, it is difficult to select the “lucky image”
and the correspondence estimation between alternative,
Except for single image denoising, the multiple-image
images. On some denoising
occasions, taking more [6,7,25],
than oneachieves
image
better results
was infeasible. since more information is collected. However, it is difficult to select the “lucky image”
and the correspondence estimation between images. On some occasions, taking more than one image
was infeasible.Image Restoration
2.3. Low-Light
One basicImage
2.3. Low-Light method is histogram equalization to expand the dynamic range of the images. The
Restoration
recent effort on low-light image restoration is the learning-based methods. For instance, the authors
One basic method is histogram equalization to expand the dynamic range of the images. The recent
of [14] proposed a deep autoencoder approach. WESPE [26] proposed a weakly supervised image-
effort on low-light image restoration is the learning-based methods. For instance, the authors of [14]
to-image network based on a Generate Adversarial Network (GAN). However, WESPE was more
proposed a deep autoencoder approach. WESPE [26] proposed a weakly supervised image-to-image
focused on image enhancement. In addition, other methods include the approach based on dark
network based on a Generate Adversarial Network (GAN). However, WESPE was more focused on
channel prior [27], the wavelet transform [23], and illumination map estimation [28], etc. These
image enhancement. In addition, other methods include the approach based on dark channel prior [27],
methods mentioned above only deal with the images captured in a normal dark environment, such
the wavelet transform [23], and illumination map estimation [28], etc. These methods mentioned above
as dusk, morning, and shadow, etc. The end-to-end model proposed by Chen et al. [9] could restore
only deal with the images captured in a normal dark environment, such as dusk, morning, and shadow,
extremely low-light images using RAW sensor data. However, its model was heavyweight. In sum,
etc. The end-to-end model proposed by Chen et al. [9] could restore extremely low-light images using
the current research suggests that either the image in the extreme dark cannot be restored or the
RAW sensor data. However, its model was heavyweight. In sum, the current research suggests that
algorithms require big data.
either the image in the extreme dark cannot be restored or the algorithms require big data.
3. The
3. The Approach
Approach

3.1. The
3.1. The End-to-End
End-to-End Pipeline
Pipeline Based
Based on
on Deep
Deep Learning
Learning
The pipelines
The pipelines based
basedonondeep
deeplearning
learningand
andthethe traditional
traditional method
method cancan be used
be used to restore
to restore low-
low-light
light images.
images. Two kinds
Two kinds of pipelines
of pipelines are shown
are shown in Figure
in Figure 1. The1.deep
Thelearning
deep learning
modelmodel
(the top(the top sub-
sub-image)
image) is an end-to-end method. This method generates a model from image
is an end-to-end method. This method generates a model from image pairs, while the traditional pairs, while the
traditional method cascades a sequence of low-level vision processing procedures, such
method cascades a sequence of low-level vision processing procedures, such as luminosity scaling, as luminosity
scaling, demosaicing,
demosaicing, denoising,
denoising, sharpening,
sharpening, andcorrection,
and color color correction,
etc. Inetc. In Figure
Figure 1, luminosity
1, luminosity scaling
scaling and
and cBM3D are selected as the major procedures in the traditional
cBM3D are selected as the major procedures in the traditional method. method.

Figure 1. The
Figure 1. The two
twocategories
categoriesofofpipelines
pipelinesconcerning
concerninglow-light
low-lightrestoration. The
restoration. toptop
The sub-image
sub-imageshows the
shows
pipeline based on deep learning. The bottom sub-image shows a traditional image processing
the pipeline based on deep learning. The bottom sub-image shows a traditional image processing pipeline.
The toy images
pipeline. The toyon the right
images side
on the are side
right the results
are theof the two
results pipelines,
of the respectively.
two pipelines, The sub-image
respectively. The sub-
surrounded by a red
image surrounded byline boxline
a red is the
boxzoom-in
is the zoom-in BM3D =
image. image. 3D transform-domain
BM3D filtering.
= 3D transform-domain filtering.
Electronics 2020, 9, 1011 4 of 15
Electronics 2020, 9, 1011 4 of 15

In the pipeline of the traditional method, the first step is luminosity scaling. The images taken
by the
Incamera NikonofD700
the pipeline are Nikonmethod,
the traditional Electric the
Filmfirst
(NEF)
stepRAW images with
is luminosity 14 bits.
scaling. The This
images means
taken that
by
the
the maximum
camera Nikon brightness
D700 are value is 214Electric
Nikon , that is,Film
16,384.
(NEF)DueRAW
to theimages
imagewith
in the14extremely
bits. Thisdark
means condition,
that the
the brightness
maximum values of
brightness value is 214 , that
the pixels are distributed between
is, 16,384. Due to the1 image
and 50.inThe
the procedure of light
extremely dark scaling
condition,
can be expressed
the brightness as of
values a the
formula:
pixels varex/vmax × 16,384.between
distributed The parameter
1 and 50. vThe
x in procedure
the formula represents
of light the
scaling can
brightness
be expressed value
as a of
formula: /vmaxv×
a pixel,vxand represents
max16,384. the max brightness
The parameter value of
vx in the formula all pixels.
represents theThe simple
brightness
luminosity scaling
value of a pixel, andalso amplifies
vmax representsthe the
noisemaxinformation
brightnessinvalue
the images. The high
of all pixels. Thenoise
simple is demoed
luminosity in
the zoom-in
scaling image surrounded
also amplifies by the red in
the noise information box
theafter luminance
images. The highscaling. The
noise is secondinstep
demoed is noise
the zoom-in
reduction by BM3D.
image surrounded by the red box after luminance scaling. The second step is noise reduction by BM3D.
In
In our work, the deep deep learning
learningneuron
neuronnetwork
networkisisproposed
proposedfor fordirect
directsingle
single image
image restoration
restoration of
of extremely
extremely low-light
low-light images.Specifically,
images. Specifically,a aconvolutional
convolutionalneural
neuralnetwork
network[29] [29] U-net
U-net [10] is used for for
processing,
processing, inspired
inspired by by the
the recent algorithms
algorithms in the work of [9,10]. The The structure
structure of of the
the network
network is is
shown
shown in Figure 2, and the details about the structure are listed in Table Table 1.1.

Figure 2.
Figure Thestructure
2. The structureofof
thethe network.
network. TheThe input
input arrayarray is adata
is a 4D 4D converted
data converted from
from the the original
original RAW
RAW image. Convolutional block is abbreviated as “Conv Block”, which represents a convolutional
image. Convolutional block is abbreviated as “Conv Block”, which represents a convolutional block
block including
including 2D convolutional
2D convolutional layerpooling
layer and and pooling
layer. layer.
The red The red dotted
dotted arrowarrow represents
represents copy copy and
and crop
crop operations.
operations.
Table 1. The parameters of the neuronal network. The parameter “32 [3,3]” represents that the output
Table 1. The
array size parameters
is 32 of the neuronal
and the convolutional network.
kernel size isThe
3 ×parameter
3. “32 [3,3]” represents that the output
array size is 32 and the convolutional kernel size is 3  3.
ID of the Block The First Layer The Second Layer The Third Layer
ID of the Block The First Layer The Second Layer The Third Layer
Block 1 Conv2d(32, [3,3]) Conv2d(32, [3,3]) Max pooling2d
Block 1
Block 2 Conv2d(32, [3,3])
Conv2d(64, [3,3]) Conv2d(32, [3,3])
Conv2d(64, [3,3]) Max
Max pooling2d
pooling2d
Block 2
Block 3 Conv2d(64, [3,3])
Conv2d(128, [3,3]) Conv2d(64, [3,3])
Conv2d(128, [3,3]) Max
Max pooling2d
pooling2d
Block
Block 3 4 Conv2d(128, [3,3])
Conv2d(256, [3,3]) Conv2d(128,[3,3])
Conv2d(256, [3,3]) Max pooling2d
Max pooling2d
Block 5 Conv2d(512, [3,3]) Conv2d(512, [3,3]) none
Block 4
Block 6 Conv2d(256, [3,3])
Conv2d(256, [3,3]) Conv2d(256, [3,3])
Conv2d(256, [3,3]) Max pooling2d
none
Block
Block 57 Conv2d(512, [3,3])
Conv2d(128, [3,3]) Conv2d(512, [3,3])
Conv2d(128, [3,3]) none
none
Block 68
Block Conv2d(64, [3,3])
Conv2d(256, [3,3]) Conv2d(64, [3,3])
Conv2d(256, [3,3]) none
none
Block 9 Conv2d(32, [3,3]) Conv2d(32, [3,3]) Conv2d(12, [1,1])
Block 7 Conv2d(128, [3,3]) Conv2d(128, [3,3]) none
Block 8 Conv2d(64, [3,3]) Conv2d(64, [3,3]) none
3.2. Regularized Block
Denoising
9 Autoencoder
Conv2d(32, [3,3]) Conv2d(32, [3,3]) Conv2d(12, [1,1])
In our study, we selected an autoencoder neural network [30,31] similar to U-net to restore dark
3.2. Regularized Denoising Autoencoder
images. The autoencoder is a neural networks that is trained to attempt to map the input to the output.
In other words, it is restricted
In our study, we selected an in autoencoder
some ways to learnnetwork
neural the useful properties
[30,31] similaroftothe data.
U-net It has many
to restore dark
layers internally called the hidden layer. The network is divided into two
images. The autoencoder is a neural networks that is trained to attempt to map the input parts: an encoder function
to the
h = F(x) In
output. and a decoder
other words,function G(h) which
it is restricted generates
in some the
ways to reconstruction.
learn the useful properties of the data. It has
Regularized technology [32,33] is used to solve the invalidation
many layers internally called the hidden layer. The network is divided of the into
over-complete
two parts:autoencoder.
an encoder
function h = F(x) and a decoder function G(h) which generates the reconstruction. The over-complete
The case is called over-complete when the hidden dimension is greater than the input.
autoencoders fail to learn anything useful if the encoder and decoder have a large number of parameters.
autoencoder to learn useful information about the data distribution.
The denoising autoencoder (DAE) [34,35] is one of the autoencoders with corrupted data as
input and clear data as output by a trained model. The structure of a DAE is shown in Figure 3. It
aims to learn a reconstruction distribution precontstruct(y|x) by the given training pairs (x,y). The DAE
minimizes
Electronics the
2020, function L(y,G(F(x))) to obtain the useful properties, where x is a corrupted 5data
9, 1011 of 15
relative to the original data y. Specifically, in our study, data x indicates the images with dark noise
and data y indicates the ground truth images.
Regularized
The first autoencoders use a loss
step is sampling y function
from theto training
learn useful information
data. The second fromstep
the input. The useful
is sampling the
information includes the sparsity of the representation, robustness to noise,
corresponding data point x by M(x|y). The third step is estimating the reconstruction distribution and robustness to the
by
missing input. In particular, the clear and real image data is useful information, hidden
pdecoder(y|h) with h the output of encoder and pdecoder defined by the decoder G(h). DAE is a feedforward in the dark
background. In one word,
network and trained by theregularization
methods of any enables
other the nonlinearnetwork.
feedforward and over-complete autoencoder
We can perform gradient- to
learn useful information about the data distribution.
based approximate minimization on the negative log-likelihood. For example, the stochastic gradient
descentThecandenoising
be written autoencoder
by: (DAE) [34,35] is one of the autoencoders with corrupted data as
input and clear data as output by a trained model. The structure of a DAE is shown in Figure 3.
It aims to learn a reconstruction -Ε y~distribution
ˆ
pdata(y)
Ε x~ M(x|y) log pdecoder
precontstruct ( yby| the
(y|x) F(x)
h =given )
training pairs (x,y). The DAE(1)
minimizes the function L(y,G(F(x))) to obtain the useful properties, where x is a corrupted data relative
to theInoriginal
Equation data(1),
y. the pdecoder isinthe
Specifically, ourdistribution
study, data xcalculated by images
indicates the the decoder and noise
with dark the p̂data
andisdata
the
ytraining
indicates thedistribution.
data ground truth images.

Figure 3. The
The structure
structure of a denoising
denoising autoencoder (DAE). The input data tagged as x and output data
asy yrepresent
tagged as represent
thethe noisy
noisy datadata and ground
and ground truth, truth, respectably.
respectably. The function
The function F(x) and F(x) and G(h)
G(h) represent
represent
the theand
encoder encoder andThe
decoder. decoder. The map
map M(x|y) M(x|y) represents
represents the procedure
the procedure of generating
of generating x from y. x from y.

TheProcedure
3.3. The first step of sampling yData
is Collecting from the training data. The second step is sampling the corresponding
data point x by M(x|y). The third step is estimating the reconstruction distribution by pdecoder (y|h)
The traditional low-light enhancement methods cascade the procedures: scaling, denoising, and
with h the output of encoder and pdecoder defined by the decoder G(h). DAE is a feedforward network
color-correcting. The traditional methods do not need the ground truth images during processing.
and trained by the methods of any other feedforward network. We can perform gradient-based
On the contrary, a deep learning neural network must train the data before the testing phase. The
approximate minimization on the negative log-likelihood. For example, the stochastic gradient descent
training and testing phases are shown in Figure 4. The upper part of this figure shows that the training
can be written by:
data consists of two parts: the low-light (dark) images and corresponding normal-light images. Every
− E y∼p̂data( y) Ex∼M(x|y) log pdecoder ( y h = F(x)) (1)
image pair in the two parts has the same size and shooting range and aligns pixel by pixel. There are
only In Equation
a few (1), the
low-light pdecoder
image is the distribution
datasets available, an calculated
exampleby from the decoder and the
one of these p̂data is the
datasets training
seen in the
data
upper distribution.
left. The learned-based model can learn the fitting parameters to map the image pairs. The
mapping relationship from the low-light images to normal-light images is non-linear, and thus deep
3.3. The Procedure
learning of Collecting Data
is appropriate.
The bottom
The sub-image
traditional low-light of Figure 4 showsmethods
enhancement the test phase.
cascade In the
the procedures:
test phase, the low-light
scaling, images
denoising,
are inputted into the trained model and the restored normal-light images are
and color-correcting. The traditional methods do not need the ground truth images during processing. outputted. In addition,
the the
On output restored
contrary, a deep image (bottom
learning neuralright
network corner)
mustistrain
unknown,
the datathe other
before thethree
testingkinds
phase.ofThe
images are
training
known. Another significant point is that the training and the test images are
and testing phases are shown in Figure 4. The upper part of this figure shows that the training data independent.
consists of two parts: the low-light (dark) images and corresponding normal-light images. Every image
pair in the two parts has the same size and shooting range and aligns pixel by pixel. There are only
a few low-light image datasets available, an example from one of these datasets is seen in the upper
left. The learned-based model can learn the fitting parameters to map the image pairs. The mapping
relationship from the low-light images to normal-light images is non-linear, and thus deep learning
is appropriate.
The bottom sub-image of Figure 4 shows the test phase. In the test phase, the low-light images
are inputted into the trained model and the restored normal-light images are outputted. In addition,
the output restored image (bottom right corner) is unknown, the other three kinds of images are known.
Another significant point is that the training and the test images are independent.
Electronics 2020, 9, 1011 6 of 15
Electronics 2020, 9, 1011 6 of 15

Figure 4. The schematic diagram of the train and test phase in the deep learning approach. The upper
sub-image shows the training phase. The bottom sub-image shows the test phase using the trained
model generated in the training phase.

In order to improve the effectiveness of the restoration, we can start from the two aspects of the
algorithm and the training data. In this section, we describe the training data. Due to the large
computational
Figure 4.
Figure 4. Thecost of the diagram
schematic trainingofdata, data
the train and collection
test phasebecomes
in the deep anlearning
obstacle
learning to the The
approach.
approach. deepupperlearning
algorithms.
sub-image
sub-image shows the the training
training phase.
phase. The bottom sub-image shows the test phase using using the
the trained
trained
The
model
model data collecting
generated
generated in method
in the
the training
traininginphase.
other deep learning literature is shown as the upper sub-image of
phase.
Figure 5. The low-light datasets based on deep learning in the computer vision community are almost
In
collected order
In order toimprove
to improve
in low-light thetheeffectiveness
conditions, effectiveness
shown asofthe of
thethe restoration,
restoration,
left box of thewe we
can can
upper start start
fromfrom
of Figure the the
two
5. The two aspects
aspects
corresponding of the
of the
algorithm
ground algorithm
andimages
truth and the
the training training
are alsodata. data.
takenIninthis In this
section,
low-light section,
we describe
conditions, we
showndescribe the
theastraining
the righttraining
data.
box Due data.
of thetoupper.Due
the largeto
For
the large
taking computational
computational cost of the
normal exposure cost of the
training
images, training
thedata,
cameradatadata, data
is collection
set to a highercollection
becomes
ISO, anbecomes an
obstacle
larger obstacle
to the
aperture, to the
deepexposure
longer deep
learning
learning
algorithms.algorithms.
time, larger light-sensing element, and flash. However, such settings reduce the quality of the images.
The
We data
data collecting
Thecollected collecting method
method
the training in
in other
data other
during deep
the learning
deep daytime,literature
learning literature
normal-lightis
is shown
shown as
as the
condition, upper
theshown
upper as sub-image
sub-image
the button of
of
Figure
Figure 5.
sub-image The
5. The low-light
of low-light
Figure 5. Wedatasets
datasets
namedbased
the on
based on deep learning
deep
collectinglearning
method in theNL2LL
inthe
the computer
computer vision community
vision
(normal-lightcommunity are
are almost
to low-light). almost
The
collected
collected in
environment in low-light
low-light conditions,
conditions,
with enough shown as
as the
the left
shownconvenience
light brings left boxto of
box of the upper
upper of of Figure
the high-quality
shoot Figure 5. The
The corresponding
5. Three
data. corresponding
pillars of the
ground truth
ground truth images
photography: images are
shutter are also
alsoISO,
speed, taken
takenand in low-light
inaperture
low-lightcan conditions,
conditions, shown as
shown asparameters
be set as “better” the right
the right box box
of the
during of the
theupper.upper.
daytime.For
For
We taking
taking thenormal
set normalshorter exposure
exposure images,
images,
exposure thethe
speeds, camera
camera
lower is is
ISO, set set
and totoa ahigher
largerhigher ISO,larger
ISO,
aperture larger
values aperture,
aperture, longer
longer
to take the dark exposure
exposure
images
time,
time, larger
during larger light-sensing
light-sensing element,
the daytime. element, andand flash. However, such
flash. However, such settings reduce the
settings reduce the quality
quality ofof the
the images.
images.
We collected the training data during the daytime, normal-light condition, shown as the button
sub-image of Figure 5. We named the collecting method the NL2LL (normal-light to low-light). The
environment with enough light brings convenience to shoot high-quality data. Three pillars of the
photography: shutter speed, ISO, and aperture can be set as “better” parameters during the daytime.
We set the shorter exposure speeds, lower ISO, and larger aperture values to take the dark images
during the daytime.

Figure 5.5. The


Figure Thedifferent
different light
light conditions
conditions during
during the phase
the phase of collecting
of collecting trainingtraining images
images in in deep
deep learning.
learning.
The upperThe upper sub-figure
sub-figure shows thatshows that the
the image image
pairs bothpairs bothincaptured
captured low-lightinconditions.
low-light conditions. In
In the below
the below sub-figure,
sub-figure, our proposedour proposed
image imagemethod
acquisition acquisition method
captures captures
images duringimages duringInthe
the daytime. ourdaytime.
method,
In our
the method,
low-light the low-light
images are takenimages
in the are taken in the
normal-light normal-light condition.
condition.

We collected the training data during the daytime, normal-light condition, shown as the button
sub-image
Figure of
5. Figure 5. Welight
The different named the collecting
conditions during themethod
phase the NL2LL (normal-light
of collecting training imagestoinlow-light).
deep
The environment
learning. The upper sub-figure shows that the image pairs both captured in low-light conditions. Inof the
with enough light brings convenience to shoot high-quality data. Three pillars
photography:
the below shutter speed,
sub-figure, ISO, and image
our proposed aperture can be set
acquisition as “better”
method capturesparameters during
images during the daytime.
the daytime.
In our method, the low-light images are taken in the normal-light condition.
Electronics 2020, 9, 1011 7 of 15

The influence of the aperture is rarely discussed in the relevant literature. Aperture is defined as
the opening in the lens through which light passes to enter the camera. Small aperture numbers
represent2020,
Electronics a large aperture opening size, whereas large numbers represent small apertures. The critical
9, 1011 7 of 15
effect of the aperture is the depth of field. Depth of field is the amount of the photograph that appears
sharp from front to back. According to the principles of optics, a larger aperture size (smaller aperture
We set leads
value) the shorter exposuredepth
to a shallower speeds, lower
of field andISO, and larger
therefore moreaperture
defocusvalues
blur. to take the dark images
during the daytime.
The effect of the aperture size on the image depth of field is shown in Figure 6. The left half of
The
this figure influence of thedepth
has a “thin” aperture is rarely
of field, discussed
where in the relevant
the background literature.out
is completely Aperture is defined
of focus. On the
as the opening in the lens through which light passes to enter the camera.
contrary, the right sub-image of Figure 6 has a “deep” depth of field, where both the foreground Small aperture numbersand
represent
the backgrounda large aperture
are sharp; opening size, whereas
the camera focuseslarge
on thenumbers represent
foreground. small apertures.
Because the zoom-in The images
critical
effect of the aperture
surrounded by the solid is the
reddepth
box ofarefield.
nearDepth of field
the focus point,is the
bothamount
imagesof(thethe first
photograph
and the that
thirdappears
image
sharp from front to back. According to the principles of optics, a larger
at the bottom) are clear. However, the distance between the dotted red region and the camera aperture size (smaller aperture
is far
value)
from that leadsbetween
to a shallower
the focusdepth of field
point andandthetherefore
camera.more defocustoblur.
According the principles of optics, the
The effect
background areaof away
the aperture
from the size on point
focus the image depth
becomes of field
blurred is shown
with a largeinaperture
Figure 6. The left
camera half
setting.
of this figure has a “thin” depth of field, where the background is
The zoom-in sub-image taken in the big aperture size (the second image at the bottom) is more completely out of focus. On the
contrary,
blurred than the right
the sub-image
respective of Figure 6 has
sub-image a “deep”
taken in thedepth
small of aperture
field, where sizeboth thefourth
(the foreground
imageand at the
the
background
bottom). are sharp; the camera focuses on the foreground. Because the zoom-in images surrounded
by the If solid red box
the images areare nearinthe
taken focus
dark point, both
conditions, theimages
aperture (the firstbe
must and
setthe third as
as large image at the
possible tobottom)
receive
are clear. However, the distance between the dotted red region and
more light. The large aperture setting in the camera inevitably results in a large amount ofthe camera is far from that between
the focus point
background and
blur. thetraining
The camera.dataset
According usedtointhe principles
deep learningofhas optics,
many the background
image area away
pairs. Every imagefrom pair
the focusofpoint
consists becomesimage
a low-light blurredandwith a large aperture
a corresponding camera setting.
normal-light image. TheTozoom-in
achievesub-image
high-quality taken
data,in
the big aperture size (the second image at the bottom) is more blurred
each pixel pair in both images must match one-to-one. Unfortunately, the blur of some pixels than the respective sub-image
taken
impacted in the
thesmall
qualityaperture
of the size (the fourth
training image at the bottom).
data [9,14].

.
Figure
Figure 6. The influence
6. The influence ofof the
the aperture
aperture size
size on
on the
the image
image quality. The aperture
quality. The of the
aperture of the left
left and
and right
right
images
images is set as 4 (large aperture size) and 25 (small aperture size), respectively. The center of the
is set as 4 (large aperture size) and 25 (small aperture size), respectively. The center of the red
red
circle
circle is
is the
the location
location ofofthe
thefocal
focalpoint.
point.The
Thearea
areasurrounded
surroundedby bythe
thesolid
solidred
redbox
boxand
anddotted
dotted red box
red is
box
in foreground and background, respectively.
is in foreground and background, respectively.

If the images are taken in dark conditions, the aperture must be set as large as possible to receive
The large aperture size should blur the image. It is also plain to see that the longer exposure time
more light. The large aperture setting in the camera inevitably results in a large amount of background
and the higher ISO reduces the quality of the image pair. More specifically, it is harder to align the
blur. The training dataset used in deep learning has many image pairs. Every image pair consists of
two images pixel by pixel. When there is less light at night, the exposure time must be increased to
a low-light image and a corresponding normal-light image. To achieve high-quality data, each pixel
capture more light for the images. Then, during the day it is important to decrease the exposure time;
pair in both images must match one-to-one. Unfortunately, the blur of some pixels impacted the quality
this will in turn reduce the amount of light that enters the apparatus. The exposure time of the ground
of the training data [9,14].
truth in the literature [9] is 10 s and 30 s, while the respective exposure time in our data is between
The large aperture size should blur the image. It is also plain to see that the longer exposure time
1/10 s to 3 s. The parameter ISO plays the same role. The value of ISO can then be adjusted to the
and the higher ISO reduces the quality of the image pair. More specifically, it is harder to align the two
smaller value (the better quality) in the daytime and set to 100 in our first experiment.
images pixel by pixel. When there is less light at night, the exposure time must be increased to capture
more light for the images. Then, during the day it is important to decrease the exposure time; this will
in turn reduce the amount of light that enters the apparatus. The exposure time of the ground truth in
the literature [9] is 10 s and 30 s, while the respective exposure time in our data is between 1/10 s to 3 s.
Electronics 2020, 9, 1011 8 of 15

Electronics 2020, 9, 1011 8 of 15


The parameter ISO plays the same role. The value of ISO can then be adjusted to the smaller value
Moreover,
(the better quality)we in
adopted Wi-Fi and
the daytime equipment to in
set to 100 remotely
our firstadjust the camera settings, and the camera
experiment.
was fixed on the tripod. The hardware devices ensure the
Moreover, we adopted Wi-Fi equipment to remotely adjust the camera stability of thesettings,
camera while
and the taking
cameraimage
was
pairs.on the tripod. The hardware devices ensure the stability of the camera while taking image pairs.
fixed
researchersconsider
Most researchers considerthat
thatthe
thetraining
trainingdata
dataused
usedforfor low-light
low-light restoration
restoration methods
methods based
based on
on deep learning must be collected in low-light conditions. On the contrary, our
deep learning must be collected in low-light conditions. On the contrary, our experiments have shown experiments have
shown
that the that the training
training data can data can be in
be collected collected in normal-light
normal-light conditions.conditions.
As opposedAstoopposed
previous to previous
methods to
methods to photograph in a low-light environment, our proposed method takes
photograph in a low-light environment, our proposed method takes images in a bright environment. images in a bright
environment.
Figure 7 showsFigure
all the7training
shows all the training
images images in our
in our experiment. Theexperiment. The shooting
shooting parameters parameters
are listed are
in Table 2.
listedalgorithm
Our in Table achieves
2. Our algorithm achieves
exciting results exciting
only using results
20 image only using
pairs. The20camera
image parameters
pairs. The camera
in our
parameters
method make in it
our method
easier make
to take it easier to take
a high-quality image.a high-quality image.

Figure 7. All images in our dataset. Normal-light images (ground truth) are shown at the front.
Figure 7. All images in our dataset. Normal-light images (ground truth) are shown at the front. The
The low-light images (extremely dark) are shown behind.
low-light images (extremely dark) are shown behind.
Table 2. The EXIF parameters of the training images. The first ID column indicates the number of
Table pairs.
image 2. TheThe
EXIF parameters
order of the
of ID is from lefttraining
to right images.
of every The
line,first
thenIDthecolumn indicates
next line, the The
and so on. number of
second
image pairs.
column EXIFThe order of
indicates IDphotos’
the is from parameters.
left to right ofGT every line, then
= ground the next
truth. line,parameters
All the and so on. of
The second
the dark
columnare
images EXIF indicates
as same theofphotos’
as that GT exceptparameters. GT = ground truth. All the parameters of the dark
shutter time.
images are as same as that of GT except shutter time.
ID EXIF ID EXIF ID EXIF ID EXIF
ID EXIF
GT: F16.0, 1/8 s, ISO 100ID EXIF
GT: F16.0, 1 s, ISO 100 ID GT: F16.0, 1/4 s, ISO
EXIF100 ID 1/8 s, EXIF
GT: F16.0, ISO 100
1 2 3 4
Dark: 1/800 s Dark: 1/100 s Dark: 1/400 s GT: sF16.0,
Dark: 1/800
GT: F16.0, 1/8 s,
GT:ISO
F16.0, GT:F16.0,
F16.0,1/10
1 s,s,ISO
ISO100 GT:1/10
F16.0, 1/4 100,
s, ISO 100 GT: F16.0, 1/8 1/8 s, ISO
51 1001/10
Dark:s, ISO 100,2
6
GT: 100,
7 3
GT: F16.0, s, ISO
8 4 s, ISO 100,
+1.0 EV Dark: 1/1000 s +1.0 EV
Dark:1/100
Dark: 1/1000 s s +1.0 EV Dark:Dark:1/400
1/1000 s s +1.0 EV Dark: 1/800100 s
1/800 s
GT: F16.0, 1/4 s, ISO 100, GT: F16.0, 1/4 s, ISO 100, GT: F16.0, 1/4 s, ISO 100, Dark:1/800
GT: F16.0, 1/2 s, ISO 100,s
9 10 11 12
+1.0 EV Dark: 1/400 s +1.0 EV Dark: 1/400 s +1.0 EV Dark: 1/400 s +1.0 EV Dark:GT: F16.0,
1/200 s
GT: F16.0, 1/10 s,
GT: F16.0, 1/5 s, ISO 100, GT:
GT:F16.0,
F16.0,1/10
1/8 s,ISO
s, ISO100,
100, GT: GT: F16.0,
F16.0, 1/8 s, 1/10100,
ISO s, ISO 100, GT: F16.0, 3 s,1/8
ISO s, 100,
ISO
5
13 ISO 100, +1.0 EV 6 14 15 7 16 8
+1.0 EV Dark: 1/500 s +1.0EV
+1.0 EVDark:
Dark: 1/1000
1/800 s s +1.0 EV
+1.0 EV Dark:
Dark: 1/800 s1/1000 s +1.0 EV Dark:
100,1/30
+1.0sEV
Dark: 1/1000s
GT: F16.0, 3 s, ISO 100, GT: F16.0, 2 s, ISO 100, GT: F16.0, 1/4 s, ISO 100, Dark:
GT: F16.0, 1/4 s, ISO 1/800
100, s
17 18 19 20
+1.0 EV Dark: 1/30 s +1.0 EV Dark: 1/50 s +1.0 EV Dark: 1/400 s +1.0 EV Dark:GT: F16.0,
1/400 s
GT: F16.0, 1/4 s,
GT: F16.0, 1/4 s, ISO 100, GT: F16.0, 1/4 s, ISO 100, +1.0 1/2 s, ISO
9 ISO 100, +1.0 EV 10 11 12
+1.0 EV Dark: 1/400 s EV Dark: 1/400s 100, +1.0 EV
Dark: 1/400 s
4. Experiments Dark: 1/200 s
GT: F16.0, 3
GT: F16.0, 1/5 s,
4.1. Dataset GT: F16.0, 1/8 s, ISO 100, GT: F16.0, 1/8 s, ISO 100, +1.0 s, ISO 100,
13 ISO 100, +1.0 EV 14 15 16
+1.0 EV Dark: 1/800 s EV Dark: 1/800 s +1.0 EV
TheDark:
images
1/500 in s our dataset were collected from real-world scenes instead of by the artificial
Dark: 1/30 s
brightness adjustment. The images in our training dataset were taken in a cloudy environment GT: F16.0, by
GT: F16.0, 3 s,
GT: F16.0, 2 s, ISO 100, +1.0 GT: F16.0, 1/4 s, ISO 100, +1.0 1/4 s, ISO
17 ISO 100, +1.0 EV 18 19 20
EV Dark: 1/50 s EV Dark: 1/400 s 100, +1.0 EV
Dark: 1/30 s
Dark: 1/400 s
Electronics 2020, 9, 1011 9 of 15

a Nikon D700 made in Japan. The lens was the fixed-focus lens labeled as AF-S NIKKOR 50 mm
f/1.4G. Because the RAW format can save more low-light information than the sRGB format, the images
were saved in the RAW format. The data collection environments of various algorithms are shown
in Table 3. The dataset was divided into the training data and the test data. All the training images
were taken outdoors and selected randomly from the image pairs. The training data included the
20 image pairs shown in Figure 7. To avoid overfitting, the test images were taken separately in
the low-light environment and independent of the training data. The test images were taken from
a variety of scenes, such as a bedroom and outdoors. They were taken from the environment about
1 lumen, approximately 100 times lower brightness than that of the training images. Some of the
test images are shown in Figure 8. The exposure time of the training pictures was reduced exactly
100 times compared to the test data picture. The training images and test images were independent
and identically distributed (i.i.d.) to guarantee the effectiveness of our algorithm.

Table 3. The environments of collecting images in various algorithms.

Class of Methods The Training Data The Test Data


Traditional methods [20] None Low-light environment
Methods based on deep learning [5,9,14] Low-light environment Low-light environment
Our methods Normal-light environment Low-light environment
Electronics 2020, 9, 1011 10 of 15

Figure
Figure The
8. 8. Theresults
resultsofofthe thevarious
variousalgorithms.
algorithms. The The images
imagesmarked
markedwithwitha aredreddotted
dotted box
box onon thethe
second
second row are the zoom-in images of the region surrounded by the small red dotted box in line 1, 1,
row are the zoom-in images of the region surrounded by the small red dotted box in line
respectively.
respectively. The
Theimages
imagessurrounded
surrounded by by the
the orange boxes in
orange boxes in the
thefourth
fourthline
lineare
arealso
alsothe
themagnified
magnified
images
images from
from thethesolid
solidorange
orangebox boxregion
regionininline
line 3.
3. The indoor
indoor test
testimage
image(upper
(upperleftleftsub-image)
sub-image) waswas
taken with
taken withthethe
parameters:
parameters:exposure
exposurecompensation
compensation +1.0,+1.0,ff6.1,
6.1,0.1
0.1ssand
andISOISO400.
400.TheTheoutdoor
outdoor test
test
image
image (third
(thirdline
lineandandfirst
firstcolumn)
column)waswastaken
taken with
with the
the parameters:
parameters: ff3.5,
3.5,0.2
0.2s sand
andISOISO400.
400.The
The first
first
column
column shows
shows the test
the testimages
imagestaken
takenin
inananextremely
extremely dark condition.
condition.The Thesecond
secondcolumn
column shows
shows ourour
results
results byby thethe end-to-enddeep
end-to-end deeplearning
learningmodel.
model. The third
third column
columnshows
showsthetheresults
resultsbybythe traditional
the traditional
method
method which
which cascadesluminosity
cascades luminosityscaling
scalingand
and BM3D
BM3D [20].
[20]. The
Thefourth
fourthtotothe
theseventh
seventhcolumns
columns show
show
the results by the Robust Retinex Model algorithm [36], by JED [37], by LIME
the results by the Robust Retinex Model algorithm [36], by JED [37], by LIME [28] and SID [9]. The [28] and SID [9]. The last
last column
column showsshows
the groundthe groundtruth truth
takentaken
with with a long
a long exposure
exposure time.time.

4.2.4.3.
Qualitative Results
Quantitative and Perceptual Analysis
Analysis
TheThemethods as aIQA
objective comparative
methods baseline
were usedinclude the traditional
to quantitatively methodthe
measure andresults.
the “modern” method.
The objective
The traditional method cascades a luminosity scaling algorithm and a denoising algorithm. BM3Dno-
methods can be classified into full-reference (FR) methods, reduced-reference (RR) methods, and was
reference
selected (NR)
as the methods.
classic denoising algorithm in our research because it outperforms most techniques
FR metrics
facing the images try
withto real
assess an image
noise. quality by
The modern comparing approach
data-driven it with a reference imagemachine
selects some (groundlearning
truth)
that is assumed to have perfect quality. The
algorithms that have been proposed in recent years. classical FR methods, the peak signal-to-noise ratio
(PSNR) and the structural similarity index (SSIM [38]) were selected in our quantitative analysis. The
higher the value of the two FR methods, the better the image quality.
The ENIQA [39] and Integrated Local NIQE (IL-NIQE) [40] present the high-performance
general-purpose NR IQA methods based on image entropy. IL-NIQE uses a feature-enriched
completely blind image quality evaluator. NIQE [41] makes a completely blind image quality
Electronics 2020, 9, 1011 10 of 15

The process of determining the level of image quality is called Image Quality Assessment (IQA).
IQA is part of the quality of experience measures. Image quality can be assessed using two kinds
of methods: subjective and objective. In the subjective method, the corresponding result images
processed by different pipelines were presented to students, who determine which image had higher
quality. The images were presented in a random order, with a random left–right order without any
indication of the provenance. A total of 100 comparisons were performed by five students. The students
found the results using the traditional method (the third column) still had a good aspect of light,
but there were some yellow patches in the large white region and noise particles. Other “modern”
algorithms [28,36,37] did not work well in the extremely dark real environments. Our results were
superior in the aspects of the image contrast, color accuracy, dynamic range, and exposure accuracy.
Our pipeline significantly outperformed the traditional method and the “modern” methods in the
aspect of denoising and color collection, respectively. The results by the various algorithms are shown
in Figure 8.

4.3. Quantitative Analysis


The objective IQA methods were used to quantitatively measure the results. The objective methods
can be classified into full-reference (FR) methods, reduced-reference (RR) methods, and no-reference
(NR) methods.
FR metrics try to assess an image quality by comparing it with a reference image (ground truth)
that is assumed to have perfect quality. The classical FR methods, the peak signal-to-noise ratio (PSNR)
and the structural similarity index (SSIM [38]) were selected in our quantitative analysis. The higher
the value of the two FR methods, the better the image quality.
The ENIQA [39] and Integrated Local NIQE (IL-NIQE) [40] present the high-performance
general-purpose NR IQA methods based on image entropy. IL-NIQE uses a feature-enriched completely
blind image quality evaluator. NIQE [41] makes a completely blind image quality analyzer and is also
one of the NR methods. SSEQ [42] evaluates image quality assessment based on spatial and spectral
entropies without a reference image. The lower scores calculated by these NR methods represents
better image quality.
The quantitative IQA of the experimental results is shown in Tables 4 and 5. The size of the
toy image in Table 4 was set to 512 × 339 pixels. The size of the bicycle image in Table 5 was set to
512 × 340 pixels. The images in the PNG format were evaluated by the following IQA algorithms,
except SID [9]. The original model in SID only accepted the 16-bit raw image taken by the Sony camera
and 14-bit raw image taken by the Fuji camera. Our test images were taken with a Nikon DSLR D700
made in Japan. Therefore, we modified the test code of SID to accept our test images.

Table 4. The Image Quality Assessment (IQA) of the recovery results about the toy image. The first
column shows the various IQA methods. The columns (from second to seventh) show the corresponding
IQA scores of the image results by various algorithms. The last column shows the score of the ground
truth. The bold font and “(1)” indicate the value that is the best score. The underline and “(2)” represent
second place.
Assessment Methods Our Method The Traditional Method Robust Retinex Model JED LIME SID Ground Truth
PSNR 27.4600 (2) 22.1908 12.5453 12.3313 14.3542 21.7659 Inf (1)
SSIM [38] 0.9479 (2) 0.8593 0.3019 0.2726 0.3442 0.5128 1 (1)
ENIQA [39] 0.0762 (2) 0.2004 0.5808 0.5499 0.3101 0.1900 0.0709 (1)
IL-NIQE [40] 34.3996 (3) 37.4977 71.1242 75.1293 70.9793 34.3299 (2) 30.9900 (1)
NIQE [41] 4.6251 (1) 5.5973 8.5458 7.6560 7.9087 5.1727 4.7387 (2)
SSEQ [42] 5.3654 (1) 29.0665 37.1942 19.3989 35.3465 30.2516 15.8142 (2)

Similar to the qualitative assessment, our method also exceeds most methods in quantitative
analysis. The Robust Retinex Model, JED, and LIME can achieve better effect in this degree of light at
dusk, but these methods do not work in extremely dark environments.
Electronics 2020, 9, 1011 11 of 15

Table 5. The IQA of the recovery results about the bicycle image. The first column shows the various
IQA methods. The columns (from second to seventh) show the corresponding IQA scores of the image
results by various algorithms. The last column shows the score of the ground truth. The bold font and
“(1)” indicate the value that is the best score. The underline and “(2)” represent second place.
Assessment Methods Our Method The Traditional Method Robust Retinex Model JED LIME SID Ground Truth
PSNR 30.8136 (2) 21.1950 11.9917 11.9579 13.7948 23,4298 Inf (1)
SSIM [38] 0.9535 (2) 0.8106 0.1963 0.1832 0.2072 0.8374 1 (1)
ENIQA [39] 0.1320 (2) 0.2860 0.1476 0.2346 0.2048 0.1658 0.0995 (1)
IL-NIQE [40] 25.1430(3) 25.5302 50.0900 47.2871 46.7496 23.6752 (1) 24.7926 (2)
NIQE [41] 3.6368 (1) 4.1941 7.0819 5.7749 7.0317 3.7191 (2) 3.7534
SSEQ [42] 15.6676(1) 38.2989 35.0894 28.4950 42.2645 24.2338 19.7217 (2)

Due to the lack of reference images, the NR methods do not know which image is the ground
truth. By NIQE and SSEQ (Line 5 and Line 6), the scores of our result even are a little better than that
of ground truth. The close scores of ours and ground truth calculated by NIQE mean that our restored
images are well.
Next, we analyzed the generalization performance of the network model. The generalization
performance of a learning algorithm refers to the performance on out-of-sample data of the models
learned by the algorithm. The test images in Figure 8 were shot in an environment that is as bright
as that of the training images. In this experiment, the test images were taken with different camera
parameters in an extremely low-light environment by different parameters. The camera metering
system indicated the setting (F5, 50 s, ISO 400) can achieve normal exposure. The test images with
different levels of darkness were collected by only adjusting the shutter time. For example, the exposure
time was set to 0.5 second to 100 × less exposure, as shown in the second column of Figure 7. Similarly,
the exposure times were set to 1 s, 1/4 s, 1/8 s, and 1/15 s, which means 50×, 200×, 400×, and 750× less
exposure, respectively.
The curves are smooth in the second line of Figure 9. However, some line deformation can be
found in the fourth line of Figure 9. The details of the line illustrate that our method has more powerful
capabilities to restore details than the traditional method. The sub-images from the first to third
column show that our algorithm adapts to the environments with varying degrees of darkness. Even in
extreme exposure condition (750× less exposure), our method can restore the acceptable details.
Electronics 2020, 9, 1011 12 of 15

Figure 9.Figure
The 9.restoration
The restoration results
results in in different brightness
different brightness environments.
environments.The first
Theline shows
first linethe
shows the
restoration results of the test images with different black levels by our algorithm. The second line
restoration results of the test images with different black levels by our algorithm. The second line
shows the details of the results. The bottom two lines show the results handled by traditional
shows themethods.
details The
of the results. The bottom two lines show the results handled by traditional methods.
columns represent the different original brightness levels.
The columns represent the different original brightness levels.
In addition to subjective analysis, several objective IQA methods were adopted to evaluate
image quality. The quantitative analysis of the effect of different exposures is shown in Table 6. It is
obvious to see our method surpasses the traditional approach.

Table 6. The IQA of the recovery results about the image with different exposures. In line 1, the word
“n”× (“n” indicates a number) indicates n times reduction of exposure amount on the bases of normal
Electronics 2020, 9, 1011 12 of 15

In addition to subjective analysis, several objective IQA methods were adopted to evaluate image
quality. The quantitative analysis of the effect of different exposures is shown in Table 6. It is obvious
to see our method surpasses the traditional approach.

Table 6. The IQA of the recovery results about the image with different exposures. In line 1, the word
“n”× (“n” indicates a number) indicates n times reduction of exposure amount on the bases of normal
exposure. The first column indicates the assessment methods and restoration algorithm. The traditional
method is abbreviated as TM. Bold numbers represent the best results in this line. Underlined numbers
indicate better results in different algorithms.

Assessment Methods
50× 100× 200× 400× 750×
(Restoration Algorithm)
ENIQA [39] (Ours) 0.0694 0.0701 0.0800 0.1369 0.1856
ENIQA (TM) 0.2583 0.2318 0.2793 0.4141 0.5599
IL-NIQE [40] (Ours) 36.5769 38.7293 47.2164 58.6588 71.5074
IL-NIQE (TM) 50.1786 50.9244 62.1660 80.4942 96.9210
NIQE [41] (Ours) 6.1894 5.9605 6.0204 6.7881 7.0984
NIQE (TM) 7.5423 7.5152 8.4179 8.6997 9.1467
SSEQ [42] (Ours) 2.4008 4.0506 4.6036 9.4805 13.3380
SSEQ (TM) 21.6355 22.9971 19.5719 17.6767 15.8840

4.4. Implementation Details


In our study, the input images are the RAW images with the size H × W × 1. H and W are the
abbreviations of the height and the width and are equal to 2832 and 4256, respectively. The input
images were packed into four channels and correspondingly cut in half in each dimension. Thus, the
size of the channels was 0.5 H × 0.5 W × 4. The packed data was fed into a neural network. Because
the U-net network has a residual connection [43,44] and supports the full-resolution image in GPU
memory, we selected a network with the architecture similar to U-net [10]. The output of the deep
learning network was a 12-channel image with the size 0.5 H × 0.5 W × 12. Lastly, the 12-channel
image was processed by a sub-pixel layer to recover the RGB image with the size H × W × 3.
Our implementation was based on TensorFlow and python. In all of our experiments, we used L1
loss, the Adam optimizer [45], and the Leaky ReLU (LRelu) [46] activation function. We trained the
network for the Nikon D700 camera images. The initial learning rate was set to 0.0001, weight decay
to 0.00001, and dampening to 0. The initial learning rate decreased according to the cosine function.
According to the practical effect of the experiment, we set the training epoch as 4000.

5. Discussion
In this work, we shared a new collecting image method that can be used for future research on
machine learning. Our algorithm was only trained with 20 image pairs and achieved the inspiring
result in the restoration of the extremely low-light images with the help of a high-quality dataset.
The method can be used in most supervised learning tasks.
In the future, we will try to improve our work in the following points. (1) It is known that
Restricted Boltzmann Machines (RBMs) [47] can be used to preprocess the data and help the machine
learning process become more efficient. The NL2LL model based on RMB might provide even better
results. (2) Improved U-net networks can be used to improve performance. (3) The generalization
performance of the method still needs to be studied. In our study, the recovered images were poor
when the brightness of test images was magnified more than 400 times. Prior information about the
environment light can be used for the algorithm. (4) The shooting environment can be extended to
more complex scenarios, such as the condition with dark and blur and the condition with dark and
scatting, etc. (5) The method can be used for low movement data (spatiotemporal data) and 3D objects
(3D geometry).
Our research is of great significance in areas such as underwater robots and surveillance.
Electronics 2020, 9, 1011 13 of 15

6. Conclusions
To see in the extreme dark, we have proposed a new method, NL2LL (collecting a low-light dataset
in normal-light condition) to collect image pairs. The method has many potential implementations
in convolutional neural network, dilated convolutional NN, regression, and graphical models.
Our end-to-end approach is simple and highly effective. We have demonstrated its efficacy in
low-light image restoration. The experiment shows that our approach can achieve inspiring results by
only using 20 image pairs.

Author Contributions: All authors contributed to the paper. H.W. performed project administration; Y.X.
conceived, designed and performed the experiments; Y.X. and G.D.C. wrote and reviewed the paper; S.R.
performed the experiments; W.S. analyzed the data. All authors have read and agreed to the published version of
the manuscript.
Funding: This research was funded by National Science Foundation of China (NSFC) grant No. 61571369. It was
also funded by Zhejiang Provincial Natural Science Foundation (ZJNSF) grant No.LY18F010018. It was also
supported by the 111 Project under Grant No. B18041.
Conflicts of Interest: The authors declare no conflict of interest.

References
1. Hu, Z.; Cho, S.; Wang, J.; Yang, M.H. Deblurring low-light images with light streaks. In Proceedings of
the IEEE Conference on Computer Vision and Pattern Recognition, Columbus, OH, USA, 24–27 June 2014;
pp. 3382–3389.
2. Remez, T.; Litany, O.; Giryes, R.; Bronstein, A.M. Deep Convolutional Denoising of Low-Light Images.
arXiv 2017, arXiv:1701.01687.
3. Zhang, X.; Shen, P.; Luo, L.; Zhang, L.; Song, J. Enhancement and noise reduction of very low light level images.
In Proceedings of the 21st International Conference on Pattern Recognition (ICPR2012), Tsukuba, Japan,
11–15 November 2012.
4. Plotz, T.; Roth, S. Benchmarking denoising algorithms with real photographs. In Proceedings of the IEEE
Conference on Computer Vision and Pattern Recognition, Honolulu, HI, USA, 21–26 July 2017; pp. 1586–1595.
5. Wang, R.; Zhang, Q.; Fu, C.W.; Shen, X.; Zheng, W.S.; Jia, J. Underexposed Photo Enhancement using
Deep Illumination Estimation. In Proceedings of the IEEE Conference on Computer Vision and Pattern
Recognition, Long Beach, CA, USA, 16–20 June 2019.
6. Hasinoff, S.W.; Sharlet, D.; Geiss, R.; Adams, A.; Barron, J.T.; Kainz, F.; Chen, J.; Levoy, M. Burst photography
for high dynamic range and low-light imaging on mobile cameras. ACM Trans. Graph. 2016. [CrossRef]
7. Liu, Z.; Yuan, L.; Tang, X.; Uyttendaele, M.; Suny, J. Fast burst images denoising. ACM Trans. Graph. 2014.
[CrossRef]
8. Mildenhall, B.; Barron, J.T.; Chen, J.; Sharlet, D.; Ng, R.; Carroll, R. Burst Denoising with Kernel
Prediction Networks. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition,
Salt Lake City, UT, USA, 18–22 June 2018. [CrossRef]
9. Chen, C.; Chen, Q.; Xu, J.; Koltun, V. Learning to See in the Dark. In Proceedings of the IEEE Conference on
Computer Vision and Pattern Recognition, Salt Lake City, UT, USA, 18–22 June 2018. [CrossRef]
10. Ronneberger, O.; Fischer, P.; Brox, T. U-Net: Convolutional Networks for Biomedical Image Segmentation.
In International Conference on Medical Image Computing and Computer-Assisted Intervention; Springer: Cham,
Switzerland, 2015; pp. 234–241. [CrossRef]
11. Everingham, M.; van Gool, L.; Williams, C.K.I.; Winn, J.; Zisserman, A. The pascal visual object classes (VOC)
challenge. Int. J. Comput. Vis. 2010. [CrossRef]
12. Russakovsky, O.; Deng, J.; Su, H.; Krause, J.; Satheesh, S.; Ma, S.; Huang, Z.; Karpathy, A.; Khosla, A.;
Bernstein, M.; et al. ImageNet Large Scale Visual Recognition Challenge. Int. J. Comput. Vis. 2015. [CrossRef]
13. Cheng, M.M.; Zhang, Z.; Lin, W.Y.; Torr, P. BING: Binarized normed gradients for objectness estimation
at 300fps. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition,
Columbus, OH, USA, 23–28 June 2014. [CrossRef]
14. Lore, K.G.; Akintayo, A.; Sarkar, S. LLNet: A deep autoencoder approach to natural low-light image
enhancement. Pattern Recognit. 2017. [CrossRef]
Electronics 2020, 9, 1011 14 of 15

15. Xu, J.; Li, H.; Liang, Z.; Zhang, D.; Zhang, L. Real-World Noisy Image Denoising: A New Benchmark.
arXiv 2018, arXiv:1804.02603.
16. Wei, C.; Wang, W.; Yang, W.; Liu, J. Deep Retinex Decomposition for Low-Light Enhancement.
arXiv 2018, arXiv:1808.04560.
17. Loh, Y.P.; Chan, C.S. Getting to know low-light images with the Exclusively Dark dataset. Comput. Vis.
Image Underst. 2019. [CrossRef]
18. Rudin, L.I.; Osher, S.; Fatemi, E. Nonlinear total variation based noise removal algorithms. Phys. D
Nonlinear Phenom. 1992. [CrossRef]
19. Mairal, J.; Bach, F.; Ponce, J.; Sapiro, G.; Zisserman, A. Non-local sparse models for image restoration.
In Proceedings of the 2009 IEEE 12th International Conference on Computer Vision, Kyoto, Japan,
29 September–2 October 2009. [CrossRef]
20. Dabov, K.; Foi, A.; Katkovnik, V.; Egiazarian, K. Image denoising by sparse 3-D transform-domain
collaborative filtering. IEEE Trans. Image Process. 2007. [CrossRef]
21. Zhang, K.; Zuo, W.; Chen, Y.; Meng, D.; Zhang, L. Beyond a Gaussian denoiser: Residual learning of deep
CNN for image denoising. IEEE Trans. Image Process. 2017. [CrossRef] [PubMed]
22. Xie, J.; Xu, L.; Chen, E. Image denoising and inpainting with deep neural networks. Adv. Neural Inf.
Process. Syst. 2012. [CrossRef]
23. Łoza, A.; Bull, D.; Achim, A. Automatic contrast enhancement of low-light images based on local statistics
of wavelet coefficients. In Proceedings of the 2010 IEEE International Conference on Image Processing,
Hong Kong, China, 26–29 September 2010. [CrossRef]
24. Burger, H.C.; Schuler, C.J.; Harmeling, S. Image denoising: Can plain neural networks compete with BM3D?
In Proceedings of the 2012 IEEE Conference on Computer Vision and Pattern Recognition, Providence, RI, USA,
16–21 June 2012. [CrossRef]
25. Joshi, N.; Cohen, M.F.; Seeing, M.T. Rainier: Lucky imaging for multi-image denoising, sharpening, and haze
removal. In Proceedings of the 2010 IEEE International Conference onComputational Photography (ICCP),
Cambridge, MA, USA, 29–30 March 2010. [CrossRef]
26. Ignatov, A.; Kobyshev, N.; Timofte, R.; Vanhoey, K.; van Gool, L. WESPE: Weakly supervised photo enhancer
for digital cameras. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition
Workshops, Salt Lake City, UT, USA, 18–22 June 2018. [CrossRef]
27. Dong, X.; Wang, G.; Pang, Y.; Li, W.; Wen, J.; Meng, W.; Lu, Y. Fast efficient algorithm for enhancement of
low lighting video. In Proceedings of the 2011 IEEE International Conference on Multimedia and Expo,
Barcelona, Spain, 18–15 July 2011. [CrossRef]
28. Guo, X.; Li, Y.; Ling, H. LIME: Low-light image enhancement via illumination map estimation. IEEE Trans.
Image Process. 2017. [CrossRef] [PubMed]
29. LeCun, Y.; Boser, B.; Denker, J.S.; Henderson, D.; Howard, R.E.; Hubbard, W.; Jackel, L.D. Backpropagation
Applied to Handwritten Zip Code Recognition. Neural Comput. 1989, 1, 541–551. [CrossRef]
30. le Cun, Y.; Fogelman-Soulié, F. Modèles connexionnistes de l’apprentissage, Intellectica. Rev. l’Association
Pour Rech. Cogn. 1987. [CrossRef]
31. Hinton, G.E.; Zemel, R.S. Autoencoders Minimum Description Length and Helmholtz free Energy. Adv. Neural
Inf. Process. Syst. 1994. [CrossRef]
32. Poggio, T.; Torre, V.; Koch, C. Computational vision and regularization theory. Nature 1985. [CrossRef]
33. Friedman, J.H. Regularized discriminant analysis. J. Am. Stat. Assoc. 1989. [CrossRef]
34. Vincent, P.; Larochelle, H.; Bengio, Y.; Manzagol, P.A. Extracting and composing robust features with
denoising autoencoders. In Proceedings of the 25th International Conference on Machine Learning, Helsinki,
Finland, 5–9 July 2008. [CrossRef]
35. Goodfellow, A.C.I. Yoshua Bengio, Deep Learning Book. Deep Learn. 2015. [CrossRef]
36. Li, M.; Liu, J.; Yang, W.; Sun, X.; Guo, Z. Structure-Revealing Low-Light Image Enhancement Via Robust
Retinex Model. IEEE Trans. Image Process. 2018. [CrossRef]
37. Ren, X.; Li, M.; Cheng, W.H.; Liu, J. Joint enhancement and denoising method via sequential decomposition.
In Proceedings of the 2018 IEEE International Symposium on Circuits and Systems (ISCAS), Florence, Italy,
27–30 May 2018; pp. 1–5.
38. Wang, Z.; Bovik, A.C.; Sheikh, H.R.; Simoncelli, E.P. Image quality assessment: From error visibility to
structural similarity. IEEE Trans. Image Process. 2004. [CrossRef] [PubMed]
Electronics 2020, 9, 1011 15 of 15

39. Chen, X.; Zhang, Q.; Lin, M.; Yang, G.; He, C. No-reference color image quality assessment: From entropy to
perceptual quality, Eurasip. J. Image Video Process. 2019. [CrossRef]
40. Zhang, L.; Zhang, L.; Bovik, A.C. A feature-enriched completely blind image quality evaluator. IEEE Trans.
Image Process. 2015. [CrossRef] [PubMed]
41. Mittal, A.; Soundararajan, R.; Bovik, A.C. Making a “completely blind” image quality analyzer. IEEE Signal
Process. Lett. 2013. [CrossRef]
42. Liu, L.; Liu, B.; Huang, H.; Bovik, A.C. No-reference image quality assessment based on spatial and spectral
entropies. Signal Process. Image Commun. 2014. [CrossRef]
43. He, K.; Zhang, X.; Ren, S.; Sun, J. Deep Residual Learning for Image Recognition. In Proceedings of the
2016 IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, NV, USA, 27–30 June 2016.
[CrossRef]
44. Xu, Y.; Wang, H.; Liu, X. An improved multi-branch residual network based on random multiplier and
adaptive cosine learning rate method. J. Vis. Commun. Image Represent. 2019, 59, 363–370. [CrossRef]
45. Kingma, D.P.; Ba, J.L. Adam: A method for stochastic optimization. arXiv 2015, arXiv:1412.6980.
46. Xu, B.; Wang, N.; Chen, T.; Li, M. Empirical Evaluation of Rectified Activations in Convolutional Network.
arXiv 2015, arXiv:1505.00853.
47. Salakhutdinov, R.; Mnih, A.; Hinton, G. Restricted Boltzmann machines for collaborative filtering.
In Proceedings of the 24th International Conference on Machine Learning, Corvalis, OR, USA, 20 June 2007;
pp. 791–798.

© 2020 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access
article distributed under the terms and conditions of the Creative Commons Attribution
(CC BY) license (http://creativecommons.org/licenses/by/4.0/).

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy