DDR 2016
DDR 2016
fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/TIP.2016.2574984, IEEE
Transactions on Image Processing
1
PSNR (dB)
36 LSSC
visible artifacts. On the other hand, optimization strategies such
as learned simultaneous sparse coding (LSSC) and sparsity and PCSD
adaptive PCA (SAPCA) based algorithms were shown to greatly 35
LPAICI MSG
DLMMSE
improve image quality compared to that delivered by interpo- GBTF
lation methods, but unfortunately are computationally heavy. In 34
AHD
this paper we propose ‘efficient regression priors (ERP)’ as a HQL
novel, fast post-processing algorithm that learns the regression 33 AP
priors offline from training data. We also propose an independent BILINEAR
efficient demosaicing algorithm based on directional difference 32
regression (DDR), and introduce its enhanced version based on 0 200 400 600 800 1,000 1,200 1,400 1,600
fused regression (FR). We achieve an image quality comparable running time (s)
to that of state-of-the-art methods for three benchmarks, while
being order(s) of magnitude faster. Fig. 2: Our proposed methods (DDR, FR and ERP) provide the
Index Terms—Demosaicing, Color filter array, Super- best average demosaicing quality with low time complexity, on
resolution, Image Enhancement, Linear Regression. the IMAX dataset. Details are given in Section V.
1057-7149 (c) 2016 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/TIP.2016.2574984, IEEE
Transactions on Image Processing
2
the best, through an optimal statistical decision or inference methods, and boost their performance. Of particular interest
process. is its combination with the fastest ones, as this leads to state-
GBTF & MSG. Pekkucuksen and Altunbasak propose the of-the-art performance at high speed. On top of that, we also
gradient-based threshold-free (GBTF) method [12] and an propose modifications that go beyond sheer post-processing
improved version, the multiscale gradients-based (MSG) [13] and that further improve the results.
color filter array interpolation. GBTF addresses certain limi- Our post-processing step is coined ‘efficient regression
tations of DLMMSE by introducing gradients of color differ- priors method’ (ERP). For a given demosaicing method,
ences to compute weights for the west, east, north and south ERP learns offline linear regressors for the residuals between
directions. MSG further applies multiscale color gradients to demosaiced training images and the ground truth, and then
adaptively combine color estimates from different directions. applies them to the output of the demosaicing method at
MLRI. Incorporating the idea from GBTF, Kiku et al. [14] runtime. ERP is inspired by the adjusted anchored neigh-
propose minimized-Laplacian residue interpolation (MLRI). borhood regression (A+) [21], [22], a state-of-the-art method
They estimate the tentative pixel values by minimizing the in image super-resolution. Farsiu et al. [23] were among the
Laplacian energies of the residuals. first to observe the connection between super-resolution and
AVSC. Zhang et al. [15] propose a robust color demosaic- demosaicing. ERP as sheer post-processing step has already
ing method with adaptation to varying spectral correlations been introduced in our previous paper [24]. Here we add
(AVSC). AVSC is a hybrid approach which combines an two further refined versions for fast demosaicing, one based
existing color demosaicing algorithm such as DLMMSE [6] on directional difference regression (DDR) and the other on
with an adaptive intraband interpolation. fused regression (FR). DDR and FR integrate MLRI and
LDINAT. Zhang et al. [16] derive a color demosaicing method ERP beyond simply post-processing the demosaiced images.
by local directional interpolation and nonlocal adaptive thresh- Motivated by MLRI, we fully explore the correlation between
olding (LDINAT) and exploit the non-local image redundancy channels by training directional differences. As a result, our
to enhance the local color results. methods reduce the color artifacts and achieve state-of-the-
Besides interpolation based methods, the demosaicing prob- art performance comparable to those of LSSC/SAPCA, but at
lem is also tackled with optimization-based methods. running times that are order(s) of magnitude lower (see Fig. 2).
AP. For optimization, Gunturk et al. [17] iteratively exploit Our paper is organized as follows. Section II briefly reviews
inter-channel correlation in an alternating-projections scheme MLRI and A+, as both underly our methods. Section III intro-
(AP). After initial estimation, intermediate results are pro- duces our proposed post-processing method - ERP. Section IV
jected onto two constraint sets, which are determined by the further introduces our novel demosaicing methods DDR and
observed data and prior information on spectral correlation. FR. In section V, we discuss the choices of parameters and
AHD. Hirakawa et al. [18] propose an adaptive homogeneity- the experimental results. Finally, we conclude the paper in
directed demosaicing algorithm (AHD). AHD employs metric section VI.
neighborhood modeling and filter bank interpolation in order
to determine the interpolation direction and cancel aliasing, II. R EVIEW OF MLRI AND A+
followed by artifact reduction iterations.
This section briefly reviews the two major sources of
LSSC. Mairal et al. [19] derive a learned simultaneous sparse
inspiration for our proposed methods: the MLRI demosaicing
coding method (LSSC) for both denoising and demosaicing.
method [14] and the A+ super-resolution method [21].
Essentially, they unify two steps – dictionary learning adapted
to sparse signal description and exploiting the self-similarities
of images into LSSC. A. Minimized-Laplacian Residue Interpolation (MLRI)
SAPCA. Last but not least, Gao et al. [20] propose the The MLRI method of Kiku et al. [14] is mainly motivated
sparsity and adaptive principal component analysis (PCA) by the GBTF method of Pekkucuksen et al.[12]. MLRI in-
based algorithm (SAPCA) by solving a minimization problem, cludes two stages (see Fig. 3). Let Gx,y and Rx,y denote the
i.e. by minimizing an l1 function that contains sparsity and raw values at position (x, y) for the green and red channels,
PCA terms. resp.:
We observe that most methods do not perform consistently First stage. 1 MLRI estimates the missing G values at loca-
on the IMAX and Kodak datasets (see Fig. 7), which are tions with R information as well as the R values at locations
the two most commonly used datasets for testing demosaicing with G information through linear interpolation. Assuming the
algorithms. When they perform well on Kodak, they tend to raw value Gi,j or Ri,j is missing then we have,
be less convincing on IMAX. Of course, part of the reason
is that the study of Kodak has a longer history than that of GH H
i,j = (Gi,j−1 + Gi,j+1 )/2, Ri,j = (Ri,j−1 + Ri,j+1 )/2.
IMAX, and the images in IMAX seem to be more challenging (1)
to reconstruct. LSSC and SAPCA report the best performances Next, after computing the horizontal Laplacian of tentative R
on the Kodak dataset and SAPCA substantially outperforms and G estimations by the 1D-filter
all other methods on the IMAX dataset. Yet, both methods
F1D = −1 0 2 0 −1 ,
(2)
come with a high computational cost.
In this paper, we propose an efficient post-processing step 1 Here we only discuss the estimation of the G values at R position in the
that can be combined with all aforementioned demosaicing horizontal direction, G values at B are handled similarly.
1057-7149 (c) 2016 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/TIP.2016.2574984, IEEE
Transactions on Image Processing
3
MLRI uses a modified version of guided image filters where FG is the Gaussian weighted averaging filter
(GIF) [25] to obtain intermediate G values, meaning that the
F1D RH is treated as the guided Laplacian for F1D GH , so FG = 0.56 0.35 0.08 0.01 0 , (9)
that the dilation coefficient ai,j is obtained, ωn,s,e,w are computed by color difference gradients and ωt is
1 H H the sum of ωn,s,e,w . Eventually, G values at R locations are
P
|ω| (m,n)∈ωi,j (F1D Rm,n )(F1D Gm,n )
ai,j = 2 + , (3) obtained by
σi,j
G̃i,j = Ri,j + ∆˜ g,r/b (i, j). (10)
where ωi,j is a local image patch centered at pixel (i, j), |ω|
2
is the number of pixels in ωi,j , σi,j is the variance of F1D RH A similar derivation holds for the G values at B locations.
in ωi,j , is a regularization parameter. As to the R channel, MLRI computes the Laplacian of R
The translation coefficient bi,j is obtained as follows, and G values with the 2D-filter
bi,j = GH i,j − ai,j RH i,j , (4) 0 0 −1 0 0
0 0 0 0 0
where GH i,j and RH i,j are the mean values of GH and RH
F2D = −1 0 4 0 −1 . (11)
in ωi,j . The intermediate G value ǦH
i,j is further determined, 0 0 0 0 0
1 X 0 0 −1 0 0
ǦH
i,j =
H
(ak,l Ri,j + bk,l ). (5)
|ω| Again, the modified GIF is applied. The R channel is guided
(k,l)∈ωi,j
by G̃i,j values. In the end, the output R values are enhanced
Under the assumption that the residues vary linearly in a
by smoothing the residues as Eq. (6) indicates. The B channel
small area, the smoothed residues ∆H
g are estimated by linear goes through exactly the same process.
interpolation
∆H H H
g (i, j) = (Gi,j−1 − Ǧi,j−1 )/2 + (Gi,j+1 − Ǧi,j+1 )/2. (6) B. Adjusted Anchored Neighborhood Regression (A+)
Correspondingly, the horizontally enhanced G values at the R A+ proposed by Timofte et al. [21] derives from and
locations are acquired by adding the tentative values ǦH and greatly enhances the performance of Anchored Neighborhood
the interpolated residuals ∆H g . To get other enhanced R, B Regression (ANR) [26] for image super-resolution tasks. The
values at different positions MLRI applies the same modified algorithm contains two important stages:
GIF. Offline Stage. A+ uses Zeyde et al.’s algorithm [27] as a
Second stage. It starts with computing the tentative horizon- starting point, which trains a sparse dictionary from millions
tal/vertical (h/v) color differences (G-R, G-B) ∆ ˜ H,V
g,r/b of low resolution (LR) patches collected from 91 training
H,V images [28]. To begin with, the LR images in the YCbCr
G̃i,j − Ri,j G is interpolated at R,
G̃H,V − B
color space are scaled up to the size of output high-resolution
˜ H,V (i, j) = i,j i,j G is interpolated at B,
∆ g,r/b H,V
(HR) images by bicubic interpolation. In the next step, the
G i,j − R̃ i,j R is interpolated, upscaled LR image yl is filtered by the first- and second-order
H,V
Gi,j − B̃i,j B is interpolated, gradients, and features {p̃kl }k corresponding to LR patches
(7) of size 3 × 3 are collected accordingly. A+ projects them
where G̃H,V
i,j , R̃ H,V
i,j and B̃ H,V
i,j are the above enhanced hor- onto a low-dimensional subspace by PCA, discarding 0.1%
izontal/vertical values. Then the color differences ∆ ˜ g,r/b are of the energy. When it comes to the training, K-SVD [29],
weighted and improved as an iterative method that alternates between sparse coding of
˜ g,r/b (i, j) = {ωs FG ∆
∆ ˜ V (i − 4 : i, j) the examples and updating the dictionary atoms, is applied to
g,r/b
solve the following optimization problem
˜ V (i : i + 4, j)
+ ωn FG ∆ g,r/b
(8)
X
˜ H (i, j − 4 : j)FTG
+ ωw ∆ Dl , {qk } = argmin kpkl − Dl qk k2 s.t. kqk k0 ≤ L ∀k.
g,r/b Dl ,{qk } k
+ ˜ H (i, j
ωe ∆ : j + 4)FTG }/ωt , (12)
g,r/b
1057-7149 (c) 2016 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/TIP.2016.2574984, IEEE
Transactions on Image Processing
4
where {pkl }k are the training LR feature vectors, qk are and vertical finite differences are calculated. After extracting
the coefficient vectors and Dl is the LR training dictionary. the LR patch features (PCA projected), A+ searches the atom
The training process of A+ goes through 20 iterations of the dj in Dl with the highest correlation to each input LR feature
K-SVD algorithm, with 1024 atoms in the dictionary, and yj , and the residual HR patch xj without low frequencies is
allocating L = 3 atoms per coefficient vector. obtained by multiplication of the regressor Pj anchored to dj
Instead of optimizing the reconstruction of high resolution with yj
(HR) patches at runtime, A+ uses offline trained anchored xj = Pj yj . (16)
regressors to directly obtain them. More specifically, A+ uses
the atoms of the trained dictionary Dl as anchors for the Subsequently the low frequencies are added. The HR patches
surrounding neighbourhood and the corresponding LR to HR are combined by averaging in the overlapping area to complete
patch regressor. A+ collects 5 million pairs of corresponding the output HR image.
LR features and HR patches from a scaled pyramid of the
91 training images. For each anchor point (LR atom), A+ III. E FFICIENT REGRESSION PRIORS (ERP)
retrieves the nearest n = 2048 training samples. Due to the Our ERP method is inspired by the A+ method introduced
l2 -norm used in Eq. (12) the distance between the atom and its for image super-resolution. As a post-processing step, ERP
neighbor is also Euclidean, and all of the 5 million candidates has two major strengths. Firstly, it is capable of improving the
are normalized by the l2 -norm. Then for an arbitrary input LR results of many demosaicing methods. Especially MLRI+ERP
feature y, A+ solves combines low time complexity and good performance. Sec-
min{ky − Sl,y δk2 + λkδk2 }, (13) ondly, ERP trains offline a dictionary and regressors and,
δ thus, allows for low computational times during testing. In
where Sl,y is the matrix of 2048 nearest neighbors anchored the following we describe how ERP (see Fig. 4) is derived
to the atom dy and λ is set to be 0.1. ‘Nearest’ is measured and used for the post-processing of demosaiced images. ERP
by correlation. The algebraic solution of Eq. (13) is goes through two stages just as A+ does.
Offline stage. ERP is trained using 100 high quality im-
δ = Pl,y y, Pl,y = (STl,y Sl,y + λI)−1 STl,y , (14) ages collected from Internet for post-processing the results
where I is the unit matrix. As to the HR training images, of a selected demosaicing method. The demosaicing method
the first thing is to remove the low frequencies by subtracting reconstructs the LR image. In the CPCA (Collection + PCA)
the upscaled corresponding LR image. Then, A+ collects 5 step (see Fig. 4), the first and second-order finite differences
million 3 × 3 such HR patches corresponding each to LR of the G channel are extracted for the LR training images, in
patches. The HR patch values are further normalized by the both the h/v direction,
corresponding l2 -norm of LR patch features. The anchored F1h = 1 −1 = FT1v ,
regressor Py corresponding to the atom dy is precomputed (17)
F2h = 1 −2 1 /2 = FT2v ,
offline
so that we can keep information on edges and mosaic artifacts,
Py = Sh,y Pl,y = Sh,y (STl,y Sl,y + λI)−1 STl,y , (15)
and train a dictionary adapted to a specific demosaicing
where Sh,y contains 2048 HR patches corresponding to LR method. Small 3 × 3 regions at the same position of the
features in Sl,y . filtered G channel are collected and concatenated to form
Online Stage. During this stage, the testing LR image is (as one vector (feature), along with PCA dimensionality reduction
done in the offline stage) firstly scaled up to the target size by with 99.9% preserved energy. After repeating the process for
bicubic interpolation and the first- and second-order horizontal the R and B channels, the input vector vlk is eventually formed
1057-7149 (c) 2016 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/TIP.2016.2574984, IEEE
Transactions on Image Processing
5
by three features at the same position of the RGB channels. IV. D IRECTIONAL D IFFERENCE R EGRESSION (DDR) AND
This process is called CPCA step in Fig. 4. F USED REGRESSION (FR)
Later, ERP applies the K-SVD [29] method as in [21], [26], In this section, we make a couple of observations on
[27] to train an LR dictionary Wl with 4096 atoms: the MLRI and ERP methods and introduce our proposed
X
Wl = argmin kvlk − Wl ck k2 s.t. kck k0 ≤ N ∀k, (18) independent demosacing methods, DDR and FR.
Wl ,{ck } k
1057-7149 (c) 2016 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/TIP.2016.2574984, IEEE
Transactions on Image Processing
6
pyramids of LR training images, selecting 2048 LR differ- Finally, the color difference is updated as follows
ences most correlated to an atom as anchored neighbors. ˆ V (i − k + 1 : i, j)
∆g,r/b (i, j) = {ws Fk ∗ ∆ g,r/b
Correspondingly, we collect 5 million HR differences without
low frequencies. Then, we compute the directional difference ˆ
+ wn Fk ∗ ∆ V
(i : i + k − 1, j)
g,r/b
regressor Pd anchored to an atom d (29)
ˆ H (i, j − k + 1 : j) ∗ FT
+ ww ∆ g,r/b k
Pd = Sh,d (STl,d Sl,d + λI)−1 STl,d , (24) ˆ H (i, j : j + k − 1) ∗ FT }/wt ,
+ we ∆ g,r/b k
where Sh,d is the matrix of 2048 HR differences correspond- ˆV
where ∆g,r/bis the vertical enhanced color difference. By
ing to LR differences in Sl,d and λ is 0.1. Finally, a tentative
adding the ground truth R/B values and ∆g,r/b we obtain the
color difference in the input image is improved by offline
updated G values.
computed regressors as follows
(iii) When it comes to the R channel, we apply the Laplacian
yh,j = λ1 Pj yl,j , (25) filter (Eq. (11)) to obtain the tentative R values R1 , as well as
the horizontal and vertical first-order difference filters Fh , Fv
where Pj is the regressor anchored to the atom with highest
Fh = −1 0 0 0 1 = FTv ,
correlation to the LR difference yl,j and λ1 is the regres- (30)
sor correction parameter. By adding HR differences to the
and get the h/v R values R2 and R3 . The above processes yield
channel-shared low frequencies we have the enhanced color
ˆH . residues for the raw values of the R channel. For k = 1, 2, 3
differences ∆ g,r/b consider
(ii) Now we come to the stage of color difference updating.
k
Based on MLRI, we compute the weights wn,s,e,w and the Ri,j − Ri,j Ri,j is the raw value,
k (i, j) = (31)
sum of the weights wt = wn + ws + we + ww , where 0 others .
X j+2
i+1 X i+1
X j
X After using the bilinear filter
H 2 H 2
we = 1/( Da,b ) , ww = 1/( Da,b ) ,
0.25 0.5 0.25
a=i−1 b=j a=i−1 b=j−2
Fb = 0.5 1 0.5 , (32)
i j+1 j+1
i+2 X
X X
V 2
X
V 2 0.25 0.5 0.25
ws = 1/( Da,b ) , wn = 1/( Da,b ) ,
a=i−2 b=j−1 a=i b=j−1 we have the enhanced estimations of R values
(26) R̃k = λ2 Fb ∗ k + Rk , for k ∈ 1, 2, 3, (33)
H V
here Di,j and Di,j are given by where λ2 is the residue correction parameter. Thus, we have
H
Di,j ˆ H (i, j + 1) − ∆
= k∆ ˆ H (i, j − 1)k, the Laplacian updated R value R̃1 . With the help of the
g,r/b g,r/b
(27) previous weights wn,s,e,w , we obtain the gradient updated
V
Di,j ˆ V (i + 1, j) − ∆
= k∆ ˆ V (i − 1, j)k.
g,r/b g,r/b color difference for R̃2 and R̃3
We observe that MLRI uses the neighbor color differences ∆r (i, j) = {wn ∆˜ 3r (i − 1, j) + ws ∆
˜ 3r (i + 1, j)
weighted by a Gaussian filter (Eq. (9)) to alleviate the error (34)
˜ 2 (i, j − 1) + ww ∆
+ we ∆ ˜ 2 (i, j + 1)}/wt ,
of color differences in the center, and the neighbor size is r r
1057-7149 (c) 2016 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/TIP.2016.2574984, IEEE
Transactions on Image Processing
7
Fig. 6: PSNR versus number of training LR/HR differences, atoms/regressors, training images, and region size for the IMAX
dataset.
by averaging R̃1 , R̃3 and R̃3 . For the B values we follow the
same process.
1057-7149 (c) 2016 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/TIP.2016.2574984, IEEE
Transactions on Image Processing
8
complementary to the standard benchmarks, we also selected ensure the relevance of our training dictionary and its anchored
500 real world (RW) color images with RGB channels as HR regressors.
images, using the Google search engine. Dozens of keywords Performance measures. In order to evaluate the performance
– such as nature, landscape, people, city – yielded images of the demosaicing methods we employ the standard Peak-
from daily life. We made sure that all the categories contain Signal-to-Noise-Ratio (PSNR), the Structural Similarity Index
a similar amount of images. Then we added a Bayer pattern (SSIM) [30], the Zipper Effect Ratio (ZER) [31], and the
mask on them to obtain LR images. runtime at test. All the compared methods along with our pro-
As a result, we not only consider images of high visual posed methods share the same testing environment – Intel(R)
quality as those in Kodak and IMAX, but also the products of CoreT M i7-930 @2.80GHz with 8 GB RAM. PSNR measures
everyday photography. Whereas regions with slowly varying quantitatively the fidelity of the restoration in comparison with
color intensities tend to show good performance with inter- the ground truth, while SSIM measures the structural similarity
polation methods, reconstructing high-quality outputs from with the ground truth and ZER the ratio of pixels affected by
‘busy’ images is more difficult. Therefore, we also focus on the zipper effect or edge blurring.
images which are highly textured and have a rich color gamut.
1057-7149 (c) 2016 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/TIP.2016.2574984, IEEE
Transactions on Image Processing
9
IMAX Kodak RW
Method
R G B All Time (s) R G B All Time (s) R G B All Time (s)
BILINEAR 31.72 35.41 31.27 32.36 0.02 29.30 33.19 29.24 30.22 0.04 26.85 30.99 27.16 27.92 0.02
HQL 34.02 37.57 33.03 34.46 0.02 34.85 39.08 34.74 35.80 0.02 30.80 34.98 30.93 31.84 0.01
AHD 33.06 37.00 32.17 33.53 38.96 37.02 39.67 37.33 37.79 59.54 31.51 35.20 31.61 32.36 19.39
AP 32.85 34.92 32.01 33.06 0.46 38.03 41.56 38.64 39.11 0.74 31.72 34.79 31.78 32.49 0.25
PCSD 34.66 38.12 33.46 34.94 1.21 38.07 40.55 38.30 38.78 1.45 32.72 36.12 32.85 33.55 0.92
DLMMSE 34.06 38.00 33.04 34.49 27.13 39.19 42.63 39.58 40.12 42.50 32.81 36.88 32.89 33.72 12.84
LPAICI 34.40 37.87 33.30 34.73 0.61 39.68 43.04 39.95 40.56 1.00 33.01 36.80 33.02 33.86 0.33
GBTF 34.05 37.37 33.09 34.42 10.60 39.69 43.36 40.03 40.64 16.42 33.11 37.03 33.15 33.99 5.28
MSG 34.44 37.68 33.40 34.76 9.65 40.09 43.81 40.35 41.02 14.89 33.50 37.37 33.51 34.36 4.73
LDINAT 36.28 39.76 34.39 36.21 1523.63 37.02 39.47 37.15 37.72 2418.24 32.84 36.33 32.95 33.67 717.23
MLRI 36.62 40.03 35.43 36.80 0.54 39.33 42.92 39.63 40.24 0.74 33.76 37.44 33.98 34.66 0.26
LSSC 35.90 38.69 34.64 36.05 453.58 40.60 44.42 40.74 41.52 707.59 33.93 37.56 33.91 34.76 258.71
DDR (ours) 37.12 40.34 35.63 37.17 6.31 40.18 43.92 40.39 41.10 9.60 34.12 37.53 34.26 34.95 3.17
FR (ours) 37.50 41.01 35.82 37.49 10.75 40.19 43.85 40.34 41.07 16.35 34.29 37.85 34.42 35.13 5.20
TABLE I: PSNR performance comparison of 14 demoisaicing methods on 3 datasets. The best two results are in bold.
IMAX Kodak RW
Method
R G B All Time (s) R G B All Time (s) R G B All Time (s)
BILINEAR 36.36 40.05 35.14 36.66 32.26 35.65 32.00 33.01 32.72 36.54 33.08 33.74
HQL 36.62 40.13 35.19 36.78 36.70 40.50 36.52 37.54 33.37 37.55 33.47 34.33
AHD 34.53 39.27 33.35 34.96 37.46 40.09 37.66 38.19 31.94 36.39 32.01 32.91
AP 36.86 40.28 35.19 36.89 38.82 42.65 39.07 39.80 33.20 37.41 33.25 34.15
PCSD 37.19 40.40 35.59 37.21 38.55 41.21 38.61 39.24 33.74 37.62 33.82 34.65
DLMMSE 37.23 40.53 35.57 37.23 39.57 42.88 39.77 40.41 33.98 38.02 34.00 34.88
LPAICI 37.29 40.66 35.55 37.26 +8.35 39.94 43.18 40.01 40.73 +12.74 33.97 37.96 33.94 34.85 +4.01
GBTF 37.26 40.63 35.62 37.28 39.96 43.46 40.11 40.81 34.06 38.09 34.06 34.95
MSG 37.28 40.63 35.60 37.28 40.23 43.83 40.32 41.08 34.14 38.20 34.14 35.03
LDINAT 37.50 40.54 35.42 37.24 37.71 40.23 37.65 38.34 33.81 37.50 33.74 34.62
MLRI 37.81 40.93 36.08 37.72 39.60 43.17 39.80 40.47 34.24 38.07 34.37 35.16
LSSC 37.52 40.74 35.90 37.53 40.68 44.41 40.68 41.54 34.49 38.47 34.54 35.40
DDR (ours) 38.00 40.76 36.20 37.82 40.25 43.91 40.40 41.13 34.46 38.29 34.54 35.35
FR (ours) 38.15 41.03 36.26 37.95 40.27 43.93 40.38 41.14 34.57 38.42 34.66 35.47
TABLE II: PSNR performance comparison after ERP post-processing on 3 datasets. The best two results are in bold.
D. Results step we significantly boost the performance of most methods,
especially on IMAX, as shown in Table II, at the cost of some
In order to rule out boundary effects, we shave off 2, 4, 6, extra seconds. The improvements on the IMAX dataset vary
8 boundary pixels for all the methods. The compared methods from +0.46dB to +4.36dB (see Table II). Broadly speaking,
are stable on the boundary, only for DLMMSE we need to cut the worse the initial demosaicing methods, the larger the
6 boundary pixels to reach good stable performance. improvements. As to our post-processing method on MLRI,
PSNR. In Tables I and II we report the best PSNR results the PSNR is almost 1dB better than the original MLRI and
from the above discussed 4 candidates for 14 demosaicing outperforms all the other compared methods on IMAX, even
methods. The best two results are in bold. Our DDR and FR when equipped with ERP. DDR with ERP achieves 37.82dB
methods are the best on both the IMAX and RW datasets. while 37.88dB is the best result to date, reported for SAPCA.
For instance, the mean PSNR of DDR is 37.17dB on IMAX, FR even slightly improves over SAPCA by achieving 37.95db.
1.12dB higher than the state-of-the-art method LSSC, while FR with the ERP step takes less than 20 seconds while SAPCA
FR reaches an improvement of 1.34dB over LSSC. At the same costs 20 minutes. To sum up, all our proposed methods
time (see Fig. 2), DDR is almost 80 times faster than LSSC MLRI + ERP and DDR/FR + ERP achieve very good results,
and 250 times faster than LDINAT. The ranking is preserved comparable to SAPCA on IMAX, while being significantly
on the 500 images of the RW dataset. Our proposed methods faster than SAPCA (about 2 orders of magnitude).
are among the fastest demosaicing methods, while performing When it comes to the Kodak dataset, the improvement of
significantly better. our post-processing methods is less impressive. It varies from
SSIM and ZER. Complementary to the PSNR results, we +2.79dB for BILINEAR to as low as +0.02dB for LSSC. This
report ZER [31] (computed on the CIELAB color space) and is mainly due to the fact that on the Kodak dataset many
SSIM [30] results on each image of the IMAX and Kodak methods have achieved impressive results above 40dB, and
datasets (see Table III and IV). The listed results correspond hence, there may not be much space left for further improve-
to the images in Fig. 7 from left to right and top to bottom for ment. Still, our ERP methods achieve 40.47dB/40.63dB on 24
each dataset. Fig. 9 shows that the ZER relative ranking of the Kodak images, 0.23dB/0.39dB higher than MLRI.
top 5 methods, together with FR + ERP, stays the same for a Also note that, if we apply the ERP regression training
large range of values for the ZER threshold. The ZER values method multiple times, we repeatedly benefit, which is further
in Tables III and IV are obtained by fixing the threshold to 2.5. confirmed by the results presented in [24].
The results for different measures confirm the top performance Visual Comparison. For visual quality assessment we show
achieved by the proposed DDR/FR and ERP. a couple of image results of the top methods in Fig. 8.
ERP post-processing. By applying the ERP post-processing For example, in the image ‘Flower’ we clearly observe false
1057-7149 (c) 2016 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/TIP.2016.2574984, IEEE
Transactions on Image Processing
10
3
Fig. 8: Visual assessment of demosaicing results with and without ERP post-processing.
1057-7149 (c) 2016 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/TIP.2016.2574984, IEEE
Transactions on Image Processing
11
1057-7149 (c) 2016 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/TIP.2016.2574984, IEEE
Transactions on Image Processing
12
0.35 offline learned on training data and its enhanced version based
FR + ERP on fused regression (FR), along with an efficient regression
0.3 FR priors (ERP) post-processing step. We keep time complexity
Zipper Effect Ratio (ZER)
DDR limited during the online stage and shift the learning and the
MLRI bulk of computations to the offline stage. Thus, we achieve
0.25 LDINAT order(s) of magnitude lower running times at testing time
LSSC than the state-of-the-art LSSC, LDINAT, and SAPCA methods.
0.2 Moreover, the experimental results on various datasets prove
competitive performances. Last but not least, the performance
0.15 of the proposed DDR and FR methods, and of any other
demosaicing method can be further improved by applying our
0.1 ERP post-processing method.
ACKNOWLEDGMENT
1 1.5 2 2.5 3 This work was supported by the ERC Advanced Grant
Threshold VarCity (#273940).
Fig. 9: Zipper Effect Ratio (ZER) vs. threshold on IMAX.
R EFERENCES
visual performance is consistent with the numerical PSNR
[1] B. E. Bayer, “Color imaging array,” Jul. 20 1976, uS Patent 3,971,065.
results presented in tables I and II. Last but not least, the visual 1
artifacts are generally alleviated by the ERP post-processing. [2] C. Bai, J. Li, Z. Lin, J. Yu, and Y.-W. Chen, “Penrose demosaicking,”
Image Processing, IEEE Transactions on, vol. 24, no. 5, pp. 1672–1684,
2015. 1
E. Limitations and future work [3] X. Li, B. Gunturk, and L. Zhang, “Image demosaicing: A systematic
Self-similarities. Our DDR and FR methods rely on trained survey,” in Electronic Imaging 2008. International Society for Optics
and Photonics, 2008, pp. 68 221J–68 221J. 1
priors and do not exploit the self-similarities and the particular [4] D. Menon and G. Calvagno, “Color image demosaicking: an overview,”
content of the input image. LSSC does exploit the self- Signal Processing: Image Communication, vol. 26, no. 8, pp. 518–533,
similarities and is capable to achieve 0.4dB better PSNR 2011. 1
[5] H. S. Malvar, L.-W. He, and R. Cutler, “High-quality linear interpo-
performance than our methods on the Kodak dataset, but not lation for demosaicing of bayer-patterned color images,” in Acoustics,
on the IMAX dataset and the RW dataset. We believe this Speech, and Signal Processing, 2004. Proceedings.(ICASSP’04). IEEE
to be caused by the particularities of the Kodak dataset with International Conference on, vol. 3. IEEE, 2004, pp. iii–485. 1, 8
[6] L. Zhang and X. Wu, “Color demosaicking via directional linear mini-
respect to the other datasets, such as larger flat regions and mum mean square-error estimation,” Image Processing, IEEE Transac-
larger images, that better suit the LSSC method. The 16th tions on, vol. 14, no. 12, pp. 2167–2178, 2005. 1, 2, 8
IMAX image is the only in that dataset where LSSC achieves [7] D. Paliy, V. Katkovnik, R. Bilcu, S. Alenius, and K. Egiazarian,
“Spatially adaptive color filter array interpolation for noiseless and noisy
superior demosaicing results than our methods. It differs from data,” International Journal of Imaging Systems and Technology, vol. 17,
the other IMAX images by the highly regular texture content, no. 3, pp. 105–122, 2007. 1, 8
a perfect fit for LSSC. The use of self-similarities is a direction [8] V. Katkovnik, K. Egiazarian, and J. Astola, “Local approximation
techniques in signal and image processing.” SPIE Bellingham, 2006.
for further performance improvement of our methods. 1
Design choices. All our methods (ERP, DDR, FR) follow [9] V. Katkovnik, K. Egiazarian, and J. astola, “Adaptive window size
closely the settings of the A+ super-resolution method [21]. image de-noising based on intersection of confidence intervals (ici) rule,”
Journal of Mathematical Imaging and Vision, vol. 16, no. 3, pp. 223–
The effect of the patch features and training procedure on the 235, 2002. 1
overall performance are unclear. If we were to train regressors [10] J. E. Adams Jr and J. F. Hamilton Jr, “Adaptive color plane interpolation
specific to the different offsets with respect to the underlying in single sensor color electronic camera,” May 13 1997, uS Patent
5,629,734. 1
mosaic pattern, further improvements are to be expected. As [11] X. Wu and N. Zhang, “Primary-consistent soft-decision color demo-
shown in [24], cascading the demosaicing methods such that saicking for digital cameras (patent pending),” Image Processing, IEEE
each cascade stage starts from the demosaicing result of the Transactions on, vol. 13, no. 9, pp. 1263–1274, 2004. 1, 8
[12] I. Pekkucuksen and Y. Altunbasak, “Gradient based threshold free color
previous cascade stage is another direction for future research. filter array interpolation,” in Image Processing (ICIP), 2010 17th IEEE
Time complexity. The proposed methods are highly paral- International Conference on. IEEE, 2010, pp. 137–140. 2, 8
lelizable, but the time complexity depends linearly on the [13] I. Pekkucuksen and Y. altunbasak, “Multiscale gradients-based color
filter array interpolation,” Image Processing, IEEE Transactions on,
number of regressors and anchors in the dictionary. However, vol. 22, no. 1, pp. 157–165, 2013. 2, 8
the use of a better sublinear data search structure instead of the [14] D. Kiku, Y. Monno, M. Tanaka, and M. Okutomi, “Minimized-laplacian
current linear search is rather straightforward and can lower residual interpolation for color image demosaicking,” in IS&T/SPIE
Electronic Imaging. International Society for Optics and Photonics,
the computation time [32]. 2014, pp. 90 230L–90 230L. 2, 8
[15] F. Zhang, X. Wu, X. Yang, W. Zhang, and L. Zhang, “Robust color
VI. C ONCLUSIONS demosaicking with adaptation to varying spectral correlations,” Image
Processing, IEEE Transactions on, vol. 18, no. 12, pp. 2706–2717, 2009.
We propose 3 a novel fast demosaicing method based on di- 2, 8
rectional difference regression (DDR) where the regressors are [16] L. Zhang, X. Wu, A. Buades, and X. Li, “Color demosaicking by local
directional interpolation and nonlocal adaptive thresholding,” Journal of
3 Codes available at http://www.vision.ee.ethz.ch/∼timofter/ Electronic Imaging, vol. 20, no. 2, pp. 023 016–023 016, 2011. 2, 8
1057-7149 (c) 2016 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/TIP.2016.2574984, IEEE
Transactions on Image Processing
13
[17] B. K. Gunturk, Y. Altunbasak, and R. M. Mersereau, “Color plane Radu Timofte obtained the PhD degree in Electrical
interpolation using alternating projections,” Image Processing, IEEE Engineering at the KU Leuven, Belgium, in 2013,
Transactions on, vol. 11, no. 9, pp. 997–1013, 2002. 2, 8 the M.Sc. at the Univ. of Eastern Finland in 2007
[18] K. Hirakawa and T. W. Parks, “Adaptive homogeneity-directed demo- and the Dipl.-Ing. at the Technical Univ. of Iasi,
saicing algorithm,” Image Processing, IEEE Transactions on, vol. 14, Romania, in 2006. He worked as a researcher for the
no. 3, pp. 360–369, 2005. 2, 8 Univ. of Joensuu (2006-2007), the Czech Technical
[19] J. Mairal, F. Bach, J. Ponce, G. Sapiro, and A. Zisserman, “Non-local Univ. (2008) and the KU Leuven (2008-2013). Since
sparse models for image restoration,” in Computer Vision, 2009 IEEE 2013 he is postdoc at ETH Zurich, Switzerland, in
12th International Conference on. IEEE, 2009, pp. 2272–2279. 2, 8 the lab of prof. Luc Van Gool. He serves as a re-
[20] D. Gao, X. Wu, G. Shi, and L. Zhang, “Color demosaicking with an viewer for major journals such as PAMI, TIP, IJCV,
image formation model and adaptive pca,” Journal of Visual Communi- TNNLS, TKDE, PRL, and T-ITS, and conferences
cation and Image Representation, vol. 23, no. 7, pp. 1019–1030, 2012. such as CVPR, ICCV, and ECCV. He received best paper awards at ICPR
2, 8 2012, at CVVT workshop from ECCV 2012 and at ChaLearn workshop from
[21] R. Timofte, V. De Smet, and L. Van Gool, “A+: Adjusted anchored ICCV 2015. His current research interests include sparse and collaborative
neighborhood regression for fast super-resolution,” in IEEE Asian Con- representations, image restoration and enhancement, and multi-view object
ference on Computer Vision (ACCV), 2014. 2, 3, 5, 12 class recognition.
[22] R. Rothe, R. Timofte, and L. Van Gool, “Efficient regression priors
for reducing image compression artifacts,” in Image Processing (ICIP),
2015 IEEE International Conference on, Sept 2015, pp. 1543–1547. 2
[23] S. Farsiu, M. Elad, and P. Milanfar, “Multiframe demosaicing and super-
resolution of color images,” Image Processing, IEEE Transactions on,
vol. 15, no. 1, pp. 141–159, 2006. 2
[24] J. Wu, R. Timofte, and L. Van Gool, “Efficient regression priors for
post-processing demosaiced images,” in Image Processing (ICIP), 2015
IEEE International Conference on, Sept 2015, pp. 3495–3499. 2, 7, 8,
9, 12
[25] K. He, J. Sun, and X. Tang, “Guided image filtering,” in Computer
Vision–ECCV 2010. Springer, 2010, pp. 1–14. 3
[26] R. Timofte, V. De Smet, and L. Van Gool, “Anchored neighborhood
regression for fast example-based super-resolution,” in Computer Vision
(ICCV), 2013 IEEE International Conference on. IEEE, 2013, pp.
1920–1927. 3, 5
[27] R. Zeyde, M. Elad, and M. Protter, “On single image scale-up using
sparse-representations,” in Curves and Surfaces. Springer, 2012, pp.
711–730. 3, 5
[28] J. Yang, J. Wright, T. S. Huang, and Y. Ma, “Image super-resolution via
sparse representation,” Image Processing, IEEE Transactions on, vol. 19,
no. 11, pp. 2861–2873, 2010. 3
[29] M. Aharon, M. Elad, and A. Bruckstein, “K-svd: An algorithm for
designing overcomplete dictionaries for sparse representation,” Signal
Processing, IEEE Transactions on, vol. 54, no. 11, pp. 4311–4322, 2006.
3, 5 Luc Van Gool obtained the master and PhD degrees
[30] Z. Wang, A. Bovik, H. Sheikh, and E. Simoncelli, “Image quality assess- in Electrical Engineering at the KU Leuven, Bel-
ment: from error visibility to structural similarity,” Image Processing, gium, resp. in 1982 and 1991. He is a full professor
IEEE Transactions on, vol. 13, no. 4, pp. 600–612, April 2004. 8, 9 for Computer Vision at both KU Leuven and ETH
[31] W. Lu and Y.-P. Tan, “Color filter array demosaicking: new method Zurich. With his two research labs, he focuses on
and performance measures,” Image Processing, IEEE Transactions on, object recognition, tracking and gesture analysis,
vol. 12, no. 10, pp. 1194–1210, 2003. 8, 9 and 3D acquisition and modeling. Luc Van Gool
[32] R. Timofte, R. Rothe, and L. Van Gool, “Seven ways to improve was a program chair of ICCV 2005, and general
example-based single image super resolution,” in CVPR 2016, 2016. chair of ICCV 2011, and acted as GC of ECCV
12 2014. He is an editor-in-chief of Foundations and
Trends in Computer Graphics and Vision. He also is
a co-founder of the spin-off companies Eyetronics, GeoAutomation, kooaba,
procedural, eSaturnus, upicto, Fashwell, Merantix, Spectando, and Parquery.
He received several best paper awards, incl. at ICCV 1998, CVPR 2007,
ACCV 2007, ICRA 2009, BMVC 2011, and ICPR 2012.
1057-7149 (c) 2016 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.