Trendy Regularized Robust Coding For Face Recognition
Trendy Regularized Robust Coding For Face Recognition
||||
1
s. t ||y-D||
2
(1)
where y is the given signal, D is the dictionary of coding atoms , is the coding vector of y over D, and > 0 is
a constant.
By coding a query image y as a sparse linear combination of all the training samples via Eq. (1), SRC classifies
y by evaluating which class could result in the minimal reconstruction error of it.However, it has been indicated
in that the success of SRC actually owes to its utilization of collaborative representation on the query image but
not the l
1
-norm sparsity constraint on coding coefficient.One interesting feature of SRC is its processing of face
occlusion and corruption. More specifically, it introduces an identity matrix I as a dictionary to code the outlier
pixels
min
|| [; ] ||
1
s. t. y = [D, I][ ; ] (2)
By solving Eq. (2), SRC shows good robustness to face occlusions such as block occlusion, pixel corruption and
disguise. It is not difficult to see that Eq. (2) is basically equivalent to min
||||
1
s .t. ||y-D ||
1
< . That is, it uses
l
1
-norm to model the coding residual y- D to gain certain robustness to outliers.The SRC has close relationship
to the nearest classifiers.SRC could be seen as a more general model than the previous nearest classifiers, and it
uses the samples from all classes to collaboratively represent the query sample to overcome the small-sample-
size problem in FR.In addition, different from the methods such as LBP, Probabilistic local approach which use
local region features, color features or gradient information to handle some special occlusion ,SRC shows
interesting results in dealing with occlusion by assuming a sparse coding residual, as in Eq. (2).There are many
following works to extend and improve SRC, such as feature-based SRC, SRC for face misalignment or pose
variation, regularized collaborative representation and SRC for continuous occlusion.
2.1 Modeling of RRC
The conventional sparse coding model in Eq.(1) is equivalent to the so-called LASSO problem:
min
|| y-D||
2
s. t. |||| (3)
Where >0 is a constant, y = [y
1
;y
2
;.;y
n
]
n
is the signal to be coded ,D = [d
1
,d
2
,,d
m
]
n m
is the
dictionary with column vector d
j
being its j
th
atom, and
m
is the vector of coding coefficients. In the
problem of FR, the atom d
j
can be simply set as the training face sample and hence the dictionary D can be the
whole training dataset.If we have the prior that the coding residual e = y - D follows Gaussian distribution, the
solution to Eq.(3) will be the MLE solution. If e follows Lapacian distribution, the l
1
-sparsity constrained MLE
solution will be,
min
||y D||
1
s. t. ||||
1
(4)
The above Eq.(4) is essentially another expression of Eq.(2) because they have the same Lagrangian
formulation: min
{||y D||
1
+||||
1
}.In practice, however, the Gaussian or Laplacian priors on e may be
invalid, especially when the face image y is occluded, corrupted, etc.Inspired by the robust regression theory, in
our previous work we proposed an MLE solution for robust face image representation. Rewrite D as D=[r
1
;r
2
;
,, ;r
n
],where r
i
is the i
th
row of D, and let e = y-D = [e
1
;e
2
; , , ; e
n
], where e
i
=y
i
r
i
, i= 1,2,..,n. Assume
that e
1
,e
2
,..,e
n
are independent and identically distributed and the PDF of
i
is f
(e
i
), where denotes the
unknown parameter set that characterizes the distribution, the so- called RSC was formulated as the following
l
1
-sparsity constrained MLE problem [ let (e)= -lnf(e)]
min
(y
i
-r
i
) s. t. ||||
1
(5)
Vol 05, Article 05363; May 2014
International Journal of Engineering Sciences Research-IJESR
http://ijesr.in/ ISSN: 2230-8504; e-ISSN-2230-8512
2010-2014 - IJESR
Indexing in Process - EMBASE, EmCARE, Electronics & Communication Abstracts, SCIRUS, SPARC, GOOGLE Database, EBSCO, NewJour, Worldcat,
DOAJ, and other major databases etc.,
1529
Like SRC, the above RSC model assumes that the coding coefficients are sparse and uses l
1
-norm to
characterize the sparsity. However, the l
1
-sparsity constraint makes the complexity of RSC high, and recently it
has been indicated that the l
1
-sparsity constraint on is not the key for the success of SRC. In this project, we
propose a more general model, namely RRC. The RRC can be much more efficient than RSC, while RSC is one
specific instantiation of the RRC model.Lets consider the face representation problem from a view-point of
Bayesian estimation, more specifically, the MAP estimation. By coding the query image y over a given s
dictionary D, the MAP estimation of the coding vector is = arg max
(e
i
),we have p
(y|) =
(y
i
r
i
). Meanwhile, assume that the elements
j
,j =1,2,,m, of the coding vector =
[
1
;
2
;,
m
] are i. i. d. with PDF f
o
(
j
), there is p() =
o
(
j
).The MAP estimation of in Eq.(6) is
= arg max
(y
i
r
i
) +
o
(
j
)} (7)
Letting
(e) = - ln f
(e) and
o
() = - ln f
o
(), Eq.(7) is converted into
=arg min
(y
i
-r
i
) +
0
(
j
) } (8)
We call the above model RRC because the fidelity term
(y
i
-r
i
) will be very robust to outliers,
while
0
(
j
) is the regularization term depending on the prior probability p().
It can be seen that
0
(
j
) becomes the l
1
- norm sparse constraint when
j
is Laplacian distributed, p()
= (-||
j
||
1
/
)/2
}/(2
(1/)} (9)
Where denotes the gamma function.
For the representation residual, it is difficult to predefine the distribution due to the diversity of image
variations. In general, we assume that the unknown PDF f
(x) =
(x
1
)>
(x
2
) if |x
1
|>|x
2
|. Without loss of generality, we let
(0) =0.
The proposed RRC model in Eq. (8) has close relations to robust estimation, which also aims to eliminate the
effect of outliers. The robust estimation methods, e.g., Regression Diagnostics, M-estimator and Least Median
of squares are widely used in parameter estimation and has various applications in computer vision, such as
tracking , robust subspace learning and so on.
However, there are clear differences between the previous robust estimation methods and the proposed RRC.
Most of previous robust estimation methods regard the whole pieces of samples but not the elements of a sample
as inliers or outliers. Although the robust subspace learning method weights each pixel by the judgment of inlier
or outlier, it aims to learn robust principle components but not to solve the regularized coding coefficients of a
testing sample with outliers. Besides, the proposed RRC model is developed in order for classification tasks but
not regression.
Two key issues in solving the RRC model are how to determine the distribution
(or f
have much bias and are not robust enough to outliers, and the Laplacian setting of f
o
makes
the minimization inefficient. In this paper, we allow f
o
to have a more flexible shape, which is adaptive to the
input query image y so that the system is more robust to outliers. To this end, we transform the minimization of
Eq.(8) into an iteratively reweighted regularized coding problem in order to obtain the approximated MAP
solution of RRC effectively and efficiently.
2.2 RRC via Iteratively Reweighting
Let F
(e)=
(e
i
).The Taylor expansion of F
(e)=F
(e
0
)+ (e-e
0
)
T
F
'
(e
0
)+R
1
(e) (10)
Vol 05, Article 05363; May 2014
International Journal of Engineering Sciences Research-IJESR
http://ijesr.in/ ISSN: 2230-8504; e-ISSN-2230-8512
2010-2014 - IJESR
Indexing in Process - EMBASE, EmCARE, Electronics & Communication Abstracts, SCIRUS, SPARC, GOOGLE Database, EBSCO, NewJour, Worldcat,
DOAJ, and other major databases etc.,
1530
Where R
1
(e) is the high order residual, and F
(e). Donote by
the derivative of
,
and there is F
(e
0
) = [
(e
0,1
);
(e
0,2
);;
(e
0,n
)], where e
0
,i is the i
th
element of e
0
. To make F
(e) strictly
convex for easier minimization, we approximate the residual term as R
1
(e)0.5(e-e
0
)
T
W(e-e
0
), where W is a
diagonal matrix for that the elements in e are independent and there is no cross term of e
i
and e
j
,ij, in F
(e).
Since F
(e) reaches its minimal value (i.e.,0) at e=0, we also require that its approximation
(e
0,i
) =
'
(e
0,i
)/e
0,i
. (11)
According to the properties of
, we know that
(e
i
)will have the same sign as e
i
. So W
i,i
is a non-negative
scalar. Then F
'
(e)= ||W
1/2
e||
2
2
+b
e0
(12)
Where b
e0
=
(e
0,i
)-
(e
0,i
)e
0,i
/2] is a scalar constant determined by e
o
.Without considering the constant
be0, the RRC model in Eq.(8) could be approximate as,
=arg min
{ ||W
1/2
(y D)||
2
+
0
(
j
)} (13)
Certainly, Eq.(13) is a local approximation of Eq.(8) but it makes the minimization of RRC feasible via
iteratively reweighted l
2
-regularized coding, in which W is updated via Eq.(11). Now, the minimization of RRC
is turned to how to calculate the diagonal weight matrix W.
2.3 Weights W
The element W
i,i
, i.e.,
(e
i
), is the weight assigned to pixel i of query image y. Intuitively, in FR the outlier
pixels should have small weights to reduce their effect on coding y over D. Since the dictionary D, composed of
non-occluded/non-corrupted training face images, could well represent the facial parts, the outlier pixels will
have rather big coding residuals. Thus , the pixel which has a big residual ei should have a small weight. Such a
principle can be observed from Eq.(11), where
(e
i
) is inversely proportional to e
i
and modulated by
(e
i
).
Refer to Eq.(11), since
is differentiable, symmetric, monotonic and has its minimum at origin, we can assume
that
(e
i
) is continuous and symmetric, while being inversely proportional to e
i
but bounded.without loss of
generality, we let
(e
i
)[0,1]. With these considerations, one good choice of
(e
i
) is the widely used logistic
function
(e
i
)=exp(-e
i
2
+)/(1+exp(-e
i
2
+)) (14)
Where and are positive scalars. Parameter controls the decreasing rate from 1 to 0, and controls the
location of demarcation point. Here the value of should be big enough to make
(e
i
)= - (ln(1+exp(-e
i
2
+))-ln(1+exp)) (15)
The PDF f
associated with
in Eq.(15) is more flexible than the Gaussian and Laplacian functions to model
the residual e. It can have a longer tail to address the residuals yielded by outlier pixels such as corruptions and
occlusions, and hence the coding vector will be robust to the outliers in y.
(e
i
) could also be set as other
functions. However, as indicated by the proposed logistic weight function is the binary classifier derived via
MAP estimation, which is suitable to distinguish inliers and outliers.
When
(e
i
) is set as a constant such as
(e
i
)=2, it corresponds to the l
2
-norm fidelity in Eq.(3);when set as
(e
i
)=1/|e
i
|, it corresponds to the l1-norm fidelity in Eq.(4); when set as a Gaussian function
(e
i
)=exp(-
e
i
2
/2
2
), it corresponds to the Gaussian Kernel fidelity in FR. However, all these functions are not as robust as
Eq.(14) to outliers, one can see that the l
2
-norm fidelity treats all pixels assigns higher weights to pixels with
smaller residuals; however, the weight can be infinity when the residual approaches to zero, making the coding
unstable.
Both our proposed weight function and the weight function of the Gaussian fidelity used in FR are bounded in
[0,1], and they have an intersection point with weight value as 0.5. However, the proposed weight function
prefers to assign larger weights to inliers and smaller weights to outliers; that is, it has higher capability to
classify inliers and outliers.
There are also some candidates which could be adopted as the weight function of RRC. Like the Gaussian
weight function, these weight functions in M-estimation could also assign high weights to inliers and low
weights to outliers. Nevertheless, the proposed RRC model is a general model which could utilize various
weight functions, and in this paper we adopt the logistic weight function due to its advantage analyzed above.
The model in Eq.(4) is the case by letting
(e
i
)=1/|e
i
|.Compared with the models in Eqs.(3) and (4), the
proposed RRC model [Eq.(8) or Eq.(13)] is much more robust to outliers because it will adaptively assign small
weights to them. Although the model in Eq.(4) also assigns small weights to outliers, its weight function
(e
i
)=1/|ei| is not bounded ,making it less effective to distinguish between inliers and outliers.
Vol 05, Article 05363; May 2014
International Journal of Engineering Sciences Research-IJESR
http://ijesr.in/ ISSN: 2230-8504; e-ISSN-2230-8512
2010-2014 - IJESR
Indexing in Process - EMBASE, EmCARE, Electronics & Communication Abstracts, SCIRUS, SPARC, GOOGLE Database, EBSCO, NewJour, Worldcat,
DOAJ, and other major databases etc.,
1531
2.4 Two Important Cases of RRC
The minimization of RRC model in Eq.(13) can be accomplished iteratively, while in each iteration W and are
updated alternatively. By fixing the weight matrix W, the RRC with CGD prior on representation and o(j)=-
lnfo() could be written as
=arg min
{ ||W
1/2
(y-D)||
2
+ (|
j
|
+b
a0
)} (16)
where
o
(
j
)=|
j
|
+b
o
. =(1/
a
)
and b
o
=ln(2(1/)/) is a constant. Similar to the processing of
F
(e)=
i=1
(e
i
),
0
(
j
) could also be approximated by the Taylor expansion. Then Eq.(16) changes to
=argmin
{||W
1/2
(y-D)||
2
+ V
j,j
j
2
} (17)
where V is a diagonal matrix with V
j,j
=
0
(
j
)/
j
.
The value of determines the types of regularization. If 0 1, then sparse regularization is applied; otherwise,
non-sparse regularization is imposed on the representation coefficients. In particular, the proposed RRC model
has two important cases with two specific values of .
When =2, GGD degenerates to the Gaussian distribution, and the RRC model becomes
=arg min
{||W
1/2
(y-D)||
2
+||||
2
} (18)
In this case the RRC model is essentially an l2-regularized robust coding model.It can be easily derived that
when W is given,the solution to Eq.(18) is =(D
T
WD+I)
-1
D
T
Wy.
When =1, CGD degenerates to the Laplacian distribution, and the RRC model becomes
=argmin
{||W
1/2
(y-D||
2
+||||
1
} (19)
3. IR
3
C ALGORITHM
3.1 INPUT:
Normalized query image y with unit l
2
-norm; dictionary D;
(1).
3.2 OUTPUT:
Start from t =1:
1. Compute residual
e
(t)
=y D
(t).
2. Estimation weight as
(e
i
(t)
)=1/1+exp((e
i
(t)
)
2
)-)Where and could be estimated in each iteration.
3. Weighted regularized robust coding:
=arg min
{ ||(W
(t)
)
1/2
(y-D)||
2
+
(
j
)} (20)
Where W
(t)
is the estimated diagonal weight matrix with W
i,i
(t)
=
(e
i
(t)
).
0
(
j
)=|
j
|
+b
0
and =2 or 1.
4. Update the sparse coding coefficients:
If t=1,
(t)
=*;
If t>1,
(t)
=
(t-1)
+
(t)
(*-
(t-1)
);
Where 0<
(t)
1 is a suitable step size that makes
(y
i
-r
i
(t)
)+
0
(
j
(t)
)<
(y
i
-r
i
(t-1)
)+
0
(
j
(t-1)
).
(t)
can be searched from 1 to 0 by the standard line-search process .
5. Compute the reconstructed test sample: y
rec
(t)
=e
(t)
,And let t=t+1.
6. Go back to step 1 until the condition of convergence is met, or the maximal number of iterations is
reached.In this case the RRC model is essentiality the RSC model in FR, where the sparse coding
methods such as l
1
_l
s
is used to solve Eq.(19) when W is given.In this project,we solve Eq.(19) via
Eq.(17) by the iteratively re-weighting technique.
4. EXPERIMENTAL RESULTS
4.1 Pass the input image
Collect all the necessary images for the project and save it in the database
Use entropyfilt to create a texture image. The function entropyfilt returns an array where each output pixel
contains the entropy value of the 9-by-9 neighbourhood around the corresponding pixel in the input image I.
Entropy is a statistical measure of randomness.
Threshold the rescaled image Eim to segment the textures. A threshold value of 0.8 is selected because it is
roughly the intensity value of pixels along the boundary between the textures.
Vol 05, Article 05363; May 2014
International Journal of Engineering Sciences Research-IJESR
http://ijesr.in/ ISSN: 2230-8504; e-ISSN-2230-8512
2010-2014 - IJESR
Indexing in Process - EMBASE, EmCARE, Electronics & Communication Abstracts, SCIRUS, SPARC, GOOGLE Database, EBSCO, NewJour, Worldcat,
DOAJ, and other major databases etc.,
1532
Fig 4.1 Pass input image and feature extraction
4.2 Segment the image based on region
Fig.4.2 Segmentation of input image
Vol 05, Article 05363; May 2014
International Journal of Engineering Sciences Research-IJESR
http://ijesr.in/ ISSN: 2230-8504; e-ISSN-2230-8512
2010-2014 - IJESR
Indexing in Process - EMBASE, EmCARE, Electronics & Communication Abstracts, SCIRUS, SPARC, GOOGLE Database, EBSCO, NewJour, Worldcat,
DOAJ, and other major databases etc.,
1533
4.3 Find the intensity distribution
(a) (b)
Fig.4.3 (a) Gradient Magnitude (Gradmag) (b) Watershed transform Of Gradient Maagnitude (Lrgb)
Your first step is to maximize the intensity contrast in the image. You can do this using ADAPTHISTEQ, which
performs contrast-limited adaptive histogram equalization. Rescale the image intensity using IMADJUST so
that it fills the data type's entire dynamic range.
4.4 Segment the image based on region(with and without intensity)
(a) (b) (c) (d)
(e) (f) (g) (h)
(i) (j) (k)
Fig.4.4 (a)Opening Lo (b) Opening by reconstruction (Lobr) (c)Opening and closing of Loc (d) Opening and
closing reconstruction (lobrcbr) (e) Regional maxima of opening closing by reconstruction(fgm) (f) Regional
maxima superimposed on original image(L2) (g)Threshold opening-closing by reconstruction(Bw)
(h)Watershed ridge lines(bgm) (i) Markers and object boundaries superimposed on original image (j) Color
watershed label matrix(lrgb) (k) Lrgb super imposed transparently on original image.
Granulometry estimates the intensity surface area distribution of snowflakes as a function of size. Granulometry
likens image objects to stones whose sizes can be determined by sifting them through screens of increasing size
and collecting what remains after each pass. Image objects are sifted by opening the image with a structuring
element of increasing size and counting the remaining intensity surface area (summation of pixel values in the
image) after each opening. Choose a counter limit so that the intensity surface area goes to zero as you increase
the size of your structuring element. For display purposes, leave the first entry in the surface area array empty.
4.5 Comparison of Gaussian noise, speckle noise, and salt and pepper noise
Median filtering is a common image enhancement technique for removing salt and pepper noise. Because this
filtering is less sensitive than linear techniques to extreme changes in pixel values, it can remove salt and pepper
Vol 05, Article 05363; May 2014
International Journal of Engineering Sciences Research-IJESR
http://ijesr.in/ ISSN: 2230-8504; e-ISSN-2230-8512
2010-2014 - IJESR
Indexing in Process - EMBASE, EmCARE, Electronics & Communication Abstracts, SCIRUS, SPARC, GOOGLE Database, EBSCO, NewJour, Worldcat,
DOAJ, and other major databases etc.,
1534
noise without significantly reducing the sharpness of an image. In this topic, you use the Median Filter block to
remove salt and pepper noise from an intensity image
5. CONCLUSION
We have proposed a more general model for segmenting images with/without intensity inhomogeneities and
with different types of noise. Our proposed level set energy function, which is dominated by the global Gaussian
distribution and constrained by local neighbor properties, can overcome the artifacts from both the intensity
inhomogeneity and image noise. A quantitative comparison on synthetic images and experimental results on real
images showed that our model outperformed the LBF and LSII models designed specifically for segmenting the
images with intensity in homogeneities. It was more robust and accurate than the CV model when segmenting
the images without intensity in homogeneities.
6. REFERENCES
[1] Dexing Zhong,Peihong Zhu,Jiuqiang Han and Shengbin Li(2012)An improved Robust Sparse Coding
for Face Recognition with disguise,Int J Adv. Robotic Sy., Vol.9,126
[2] John Wright,A.Y.Yang,S.ShankarSastry,Arvind Ganesh and Yi Man(Feb2009),Robust face
recognition via Sparse representation,IEEE transaction Vol.31,No.2.
[3] M. Yang, L.Zhang,J.Yang, and D.Zhang(2011)Robust sparse coding for face recognition, Proc. IEEE
Conf. Computer Vision and Pattern Recognition.
[4] R. Chellapa, C.Wilson, and S. Sirohey. Human and machine recognition of faces: a survey.
Proceedings of the IEEE, 83(5):705741, 1995.
[5] C. Cortes and V. Vapnik. Support vector networks. Machine Learning, 20:125, 1995.
[6] M. Fleming and G. Cottrell. Categorization of faces using unsupervised feature extraction. In Proc.
IEEE IJCNN International Joint Conference on Neural Networks, pages 6570, 90.
[7] G. Guodong, S. Li, and C. Kapluk. Face recognition by support vector machines. In Proc. IEEE
International Conference on Automatic Face and Gesture Recognition, pages 196201, 2000.
[8] B. Heisele, T. Poggio, and M. Pontil. Face detection in still gray images. AI Memo 1687, Center for
Biological and Computational Learning, MIT, Cambridge, MA, 2000.
[9] K. Jonsson, J. Matas, J. Kittler, and Y. Li. Learning support vectors for face verification and
ecognition. In Proc. IEEE International Conference on Automatic Face and Gesture Recognition,
pages 208213, 2000.
[10] A. Lanitis, C. Taylor, and T. Cootes. Automatic interpretation and coding of face images using flexible
models. IEEE Transactions on Pattern Analysis and Machine Intelligence, 19(7):743 756, 1997.
7. BIOGRAPHIES
V.Ezhilya received B.E degree in Electronics and Communication Engineering from
AnnaUniversity, Chennai in 2010 and M.E in Power Electronics and Drives from Vinayaga
Missions University. She is currently working as Head of the Department of Electronics and
Communication Engineering in VSA Group of Institutions, affiliated to Anna University.
Her major field of interest is Power Electronics, Digital Signal Processing, and Image
Processing. She was also a recipient of the distinguished graduate student award from
Vinayaga Missions University.