0% found this document useful (0 votes)
84 views8 pages

Facial Expression Recognition System Using Weight-Based Approach

Ubiquitous Computing and Communication Journal (ISSN 1992-8424), is an international scientific journal dedicated to advancing the information and communication technology. With a world-wide membership, UBICC is a leading resource for computing professionals and students working in the disciplines of information technology and the impact on society. In particular UBICC journal provides global perspective on new developments in ubiquitous and pervasive computing technologies. The journal is committed to provide platform to present discuss and exchange experimental or theoretical results, trend-setting ideas in the emerging field of ubiquitous computing and related disciplines. UBICC publishes peer-reviewed, interesting, timely and accessible contributions from researchers from all over the globe. The Journal is an essential resource for researchers and educators who wish to understand the implications of ubiquitous computing. In addition to regular publication UBICC also participate in international conferences on related subject and publishes the selected papers with the special issue.
Copyright
© Attribution Non-Commercial (BY-NC)
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
84 views8 pages

Facial Expression Recognition System Using Weight-Based Approach

Ubiquitous Computing and Communication Journal (ISSN 1992-8424), is an international scientific journal dedicated to advancing the information and communication technology. With a world-wide membership, UBICC is a leading resource for computing professionals and students working in the disciplines of information technology and the impact on society. In particular UBICC journal provides global perspective on new developments in ubiquitous and pervasive computing technologies. The journal is committed to provide platform to present discuss and exchange experimental or theoretical results, trend-setting ideas in the emerging field of ubiquitous computing and related disciplines. UBICC publishes peer-reviewed, interesting, timely and accessible contributions from researchers from all over the globe. The Journal is an essential resource for researchers and educators who wish to understand the implications of ubiquitous computing. In addition to regular publication UBICC also participate in international conferences on related subject and publishes the selected papers with the special issue.
Copyright
© Attribution Non-Commercial (BY-NC)
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 8

FACIAL EXPRESSION RECOGNITION SYSTEM USING

WEIGHT-BASED APPROACH

Srinivasa K G, Inchara Shivalingaiah, Lisa Gracias, and Nischit Ranganath


M.S Ramaiah Institute of Technology, India
http://www.msrit.edu

ABSTRACT
In this paper we present a method to identify the facial expressions of a user by
processing images taken from a webcam. This is done by passing the image
through 3 stages -face detection, feature extraction, and expression recognition.
Face detection is accomplished by implementing the concept of Haar-like features
to prioritize reduction in detection time over accuracy. Following this, facial
features are extracted from their respective regions of interest using Gabor filters
and classified based on facial ratios and symmetry. We plotted a minimal number
of facial points (12 points) and compared the positions of these points on the
neutral expression image and the current expression image. This allowed us to
recognize 8 expressions based on the facial movements made as opposed to
identifying just the 6 basic expressions as done by research papers previously. We
were able to reach an accuracy level of approximately 92% for expression
recognition when tested with the JAFFE database. This is a considerably high
performance when taking into account the simplicity of the algorithm involved.

Keywords: Face recognition, Image processing, Pattern recognition.

1 INTRODUCTION cannot always be reliably extracted when


background clutter is present. Another successful
Facial expression results from one or more system is that of using color information from the
motions or positions of the muscles of the face. images. Many researchers addressing skin color have
These movements convey the emotional state of the demonstrated that skin color falls into a narrow band
individual to observers. Facial expressions are a form in the color space, and hence can be harnessed to
of nonverbal communication. They are a primary detect pixels which are of the color of human skin.
means of conveying social information. To convey To avoid the pitfalls of these approaches, we used
information easily in the field of networking and the concept of Haar-like features to detect the faces
robotics, facial expression recognition can play a in an image
very important role. The process of expression
recognition involves processing images by detecting 1.2. Feature Extraction
the face, extracting the facial features, and then using Previous methods for facial feature point
an algorithm to identify the expressions made based detection could be classified into two categories:
on the movements of the feature made. texture-based and shape-based methods. Texture-
based methods model local texture around a given
1.1. Face Detection feature point, for example the pixel values in a small
By face detection we generally mean that a region around a mouth corner. Shape-based methods
bounding box (or an ellipsoid) enclosing the face (or regard all facial feature points as a shape, which is
faces) in an image at approximately the correct scale learned from a set of labeled faces, and try to find the
needs be specified. Several methods have been proper shape for any unknown face. Typical texture-
proposed for face detections. Shape based based methods include gray-value, eye-configuration
approaches are those which make use of active and neural-network-based eye-feature detection, log
contours (Waite and Welsh [3], Cootes and Taylor et Gabor wavelet based facial point detection, and two-
al. [4]). An active contour provides continuity and stage facial point detection using a hierarchy of
smoothness in the output yet is flexible and able to Gabor wavelet networks. Typical shape-based
deform to fit variations of the face shape. The methods include active appearance model-based
difficulty of using shape-based approaches is that facial feature detectors. A number of approaches
there is so much variability in the shape between combining texture- and shape-based methods have
faces of different people and thus the face region been proposed as well [2].We applied the Gabor
wavelet-based facial point detection, which is a conditions or with a non-uniform background may
popular method to detect facial features. This paper lead to errors.
introduces the implementation of Gabor wavelets by
detecting 12 feature points that are necessary for 3 BACKGROUND
expression recognition.
In text body, use a proportional typeface 10
1.3. Expression Recognition point in size, preferably a Times New Roman type
Ekman and Friesen [5] have produced a face. All text must be double-columned and single-
system for describing “all visually distinguishable spaced, with an indent of 5mm for the first line of
facial movements”, called the Facial Action Coding each paragraph. Please take the following guidelines
System or FACS. It is based on the numeration of all into consideration strictly.
“action units” (AUs) of a face that cause facial
movements. However, it detects only 6 basic 3.1 Haar-like Features
expressions. For instance, all types of smiles are put A publication by Papageorgiou et al. [8]
into a single category. In our view the principle discussed working with an alternative feature set
difficulty that these researchers encountered is the instead of the usual image intensities. This feature set
sheer complexity of describing human facial considers rectangular regions of the image and sums
movement using FACS. Using the FACS up the pixels in this region. This sum is used to
representation, there are a very large number of AUs, categorize images. Haar-like features encode the
which combine in extremely complex ways to give existence of oriented contrasts between regions in the
rise to expressions. In contradiction to this view, image. A set of these features can be used to encode
there is now a growing body of psychological the contrasts exhibited by a human face and their
research that argues that it is the dynamics of the spatial relationships. Haar-like features are so called
expression, rather than detailed spatial deformations, because they are computed similar to the coefficients
which is important in expression recognition [13]. in Haar wavelet transforms. Haar wavelets are single
This paper produces better results by detecting a wavelength square waves (one high interval and one
greater number of expressions than just the basic six, low interval). In two dimensions, a square wave is a
and by using a novel weight-based approach. pair of adjacent rectangles - one light and one dark.
They can be represented by equation (1).
2 PROBLEM STATEMENT
1 0  t  1 / 2,
This paper’s objective is to design software that 
takes as input static images of a single user, and finds  (t )   1 1 / 2  t  1,
the expression made with a high rate of accuracy. 0
The static images are input from a webcam and are  otherwise.
(1)
processed to satisfy the functional requirements as To determine the presence or absence of hundreds of
given below. Haar features at every image location and at several
1. Face localization: To detect the face present in scales efficiently, Viola and Jones used a technique
an image taken from the webcam. called an Integral Image. In general, "integrating"
2. Region of Interest Detection: To divide the face means adding small units together. In this case, the
into regions of interest for feature extraction in small units are pixel values. The integral value for
each region. each pixel is the sum of all the pixels above it and to
3. Feature Extraction: To detect the features its left. Starting at the top left and traversing to the
present in each region of interest. right and down, the entire image can be integrated
4. Feature Classification: To classify the features with a few integer operations per pixel.
detected based on their relative and absolute As seen in fig.1, After integrating, the pixel at any
distances on the face. position (x,y) contains the sum of all pixel values in
5. Expression Recognition: To identify the the shaded rectangle.
expression made based on the feature
movements.
The constraints faced are:
 Presence or absence of structural components like
facial hair, glasses, etc., affects the accuracy of
detection.
Fig 1. Haar-Like rectangular features
 The orientation of the face affects the feature
detection. The face must not be tilted, and must be To find the average pixel value in this rectangle, we
directed straight at the camera. need to divide the value at (x,y) by the rectangle's
 Webcam images captured under improper lighting area.
5. if (number of Faces >1)
3.2 Gabor Function Print failure in detection of a single face
A family of Gabor kernel is the product of a Exit
Gaussian envelope and a plane wave. The local end if
appearance around a point, x, in a gray scale image 6. Crop the face portion of the ‘image’.
I(x) can be encoded using a set of Gabor coefficients 7. Present output (cropped image) as input to the
Jj (x). Each coefficient Jj(x) is derived by convolving next level – feature detection
input image I(x) with a family of Gabor kernels ψj as
shown in equation (2) . 4.2 Feature Extraction
4.2.1. Database creation
k 2j   k 2j x 2   2

 j ( x)  exp exp(i k j  x)  exp     Create Gabor Matrix with a 32x32 window.
 2  2  2  2  a. Create 5 X 8 Matrix Gabor
    (2)
b. for each scale i from 1 to 5 in Gabor
for each orientation j from 1 to 8 in
Gabor kernels are plane waves restricted by a Gabor
Gaussian envelope function of relative width σ = 2π. Gabor[i][j] = Gabor_equation(i;j )
Each kernel is characterized by a wave vector end for(j)
kj = [kv cos φu kv sin φu]T where kv = 2−(v+1) end for(i)
with v=0, 1 . . . , 4 symbolize spatial frequencies and
 Read all cropped images from the directory of
φu=(φ/8)u with u = 0, 1 . . . , 7 are the different
dimension 27x18.
orientations of the Gabor kernels used in this paper.
 Produce copies of images by lateral inversion
A jet J is a set {Jj, j = u + 8v} of 40 complex Gabor
and inverting the image upside down.
coefficients obtained from a single image point.
Complex Gabor coefficients can be represented in  Calculate the Gabor features of the image
their exponential form Jj = aj exp(iφj) where aj(x) is a. Fast Fourier transform is applied to the
the slowly varying magnitude and φj(x) is the phase high contrast image with dimensions
of the jth Gabor coefficient at pixel x. 27x18 and converted to a matrix of 32 x
32 dimensions.
The similarity between two jets is defined using the b. Inverse Fourier transform is applied to the
phase sensitive similarity measure as in equation 3. array product of Fourier result and the
40
Gabor matrix and converting the final

S ( J , J ) 
 i 1
ai ai cos(i   i) matrix back into the 27x18 dimension.
c. for each scale s from 1 to 5
40 40
 i 1
ai2 i1 a i 2 (3) for each orientation j from 1 to 8
result = fft2(double(W27x18),32,32);
Features = ifft2(Gabor(s,j)* result,27,18);
This similarity measure returns real values in the end for(j)
range [−1, +1], where a value closer to +1 means a end for (s)
higher similarity between the input jets.  Gabor features are converted to a vector by
reducing the matrix by nine times for faster
In order to search for a given feature on a new face calculations. Hence, 2160 features are presented
image, a general representation of that facial feature per image.
point is required. As proposed in [6], the general  All images and Gabor features are saved and
appearance of each feature point can be modeled by procedure is repeated for all positive and negative
bundling the Gabor jets extracted from several images in the database.
manually marked examples of that feature points (e.g.  Network is initialized and trained using this
eye corners) in a stack-like structure called a “Gabor image database to reach a low error rate.
bunch”.[10]
4.2.2. Detection of any one feature (image_scan)
4 ALGORITHM  Create 2 templates (C2,C3) of the feature to be
detected.
4.1Face Detection  Produce convolutions of the templates with the
1. Read the 640x480 input image given image(C1)
2. Convert image to grayscale Corr_1 = double(conv2 (C1,C2,'same'));
3. Use a trained database to find the portion of the Corr_2 = double(conv2 (C1,C3,'same'));
given image that gives the best results for face  Save the combined state of the 2 convolution
detection using Haar-like features. results in a cell matrix. 0 would indicate no
4. Return the points at which faces are centered in detection possible and 1 would indicate a possible
the image. detection.
 while there exists (x,y) in Cell whose state is 1, do into a cascade of classifiers. An input window is
 Let the element be i,j passed from one classifier in the cascade to the next
 Cut out the 27x18 image centered at this point. as long as each classifier classifies the window as a
 Invalidate the state so it is not detected again face. The threshold of each classifier is set to a high
 Find the simulated network value calculated detection rate. The rates are such that the later stages
using the net matrix and the given ‘imcut’ cover harder detection procedures. Once a face was
image detected, the Haar classifier cropped the image for
 Save the point (i,j) if the simulated network ROI detection (see fig. 2).
value is greater than the required threshold.
End while 5.2. ROI Detection
 Find the centroid of the area covered by the group The next step in the automatic facial point
detection is to determine the Region Of Interest
of points that had acceptable network values. This
indicates the feature detection point. (ROI) for each point, that is, to define a large region
which contains the point that we want to detect. To
 Return the value of all the possible feature
achieve this we first detect the nose and then
positions detected.
calculate regions of interest for the other features
relative to the nose, as it forms the central point in
4.2.3. Collective detection of all features
the image. This can be seen in fig. 3
 Create Regions Of Interest (as shown in Fig 3)
 Apply ratio to classify among detected points as 5.3. Feature Extraction
seen in the figure (show figure ) The feature vector for each facial point is
 Apply symmetry to synchronize some feature extracted from the 27x18 pixels image patch
point positions centered on that point. This feature vector is used to
learn the pertinent point’s patch template and, in the
4.3 Expression Recognition testing stage, to predict whether the current point
Weights are assigned to each expression based represents a certain facial point or not. This 27x18
on the movements made on the face when a neutral pixels image patch is extracted from the gray scale
image of the user is compared with an image of the image of the ROI and from 40 representations of the
user making an expression. Some movements are ROI obtained by filtering the ROI with a bank of 40
given negative weights, hence decreasing the Gabor filters at 8 orientations and 5 spatial. Thus,
probability of that expression. The resulting sum 486x40=19440 features are obtained, which are
after these weights are applied gives the overall reduced by 9 times to have 2160 features represent
probability of that expression. one facial point. Each feature contains the following
1. The positive weights are assigned according to information: (i) the position of the pixel inside the
the priority of movements for each expression as 27x18 pixels image patch, (ii) whether the pixel
given in table 1. originates from a grayscale or from a Gabor filtered
2. Negative weights are assigned when the representation of the ROI.
movement made rules out the possibility of
existence of the expression in question.

5 FUNCTIONAL ARCHITECTURE

5.1. Face Detection


The input image is scanned across location and
scale. At each location, an independent decision is
made regarding the presence of a face. This leads to
a very large number of classifier evaluations;
approximately 50,000 in a 320x240 image.
Following the AdaBoost algorithm, a set of weak
binary classifiers is learned from a training set. Each Fig. 2 : Usage of Haar-like features for detection of
classifier is a simple function made up of rectangular the face in an image and cropping it out for
sums followed by a threshold. In each round of processing.
boosting, one feature is selected, that with the lowest
weighted error. The feature is assigned a weight in
the final classifier using the AdaBoost procedure. In
subsequent rounds, incorrectly labeled examples are
given a higher weight while correctly labeled
examples are given a lower weight. In order to
reduce the false positive rate, classification is divided
feature points on the face such as in fig.4.

5.5. Expression Recognition


One of the main advantages of the methods
presented here is the ability to use real imagery to
define the representation for each expression. This
paper did not rely on existing models of representing
facial expressions but generated a representation of
an expression from the observations of people
making expressions and then used the extracted
profiles for recognition. A database of 50 people
making 8 expressions as depicted in fig.5 was
created. We refer to a particular template of
movements as observed. The values could be
tabulated as given in Table 1.

Fig. 3: Regions of Interest for each feature detected,


from left to right being nose, Left eye top, Left eye
bottom, Right eye top, right eye bottom, left brow
top, right brow top, left brow side, right brow side,
lip bottom, left and right lip corners. The green
pixels mark the right detections. Fig 4. All the feature points being detected
Eventually, in the testing phase, each ROI is filtered
first by the same set of Gabor filters used in the
training phase (in total, 32 Gabor filters are used).
Then, for a certain facial point an input 2718 pixels
sliding window is slid pixel by pixel across 33
representations of the ROI (grayscale plus 32 Gabor
filter representations) For each position of the sliding
window, the GentleBoost classifier outputs a
response depicting the similarity between the 33-
dimensional representation of the sliding window Fig 5. Expressions detected in this paper. From top
compared to the learned feature point model. After left to bottom right: Neutral, Polite smile, Happy
scanning the entire ROI, the position with the highest smile, Sadness, Disgust, Shock, Surprise, Anger,
response reveals the feature point in question. Irritation.

5.4. Feature Classification Expression Movements observed


Since our database is not very large, there is the Polite smile Lip corners move outwards and up, eyes
possibility that many incorrect features or no features narrowed
at all will be detected. To avoid such cases, the Happy smile Lip corners move outwards and up, eyes
chances of accurate feature detection are increased narrowed, bottom of the lip moves down.
using properties such as the symmetry of the face Anger Eyebrows down and squeezed, eyes
and the standard ratio between certain features. widened, lips tightened, Eyebrow inner
Feature points such as the sides of the lip, the top of corners move closer.
each eye and the bottom of each eye are symmetrical Irritation Eyebrows down and squeezed, eyes
to each other about the nose. This information is narrowed , lips tightened
used to choose the detected point that most conforms Surprise Eyebrows up, eyes widened, lip corners
to the law of symmetry. Regarding the application of move apart, bottom of the lip moves down
the ratio concept, it is known that the golden ratio by a large margin.
(1.618) applies to any face. Also, statistically, we Shock Eyebrows up, eyes widened, lip corners
discovered other ratios on the face that are move closer, bottom of the lip moves
corollaries of the application of the golden ratio. For down.
example, the ratio of the vertical distances of the Sadness Eyebrow inner corners raised, eyes
bottom of each eye to the nose from the head is narrowed, lip corners move down.
found out to be around 1.53. This ratio is used while Disgust Eyes narrowed, mouth open, lip corners
choosing the feature that is required from amongst pushed down and inward
the various detected false features. This would Table 1: Movement for facial expression
produce a result that has marked all the necessary
Using the observations made, a classifier was
produced that took into account all the features
which showed movement for any of the expressions
and the amount of movement for each. Each type of
movement, for instance, the eyebrow movement, was
given a particular priority for each expression and
based on that weights were allotted for those
expressions. The final result included the probability
of each expression for that instance plotted in a graph.

6 RESULTS

This method of expression recognition was tested for


the JAFFE (Japanese Female Facial Expression) Fig 7: Accurate detections of feature points on the
Database. This database contains 213 images of 7 face.
facial expressions (6 basic facial expressions + 1
neutral) made by 10 female Japanese models. Graphs
with the movements and the implied expressions
were produced as in fig.8a and 8b.The accuracies of
detection of each feature and expression are given in
the tables 3 and 4. Some results with inaccurate and
accurate detection are given in fig 6 and fig 7.

Movements measured and their meanings are given


in the table 2, which help understand the feature
movement’s graph. As seen in the feature
movements graph in fig 8a, the lip corners widen, lip
bottom moves away from the nose and the lip
corners move closer to the nose (LSN is negative).
Hence the expression is most probably that of
happiness. Similarly, in fig 8b, the lip corners move
away from the nose and closer to each other and
hence the expression of sadness is detected.

Fig 8a. The above two graphs portray the expression


of a happy smile.

Movement Meaning
NLB Lip’s vertical Movement
Fig 6 : Inaccurate detections of lip bottom and LS1LS2 Lip corners movement. Positive value
eyebrow tops on a larger scale than other points implies widening of the mouth
produced in the image. LSN Distance of the lip corners from the
nose
ETN Distance of the eye top from the nose
ETEB Width of the eyes
BTN Amount by which the brow
raises/lowers
BTET Distance between the eye and top of
the eyebrow.
Table 2: Movements Recorded in the graph
7 CONCLUSION Left Eye bottom (LEB) 89%
Right eye top (RET) 83%
Results obtained for expression recognition were Right Eye bottom (REB) 86%
92% accurate and the feature point detections were Left Brow top (BT1) 81%
86% accurate as inferred from the individual Right Brow top (BT2) 82%
accuracies given in Table 4. In future work we will Left Brow side(BS1) 93%
investigate effects of using a reduced number of Right Brow side (BS2) 90%
features for classification. Also, we plan to conduct Left Lip Corner (LS1) 79%
extensive experimental studies using other publicly Right Lip corner (LS2) 80%
available face databases. Lip bottom (LB) 84%
Additionally, we plan to have better methods of
Table 4: facial feature points detection rate for the
classification and a bigger image database which can
JAFFE database.
increase the performance in facial feature detection.
8 REFERENCES

[1] Liya Ding and Aleix M. Martinex, “Precise


detailed detection of faces and facial features”,
IEEE Conference on Computer Vision and
Pattern Recognition, CVPR, 2008
[2] Danijela Vukadinovic and Maja Pantic, “Fully
automatic facial feature point detection using
gabor feature based boosted classifiers”, IEEE
International Conference on Systems, Man and
Cybernetics, 2005
[3] J. B. Waite and W. J. Welsh, “An application of
active contour models to head boundary
location”, in Proc. British Machine Vision
Conference, pp. 407-412, Oxford 1990
[4] T. F. Cootes and C. J. Taylor, “Locating faces
using statistical feature detectors”, in Proc. 2nd
Int. Conf on Automatic Face and Gesture
Recognition, pp. 204-209, Vermont, 1996.
IEEE Comp. Soc. Press.
[5] HyoungWoo Lee, SeKee Kil, Younghwan Han
and SeungHong Hong, “Automatic face and
facial features detection”, IEEE International
Symposium on Industrial Electronics, 2001
[6] I.R. Fasel, et al., “GBoost: A generative
framework for boosting with applications to
realtime eye coding”, Computer Vision and
Fig 8b: Graphical results on expression resulting as Image Understanding, 2005
sad. [7] P. Viola, M. Jones, “Robust real-time object
detection”, ICCV Workshop on Statistical and
Expression Detection Rate Computation Theories of Vision, 2001
Polite Smile 91% [8] C. Papageorgiou, M. Oren, and T. Poggio, “A
general framework for object detection”,
Happy smile 95%
International Conference on Computer Vision,
Sad 90%
1998
Disgust 93%
[9] Wei-feng LIU, Ji-li LU, Zeng-fu WANG, and
Anger 89% Hua-jun SONG, “An expression space model
Irritated 94% for facial expression analysis”, Congress on
Surprise 96% Image and Signal Processing, 2008
Shock 90% [10]Sina Jahanbin, Alan C. Bovik, and Hyohoon
Table 3: facial feature points detection rate for the Choi, "Automated facial feature detection from
JAFFE database. portrait and range images," ssiai, pp.25-28,
IEEE Southwest Symposium on Image Analysis
Facial Feature points Detection rate and Interpretation, 2008
Nose (N) 97% [12]Peng Yang, Qingshan Liu, Xinyi Cui, and
Left Eye top (LET) 85% Dimitris N. Metaxas, "Facial expression
recognition using encoded dynamic features,"
cvpr, pp.1-8, 2008 IEEE Conference on
Computer Vision and Pattern Recognition,
2008
[13]I.A. Essa, A.P. Pentland, "Facial expression
recognition using a dynamic model and motion
energy," iccv, pp.360, Fifth International
Conference on Computer Vision (ICCV'95),
1995

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy