0% found this document useful (0 votes)
122 views35 pages

Emotion Detection: Using Opencv and Tensorflow

The group developed a product to help users understand their emotional state using both facial recognition and a psychological test. The product detects emotions from facial expressions using the Harr Cascade and convolutional neural networks, which were trained on a large dataset. It then has users complete a psychological test to evaluate their inner mental state. The results from both methods are analyzed and solutions are provided based on the level of detected stress or anger, to help users address their current emotional condition. The product aims to help individuals understand and manage their emotions before issues escalate. The group contributed a literature review covering both conventional and advanced emotion detection methods, as well as internal feelings assessment, to provide context for their work.

Uploaded by

Coyote Avi
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
122 views35 pages

Emotion Detection: Using Opencv and Tensorflow

The group developed a product to help users understand their emotional state using both facial recognition and a psychological test. The product detects emotions from facial expressions using the Harr Cascade and convolutional neural networks, which were trained on a large dataset. It then has users complete a psychological test to evaluate their inner mental state. The results from both methods are analyzed and solutions are provided based on the level of detected stress or anger, to help users address their current emotional condition. The product aims to help individuals understand and manage their emotions before issues escalate. The group contributed a literature review covering both conventional and advanced emotion detection methods, as well as internal feelings assessment, to provide context for their work.

Uploaded by

Coyote Avi
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 35

Emotion Detection

Using OpenCV and TensorFlow

Group Members

18BCE2227- Yuvraj Chhabra


18BCI0162- Shubham Sareen
19BEE0359- Kshitij Joshi
19BEI0067- Aviral Singh

Faculty Name

Prof. Rajakumar K.
Associate Professor , VIT Vellore

Fall Semester 2020-2021


SYNOPSIS

In the modern era, in which we’re constantly facing factors which


contribute and further pile on our already high levels of stress, the
humanity faces a huge problem of degrading mental health. In the
current scenario of Covid-19, this problem is further exaggerated
because of the limited social interactions, more screen time and also,
lack of exercise. When we came up with the project, we wanted to help
people with mental health issues understand and comprehend the kind
of pain they are going through. They can also use our software to know,
if they need any external support like counselling. The basic idea
behind this was implemented such that it acts as a standalone product,
rather than a project. The idea was implemented using a web-app. In
the emotion detection end, firstly, the Harr Cascade is used to detect
the facial region. It is used to detect the facial region in the frame
captured by the webcam. It’s functioning is described as follows:
instead of applying all the 6000 features on a window, it groups the
features into different stages of classifiers ( fi ) and applies them one by
one. Second, Machine Learning algorithm using Convolutional Neural
Network. OpenCV with Python is used to run the whole process.
TensorFlow library is used to train the machine learning model. The
model runs frame by frame to detect the emotion. The facial region
detected through Haar Cascade Classifier is passed as input to CNN
(convolutional neural network) model. The emotion having maximum
softmax scores is given as the result on live webcam feed itself.
WHAT YOU OR YOUR GROUP HAS DONE IN REVIEW 0 (FIRST)?

After comprehending the various domains, a booming domain of facial


recognition caught our imagination. Facial recognition is a way of
recognizing a human face through technology. A facial recognition system
uses biometrics to map facial features from a photograph or video. It
compares the information with a database of known faces to find a match.
Facial recognition can help verify personal identity, but it also raises privacy
issues. Some face recognition algorithms identify facial features by extracting
landmarks, or features, from an image of the subject's face. For example, an
algorithm may analyze the relative position, size, and/or shape of the eyes,
nose, cheekbones, and jaw. These features are then used to search for other
images with matching features. Facial recognition has become so common,
that it is now on a majority of cell phones, not only that, but, using just the
face, computers can now differentiate between someone who is sad, angry,
happy or what he/she is feeling. This captured our imagination, we starter
brainstorming and decided that We choose not only to build a make-shift
project, but rather a Product. We decided to build a complete website
helping users analyze their state of mind using both visual aids and a
questionnaire, if the user is experiencing a higher percentage of a negative
emotion, appropriate solutions for the same are suggested. We started
analyzing various algorithms to detect and analyze the face of the subject,
Fuzzy Classifier caught our attention.
WHAT YOU OR YOUR GROUP HAS DONE IN REVIEW 1 (SECOND)?

After discussing various possibilities as to how, we should build the product, the various
specifications relating to the design, feel and look of the web-app were decided, we also
came up with various add-ons, which would later be added to the website. We were
going to use the histogram segmentation algorithm and fuzzy classifier identification
algorithm, initially. Histogram Segmentation algorithm is used by us, because of the
reasons explained below. Histogram are constructed by splitting the range of the data
into equal-sized bins (called classes). Then for each bin, the numbers of points from the
data set that fall into each bin are counted. The term fuzzy refers to things which are not
clear or are vague. In the real world many times we encounter a situation when we can’t
determine whether the state is true or false, their fuzzy logic provides a very valuable
flexibility for reasoning. In this way, we can consider the inaccuracies and uncertainties of
any situation. Image segmentation is the process of grouping pixels of a given image into
regions homogenous with respect to certain features, and with potential semantic
content. The multitude of segmentation methods in the literature can be categorized
under four main qualifiers, namely, measurement-space-based algorithm. The histogram
analysis of an image is considered as a probability distribution over the pixel’s values.
Further, it provides a means to determine the information content of the image but it
would not take spatial factors into consideration. A localized frame work should therefore
be able to determine automatically the skewness introduced by non-uniformity
(heterogeneous edge and texture) regions varying above the uniformity (homogeneous)
regions. Further, the thresholds should be chosen based on the distance between the
valleys to peaks of the histogram. The Segmentation process is a straightforward process
where all necessary parameters are dictated based on the histogram analysis or soft
thresholding. Input image is segmented into various uniform and non-uniform regions
without any loss or integrity with reference to the input with highest efficiency and low
time complexity. The histogram analysis offered a better means to classify the image into
various binary labels based on the image statistics of the region. The soft threshold used
for classification and merging of the regions based on the localized statistics (i.e.
uniformity and non-uniformity) of the image in consideration. Simulation results and
analysis proved that the proposed algorithm shows good performance in image
segmentation without choosing the region of interest. But We found a method to not only
increase our efficiency but, also to make the whole process of hosting the python file a lot
easier. This method uses Harr Cascade and convolution neural networks. We further
trained our model with a lot of images (dataset), after which our model could correctly
predict the state of emotion. We further used PyFlask to host the website on our local
host and to integrate all the other components of the website.
REVIEW 3 (LAST)

INTRODUCTION

We know that emotions play a huge role in a person's life. With different types of
moments or moments the face of a person expresses how you feel or in which mood the
person is. Humans are capable of producing thousands of facial expressions during
interactions that vary in complexity, intensity and meaning. Emotion or intention is often
communicated by subtle changes in one or several discrete features. According to
different surveys, verbal components convey one-third of human communication, and
nonverbal components convey two-thirds. Among the many intangibles, by carrying
emotional states, facial expressions are one of the most important channels of information
in human communication. Thus, it is natural that face-to-face research has received a
great deal of attention over the past decades for applications not only in cognitive
science, but also in computer-related applications and performance. To find the optimal
solution for emotion recognition in learning the virtual environment, both accuracy and
efficiency must be achieved. Recognition of emotional changes in facial expressions of
online students, will allow teachers to tailor strategies and teaching methods, provide
students with real feedback, and achieve excellent teaching quality. The goal is to find the
best solution for emotion recognition based on facial recognition in virtual, real-time
learning environments. To achieve this goal, it is necessary to improve the accuracy and
efficiency of face recognition systems.
In this modern era, a person facing emotional distress has become a general aspect
of life. But some cases might reach out of hand if not taken proper care and guidance.
Besides, we have a jump in the number of suicides in this decade, which had emotion
conditions such as depression one of the main cause. Therefore, our group has decided
to develop a product which would not only help an individual understand his/her outer
emotional state, but also his/her inner mental condition, which would help them to take
preventive measures towards their psychological state before the situation would go out
of hands.

CONTRIBUTION

Despite the long history related to emotion detection, there are no comprehensive
literature reviews on the internal feelings of the person. Some review papers have focused
solely on conventional researches without introducing the psychological test. Therefore,
this paper is dedicated to a brief literature review from conventional emotion detection to
recent advanced emotion detection. The main contribution of this paper is that the results
are obtained through both the emotion detecting techniques i.e., first through the face
and then by taking a psychological test and then evaluating both the results and solutions
are given for low, moderate and high scores of stress and anger and then the person can
follow the solutions to overcome their current emotional state.
18BCE0770 NAVYA SAXENA

2. LITERATURE SURVEY

a. ALSO A TABLE OF YOUR LITERATURE SURVEY

Reference Methods Evaluation Merits and Demerits


Used
https://www.aca Raspberry In Face Recognition System Pros:
demia.edu/4154 Pie Based on Raspberry Pi Platform 1. The information of an
9558/Face_Reco paper published by Nafis individual has been stored
gnition_System Mustakim, Noushad Hossain and in the SQLite database
_Based_on_Ras Mohammad Mustafizur Rahman, al g i h ha i di id al
pberry_Pi_Platf they presented an approach to 20 sample pictures. When
orm?show_app_ detect and identify a human face the data of a person matches
store_popup=tru from the real-time video that with any previous record,
e tracks a face and compares it with his/her sample pictures are
stored data of known individuals. updated.
Their approach recognizes an 2. Usually a small beard,
individual within a fraction of a make-up, the effect of light
second which completely ignores in a room has a small effect
any background effect. It also on a human face. So,
presents additional details about gradually the performance
that individual. Besides, the of the face recognition
method that has been shown in algorithm declines. It might
this paper also works on different work satisfactorily one day
lighting conditions which make it but will not be consistent
suitable to execute its purpose in a for a longer period of time.
wide variety of environments But the method showed in
without encountering any this paper is smart enough
significant error. The authors of to ignore these small
this paper have implemented the changes and detect the
test using a Raspberry Pi. And this human perfectly.
approach utilizes the benefits
offered by different python Cons:
libraries like OpenCV and numPy. 1. In most cases, the face
SQLite database management recognition approach is
system has been used as the usually favorable for the
backbone of its memory. Through place where previous data is
its swift and sterling performance, taken. As a result, it does
this system is worthy enough to not function authentically
be integrated with any when a test is run in a new
applications to boost up environment. Though the
automation and user-friendliness experiment stated in this
among the applications. paper performed really well,
it did not provide a solution
to the above-mentioned

7|Page
18BCE0770 NAVYA SAXENA

problem.

https://www.aca Face In A Comparison of Face Pros:


demia.edu/4136 Verification Verification with Facial 1. Results from this paper
3496/A_Compar Landmarks and Deep Features show that the accuracy of
ison_of_Face_V paper by Giuseppe Amato, the deep features in
erification_with Fabrizio Falchi, Claudio Gennaro, verifying whether a face
_Facial_Landma and Claudio Vairo, several belongs to a given person is
rks_and_Deep_ methods that are used for facial much greater than the one
Features recognition and verification have of the facial landmarks-
been tested and compared. Among based approach.
those, facial landmarks are very 2. Through this paper, we
essential in forensics, since the would get to know that the
distance between some 68- points feature is two
characteristic points of a face can times better than the 5-
be used as an objective measure in points feature, and the Pairs
court during trials. However, the feature slightly improves
efficiency of the approaches based the 68-points feature result.
on facial landmarks in verifying
whether a face belongs to a given Cons:
person or not is often not really 1. Though this paper gives us
good. Lately, deep learning the clarification that facial
approaches have been introduced recognition is better using
to address the face verification deep learning when
problem, with very good results. compared to that of facial
In this paper, the accuracy of landmarks method, it does
facial landmarks and deep not guarantee us that facial
learning approaches in performing recognition can completely
the face verification task have replace facial landmark
been addressed. The experiments, approach.
conducted on a real case scenario,
in this paper show that the deep
learning approach greatly
outperforms in accuracy when
compared to the facial landmarks
approach.

https://ieeexplor Data Mining Data mining also sometimes Pros: -


e.ieee.org/docu called data or knowledge 1) This research, facial
ment/8250725 discovery is the process of expressions is considered
analyzing data from different for the recognition of
perspectives and summarizing it emotions in humans facial
into useful information. Image images from live image
processing is related to Computer from webcam.
vision, which is a high-level 2) The proposed system is
image processing out of which a independent of factors like

8|Page
18BCE0770 NAVYA SAXENA

machine/computer/software gender, age, ethnic group,


intends to decipher the physical beard, backgrounds and
contents of an image or a birthmarks.
sequence of images. One of the 3) The proposed system hopes
ways to do this is by comparing to be very promising and
selected facial features from the developed for individuals
image and a facial database. going through stress during
Recognizing emotion from images their working hours giving
has become one of the active them music therapy.
research themes in image Cons: -
processing and in applications 1) This is a challenging
based on human-computer endeavor because of the
interaction. This research uncertainty inherent in the
conducts an experimental study inference of hidden mental
on recognizing facial emotions. states from behavioral cues,
The flow of our emotion because the automated
recognition system include the analysis of expressions in
basic process in FER system. images is challenging.
These include image acquisition,
preprocessing of an image, face
detection, feature extraction,
classification and then when the
emotions are classified the system
assigns the user particular music
according to his emotion. Our
system focuses on live images
taken from the webcam. The aim
of this research is to develop
automatic facial emotion
recognition system for stressed
individuals thus assigning them
music therapy so as to relief
stress. The emotions considered
for the experiments include
happiness, Sadness, Surprise,
Fear, Disgust, and Anger that are
universally accepted
https://ieeexplor Bespoke This paper describes a new Pros: -
e.ieee.org/docu Machine emotional detection system based 1) The prototype has external
ment/8621562 Learning on a video feed in real-time. It validity, as the emotions
demonstrates how a bespoke detected were consistent
machine learning support vector with the emotions presented
machine (SVM) can be utilized to by the speakers.
provide quick and reliable 2) The accuracy found in the
classification. Features used in the results, the initial signs
study are 68-point facial suggest that affective

9|Page
18BCE0770 NAVYA SAXENA

landmarks. In a lab setting, the computing research is close


application has been trained to to providing a powerful new
detect six different emotions by tool to quickly and
monitoring changes in facial objectively determine
expressions. Its utility as a basis fundamental aspects of
for evaluating the emotional human well-being.
condition of people in situations Cons: -
using video and machine learning 1) This could be improved by
is discussed. deploying the application to
a more powerful hardware
and we hope to achieve
classification on a 30fps
video in the future through
the use of mobile edge
computing (MEC).

https://www.visl Facial Existing facial expression Pros: -


ab.ucr.edu/PUB Expression recognition techniques analyze the 1. The performance is
LICATIONS/pu Recognition spatial and temporal information dramatically improved by
bs/Journal%20a Using for every single frame in a human the idea of single EAI
nd%20Conferen Emotion emotion video. On the contrary, representation compared
ce%20Papers/aft Avatar Image we create the Emotion Avatar with the baseline approach.
er10-1- Image (EAI) as a single good 2. The 10-fold cross-validation
1997/Conferenc representation for each video or result on the training data
e/2011/Facial% image sequence for emotion shows that the higher
20expression%2 recognition. In this paper, we resolution Avatar reference
0recognition%2 adopt the recently introduced has the potential to generate
0using11.pdf SIFT flow algorithm to register better quality of EAI, and
every frame with respect to an subsequently, higher
Avatar reference face model. classification rate.
Then, an iterative algorithm is
used not only to super-resolve the
EAI representation for each video Cons: -
and the Avatar reference, but also 1. Cross-validation on training
to improve the recognition data does not enforce the
performance. Subsequently, we exclusiveness of test and
extract the features from EAIs training subjects, and
using both Local Binary Pattern therefore, this test result is
(LBP) and Local Phase subject to person-specific
Quantization (LPQ). Then the category.
results from both texture 2. A high-level Avatar
descriptors are tested on the Facial reference model tends to
Expression Recognition and super-resolve the EAIs by
Analysis Challenge (FERA2011) enhancing the facial details,
data, GEMEP-FERA dataset. To and attenuates the person-
evaluate this simple yet powerful specific information.

10 | P a g e
18BCE0770 NAVYA SAXENA

idea, we train our algorithm only


using the given 155 videos of
training data from GEMEP-FERA
dataset. The result shows that our
algorithm eliminates the person-
specific information for emotion
and performs well on unseen data.

http://ndl.iitkgp. Emotion Emotion recognition from facial Pros: -


ac.in/document/ Recognition images is a very active re-search 1) It does not need face
MDl5cHdNUUl using topic in human computer alignment or facial
nd0lnZHNoQXl Autonomic interaction (HCI). However, most landmark points localization
vOG5lQm4vc0l Nervous of the previous approaches only from arbitrary view facial
wQmJ4WGxtb System focus on the frontal or nearly images, both of which are
m1oRUFMYlJO Responses frontal view facial images. In very challenging
MD0 Emotion contrast to the frontal/nearly- 2) The proposed method is
Recognition frontal view images, emotion tested on a lot of facial
recognition from non-frontal view images with various views,
or even arbitrary view facial generated from 3D facial
images is much more difficult yet expression models in the
of more practical utility. To BU-3DFE database. The
handle the emotion recognition experimental results show
problem from arbitrary view that our method can achieve
facial images, in this paper we a satisfactory recognition
propose a novel method based on performance.
the regional covariance matrix
(RCM) representation of facial Cons: -
images. We also develop a new 1) It is worth noting that,
discriminant analysis theory, although having been
aiming at reducing the proven to be an effective
dimensionality of the facial image representation
feature vectors while preserving method, the RCM
the most discriminative representation may also
information, by minimizing an discard some useful
estimated multiclass Bayes error discriminant information,
derived under the Gaussian e.g., the class means of
mixture model (GMM). We samples.
further propose an efficient
algorithm to solve the optimal 2) Finding a better image
discriminant vectors of the representation method may
proposed discriminant analysis help to improve the
method. We render thousands of performance of emotion
multi-view 2D facial images from recognition.
the BU-3DFE database and
conduct extensive experiments on
the generated database to

11 | P a g e
18BCE0770 NAVYA SAXENA

demonstrate the effectiveness of


the proposed method. It is worth
noting that our method does not
require face alignment or facial
landmark points localization,
making it very attractive.

https://www.scie MAFONN- In recent days, facial emotion Pros :-


ncedirect.com/sc EP: A recognition techniques are
ience/article/pii/ Minimal considered important and critical 1. It utilizes different and
S131915781830 Angular due wide array of application efficient image processing
3847 Feature domains use facial emotion as part techniques for facial
Oriented of their workflow and analytics emotion recognition
Neural gathering. Reduced recognition 2. During experimentation,
Network rate, inefficient computation and various performance
based increased time consumption are measures such as
Emotion the major drawbacks of these sensitivity, specificity,
Prediction techniques. To overcome these accuracy, precision, recall
system in issues, this paper develops the and recognition rate are
image new facial emotion recognition used for evaluating the
processing technique named as Minimal proposed system
Angular Feature Oriented Neural 3. From the results, it is
Network based Emotion evident that the MAFONN-
Prediction (MAFONN-EP). EP technique provides
Initially, the input video sequence better results compared to
are categorized into image frames the existing techniques.
that are pre-processed by Cons :-
eliminating the noise by using
Weighted Median 1. As per the results obtained
Filtering (WMF) technique and to using the proposed method,
separate the background and he fea em i ffe ed
foreground regions using Edge the highest value and the
Preserved Background Separation a ge ffe ed he l e
and Foreground Extraction value compared to other
(EPBSFE) technique. Then, the algorithms.
set of texture patterns are 2. From the results, it is
extracted based on four key parts observed that the proposed
such as two eyes, nose and mouth MAFONN-EP provides an
by using the Minimal Angular increased recognition rate
Deviation (MAD) technique. he ha fea a d
Particular features are selected by ha em i he
employing the Cuckoo Search compared to the other
based Particle Swarm existing techniques.
Optimization (CS-PSO) technique
that also reduces the
feature dimensionality. Finally,

12 | P a g e
18BCE0770 NAVYA SAXENA

the Weight Based Pointing Kernel


Classification (WBPKC)
technique is employed for
recognizing the emotion. In the
experimental results, the
performance of the proposed
technique is analyzed and
compared with different
performance measures like
accuracy, sensitivity, specificity,
precision, recall.

3. BACKGROUND OF YOUR PROJECT WORK

Mental Health is majorly affected due to the outbreak of Covid-19. This project will
help the user to analyze their own emotions and help them find solutions if they are
feeling stressed or depressed or angry. The project aims to provide two fronts to
verif he e em i .O ei ca e li e image f he e h gh ebcam
and detect the emotion the user is feeling the most, through image processing. The
other is to ask the user to give their response for psychological quiz to detect their
emotion. Its result will present the percentage of each emotion. Based on both the
results, solutions will be provided to the user.

a. Various components of work

The main objective of the project is to detect Human Emotion, both


internal and external. The project is divided into two modules. One module is
ba ed de ec i g he e em i h gh hi /he facial e e i
implemented using image processing. The other module is based on detecting
the emotion based on the responses given by the user for multiple questions
asked through a quiz.
Detecting emotion through image processing and quiz can further be
divided into 4 components.
First, Haar Cascade classifier. It is used to detect the facial region in the
f ame ca ed b he ebcam. I f ci i g i described as follows: instead

13 | P a g e
3. Background of project

Mental Health is majorly affected due to the outbreak of Covid-19. This


project will help the user to analyze their own emotions and help them find
solutions if they are feeling stressed or depressed or angry. The project aims
to provide two fronts to verify the user’s emotions. One is to capture live
image of the user through webcam and detect the emotion the user is
feeling the most, through image processing. The other is to ask the user to
give their response for psychological quiz to detect their emotion. Its result
will present the percentage of each emotion. Based on both the results,
solutions will be provided to the user.

Components at work

The main objective of the project is to detect Human Emotion, both


internal and external. The project is divided into two modules. One module
is based on detecting the user’s emotion through his/her facial expressions
implemented using image processing. The other module is based on
detecting the emotion based on the responses given by the user for multiple
questions asked through a quiz.

Detecting emotion through image processing and quiz can further be


divided into 4 components.

First, Haar Cascade classifier. It is used to detect the facial region in the
frame captured by the webcam. It’s functioning is described as follows:
instead of applying all the 6000 features on a window, it groups the features
into different stages of classifiers ( fi ) and applies them one by one. If any
feature suggests that it is not part of the Face region, the algorithm will rule
out that subregion as not-face. If the subregion passes all the stages, then it
is classified as the Face region. Once the facial region is detected, then it
runs through the machine learning algorithm.

Second, Machine Learning algorithm using Convolutional Neural Network.


OpenCV with Python is used to run the whole process. TensorFlow library is
used to train the machine learning model. The model runs frame by frame to
detect the emotion. The facial region detected through Haar Cascade
Classifier is passed as input to CNN (convolutional neural network) model.
The face region is sent through multiple layers for emotion detection.
Convolution is the first layer where features are extracted using Garbo filter.
Then it is sent through max pooling filter. Its function is to progressively
reduce the spatial size of the representation to reduce the amount of
parameters and computation in the network. Pooling layer operates on each
feature map independently. Then it is sent through fully connected layers
(FC) and Softmax function is applied to classify an object with probabilistic
values between 0 and 1. The emotion having maximum softmax scores is
given as the result on live webcam feed itself.

Third, Quiz. The user is presented with 24 questions which must be


answered using a 5 point scale. Then the responses will go through an
algorithm to give the percentage of each emotion: Stress, Happy, Anger,
Sad, Disgust, and Fear. The result will signify what emotion the user is
feeling.
Fourth, Solutions. The user is presented with solutions to help them
overcome stress, depression and anger. These solutions will vary according
to the level of emotion the user must be feeling. If they are mildly stressed,
motivational videos are suggested. For moderate stress, calming music is
suggested. For high stress score, personal counselling is suggested. Similar
solutions are given for other emotions as well.

4. Proposed Work

Once the user lands on the website, they are provided with information
about the project and how it works. The user can then begin analysis of their
current emotion.

First, we analyze the external emotion the user is feeling by capturing their
live video through webcam. Each frame goes through Haar Cascade
Classifier to detect the facial region. Once the face is detected, the sub-
region goes through the machine learning algorithm based on CNN
(convolutional neural network) to predict the emotion the user is
experiencing. The result is based on the maximum softmax score given by
the model. The accuracy of the model will depend on the number of epochs
used. The user can see the result of external emotion detection on their live
feed itself.
Next, the user is asked to take a psychological test. He/she must answer all
the questions using the 5 point scale. Once submitted, the result will
showcase the percentage of each emotion: Stress, Happy, Anger, Sad,
Disgust, and Fear, the user is feeling.

Then, the results obtained through both the emotion detecting techniques
will help the user understand their emotion. Solutions are given for low,
moderate and high scores of stress and anger. The user can follow the
solutions to overcome their current emotional state.
18BCE0770 NAVYA SAXENA

a. Block Diagrams and Architectures of your Project

Figure 1 depicts the architecture of the project

Figure 2 depicts the multiple stages image sub-region goes through to detect the facial region
in Haar Cascade Classifier

16 | P a g e
18BCE0770 NAVYA SAXENA

Figure 3 depicts how the emotion of a user is detected based on the trained model

Figure 4 depicts the working of Convolution Neural Network Algorithm. The Face region,
detected through Haar Cascade Classifier, is passed through multiple layers to detect the
emotion

5. EVALUATION AND RESULT ANALYSIS

a. Evaluation Analysis:

The Neural Network architecture that we developed, though gives an assumption of


a shallow network, works equivalently well when compared to various previous
emotion detection network architectures, which consisted of deep and complicated
neural network components, such as Batch Normalization and convolution
transposes. We were able to attain an exceptional accuracy of 91.1% on the training
set and an accuracy of 60.2% on the validation set, which is substantial, compared

17 | P a g e
Camera.py
import cv2
import numpy as np
import argparse
import matplotlib.pyplot as plt
from tensorflow.keras.layers import Dense, Dropout, Flatten
from tensorflow.keras.layers import Conv2D
from tensorflow.keras.optimizers import Adam
from tensorflow.keras.layers import MaxPooling2D
from tensorflow.keras.preprocessing.image import
ImageDataGenerator
from tensorflow.keras.models import Sequential

facecasc =
cv2.CascadeClassifier('haarcascade_frontalface_default.xml')
ds_factor=0.6
emotion_dict = {0: "Angry", 1: "Disgusted", 2: "Fearful", 3: "Happy",
4: "Neutral", 5: "Sad", 6: "Surprised"}
model = Sequential()

model.add(Conv2D(32, kernel_size=(3, 3), activation='relu',


input_shape=(48,48,1)))
model.add(Conv2D(64, kernel_size=(3, 3), activation='relu'))
model.add(MaxPooling2D(pool_size=(2, 2)))
model.add(Dropout(0.25))

model.add(Conv2D(128, kernel_size=(3, 3), activation='relu'))


model.add(MaxPooling2D(pool_size=(2, 2)))
model.add(Conv2D(128, kernel_size=(3, 3), activation='relu'))
model.add(MaxPooling2D(pool_size=(2, 2)))
model.add(Dropout(0.25))

model.add(Flatten())
model.add(Dense(1024, activation='relu'))
model.add(Dropout(0.5))
model.add(Dense(7, activation='softmax'))
model.load_weights('model.h5')

class VideoCamera(object):
def init(self):
self = cv2.VideoCapture(0)
print("hi9")

def del(self):
self.release()
cv2.destroyAllWindows()

def get_frame(self):
self = cv2.VideoCapture(0)
print("hi10")
success,frame=self.read()
gray = cv2.cvtColor(frame, cv2.COLOR_BGR2GRAY)
faces = facecasc.detectMultiScale(gray,scaleFactor=1.3,
minNeighbors=5)
for (x, y, w, h) in faces:
cv2.rectangle(frame, (x, y-50), (x+w, y+h+10), (255, 0, 0), 2)
roi_gray = gray[y:y + h, x:x + w]
cropped_img =
np.expand_dims(np.expand_dims(cv2.resize(roi_gray, (48, 48)), -1),
0)
prediction = model.predict(cropped_img)
maxindex = int(np.argmax(prediction))
cv2.putText(frame, emotion_dict[maxindex], (x+20, y-60),
cv2.FONT_HERSHEY_SIMPLEX, 1, (255, 255, 255), 2,
cv2.LINE_AA)
break
cv2.imshow('Video', cv2.resize(frame,(1600,960),interpolation
= cv2.INTER_CUBIC))
#if cv2.waitKey(1) & 0xFF == ord('q'):
#break

ret, jpeg = cv2.imencode('.jpg', frame)


return jpeg.tobytes()
Main.py

import numpy as np
import argparse
import matplotlib.pyplot as plt
import cv2
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Dense, Dropout, Flatten
from tensorflow.keras.layers import Conv2D
from tensorflow.keras.optimizers import Adam
from tensorflow.keras.layers import MaxPooling2D
from tensorflow.keras.preprocessing.image import
ImageDataGenerator
from flask import Flask, render_template, Response
from camera import VideoCamera
import os
os.environ['TF_CPP_MIN_LOG_LEVEL'] = '2'

app = Flask(_name_)
print("hi")

# command line argument


ap = argparse.ArgumentParser()
ap.add_argument("--mode",help="train/display")
mode = ap.parse_args().mode

@app.route('/')

# plots accuracy and loss curves


# plots accuracy and loss curves
# def plot_model_history(model_history):
# """
# Plot Accuracy and Loss curves given the model_history
# """
# fig, axs = plt.subplots(1,2,figsize=(15,5))
# # summarize history for accuracy
# axs[0].plot(range(1,len(model_history.history['accuracy'])
+1),model_history.history['accuracy'])
# axs[0].plot(range(1,len(model_history.history['val_accuracy'])
+1),model_history.history['val_accuracy'])
# axs[0].set_title('Model Accuracy')
# axs[0].set_ylabel('Accuracy')
# axs[0].set_xlabel('Epoch')
#
axs[0].set_xticks(np.arange(1,len(model_history.history['accuracy'])
+1),len(model_history.history['accuracy'])/10)
# axs[0].legend(['train', 'val'], loc='best')
# # summarize history for loss
# axs[1].plot(range(1,len(model_history.history['loss'])
+1),model_history.history['loss'])
# axs[1].plot(range(1,len(model_history.history['val_loss'])
+1),model_history.history['val_loss'])
# axs[1].set_title('Model Loss')
# axs[1].set_ylabel('Loss')
# axs[1].set_xlabel('Epoch')
# axs[1].set_xticks(np.arange(1,len(model_history.history['loss'])
+1),len(model_history.history['loss'])/10)
# axs[1].legend(['train', 'val'], loc='best')
# fig.savefig('plot.png')
# plt.show()
#print("hi3")

def index():

# Define data generators


#train_dir = 'data/train'
val_dir = 'data/test'

num_train = 28709
num_val = 7178
batch_size = 64
num_epoch = 50

train_datagen =
ImageDataGenerator(rescale=1./255)
val_datagen =
ImageDataGenerator(rescale=1./255)

# train_generator =
train_datagen.flow_from_directory(
# train_dir,
# target_size=(48,48),
# batch_size=batch_size,
# color_mode="grayscale",
# class_mode='categorical')

validation_generator =
val_datagen.flow_from_directory(
val_dir,
target_size=(48,48),
batch_size=batch_size,
color_mode="grayscale",
class_mode='categorical')

# Create the model


model = Sequential()

model.add(Conv2D(32, kernel_size=(3, 3),


activation='relu', input_shape=(48,48,1)))
model.add(Conv2D(64, kernel_size=(3, 3),
activation='relu'))
model.add(MaxPooling2D(pool_size=(2, 2)))
model.add(Dropout(0.25))

model.add(Conv2D(128, kernel_size=(3, 3),


activation='relu'))
model.add(MaxPooling2D(pool_size=(2, 2)))
model.add(Conv2D(128, kernel_size=(3, 3),
activation='relu'))
model.add(MaxPooling2D(pool_size=(2, 2)))
model.add(Dropout(0.25))

model.add(Flatten())
model.add(Dense(1024, activation='relu'))
model.add(Dropout(0.5))
model.add(Dense(7, activation='softmax'))
print("hi4")
return render_template('index.html')

# If you want to train the same model or try other


models, go for this
# if mode == "train":
#
model.compile(loss='categorical_crossentropy',opti
mizer=Adam(lr=0.0001,
decay=1e-6),metrics=['accuracy'])
# model_info = model.fit_generator(
# train_generator,
# steps_per_epoch=num_train //
batch_size,
# epochs=num_epoch,
#
validation_data=validation_generator,
# validation_steps=num_val //
batch_size)
# plot_model_history(model_info)
# model.save_weights('model.h5')

# emotions will be displayed on your face from the


webcam feed
# elif mode == "display":
def gen(camera):
#model.load_weights('model.h5')

# prevents openCL usage and unnecessary


logging messages
cv2.ocl.setUseOpenCL(False)

# dictionary which assigns each label an


emotion (alphabetical order)
#emotion_dict = {0: "Angry", 1: "Disgusted",
2: "Fearful", 3: "Happy", 4: "Neutral", 5: "Sad",
6: "Surprised"}
# start the webcam feed
#cap = cv2.VideoCapture(0)
while True:
# Find haar cascade to draw bounding box
around face
frame = camera.get_frame()
#if not ret:
# break
# facecasc =
cv2.CascadeClassifier('haarcascade_frontalface_def
ault.xml')
# gray = cv2.cvtColor(frame,
cv2.COLOR_BGR2GRAY)
# faces =
facecasc.detectMultiScale(gray,scaleFactor=1.3,
minNeighbors=5)

# for (x, y, w, h) in faces:


# cv2.rectangle(frame, (x, y-50),
(x+w, y+h+10), (255, 0, 0), 2)
# roi_gray = gray[y:y + h, x:x + w]
# cropped_img =
np.expand_dims(np.expand_dims(cv2.resize(roi_gray,
(48, 48)), -1), 0)
# prediction =
model.predict(cropped_img)
# maxindex =
int(np.argmax(prediction))
# cv2.putText(frame,
emotion_dict[maxindex], (x+20, y-60),
cv2.FONT_HERSHEY_SIMPLEX, 1, (255, 255, 255), 2,
cv2.LINE_AA)

#cv2.imshow('Video', cv2.resize(frame,
(1600,960),interpolation = cv2.INTER_CUBIC))
#if cv2.waitKey(1) & 0xFF == ord('q'):
#break
print("hi5")
yield (b'--frame\r\n'
b'Content-Type: image/jpeg\r\n\r\n'
+ frame + b'\r\n\r\n')
# cap.release()
@app.route('/video_feed')
def video_feed():
print("hi7")
return
Response(gen(VideoCamera()),mimetype='multipart/x-
mixed-replace; boundary=frame')

if _name_ == '_main_':
print("hi8")

app.run(host='0.0.0.0',port='5000' ,debug=True)
5. EVALUATION AND RESULT ANALYSIS

a. Evaluation analysis

The Neural Network architecture that we developed, though gives an


assumption of a shallow network, works equivalently well when compared to
various previous emotion detection network architectures, which consisted of deep
and complicated neural network components, such as Batch Normalization and
convolution transposes. We were able to attain an exceptional accuracy of 91.1%
on the training set and an accuracy of 60.2% on the validation set, which is
substantial, compared to other implementations which achieved only 55.1% on
their validation data. From the graph presented below, we can also conclude that
our architecture was able to reduce the loss of the model sequentially. This
reduction in loss function once again proves that the shallow network that we have
developed works exceptionally good.

b. Result Analysis

After working on this project for 2 months, we were successfully able to


develop a product which was very efficient in detecting the mental condition of a
person. We have also practically tested our product, and were able to predict the
emotional state of that person accurately. We have also developed our website
using multiple colors which are soothing for the eyes of a user, thereby not
triggering their emotional state.
c. Result of Emotion Detection using Image Processing

Video streaming Demonstration

18BCE0770 NAVYA SAXENA

d. Snapshots of Implementation

Figure 6 depicts the first page the user comes across. The user can directly Begin Analysis.

Figure 7 depicts the additional information provided to the user about different emotions.

19 | P a g e
18BCE0770 NAVYA SAXENA

Figure 10 depicts the page user comes across when they head over to the Quiz.

Figure 11 depicts a snippet of 24 questions presented to the user in the Emotion Detection Quiz.

21 | P a g e
18BCE0770 NAVYA SAXENA

Figure 12 depicts the user s result based on their response. Percentage of each emotion is
calculated and displayed for the user for self- analysis.

Figure 13 depicts the page the user comes across when they click on Solutions.

22 | P a g e
18BCE0770 NAVYA SAXENA

Figure 14 depicts that the user can contact their nearest counsellor.

Figure 15 depicts the solution for a Low Stress Score.

23 | P a g e
18BCE0770 NAVYA SAXENA

Figure 16 depicts the solution for a Moderate Stress Score.

Figure 17 depicts the solution for a High Stress Score.

24 | P a g e
7. Overall Discussion

As we know emotions expresses human feeing and his/her mental health. In


this busy lifestyle people get stressed really fast and they don’t even realize
it. So, we came up with a website which can help a person to understand his/
her mental state.

Here we use powerful tools of image processing first, Haar Cascade classifier.
It is used to detect the facial region in the frame captured by the webcam.
Second, Machine Learning algorithm using Convolutional Neural Network.
OpenCV with Python is used to run the whole process. TensorFlow library is
used to train the machine learning model. The model runs frame by frame to
detect the emotion. The facial region detected through Haar Cascade
Classifier is passed as input to CNN (convolutional neural network) model.

Now for better result we came up with an Emotion testing Quiz through our
research after giving this quiz we can tell more accurately about one’s
emotion right now. These emotions include Stress, Happy, Anger, Sad,
Disgust, and Fear.

8. Conclusion

This project was created and test and it successfully gave one’s emotion both
by image detection and Quiz. Image detection part is Realtime with on spot
result on camera, while the quiz gives percentage of what type of emotion
one is feeling which includes Stress, Happy, Anger, Sad, Disgust, and Fear.

9.Reference
Nithya Roopa. S (2019). Emotion Recognition from Facial Expression using
Deep Learning. International Journal of Engineering and Advanced
Technology (IJEAT) ISSN: 2249 – 8958, Volume-8 Issue-6S.
A. A. Varghese, J. P. Cherian and J. J. Kizhakkethottam (2015). Overview on
emotion recognition system. 2015 International Conference on Soft-
Computing and Networks Security (ICSNS), Coimbatore, 2015, pp. 1-5, doi:
10.1109/ICSNS.2015.7292443.

Singh, Dilbag. (2012). Human Emotion Recognition System. International


Journal of Image, Graphics and Signal Processing. 4. 10.5815/
ijigsp.2012.08.07.

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy