0% found this document useful (0 votes)
16 views4 pages

An Efficient Model For Facial Expression Recognition With Music Recommendation

This paper proposes a system that can recognize human faces, determine human emotions from facial expressions, and recommend music based on the detected emotion. The system uses CNN models trained on facial datasets to classify emotions. It then selects music randomly from playlists associated with each emotion. The accuracy of the emotion classification model is evaluated and compared to other methods.

Uploaded by

Shivam Gupta
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
16 views4 pages

An Efficient Model For Facial Expression Recognition With Music Recommendation

This paper proposes a system that can recognize human faces, determine human emotions from facial expressions, and recommend music based on the detected emotion. The system uses CNN models trained on facial datasets to classify emotions. It then selects music randomly from playlists associated with each emotion. The accuracy of the emotion classification model is evaluated and compared to other methods.

Uploaded by

Shivam Gupta
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 4

Natl. Acad. Sci. Lett.

https://doi.org/10.1007/s40009-023-01346-4

SHORT COMMUNICATION

An Efficient Model for Facial Expression Recognition with Music


Recommendation
Brijesh Bakariya1 · Krishna Kumar Mohbey2 · Arshdeep Singh1 ·
Harmanpreet Singh1 · Pankaj Raju1 · Rohit Rajpoot1

Received: 23 February 2023 / Revised: 20 July 2023 / Accepted: 10 August 2023


© The Author(s), under exclusive licence to The National Academy of Sciences, India 2023

Abstract An AI interactive robot can identify human of our suggested model was compared to several baseline
faces, determine the emotions of the person it is chatting approaches, and the results were quite affirmative. Anger,
with, and then pick appropriate replies using algorithms fear, pleasure, neutral, sorrow, and surprise are the six feel-
that analyze facial expressions and recognize faces. One ings that our CNN model can predict.
example of these algorithms is facing recognition and emo-
tion recognition algorithms. Deep learning is currently the Keywords Face recognition · Face expression
most effective method for carrying out tasks. We have devel- recognition · Artificial intelligence · CNN · Deep learning
oped a real-time system that can recognize human faces,
determine human emotions, and even provide users with
music recommendations by utilizing deep learning and a Facial expressions reflect the person’s state of mind as they
few Python modules. The OAHEGA and FER-2013 data- mirror the emotions impeccably and are kingpins in nonver-
sets train the models presented in this article. The accuracy bal human communication. Charles Darwin proposed the
concept of universal emotions as he argued that emotional
experiences are hardwired into every human being [1]. Sev-
Significance statement: In this paper, the proposed approach
will predict a person’s emotion and recognize a person’s face
eral works in the past [2–5] have utilized the study of facial
using the facial expressions of the person as well as recommend expressions in different application perspectives. Reference
music to the person based on the detected emotion of the person. [6] utilized face recognition and emotion recognition out-
This system can be of great use in several applications. put for a humanoid robot using convolution neural networks
(CNN).
* Brijesh Bakariya
brijeshmanit@gmail.com We propose a real-time system consisting of three phases.
In the first phase, we use the Haar cascade technique [7] to
Krishna Kumar Mohbey
kmohbey@gmail.com locate human faces. The "haarcascade_frontalface_default.
xml" is the face detection template that serves the purpose.
Arshdeep Singh
ishir.sagoo@gmail.com We also use OpenCV to connect the camera to capture the
image before bringing in the Haar cascade classifier. Then,
Harmanpreet Singh
singhharmanpreet21@gmail.com the image is converted to grayscale, and the detectMulti-
Scale method is used to locate faces of varying sizes within
Pankaj Raju
rajupankaj20@gmail.com the input image. If a face is detected, the image is shown
in RGB, and the cv2.rectangle function draws a rectangle
Rohit Rajpoot
rohitrajpoot7696@gmail.com with the coordinates (x,y), width ("w"), and height (“h”),
1 otherwise, "No Face Detected." is prompted on the screen.
Department of Computer Science and Engineering,
I.K.Gujral Punjab Technical University Campus, Hoshiarpur, We use cv2.imread(img) function to capture intricacies, and
Punjab, India then with the use of face recognition encoding, the images
2
Department of Computer Science, Central University are processed and compared against other encoded images
of Rajasthan, Ajmer, India to determine whether the face is already recognized, or the

13
Vol.:(0123456789)
B. Bakariya et al.

Fig. 1  CNN-based framework for emotion classification

Table 1  Confusion matrix Actual class


Emotions Angry Fear Happy Neutral Sad Surprise

Predicted class Angry 65 2 1 2 2 0


Fear 1 6 1 4 0 0
Happy 8 0 74 12 1 2
Neutral 12 0 3 84 6 0
Sad 5 3 11 17 28 0
Surprise 3 2 5 3 3 7
Accuracy = 70.77

current instance is the detected face. At the GUI, the user optimizer = Adam. The confusion matrix for the process is
is enabled with the "Add the person’s face" button to add depicted in Table 1.
the recognized face to the database for future reference and In the third phase, the algorithm was developed to ana-
reduce the time consumed. The entire database is read and lyze the emotions for some threshold time and extract the
encoded only once at the setup time and not continuously. predominant emotion. Before this, a separate CSV filename
The newly recognized faces are encoded and appended to after the different emotions under study is built with suitable
the list with an alphabetic order-based index. The detected songs. This forms our music database and Python library
face image is reduced to 48 by 48 pixels and is now ready pygame3, and a mixer sub-library is used to monitor the
for the next phase. music playback. Once the user initiates the "Suggest a Song"
In the second phase, we develop a CNN model that can operation on the graphical user interface (GUI), the algo-
automatically and adaptively learn spatial hierarchies of rithm uses the recognized emotion to select the song ran-
features, moving from low-level to high-level patterns in domly from the CSV file. Parallel, the pygame music mixer
data having a grid pattern, for facial emotion recognition is started, and the relevant song is played. This parallel pro-
as proposed in [8]. Thus, we combine the FER-20131and cess can be controlled using the appropriate GUI. The entire
OAHEGA [9] datasets and obtain 43,003 images for training process is depicted in Fig. 2.
and 8856 images for testing, respectively, for six emotions: The model consisted of 12 twelve layers 32–64–64–128
anger, fear, happy, neutral, sad, and surprise. The developed kernels, 19,011,142 trainable parameters. For an image of
model has 12 layers. The initial layers are the input layers, size M by N pixels and CNN with k kernels, the estimated
which receive images 48 by 48 pixels from an input channel. complexity is O(MNk2). It consumed 0.70252 s and achieved
The convolution and pooling layers perform feature extrac- 0.9413 as training accuracy with a loss of 0.1687 and 0.7302
tion, and the fully connected layer classifies the emotion on as validation accuracy with a loss of 1.0412. The technique
the face, as illustrated in Fig. 1. proposed in [10] has an accuracy of 0.65 and in [11] has
The hyperparameters were tuned as learning rate an accuracy of 0.68. This was surpassed by the ResNet-18-
(lr) = 0.0001, decay = 1e-6, batch size = 32, epochs = 24, and based method proposed in [12] which achieved an accuracy
of 0.71. Using smoothed deep neural network ensemble
technique in Ref. [13] achieved an accuracy of 0.72. Our
proposed model surpasses these methods and has an accu-
1
https://​www.​kaggle.​com/​datas​ets/​msamb​are/​fer20​13. racy of 0.732.

13
An Efficient Model for Facial Expression Recognition with Music Recommendation

Fig. 2  GUI representation of the model

The proposed model has been evaluated in real time Declarations


with fewer computing resources, and the results are quite
complementary. It is a viable option for various applica- Conflict of interest All the author(s) declare(s) that there is no con-
tions like face recognition-based mobile unlocking, sev- flict of interest.
eral camera-based applications, social networking apps,
AI bots, gaming, healthcare, and the learning industry. References
Processing power, unbalanced data, overfitting–underfit-
ting and limited research on musical aspects-based face 1. Dalvi C, Rathod M, Patil S, Gite S, Kotecha K (2021) A survey
recognition are a few of the challenges that require to be of ai-based facial emotion recognition: features, ml & dl tech-
explored in the future. niques, age-wise datasets and future directions. IEEE Access
9:165806–165840

13
B. Bakariya et al.

2. Boragule A, Akram H, Kim J, Jeon M (2022) Learning to resolve 9. Kovenko V, Shevchuk V (2021) OAHEGA: emotion recogni-
uncertainties for large-scale face recognition. Pattern Recogn Lett tion dataset. Mendeley Data, V2. https://​doi.​org/​10.​17632/​5ck5z​
160:58–65 z6f2c.2.
3. Basha SM, Rajput DS (2018) Parsing based sarcasm detec- 10. Meena G, Mohbey KK, Indian A, Kumar S (2022) Sentiment anal-
tion from literal language in tweets. Recent Patents Comp Sci ysis from images using vgg19 based transfer learning approach.
11(1):62–69 Proc Comp Sci 204:411–418
4. Basha SM, Rajput DS, Thabitha TP, Srikanth P, Pavan Kumar 11. Yang L, Zhang H, Li D, Xiao F, Yang S (2021) Facial expression
CS (2019) Classification of sentiments from movie reviews using recognition based on transfer learning and SVM. J Phys Conf Ser
KNIME. In: Proceedings of the 2nd international conference 2025(1):012015
on data engineering and communication technology: ICDECT 12. Benamara NK, Val-Calvo M, Alvarez-Sanchez JR, Diaz-Morcillo
2017, pp 633–639. Springer, Singapore. A, Ferrandez-Vicente JM, Fernandez-Jover E, Stambouli TB
5. Varshney N, Bakariya B, Kushwaha AKS (2022) Human activity (2021) Real-time facial expression recognition using smoothed
recognition using deep transfer learning of cross position sensor deep neural network ensemble. Integr Comput Aided Eng
based on vertical distribution of data. Multimed Tools and Appl 28(1):97–111
81(16):22307–22322 13. Tai Y, Tan Y, Gong W, Huang H (2021) Bayesian convolutional
6. Dwijayanti S, Iqbal M, Suprapto BY (2022) Real-time implemen- neural networks for seven basic facial expression classifica-
tation of face recognition and emotion recognition in a human- tions. arXiv preprint arXiv:​2107.​04834.
oid robot using a convolutional neural network. IEEE Access
10:89876–89886 Publisher’s Note Springer Nature remains neutral with regard to
7. Viola P, Jones M (2001) Rapid object detection using a boosted jurisdictional claims in published maps and institutional affiliations.
cascade of simple features. In: Proceedings of the 2001 IEEE
computer society conference on computer vision and pattern rec-
ognition. CVPR 2001, vol 1, pp. I-I. IEEE, New York.
8. Varshney N, Bakariya B (2021) Deep convolutional neural model
for human activities recognition in a sequence of video by com-
bining multiple CNN streams. Multimed Tools Appl, 1–13.

13

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy