0% found this document useful (0 votes)
5 views5 pages

Ref 3

The document discusses a research study on emotion recognition from music using deep learning techniques, specifically focusing on neural networks and MFCC features. It highlights the importance of accurately identifying emotions in music to enhance human-computer interaction and improve personalized music recommendations. The study utilizes a dataset from Kaggle and compares the performance of deep learning models with traditional machine learning algorithms for emotion classification.

Uploaded by

capakit416
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
5 views5 pages

Ref 3

The document discusses a research study on emotion recognition from music using deep learning techniques, specifically focusing on neural networks and MFCC features. It highlights the importance of accurately identifying emotions in music to enhance human-computer interaction and improve personalized music recommendations. The study utilizes a dataset from Kaggle and compares the performance of deep learning models with traditional machine learning algorithms for emotion classification.

Uploaded by

capakit416
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 5

Proceedings of the International Conference on Automation, Computing and Renewable Systems (ICACRS 2022)

IEEE Xplore Part Number: CFP22CB5-ART: ISBN: 978-1-6654-6084-2

Deep Learning Aided Emotion Recognition from


Music
R Raja Subramanian Kokkirala Aditya Ram Dola Lokesh Sai
Department of computer Science and Department of computer Science and Department of computer Science and
2022 International Conference on Automation, Computing and Renewable Systems (ICACRS) | 978-1-6654-6084-2/22/$31.00 ©2022 IEEE | DOI: 10.1109/ICACRS55517.2022.10029108

Engineering Engineering Engineering


Kalasalingam Academy of Reasearch Kalasalingam Academy of Reasearch Kalasalingam Academy of Reasearch
and Education and Education and Education
Virudhunagar,Tamil Nadu,India . Virudhunagar,Tamil Nadu,India . Virudhnagagr,Tamil Nadu.India line 5:
rajasubramanian.r@klu.ac.in adityaramkokkirala@gmail.com lokesh091403@gmail.com

K Venkatesh Reddy Kondeti Akarsh Chowdary Kundu Dheeraj Datta Reddy


Department of computer Science and Department of computer Science and Department of computer Science and
Engineering Engineering Engineering
Kalasalingam Academy of Reasearch Kalasalingam Academy of Reasearch Kalasalingam Academy of Reasearch
and Education and Education and Education
Virudhnagagr,Tamil Nadu.India Virudhnagagr,Tamil Nadu.India Virudhnagagr,Tamil Nadu.India
kvenkyreddy113@gmail.com akarshchowdary2035@gmail.com dheerajdattakundu@gmail.com

Abstract— Emotion identification by audio signal is a Emotion categorization follows genre classification. For
contemporary study area in Human Computer Interaction music recovery, they are endeavouring to involve feeling
domain. The desire for improving the communication interface notwithstanding conventional meta information like type and title.
between people and digital media has increased. The emotion of Numerous music sites have likewise settled melody idea frameworks
the song is detected through music. Music is a great medium for to fulfil comparative requirements. In light of client demands and
conveying emotion. The practice of determining emotions from tracks that clients ordinarily pay attention to and the system will
music snippets is known as music emotion recognition. Audio likewise suggest similar melodies from music library. As of late,
dataset is collected from the Kaggle. Researchers are now different listening destinations have started to give music idea
increasingly concerned towards increasing the precision of administrations with shifting states of mind to give a superior client
experience. There are only a couple of music feeling characterization
emotion recognition techniques. However, a complete system
and feeling based web indexes. [22] Therefore, feeling based music
that can discern emotions from speech is not yet developed. This recovery is a significant piece of meeting individuals' individualized
research work has suggested a novel emotion recognition music recovery requirements, as well as an essential development
technique, where the neural networks are trained to identify course for current music recovery. A few music specialists
emotions based on the retrieved information. The performance contributed manual explanation on the connection between highlight
of neural networks is then compared to the performance of amount and melody feeling. [18] Music creations should be named
baseline machine learning classification algorithms. The with feelings to accomplish feeling based music ID and recovery.
obtained results show that MFCC characteristics combined with Numerous music experts gave understanding into the connection
deep RNN perform better for instrument emotion identification. between include number and music feeling. explanation by hand
The results also reveal that MFCC features paired with a deep Close to home comment of immense music creations utilizing fake
neural network outperform other emotion recognition methods. techniques isn't just time requesting, yet in addition unsure with
It also shows that the class has a major influence on the mood regards to quality. Subsequently, investigating music feeling
evoked by music. To make human-computer interaction more programmed recognizable proof innovation and executing
natural, the computer should be able to perceive different mechanized feeling marking of music works is a fundamental need.
emotional states. The voice of a person is very essential in [20] To improve the system's reliability and resilience, A
assessing individuals. The emotion of the individual is detected classification method simulates a feature classifier and is used to
through the person's speech. These audio types are further analyse each feature, resulting in a musical sentiment. The underlying
classified as joyful, sad, neutral, or fearful. recognition model in this study is a neural network.

Keywords— Audio Emotion Recognition, Deep learning, II. DATASET


Neural Network, LSTM (Long Short-Term Memory) and MFCC In music emotion recognition, the most often used public
(Mel Frequency Cepstral Coefficient). datasets mostly consist of audio recordings conveying emotions.
Each of the two female performers and their emotions is assigned its
I. INTRODUCTION own folder in the dataset.[17] And within it are all 5600 target word
Music is a powerful tool that has many positive effects on the audio files. WAV is the audio file format. Attributes in this dataset
human body and mind. Stimulating and relaxing. The part of the include joyful, sad, neutral, wrath, disgust, pleasure, and pleasant
human mind that perceives music is close to the realm where home surprise. Several tilt models have been proposed in brain research
articulations are realized, so there is a direct relationship between and physiology. The Thayer model is probably the most relevant
music and profound articulations. A portion of the examination model to the sense of music, as it is highly related to musical angles.
bunches are taking gander at the connection[1]between music and Thayer's 2D model is based on two important and successful
feeling. Music analysts are the people who research the connection boundaries. The energy of music and the joy of music are also called
between aural signs and different communicated feelings. They for excitement and value respectively. The moment a person pays
the most part make profound models. PC researchers who make attention to an angry or joyful melody, the pulse increases and the
calculations that perceive melodic feeling naturally. Discourse circulatory load increases (energy music). These factors are related
Feeling Acknowledgment is a task that utilizes profound figuring out to the excitement aspect. Overall, higher blood cortisol levels were
how to group sounds. The project's goal is to analyse spoken sounds associated with positive valences. The model consists of a two-
and classify the accompanying emotion. This paradigm is applicable dimensional plane divided into four clusters based on excitation and
to any sound-based recognition project, including speech, music, and valence factors, each located in a quadrant of the plane.[29] As
songs.[11]

978-1-6654-6084-2/22/$31.00 ©2022 IEEE 712


Authorized licensed use limited to: VIT University- Chennai Campus. Downloaded on February 02,2024 at 08:56:25 UTC from IEEE Xplore. Restrictions apply.
Proceedings of the International Conference on Automation, Computing and Renewable Systems (ICACRS 2022)
IEEE Xplore Part Number: CFP22CB5-ART: ISBN: 978-1-6654-6084-2

shown in Fig. 2, the resulting four clusters are furious, joyful,


relaxed, and sad; we applied this emotion model in our suggested
Represents the Speech recognition using the MFCC-(Mel Frequency
technique.
Cepstral Coefficient).It actually Shows Step involved in MFCC
Technique which is used to divide according to the frequencies as
shown in the Fig 2.
3.2 LSTM
Long short-term memory (LSTM)is a type of artificial neural
network used in artificial intelligence and deep learning. Unlike
traditional feedforward neural networks, LSTM has feedback
connections. To identify emotions from text, many Machine
Learning Models have been developed. However, in this essay, we
will concentrate on the Bidirectional LSTM Model. Bidirectional
LSTMs, or Bi LSTMs, are an addition to standard LSTMs that are
used to improve the model's performance on sequence classification
issues. Bi LSTMs train on sequential input using two LSTMs. The
first LSTM is applied directly to the input sequence. On a reversed
representation of the input sequence, the second LSTM is employed.
It aids in the insertion of context and speeds up our model.[2]
Fig.1 Some Audio clips in Dataset

Fig .3 Architecture of Speech Emotion [21]


Fig 1.2 Emotion Classes of the Dataset Represents the speech recognition architecture using (CNN –
LSTM). It is a deep multi tasking learning-based recognition. It is a
III. FEATURES 2D-CNN-LSTM that has two conventional layers followed by two
layers of long short-term memory as shown in the Fig 3.
In this work, the features extracted from the instrumental music
clips give an excellent computational approach for describing the IV. Data Preprocessing
audio data. [19]
3.1 MFCC Preprocessing is an important step before doing feature extraction
and classification. In this study, the second order Butterworth filter is
The electrical modification of audio signals is the focus of audio employed to remove noise from music sound sources. Furthermore,
signal processing, a subset of signal processing. Audio signals are the dataset's music recordings are converted to mp3 format with a
electronic representations of sound waves, which are longitudinal sample frequency of 41100 Hz.[8]
waves that flow through air and contain compressions and
rarefactions. The energy of audio transmissions is usually measured 4.1 Modules
in decibels. It is, however, rarely employed in instrument emotion Pandas' module is an information control and examination
recognition. These qualities enable near-human perception accuracy instrument. NumPy Is a Python library that can lead a great many
because they account for human perception sensitivity to frequencies. numerical procedure on clusters. Matplotlib is an information
[7] To compute MFCC, the music clips are divided into 20ms frames perception and graphical plotting library. os - used to control records
with a 10ms shift. Each frame contains MFCC properties such as with framework orders. Seaborn - similar usefulness carried out on
static, derivative, and acceleration coefficients. To extract the top of matplotlib librosa is a sound record investigation program
features, the Python speech features module is utilised. The MFCC me.[15] Librosa. show - used to deliver sound information as designs.
feature was created by combining four different instrument clips and Sound is utilized to show and pay attention to sound. Alerts - to
portrays the corresponding emotion.. change the subtleties of admonitions.
Up till now, brain research and physiological sciences have put
out various inclination models.The Thayer model is one of the
Speech Preprocessing Framing models of music feeling that is most often applied on account of how
intently it connects with melodic components. Two essential and
valuable boundaries — music energy and music delight, frequently
known as excitement and valence, individually — structure the
MF FFT Appling premise of Thayer's 2D model. Individuals' heartbeat rates and pulse
Wrappin Harming rise when they pay attention to irate or cheerful music (music with
g high energy). The excitement aspect is associated with these factors.
window All things considered; positive valence was connected to expanded
blood cortisol levels. The dataset's initial five-way documents and the
Cepstrum MFCC
MFCC initial five names of the dataset's voice records is executed. We
presently make an information build out of the sound documents and
labels. The provided information is a record way which is input for
Fig 2 Speech Recognition Using the MFCC

978-1-6654-6084-2/22/$31.00 ©2022 IEEE 713


Authorized licensed use limited to: VIT University- Chennai Campus. Downloaded on February 02,2024 at 08:56:25 UTC from IEEE Xplore. Restrictions apply.
Proceedings of the International Conference on Automation, Computing and Renewable Systems (ICACRS 2022)
IEEE Xplore Part Number: CFP22CB5-ART: ISBN: 978-1-6654-6084-2

the code. The result information for the program is name. The
informational index's groupings and the quantity of tests in each class
are listed.[16] The worth counts () strategy returns a Series that
contains counts of exceptional qualities. The resultant item will be
organized in diminishing request, with the primary component being
the most frequently happening. Presently we characterize both wave
plot and spectrogram capabilities. The highlights are separated
utilizing the Python discourse highlights module. The MFCC include
was made by joining four different instrument cuts and depicts the
comparing feeling. [4] A wave plot is a visual portrayal of a sound Fig 5.1 Audio Signals of disgust Emotion
record's waveform. A sound record's recurrence levels are displayed
on a spectrogram. The spectrogram highlights are utilized for include Returns features taken from all audio files. Visualization of the
extraction and element choice in the brain network by means of the retrieved data characteristics.[14] The greater the number of samples
convolution layer and pooling layer, though the sound elements act in the dataset, the longer the processing time. The list is converted
as the organization input for the combination characterization model into a single-dimensional array. In a single dimension array, the
in view of LSTM. [3] A progression of serialized include vectors are shape indicates the number of samples in the dataset.[9] The shape
created by the model and took care of into the LSTM network as new denotes the number of samples and output classes. Hidden units in a
highlights prior to being yield through an express meager single dimension linear layer is called Dense. Dropout is used to
consideration network. We can get the feeling of the sound apply regularization to data in order to avoid overfitting and dropping
subsequent to posting it.as shown in the Fig 4 and Fig 4.1 out a portion of the data.

Fig.4 Frequency of Fear Emotion Fig 6 Frequency of Angry Emotion

Fig 4.1 Audio Signals of the Fear Emotion Fig 6.1 Audio Signal of Angry Emotion
Each class's audio file's wave plot and spectrogram are plotted. The outcomes of each training epoch are displayed. batch
Each class has a sample audio of an emotion speech. Darker colors size=64 indicates the amount of data to be processed each step.
are associated with lower pitched voices. Colours are brighter in epochs=50 - the number of iterations used to train the model.
higher pitched voices. Audio length is limited to 3 seconds for files Validation split=0.2 - % of train and test split. Each cycle improves
of identical size. [6] The Mel-frequency cepstral coefficients the training and validation accuracy. The highest validation accuracy
(MFCC) features will be extracted with a limit of 40 and the mean is 72.32%. Save the best validation accuracy model using a
will be used as the final feature. Audio file feature values are being checkpoint. Slow convergence requires adjusting the learning
displayed in Table-1. The frequencies and audio Signals of different rate.[12]
emotions (Happy, Sad, disgust etc.) as shown in the below figures.

Fig 7 Frequency of Happy Emotion


Fig.5 Frequency of disgust Emotion

978-1-6654-6084-2/22/$31.00 ©2022 IEEE 714


Authorized licensed use limited to: VIT University- Chennai Campus. Downloaded on February 02,2024 at 08:56:25 UTC from IEEE Xplore. Restrictions apply.
Proceedings of the International Conference on Automation, Computing and Renewable Systems (ICACRS 2022)
IEEE Xplore Part Number: CFP22CB5-ART: ISBN: 978-1-6654-6084-2

moods to get music. This organization uses 288 mood categories for
emotional classification provided by music professionals.
V. Result
Deep learning models outperform machine learning techniques
in terms of accuracy. The voice emotion recognition model is trained
using the retrieved audio features. Your accuracy will increase with
more training data. This model can be used in a variety of ways,
including speech recognition or other audio-related tracks, depending
on the settings and data collection. We reviewed the Speech Emotion
Recognition dataset as a deep learning classification project during
this project conference. Various voice-emotional sounds were
identified and classified using explanatory data analysis. The phase
spectrum feature combined achieves an accuracy score of 83%.
72.32% short-term energy, short-term average amplitude, short-term
Fig 7.1 Audio Signals Happy Emotion autocorrelation function, frequency, amplitude, phase and complex
characteristics of the drum face are correct. The voice emotion
Create a categorization task for the MER job. In the VA recognition model is trained using the retrieved audio features. Your
emotional space, there are four unique sorts of continuous emotions: accuracy will increase with more training data.
joyous, sad, anxious, and calm. Since the music video labels in the
dataset correspond to specified points in the VA space, the emotional This model can be used in a variety of ways, including speech
value must be separated to map to the emotional category. [5] Before recognition or other audio related tracks, depending on settings and
the sample data were processed using the classification tasks in this data collection. We reviewed the Speech Emotion Recognition
study, the VA space was separated into four parts, and the four dataset as a deep learning classification project during this project
emotions were associated with the VA space. The combination of conference. Various voice-emotional sounds were identified and
short-term energy functions, short-term mean amplitude and short- classified using explanatory data analysis. The phase spectrum
term autocorrelation function in the BP-based MER experiment had feature combined achieves an accuracy score of 83%. 72.32% short-
the best recorded effect. The outcomes of each training epoch are term energy, short-term average amplitude, short-term
displayed. The training accuracy and validation accuracy grow with autocorrelation function, frequency, amplitude, phase and complex
each iteration; the best validation accuracy is 72.32 use checkpoint to characteristics of the drum face are correct. In this study, the VA
save the best validation accuracy model Slow convergence requires space was divided into four parts, and the four emotions were linked
adjusting the learning rate.[13] to the VA space, before the sample data was processed by
classification tasks.
Table-1 Compare With the Layer and param.
Layer( type) Output Shape Param#

Lstm_3(LSTM) (None,256) 264192

Dropout_9(Dropout) (None,256) 0

Dense_9(Dense) (None,128) 32896

Dropout_10(Dropout) (None,128) 0

Dense_10(Dense) (None,64) 8256

Fig 8 Frequency of Neutral Emotion Dropout_11(Dropout) (None,64) 0

Dense_11(Dense) (None,7) 455

Total Params: 305,799


Trainable Params: 305,799
Non Trainable Params: 0
To determine which feature combination provides the best
recognition results, the BP technique was initially used to study many
different feature combinations. Short-term energy, short-term
average amplitude, short-term autocorrelation function, short-term
zero cross-sectional rate, frequency spectrum and amplitude
spectrum are the combination of features that have the strongest
identifying effect. The combination of phase spectrum functions has
an accuracy rate of 83.83%. Accuracy of 77.89% was obtained by
combining short-term energy, short-term average amplitude, short-
term autocorrelation function, frequency spectrum, amplitude
Fig 8.1 Audio Signals of Emotion spectrum, phase spectrum, and complex features of the surface.
Because there is no common standard database for musical drum. Accuracy of short-term energy, short-term average amplitude,
motion detection, many databases have been used in published short-term autocorrelation function, frequency, amplitude, phase,
studies. Because of this, the specified accuracy of each algorithm complex face and drum characteristics is 76.48%. The test results
varies. All Music Guide (AMG)a is a music company that uses show that the time factor has a significant effect. The differences are

978-1-6654-6084-2/22/$31.00 ©2022 IEEE 715


Authorized licensed use limited to: VIT University- Chennai Campus. Downloaded on February 02,2024 at 08:56:25 UTC from IEEE Xplore. Restrictions apply.
Proceedings of the International Conference on Automation, Computing and Renewable Systems (ICACRS 2022)
IEEE Xplore Part Number: CFP22CB5-ART: ISBN: 978-1-6654-6084-2

small because they are not materially different from the experimental
results that the recognition models produce. for graphical comparison [7] Singhal, Rahul, Shruti Srivatsan, and Priyabrata Panda. "Classification
of test results. of Music Genresusing Feature Selection and Hyperparameter Tuning."
Journal of Artificial Intelligence 4, no. 3 (2022): 167-178.
VI. Conclusion
[8] Cheng Z Y, Shen J L, Nie L Q, Chua T S, Kankanhalli M. Exploring
Music contains a plethora of human emotional information. user-specific information in music retrieval. In:Proceedings of the 40th
Research on music emotional categorization is useful for International ACM SIGIR.
incorporating vast amounts of musical data. This study enhances the
feature information gathering capabilities of the emotion [9] Kim Y E, Schmidt E M, Migneco R, Morton B G, Richardson P, Scott
identification model by including the deep network model into the J, Speck J A, Turnbull D. Music emotion recognition:a state of the art
review. In: Proceedings of the 11th International Society for Music
explicit sparse attention mechanism for optimization. It encourages Information Retrieval Conference. 2010, 255–266
the preparation of related data and enhances the input level of the
model, which increases the recognition accuracy of the model. [10] Yang Y H, Chen H H. Machine recognition of music emotion: a review.
Compared with other strategies, the proposed method includes an ACM Transactions on Intelligent Systems and Technology. 2011, 3(3):
obvious sparse attention mechanism to deliberately filter out small 1–30 Bartoszewski
amounts of information, concentrate the distribution of attention, and
enable the collection and analysis of information. information about [11] M, Kwasnicka H, Kaczmar M U, Myszkowski P B. Extraction of
geographic objects. The test results show that the proposed method emotional content from music data. In: Proceedings of the 7th
can effectively analyze and classify the data. International Conference on Computer Information Systems and
Industrial Management Applications. 2008, 293–299.
Research on audio digitization has advanced as a result of the
continual development of modern information technology. It is now [12] Hevner K. Experimental studies of the elements of expression in music.
possible to do research on using computer-related technologies to The American Journal of Psychology, 1936, 48(2): 246–268
MER. To improve musical emotion recognition, this study uses an
improved BP network to recognize music data. Before analyzing the [13] Posner J, Russell J A, Peterson B S. The circumplex model of
optimal feature data for emotion detection, this study first identifies affect:anintegrative approach to affective neuroscience, cognitive
the acoustic features of music in associative form for emotion development, and psychology. Development and Psychopathology,
classification. Second, using the ABC modified BP network, a 2005, 17(3): 715–734
musical sentiment classifier was developed and its performance [14] Thammasan N, Fukui K I, Numao M. Multimodal fusion of EEG and
evaluated compared with other classifiers. The results of the test musical features in music-emotion recognition. In: Proceedings of the
show that the network used has a greater impact on the recognition. 31st AAAI Conference on Artificial Intelligence. 2017, 4991–4992
References [15] R. R. Subramanian, M. Yaswanth, B. V. Rajkumar T S, K. Rama Sai
[1] R. R. Subramanian, Y. Sireesha, Y. S. P. K. Reddy, T. Bindamrutha, M. Vamsi, D. Mahidhar and R. R. Sudharsan, "Musical Instrument
Harika and R. R. Sudharsan, "Audio Emotion Recognition by Deep Identification using Supervised Learning," 2022 6th International
Neural Networks and Machine Learning Algorithms," 2021 Conference on Intelligent Computing and Control Systems (ICICCS),
International Conference on Advancements in Electrical, Electronics, 2022, pp. 1550-1555, doi: 10.1109/ICICCS53718.2022.9788116.
Communication, Computing and Automation (ICAECA), 2021, pp. 1-
6, doi: 10.1109/ICAECA52838.2021.9675492. [16] Turnbull D, Barrington L, Torres D, Lanckriet G. Towards musical
query-by-semantic-description using the CAL500 data set. In:
[2] J. Sönmez-Cañón et al., "Music Emotion Recognition: Toward new, Proceedings of the 30th Annual International ACM SIGIR Conference
robust standards in personalized and context-sensitive applications," in on Research and Development in Information Retrieval. 2007, 439–
IEEE Signal Processing Magazine, vol. 38, no. 6, pp. 106-114, Nov. 446
2021, doi: 10.1109/MSP.2021.3106232.
[17] Aljanaki A, Yang Y H, Soleymani M. Developing a benchmark for
[3] Serhat Hizlisoy, Serdar Yildirim, Zekeriya Türeci, Music emotion
emotional analysis of music. PLoS ONE, 2017, 12(3): e0173392
recognition using convolutional long short term memory deep neural
networks, Engineering Science and Technology, an International
[18] Chen P L, Zhao L, Xin Z Y, Qiang Y M, Zhang M, Li T M. A scheme
Journal,Volume24,Issue3,2021,ISSN22150986,https://doi.org/10.1016
of MIDI music emotion classification based on fuzzy theme extraction
/j.jestch.20210.009.
and neural network. In: Proceedings of the 12th International
[4] R. R. Subramanian, B. R. Babu, K. Mamta and K. Manogna, "Design Conference on Computational Intelligence and Security. 2016, 323–
and Evaluation of a Hybrid Feature Descriptor based Handwritten 326
Character Inference Technique,"2019IEEE International Conference
on Intelligent Techniques in Control, Optimization and Signal [19] Juslin P N, Laukka P. Expression, perception, and induction of musical
Processing (INCOS), Tamil Nādu, India, 2019, pp. 1-5. emotions: a review and a questionnaire study of everyday listening.
Journal of New Music Research, 2004, 33(3): 217–238
[5] R. Raja Subramanian, H. Mohan, A. Mounika Jenny, D. Sreshta, M. [20] R. Raja Subramanian, V. Vasudevan, “A deep genetic algorithm for
Lakshmi Prasanna and P. Mohan, "PSO Based Fuzzy-Genetic human activity recognition leveraging fog computing frameworks”,
Optimization Technique for Face Recognition," 2021 11th Journal of Visual Communication and Image Representation, Volume
International Conference on Cloud Computing, Data Science & 77, 2021,103132,ISSN1047-320
Engineering(Confluence),2021,pp.374379,doi:10.1109/Confluence51
648.2021.9377028. [21] Kim, Jaebok, Ibrahim H. Shareef, Peter Regier, Khiet P. Truong, Vicky
[6] Yang X Y, Dong Y Z, Li J. Review of data features-based music Charisi, Cristina Zaga, Maren Bennewitz, Gwenn Englebienne, and
emotion recognition methods. Multimedia System, 2018, 24(4): 365– Vanessa Evers. "Automatic ranking of engagement of a group of
389 children “in the wild” using emotional states and deep pose machines."

978-1-6654-6084-2/22/$31.00 ©2022 IEEE 716


Authorized licensed use limited to: VIT University- Chennai Campus. Downloaded on February 02,2024 at 08:56:25 UTC from IEEE Xplore. Restrictions apply.

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy