9 - Yogendra
9 - Yogendra
3,
No. 2, JULY-DECEMBER 2024, pp. 44-48
Abstract—In recent years, the use of machine learning to recognize linked to different emotional states. Early approaches relied
human emotions through speech analysis has received a lot of on rule-based systems and manually crafted characteristics,
attention. This approach involves identifying the relationship resulting in limited accuracy and scalability. However,
between speech features and emotions and training machine the introduction of machine learning techniques brought
learning models to classify emotions based on these features. In
about a paradigm shift, facilitating the development of
this article, we present a new method for understanding human
emotions by analyzing speech using neural networks. Without more sophisticated models capable of learning intricate
requiring artificial intelligence, our approach extracts features data patterns. Recent progress in deep learning, particularly
from unprocessed speech data by leveraging the capabilities of the utilization of neural networks, has notably enhanced
deep learning. We achieve the best performance in the case of the performance of SER systems by streamlining feature
knowledge for fundamental needs by evaluating our strategy extraction and harnessing extensive datasets.
using large data. Our findings demonstrate that deep learning
can enhance the conventional method by eliminating speech-to- Despite these advancements, numerous challenges persist
cognitive recognition elements. By using discourse analysis, this in the realm of SER. One major obstacle is the variability
article advances the field of cognitive psychology research and
in emotional expression across diverse languages,
highlights the benefits of deep learning in this area. Speech is a
powerful tool for communicating emotions, and understanding cultures, and individual speakers, potentially affecting the
people’s emotions by analyzing speech can have important generalizability of SER systems. Furthermore, the subtlety
applications inmany areas. In this article, we propose a method and contextdependency of emotional cues necessitate
for combining speech and data for cognitive recognition. Our resilient models capable of handling a range of real-world
approach involves extracting text from speech data using data variations. Additionally, ethical considerations and
natural language processing techniques and combining it with privacy issues surrounding the acquisition and utilization
acoustic features. We then use a deep neural network model to of emotional data represent crucial concerns that warrant
classifyemotions. meticulous attention. [2]
Keywords—Psychology, cognitive, spectrogram, optimization,
effectiveness. This manuscript delivers a thorough overview of the present
status of speech emotion recognition, concentrating on
I. INTRODUCTION the latest methodologies, ongoing hurdles, and emerging
Emotion recognition from speech, an integral aspect of applications.
affective computing, has experienced notable progress in
recent years due to advancements in artificial intelligence (AI) Various approaches employed in SER, spanning from
and machine learning technologies. The accurate and efficient traditional acoustic analysis to advanced deep learning
interpretation of emotional states from spoken language has methods, will be explored. Moreover, the paper will delve
become increasingly crucial as these technologies evolve. into the practical implications of SER technology across
Speech Emotion Recognition (SER) involves identifying and different sectors and underscore the prospective avenues for
categorizing emotional cues conveyed through vocal research and advancement. By amalgamating recent progress
expressions, with significant implications for various fields and pinpointing critical challenges, this paper endeavors to
such as human-computer interaction, healthcare, customer furnish valuable insights into the future of emotion recognition
service, and entertainment.[1]. from speech and its potential ramifications on technology and
society. [3].
The exploration of emotions in speech historically commenced
with fundamental research on acoustic and linguistic features
44
GLIMPSE - Journal of Computer Science • Vol. 3, No. 2, JULY-DECEMBER 2024
45
GLIMPSE - Journal of Computer Science • Vol. 3, No. 2, JULY-DECEMBER 2024
46
GLIMPSE - Journal of Computer Science • Vol. 3, No. 2, JULY-DECEMBER 2024
Fig. 2: Flow chart On our test data, we obtained an overall accuracy of 83;
however, this might be further enhanced by utilising additional
various individuals, cultures, and languages. Additionally, augmentation techniques and a variety of contemporary
the system is made to take into consideration the unique feature extraction approaches
characteristics of each user, including their speaking patterns
and personal habits. Planning strategies are widely used in V. CONCLUSION
a variety of industries, including psychology, education, and In conclusion, speech analysis for emotional intelligence has
entertainment. By examining alterations in speech patterns, shown great potential in recent years. With the use of machine
the technique can now be utilized in psychology to identify learning algorithms,speech characteristics such as tone, vol-
early indicators of mental illnesses like anxiety or sadness. ume, and intensity can be analyzed to describe human emo-
By giving users feedback on pronunciation and intonation, the tions accurately. The technology has many applications in a
technology can be utilized in education to enhance language variety of industries, including psychological diagnostics,
acquisition. More interactive games could be made with the customer service, and human-computer interaction. However,
system to provide entertainment value. All in all, the demand ethical issues such as privacy concerns and the possibility of
to understand human emotions using speech analysis is abuse must be addressed to ensure fair and transparent use
a great way to change the way we interact with machines. of technology. Future research in thisarea should focus on
With the rapid development of machine learning algorithms, developing new features and models for emotional analysis,
we can expect to see more accurate and reliable emotional integrating other emotional data sources, and ensuring fair use
intelligence in the future. of these methods. Speech analysis for emotionrecognition is a
complex process involving many machine-learning algo-
IV. RESULT rithms and techniques. Although it has shown potential for
The results of the project ”Emotion Recognition Using Speech many applications, more research is needed to improve its
Analysis” vary depending on the specific methods and models accuracy and overcome its limitations. In addition, ethical is-
used in the research. Overall, the results show that machine sues such as data privacy and transparencymust be addressed
learning algorithms can classify human emotions based on to ensure that these technologies are developed and used re-
speech analysis. For example, studies have reported positive sponsibly and ethically. Using speech analysis to understand
results ranging from 70 percent for emotional awareness us- humanemotions is a rapidly growing area of research that has
ing speech more than 90 percent. The accuracy of results de- the potential to change the way we interact withtechnology
pends on many factors, such as the quality of the data, the and with each other. By recognizing good behavior in conver-
inference process, and the classification pattern. Also, the pro- sation, communication can be improved, psychological prob-
posed method using a convolutional neural network (CNN) lems diagnosed, and customer service improved. However, to
shows good results in speech perception recognition. The use realize these results, more research is needed toimprove the ac-
of CNNs can help improve classification accuracy by extract- curacy and reliability of emotional recognition through speech
ing features from speech signals and preserving relationships and to resolve ethical issues such as data privacy and integrity.
between them. Overall, the findings from the use of speaking In general,the development and use of emotional intelligence
47
GLIMPSE - Journal of Computer Science • Vol. 3, No. 2, JULY-DECEMBER 2024
through speech analysis should be guided by ethicaland so- to-date crustal deformation map of iran using integrated
cial considerations. All things considered, speech analysis as campaign-mode and permanent gps velocities,” Geophysical
a means of assessing emotional intelligence is a fascinating Journal International, vol. 217, 02 2019.
field of study with great potential to influence numerous com- [5] D. Ververidis and C. Kotropoulos, “Emotional speech recogni-
tion: Resources, features, and methods,” Speech Communica-
panies and sectors in the years to come. New characteristics
tion, vol. 48, no. 9, pp. 1162–1181, 2006. [Online]. Available:
and models for sentiment analysis are still being investigated https://www.sciencedirect.com/science/article/pii/
to increase the efficacy and accuracy of speech analysis-based S0167639306000422
emotion recognition. For example, researchers are exploring [6] C. Busso, M. Bulut, C.-C. Lee, A. Kazemzadeh, E. Mower,
the use of deep learning techniques such as neural networks S. Kim, J. N. Chang, S. Lee, and S. S. Narayanan, “Iemocap:
and recurrent neural networks to better interact and make dif- interactive emotional dyadic motion capture database,” Lan-
ferent decisions between listening and speaking. Another im- guage Resources and Evaluation, vol. 42, no. 4, pp. 335–359,
portant aspect of emotional recognition is theintegration of 12 2008. [Online]. Available: https://doi.org/10.1007/s10579-
various information such as facialexpressions, body language, 008-9076-6
[7] W. Mellouk and W. Handouzi, “Facial emotion recognition us-
and body movements, which can provide additional informa-
ing deep learning: review and insights,” Procedia Computer
tion to thecontrol message voice and increase the accuracy of Science, vol. 175, pp. 689–694, 2020, the 17th International
the thought process. However, it is important to remember Conference on Mobile Systems and Pervasive Computing
that technology use should be guided by ethics and attributes (MobiSPC),The 15th International Conference on Future Net-
such as privacy, transparency, action, or health. works and Communications (FNC),The 10th International
Conference on Sustainable Energy Information Technology.
REFERENCES [Online]. Available: https://www.sciencedirect.com/science/
[1] M. S. Ram, A. Sreeram, M. Poongundran, P. Singh, Y. N. Pra- article/pii/S1877050920318019
japati, and S. Myrzahmetova, “Data fusion opportunities in iot [8] Vikas, “Machine learning methods in software engineering –
and its impact on decision- making process of organisations,” review,” Journal of Computer Science, vol. 3, no. 1, pp. 48–
in 2022 6th International Conference on Intelligent Comput- 51, jan 2024.
ing and Control Systems (ICICCS), 2022, pp. 459–464. [9] F. Eyben, M. W¨ollmer, and B. Schuller, “opensmile – the mu-
[2] S. Jain, “Deep learning’s obstacles in medical image analysis: nich versatile and fast open-source audio feature extractor,” 01
Boosting trust and explainability,” Journal of Computer Sci- 2010, pp. 1459–1462.
ence, vol. 3, no. 1, pp. 21–24, jan 2024. [10] Y. N. Prajapati and M. Sharma, “Novel machine learning algo-
[3] Y. N. Prajapati, U. Sesadri, T. Mahesh, S. Shreyanth, A. Ober- rithms for predicting covid-19 clinical outcomes with gender
oi, and K. P. Jayant, “Machine learning algorithms in big data analysis,” in Advanced Computing, D. Garg, J. J. P. C. Rod-
analytics for social media data based sentimental analysis,” rigues, S. K. Gupta, X. Cheng, P. Sarao, and G. S. Patel, Eds.
International Journal of Intelligent Systems and Applications Cham: Springer Nature Switzerland, 2024, pp. 296–310.
in Engineering, vol. 10, no. 2s, pp. 264–267, 2022. [11] ——, “Designing ai to predict covid-19 outcomes by gender,”
[4] F. Khorrami, P. Vernant, F. Masson, F. Nilfouroushan, Z. in 2023 International Conference on Data Science, Agents Ar-
Mousavi, N. R, R. Saadat, A. Walpersdorf, S. Hosseini, P. tificial Intelligence (ICDSAAI), 2023, pp. 1–7.
Tavakoli, A. Aghamohammadi, and M. Alijanzade, “An up-
48